# George E. P. Box

Appearance

**George Edward Pelham Box** (October 18, 1919 – March 28, 2013) was a British mathematician and professor of statistics at the University of Wisconsin, and a pioneer in the areas of quality control, time series analysis, design of experiments and Bayesian inference. He was the son-in-law of Sir Ronald Fisher.

## Quotes

[edit]- Statistical criteria should (1) be sensitive to change in the specific factors tested, (2) be insensitive to changes, of a magnitude likely to occur in practice, in extraneous factors.
- G.E.P Box (1955); cited in: JOC/EFR (2006) "George Edward Pelham Box" at history.mcs.st-and.ac.uk, Nov 2006.

- We have a large reservoir of engineers (and scientists) with a vast background of engineering know how. They need to learn statistical methods that can tap into the knowledge. Statistics used as a catalyst to engineering creation will, I believe, always result in the fastest and most economical progress…
- Statement of 1992, quoted in
*Introduction to Statistical Experimental Design — What is it? Why and Where is it Useful?*(2002) Johan Trygg & Svante Wold

- Statement of 1992, quoted in

- All models are wrong; some models are useful.
- For instance in George E. P. Box, William Hunter and Stuart Hunter,
*Statistics for Experimenters*, second edition, 2005, page 440. See "All models are wrong".

- For instance in George E. P. Box, William Hunter and Stuart Hunter,

*Science and Statistics* (1976)

[edit]- George E. P. Box (1976)
*Science and Statistics**Journal of the American Statistical Association*, Vol. 71, No. 356. (Dec., 1976), pp. 791-799

- One important idea is that science is a means whereby learning is achieved, not by mere theoretical speculation on the one hand, nor by the undirected accumulation of practical facts on the other, but rather by a motivated iteration between theory and practice.
- p. 791

- Since all models are wrong the scientist cannot obtain a "correct" one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity.
- p. 792

- The researcher hoping to break new ground in the theory of experimental design should involve himself in the design of actual experiments. The investigator who hopes to revolutionize decision theory should observe and take part in the making of important decisions.
- p. 792

- For the theory-practice iteration to work, the scientist must be, as it were, mentally ambidextrous; fascinated equally on the one hand by possible meanings, theories, and tentative models to be induced from data and the practical reality of the real world, and on the other with the factual implications deducible from tentative theories, models and hypotheses.
- p. 792

- A man in daily muddy contact with field experiments could not be expected to have much faith in any direct assumption of independently distributed normal errors.
- p. 795

- The penalty for scientific irrelevance is, of course, that the statistician's work is ignored by the scientific community.
- p. 798

*Empirical Model-Building and Response Surfaces* (1987)

[edit]- Box, G. E. P., and Draper, N. R., (1987),
*Empirical Model Building and Response Surfaces*, John Wiley & Sons, New York, NY.

- An innovative discussion of building empirical models and the fitting of surfaces to data. Introduces the general philosophy of response surface methodology, and details least squares for response surface work, factorial designs at two levels, fitting second-order models, adequacy of estimation and the use of transformation, occurrence and elucidation of ridge systems, and more. Some results are presented for the first time. Includes real-life exercises, nearly all with solutions.
- Introduction, book summary

- A mechanistic model has the following advantages:

1. It contributes to our scientiﬁc understanding of the phenomenon under study.

2. It usually provides a better basis for extrapolation (at least to conditions worthy of further experimental investigation if not through the entire range of all input variables).

3. It tends to be parsimonious (i.e, frugal) in the use of parameters and to provide better estimates of the response- p. 13-14 as cited in: Andrew Odlyzko (2010) Social Networks and Mathematical Models Electronic Commerce Research and Applications 9(1): 26-28 (2010)

- Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.
- p. 74

*An Accidental Statistician*, 2010

[edit]- George E.P. Box (2010) "An Accidental Statistician". On stat.wisc.edu, 3 June 2010

- I want to tell you how I got to be a statistician. I was, of course, born in England and in 1939... when war broke out in September of that year, although I was close to getting a degree in Chemistry, I abandoned that and joined the Army. They put me in the Engineers (and when I see a bridge I still catch myself calculating where I would put the charges to blow it up).

Before I could actually do any of that I was moved to a highly secret experimental station in the south of England. At the time they were bombing London every night and our job was to help to find out what to do if, one night, they used poisonous gas.

Some of England's best scientists were there. There were a lot of experiments with small animals, I was a lab assistant making biochemical determinations, my boss was a professor of physiology dressed up as a colonel, and I was dressed up as a staff sergeant.

The results I was getting were very variable and I told my colonel that what we really needed was a statistician.

He said "we can't get one, what do you know about it?" I said "Nothing, I once tried to read a book about it by someone called R. A. Fisher but I didn't understand it". He said "You've read the book so you better do it", so I said, "Yes sir"

## About George E. P. Box

[edit]- [Box's 1960 paper
*Fitting empirical data*is] a mature exposition of an important branch of statistics, to which the author has made great contributions. One feature of particular interest is practical discussion of genuinely nonlinear fitting problems and their solution with the help of tact and a special, publicly available, IBM-704 program. Another is insightful comments on the role of prior distributions in statistics.- Leonard Jimmie Savage in 1960s; cited in: JOC/EFR (2006) "George Edward Pelham Box" at history.mcs.st-and.ac.uk, Nov 2006.

- George Box is, in the field of the quality sciences, the consummate ‘Renaissance man’ who has made significant and enduring contributions to the profession of quality control and the allied arts and sciences... [His] contributions encompass considerable scope and have already had lasting effect.
- Frank Caplan (1996), then-editor of Quality Engineering, cited at George E.P. Box: Accomplishments in statisticsas at asq.org, 2013.