In 1987 George Cobb published a paper evaluating statistics textbooks. I am very grateful for it, as it alerted me to the problems with textbooks, and introduced me to the man himself, whose work I greatly admire. Cobb explains that statistics is an inherently interesting and practical subject, but that many textbooks seem to have missed that, or concealed it from the students.

The discipline of statistics is inherently fascinating, applied and important. So why do so many textbooks make it seem mechanistic and abstract? I have been examining textbooks, and wonder if the writers even like their subject matter, or the students they are supposed to be reaching.

I am particularly interested in textbooks for non-mathematicians. The majority of students of statistics are not mathematicians, and are not planning to take any more statistics than they are required to. These students don’t like mathematics. They feel uneasy about taking the course. They are required to take a statistics course as part of their business, psychology or health sciences major. They aren’t even sure why they need to take the course, and hope to get it over and done with and forget about the experience as soon as possible. A previous post talks about how to help students who are feeling negatively towards the course. A textbook for these students needs to get the tone and content right.

A friendly, but authoritative tone is important. Some go too far and become corny in their chattiness. It’s nice to be friendly, but it can be a bit tiresome and the examples can be too cute. But most are just too dry – and have too many words. And far too many equations and algorithms. They seemed bent on protectionism rather than empowerment.

Even more important is the choice of content, and I find this fascinating. I wonder what course some textbooks are designed for. A telling chapter is regression. Regression is an important statistical technique. But what do we tell them about regression? Here is how I have recently seen it done. Provide an example of real data taken from the web. Introduce the problem, then let them wait until the end to find out where you are going. Give the mathematical way of expressing a line, using greek letters. Derive the least squares method of line fitting. Calculate the line by hand. Interpret the slope and the intercept. Calculate the coefficient of determination by hand. Interpret it. Define the residuals, and calculate them. Calculate the F-statistic and t-statistics. Interpret them. Then finish off the story you started at the beginning of the chapter (not that anyone cares anymore).

Some of you may be wondering what is wrong with that. Good – it means I am not preaching to the choir.

Students need to see the whole picture from the beginning. If you absolutely MUST do the mathematics, put it at the end of the chapter for the keen students, but don’t do the maths in the body of the text and scare the others. Do not assume the readers know how to interpret a line. Most don’t. Start with some examples that explain the context, show the line, and explain and apply the model equation. Next work through one example thoroughly, using computer output. Explain the different values and talk about what applies to the sample, and what helps us to generalize to the population. Then provide some more examples, making sure many of them are not statistically significant, some have negative slopes, and all are solving a problem using a sufficiently large sample of real data. Then give them a template for writing up a regression, explaining the different parts. Finally, if you must, you can give them the mathematics. This may keep the instructors happy so that they will buy your book.

Another telling bit of content is a textbook’s approach to ordinal data. In my video about types of data two instructors argue over whether it is permissible to calculate the mean for ordinal data. It ends with them calling each other “nit-picking mathematician” and “sloppy social scientist”. My approach is to take the middle ground. It is not ideal mathematically to calculate a mean for ordinal data, but much of the time people do, so it is best to know why it may cause problems and that there is an issue, rather than pretending that it never happens. Look in the textbook. I would be wary of any text that states categorically that you cannot find the mean for ordinal data.

There is also the issue of the purpose of the text, both its place in the course, and in the lives of the students. Textbooks can take different roles in courses, largely as a function of the confidence and competence of the instructor. A novice instructor, unsure of the material is well-advised to stick closely to the textbook. But an experienced and engaged instructor will find the text less and less important and more a peripheral second opinion and source of homework exercises. The internet and Wikipedia have replaced the textbook as the source of background knowledge. We suspect a textbook is used more as an expensive combination of talisman and doorstop by the students.

“Judge a book by its exercises and you cannot go far wrong,” said George Cobb. All exercises in statistics should have context. There is no place for fitting a line by hand calculation to a set of five points with no context. Leave that to mathematics courses. Statistics is about context, and all examples need to reflect that. The data should be real data, so that an interesting result is authentic, not just something dreamed up by the instructor. The data should occasionally be dirty even! (but not too early in the course, without warning). And there should be enough data. Don’t perpetuate bad habits by using too few data.

Having said all this, I do wonder what the role of textbooks is in the education of the future. On-line materials, which can be frequently updated, and crowd-sourced explanations such as found on Wikipedia and elsewhere can fill the place of a textbook.

Or there is always our app – AtMyPace: statistics, which uses video and interactive lessons to teach some important concepts. We are now working to bring this to the web so all can use it. And then maybe I should write a textbook. 😉

## 16 Comments

So… let’s hear some engineering statistics textbook recommendations!

That’s a tricky one. Engineers probably understand formulas, so you don’t need to keep them out of sight. However I suspect, like mathematicians, they may miss the point of it all, and dislike the glorious fuzziness and subjectivity that real statistical analysts understand and revel in. A textbook has to deal with that. Hmmm.

Exactly. I don’t have a recommendation, but note that for engineers (& a good many mathematics major students) the words and uncertainty cause difficulties.

Write that textbook please!

And also – I can’t wait until you bring your app to the web. That’s a great idea.

We are using Moodle to convert the app to the web. We could have it available within a few weeks – well a rough version.

I’ll get onto it this afternoon! I think short is good.

I’m not sure if this is entirely relevant, but because the book was by far the best “textbook” I read I’ve always thought it as what I’d try to emulate if possible. However it was an Economics textbook, but in essence it is another similar subject that scares the students away before they even start at it. It’s by the late Dr. Paul Heyne (U.Washington) who was legendary for teaching very large (and popular) classes. The book was called The Economic way of thinking.

I agree but I think you may have overlooked “Biostatistics: The Bare Essentials” by Geoffrey R. Norman and David L. Streiner http://books.google.com/books?id=8rkqWafdpuoC&dq=norman+streiner. It is extremely funny and fun to read.

I would love to see an implementation of their examples and thought processes in R.

Thanks for the suggestion. Looks a wee bit advanced for my students. Andy Field did not a bad one using SPSS, but a bit too cutesy for my liking, and too advanced. It’s a fine line. What’s fun for skimming can get irritating for a whole textbook.

Here are a couple of light reads. They’re not suitable as textbooks, but the Gonick book does have an entertaining and informative explanation of basic principles. I’ve put the Gonick book on reserve for my elementary stats students, but I don’t know if they’ve used it.

http://www.amazon.com/gp/product/0062731025

http://www.amazon.com/gp/product/1593271891

[…] Role of the textbook […]

[…] another previous post I lamented how “Statistics Textbooks suck out all the fun.” I cited the work by George Cobb, reviewing textbooks in […]

[…] I said in a previous post Statistics Textbooks suck out all the fun. Very few textbooks do no harm. I wonder if this site could provide a database of statistics texts […]

I find the “Primer of Biostatistics” by Glantz admirable. http://www.amazon.com/dp/0071781501

It contains many well-chosen and motivating medical examples suitable for the audience, with statistical rules of thumb, non-technical explanations and fully worked example problems. Sometimes I think he is too wordy and I’d like more summary boxes and clearer signposts for choosing tests. I agree with you about Andy Field’s book on Statistics with SPSS: a little too chatty and sometimes even kooky, but the explanations are mostly clear. Thanks for your helpful website and videos.

[…] Or cease to use a textbook. Or write one of your own. The first thing I ever read of George Cobb’s was an analysis of textbooks, back in the later years of last century. I strongly agree with his analysis that the questions were the most important part. This is even more applicable in these days of free online information of varying value. Depending on how confident the instructor is, a textbook can be a great help, but often they are expensive doorstop/lucky charm combinations […]