Recently I was discussing the process we use in a statistical enquiry. The ideal is that we start with a problem and follow the statistical enquiry cycle through the steps Problem, Plan, Data collection, Analysis and Conclusion, which then may lead to other enquiries.
I have previously written a post suggesting that the cyclical nature of the process was overstated.
The context of our discussion was a video I am working on, that acknowledges that often we start, not at the beginning, but in the middle, with a set of data. This may be because in an educational setting it is too expensive and time consuming to require students to collect their own data. Or it may be that as statistical consultants we are brought into an investigation once the data has been collected, and are needed to make some sense out of it. Whatever the reason, it is common to start with the data, and then loop backwards to the Problem and Plan phases, before performing the analysis and writing the conclusions.
We, a group of statistical educators, were suggesting what we would do with a data set, which included looking at the level of measurement, the origins of the data, and the possible intentions of the people who collected it. One teacher suggests to her students that they do exploratory scatter plots of all the possible pairings, as well as comparative dotplots and boxplots. The students can then choose a problem that is likely to show a relationship – because they have already seen that there is a relationship in the data.
I have a bit of a problem with this. It is fine to get an overview of the relationships in the data – that is one of the beauties of statistical packages. And I can see that for an assignment, it is more rewarding for students to have a result they can discuss. If they get a null result there is a tendency to think that they have failed. Yet the lack of evidence of a relationship may be more important than evidence of one. The problem is that we value positive results over null results. This is a known problem in academic journals, and many words have been written about the problems of over-occurrence of type 1 errors, or publication bias. Let me illustrate. A drug manufacturer hopes that drug X is effective in treating depression. In reality drug X is no more effective than a placebo. The manufacturer keeps funding different tests by different scientists. If all the experiments use a significance level of 0.05, then about 5% of the experiments will produce a type 1 error and say that there is an effect attributable to drug X. The (false) positive results are able to be published, because academic journals prefer positive results to null-results. Conversely the much larger number of researchers who correctly concluded that there is no relationship, do not get published and the abundance of evidence to the contrary is invisible. To be fair, it is hoped that these researchers will be able to refute the false positive paper.
So where does this leave us as teachers of statistics? Awareness is a good start. We need to show null effects and why they are important. For every example we give that ends up rejecting the null hypothesis, we need to have an example that does not. Text books tend to over-include results that reject the null, so that when a student meets a non-significant result he or she is left wondering whether they have made a mistake. In my preparation of learning materials, I endeavour to keep a good spread of results – strongly positive, weakly positive, inconclusive, weakly negative and strongly negative. This way students are accepting of a null result, and know what to say when they get one.
Another example is in the teaching of time series analysis. We love to show series with strong seasonality. It tells a story. (see my post about time series analysis as storytelling.) Retail sales nearly all peak in December, and various goods have other peaks. Jewellery retail sales in the US has small peaks in February and May, and it is fun working out why. Seasonal patterns seem like magic. However, we need also to allow students to analyse data that does not have a strong seasonal pattern, so that they can learn that they also exist!
My final research project before leaving the world of academia involved an experiment on the students in my class of over 200. It was difficult to get through the human ethics committee, but made it in the end. The students were divided into two groups, and half were followed up by tutors weekly if they were not keeping up with assignments and testing. The other half were left to their own devices, as had previously been the case. The interesting result was that it made no difference to the pass rate of the students. In fact the proportion of passes was almost identical. This was a null result. I had supposed that following up and helping students to keep up would increase their chances of passing the course. But they didn’t. This important result saved us money in terms of tutor input in following years. Though it felt good to be helping our students more, it didn’t actually help them pass, so was not justifiable in straitened financial times.
I wonder if it would have made it into a journal.
By the way, my reference to the silent dog in the title is to the famous Sherlock Holmes story, Silver Blaze, where the fact that the dog did not bark was important as it showed that the person was known to it.
3 Comments
Very nice. I see these type 1 errors every day.
Well, as an English major, you lost me with the example of the number of icecreams purchased being a “random variable.” Certainly that number would be variable, but I fail to see how it could be random, given the various factors that might influence that outcome: The time of day, the year, the longitude/latitude, the cultural influences just to name a few things that would impinge on my notion of randomness. Cheers.
Hi – Sorry I took so long to reply. The problem is with the word “random”, which has a particular meaning in everyday life that is different from the meaning in statistics. A random variable is one for which the value is not known.