For people who understand them, graphs tell a story. To the initiated, even a p-value, and some summary statistics can help to tell a story. Part of the role of a statistician is to extract the story from the data. The role of a statistics teacher is to enable students first to recognise that there is a story, then to enable them to tell the story through the tools of analysis and communication.
This idea of statistics as story-telling is explained in an award-winning paper byPfannkuch, Regan, Wild and Horton,Telling Data Stories: Essential Dialogues for Comparative Reasoning, which won the inaugural Journal of Statistics Education Best Paper Award.
Time series data, especially seasonal time series data, yields its story abundantly. For this reason I changed my mind about the teaching of time series analysis at high school. I used to think that it was far too complex for high school students and should be left to higher education. In a way that is true, but if you stick to the basic concepts, it is a contextually rich area of study.
Time series data is full of little hazards, not the least being auto-correlation. We can use moving averages to take out the bumps and exponential smoothing to be more responsive to more recent data. We can deseasonalise and fit a trend line, predict and then put the seasonality back in. There are weighty (in more ways than one) volumes dedicated to time series analysis and the various discoveries and inventions that have helped us draw meaning from the past and forecast the future.
Because of the inherent complexity of time series analysis, I used to think that time series was not an appropriate part of the high school curriculum.
However, if a storytelling approach is used, backed up by appropriate software, then time series is a wonderful introduction to statistics. It is a good example of modelling, it has clear purpose, and the contexts can be fascinating.
Time series analysis is a clear example of the concept of a model, as there are so many different ways that it is possible to model a set of time series data. In contrast, when you teach linear regression with only one possible predictor variable, on data that is nicely behaved, there is generally one sensible model to use. This gives students the idea that you are trying to find “the right model”. This is not the case with time series, as models change, depending on how we choose to define the model.
Another selling-point for time series analysis is that its main function is forecasting. We all want to have crystal balls that can predict the future. The main reason we study a time series is to understand the patterns of data so that we can project into the future, usually for economic reasons. There is no question of “Why are we doing this, Miss?”, as the purpose of the analysis is self-evident.
There are numerous economic time series available from official statistics sites. In New Zealand I went to Infoshare and in the US there is Economagic. Some of the series are fascinating. (I like the three peaks per year in jewellery sales in the US – December, February and May.)
Analysis can be difficult, and Excel is hideous for time series graphing and deseasonalising. There has been a free front end for R set up, called iNZight, which enables straight-forward time series analysis. One drawback is that it only allows for one model, which I fear perpetuates the “there is one model” mindset.
But the opportunities for storytelling are there. You can talk about trend, seasonality, variation, the relative contribution of each. As teachers and students are exposed to more and more time series graphs, they are better able to tell stories. The graphs of the seasonal shape are rich with story-telling potential.
To support this we have made four videos about time series analysis, and an app, which is still in the pipeline. We hope that these will help develop the confidence of teachers and students to tell stories about time series data. We also have further quizzes and step-by-step guide to writing up a time series analysis.
For teachers where there is limited access to computer resources, I have an earlier post with some ideas of how to overcome this problem and emphasise the story in time series data: Teaching Time Series with Limited Computer access.