 ###### Statistics and chocolate
10 July 2012 ###### Question questions
23 July 2012

Rocks on a stream near Castle Rock Village, New Zealand

For many people, a graph is not obvious. Let me illustrate:
Here are two graphs showing the results from two classes of students in some mythical test out of 10. Have a look at them and decide which one shows more variation. I won’t embarrass you by asking which one you chose.
Actually I might. Try not to look at the answer before you answer this. I’ll put a pretty picture that you will have to scroll past to help you not cheat.

The answer is that Class A shows more variation. This is little counter-intuitive as Class A is looks nice and even, and not really varied at all. Each of the bars is about the same height, so it seems that  there is not very much variation. In contrast, Class B is all up and down, and looks to have more variation. This is how MANY of my students, and even some of my tutors think before they take my course. However they are wrong.

## What kind of graph is it?

Excel is a fabulous tool, which makes the production of graphs so simple than we can do it without thinking. Problem is, that’s what happens a lot of the time. Often column graphs are used to show values, rather than frequencies, and it can be difficult to tell what is going on. Let us now look at Class A as shown as a value graph. Each of the bars now represents one student, and the students are in ID order along the x axis.

It can be difficult for people to see that this is the same data as the graph of Class A given above. What you can do is look at how many students scored 1 in the test. You can see that there are two students, student 7 and student 15. Similarly there are two students who scored 10, student 5 and student 9. If you then look up at the frequency graph of Class A, you will see that the frequency value for a score of 1 is 2 – indicating that 2 students scored 1 out of 10. Similarly the frequency value for a score of 10 is 2.
If you then look at the value graph for Class B, you can see that it does not vary as greatly. Another question to ask is, if there was no variation at all, what would the frequency graph look like? Well the students would all get the same mark, so there would be one bar, thirty high on one of the scores. What would it look like to have the maximum variation? I’m pretty sure having half the data at each end will do that. What is important is to get students to recognise the difference between frequencies and values.

## Labelling

These graphs are correctly labelled so that a close inspection will yield correct insights. Sadly this is not always the case, making it more difficult to draw correct conclusions.

## Does this matter?

It matters greatly. As citizens in the world we are frequently exposed to graphs, some of which aim to instruct and some to obscure. Some obscure without even trying. All our students need to be able to recognise the difference between a value graph and a frequency graph, and compare correctly  two graphs. Those who go on to produce or promulgate graphs, such as scientists and journalists should be well aware of the many pitfalls in graphs, one of which is the confusion between value and frequency graphs. 