First there is the problem of lexical ambiguity. There are colloquial meanings for random that don’t totally tie in with the technical or domain-specific meanings for random.
Then there is the fact that people can’t actually be random.
Then there is the problem of equal chance vs displaying a long-term distribution.
And there is the problem that there are several conflicting ideas associated with the word “random”.
In this post I will look at these issues, and ask some questions about how we can better teach students about randomness and random sampling. This problem exists for many domain specific terms, that have colloquial meanings that hinder comprehension of the idea in question. You can read about more of these words, and some teaching ideas in the post, Teaching Statistical Language.
First there is lexical ambiguity. Lexical ambiguity is a special term meaning that the word has more than one meaning. Kaplan, Rogness and Fisher write about this in their 2014 paper “Exploiting Lexical Ambiguity to help students understand the meaning of Random.” I recently studied this paper closely in order to present the ideas and findings to a group of high school teachers. I found the concept of leveraging lexical ambiguity very interesting. As a useful intervention, Kaplan et al introduced a picture of “random zebras” to represent the colloquial meaning of random, and a picture of a hat to represent the idea of taking a random sample. I think it is a great idea to have pictures representing the different meanings, and it might be good to get students to come up with their own.
The first meaning of random describes something happening without pattern, method or conscious decision. An example is “random violence”.
Example: She dressed in a rather random faction, putting on whatever she laid her hand on in the dark.
Most on-line dictionaries also give a statistical definition, which includes that each item has an equal probability of being chosen.
Example: The students’ names were taken at random from a pile, to decide who would represent the school at the meeting.
One meaning: Something random is either unknown, unidentified, or out of place.
Example: My father brought home some random strangers he found under a bridge.
Another colloquial meaning for random is odd and unpredictable in an amusing way.
Example: My social life is so random!
There has been considerable research into why people cannot provide a sequence of random numbers that is like a truly randomly generated sequence. In our minds we like things to be shared out evenly and the series will generally have fewer runs of the same number.
Animals aren’t very random either, it seems. Yesterday I saw a whole lot of sheep in a paddock, and while they weren’t exactly lined up, there was a pretty similar distance between all the sheep.
In the paper quoted earlier, Kaplan et al used the following definition of random:
“We call a phenomenon random if individual outcomes are uncertain, but there is nonetheless a regular distribution of outcomes in a large number of repetitions.” From Moore (2007) The Basic Practice of Statistics.
Now to me, that does not insist that each outcome be equally likely, which matches with my idea of randomness. In my mind, random implies chance, but not equal likelihood. When creating simulation models we would generate random variates following all sorts of distributions. The outcomes would be far from even, but in the long run they would display a distribution similar to the one being modelled.
Yet the dictionaries, and the later parts of the Kaplan paper insist that randomness requires equal opportunity to be chosen. What’s a person to do?
I propose that the meaning of the adjective, “random” may depend on the noun that it is qualifying. There are random samples and random variables. There is also randomisation and randomness.
A random sample is a sample in which each object has an equal opportunity of being chosen, and each choice of object is by chance, and independent of the previous objects chosen. A random variable is one that can take a number of values, and will generally display a pattern of outcomes similar to a given distribution.
I wonder if the problem is that randomness is somehow equated with fairness. Our most familiar examples of true randomness come from gambling, with dice, cards, roulette wheels and lotto balls. In each case there is the requirement that each outcome be equally likely.
Bearing in mind the overwhelming evidence that the “statistical meaning” of randomness includes equality, I begin to think that it might not really matter if people equate randomness with equal opportunity.
However, if you think about medical or hazard risk, the story changes. Apart from known risk increasing factors associated with lifestyle, whether a person succumbs to a disease appears to be random. But the likelihood of succumbing is not equal to the likelihood of not succumbing. Similarly there is a clear random element in whether a future child has a disability known to be caused by an autorecessive gene. It is definitely random, in that there is an element of chance, and that the effects on successive children are independent. But the probability of a disability is one in four. I suppose if you look at the outcomes as being which children are affected, there is an equal chance for each child.
But then think about a “lucky dip” containing many cheap prizes and a few expensive prizes. The choice of prize is random, but there is not an even chance of getting a cheap prize or an expensive prize.
I think I have mused enough. I’m interested to know what the readers think. Whatever the conclusion is, it is clear that we need to spend some time making clear to the students what is meant by randomness, and a random sample.