Teaching random variables and distributions
18 August 2014
A Statistics-centric curriculum
6 October 2014

The subject of statistics is rife with misleading terms. I have written about this before in such posts as Teaching Statistical Language and It is so random. But the terms sampling error and non-sampling error win the Dr Nic prize for counter-intuitivity and confusion generation.

Confusion abounds

To start with, the word error implies that a mistake has been made, so the term sampling error makes it sound as if we made a mistake while sampling. Well this is wrong. And the term non-sampling error (why is this even a term?) sounds as if it is the error we make from not sampling. And that is wrong too. However these terms are used extensively in the NZ statistics curriculum, so it is important that we clarify what they are about.
Fortunately the Glossary has some excellent explanations:

Sampling Error

“Sampling error is the error that arises in a data collection process as a result of taking a sample from a population rather than using the whole population.
Sampling error is one of two reasons for the difference between an estimate of a population parameter and the true, but unknown, value of the population parameter. The other reason is non-sampling error. Even if a sampling process has no non-sampling errors then estimates from different random samples (of the same size) will vary from sample to sample, and each estimate is likely to be different from the true value of the population parameter.
The sampling error for a given sample is unknown but when the sampling is random, for some estimates (for example, sample mean, sample proportion) theoretical methods may be used to measure the extent of the variation caused by sampling error.”

Non-sampling error:

“Non-sampling error is the error that arises in a data collection process as a result of factors other than taking a sample.
Non-sampling errors have the potential to cause bias in polls, surveys or samples.
There are many different types of non-sampling errors and the names used to describe them are not consistent. Examples of non-sampling errors are generally more useful than using names to describe them.

And it proceeds to give some helpful examples.
These are great definitions, and I thought about turning them into a diagram, so here it is:

Table summarising types of error.

Table summarising types of error.


And there are now two videos to go with the diagram, to help explain sampling error and non-sampling error. Here is a link to the first:
Video about sampling error
 One of my earliest posts, Sampling Error Isn’t, introduced the idea of using variation due to sampling and other variation as a way to make sense of these ideas. The sampling video above is based on this approach.
Students need lots of practice identifying potential sources of error in their own work, and in critiquing reports. In addition I have found True/False questions surprisingly effective in practising the correct use of the terms. Whatever engages the students for a time in consciously deciding which term to use, is helpful in getting them to understand and be aware of the concept. Then the odd terminology will cease to have its original confusing connotations.

27 Comments

  1. These concepts have been developed much further within the framework of total survey error. See the special issue of Public Opinion Quarterly on TSE: http://poq.oxfordjournals.org/content/74/5.toc, or at the very least the representation and measurement error branches of the TSE diagram, http://poq.oxfordjournals.org/content/74/5/849/F3.expansion.html.

  2. Nozipno Mahlalela says:

    can you pliz eplain more for me about the sampling error like giving example

    • Dr Nic says:

      Hi
      Another way of looking at it is to call it sampling variation. Say the true and unknown population mean weight of something is 55kg. We take a which sample happens to contain items that gave a mean of 52. The sample may be representative and not have much non-sampling error at all, but there is sampling error.
      Or another example could be Lotto balls. In NZ there are 40 lotto balls, numbered from 1 to 40, so the mean of them is 20.5. When 6 balls are drawn randomly, there is no non-sampling error as this is a gambling machine, that requires a high level of attention to eliminating bias and other non-sampling error. However, there is a high likelihood that any sample taken will have a mean different from 20.5. This is sampling error.
      I hope that helps
      Nic

      • G Chattapadhyay says:

        That’s a great way of teaching. Thanks.

      • Balqish says:

        Hello Dr. Nic,
        I still dont understand what are the differences between sampling and non-sampling error. Hope that you can explain more to me as I am not familiar with statistics. Thanks

  3. shady says:

    Your work is great. I would however love to see specific examples of sampling errors. God bless you in Jesus name.

    • Dr Nic says:

      I’m happy you like the blog.
      You can’t have examples of sampling error. Sampling error, or sampling variation, which is a better term for it, exists because you take a sample of the population. Any examples of error you make due to sampling, are in fact non-sampling error.

  4. Ssesanga Enock says:

    Can you please explain more about the types of non sampling errors other than examples

  5. Mrunal gandhi says:

    can you tell me the non sampling error arise during the research study?

  6. Srivatsav Murthy says:

    Hi, can you please let me know – if my population size is 1000 items, out of which I select 100 items and do a quality check on the 100 items, and if I discover 6 errors, is the error percentage 6% (6/100) or 0.6% (6/1000)? I felt the 100 is representative of the 1000, so the errors discovered in the 100 items are taken as having been discovered from out of the 1000 items. Can you throw some light on this please? Thanks.

  7. Thank you I had it all.mixed up.

  8. NORBERT SMITH says:

    HEY, I JUST WANT THE SPECIFIC EXAMPLES OF BOTH TYPES OF ERRORS..

    • Dr Nic says:

      Hi Norbert
      I think you might need to read the post again. Basically any kind of error you think of is likely to be a non-sampling error. Sampling error occurs because the sample is not the whole population.
      Nic

  9. […] There are other problems – known as non-sampling error. I wrote a short post on it previously. […]

  10. […] There are other problems – known as non-sampling error. I wrote a short post on it previously. […]

  11. Leonardo says:

    Thanks for your help! Could you explain this problem with sampling error:
    The percentage of a random sample of white respondents (N = 400) who say they have a favorable attitude toward the police is 53%. The percentage of a random sample of black respondents (N = 300) who say they have a favorable attitude toward the police is 45%.
    You are asked if there is a real difference between the percentage of whites and blacks who have a positive attitude toward the police in the larger population, or is this sample difference likely to have occurred by random chance or sampling error.
    How do you respond? Explain your answer.
    Construct a 95% confidence interval for the proportion of Blacks in the population who have a favorable attitude toward the police.

    • Dr Nic says:

      Hi Leonardo
      This looks a lot like a homework problem! You need to perform a difference of two proportions to find out the p-value. The p-value is the probability that you would get this result due to chance (sampling error). So really this question is not about sampling error.

  12. […] let me explain what a sampling distribution is. (And let me add the term to Dr Nic’s long list of statistics terms that cause unnecessary confusion.) A sampling distribution of a mean is the distribution of the […]

  13. Nor farah nadiah says:

    Can i know the type of sampling error? Because i have to do an assignment about that.

    • Dr Nic says:

      There is only one type of sampling error. Sampling error exists because the sample is not the full population. Everything else is non-sampling error.

  14. Dominic Ansah says:

    This sampling and Non-sampling Error is so confusing. Thanks for your time but educate us more on this please , cause I have an assignment on that.

    – Distinguish Between Sampling And Non- Sampling –

  15. Ashani Fernando says:

    Thank you so much for the explanation. It helped a lot to understand the concept. Great job Dr Nic thank you once again. 😊

  16. Yasir ahmad says:

    What method would you to control each of the errors?

  17. Kay says:

    Good evening, what are the sources of sampling error?

Leave a Reply

Your email address will not be published. Required fields are marked *