# The median outclasses the mean

22 April 2013
6 May 2013

## The median suffers from poor marketing.

All my time at school the “average” was always calculated as the arithmetic mean, by adding up all the scores and then dividing by the number of scores. When we were taught about the median, it seemed like an inferior version of the mean. It was the thing you worked out when you weren’t smart enough to add and divide. It was used for house prices, and that was about it. Of course the mean was the superior product! Why wouldn’t you use the mean?
I’ve been preparing resources for teaching the fabulous new New Zealand curriculum, and have been brought face-to-face with my prejudices. It strikes me that the median has had very poor representation.

# Public opinion of the median and mean

I put a question on Facebook and Twitter to see what people felt about the mean and the median. I briefly explained what each was, then asked which one they thought was better. Some people had no idea what I was talking about, but most felt that the mean was the superior statistic. The following are a selection of responses:

The mean, but I don’t know why.. maybe that’s just what we were taught to use when I was back in school (a long time ago!) lol
When I think of “average” I always think of the mean. I don’t know if it’s actually better though
well the median is a real pain to work out. you have to make a list of all the numbers, in order, and then count how many they are and then go to the middle. PAIN IN THE BUM. the average… well that is somewhat quicker to do, no? and i don’t see the point in the median at all. unless well no, there is just no need for it. who cares what the15th person in the class got on a test? the lowes and highes is much more interesting. As i remember it, the mode is the most commonly occuring number out of a set of numbers… i think of this as the “mode” or in English (not French), the ‘fashionable” number. oh and it stresses me how all 3 start with Ms cos that is confusing. which is why i like to use the word average.
The mean, which I’m guessing is the same as the average? When the media refer to real estate stats they always use median price, which can distort reality, we would prefer the average price. (From a real estate agent)
I don’t really think it’s a case of which is better. They’re two different things aren’t they? I think it’s usually easier to work out the average.

A number of my Facebook friends did know about statistics, and responded in favour of the median in most cases. This was an interesting comment:

“It depends. Everyone who proof read my thesis was like why on earth are you using the median – no one uses it. And most of the other similar primate studies I’ve read use the mean (except one, that was published by my associate supervisor). But my means were off their rocker, and I’m pretty sure my medians were a much better representation of reality in this case. It makes making comparisons between studies a little awkward though.

# Why NOT use the median all the time?

I am hard pressed to find an instance where the mean is actually a better measure of central tendency than the median. The purpose of the mean or median (or mode) is to provide a one number summary of a set of data. The whole idea of the mean is actually quite tricky, as you can read in one of my early posts about explaining what the mean is. Generally the summary value is used to compare with another sample or population.
In my lectures I often illustrated times when the median is a better summary measure of a sample or population than the mean. This is quite common in notes and YouTube videos. Never once did I show where the mean was preferred to the median! So why were/are we so loyal to the mean, bringing out the median for special occasions and real estate?
I think there are two answers, both of them no longer valid. It is a question of legacy.

# Time and ease to calculate

Despite first appearances, for anything larger than a trivial sample the mean is actually easier to calculate than the median. Putting a set of 100 values in order by hand is no easy task. (Pain in the bum, as my friend so elegantly expressed it.) Adding up scores and dividing by 100 is a walk in the park in comparison.  In the early 1980s when I learned programming (in Fortran, Pascal and Cobol), writing a sorting program was far from trivial and a large set of numbers would take a large amount of time to sort. Only in later years, as computing power has expanded, has it been possible to get a computer to calculate a median.

# Formulas for confidence intervals

Means behave nicely and give nice mathematical results when manipulated. Because of this we can calculate confidence intervals using a nifty little formula and statistical tables. Until bootstrapping by computer  became do-able on a large and small scale, there was no practical way to perform inference on a number of very useful statistics, including the median and the inter-quartile range.

# Conclusion: the median is better

A median is intrinsically understandable. It is the middle number when the values are put in order. End of story. – Well not quite – you do have that slightly tricky thing where the sample is even and you have to average the middle two terms, but apart from that it is easy!
A median is not affected by outliers. I learned a new term for this when I was reading up in preparation for writing this post. The term is “resistant” and I learned it from one of Mr Tarrou’s videos for AP Statistics. I found these videos after my tirade against videos on confidence intervals. Tarrou’s videos are long and a bit more mathematical than I would like. (He can’t help it – he is a maths teacher and the AP Statistics syllabus seems to have been devised by mathematical statisticians trying to put students off ever taking the subject again.) But they are GOOD. Tarrou’s videos are sound, and interesting and well put together. I will be recommending them as complementary to my own offerings. (Because I sure as heck don’t want to have to do all that icky mathsy stuff).
But I digress. The median is “resistant” because it is not at the mercy of outliers. There are lots of great examples, including in Mr Tarrou’s video. If you have a median of 5 and then add another observation of 80, the median is unlikely to stray far from the 5. However a mean is a fickle beast, and easily swayed by a flashy outlier.
The main disadvantage I can see for the median is that it can be a bit jumpy in small samples made up of discrete values. I guess if you have two well-behaved populations that are very similar and you want to see precise differences then the means might just be better – but even then you would possibly be over-interpreting small differences.
I have found it very interesting observing the behaviour of confidence intervals for the difference of two medians, compared with confidence intervals for the difference in two means. While I was preparing materials for our on-line resource, I performed nine such tests on different real data taken from students at university. The scores are very jumpy, and the differences between the medians often include exactly zero. Consequently the confidence intervals of the difference of two medians quite often have zero as their lower bound. This provides a challenge in interpretation, as I had not met this often when looking at the differences between means. However, it also illuminates the odd relationship we have with zero. Just because a confidence interval for a difference of two means is (-0.13, 3.98) and includes a zero, it is tempting to conclude that there is no significant difference. But is -0.13 really any different from zero in practical terms? The other point is that we should be leaving the confidence interval as it is, rather than stretching it into further inference.

# Word on the web

I did a little surfing to see what the word on the web was.  To find out who said what, drop the entire phrase into Google. (Ah ‘tis a wonderful we live in, indeed)

• “The mean is the one to use with symmetrically distributed data; otherwise, use the median.” Hmm – but if the data is symmetric, surely the mean = the median?
• “An important property of the mean is that it includes every value in your data set as part of the calculation. In addition, the mean is the only measure of central tendency where the sum of the deviations of each value from the mean is always zero. “ Ok – hard to argue with that.
• “Calculation of medians is a popular technique in summary statistics and summarizing statistical data, since it is simple to understand and easy to calculate, while also giving a measure that is more robust in the presence of outlier values than is the mean.” Totally!
• “However, when the sample size is large and does not include outliers, the mean score usually provides a better measure of central tendency. “(Then goes on to give an example of when the median is better.)
• “Use the median to describe the middle of a set of data that does have an outlier. Advantages of the median: Extreme values (outliers) do not affect the median as strongly as they do the mean, useful when comparing sets of data, it is unique – there is only one answer.
Disadvantages of the median:  Not as popular as mean.(Not as popular??!)

Sorry median  – you do not win X-Factor for summary statistics. You may be more robust, and less fickle, not to mention easier to understand, but you just aren’t as popular!
I can feel a video coming on – the median has been relegated to the periphery long enough!

## Update in 2018

Here is our video about different summary statistics, which also addresses the relative merits of mean and median, and why they even matter!

##### Dr Nic

1. Kevin Kane says:

Nice article. As a statistician, I’m a huge fan of the median. I’ve worked on a large simulation study, and even small departures from non-normality result in the median being a better estimate of central tendency (even in quite large sample sizes).
Most software gives a p-value for non-parametric tests, such as the Wilcoxon Rank Sum Test (WRST). What a lot of people don’t know is a neat trick to work out a confidence interval. If you have two samples, say two treatments in a clinical trial, if you add a constant to all the values in one sample, and the p-value from the WRST is non-significant, then it’s within the confidence interval. By either playing around, or adding an increasing range if constants to one sample, you can EASILY get the confidence interval.
Lastly, there’s one case where I do think it makes sense to use the mean, even when distributions aren’t normal. That’s where you are analysing amounts of money (say the cost of an illness). Quite often, distributions of cash are highly skewed, but there is interest in TOTAL spend. In these cases, the mean can be more relevant.

• Dr Nic says:

Thanks Kevin – great to hear from a practitioner. That is a really good point about the total.
Someone on Twitter pointed out “Depends on the application. Median is good for giving a “typical” value, but median speed won’t help me predict my travel time”
This is another case where what we really want is related to the total, rather than the average. Or something like that.

• JOSEPHINE says:

thanks dr have gain a lot

• Carson Lochridge says:

After reading Kevin’s comment and then reading the comment about “predict my travel time” I am kind of confused and have some questions. First why couldn’t you predict your travel time with the Median? Second, I run a crew of about sixteen installers for work. I want to track their install rates at ft/hr and see what I can expect from them in a days work. I want to use the Median, but after reading this last comment of predicting travel time it seems like the mean might be better? Most days they install about 300 ft/hr, but there are many outliers, especially with bad weather. Any help on this would be great, Thanks.

• Dr Nic says:

Hi Carson
It would be good to graph your data and look at that before deciding between the mean and the median. It sounds as if the median is a better option in this case.

2. Paul Swank says:

The mean does have a smaller sampling error thatn the median and it is important to the calculation of the variance and standard deviation.

3. One number (mean, median etc) is seldom enough to describe a set of numbers, a standard deviation helps. But a plot of the distribution (cumulative?) is the often best answer.

• Dr Nic says:

So true, but the sad fact is that often only one number is given. Box plots and dotplots are emphasised in the new NZ curriculum, over single value summaries. In the world of politics and economics, however, we are usualy fed only the mean. I wish we were told the standard deviation more often. (As in ever!)

4. mpledger says:

Regression is the main tool in the statitistian’s tool box and that based all around averages e.g. if Y is height and X is an indicator 0,1 for men and women respectively, than the estimates in a regression give you the average male hieght and the difference between the average male height and average female height. Since regression (and it’s generalisation) are pervasive in the applied literature then it’s quite hard to change. (I quite like quantile regression but it doesn’t give unique solutions.)
I think it’s also revolves around this theorem too (whose name I’ve forgotten) – “The most powerfull test of size alpha is the Likelihood Ratio Test” and most estimators coming out of maximising Likelihoods are means or functions of means.

5. Hi Dr. Nic,
To the best of my knowledge, bootstrap DOES NOT WORK for the median. There are asymptotic methods that involve estimating the density and there are non-parametric methods based on the median – basically inverting the sign test which leads to intevals whose endpoints are qunaitles (as I am sure you know).
In business, the mean is more “mean”ingful than the median! Would any business person really care about median monthly profit? Mean monthly profit means much more, because it you multiply it by 12 you get the total profit. Ditto for sport. The Aussie cricket team members might have a higher batting median than another team, but this would say little about the probability of winning. Batting averages on the other hand predict long-run team score.
Paul Swank. No, the median does not have a smaller sampling error than the mean. it depends on the udnerlying distribution. For normal data you are correct. The median is abou 67% efficient. For heavy tailed distributions, the median gets ebtter. But for laplace errors, it is more efficient than eny other estimator – it is the MLE!

• “To the best of my knowledge, bootstrap DOES NOT WORK for the median. ” Is’nt that an over-statement? See for example Biometrika (2001) 88 (2): 519-534. (Brown, Hall and Young). “Even in one dimension the sample median exhibits very poor performance when used in conjunction with the bootstrap. For example, both the percentile‐t bootstrap and the calibrated percentile method fail to give second‐order accuracy when applied to the median.” Much depends on the distribution.

The bootstrap is certainly unsatisfactory for extreme quantiles.

• 1 says:

I agree the business wouldn’t, but because the business actually cares about the total itself. Calculting a mean provides a cosmetic benefit over the total. The mean (“average monthly”) reveals nothing new. The mean is merely a more memorable/familiar unit.
Similarly, measuring in a metric unit rather than a less familiar unit doesn’t change the ‘true” quantity.

• Suspicious Mind says:

Actually, it may be outdated now, but there was an old saw about the income of lawyers, and it really dealt with the median, an astonishingly low dollar figure. The stratospheric incomes of the few highly successful skewed the average data, and led to misleading career advice regarding the profession. If one among a thousand people had won a billion dollar lottery, a million average looks pretty good, but the median would still be zero. The basic lay distinction is the mean or average is a mathematical result, focusing on the data taken from a number of samples, while the median gives equal weight to each sample, which can provide directions for additional study regarding what might account for distributions of results in the population. I once scored a 29 out of 100 on a calculus test, where two scored 98 and 99; I received a C, thankfully because the curve took note of the median score, not the average. Mean is essentially a single calculation, while median requires two steps in its calculation – an ordered distribution based on data, then a simple numerical midpoint in the distribution, independent of the data itself. Implicit, also, in any statistical analysis, is assuming relationships of some kind, which may not always exist; everyone has a height, but not everyone has an income. We can reasonably anticipate average and median heights to be close, but incomes especially in corrupt, bankrupt third world dictatorships may display peculiar statistical abnormalities. Numbers that reflect measurements of natural life processes may have a much narrower range, while those associated with abstracted measures of human invention can reach to the limits of the universe.

• Dr Nic says:

Thanks for those great examples to add to the discussion.

6. Peter Lane says:

Hi Nic
The mean has the property of being the best linear unbiased predictor, as long as the distribution of the data is reasonably well behaved. A lot of stats analysis, as opposed to description, is geared towards being able to predict things, so the mean is therefore preferable. The distribution issue is not usually a problem when using the mean, as long as data is not really sparse, because of the Central Limit Theorem.
I have never personally found a use for the mode. However, I guess it must have a place in descriptive stats applied in some areas.

• Dr Nic says:

Thanks for that. I appreciate getting the balance of the argument.
I’ve always wondered about the point of the mode.

• David Munroe says:

Hi Nic,
I thought the mode was only useful for ordinal data but otherwise was rather pointless as a measure of ‘central tendency’.

7. It is instructive that most of the comments relate to the purpose to which the estimate of mean/median will be used. Real statistical applications rarely have as their purpose the estimation of such a parameter, it is merely a step in the process. Unfortunately much teaching – high and low level – ignores this context.
This issue extends beyond the mean/median debate. For example, skewed data from many applications such as geochemistry is best approximated by the log normal. However it may not make sense to consider means on a log scale since they lose the additivity that may be fundamental to the application.
Orthogonal to this is the mathematical context. Medians are ugly mathematically and complex analyses based on them can be even more ugly (think of Tukey’s median polish, essentially iterative proportional scaling with medians – easy to describe, messy to do, impossible to really understand). While we should not over constrain any analysis to match our mathematical limitations, we should not ignore the enormous benefit we can get from applying mathematical understanding.

• Dr Nic says:

Thanks for that really helpful comment. As my interest is very much at the beginner/consumer level of statistical education it is great to have people provide a more advanced perspective.

8. JUSTICE MOSES K. AHETO says:

Technically and as a Statistician, I prefer the Median to the Mean. The median is robust and resistance to outliers in the dataset unlike the mean which is highly affected/influence by extreme observations in the dataset.
Thanks

9. Simon Crouch says:

Perhaps it’s worth noting that in survival analysis, although things like “mean survival” are defined, they are very rarely used – the median, and other quantiles, reign supreme. Censoring of survival times means that the calculation of a mean involves extrapolation; the highly skew nature of survival times removes much of the interpretable meaning of the mean.

10. John Bibby says:

Please see MDST242 “Statistics in Society” – an Open University course. That uses median + quartiles + deciles + extremes. You can select from these and get measures of dispersion, skewness, kurtosis etc.. Confidence interval on the median is also a cinch.

11. David Munroe says:

Hi Nic,
In slight defense of the mean (ie, when you’re forced to present only 1 number – such as on a balance sheet)… generally it is better to present the mean in respect of liabilities rather than the median. Typically liabilities are skewed (if things go bad, they go very bad), so an estimate that responds to outliers is actually handy here. 🙂
For instance, when estimating the ‘long-tail liability losses’ for insurance companies the total losses are very skewed. Although the median is a better estimate of the centre of the distribution, the mean is a better choice for presentation in a balance sheet as it is much more conservative (and as a slight bonus, everyone ‘understands’ a mean so they tend to request it). In practice, you need to ensure that much more capital is available than the mean – assuming you want to stay in business.
It comes back, as several respondents have mentioned, to the purpose of the statistic. Generally I want at least three numbers: mean, standard deviation, and skewness.
Incidentally, another reason the mean may be preferred computationally over the median is that you can calculate it while only keeping track of three numbers: SUM(X) [ie, the sum of x to n-1], N, x. The median is harder to restrict in this way.

12. David Jones says:

I think no-one has yet mentioned the question of multidimensional data, and defining a “location” for these. Suppose data consist of locations of lightning strikes, expressed as 2 dimensional coordinates. Then “nice” properties for a location might include that the selected point should not depend on the coordinate system used (invariance to rotation). In more general cases this might be extended to invariance to affine transformations. A co-ordinatewise median point does not satify these. Of course there is a multivariate version of the median that can cope with the rotation invariance, and there are others, some based on counting shells.But still, these seem to fail the requirement, which might be either an extension or a reversal of the aove, in that one might not want the per-coordinate location of the multidimesional location to depend on what other dimensions have been measured or are being considered.
Thus, in a multivariate setting, the mean is invariant to linear transformations, while the co-ordinatewise median is not. Neither the mean or co-ordinatewise median are affected by dropping variables that are not of interest. Still stating the co-ordinatewise medians may still be prefereable to the mean, or not, depending on what purpose you think the summary might be put. For example, suppose data consist of daily values of (weight of) sediment transported past a point in a river … then the mean daily value is immediately informative about the total weight transported in, say, a month, while the median daily value is not.
Other comments have emphasized describing more of the distribution than just the location. This applies even more to multivariate data.

13. David Jones says:

“The mean is the one to use with symmetrically distributed data; otherwise, use the median.” Hmm – but if the data is symmetric, surely the mean = the median?
The point here should be that: while “the mean = the median”, the properties of the sample mean and the sample median are different (and of couse their values are usually different). So your question really splits into two parts… what thing should you be trying to estimate as a summary of the data, and then how should you estimate that quantity.

14. Allan Reese says:

I don’t see why you want to introduce “resiliance” in place of “robustness”. The latter term is standard statistical terminology.
As pointed out previously, calculating any statistic is either a step in a wider inference or is a quick summary. You are right that given any collection of numbers most people’s instinct is to add them up! The mean may not be a “meaningful” summary. Apart from skewed data, multimodal data may deliver a mean that has no relevance to the measured variable..The mean salary in a company is bloated by the CEO’s plundering. Regarding modes, tailors seem to plan for the modal number of legs, not the mean.
Someone mentioned lognormal distributions, and I work with these a lot. When any distribution can be transformed to symmetry, the median and the mode of the transformed values become similar, so the log-mean or geomean of a skewed distribution is reasonably estimated by the median.
I too am a fan of the median, but as one of Tukey’s five-value summary, hence the median rather than the mean is usually shown on boxplots. Means are sometimes added as symbols.
Someone else mentioned SD. Mean and SD are sufficient statistics to describe a normal distribution. If you *assume* your sample comes from a normal distribution, Mean and SD are sensible summary values.
Sorting is of course a well-studied computing problem (cf Knuth). It’s very odd to have people writing as if we were still in the 1950s when computer power was scarce and expensive. For any non-trivial stats calculations, get a proper stats package. Then you type in the values (or scan, or download), and get the whole slew of summary statistics (mean, median, mode, quartiles, min max, SD, trimmed mean, …) to examine for data cleaning before deciding which to report as *information*.
Allan

15. John Maindonald says:

Several respondents have mentioned the difficulty of working with the median for purposes of statistical theory. Classically, expectations have been central; they are theoretical means. The theory is a good approximation, in modest sized samples, only if data is on the scale that is not badly asymmetric. Where a monotonic transformation is used, often log(), that leads to a roughly symmetric distribution, does one need to apply a correction to correct for the bias induced on the original scale? Not if the medians (which are unaffected by monotone transformation) are the appropriate measures.
Modern computational abilities free us somewhat from the constraints of an expectation-tied theory. ‘Somewhat’ is the key word here. As an aside, the limitations of that theory, and the common role of recourse to empirical approaches that may involve heavy computation, become even more important when dealing with the dependence that is often present in the observational data with which most statisticians work most of the time.

16. Sage Moore says:

“The median is ‘resistant’ because it is not at the mercy of outliers.”
I love it! I’ll try it out on my students tomorrow.

17. Murray Jorgensen says:

I was surprised not to see a mention of the trimmed mean given that the median is the 50% trimmed mean and the mean is the 0% trimmed mean. Most statistical packages have a trimmed mean function.
The median and the trimmed mean lead on to the general area of Robust Statistics, which has been studied extensively since the 70s. Robust Statistics have not made much of an impact in applied statistics, at least beyond the area of descriptive statistics. Why is this? Well much of applied statistics is built around the linear model and the ANOVA decomposition, which is built on the properties of the L2 norm which underlies the mean and variance.
Another chunk of applied statistics is based on statistical models and the likelihood function. Many of the MLEs least to statistics that are not robust. One way to keep both the likelihood function and robustness is to mix the statistical model with a parameter-free heavy-tailed component intended to diminish the influence of outliers on model parameters. Rohan Maheswaran studied this approach in his thesis.

18. This is a great post, Dr. Nic, and I appreciate everyone’s discussions about it.
1) In summarizing or describing a data set, it’s always a good idea to use multiple summary statistics and visualizations. I like both the mean and the median, so I like the use both. In fact, I like the 5-number summary, the mean, the variance, and a plot of the data (histogram, bar chart, scatter plot).
2) The mean often shows up as a sufficient statistic for the parameters of many distributions. Thus, it is often used to find an estimator with a lower variance using the Rao-Blackwell Theorem.
3) The mean is often the maximum likelihood estimator for the parameters of many distributions. Beyond its use as a point estimator, it also has many nice large-sample properties that can be used for inference.
4) Echoing Peter Lane’s comment, a nice thing about the mean is the ease of using it for inference; its sampling distribution can be easily found based on the Central Limit Theorem!
Eric Cai – The Chemical Statistician
http://chemicalstatistician.wordpress.com

19. […] I’ve written previously about what a difficult concept a mean is, and then another post about why the median is often preferable to the mean. In that one I promised a video. Over two years ago – oops. But we have now put these ideas into a […]

20. patrick.gennow@gmail.com says:

Each statistic has it’ s place. I most often use mean in situations where the values I’ m getting are precise and when all data needs to be taken into account, such as physical measurements from sensors. These measurements can still have outliers but the scientist must then justify their omission with a relevant criterion. Median is used in situations where the data can be sparatic and is heavily prone to bias or outliers. I would use median for instance if I was asking people to rate a movie beucase many may say 0 or 10 which is not an accurate reflection on how they actually felt. An example of this being done is how in judged sports the high and low scores are most often omitted. Conversely, there are some situations where only mean makes sense. For instance, if lottery winnings were described by the median they may show a 0 % return which is not a good representation of the return. If the same was done with mean statistic it may show a 67 % return which is what players should actually expect.

• Dr Nic says:

Hi Patrick
Thanks for your excellent examples of when to use the median and the mean. I like that they are contextual rather than just rules.
Nic

21. Shennella Murze says:

Sorry median – you do not win X-Factor for summary statistics. You may be more robust, and less fickle, not to mention easier to understand, but you just aren’t as popular!
in all the debate, that sense of humor though!!!!
the info is quite useful !!

22. […] the measuring of income it uses mean income of the specialty instead of median of the specialty. Median income is the better than mean when it comes to methods of measuring income because it is more robust to outliers and more robust […]

23. Brett says:

If you transform the data with an affine transformation, then both the mean and the median (and mode) transform the same as the data. However, if you transform the data with a non-affine monotonic transformation, then only the median transforms with the data.

24. سمر طلبة says:

Thank you so much. God bless you.

25. Taiwo says:

Perfectly written and clear to understand. Thank you.

26. Torbjörn Björkman says:

Very nice video!

I think the problem that your social media commentariat demonstrated is not so much the confusion about mean and median as a confusion about what we are trying to achieve by calculating them in the first place. For most people of non-technological vocation, that would be “representing the typical number”. Which is, for typical everyday distributions, more likely to be the median than the mean. I especially like that your video also includes the mode and strongly emphasizes the importance of actually looking at the distribution before deciding how to compress it into a single number.

I can think of only two things that I would add if I used this in teaching.

The first thing I might have added is to point out the reason for the skew of your example distribution, because it is so commonly occuring: the distribution is defined only on [0,+infinity[ since you can’t own a negative number of shoes. I might also have taken the opportunity to point out that this not infrequently leads to distributions which have their maximum at 0 and then just tail off to infinity. And that in such cases the central tendency, while perfectly definable mathematically, probably no longer tells you the most relevant thing about the distribution. But I guess you get to that in the next lesson 🙂

Secondly, I think that your graphics with the students being sorted for the median and then shoes and students just piled up and divided are useful for getting an intuitive idea about why the median is often a better single-number-measure: it contains more information about the distribution. You can compute the mean from less information than you need for the median. In your video this can be seen by comparing the simplistic concepts of the two piles for the mean with the more sophisticated treatment of first sorting and then divvying up the student group in two equal size groups.

…which brings us to another case in which the mean might have to be used.
Suppose we decided to make an independent measure of the distribution by sneak(er)ing into the empty student dorms on Barefoot Athletics Day in honor of Abebe Bikila. Due to the general student approach to tidyness, we come away with a great count of the total number of shoes for men and women, but a rather foggier view of precisely how these shoes couple to individual students. In this case we might conclude that the great uncertainties in our assignment of the individual shoes and the great accuracy of our total shoe count, 1623.5 pairs (that student tidyness again), might lead us to prefer the mean, since it can be computed from more certain data.

• Dr Nic says:

27. David H says:

Over the last x number of years I have found myself looking at trade statistics, particularly the calculation of cargo “dwell times” (the time between discharge of cargo at a port and removal from the port). The intention here is to calculate time taken for an average trade transaction – after all time=money! There are many reasons for doing this, as interval measurements of inteterim steps also allows for identification of bottle necks, which can then be addressed.

For my purposes the mean is not particularly helpful because of the outlier effect. These exist at both ends, where special goods move quickly and others remain static for extended periods, up to and including becoming abandoned goods. Even in data sets numbering in the tens of thousands a small number of long-stay transactions significantly skew the outcome. Most transactions are cleared within days, in many cases hours and those few long stay (2 weeks and up to 12 months), while insignificant in terms of their proportion to the sample set have a marked effect on the overall numbers if mean is used.

What matters to me and others is the time taken for a normal trade transaction and those outliers are not normal.

28. Steve Owens says:

Whenever I pick a math formula, my main consideration is how many other formulas signed it’s yearbook. Let’s face it, being popular is important.

• Dr Nic says:

You got it!

29. DM says:

Median is something I wish could be obtained in an Excel Pivot table. Unfortunately, this is not the case 🙁

• Dr Nic says:

That is disappointing. Excel is not particularly good for statistical analysis, but is often all we have. But PivotTables are pretty amazing.

30. Richard says:

I am a valuer attempting to find a statistically robust value from a set of sample prices, I typically find between 3 and six comparable discreet values very rarely multiple values, in this scenario, My gut feeling is that the mean is the best measure as an ‘outlier’ is just as valid as a data point in the centre of the distribution. If I used the median it feels like I am throwing away or ignoring significant /important data. Comments, please

• Dr Nic says:

Hi Richard
I hear what you are saying. It depends on what you are using the summary value for. If the total price is a useful value, then the mean is more supportable. The median can be quite jumpy with very small samples. How about a trimmed mean? Remove the top and bottom values and find the mean of the middle 50% of the values.

I am also looking at a very small samples of prices (no more than 17; sometimes as small as three). Given the points made by Richard, and the fact that the median would ignore a significant amount of the data in this case, would you recommend using the mean instead of the median as a measure of the average price?

Apologies – I am not a statistician, but come from a legal background! Any guidance you can provide would be much appreciated.

31. GEORGE JOHN says:

Quote from an earlier entry: I have never personally found a use for the mode. However, I guess it must have a place in descriptive stats applied in some areas.
An addendum: If you were a shoe salesman, neither mean nor median is any good – go for the mode!

32. Simon Woodward says:

The median has its dangers too. Consider daily rainfall data for a year. The median daily rainfall is almost always zero. Not very helpful.

• Torbjörn Björkman says:

But how is the average of your chosen quantity “daily rainfall for a year” helpful in any way whatsoever, other than as a representation of the total rainfall in a year? And, to go on, how is even that number useful, other than as a description of an average year?

So the median daily rainfall is almost always zero where you live (suggesting that you do not live in Bergen, Norway; that’s OK, nobody’s perfect). That is good and valuable information: It typically does not rain on any given day, so you can probably assume that it’s safe to plan a picnic some time within the next two weeks. But of what use is the rainfall averaged over a year?

I mean, typical values for a month, yes, because that may influence how likely you are to experience things like flooding or your local governing body prohibiting you watering your plants. But a yearly average…? nope, that seems rarely useful. But maybe you are in some more specialized business.

33. Simon Woodward says:

Mean daily rainfall might be used to compare against mean daily evapotranspiration, to get some idea of the water deficit for crops. Admittedly this might be more useful on a monthly or annual basis.

Median streamflow then. This is probably non-zero but completely ignores stormflow, which is a big part of the annual total. My point is that while means are sensitive to outliers, medians may be too insensitive to skew, which is often important.

As a side note, it occured to me that median is better for values that should not be added together. For example stream water quality such as Total Phosphorus (mg/l). Although this is skewed, it doesn’t make sense to add the values together since they are sampled at different flow rates. So median is probably better even though it hides the very large skew.

• Dr Nic says:

This is a great discussion with some excellent examples. Thanks for commenting. Nic

34. Texconsin says:

This is insane. The press reports a median salary of \$55,000, but Lebron James makes \$134.9 MILLION. Even if someone is working for free, the median salary – assuming James is the highest-paid American, which he is NOT – \$67.45 MILLION. Saying that we should stick to “median” is saying Americans are pretty much the dumbest clucks in the universe.

• Dr Nic says:

That is why the press uses the median rather than the mean. The median is the middle value, and it is not affected by extreme values.

35. […] both the median and the average (mean) costs when possible, we should all be aware that one is clearly better than the other […]

36. […] each legislative body, the slightly older median age makes the most sense as a reference point. A handful of outliers can throw off […]

37. […] each legislative body, the slightly older median age makes the most sense as a reference point. A handful of outliers can throw off […]