Lesson 71 — Measures of central tendency: mean, median, mode
Summarize a dataset with a single number: mean, median, mode. When to use each and what the choice reveals about the distribution.
Used in: 2.º ano do EM (16-17 anos) · Stochastik LK alemão · H2 Math Statistics singapurense · Math B japonês
Rigorous notation, full derivation, hypotheses
Definitions and properties
Descriptive statistics: the summary problem
Given a set of observations , we want a single number that represents the "center" of the distribution. There is no single answer — there are three different questions, three different answers.
"The sample mean can be calculated for any quantitative variable. For a discrete distribution, the mean is the sum of each value multiplied by its probability; for a continuous distribution, the corresponding integral." — OpenIntro Statistics, §1.6
Algebraic properties of the mean
"The mean minimizes the sum of squared deviations ( error). The median minimizes the sum of absolute deviations ( error). This distinction has profound consequences in regression and machine learning." — OpenIntro Statistics, §2.1
Relationship between the three measures and skewness
Relationship between mode, median, and mean according to distribution skewness. In right skew (long positive tail): mode less than median less than mean.
| Distribution shape | Relationship |
|---|---|
| Symmetric unimodal | Mode Median Mean |
| Right skew (positive tail) | Mode Median Mean |
| Left skew (negative tail) | Mean Median Mode |
Solved examples
Exercise list
42 exercises · 10 with worked solution (25%)
- Ex. 71.1Application
Data: 5, 6, 7, 7, 7, 8, 9. Calculate the mean, median, and mode.
- Ex. 71.2Application
Grades of 8 students: 4, 5, 6, 6, 7, 8, 9, 10. Calculate the mean, median, and mode.
- Ex. 71.3Application
Monthly salaries ($k): 2, 2, 3, 4, 5, 50. Compare mean and median. Which better represents the typical salary?
- Ex. 71.4Application
Ages of 7 participants: 18, 19, 20, 21, 22, 23, 24. Calculate the mean, median, and mode.
- Ex. 71.5ApplicationAnswer key
Loading times (s): 0.5; 0.7; 0.8; 0.9; 1.1; 1.5; 7.0. Calculate the mean and median. Is the median more informative than the mean in this case?
- Ex. 71.6Application
Car colors in a parking lot: 12 white, 8 black, 5 gray, 5 red. Which measure of central tendency is appropriate?
- Ex. 71.7ApplicationAnswer key
Data: 1, 1, 2, 3, 5, 5, 7. Determine the mode(s). How is this distribution classified?
- Ex. 71.8Application
Frequency table: = 4, 5, 6, 7, 8 with frequencies = 2, 3, 5, 3, 2. Calculate the arithmetic mean.
- Ex. 71.9Application
Grouped data: intervals , , with frequencies 5, 12, 3. Calculate the mean using midpoints.
- Ex. 71.10Application
A class has a mean age years. A new 20-year-old student joins and the new mean becomes years. How many students were there originally?
- Ex. 71.11ApplicationAnswer key
Calculate mean, median, and mode for: 3, 3, 4, 5, 6, 6, 6, 9.
- Ex. 71.12ApplicationAnswer key
Calculate mean, median, and mode(s) for: 10, 10, 11, 12, 13, 14, 14, 15, 19.
- Ex. 71.13UnderstandingAnswer key
Why does IBGE prefer to use the median (and not the mean) to describe the per capita household income of Brazil?
- Ex. 71.14Understanding
Emergency room waiting time: most are seen in 1 to 2 hours, but some serious cases wait more than 10 hours. Which measure to use to describe typical waiting time? Justify.
- Ex. 71.15Understanding
A manufacturer wants to declare the typical useful life of its LED bulbs. Suggest which measure of central tendency to use and justify.
- Ex. 71.16Understanding
An election poll asks 1,000 voters which party they intend to vote for. Which measure of central tendency will identify the preferred party?
- Ex. 71.17Understanding
For a unimodal distribution with right skew (long positive tail), what is the typical order between mode, median, and mean? Explain intuitively.
- Ex. 71.18UnderstandingAnswer key
Uniform distribution in . Determine the mean, median, and discuss the mode. What does this say about symmetric distributions?
- Ex. 71.19Understanding
ENEM grades have a distribution close to normal. Is the mean or median more adequate to describe typical performance? Justify.
- Ex. 71.20Understanding
An investor wants to know the most common number of rooms in apartments in a neighborhood. Which measure to use?
- Ex. 71.21UnderstandingAnswer key
Page loading time: 95% of requests respond in less than 300 ms, but 1% take more than 5 s. Why do reliability engineers prefer median (P50) and percentiles (P95, P99) instead of the mean?
- Ex. 71.22Understanding
Why for a continuous unimodal symmetric distribution are the three measures of central tendency equal? Explain geometrically.
- Ex. 71.23Modeling
A/B testing: checkout time for site A has mean 12 s and median 9 s. Site B has mean 10 s and median 10 s. Which site has better experience for the typical user? Justify.
- Ex. 71.24Modeling
Company A reports only an average salary of R$ 10k. Company B reports mean of R$ 8k and median of R$ 7k. What might the absence of the median in A be hiding?
- Ex. 71.25Modeling
In K-means, the centroid of a cluster is the mean. What is the effect of an outlier on the centroid? How does K-medoids (which uses the median point) mitigate this problem?
- Ex. 71.26Modeling
Quality control: parts with mean diameter mm and approximately symmetric distribution. To what value would you expect the median to be close? Why?
- Ex. 71.27Modeling
In machine learning, MSE as a loss function implies the model learns to estimate the mean conditional. MAE implies the model estimates the median conditional. Explain why this follows from the variational characterization of central measures.
- Ex. 71.28Modeling
A meta-analysis with 50 studies reports the median effect size instead of the mean. Why is the median preferred in meta-analysis?
- Ex. 71.29Modeling
Why does the boxplot use the median as the central line (and IQR as box width) instead of using mean and standard deviation?
- Ex. 71.30Modeling
In federated learning, why does replacing the mean of gradients with the median increase resistance to malicious clients (Byzantine attacks)?
- Ex. 71.31Modeling
For the log-normal distribution (): mode , median , mean . Verify the ordering mode less than median less than mean for .
- Ex. 71.32ModelingAnswer key
Salaries ($k): 4, 4, 5, 5, 6, 7, 7, 8 (8 employees). A CEO with a salary of $60k is added (without removing anyone). Calculate mean and median before and after. Which measure changed more?
- Ex. 71.33Modeling
Grades of 30 students on a test, grouped: : 3 students; : 8 students; : 12 students; : 7 students. Calculate the mean estimated by midpoints.
- Ex. 71.34Proof
Show that .
- Ex. 71.35Proof
Show that is minimized at for any sequence .
- Ex. 71.36Proof
Show that is minimized at . (Hint: analyze what happens when shifting to one side or the other of the median, counting how many remain above and below.)
- Ex. 71.37Proof
Show that if (linear transformation), then .
- Ex. 71.38Challenge
Does the mean satisfy in general? And the median? Investigate with and the data .
- Ex. 71.39Challenge
Cauchy distribution: . Calculate the median. Show that the mean does not exist (the integral diverges).
- Ex. 71.40Challenge
Show that if we swap the largest value of a dataset for an even larger value, the median does not change, but the mean increases.
- Ex. 71.41ChallengeAnswer key
Two groups have means and with sizes and . Derive the formula for the combined mean of the two groups.
- Ex. 71.42ChallengeAnswer key
Jensen's inequality states that for convex , . Apply with to obtain an inequality between and . What does this imply about the variance?
Sources
- OpenIntro Statistics (4th ed.) — Diez, Çetinkaya-Rundel, Barr · CC-BY-SA 4.0 · §1.6 (basic descriptive measures, choice of measure, skewness) and §2.1 (variational characterization, robustness). Primary source for this lesson.
- Introductory Statistics 2e (OpenStax) — Illowsky, Dean et al. · CC-BY 4.0 · §2.5 (mean calculation for grouped data, extensive examples with frequency tables).
- Estatística (Wikilivros) — collaborative · CC-BY-SA 4.0 · Sections: Mean, Median, Mode, Measures of central tendency (reference in PT-BR; Czuber formula for mode in grouped data).
- 2000 Nobel Prize in Economics — Heckman and McFadden — microeconometric methods based on robust estimation of central location.