Math ClubMath Club
v1 · padrão canônico

Lesson 73 — Quartiles, percentiles, and boxplots

5-number summary: min, Q1, median, Q3, max. IQR, boxplots, and the 1.5 IQR rule for detecting outliers. Robust measures for skewed data.

Used in: Stochastik — Leistungskurs alemão · H2 Math Statistics — Singapura · AP Statistics — EUA · Math B — Japão

IQR=Q3Q1,outlier if x<Q11,5IQR or x>Q3+1,5IQRIQR = Q_3 - Q_1, \quad \text{outlier if } x < Q_1 - 1{,}5\,IQR \text{ or } x > Q_3 + 1{,}5\,IQR
Choose your door

Rigorous notation, full derivation, hypotheses

Rigorous definition

Order statistics and percentiles

"The first quartile, Q1Q_1, is the value such that 25% of the data fall below it, and the third quartile, Q3Q_3, is such that 75% of the data fall below it." — OpenIntro Statistics §2.1

minQ₁Q₂Q₃maxoutlieroutlierIQR

Boxplot anatomy: box (Q1 to Q3), median line, whiskers to the non-outlier extreme, isolated points for outliers.

Solved examples

Exercise list

40 exercises · 10 with worked solution (25%)

Application 21Understanding 4Modeling 10Challenge 2Proof 3
  1. Ex. 73.1ApplicationAnswer key

    Data: 1, 3, 5, 7, 9. Calculate median, Q1Q_1, and Q3Q_3.

  2. Ex. 73.2Application

    Data: 2, 4, 6, 8, 10, 12. Calculate the 5-number summary.

  3. Ex. 73.3ApplicationAnswer key

    Grades: 4, 5, 6, 6, 7, 7, 8, 8, 9, 10. Calculate Q1Q_1, Q2Q_2, Q3Q_3.

  4. Ex. 73.4Application

    Calculate the IQRIQR of the data: 12, 14, 18, 22, 25, 28, 32.

  5. Ex. 73.5ApplicationAnswer key

    Ages: 18, 20, 21, 22, 23, 24, 25, 27, 30, 35, 60. Apply the 1.5 IQR rule. Is there an outlier?

  6. Ex. 73.6Application

    Salaries (k):2,3,3,4,4,5,5,6,8,50.Calculatemedianandk): 2, 3, 3, 4, 4, 5, 5, 6, 8, 50. Calculate median and IQR$.

  7. Ex. 73.7ApplicationAnswer key

    For n=100n = 100 sorted data points, what is the position of Q3Q_3 by the linear interpolation method?

  8. Ex. 73.8Application

    Times (s): 10, 11, 11, 12, 13, 13, 14, 14, 15, 100. Calculate Tukey limits and identify the outlier(s).

  9. Ex. 73.9Application

    Weights (kg): 60, 62, 64, 65, 65, 67, 70, 72, 75, 80. Describe all elements of the boxplot.

  10. Ex. 73.10Application

    For ZN(0,1)Z \sim \mathcal N(0,1), Q3Q1=?Q_3 - Q_1 = ?

  11. Ex. 73.11Application

    Data with IQR=6,7IQR = 6{,}7. Using the robust estimator σ^=IQR/1,349\hat\sigma = IQR/1{,}349, calculate σ^\hat\sigma.

  12. Ex. 73.12Application

    How many points above Q3+3IQRQ_3 + 3 \cdot IQR would we expect in a sample of 1000 normal observations?

  13. Ex. 73.13Application

    Boxplot A: narrow box, centered median. Boxplot B: wide box, median close to Q1Q_1. Compare dispersion and symmetry of the two sets.

  14. Ex. 73.14Application

    Distribution with a long right tail. Where is the mean in relation to the median?

  15. Ex. 73.15Application

    Set A has IQR=5IQR = 5, set B has IQR=20IQR = 20. Which has more dispersion in the central data?

  16. Ex. 73.16Application

    Median of A=B=50A = B = 50. Q3Q_3 of A=55A = 55, of B=80B = 80. Which has a more right-skewed distribution?

  17. Ex. 73.17Application

    P90P_{90} of company salaries = $30k. Interpret this information.

  18. Ex. 73.18Application

    A student is at the P85P_{85} of the exam. What does this mean?

  19. Ex. 73.19Application

    If Q1=Q2=Q3Q_1 = Q_2 = Q_3, what can be concluded about the data?

  20. Ex. 73.20Understanding

    Is the statement "the 1.5 IQR rule flags 5% of data as outliers" correct for normal data?

  21. Ex. 73.21ApplicationAnswer key

    Ages (years): 40, 52, 55, 58, 62, 66, 72. Calculate the 5-number summary and check for outliers.

  22. Ex. 73.22ApplicationAnswer key

    Grades of 10 students: 3, 5, 6, 7, 7, 8, 8, 9, 10, 10. Complete boxplot (with outlier check).

  23. Ex. 73.23Modeling

    Class of 100 students: Q1=5Q_1 = 5, Q3=8Q_3 = 8. A student scored 9.5—are they in the top 25%?

  24. Ex. 73.24Modeling

    Why do statistical agencies report median income, rather than just the mean, in inequality reports?

  25. Ex. 73.25Modeling

    Parts produced with diameter: Q1=9,98Q_1 = 9{,}98 mm, Q3=10,02Q_3 = 10{,}02 mm. Specification: 10,00±0,0510{,}00 \pm 0{,}05 mm. Is the process centered? Is there significant risk of rejection?

  26. Ex. 73.26Modeling

    A/B test of a site: variant A has median 1.2 s and IQR=0,3IQR = 0{,}3; variant B has median 1.1 s and IQR=1,5IQR = 1{,}5. Which do you prefer for production? Justify using dispersion statistics.

  27. Ex. 73.27ModelingAnswer key

    You detect an outlier in financial transactions that appears to be fraud. Should you remove it before analyzing the data? Justify with statistical arguments.

  28. Ex. 73.28Modeling

    Response times (ms): 120, 130, 135, 140, 142, 145, 148, 150, 155, 380. Calculate the 5-number summary and evaluate if the system meets a 200 ms SLA based on quartiles.

  29. Ex. 73.29Modeling

    Hospital with 4 wings. Stay times (days): Wing A: 5, 8, 9, 10, 12; Wing B: 3, 4, 4, 5, 20; Wing C: 7, 8, 8, 9, 10; Wing D: 2, 3, 15, 18, 25. Construct 5-number summaries and identify which wing is most predictable for bed management.

  30. Ex. 73.30Modeling

    Exam scores by school. School A: median 650, IQR=80IQR = 80. School B: median 620, IQR=200IQR = 200. Which school has more uniform performance? What does each pattern suggest for pedagogical policy?

  31. Ex. 73.31Modeling

    Average monthly precipitation (mm): 234, 181, 130, 83, 68, 52, 44, 47, 82, 122, 145, 201. Calculate the 5-number summary and interpret seasonality.

  32. Ex. 73.32Modeling

    Real estate prices in a neighborhood ($k): 250, 280, 310, 320, 340, 350, 380, 390, 420, 1800. Calculate median and mean. Why should a buyer use the median as a reference for typical price?

  33. Ex. 73.33Understanding

    Explain, in your own words, why median and IQR are "robust" while mean and standard deviation are not. Use a concrete example.

  34. Ex. 73.34UnderstandingAnswer key

    Can a boxplot hide a bimodal distribution? Construct a concrete example of a bimodal distribution that has the same boxplot as a unimodal one.

  35. Ex. 73.35UnderstandingAnswer key

    For XUniform(0,1)X \sim \text{Uniform}(0, 1), the IQRIQR is:

  36. Ex. 73.36Challenge

    Calculate analytically the IQRIQR of XExponential(λ)X \sim \text{Exponential}(\lambda). Express in terms of λ\lambda.

  37. Ex. 73.37Challenge

    Argue why the breakdown point of the IQRIQR is 25%, the median is 50%, and the mean is 0%.

  38. Ex. 73.38ProofAnswer key

    Demonstrate: if XX is a continuous r.v. with density symmetric around μ\mu, then μ\mu is the median of XX.

  39. Ex. 73.39Proof

    Show that for nn \to \infty and iid samples from Uniform(0,1), the sample estimator of Q1Q_1 converges to 0.25. Use properties of order statistics.

  40. Ex. 73.40Proof

    Demonstrate that the median minimizes E[Xc]E[|X - c|] over all values cRc \in \mathbb{R}.

Sources

Updated on 2025-05-14 · Author(s): Clube da Matemática

Found an error? Open an issue on GitHub or submit a PR — open source forever.