Lesson 72 — Variance and standard deviation
Statistical dispersion: how much data deviates from the mean. Population and sample variance, standard deviation, computational formula, properties of linearity and independence.
Used in: 2.º ano do EM (16-17 anos) · Equiv. Stochastik LK alemão · Equiv. Math B japonês · Equiv. H2 Statistics singapurense
Rigorous notation, full derivation, hypotheses
Rigorous definition
Variance and standard deviation — population and sample
"Variance is more or less the mean squared distance of each data point from the mean. The unit associated with variance is in squared units. To ensure the dispersion measure has the same units as the data, we take the square root of the variance, called the standard deviation." — OpenIntro Statistics §2.1, Diez et al., CC-BY-SA.
"In statistics problems, we usually do not have access to the entire population, so we use sample data to estimate population parameters. For this, we divide by the sample degrees of freedom, , instead of ." — OpenStax Statistics §2.7, Illowsky & Dean, CC-BY.
Algebraic properties
Geometric representation — scatter plot
Two sets with the same mean but distinct dispersions. Points far from the dashed line (mean) generate high variance; grouped points generate low variance.
Solved examples
Exercise list
40 exercises · 10 with worked solution (25%)
- Ex. 72.1Application
Calculate the population variance and standard deviation of .
- Ex. 72.2Application
Calculate the sample variance and sample standard deviation for .
- Ex. 72.3Application
Calculate the population standard deviation of .
- Ex. 72.4ApplicationAnswer key
What is the variance of ? Explain geometrically.
- Ex. 72.5ApplicationAnswer key
Calculate the population variance of .
- Ex. 72.6Application
Salaries (thousand R3, 3, 4, 4, 5, 20$. Calculate the mean and sample standard deviation. Comment on the effect of the outlier.
- Ex. 72.7Application
Use the computational formula to calculate the variance of .
- Ex. 72.8Application
Waiting time (min) in 8 service calls: . Calculate the sample standard deviation.
- Ex. 72.9ApplicationAnswer key
Weights (kg) of 6 watermelons: . Calculate and .
- Ex. 72.10Application
assumes values with probabilities . Calculate .
- Ex. 72.11Application
Fair 6-sided die. Calculate .
- Ex. 72.12ApplicationAnswer key
Sum of two independent fair dice. Calculate using the independence property.
- Ex. 72.13Application
Maximum temperature (°C) over 7 days: . Calculate the sample variance.
- Ex. 72.14Application
Use the computational formula to calculate the variance of .
- Ex. 72.15ApplicationAnswer key
If , calculate .
- Ex. 72.16Application
If , what is the standard deviation of ?
- Ex. 72.17ApplicationAnswer key
, , and independent. Calculate and .
- Ex. 72.18Application
Standardize if , . Calculate the z-score.
- Ex. 72.19Application
(Celsius to Fahrenheit conversion). If °C, what is ?
- Ex. 72.20Application
Calculate the coefficient of variation for heights ( cm, cm) and weights ( kg, kg). Which set is relatively more variable?
- Ex. 72.21Application
Standardize using . What are the mean and standard deviation of the z-scores?
- Ex. 72.22Application
. What is ?
- Ex. 72.23ApplicationAnswer key
Sample mean of independent observations with . What is the standard deviation of the mean?
- Ex. 72.24Application
Sum of 100 iid random variables with . What is the standard deviation of the sum?
- Ex. 72.25Understanding
Why does sample variance use the divisor instead of ?
- Ex. 72.26Understanding
To compare dispersion between salaries (\sigmaCV$ preferred? Why?
- Ex. 72.27Understanding
Can variance be negative?
- Ex. 72.28Modeling
Production line: mean mass 500 g, g. Tolerance g. How many does the tolerance represent?
- Ex. 72.29ModelingAnswer key
Two funds with 8% expected return, but and . Which one to choose as risk-averse? Why?
- Ex. 72.30Modeling
You measure a resistance 10 times: , . Estimate the standard deviation of the mean.
- Ex. 72.31Modeling
Home-to-work commute time: min, min. Using Chebyshev's inequality as a conservative bound, how many minutes early should you leave to have at least a 95% chance of arriving on time?
- Ex. 72.32Modeling
Six Sigma process: mm, tolerance to mm. What is the largest that still satisfies the Six Sigma requirement?
- Ex. 72.33ModelingAnswer key
Stocks A: ; Stocks B: . 50-50 portfolio, zero correlation. Portfolio variance.
- Ex. 72.34Modeling
Same portfolio as the previous exercise, but with correlation between stocks. Variance. Compare with the zero correlation case.
- Ex. 72.35Modeling
In machine learning, why should features with different scales be standardized before training gradient-based models?
- Ex. 72.36Modeling
ENEM Math grades: , points. A student got 740. Calculate the z-score and interpret (how many standard deviations above the mean are they?).
- Ex. 72.37Proof
Prove that from the definition .
- Ex. 72.38Proof
Prove that for any constants .
- Ex. 72.39ProofAnswer key
Prove that when and are independent.
- Ex. 72.40Proof
Prove Chebyshev's inequality: for .
Sources
-
OpenIntro Statistics (4th ed.) — Diez, Çetinkaya-Rundel, Barr · CC-BY-SA. Primary source for this lesson. §2.1–§2.2 cover sample variance, standard deviation, boxplots, and applied examples.
-
Statistics (OpenStax) — Illowsky, Dean · CC-BY. §2.7 covers dispersion measures, computational formula, calculator exercises, and education/health data.
-
Introduction to Probability — Grinstead & Snell (Dartmouth) — GNU FDL. Ch. 6 covers variance of discrete random variables, algebraic properties, Chebyshev's inequality, and connection to the law of large numbers.