Math ClubMath Club
v1 · padrão canônico

Lesson 80 — Consolidation Term 8 — Applied Statistics and Probability

Integrative workshop: central measures, variance, quartiles, discrete r.v., binomial, normal, CLT, correlation, and Bayes in real-world problems.

Used in: 2.º ano do EM (16-17 anos) · Equiv. Stochastik LK alemão · Equiv. Math B japonês · Equiv. H2 Maths Statistics (Singapura)

Dadosdescreverμ^,σ^modelarP(X)inferirP(HE)\text{Dados} \xrightarrow{\text{descrever}} \hat\mu,\,\hat\sigma \xrightarrow{\text{modelar}} P(X) \xrightarrow{\text{inferir}} P(H \mid E)
Choose your door

Rigorous notation, full derivation, hypotheses

Formal synthesis of the term

Descriptive statistics

"Variance is the average of the squared deviations from the mean. For a sample, divide by n1n-1 (Bessel's correction) instead of nn." — OpenIntro Statistics §2.1

Discrete random variable

"Expectation is a weighted average of the possible values of XX, weighted by their probabilities." — Grinstead & Snell §6.1

Parametric distributions

Central Limit Theorem

"The CLT is arguably the most important result in all of probability theory. It states that the distribution of the sample mean approaches normal regardless of the original distribution of XX." — OpenIntro Statistics §4.4

Correlation and regression

Bayes' Rule

DatasampleEDADescriptiveμ, σ, Q, rmodelProbabilityP(X), CLTBayesInferenceP(H|E), decision

Term 8 Pipeline. Each block corresponds to a group of lessons (72–73, 74–76, 77, 78–79).

Solved examples

Exercise list

37 exercises · 9 with worked solution (25%)

Application 10Understanding 4Modeling 13Challenge 5Proof 5
  1. Ex. 80.1ApplicationAnswer key

    Sample: 4, 6, 8, 8, 9, 10, 10, 11, 12, 22. Calculate median, Q1Q_1, Q3Q_3, IQR, and identify outliers by the Tukey fence.

  2. Ex. 80.2Application

    Same sample as exercise 80.1. Calculate the mean and sample standard deviation. Compare with the median: which is more representative of the central position? Why?

  3. Ex. 80.3Application

    XBin(20,0,3)X \sim \text{Bin}(20,\, 0{,}3). Calculate E[X]E[X], Var(X)\text{Var}(X), and σ\sigma.

  4. Ex. 80.4Application

    For XBin(20,0,3)X \sim \text{Bin}(20,\, 0{,}3): verify the normal approximation criterion and, even if borderline, use the approximation with continuity correction to estimate P(X10)P(X \geq 10).

  5. Ex. 80.5Application

    XN(50,100)X \sim \mathcal{N}(50,\, 100). Calculate P(X>65)P(X > 65).

  6. Ex. 80.6Application

    Sample n=100n = 100 from population with μ=10\mu = 10, σ=3\sigma = 3. Calculate P(Xˉ>10,3)P(\bar X > 10{,}3).

  7. Ex. 80.7Application

    Pairs (x,y)(x, y): (1, 2), (2, 4), (3, 5), (4, 4), (5, 7). Calculate Pearson correlation coefficient rr and regression line y^=b0+b1x\hat y = b_0 + b_1 x.

  8. Ex. 80.8Application

    Disease with 2% prevalence, test with 90% sensitivity and 95% specificity. Calculate Positive Predictive Value (PPV).

  9. Ex. 80.9Application

    Y=2X+5Y = 2X + 5 where XN(10,4)X \sim \mathcal{N}(10,\, 4). Determine the distribution of YY.

  10. Ex. 80.10ApplicationAnswer key

    A fair die is rolled 50 times. Calculate the expectation and standard deviation of the sum S=X1++X50S = X_1 + \cdots + X_{50}.

  11. Ex. 80.11Understanding

    Which statement about dispersion measures is correct?

  12. Ex. 80.12UnderstandingAnswer key

    Which relationship is true for the point probabilities of the binomials at the mode?

  13. Ex. 80.13Understanding

    Explain in your own words: does the CLT state that, for large nn, individual data points follow a normal distribution? If not, what exactly converges to normal?

  14. Ex. 80.14UnderstandingAnswer key

    Which statement about Pearson correlation is correct?

  15. Ex. 80.15Modeling

    Pizza delivery time: TN(32,52)T \sim \mathcal{N}(32,\, 5^2) min. What is the maximum deadline covering 95% of deliveries?

  16. Ex. 80.16ModelingAnswer key

    With the model from exercise 80.15 (SLA of 40.2 min, 5% violation rate), calculate the expectation of the number of violations in 100 deliveries.

  17. Ex. 80.17Modeling

    Real estate market: correlation between area and price is r=0,8r = 0{,}8. Means xˉ=80m2\bar x = 80\,\text{m}^2, yˉ=450,000\bar y = 450,000 (R);deviations); deviations s_x = 20,, s_y = 80,000$. Find the regression line and predict the price for an average-area property.

  18. Ex. 80.18Modeling

    Financial portfolio: 100 independent stocks, each with daily return N(0,001,  0,022)\sim \mathcal{N}(0{,}001,\; 0{,}02^2). Determine the distribution of the daily return of the equal-weighted portfolio.

  19. Ex. 80.19Modeling

    Six Sigma: parts with dimension XN(10,  0,022)X \sim \mathcal{N}(10,\; 0{,}02^2) mm. Tolerance 10±0,110 \pm 0{,}1 mm. Calculate the proportion of defects and estimate defects per million.

  20. Ex. 80.20Modeling

    Election poll: n=2500n = 2500, p^=0,48\hat p = 0{,}48. Construct a 95% confidence interval for the true proportion pp.

  21. Ex. 80.21Modeling

    Spam filter: 80% of spam contains "FREE", 5% of ham contains it. P(spam)=0,3P(\text{spam}) = 0{,}3. An email contains "FREE" — apply Bayes and classify.

  22. Ex. 80.22ModelingAnswer key

    Production line: 2% defect rate, batch of 200 parts. Estimate P(X5 defects)P(X \geq 5\text{ defects}) via Poisson approximation and normal approximation. Compare results.

  23. Ex. 80.23Modeling

    Financial portfolio: assets A (σA=1%\sigma_A = 1\%) and B (σB=2%\sigma_B = 2\%) with correlation ρAB=0,3\rho_{AB} = 0{,}3. Calculate the standard deviation of a 50%/50% portfolio.

  24. Ex. 80.24Modeling

    Two independent diagnostic tests, both positive: test 1 (sens 90%, spec 95%), test 2 (sens 85%, spec 90%). Prevalence 1%. Apply Bayes sequentially and calculate final PPV.

  25. Ex. 80.25Modeling

    Vaccine clinical trial: 100 vaccinated, 5 sick; 100 placebo, 25 sick. Calculate vaccine efficacy and evaluate (informally) if the difference is statistically significant.

  26. Ex. 80.26Modeling

    Call center: in each minute, each of 120 agents receives a call with 2% probability. Model the number of simultaneous calls in 1 minute and calculate P(at least 1 call)P(\text{at least 1 call}).

  27. Ex. 80.27Modeling

    Heights of adult men in Brazil: μ=173\mu = 173 cm, σ=7\sigma = 7 cm. What percentage does not pass through a 180 cm door? What door height covers 99% of the male population?

  28. Ex. 80.28Challenge

    Explain, with a numerical example, why in highly right-skewed distributions the median is more informative than the mean, and IQR more informative than standard deviation.

  29. Ex. 80.29ChallengeAnswer key

    Describe intuitively and mathematically how Bayesian inference converges to frequentist inference (MLE) as nn \to \infty. Which theorem formalizes this convergence?

  30. Ex. 80.30ChallengeAnswer key

    Construct a data example where r>0,8r > 0{,}8 but the relationship between XX and YY is entirely explained by a confounder CC. Explicit the mathematical mechanism.

  31. Ex. 80.31Challenge

    Generate (theoretically) 100 independent random variables XiUniform[0,1]X_i \sim \text{Uniform}[0,1]. Use CLT to approximate P(X1++X100>55)P(X_1 + \cdots + X_{100} > 55).

  32. Ex. 80.32Challenge

    ENEM: public school has μ1=520\mu_1 = 520, σ1=90\sigma_1 = 90 (Math); private school has μ2=610\mu_2 = 610, σ2=80\sigma_2 = 80. Samples of n=100n=100 from each. What is the probability that the private sample mean exceeds the public one by more than 80 points?

  33. Ex. 80.36Proof

    Prove that Var(X)=E[X2](E[X])2\text{Var}(X) = E[X^2] - (E[X])^2 from the definition Var(X)=E[(Xμ)2]\text{Var}(X) = E[(X-\mu)^2].

  34. Ex. 80.37Proof

    Show that if XiBernoulli(p)X_i \sim \text{Bernoulli}(p) are iid, then S=i=1nXiBin(n,p)S = \sum_{i=1}^n X_i \sim \text{Bin}(n,p). Conclude that Xˉ=S/nPp\bar X = S/n \xrightarrow{P} p by the Law of Large Numbers.

  35. Ex. 80.38ProofAnswer key

    Prove that E[aX+b]=aE[X]+bE[aX + b] = aE[X] + b and Var(aX+b)=a2Var(X)\text{Var}(aX + b) = a^2 \text{Var}(X) for any a,bRa, b \in \mathbb{R}.

  36. Ex. 80.39Proof

    Derive Bayes' rule P(HE)=P(EH)P(H)/P(E)P(H \mid E) = P(E \mid H)\,P(H)/P(E) from the definition of conditional probability and the law of total probability.

  37. Ex. 80.40Proof

    State the CLT formally. Sketch the proof via characteristic function (indicate the steps, justifying Lévy's continuity theorem is not required).

Sources

Updated on 2025-05-14 · Author(s): Clube da Matemática

Found an error? Open an issue on GitHub or submit a PR — open source forever.