Math ClubMath Club
v1 · padrão canônico

Lesson 103 — Hypothesis testing: structure and logic

Formal structure of hypothesis testing: H0 vs H1, test statistic, p-value, significance level, Type I and II errors, and test power.

Used in: 3.º ano do EM (17-18 anos) · Equiv. Stochastik LK alemão · Equiv. Math B japonês · H2 Statistics singapurense

p-value=P(TtobsH0)αreject H0p\text{-value} = P(T \geq t_{\mathrm{obs}} \mid H_0) \leq \alpha \Rightarrow \text{reject } H_0
Choose your door

Rigorous notation, full derivation, hypotheses

Rigorous definition

The five elements of a hypothesis test

"The null hypothesis H0H_0 represents a claim of skepticism. It is the status quo that would be maintained unless there is sufficient evidence against it." — OpenIntro Statistics, §5.1

Errors and test power

Formal definition of the p-value

"The p-value measures how consistent the data are with H0H_0. A small p-value indicates that the data are incompatible with H0H_0 — not that H0H_0 is false with probability 1p1-p." — OpenIntro Statistics, §5.1

Types of alternative hypothesis

Solved examples

Exercise list

26 exercises · 6 with worked solution (25%)

Application 18Understanding 4Modeling 2Challenge 1Proof 1
  1. Ex. 103.1ApplicationAnswer key

    Formulate the hypotheses H0H_0 and H1H_1 for the following scenario: a consumer protection agency wants to verify if the average weight of a 500 g package of flour is compliant with the declaration.

  2. Ex. 103.2Application

    Researchers want to verify if teenagers sleep less than the recommended 8 hours per night. Formulate H0H_0 and H1H_1.

  3. Ex. 103.3Application

    H0:μ=50H_0: \mu = 50, H1:μ50H_1: \mu \neq 50. Data: n=25n = 25, Xˉ=52\bar X = 52, σ=10\sigma = 10 (known). Calculate the z-statistic and the p-value. Conclude for α=0.05\alpha = 0.05.

  4. Ex. 103.4Application

    A manufacturer claims its bulbs last on average 1000 h. A sample of n=64n = 64 bulbs gives Xˉ=985\bar X = 985 h with σ=50\sigma = 50 h (known). At the 5% level, is the average lifespan less than claimed?

  5. Ex. 103.5Application

    In a criminal trial, H0H_0 is "the defendant is innocent" and H1H_1 is "the defendant is guilty". Describe Type I and Type II Errors in this context. Which is considered more serious in the legal system? Why?

  6. Ex. 103.6Understanding

    A test results in p=0.03p = 0.03. Which of the statements below is correct?

  7. Ex. 103.7Understanding

    A test with n=10n = 10 results in p=0.12p = 0.12. The researcher concludes "the effect does not exist". What might be wrong?

  8. Ex. 103.8Application

    A school implemented a new methodology. The historical average grade is μ0=35\mu_0 = 35 points. After intervention, n=40n = 40 students had Xˉ=37\bar X = 37 and σ=8\sigma = 8 (known). At the 5% level, did the grade improve?

  9. Ex. 103.9Application

    A clinic wants to detect a 5 min reduction in service time (δ=5\delta = 5, σ=10\sigma = 10). With α=0.05\alpha = 0.05 and 90% power, what is the minimum nn?

  10. Ex. 103.10ApplicationAnswer key

    A coin is flipped 100 times and gets 60 heads. At the 5% level, is the coin fair?

  11. Ex. 103.11Application

    A researcher changes the significance level from α=0.05\alpha = 0.05 to α=0.01\alpha = 0.01 while keeping nn fixed. Explain the effect on Type II Error and test power.

  12. Ex. 103.12ApplicationAnswer key

    Normal fasting blood glucose: μ0=120\mu_0 = 120 mg/dL. A sample of n=50n = 50 diabetics gives Xˉ=128\bar X = 128 mg/dL with σ=20\sigma = 20 mg/dL. At the 1% level, is average blood glucose elevated?

  13. Ex. 103.13Understanding

    A result is "statistically significant at 5%". What does this correctly mean?

  14. Ex. 103.14Application

    A company wants to detect if the average weight of its products dropped from μ0=250\mu_0 = 250 g to μ1=245\mu_1 = 245 g, with σ=20\sigma = 20 g, α=0.05\alpha = 0.05 and 80% power. What is the minimum nn?

  15. Ex. 103.15Application

    A genomics study performs 1000 simultaneous tests with α=0.05\alpha = 0.05. All tested genes are null (no real effect). How many false positives are expected? If 60 genes are "significant", what is the estimated false discovery rate?

  16. Ex. 103.16Application

    A coin is flipped 800 times and gets 384 heads. At the 5% level, is the coin fair?

  17. Ex. 103.17ApplicationAnswer key

    A survey with n=30n = 30 teenagers recorded an average sleep of Xˉ=7.5\bar X = 7.5 h with σ=1.5\sigma = 1.5 h (from previous studies). At the 5% level, do they sleep less than 8 hours?

  18. Ex. 103.18UnderstandingAnswer key

    Which of the statements about statistical significance is correct?

  19. Ex. 103.19Modeling

    A clinical trial tests 20 endpoints simultaneously with α=0.05\alpha = 0.05. What is the probability of at least one false positive without correction? Describe how Bonferroni correction solves the problem and discuss its limitation.

  20. Ex. 103.20Application

    The historical ENEM approval rate of a school is 30%. After a new methodology, 38 out of 100 students passed. At the 5% level, did the rate improve?

  21. Ex. 103.21Application

    Test H0:μ=50H_0: \mu = 50 vs H1:μ50H_1: \mu \neq 50 with σ=10\sigma = 10 and Xˉ=51\bar X = 51. Calculate the p-value for n=10n = 10 and n=10000n = 10000. What does this reveal about the p-value and effect size?

  22. Ex. 103.22ApplicationAnswer key

    Normal systolic pressure: μ0=120\mu_0 = 120 mmHg. Sample of n=60n = 60 sedentary adults: Xˉ=125\bar X = 125 mmHg, σ=15\sigma = 15 mmHg. At the 1% level, is average pressure elevated?

  23. Ex. 103.23Application

    A veterinary study wants to detect that the average weight of pigs of a breed changed from 125 kg to 120 kg (δ=5\delta = 5, σ=15\sigma = 15). With α=0.05\alpha = 0.05 two-tailed and 80% power, how many animals are needed?

  24. Ex. 103.24Modeling

    A school's ENEM has Xˉ=52\bar X = 52 points against μ0=50\mu_0 = 50 state average, with s=10s = 10 and n=10000n = 10000 students. The result is "highly significant" (p<0.001p < 0.001). Calculate Cohen's effect size dd. Is the 2-point difference educationally relevant? Discuss.

  25. Ex. 103.25Challenge

    Show that, under H0H_0 true, the p-value has a Uniform(0,1) distribution for continuous tests. Use this result to verify that P(reject H0H0)=αP(\text{reject } H_0 \mid H_0) = \alpha.

  26. Ex. 103.26Proof

    Use the Neyman-Pearson Lemma to show that the one-tailed z-test (reject if Xˉ>c\bar X > c) is the most powerful level α\alpha test for H0:μ=μ0H_0: \mu = \mu_0 vs H1:μ=μ1>μ0H_1: \mu = \mu_1 > \mu_0 with normal data and known σ\sigma.

Sources

Updated on 2025-05-14 · Author(s): Clube da Matemática

Found an error? Open an issue on GitHub or submit a PR — open source forever.