Lesson 79 — Deep Dive into Bayes' Theorem
Priors, posteriors, and sequential updating. Odds form, Beta-binomial conjugate prior, base rate fallacy, Naive Bayes. Applications in medical diagnosis, spam filtering, and ML.
Used in: Stochastik LK alemão · H2 Math Statistics singapurense · Math B japonês · Equiv. AP Statistics EUA
Rigorous notation, full derivation, hypotheses
Definitions and theorems
Conditional probability
"The conditional probability , the probability of given , expresses the probability of when we know that has occurred. It can be computed using the formula , assuming ." — Grinstead & Snell, Introduction to Probability, §4.1
Law of total probability
Bayes' theorem
"Bayes' Theorem is just a formula that comes from the definition of conditional probability. Yet it is extremely powerful, and is the key to understanding what it means to rationally revise your beliefs in light of new evidence." — OpenIntro Statistics 4e, §3.2
Odds form
Sequential updating
Beta-binomial conjugate prior
SVG — Bayes diagram in 2×2 table
Absolute frequency diagram. The PPV (Positive Predictive Value) is the Bayesian posterior P(sick | positive test). When prevalence is low, false positives outweigh true positives even with a high-quality test.
Solved examples
Exercise list
40 exercises · 10 with worked solution (25%)
- Ex. 79.1ApplicationAnswer key
, , . Calculate .
- Ex. 79.2Application
, . Calculate .
- Ex. 79.3Application
, , . Calculate .
- Ex. 79.4Application
With the data from exercise 79.3, calculate .
- Ex. 79.5ApplicationAnswer key
Disease with 0.5% prevalence. Diagnostic test: 95% sensitivity, 95% specificity. Calculate the PPV using frequencies in 10,000 people.
- Ex. 79.6ApplicationAnswer key
Same data as exercise 79.5, but with 50% prevalence. Calculate the PPV and compare with the previous result.
- Ex. 79.7Application
Spam filter: . Word "FREE" appears in 60% of spams and 5% of legitimate emails. Calculate .
- Ex. 79.8Application
Urn A: 2 red, 3 blue. Urn B: 5 red, 1 blue. An urn is chosen at random and a red ball is drawn. What is the probability the urn is A?
- Ex. 79.9ApplicationAnswer key
3 coins: 2 fair, 1 double-headed. One is chosen at random, flipped once, comes up heads. What is the probability the chosen coin is the double-headed one?
- Ex. 79.10Application
. . . Given a person has cancer, what is the probability they are a smoker?
- Ex. 79.11Application
Sequential updating: two positive tests with 90% sensitivity and 90% specificity, applied to a disease with 1% prevalence. Use the posterior of the 1st test as the prior of the 2nd. What is the PPV after both consecutive positive results?
- Ex. 79.12Application
For a test with 90% sensitivity and 95% specificity, calculate the positive likelihood ratio .
- Ex. 79.13Application
Prior odds of 1:99 (1% prevalence). (90% sensitivity, 95% specificity). Calculate the posterior odds and the posterior.
- Ex. 79.14Application
Which of the following values is the correct posterior in a context with prior odds 1:99 and ?
- Ex. 79.15Application
Prior . 7 heads observed in 10 flips. Determine the posterior.
- Ex. 79.16Application
Prior (uniform). 0 heads observed in 5 flips. Determine the posterior and its mean.
- Ex. 79.17Application
In exercise 79.15, what is the posterior mean?
- Ex. 79.18Application
Prior . New batch: 30 parts inspected, 6 defective. Determine the posterior and posterior mean.
- Ex. 79.19ModelingAnswer key
COVID-19 in endemic phase: 5% prevalence. Rapid test: 80% sensitivity, 95% specificity. Calculate the PPV using frequencies in 10,000 people. Is it worth automatically isolating all positives?
- Ex. 79.20Modeling
Naive Bayes for email: . In training: "FREE" appears in 60% of spams and 5% of hams; "won" appears in 50% of spams and 10% of hams. An email contains both words. Classify assuming conditional independence.
- Ex. 79.21Modeling
Three diseases: A (10% in population), B (5%), C (1%). Patient presents symptom S with , , . Which disease is most likely?
- Ex. 79.22Modeling
Prosecutor's fallacy: DNA evidence has a frequency of 1/1000 in the population. The prosecutor claims the probability of innocence is 1/1000. Why is this reasoning wrong? Calculate the correct posterior assuming there are 100,000 plausible suspects in the city.
- Ex. 79.23ModelingAnswer key
Fraud classifier: 95% sensitivity, 99.9% specificity. Frauds: 0.1% of transactions. Calculate the PPV. How many false positives for every true positive?
- Ex. 79.24Modeling
Pregnancy test: 99% sensitivity, 98% specificity. Woman with prior probability of pregnancy of 30%. Calculate the PPV.
- Ex. 79.25ModelingAnswer key
Polygraph: 70% sensitivity, 80% specificity. In interrogation with a suspect who has a 5% prior of guilt. Calculate the posterior after a positive result. Is the result admissible as sufficient evidence to convict?
- Ex. 79.26ModelingAnswer key
Two independent positive tests (sens = 0.9, spec = 0.95; sens = 0.85, spec = 0.90). Prevalence 2%. Calculate the posterior after both positive results via sequential updating.
- Ex. 79.27Modeling
In a lineup, one suspect has red hair (H) with a 70% probability of being the culprit. A witness identifies the red-haired one with 90% probability when the culprit is H, and erroneously 15% of the time when the culprit is not H. Given the witness pointed to H, what is the posterior of guilt?
- Ex. 79.28Modeling
Quality control with 3 lines (A: 40% of production, 2% defect; B: 35%, 3%; C: 25%, 5%). A defective part is found. Determine the probability of each line being the origin.
- Ex. 79.29Understanding
What is the base rate fallacy?
- Ex. 79.30Understanding
Why does the prior matter even in "objective science"? An analysis that ignores the prior is equivalent to what implicit assumption?
- Ex. 79.31Understanding
Two independent positive tests with likelihood ratios and . What is the effect on the odds form?
- Ex. 79.32Understanding
What is the practical difference between using a Beta(1,1) prior and a Beta(10,10) prior for a coin? In which case will the posterior be more sensitive to new data?
- Ex. 79.33Challenge
Show that two conditionally independent positive tests given result in posterior odds equal to prior odds, where .
- Ex. 79.34Challenge
Demonstrate that the posterior of the Bernoulli-Beta model is Beta(, ) when the prior is Beta(, ) and we observe successes in trials.
- Ex. 79.35Proof
Demonstrate Bayes' theorem from the definition of conditional probability and the law of total probability.
- Ex. 79.36Proof
Show that using only the definition of conditional probability. Identify why in general.
- Ex. 79.37Challenge
Monty Hall problem with 3 doors. Use Bayes to calculate the probability of the car being in each door after Monty (who knows where the car is) opens an empty door. Should you switch?
- Ex. 79.38ChallengeAnswer key
In Naive Bayes with binary features, show that the classifier is equivalent to multiplying the individual LRs of each feature. What happens when the conditional independence assumption is violated?
- Ex. 79.39ProofAnswer key
Demonstrate that the odds form of Bayes, posterior odds = LR prior odds, follows directly from the usual form of Bayes' theorem for two complementary events and .
- Ex. 79.40Challenge
Show that the mean of the posterior Beta(, ) converges to the maximum likelihood estimator when , for any fixed prior Beta(, ). What does this imply about the relationship between Bayes and frequentism for large samples?
Sources
-
Grinstead, C.M. & Snell, J.L. — Introduction to Probability (2nd ed.) · GNU FDL · Dartmouth College. Chapter 4 (§4.1): Conditional probability, independence, Bayes' theorem — primary source for most urn, coin, and proof exercises in this lesson.
-
Diez, D.M., Çetinkaya-Rundel, M., Barr, C.D. — OpenIntro Statistics (4th ed.) · CC-BY-SA · OpenIntro. Sections §3.2–3.4: conditional probability, Bayes, frequency tables, and Bayesian updating — source for PPV, sequential updating, and conjugate prior exercises.
-
Illowsky, B. & Dean, S. — Statistics (OpenStax) · CC-BY · OpenStax. Section §3.4 (Contingency Tables and Probability Trees): medical diagnosis, spam filtering, and probability trees — basis for Naive Bayes and fraud exercises.