Lesson 109 — Introductory Bayesian Statistics
Prior, likelihood, posterior. Bayes' rule. Beta-Bernoulli conjugates. MAP versus MLE. Credible interval. Introduction to inference through the Bayesian paradigm.
Used in: Stochastik LK (Alemanha, Klasse 12) · H2 Math Statistics (Singapura) · AP Statistics (EUA)
Rigorous notation, full derivation, hypotheses
Rigorous definition
Bayes' theorem
"Bayes' theorem is a basic result of conditional probability, but its interpretation changes everything: it offers a formal recipe for updating beliefs in light of evidence." — OpenIntro Statistics §3.6
Conjugate priors: the Beta-Bernoulli case
Point estimators
Credible interval
Bayes Factor
To compare hypotheses and :
Jeffreys scale: — strong evidence for ; — strong for ; — inconclusive.
Bayesian flow: prior × likelihood → posterior. The posterior becomes the new prior when more data arrives.
Worked examples
Problem. A disease affects 2% of the population. A test has sensitivity 90% and specificity 85%. A patient tests positive. What is the probability he has the disease?
Strategy. Apply Bayes' formula with partition and calculate .
Solution.
, , , .
Total evidence:
Posterior:
Verification. With only 2% prevalence, even a relatively good test generates many false positives. The answer makes sense: most positives come from the huge healthy population.
Source. OpenIntro Statistics §3.6, medical diagnosis example — CC-BY-SA.
Problem. An urn has unknown proportion of red balls. Prior: . Draw with replacement: 1st sample — 3 red in 5 draws; 2nd sample — 4 red in 6 draws. Calculate the posterior after each sample and the final posterior mean.
Strategy. Beta-Bernoulli: after successes in trials, . Apply iteratively.
Solution.
Prior: , mean .
After 1st sample (, ):
After 2nd sample (, ):
Verification. Total data: 7 red in 11 draws, sample proportion = . Posterior mean 0.60 is between the prior (0.50) and the sample proportion — makes sense. With weak prior, the posterior converges to the MLE as grows.
Source. Think Bayes §3 — Allen Downey — CC-BY-NC-SA.
Problem. For the Beta-Bernoulli model with , , after 6 successes in 10 trials, calculate the MLE, MAP, and posterior mean. Interpret the difference.
Strategy. MLE maximizes the likelihood; MAP maximizes the posterior; posterior mean is .
Solution.
MLE: .
Posterior: .
MAP (mode of Beta with is ):
Posterior mean:
Verification. Ordering: mean between MAP and the mode of Beta(9,7). MLE is largest — the prior "pulls" toward 0.5 (symmetric prior around 0.5). With large , all three converge to the MLE.
Source. Think Bayes §4, §6 — Allen Downey — CC-BY-NC-SA.
Problem. After 12 successes in 20 trials with prior (uniform), calculate the central 95% credible interval for .
Strategy. Posterior . The central 95% interval is given by the 2.5% and 97.5% percentiles of the Beta distribution.
Solution.
Posterior: .
Posterior mean: .
By table or software (R: qbeta(c(0.025, 0.975), 13, 9)):
2.5% percentile: . 97.5% percentile: .
95% credible interval: .
Verification. Direct interpretation: "given the uniform prior and the data, the probability that is between 0.376 and 0.779 is 95%". Note the interval is not centered at 0.6 — it is asymmetric because the Beta is asymmetric in this case.
Source. Introduction to Probability §4.1 — Grinstead & Snell — GNU FDL.
Problem. To test whether (fair coin) versus (biased coin), with equiprobable prior (), calculate the Bayes factor and the posterior probability of after 8 heads in 10 flips.
Strategy. Calculate for each hypothesis, then apply Bayes.
Solution.
Bayes factor:
With prior :
Verification. — moderate evidence for (Jeffreys scale: between 3 and 10 is "moderate"). The posterior probability of a biased coin went from 50% to 84%. Consistent with the data (8 out of 10 favors ).
Source. OpenIntro Statistics §3.7 — Diez, Çetinkaya-Rundel, Barr — CC-BY-SA.
Exercise list
34 exercises · 8 with worked solution (25%)
- Ex. 109.1Application
Prevalence of a disease: 1%. Test sensitivity: 95%. False positive rate: 10%. A patient tests positive. Calculate the probability of having the disease.
- Ex. 109.2Application
A coin flipped 10 times gives 4 heads. Prior: Beta(1,1) (uniform). Calculate the posterior, posterior mean, and compare with MLE.
- Ex. 109.3Application
Prior: Beta(4, 6). Sample: 7 successes in 10. Calculate the posterior, posterior mean, and MAP.
- Ex. 109.4Application
Prior: Beta(2, 2). Batch 1: 5 successes in 10. Batch 2: 8 successes in 10. Do the sequential update and calculate the final posterior mean.
- Ex. 109.5Application
Prevalence: 0.5%. Sensitivity: 99%. False positive rate: 2%. Patient tests positive. What is the probability of having the disease?
- Ex. 109.6Application
3 successes in 10 trials. Compare the posterior mean with priors Beta(1,1) and Beta(5,5). Which prior has greater influence on the posterior?
- Ex. 109.7Application
Three factories produce bolts: E1 (60% of production, 30% defective), E2 (30%, 50% defective), E3 (10%, 10% defective). A defective bolt is drawn. What is the probability it came from E1?
- Ex. 109.8Application
Prior: Beta(3, 3) (slight belief in fair coin, mean 0.5). Flip 5 times and get 0 heads. Calculate the posterior and the new mean.
- Ex. 109.9Application
Prior: Beta(1,1). Data: 15 successes in 20. Calculate MAP and MLE. Are they equal? Why?
- Ex. 109.10Application
Bag with two coins: one always gives heads (H), the other is fair (F). One is chosen at random. Flipped twice, both heads. What is the probability it is the H coin?
- Ex. 109.11Understanding
What does a 95% Bayesian credible interval mean?
- Ex. 109.12UnderstandingAnswer key
Which statement about MAP and MLE is INCORRECT?
- Ex. 109.13Understanding
How does sample size n affect the relationship between prior and posterior?
- Ex. 109.14Application
A student passes the exam (). Known: (studied hard, probability 60%), (did not study, probability 40%). Given that he passed, what is the probability he studied hard?
- Ex. 109.15Application
A machine has unknown success rate. Prior: Beta(4, 2) (history of 4 successes and 2 failures). New test: 6 consecutive successes. Calculate the posterior, mean, and MAP.
- Ex. 109.16Application
Calculate the Bayes Factor for versus after 8 heads in 10 flips.
- Ex. 109.17ApplicationAnswer key
Three batches of 10 trials each: 7 successes, 6 successes, 7 successes. Prior: Beta(1,1). Do the sequential update and calculate the final posterior mean.
- Ex. 109.18Application
Prevalence: 30%. Sensitivity: 95%. False positive rate: 20%. Patient tests positive. Calculate the probability of having the disease and compare with exercise 109.1.
- Ex. 109.19ApplicationAnswer key
Show that the posterior mean of the Beta-Bernoulli model is a weighted average between the prior and the sample proportion. Identify the weights.
- Ex. 109.20Application
Prior: Beta(2, 2). Data: 0 successes in 3. Calculate the posterior, MAP, and posterior mean.
- Ex. 109.21Application
Probability of rain in Fortaleza on a given day: 40%. If it rains, there is an 85% chance of dark clouds. If it does not rain, 30%. There are dark clouds. What is the probability it will rain?
- Ex. 109.22ApplicationAnswer key
Production history: 10% defects (equivalent to 10 defects in 100 parts = Beta(10,90)). New inspection: 3 defects in 20. Calculate the posterior and posterior mean.
- Ex. 109.23Application
Bag with 3 coins: 1 always gives heads (H), 2 are fair (F). One coin is drawn randomly and flipped: heads appears. What is the probability it is the H coin?
- Ex. 109.24Application
Prior Beta(1,1). Data: 10 successes in 20. Describe the posterior and the central 95% credible interval (use the fact that the 2.5% percentile of Beta(11,11) ≈ 0.31).
- Ex. 109.25Modeling
A test prep course historically approves 70% of students on the ENEM. New cohort, 20 students: 15 passed. Propose a suitable Beta prior, calculate the posterior, and the posterior mean of the approval rate.
- Ex. 109.26ModelingAnswer key
Prevalence of pancreatic cancer: 0.2%. Biopsy: sensitivity 92%, specificity 97%. Test positive. Calculate P(cancer | positive) and discuss the medical decision.
- Ex. 109.27Modeling
A shipping company reports 20 delayed deliveries in 50 monitored deliveries. Using prior Beta(1,1), estimate the delay rate with a 90% credible interval.
- Ex. 109.28ModelingAnswer key
A fintech knows that 1% of transactions are fraudulent. An algorithm detects that the current transaction has a value outside the customer's normal pattern. P(abnormal value | fraud) = 85%, P(abnormal value | legitimate) = 2%. Calculate the probability of fraud.
- Ex. 109.29Proof
Show that, for the Bernoulli model with Beta prior, the posterior is also Beta. Identify the parameters.
- Ex. 109.30ProofAnswer key
Prove that, with uniform prior Beta(1,1), the MAP estimator coincides with the MLE for the Bernoulli model.
- Ex. 109.31ApplicationAnswer key
Spam filter: 20% of emails are spam. In spam emails, each suspicious keyword appears with probability 60%; in legitimate emails, 5%. An email has 3 keywords. What is the probability it is spam?
- Ex. 109.32Application
Two groups of rats: lineage 1 (10 animals, 8 developed tumor after exposure) and lineage 2 (10 animals, 3 developed). Prior Beta(1,1) for both rates. Calculate the posterior and posterior mean for each lineage.
- Ex. 109.33Application
An urn has unknown proportion of orange balls. After 100 draws with replacement, 50 are orange. Prior Beta(1,1). Calculate the posterior, the mean, and the 95% credible interval.
- Ex. 109.34Challenge
The Jeffreys prior for Bernoulli is Beta(0.5; 0.5). After 6 successes in 10, calculate the posterior. Research what it means for this prior to be "invariant under parametrization" and compare the posterior mean with the Beta(1,1) prior.
Sources
- Think Bayes — Allen B. Downey · CC-BY-NC-SA · Greenteapress · Chapters 1–9.
- Introduction to Probability — Grinstead & Snell · GNU FDL · Dartmouth · §4.1.
- OpenIntro Statistics — Diez, Çetinkaya-Rundel, Barr · CC-BY-SA · OpenIntro · §3.6–3.7.