Math ClubMath Club
v1 · padrão canônico

Lesson 106 — Multiple regression

Model with p predictors, OLS matrix solution, adjusted R², multicollinearity, variable selection, and diagnostic of assumptions.

Used in: Stochastik LK alemão (Klasse 12) · H2 Mathematics Singapura (§15) · econometria introdutória

β^=(XTX)1XTy\hat{\boldsymbol\beta} = (X^TX)^{-1}X^T\mathbf{y}
Choose your door

Rigorous notation, full derivation, hypotheses

Rigorous definition

Multiple linear regression model

"The multiple regression model is . The coefficient measures the expected change in per unit change in when all other predictors are held constant." — OpenIntro Statistics, §8.1, p. 362

Goodness-of-fit metrics

Inference

Design matrix X (n=4, p=2)1 X₁₁ X₁₂1 X₂₁ X₂₂1 X₃₁ X₃₂1 X₄₁ X₄₂n×(p+1)×β₀β₁β₂(p+1)×1=Ŷ₁Ŷ₂Ŷ₃Ŷ₄n×1+ε₁ε₂ε₃ε₄n×1

Matrix representation of the model: . The first column of 1s in generates the intercept .

Solved examples

Exercise list

20 exercises · 5 with worked solution (25%)

Application 10Understanding 3Modeling 3Challenge 3Proof 1
  1. Ex. 106.1ApplicationAnswer key

    Regression: (price in R$ thousand, area m², bedrooms, floor). Interpret each coefficient.

  2. Ex. 106.2Application

    Using , calculate the prediction and residual for an 80 m², 3-bedroom, 5th-floor apartment with an observed price of R$ 450 thousand.

  3. Ex. 106.3ApplicationAnswer key

    , predictors, , . Calculate and .

  4. Ex. 106.4ApplicationAnswer key

    . Three models with predictors and . Calculate for each and point out the preferable one.

  5. Ex. 106.5Application

    , , , . Build the ANOVA table and test the model at the 5% level.

  6. Ex. 106.6Application

    Auxiliary regressions for 3 predictors: , , . Calculate VIFs and identify severe multicollinearity.

  7. Ex. 106.7Application

    Regression of ENEM score on family income () and participation in a tutoring program (: 1=yes, 0=no): , . Interpret .

  8. Ex. 106.8Application

    , , , . Test at the 5% level (two-tailed).

  9. Ex. 106.9Application

    , , , . Construct 95% CI for . Use .

  10. Ex. 106.10Application

    Four of the five residuals of a regression are: . What is the fifth residual?

  11. Ex. 106.11Understanding

    Which statement about and adjusted is CORRECT?

  12. Ex. 106.12Understanding

    What is the main practical effect of multicollinearity in multiple regression?

  13. Ex. 106.13Understanding

    Which statement about partial coefficients in multiple regression is CORRECT?

  14. Ex. 106.14Modeling

    Regression of monthly family expenditure (R$ thousand) on 4 socioeconomic predictors: , , . Calculate , , and .

  15. Ex. 106.15Modeling

    Model: (salary in R$ thousand, experience in years, =1 if female). Calculate salaries for (a) male, 10 years; (b) female, 10 years. How to include interaction to check if the gap varies with experience?

  16. Ex. 106.16ModelingAnswer key

    A researcher has a regression model with 2 predictors (, ) and considers adding a third predictor. Describe two criteria for deciding whether to include it.

  17. Ex. 106.17Challenge

    Prove that the hat matrix is idempotent: .

  18. Ex. 106.18Challenge

    Data: observations with , , . Write the design matrix and state the procedure to calculate (no need to perform inversion by hand — describe the steps).

  19. Ex. 106.19ProofAnswer key

    Prove that in any regression with an intercept, , using the orthogonality .

  20. Ex. 106.20Challenge

    Show that adding a predictor to the model increases if and only if the statistic of the new predictor is greater than 1.

Sources

  • OpenIntro Statistics (4th ed.) — Diez, Çetinkaya-Rundel, Barr · CC-BY-SA · Chapter 8 (Multiple and logistic regression). Primary source for coefficient interpretation, , multicollinearity, and dummy variables.
  • Statistics — OpenStax — Illowsky, Dean · CC-BY · Chapter 13 (Linear Regression and Correlation — Multiple). Source for multiple regression ANOVA tables and global F-test.
  • Probabilidade e Estatística — Wikilivros — collaborative · CC-BY-SA · Multiple regression section. Reference in PT-BR with matrix notation.

Updated on 2025-05-14 · Author(s): Clube da Matemática

Found an error? Open an issue on GitHub or submit a PR — open source forever.