Lesson 106 — Multiple regression
Model with p predictors, OLS matrix solution, adjusted R², multicollinearity, variable selection, and diagnostic of assumptions.
Used in: Stochastik LK alemão (Klasse 12) · H2 Mathematics Singapura (§15) · econometria introdutória
Rigorous notation, full derivation, hypotheses
Rigorous definition
Multiple linear regression model
"The multiple regression model is . The coefficient measures the expected change in per unit change in when all other predictors are held constant." — OpenIntro Statistics, §8.1, p. 362
Goodness-of-fit metrics
Inference
Matrix representation of the model: . The first column of 1s in generates the intercept .
Solved examples
Exercise list
20 exercises · 5 with worked solution (25%)
- Ex. 106.1ApplicationAnswer key
Regression: (price in R$ thousand, area m², bedrooms, floor). Interpret each coefficient.
- Ex. 106.2Application
Using , calculate the prediction and residual for an 80 m², 3-bedroom, 5th-floor apartment with an observed price of R$ 450 thousand.
- Ex. 106.3ApplicationAnswer key
, predictors, , . Calculate and .
- Ex. 106.4ApplicationAnswer key
. Three models with predictors and . Calculate for each and point out the preferable one.
- Ex. 106.5Application
, , , . Build the ANOVA table and test the model at the 5% level.
- Ex. 106.6Application
Auxiliary regressions for 3 predictors: , , . Calculate VIFs and identify severe multicollinearity.
- Ex. 106.7Application
Regression of ENEM score on family income () and participation in a tutoring program (: 1=yes, 0=no): , . Interpret .
- Ex. 106.8Application
, , , . Test at the 5% level (two-tailed).
- Ex. 106.9Application
, , , . Construct 95% CI for . Use .
- Ex. 106.10Application
Four of the five residuals of a regression are: . What is the fifth residual?
- Ex. 106.11Understanding
Which statement about and adjusted is CORRECT?
- Ex. 106.12Understanding
What is the main practical effect of multicollinearity in multiple regression?
- Ex. 106.13Understanding
Which statement about partial coefficients in multiple regression is CORRECT?
- Ex. 106.14Modeling
Regression of monthly family expenditure (R$ thousand) on 4 socioeconomic predictors: , , . Calculate , , and .
- Ex. 106.15Modeling
Model: (salary in R$ thousand, experience in years, =1 if female). Calculate salaries for (a) male, 10 years; (b) female, 10 years. How to include interaction to check if the gap varies with experience?
- Ex. 106.16ModelingAnswer key
A researcher has a regression model with 2 predictors (, ) and considers adding a third predictor. Describe two criteria for deciding whether to include it.
- Ex. 106.17Challenge
Prove that the hat matrix is idempotent: .
- Ex. 106.18Challenge
Data: observations with , , . Write the design matrix and state the procedure to calculate (no need to perform inversion by hand — describe the steps).
- Ex. 106.19ProofAnswer key
Prove that in any regression with an intercept, , using the orthogonality .
- Ex. 106.20Challenge
Show that adding a predictor to the model increases if and only if the statistic of the new predictor is greater than 1.
Sources
- OpenIntro Statistics (4th ed.) — Diez, Çetinkaya-Rundel, Barr · CC-BY-SA · Chapter 8 (Multiple and logistic regression). Primary source for coefficient interpretation, , multicollinearity, and dummy variables.
- Statistics — OpenStax — Illowsky, Dean · CC-BY · Chapter 13 (Linear Regression and Correlation — Multiple). Source for multiple regression ANOVA tables and global F-test.
- Probabilidade e Estatística — Wikilivros — collaborative · CC-BY-SA · Multiple regression section. Reference in PT-BR with matrix notation.