Adding election example questions by RossLinModelling · Pull Request #34 · elevien/math50_2025

RossLinModelling · 2025-11-25T16:06:51Z

Summary

This pull request adds a new set of contribution problems to Unit 5 of the Math 50 notes. The problems are based on my final project analyzing the “13 Keys to the White House” and comparative forecasting models for recent U.S. presidential elections (including fundamentals + polls models). The goal is to give students a concrete, data-driven context for applying the concepts from Units 4–5 on linear regression, logistic regression, and model validation.

The new material appears under the heading:

Contribution Problems: Presidential Forecasting Models

in the Unit 5 LaTeX notes.

Content Added

The contribution consists of five multi-part problems, each linking a course concept to the election forecasting project:

Simple Linear Regression and Residual Analysis
- Uses the “13 Keys” dataset (1976–2020) with correlation ( r = -0.838 ) between number of false keys and incumbent two-party vote share.
- Asks students to compute ( R^2 ), estimate the regression line from two points, interpret the slope in plain language, and discuss why a W-shaped scatterplot violates the linearity assumption and breaks the model when predicting Electoral College vote share.
Multiple Regression and Multicollinearity
- Uses regression output where incumbent vote share is predicted by:
  Num_False_Keys, Key 5 (short-term economy), and Key 6 (long-term economy).
- Asks students to:
  - Write the estimated multiple regression equation.
  - Decide which predictors are significant at (\alpha = 0.05).
  - Explain why Key 5 looks strong in a single-predictor model but becomes insignificant once Num_False_Keys is included (multicollinearity and variance inflation).
  - Reason about what happens to (\operatorname{Var}(\hat{\beta})) if Num_False_Keys is removed.
Logistic Regression and Classification Thresholds
- Treats the Keys model as a binary classifier (incumbent wins if ≤5 false keys).
- Students:
  - Use the logistic formula ( P(Y=1) = \frac{1}{1+e^{-(\beta_0 + \beta_1 x)}} ) to find the ratio (-\beta_0/\beta_1) when the decision boundary is at (x=5.5).
  - Discuss the trade-off between validity (error guarantees) and plausibility (modeling uncertainty) when comparing probabilistic forecasts to a deterministic “Keys” rule.
  - Explain how quasi-separation in a tiny dataset (n=12) creates near-step-function behavior and potential MLE instability.
The Linear Extrapolation Problem
- Starts from a fundamentals-plus-polls linear model ( M = \beta_0 + \beta_1 E + \beta_2 P ).
- Problems ask students to:
  - Show algebraically how linear models can produce impossible margins (less than 0% or greater than 100%) when inputs are extreme or when a negative weight on polls is needed to fit a state.
  - Explain how applying a logistic transform ( f(z) = 1/(1+e^{-z}) ) constrains predictions to ([0,1]).
  - Interpret the Monte Carlo simulation’s distribution of Electoral College outcomes in terms of the error term ( \varepsilon ) in ( Y = \beta X + \varepsilon ).
Sample Size and Degrees of Freedom
- Emphasizes that the dataset contains only 12 elections (1976–2020).
- Students:
  - Compute the residual degrees of freedom for the regression with three predictors and discuss whether the “10 observations per predictor” rule of thumb is satisfied.
  - Analyze what happens to the OLS solution ( \hat{\beta} = (X^T X)^{-1} X^T Y ) if we try to regress on all 13 individual Keys with only (n=12) (singular (X^T X)).
  - Connect 2016 polling failures and structural breaks to the need for regularization (Ridge/Lasso) to improve generalization versus plain OLS on such a small sample.

Pedagogical Motivation

These problems are designed to:

Tie abstract regression concepts directly to a high-interest real-world application (presidential elections).
Reinforce key Unit 5 topics:
- OLS diagnostics and residual patterns
- Multiple regression and multicollinearity
- Logistic regression and classification
- Domain constraints and link functions
- Sample size, degrees of freedom, and overfitting
- Regularization as a way to handle structural breaks and limited data
Provide a bridge between the course material and current debates about election models (fundamentals vs. polls, deterministic vs. probabilistic forecasts).

They can be used as:

Optional challenge problems,
A contribution exercise,
Or supplementary practice for students interested in political applications of regression.

Files Touched

public/latex_notes/unit5/unit5.tex (added the new subsection and problems near the end of Unit 5).

No existing content was deleted or modified; the contribution is additive and self-contained.

Unit 5 and maybe onward based questions with the theme of election modeling.

fixed election questions

RossLinModelling added 2 commits November 25, 2025 07:48

Update posts.html

1d3c8ca

Unit 5 and maybe onward based questions with the theme of election modeling.

Update posts.html

e2117d6

fixed election questions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding election example questions#34

Adding election example questions#34
RossLinModelling wants to merge 2 commits into
elevien:mainfrom
RossLinModelling:add-election-example

RossLinModelling commented Nov 25, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RossLinModelling commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Content Added

Pedagogical Motivation

Files Touched

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RossLinModelling commented Nov 25, 2025 •

edited

Loading