Aiden Brown and Tony Nguyen's Spring 2026 Data Science Project
About our data: Every year, the CDC publishes the results of its survey on Americans with different health conditions and demographics. We chose data from 2017-2018 because it was the most accessible.
Analytical goals: We seek to determine whether there is a correlation between depression and several demographics, such as race, sleep behavior, and income. Since we are not medical professionals, we claim no diagnostic intent or accuracy regarding whether these demographics have any meaningful influence on an individual’s risk of depression; instead, we seek to find any relationships between these demographics and depression to guide the direction of future research.
Research Question: What impact, if any, do different demographics, sleep behaviors, incomes, body weights, and hours worked each week have on predicting depression score?
Methods: In our analysis, we use a linear regression model, a neural network, a LASSO model, and gradient boosting.