This project analyzes institution-level data from the U.S. Department of Education’s College Scorecard to examine how institutional characteristics—such as ownership type, selectivity, and cost of attendance—are associated with student outcomes, including graduation rates, retention rates, and early employment outcomes.
The analysis focuses specifically on four-year, bachelor’s degree–granting institutions, with the goal of producing analysis-ready datasets and interpretable insights relevant to institutional research, higher education policy, and data analytics roles.
The main objectives of the project include:
- Clean and prepare a large unstructured public data set for analysis using Postgre SQL
- Explore how insitutional ownership (Public, Private Non-profit, Private For-profit) relates to students outcomes(retention and graduation)
- Find out does school selectivity (admission rate) have relationship with students' graduation rates and job placement rates
- Analyze the cost of attendance and students outcomes such as graduation rates and job placement rates
Source: U.S. Department of Education – College Scorecard
Academic Year: 2018–2019
Key Variables:
- Institutional characteristics (ownership/control, degree offerings, enrollment size)
- Admissions selectivity (admission rate)
- Cost of attendance (public/private cost measures)
- Student outcomes:
- Retention Rate
- Graduation Rate
- Retention Rate
- Employment outcomes one year after graduation
PostgreSQL
- Cleaned and prepared large-scale public datasets, including type conversion, standardization of missing values, and construction of analysis-ready variables using COALESCE
- Structured multi-step transformations with Common Table Expressions (CTEs)
- Applied window functions (ROW_NUMBER, PERCENTILE_CONT, NTILE) to compute percentiles, rankings, and group-level benchmarks
- Categorized institutions by ownership, selectivity tiers (admission rate quartiles), and cost tiers (cost of attendance quartiles) using CASE WHEN syntax
- Compared graduation, retention, and employment outcomes across institutional groups using multiple aggregate functions
Institution type: Public and private nonprofit schools have higher graduation and retention rates than private for-profit schools; private nonprofits are slightly higher than publics in graduation.
Selectivity: More selective institutions show higher graduation rates, but employment rates are similar across selectivity tiers (job quality/earnings not available).
Cost: Higher-cost institutions have higher graduation rates, while employment rates are relatively consistent across cost tiers.