Suggested data sources for data science students at Thinkful.
| Title | Description | Prospective Uses |
|---|---|---|
| American Community Survey 5-year Data (2009-2015) | Survey data from the U.S. Census Bureau available via API | Exploratory Analysis, Data Visualizations, Predictive Modeling |
| General Social Survey, 2012 Merged Data | Panel data from three iterations of the General Social Survey of the United States, covering attitudes toward a variety of subjects. Contains both panel data and cross-sectional data. | Exploratory Analysis, Data Visualizations, Predictive Modeling |
| Kaggle | Kaggle data sets, from a variety of sources and industries | Industry based modeling/analytics |
| ICPSR | Political data sets | Economics, Health? |
| UC Irvine ML Data | Machine Learning Repo from UC Irvine. Tons of data sets here with an emphasis on scale | Predictive Modeling, particularly unsupervised |
| UN Data | Government data sets from the UN | All |
| Quora Question | Great collection of various datasets, some explicitly called out here as well | All |
| AWS Public Data Sets | Data sets on AWS infrastructure | Big Data |
| Data.GOV | US government open data sets | Economics, Gov't |
| GitHub Repo | Awesome GitHub repo with many, many datasets sorted by subject | All |
Highly specalized data sets:
| Title | Description | Prospective Uses |
|---|---|---|
| Rijksmuseum API | A public facing API from the Rijksmuseum in Amsterdam | Exploratory Analysis, Data Visualizaitons |
| SFMOMA API | An API from the SF MOMA | Exploratory Analysis, Data Visualizaitons |