Predict California Housing Prices Using Two AutoML Tools
This repository contains two tutorials that show how to use state-of-the-art AutoML methods—AutoGluon and TabPFN—on the California Housing dataset.
AutoML (Automated Machine Learning) automates the process of training machine learning models: model selection, hyperparameter tuning, ensembling, and evaluation—making it faster and easier to build high-performing models without manual coding.
We’ll use the California Housing dataset (a regression problem) Dataset
A hands-on guide using AutoGluon, an open-source AutoML toolkit by Amazon.
- Open this Google Colab notebook:
Tutorial Link - Go to
File→Save a copy in Drive - Click
Connect(upper right corner) and start running cells
Try out TabPFN, a fast transformer-based model trained to approximate Bayesian posteriors.
- Open this Google Colab notebook:
Tutorial Link - Go to
File→Save a copy in Drive - Click
Connectto run
“Although TabPFN provides a powerful drop-in replacement for traditional tabular data models such as CatBoost, similar to these models, it is intended to be only one component in the toolkit of a data scientist. Achieving top performance on real-world problems often requires domain expertise and the ingenuity of data scientists. As with other modeling approaches, data scientists should continue to apply their skills and insights in feature engineering, data cleaning and problem framing to get the most out of TabPFN. We hope that the training speed of TabPFN will facilitate faster iterations in the data science workflow.”
Both tutorials run in Google Colab, no installation needed.
To run locally:
pip install autogluon
pip install tabpfn| Tool | Task | Highlights |
|---|---|---|
| AutoGluon | Regression | Fast, interpretable, ensemble-based |
| TabPFN | Regression | Fast transformer, few-shot learning |
Miscl
3- click 'connect' upper right corner
Script: https://colab.research.google.com/drive/1cFR4n7N0WxUI2vtQiEzjbEzbgULWXmfI?usp=sharing
autogluon: https://auto.gluon.ai/
Part B: California housing dataset and TabPFN
Script: https://colab.research.google.com/drive/1Mbkvk2egLQkVN6qCimvhIDCSyrX1OWxj?usp=sharing