Predict a suitable weather category based on a song's audio features (valence, acousticness, tempo, loudness, energy)
This starts the FastAPI backend + the Vite frontend.
- Backend Setup:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt- Set Spotify credentials (go to Spotify Web API):
export SPOTIPY_CLIENT_ID="your_spotify_client_id"
export SPOTIPY_CLIENT_SECRET="your_spotify_client_secret"- Start the API (uses
backend/models/model.pklif present):
uvicorn backend.app.main:app --reload --port 8000- Start the frontend in a second terminal:
cd frontend
npm install
npm run dev- Open the app using the given
localhostserver.
Video: https://drive.google.com/file/d/1nc7trHjiscxEsSmc9bRy0GR-1T8RSImB/view?usp=sharing
This runs a simple prompt-driven predictor and does not use the web app.
- Ensure the Python environment is ready (same as Demo 1).
- Run the script and enter features when prompted:
python 'ml/features->weather.py'The user will be prompted for energy [0.0, 1.0], valence [0.0, 1.0], tempo (BPM), acousticness [0.0, 1.0], and loudness (dB) in order, and the predicted weather category is printed.
- Backend: Python - scikit-learn, pandas, matplotlib, FastAPI (served via Uvicorn)
- Frontend: React, Typescript, Tailwind CSS
.
├── backend
│ ├── api
│ └── app
├── data
├── frontend
│ ├── public
│ └── src
├── ml
│ └── results
├── README.md
└── requirements.txt
- Spotify Web API (song lookup)
- ReccoBeats API (audio features)
- OpenWeatherMap API (real-time weather context) (NOT USED IN CURRENT VERSION)
- Data source:
data/track_data.csv. We ultmately used 5201 labelled tracks for training/testing, around a third of which were pulled using the Spotify Web API from Spotify-generated and user-made playlists. As no other canonical dataset mapping songs to weather labels exists, and weather labels are weakly supervised, we prompted LLMS (ChatGPT and Anthropic) to generate supplemental data based on some explicit heuristics (e.g. melancholic and reflective ambiance ->rainy). - Features:
energy,valence,tempo,acousticness,loudness.StandardScalarwas used to standardize each feature. - Target label:
weather: one ofsunny,cloudy,rainy,snowy - Models in
ml/models.py, trained usingPipelineto avoid data leakage: Naive Bayes, Logistic Regression (baseline), Random Forest, Gradient Boosting (production)
Training and evaluation scripts:
python ml/train.py
python ml/evaluate.pyEvaluation metrics and diagnostics used in ml/evaluate.py:
- Weighted F1 on a simple holdout split was initially used, and we enhanced by using Stratified K-Fold cross-validation with weighted F1
- Permutation feature importance (PFI)
- Confusion matrices (visualized using matplotlib)
see ml/results/evaluate_results.txt for other final diagnostics
We found that energy, valence, tempo, acousticness, loudness, standardized using StandardScalar, were a set of features with consistent, moderately high PFI scores across multiple CV folds and all four models. A structured imbalance was present, with 'energy' (mean 0.224) slightly dominating weaker supporting features such as tempo (mean 0.113). We ultimately achieved 0.758 ± 0.014 weighted F1 with 5-fold stratified cross-validation on our Gradient Boosting production model.
We initially intended the Naive Bayes model to be a deliberately weak baseline, having reasoned that audio features such as 'energy' and 'loudness' are highly correlated and conditionally dependent. However we ended up with a surprisingly high 0.745 CV F1. Indeed, in our regime, where features tend to cluster across classes and predictive signal is distributed relatively evenly (see PFI), Naive Bayes can still effectively aggregate signals.
More expressive models such as Gradient Boosting are nonetheless able to achieve additional gains by modeling residual feature dependencies. For instance, Naive Bayes struggled on differentiating between the similar classes of snowy and rainy, but Gradient Boosting reduced from 146 to 59 such misclassifications.
- As of November 2024, Spotify API does not provide access to audio features. ReccoBeats was thus added for audio features but API experienced high latency. In the future we can experiment with caching song information for faster response times.
- In the future, sentiment/semantic analysis on song lyrics could be used to further our model. For example, our model classifies "Jingle Bell Rock" by Brenda Lee as
sunny, but the lyrics are definitely more indicative of asnowysong.