- Lambda School Project Explanation, Guidelines, and Team Roles
- Song Suggester Architecture
- Team Github
- A Guide to what other roles within our team are doing
Create an algorithm to recommend songs based on user input songs. In production, Spotify uses a mixture of Deep Learning, Collaborative Filtering, persistent user 'taste profiles', frequent itemset mining, and some sort of neighbors algorithm based on taste profiles, time listened, etc. Competitor Pandora has hired musiciologists to work on their 'music genome project' which is probably more of a feature engineering thing using domain experts.
We ended up using only Nearest Neighbors. Couldn't find a data set to do CF on. Feature engineering included scraping genres, predicting languages of the track_names, and sentiment analysis of the track names. While we found several research papers on using Neural Networks, LSTMs on Nearest Neighbors, we were unable to find source code or pre-trained models or create one in the time alotted.
Research into siamese network matching algorithms looks more promising as they train much faster but we were also unable to create or find a ready to use model in the time alotted. Also Siamese networks scale much better than Nearest Neighbors as the dataset increases in size.
Read more here: K-Nearest Neighbors & here: K-Nearest Neighbors Documentation
Pitch: Build an app to enable users to browse and visualize audio features of over 116k spotify songs.
MVP: User can search for a specific song and see its audio features displayed in a visually appealing way. The app also identifies songs with similar audio features. DS:
- Build a model to recommend songs based on similarity to user input (I like song x, here are n songs like it based on these similar features)
- Create visualizations using song data to highlight similarities and differences between recommendations.
https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/
https://www.reddit.com/r/MachineLearning/comments/2f8jff/using_neural_networks_for_nearest_neighbor/
https://towardsdatascience.com/how-to-build-a-simple-song-recommender-296fcbc8c85
https://arxiv.org/pdf/1605.09477.pdf
https://medium.com/@b.terryjack/nlp-pre-trained-sentiment-analysis-1eb52a9d742c
https://arxiv.org/pdf/1810.12575.pdf
https://blogs.cornell.edu/info4220/2016/03/18/spotify-recommendation-matching-algorithm/
Future Teams are welcome to build upon our data set which includes genres, predicted track languages here: https://raw.githubusercontent.com/Build-Week-Spotify-Song-Suggester-5/Data-Science/master/spotify_unique_track_id_lang.csv
The script to predict sentiment is still running and will be done soon(TM) but maybe your machines are faster and you can run it here: https://github.com/Build-Week-Spotify-Song-Suggester-5/Data-Science/blob/master/Sentiment_Feature_Script.ipynb
Note that it only runs sentiment analysis on the top 75000 songs by popularity because of database size constraints for Flask.