A semantic search engine that leverages the Spotify API to discover music through natural language queries. Uses SBERT for embeddings and FAISS for efficient vector similarity search.
User Query → SBERT Embedding → FAISS Search → Spotify Metadata → Results
↑
Vector Database
(10k+ track embeddings)
- Data Ingestion - Spotify API integration for track metadata and lyrics
- Embedding Generation - SBERT to convert text into 768-dimensional vectors
- Vector Storage - FAISS index for efficient similarity search
- Search Engine - Natural language query interface
- Python 3.8+
- Spotify Developer Account (for API credentials)
- ~2GB disk space (for model and vector index)
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install spotipy sentence-transformers faiss-cpu numpy pandas python-dotenv requests beautifulsoup4 lyricsgenius --break-system-packages- Go to https://developer.spotify.com/dashboard
- Create a new app
- Copy your Client ID and Client Secret
- Create a
.envfile in the project root:
SPOTIFY_CLIENT_ID=your_client_id_here
SPOTIFY_CLIENT_SECRET=your_client_secret_here
GENIUS_ACCESS_TOKEN=your_genius_token_here # Optional, for lyricsaudio-semantic-discovery/
├── .env
├── config.py
├── spotify_ingestion.py
├── lyrics_scraper.py
├── embeddings_generator.py
├── faiss_indexer.py
├── search_engine.py
├── main.py
├── data/
│ ├── tracks.json
│ ├── tracks_with_lyrics.json
│ └── faiss.index
└── README.md
from search_engine import AudioSemanticSearch
# Initialize search engine
search = AudioSemanticSearch()
# Build index (first time only)
search.build_index(num_tracks=10000)
# Search with natural language
results = search.search("upbeat songs about summer and dancing", top_k=10)
# Display results
for i, result in enumerate(results, 1):
print(f"{i}. {result['name']} by {result['artist']} (Score: {result['score']:.3f})")# 1. Ingest track data from Spotify
python spotify_ingestion.py --num_tracks 10000
# 2. Fetch lyrics for tracks
python lyrics_scraper.py
# 3. Generate embeddings
python embeddings_generator.py
# 4. Build FAISS index
python faiss_indexer.py
# 5. Run search interface
python main.py- Natural Language Queries: Search using descriptions like "melancholic indie songs about lost love"
- Multi-field Embeddings: Combines track name, artist, genre, and lyrics
- Fast Retrieval: Sub-second search across 10k+ tracks using FAISS
- Semantic Understanding: Goes beyond keyword matching to understand meaning
Each track is represented by a concatenated text field:
"{track_name} by {artist}. Genre: {genres}. Lyrics: {lyrics_snippet}"
This is encoded using sentence-transformers/all-MiniLM-L6-v2 (384-dim) or all-mpnet-base-v2 (768-dim).
- IndexFlatL2: Brute-force L2 distance (exact search, good for <100k vectors)
- IndexIVFFlat: Inverted file index with clustering (faster, slight accuracy trade-off)
- Indexing: ~100 tracks/second
- Search latency: <50ms for 10k tracks
- Memory: ~300MB for 10k tracks (768-dim embeddings)
from embeddings_generator import EmbeddingGenerator
gen = EmbeddingGenerator(model_name='all-mpnet-base-v2')results = search.search(
query="energetic workout music",
top_k=20,
min_score=0.7,
filters={'genre': 'electronic'}
)queries = [
"sad piano ballads",
"aggressive rap with heavy bass",
"relaxing ambient music"
]
results = search.batch_search(queries, top_k=5)Issue: Rate limiting from Spotify API
- Solution: Add delays between requests or cache responses
Issue: Out of memory during indexing
- Solution: Process in batches or use smaller embedding model
Issue: Poor search results
- Solution: Tune embedding model, add more metadata, or adjust similarity threshold
- Audio feature analysis (tempo, key, energy)
- Playlist generation from queries
- User preference learning
- Real-time streaming integration
- Multi-modal search (audio + text)
MIT License - feel free to use for personal or commercial projects.
Pull requests welcome! Please ensure code follows PEP 8 style guidelines.