Reduce RAM usage#18
Conversation
Co-authored-by: Copilot <copilot@github.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #18 +/- ##
==========================================
+ Coverage 90.36% 90.52% +0.15%
==========================================
Files 36 37 +1
Lines 1754 1868 +114
Branches 122 130 +8
==========================================
+ Hits 1585 1691 +106
- Misses 130 137 +7
- Partials 39 40 +1
🚀 New features to boost your workflow:
|
Co-authored-by: Copilot <copilot@github.com>
There was a problem hiding this comment.
Pull request overview
This PR reduces memory usage in the recommender stack by compacting in-memory feature-store identifiers/metadata and by normalizing genres into dedicated tables.
Changes:
- Store MBIDs in the feature index as 16-byte UUID values (
V16) and optionally avoid loading raw feature matrices in production (DEBUG-only). - Replace track genre string fields with FK references to
GenreDortmund/GenreRosamerica, updating serializers and genre/track APIs accordingly. - Update ingest pipeline, management command, and tests/factories to use compact genre codes and UUID MBIDs.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/recommend_api/services/recommender.py | Loads compact MBID/genre arrays, adds feature accessors, and emits UUID strings in recommendation output. |
| backend/recommend_api/models.py | Introduces genre lookup tables and switches Track genres to FK relationships. |
| backend/recommend_api/migrations/0021_genredortmund_genrerosamerica_and_more.py | Adds genre tables and alters track genre fields to FKs. |
| backend/recommend_api/serializers.py | Exposes genre labels via FK (genre_*.label) and makes raw_features optional. |
| backend/recommend_api/api/track.py | Uses new FeatureStore accessors; conditionally includes raw features. |
| backend/recommend_api/api/genre.py | Lists genres from new genre tables rather than distinct track fields. |
| backend/recommend_api/tests/services/test_recommender.py | Updates service tests to use UUID MBIDs and numeric genre codes. |
| backend/recommend_api/tests/factories.py | Adds genre factories and updates TrackFactory to create FK genres. |
| backend/recommend_api/tests/api/test_track_api.py | Updates tests for feature endpoint to use new store accessors and UUID MBIDs. |
| backend/recommend_api/tests/api/test_genre_api.py | Updates genre API tests to seed genre tables directly. |
| backend/ingest/pipeline.py | Writes compact MBID bytes and uint16 genre codes into the NPZ feature/index artifact; populates genre tables. |
| backend/ingest/track_processing_helpers.py | Minor formatting/comment cleanup; retains genre labels during extraction for later coding. |
| backend/ingest/management/commands/recommend.py | Updates CLI to map numeric genre codes back to labels and improves robustness. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 22 changed files in this pull request and generated 1 comment.
Files not reviewed (1)
- frontend/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
This PR reduces total RAM usage by not loading debug data in production (
feature_matrix_raw), optimizing track ids to be stored as UUID bytes instead of strings and storing genre labels as integer codes instead of strings.