Conversation
Sync branches
- Check for empty responses before JSON parsing - Validate JSON structure immediately after receiving response - Add detailed error logging for diagnostics - Propagate meaningful errors to retry mechanism
…cent songs Co-authored-by: Blueion76 <128919662+Blueion76@users.noreply.github.com>
Co-authored-by: Blueion76 <128919662+Blueion76@users.noreply.github.com>
…ain.py Co-authored-by: Blueion76 <128919662+Blueion76@users.noreply.github.com>
Improve playlist variety: randomize context, add decade hints, track recent songs
…rts, label UnboundLocalError, indentation, and stray comments Co-authored-by: Blueion76 <128919662+Blueion76@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Blueion76 <128919662+Blueion76@users.noreply.github.com>
Clean up dev branch for production: fix runtime bug, remove debug artifacts, and correct style issues
There was a problem hiding this comment.
Pull request overview
This PR merges the dev branch into main, bringing several improvements to the OctoGen music playlist generation engine: empty response checking, JSON validation for the Gemini backend, improved playlist variety through random sampling and shuffling, cross-run song deduplication, and general code cleanup including indentation fixes and removal of stale comments/unused imports.
Changes:
- Added cross-run song deduplication (
_load_recent_songs/_save_recent_songs) and playlist variety improvements (random sampling of artists/genres/favorites, decade hints, song shuffling) in bothengine.pyandmain.py - Added empty response check and strict JSON validation in the Gemini generation path, plus fixed the Shannon entropy calculation (
bit_length()→math.log2()) - Code cleanup: fixed indentation in the hybrid playlist block, moved
labelvariable to correct scope, removed unused imports (asyncio,load_config_from_env), removed stale inline comments, and deduplicated an innerimport json
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
octogen/main.py |
Added random import; removed unused asyncio/load_config_from_env imports; moved label variable out of conditional block (bug fix); added random.shuffle(songs) for variety; used random.sample for artist/genre/favorited song selection; replaced deterministic seed song with random.choice; fixed indentation in hybrid playlist block; removed stale comments and duplicate import json |
octogen/ai/engine.py |
Added math import; fixed Shannon entropy calculation; added _load_recent_songs/_save_recent_songs for cross-run deduplication; added random sampling in _build_cached_context; extended _build_task_prompt with decade hints, recent songs avoidance section, and variety rules; added empty response check and JSON validation in Gemini path; persists recommended songs after generation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| # Validate JSON structure | ||
| try: | ||
| json.loads(response.text) | ||
| except json.JSONDecodeError as e: | ||
| logger.error(f"Gemini returned invalid JSON: {e}") | ||
| logger.debug(f"Problematic response start: {response.text[:500]}") | ||
| raise ValueError("Invalid JSON response from Gemini") from e | ||
|
|
There was a problem hiding this comment.
The strict JSON validation here will reject responses with minor JSON issues (e.g., markdown fencing like ```json ... ```, trailing commas, unescaped characters) and raise a ValueError. This ValueError propagates through _generate_with_retry (which only retries rate-limit errors) and is caught in generate_all_playlists as a generic "api_error", completely bypassing the extensive JSON repair logic downstream (lines 710-754: fence stripping, repair_json, regex cleanup, truncation fallbacks).
Since response_mime_type="application/json" is already set in the Gemini config (line 578), the API should usually return valid JSON. But the existing repair chain was clearly added because malformed responses do occur in practice. This strict pre-validation converts previously recoverable situations into hard failures that consume an API call with no result.
Consider removing this strict JSON validation, or moving it to a warning-only check (log but don't raise), and let the existing downstream repair pipeline handle malformed responses.
| # Validate JSON structure | |
| try: | |
| json.loads(response.text) | |
| except json.JSONDecodeError as e: | |
| logger.error(f"Gemini returned invalid JSON: {e}") | |
| logger.debug(f"Problematic response start: {response.text[:500]}") | |
| raise ValueError("Invalid JSON response from Gemini") from e | |
| # Validate JSON structure (warning-only; allow downstream repair logic to handle issues) | |
| try: | |
| json.loads(response.text) | |
| except json.JSONDecodeError as e: | |
| logger.warning("Gemini returned non-strict JSON; deferring to downstream repair logic: %s", e) | |
| logger.debug("Problematic response start: %s", response.text[:500]) |
| def _load_recent_songs(self) -> set: | ||
| """Load recently recommended songs from disk. | ||
|
|
||
| Returns: | ||
| Set of "artist - title" strings from recent runs, or empty set on failure. | ||
| """ | ||
| recent_file = self.data_dir / "recent_playlist_songs.json" | ||
| try: | ||
| if recent_file.exists(): | ||
| with open(recent_file, 'r', encoding='utf-8') as f: | ||
| data = json.load(f) | ||
| if isinstance(data, list): | ||
| return set(data) |
There was a problem hiding this comment.
_load_recent_songs() returns a set, which has no guaranteed iteration order. When this is converted back to a list via list(existing) on line 309, the "existing first (oldest)" ordering described in the comment is not preserved — the existing entries will be in arbitrary order. This means ordered[-200:] may drop recently recommended songs rather than the oldest ones.
To fix this, _load_recent_songs should return a list (preserving the on-disk order) instead of converting to a set. The deduplication in _save_recent_songs via the seen set already handles duplicates, so the return type can safely be list.
Added empty response check, JSON validation, improved playlist variety, general code cleanup/polishing