Skip to content

fix(MusicService): extend cache-corruption recovery to malformed container parse errors#812

Open
adam-adrian wants to merge 3 commits into
ArchiveTuneApp:devfrom
adam-adrian:fix/extend-corrupt-cache-recovery
Open

fix(MusicService): extend cache-corruption recovery to malformed container parse errors#812
adam-adrian wants to merge 3 commits into
ArchiveTuneApp:devfrom
adam-adrian:fix/extend-corrupt-cache-recovery

Conversation

@adam-adrian

Copy link
Copy Markdown
Contributor

Pull Request

Summary

This is a follow-up PR that extends the newly merged cache-corruption recovery path in MusicService to handle container-parsing errors (such as the 3001 ParserException experienced in the field) when they occur on cached media.

Previously, isCacheCorruptionError() only listened for ERROR_CODE_IO_UNSPECIFIED and ERROR_CODE_IO_READ_POSITION_OUT_OF_RANGE. This meant that a corrupted or truncated cached MP4 file throwing a 3001 container-parse error (ERROR_CODE_PARSING_CONTAINER_MALFORMED) would bypass the recovery block, fall back to the remote parser retry path (which simply retries without purging the cache), and cause an infinite playback failure/retry loop for the user on revisit.

This PR solves that by:

  1. Gating isCacheCorruptionError() on both IO and container-parse errors.
  2. Introducing a hasAnyCachedData signal (which checks if any cached spans, including partial ones, exist) to ensure we only treat parse failures as cache corruption when there are cached bytes on disk.
  3. Purging the corrupted cache when matched, so subsequent preparation cleanly fetches fresh, uncorrupted bytes.

Linked Work

  • Closes: -
  • Related: -

Change Type

  • Feature
  • Bug fix
  • UI / UX
  • Performance / memory
  • Playback / Media3
  • Lyrics / provider integration
  • Search / YouTube / network data
  • Local library / Room database
  • Widgets / notification / shortcuts
  • Settings / preferences / DataStore
  • Discord / Last.fm / ListenBrainz / external integration
  • Localization / strings / fastlane metadata
  • Build / Gradle / CI / release packaging
  • Dependency update
  • Documentation only

Affected Surfaces

  • :app
  • :innertube
  • :betterlyrics
  • :kugou
  • :lrclib
  • :lastfm
  • :simpmusic
  • :paxsenix
  • :unison
  • :canvas
  • :shazamkit
  • :spotifycore
  • server
  • fastlane
  • .github
  • Other: MusicService.kt playback error-handling path only

Screenshots / Recordings

Before After
N/A (no UI change) N/A (no UI change)

Behavior Notes

  • Trigger Paths:
    • When a playback error occurs, if it's a parsing error (3001), we check if there are cached spans (fully or partially) for the media ID.
    • If cached bytes exist (hasAnyCachedData == true) and isCacheCorruptionError matches a ParserException or typical corrupt atom messages ("Invalid integer size", "Skipping atom with length", "contentIsMalformed=true"), we mark it as cache corruption.
    • The recovery logic then purges the streaming cache, purges the download cache (if it wasn't a completed download), and re-prepares after the purge on the main thread.
  • Fallbacks preserved:
    • Genuinely broken remote stream URLs (uncached) will not trigger a cache purge. They correctly fall through to isRetryableRemoteParserFailure() to re-resolve the URL.
    • Local files are ignored entirely (!isLocalMedia).

Architecture Checklist

  • The change preserves UDF flow: UI -> ViewModel -> UseCase/domain -> Repository/data.
  • Screen state is represented structurally with Loading, Success, Empty, and Error where this PR introduces or changes screen state.
  • Composables receive immutable, UI-specific domain models instead of raw Room, network, or service entities.
  • Business work is not triggered directly from composition.
  • Exceptions are surfaced through explicit state or result types instead of being swallowed.
  • No runBlocking is introduced in app execution paths.

Compose / Material Checklist

  • UI state is hoisted out of composable layout code.
  • Reactive state is collected with collectAsStateWithLifecycle().
  • New UI models are annotated with @Immutable or @Stable.
  • Lazy layouts use stable key values and explicit contentType.
  • Non-primitive constants, structural lambdas, and allocation-heavy objects in hot paths are remembered.
  • Rapidly changing inputs such as scroll, gesture, animation, or playback progress use derivedStateOf where appropriate.
  • UI strings come from stringResource() and duplicated visible strings on the same screen are avoided.
  • Interactive elements keep a minimum 48dp touch target.
  • Material 3 / Material 3 Expressive tokens are used for color, typography, shape, and motion instead of hardcoded UI values.
  • Edge-to-edge layouts handle and consume WindowInsets correctly when this PR changes screen layout.

Concurrency / Performance Checklist

  • Business coroutines are scoped to viewModelScope or an existing lifecycle-owned application/service scope.
  • Disk, database, network, parsing, and heavy mapping work is dispatched to Dispatchers.IO or Dispatchers.Default.
  • Cancellation is handled explicitly where long-running work, playback, downloads, sync, lyrics fetching, or recognition is involved.
  • The change avoids blocking the main thread.
  • The change avoids unnecessary allocations in recomposition loops, playback loops, polling loops, and provider parsing paths.
  • Large lists, images, lyrics payloads, and network responses are bounded, streamed, cached, paged, or mapped off the main thread as appropriate.

Data / Persistence Checklist

  • Room entity, DAO, or database changes include a schema update under app/schemas.
  • Database migrations preserve existing user data and fail clearly when migration is impossible.
  • DataStore or preference-key changes are backward compatible.
  • Network DTOs remain isolated from UI models.
  • Provider responses handle missing, malformed, regional, or rate-limited data without crashing the app.

Playback / Integration Checklist

  • Media3 playback, queue, cache, and service behavior remains stable across foreground, background, notification, and process recreation paths.
  • Audio focus, notification controls, widgets, shortcuts, and media session actions are considered when playback behavior changes.
  • External integrations handle absent credentials, revoked auth, network failure, and API changes.
  • Native, AAR, or ABI-sensitive changes account for mobile/tv and universal, arm64, armeabi, x86, and x86_64 variants.

Localization / Assets Checklist

  • User-facing strings are added to the base resources and translated resources are updated or intentionally left for translation follow-up.
  • Unused string resources, drawables, and metadata are removed when replaced.
  • Image assets have explicit display dimensions and are decoded at the displayed size.
  • Fastlane metadata, screenshots, icons, or release text are updated when user-facing store behavior changes.

Privacy / Security Checklist

  • No secrets, keys, tokens, keystores, signing files, private certificates, or local machine paths are committed.
  • Logs do not expose access tokens, cookies, auth headers, user identifiers, listening history, or local file paths.
  • New network calls are justified by the feature and use existing client, proxy, timeout, and error-handling patterns.
  • User data remains local unless the PR explicitly documents the integration and consent path.

Verification

  • [] Android Studio sync:
  • Manual device/emulator verification: Verified that when a cached item encounters a malformed parsing exception (e.g., simulating corrupted container bytes), the cache is wiped successfully and re-prepared from current position instead of loop-crashing. Uncached items encountering parse failures fall back to re-resolving correctly.
  • [] UI screenshot/recording attached: N/A (no visual changes)
  • [] Accessibility or touch-target review: N/A
  • Regression areas checked: Stream URL re-resolving, bot-detection, and standard remote parser failure flows still trigger correctly.
  • CI expectation: Service compiles cleanly.

Reviewer Focus

  • moe.rukamori.archivetune.playback.MusicService.kt:
    • isCacheCorruptionError(): Check the added conditions for container-parse errors and how isContentCached (from hasAnyCachedData) protects remote-only streams from having their cache cleared pointlessly.
    • hasAnyCachedData: Check the lightweight query of getCachedSpans on both downloadCache and playerCache to verify partial downloads are accounted for.

Release Notes

Fixed an issue where playback would get stuck in a failure loop with error code 3001 (ParserException) when encountering a corrupted or truncated song file in the cache. ArchiveTune now automatically clears the bad segment and retries.

 cached media as cache corruption

Reported in the field: "Source error / Code: 3001 / ParserException:
Invalid integer size: 87 {contentIsMalformed=true, dataType=1}".

Error code 3001 is ERROR_CODE_PARSING_CONTAINER_MALFORMED, not an IO
error, so isCacheCorruptionError() returned early and never purged the
cache. The error fell through to isRetryableRemoteParserFailure(), which
only re-prepares (and is gated by !isFullyCachedMedia) without removing
the corrupted cached segment, so re-prepare re-read the same bad bytes
and the user kept hitting 3001 on revisit.

Expand isCacheCorruptionError() to also match
ERROR_CODE_PARSING_CONTAINER_MALFORMED / _UNSUPPORTED and detect
ParserException / "Invalid integer size" / "Skipping atom with length" /
contentIsMalformed in the cause chain. To avoid hijacking genuinely
malformed *remote* streams, the parse-error branch only counts as cache
corruption when the content is actually cached; an isContentCached hint
(hasAnyCachedData, including partial spans) is passed from onPlayerError.
Remote-only parse failures still fall through to the existing
re-resolve-URL retry path.
…e recovery

Remove PlaybackException.ERROR_CODE_PARSING_CONTAINER_UNSUPPORTED (3002) from isCacheCorruptionError().

An unsupported container represents a structural codec limitation, not file/cache corruption. Purging and re-downloading would not help in this case, so we narrow the check to strictly ERROR_CODE_PARSING_CONTAINER_MALFORMED (3001).
@sang765 sang765 added the BUG Something isn't working label Jun 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BUG Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants