feat(tr): host-DMA double-buffer for tear-free bitmap video on no-REU backends#42
Merged
Conversation
… backends The TeensyROM's cycle-clean bus DMA can't rewrite a full mhires frame in the visible VIC bank without tearing — the per-cell "sparkle." Profiling confirmed it's the bus, not the link (serial and TCP both ~106 KiB/s), so a faster transport doesn't help; the emulated REU moves bytes over the SAME bus DMA, so REU staging can't help either. The fix that needs no REU: write each frame's bitmap+screen into the OFF-screen VIC bank over the normal host-DMA path, then flip $DD00 at vblank via a tiny raster IRQ. The visible bank is never written mid-display, so every shown frame is whole — tear-free at the same ~10-12 fps. - modes.py: HOSTDMA_SWAP_IRQ_HANDLER (35-byte minimal handler at $C500 + 3-byte tracker at $C700), reusing the existing _install/_uninstall_bank_swap_irq; double_buffer setup/push/teardown on Hires + MultiHires via shared base helpers. Unlike the REU path the IRQ does NO in-IRQ DMA, so the swap lands inside vblank with no shimmer — folded text overlays render crisply. - config.py: [video].double_buffer tri-state (true|false|"auto", default auto) + resolve_double_buffer. auto enables it for bitmap modes when REU staging is off (mutually exclusive — both flip $DD00) AND the backend has no REU at all (the TR); the U64 is left untouched. Threaded through build_scene (api.profile.supports_reu) and the slideshow random-mode rebuild. - c64.py: per-bank RegionIDs so each VIC bank diffs against its own content. Residual: mhires color RAM ($D800) is not VIC-banked, so the c3 slot still tears briefly before each flip; bitmap+screen (structure + c1/c2) go tear-free. Hires (no color RAM) and static-palette mhires (cheap/grayscale) are fully tear-free. HW-verified on TeensyROM over serial: visibly smoother playback, no errors.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #42 +/- ##
==========================================
+ Coverage 79.57% 79.68% +0.11%
==========================================
Files 68 68
Lines 12861 12941 +80
Branches 1898 1909 +11
==========================================
+ Hits 10234 10312 +78
- Misses 2188 2190 +2
Partials 439 439 ☔ View full report in Codecov by Harness. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
On the TeensyROM, single-buffered
mhiresvideo tears — the per-cell "sparkle" — because the per-frame screen/color/bitmap writes are non-atomic and the slow cycle-clean bus DMA lets the VIC scan a half-updated visible bank. Profiling this session showed the bottleneck is the bus DMA, not the link: serial and TCP both sustain ~106 KiB/s, and the TR's emulated REU moves bytes over that same bus DMA — so neither a faster transport nor REU staging can fix it. The U64'suse_reu_stageddouble-buffer isn't available here.Fix
Host-DMA double-buffering (page flip), no REU. Write each frame's bitmap+screen into the off-screen VIC bank over the normal host-DMA path, then flip
$DD00at vblank via a tiny raster IRQ. The visible bank is never written mid-display → every shown frame is whole, tear-free at the same ~10–12 fps. Because the IRQ does no in-IRQ DMA (unlike the REU handler's ~9000-cycle copy), the swap lands inside vblank with no shimmer — so it also works with text overlays, which the REU path can't claim.What's here
modes.py:HOSTDMA_SWAP_IRQ_HANDLER— a 35-byte minimal handler at$C500with a 3-byte tracker ([bg0, bank, ready]) at$C700, reusing the existing_install_bank_swap_irq/_uninstall_bank_swap_irq.double_buffersetup/push/teardown onHires+MultiHiresvia sharedBitmapDisplayModehelpers.config.py:[video].double_buffertri-state (true | false | "auto", defaultauto) +resolve_double_buffer.autoenables it for bitmap modes when REU staging is off (mutually exclusive — both flip$DD00) and the backend has no REU at all (the TR); the U64 is left on its existing paths. Threaded throughbuild_scene(api.profile.supports_reu) and thedisplay = "random"slideshow rebuild.force_host_dma(SID-audio scenes) disables it too.c64.py: per-bankRegionIDs so each VIC bank diffs against its own prior content.Residual (documented, not over-engineered)
mhirescolor RAM ($D800) isn't VIC-banked, so the c3 slot still tears in a brief (~9 ms) window before each flip; bitmap+screen (the structure + c1/c2) go tear-free. Hires (no color RAM) and static-palette mhires (cheap/grayscale) are fully tear-free.Verification
resolve_double_buffertruth table + load-time validation (tests/test_config.py); structure tests for setup/push/teardown on the off-screen bank + 3-byte tracker (tests/test_bitmap_compose.py).--strict, pyright, 1581 tests; schema drift clean.host-DMA double-buffer armedpath resolved.Independent of (and composes with) the bg0-hysteresis fix in the separate PR.