fix(producer): rebuildExtractedFramesFromPlanDir off-by-one in framePaths key indexing#1730
Conversation
|
cc @jrusso1020 |
There was a problem hiding this comment.
The one-line fix is the right direction: packages/producer/src/services/distributed/renderChunk.ts:197 now matches the extractor/consumer's 0-based framePaths contract instead of the old filename-numbering assumption. I also reproduced the failure locally through the real plan -> renderChunk -> assemble distributed-simulated path.
blocker: packages/producer/src/services/distributed/renderChunk.ts:197 changes the production distributed frame lookup contract, but this PR adds no regression coverage. The bug existed because distributed reconstruction drifted from the extractor/engine contract, and the existing suite did not assert that rebuilt framePaths resolves key 0 at a video clip's first paint. Please add a focused test before merge. Good shapes: a unit test for rebuildExtractedFramesFromPlanDir/FrameLookupTable asserting t=start returns the first extracted frame, or a small distributed-simulated fixture with back-to-back videos that fails on the previous i + 1 indexing.
CI note: current PR checks are still pending/failing, so I did not treat CI as merge-ready either.
Verdict: REQUEST CHANGES
Reasoning: The fix is correct, but the exact distributed-only contract that caused the production bug needs to be pinned so this cannot regress silently.
|
Added the regression test you asked for at 3df8a8f — pins the 0-based Verified the test fails against the previous Test lives in its own file ( — Jerrai |
|
No more changes from my side. |
Merge activity
|
|
@Tzuhany would you be able to sign your commit? we can't merge this at the moment without that otherwise I can recreate the fix myself in a new branch |
|
Setting up signing now — give me ~5 min and I'll force-push a signed version of my commit. |
…aths key indexing In distributed chunk-lambda render mode, every <video>'s first-paint frame (the moment a vid first becomes visible on the composition timeline) renders as PRISTINE Y=16 black. For a 3-vid back-to-back composition (v1: 0-4s, v2: 4-8s, v3: 8-12s at 30fps), frames 0, 121, 242 are all PRISTINE black; the render then either stays black for 1 frame, or shows body bg + persistent overlays only (Y~22 with sparse highlights from text/logo). The symptom only reproduces in distributed mode — local single-process renders are unaffected. Root cause: rebuildExtractedFramesFromPlanDir builds the framePaths Map with 1-based keys, but the consumer (getFrameAtTime at engine/videoFrameExtractor.ts:958) computes a 0-based frame index via Math.floor(localTime * fps + 1e-9). For each vid's first-paint frame (localTime === 0 → frameIndex === 0), framePaths.get(0) returns undefined; the vid is silently dropped from activePayloads, videoFrameInjector doesn't fire, syncVideoFrameVisibility hides everything, and BeginFrame screenshots an empty composition. Every other site in the codebase builds/consumes framePaths with 0-based keys: - engine/videoFrameExtractor.ts:317 framePaths.set(index, ...) - engine/extractionCache.ts:204 framePaths.set(idx, ...) - engine/videoFrameExtractor.test.ts:264/1066 framePaths.set(i, ...) - engine/videoFrameExtractor.ts:958 (consumer) Math.floor 0-based - producer/renderOrchestrator.test.ts:287/315/349 framePaths.get(0) Only producer/distributed/renderChunk.ts:198 was 1-based, with a stale comment claiming FrameLookupTable indexes frames 1-based — which the surrounding evidence contradicts. This is why local tests pass while distributed-lambda renders always had cold black at each vid first paint. Verified locally against a 3-vid composition and a single-vid 4-worker case in a Lambda render fleet. Before fix: every vid first-paint frame is PRISTINE Y=16 black. After fix: all frames are valid source content, blackdetect reports zero black regions outside legitimate source video content (intentional fade-ins / hard cuts in source mp4).
…ths contract Regression guard for the off-by-one fix in HF#1730. The pre-fix code indexed framePaths 1-based while the consumer (getFrameAtTime in engine/videoFrameExtractor.ts:958) reads 0-based, dropping every <video>'s first-paint frame in distributed chunk-lambda renders. Asserts framePaths.get(0) resolves to the first extracted frame, and framePaths.get(N-1) resolves to the last — pre-fix the keys were shifted to [1..N], so get(0) returned undefined and get(N) resolved. Verified to fail against the previous i+1 indexing. Also exports rebuildExtractedFramesFromPlanDir (was module-local) so the test can call it directly. Pure logic worth testing in isolation — the bug only reproduces under distributed mode and the existing renderChunk.test.ts already pays a multi-second Chrome smoke probe in its module-level beforeAll, so the regression check lives in its own file (rebuildExtractedFrames.test.ts) and runs Chrome-free in ~10ms. The function's doc comment said "1-based framePaths" — updated to "0-based" with a pointer to the consumer site and the bug context. Per Miguel's REQUEST_CHANGES on HF#1730. — Jerrai (https://claude.com/claude-code)
3df8a8f to
15e6d13
Compare
|
Done — force-pushed both commits re-signed (your authorship on the test commit is preserved, just the signature is mine since I rebased through them). Both show verified on the commit list now. @jrusso1020 @miguel-heygen — the new workflow runs are sitting in |
Closes #1731
Hit a 1-frame black flash at every
<video>boundary when rendering through the distributed chunk-lambda path. Local single-process renders were fine, so it didn't show up until things hit lambda.It's
rebuildExtractedFramesFromPlanDirinrenderChunk.ts:That comment isn't right —
getFrameAtTimeinengine/videoFrameExtractor.ts:958looks up withMath.floor(localTime * fps + 1e-9), which is 0-based. Every video's first-paint frame (localTime === 0→frameIndex === 0) looks upframePaths.get(0)and getsundefined. The vid then drops out ofactivePayloads, inject never fires for it, and the BeginFrame screenshot shows just the body background plus any persistent overlays you have on top — Y≈22 with sparse highlights from text/logo, basically a single-frame "almost black" flash at every vid start.Every other place in the codebase that builds or reads this Map is 0-based —
engine/videoFrameExtractor.ts:317andengine/extractionCache.ts:204bothset(i, ...), the consumer is 0-based, andrenderOrchestrator.test.tsassertsframePaths.get(0)is the first frame.renderChunkis the only 1-based one, which is why local tests don't catch it.Verified against a 3-vid back-to-back composition and a single-vid render split across 4 workers in chunk-lambda. Before: per-frame
signalstatsshows YAVG ≈ 16-22 at every first-paint frame. After:blackdetectonly picks up legitimate source content (fade-ins / hard cuts in the source mp4s).FWIW the diagnostic log we added on the
framePaths.getmiss made it pretty obvious: