Skip to content

feat(core): duck() programmatic audio ducking (re-submit of #1337)#1747

Open
mvanhorn wants to merge 2 commits into
heygen-com:mainfrom
mvanhorn:feat/1337-duck-js-api
Open

feat(core): duck() programmatic audio ducking (re-submit of #1337)#1747
mvanhorn wants to merge 2 commits into
heygen-com:mainfrom
mvanhorn:feat/1337-duck-js-api

Conversation

@mvanhorn

Copy link
Copy Markdown
Contributor

Re-submit of #1337 along the lines you suggested: instead of new data-duck/data-duck-fade/data-role HTML attributes, this adds a programmatic duck(musicTrack, voiceTracks, options) helper that writes deterministic compile-time volume keyframes into the same envelope used by hand-authored fades. No new HTML schema attributes; htmlCompiler/audioElementParser/the schema docs are untouched.

  • amount accepts linear gain or dB (default -12dB), fade ramp defaults to 0.3s, optional GSAP-compatible timeline receives the ramps.
  • New packages/core/src/runtime/audioDucking.ts + tests; exported from @hyperframes/core.
  • Full core suite green locally; oxlint/oxfmt/fallow clean on the changed files.

Happy to attach a rendered music-under-voiceover demo if useful.

…yframes

Adds a duck(musicTrack, voiceTracks, options) helper that lowers a music
track whenever any voice track is audible, emitting deterministic
compile-time volume keyframes into the same envelope used by hand-authored
fades. Replaces the previous data-duck/data-duck-fade/data-role HTML
attribute approach (heygen-com#1337) with a JS API per maintainer guidance — no new
HTML schema attributes.

amount accepts linear gain or dB (default -12dB); fade ramp defaults to
0.3s; an optional GSAP-compatible timeline receives the ramps.

@miguel-heygen miguel-heygen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the overall direction here: keep ducking framework-native by writing ordinary media .volume ramps onto the captured timeline, not by adding another public data-duck schema path. duck() is the right ergonomics layer over the existing volume-automation primitive.

blocker: preview and render disagree for ducked/probed volume automation when the music track starts at a non-zero data-start. The new helper computes/writes keyframes in composition time: element timing is resolved from data-start (packages/core/src/runtime/audioDucking.ts:118), the duck points are added at absolute overlap/ramp times (audioDucking.ts:208-214), and timeline writes use those same absolute first.time / previous.time values (audioDucking.ts:253-265). The existing probe path also samples absolute composition time (packages/core/src/runtime/mediaVolumeEnvelope.ts:90-119). Render handles that correctly by normalizing absolute keyframes with the track start before baking (packages/engine/src/services/audioVolumeEnvelope.ts:83-90). Preview does not: init.ts attaches the raw probed keyframes directly to the runtime clip (packages/core/src/runtime/init.ts:1652-1657), then syncRuntimeMedia computes a track-relative relTime (packages/core/src/runtime/media.ts:169) and interpolates the raw keyframes against that relative value (media.ts:192-197).

Concrete failure: if music has data-start="5" and ducking produces keyframes around 5.75/6/7/7.25, render subtracts 5 and ducks during track-relative 0.75-2.25. Preview queries those raw keyframes at relTime 0.75-2.25, so it stays on the first envelope segment and misses the duck. The tests only cover start=0 (packages/core/src/runtime/audioDucking.test.ts:124-133 normalizes with trackStart 0; the DOM media case uses music.dataset.start = "0" at audioDucking.test.ts:136-155), so this parity break is unpinned.

Please fix the preview path to use the same time basis as render and add a regression with non-zero music data-start (either normalize keyframes before attaching them to runtime clips, or interpolate with composition time, but keep preview/render semantics identical).

Non-blocking API note: the no-new-attribute choice is correct. I would just clarify in docs/API that plain timing objects are for computing keyframes; DOM media elements + the captured timeline are the render-affecting path. A future split into computeDuckKeyframes(...) and duck(...timeline...) would make that even clearer, but it should not block once the time-basis bug is fixed.

Verdict: REQUEST CHANGES
Reasoning: The helper rides the right native framework primitive, but it exposes a real preview/render parity bug for non-zero-start audio tracks and needs a targeted fix/test before merge.

— Magi

Probed volume keyframes carry absolute composition-time stamps, and the
renderer normalises them by the track start before baking, so its
envelope lookup is shift-invariant in composition time. The preview path
in syncRuntimeMedia interpolated those absolute keyframes at the
track-relative relTime (which also folds in mediaStart/playbackRate),
so for any track with a non-zero data-start it sampled the wrong window
and missed the duck entirely. Look the envelope up at absolute
composition time instead, matching the renderer.

Adds syncRuntimeMedia regression tests pinning preview/render parity for
a non-zero data-start track, and clarifies in the duck() docs which
inputs are render-affecting.

Addresses review feedback on heygen-com#1747.
@mvanhorn

Copy link
Copy Markdown
Contributor Author

Thanks for the precise trace — that's exactly it. Fixed in 945189f8.

The probe stamps keyframes in absolute composition time, and the renderer normalises them by the track start before baking, so its envelope lookup is shift-invariant in composition time. The preview path was interpolating those absolute keyframes at the track-relative relTime (which also folds in mediaStart/playbackRate), so a non-zero data-start shifted the query off the duck window. syncRuntimeMedia now looks the envelope up at absolute composition time, matching the renderer.

Added syncRuntimeMedia regression tests pinning preview/render parity for a data-start=5 track — they fail on the old relTime lookup and pass now — plus sampled points across the active window asserting preview equals the renderer's normalised envelope.

On the non-blocking note: added a docs line clarifying the DOM elements + captured timeline are the render-affecting path, while plain timing objects / the returned array just compute keyframes. The computeDuckKeyframes(...) / duck(...timeline...) split sounds like a good follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants