Detect AV1 keyframes logic by olegokunevych · Pull Request #239 · elixir-webrtc/ex_webrtc

olegokunevych · 2025-12-08T11:13:02Z

In our media server utilizing ex_webrtc, we require a feature to transmit AV1 RTP packets to clients with help of WebRTC/WHEP. Consequently, we need a way to identify AV1 keyframes, that’s why we submitted this pull request.
This solution worked for us, that's why we'd like to ask to look over this PR and suggest any necessary modifications.

sgfn

Sorry for the long delay and the radio silence, and thanks for the contribution.

We can't accept this PR in its current form, but we're open to discussion on whether our proposed changes work in your case

sgfn · 2026-02-19T14:14:35Z

lib/ex_webrtc/rtp/av1.ex

+  According to the [AV1 RTP spec](https://aomediacodec.github.io/av1-rtp-spec/v1.0.0.html) §4.4,
+  the RTP aggregation header's N bit marks the start of a new coded video sequence (CVS).
+  A CVS must contain a sequence header and the first frame must be a KEY_FRAME as defined
+  by ISO/IEC 23094-1 §6.8:
+  - `show_existing_frame` = 0 (a new frame, not a reference reuse)
+  - `frame_type` = KEY_FRAME (0)
+  - `show_frame` = 1 (displayed frame)
+
+  Some encoders repeat sequence headers in non-key frames, therefore the
+  presence of a sequence header alone is not considered sufficient for keyframe
+  detection.


Please leave only the first sentence in @doc and change the rest to regular # comments

I'm not sure how ISO/IEC 23094-1 is relevant here -- it defines the EVC standard and has no section 6.8. Did you mean AV1 spec, sec. 6.8?

Totally agree, will commit corresponding change

sgfn · 2026-03-02T15:54:07Z

lib/ex_webrtc/rtp/av1.ex

+          (av1_payload.z == 0 and check_keyframe_in_payload(av1_payload.payload))
+


There's a problem with this approach. The AV1 RTP spec allows each OBU to be sent in a separate RTP packet with W=1. In the simplest case, the bitstream SEQ_HDR FRAME can be packetized into [ SEQ_HDR ] N=1 [ FRAME ] N=0.

AV1.keyframe? will return true for both packets. If we're looking for the right place to switch simulcast layers, both of these packets will be considered equally valid. If the first packet never got delivered due to packet loss, we're going to switch layers without changing params.

For H264, we decided that the occasional freeze which will trigger a PLI feedback (and, eventually, a new keyframe) is preferable to the green pixelated glitchy mess the end user will be seeing in the alternate case. You can refer to lib/ex_webrtc/rtp/h264.ex for more info and further reading (source code of existing SFU implementations).

I'd opt for a simple N=1 check, even though 1) it will falsely flag SEQ_HDR repeats as keyframes, and 2) it's not technically the same thing as checking for the start of a CVS, or even a keyframe?. The optimistic approach was found to work well in our previous experiments.

@sgfn Thank you for detailed response, here are few points that we found regarding the keyframe detection:

N=1 alone is insufficient — tested with AOM AV1 encoder in OBS that never set N=1. This makes N=1-only completely broken, not just suboptimal. The function is used beyond simulcast (initial stream setup, first-keyframe detection), so missing keyframes is unacceptable.

Double-detection is a narrow edge case — it requires: (a) the encoder splits SEQ_HDR and FRAME into separate packets, AND (b) the SEQ_HDR packet is lost while the FRAME packet arrives. This is a subset of normal packet loss.

Even in the double-detection scenario, the outcome is acceptable — if the SEQ_HDR packet was lost, the stream is already degraded regardless of whether we flag the FRAME packet as a keyframe. The simulcast switching layer should handle incomplete keyframe data gracefully (the same way it handles any packet loss).

The H264 analogy doesn't fully apply — H264's SPS is reliably present as a distinct NAL unit type. AV1's N bit is an RTP-layer signal that depends on the payloader implementation, which varies. The payload-level check is more robust because it inspects the actual bitstream content.

Chromium's own depacketizer inspects OBU content — the codebase references Chromium implementations (leb128.ex, payloader.ex), which also perform content inspection rather than relying solely on aggregation header flags.

Let me know if those points make sense, we are opened for the following discussion :)

Detect AV1 keyframes logic.

4d10b20

olegokunevych force-pushed the av1_keyframe_detect branch from 435e349 to 4d10b20 Compare February 17, 2026 12:46

sgfn requested changes Mar 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect AV1 keyframes logic#239

Detect AV1 keyframes logic#239
olegokunevych wants to merge 1 commit intoelixir-webrtc:masterfrom
olegokunevych:av1_keyframe_detect

olegokunevych commented Dec 8, 2025

Uh oh!

sgfn left a comment

Uh oh!

sgfn Feb 19, 2026

Uh oh!

olegokunevych Mar 4, 2026

Uh oh!

sgfn Mar 2, 2026 •

edited

Loading

Uh oh!

olegokunevych Mar 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		(av1_payload.z == 0 and check_keyframe_in_payload(av1_payload.payload))

Conversation

olegokunevych commented Dec 8, 2025

Uh oh!

sgfn left a comment

Choose a reason for hiding this comment

Uh oh!

sgfn Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

olegokunevych Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

sgfn Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

olegokunevych Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sgfn Mar 2, 2026 •

edited

Loading

olegokunevych Mar 4, 2026 •

edited

Loading