Skip to content

video-encoder-stats: add video encoder stats elements#4

Open
diegonieto wants to merge 46 commits intomainfrom
video-encoder-stats
Open

video-encoder-stats: add video encoder stats elements#4
diegonieto wants to merge 46 commits intomainfrom
video-encoder-stats

Conversation

@diegonieto
Copy link
Copy Markdown

@diegonieto diegonieto commented Jun 15, 2025

  • video-encoder-stats:
    The element that collects statistics from a video encoder, and attach them onto the GstBuffers as metadata. It helps analyze encoding performance and quality metrics.

  • video-compare-mixer:
    The element in charge of comparing and mixing multiple video streams. Useful for side-by-side quality comparisons or blending outputs from different encoders.

  • videoencoderstatsmeta:
    Defines metadata structures and logic for handling video encoder statistics along the pipeline. It has been defined as GstVideoEncoderStatsMetaAPI.

Running example: cargo r --example video-encoder-stats

encoder-stats

Comment thread video/stats/src/videoencoderstats.rs Outdated
Comment thread video/stats/src/videoencoderstats.rs Outdated
Comment thread video/stats/src/videoencoderstats.rs Outdated
Comment on lines +155 to +162
let current_time = gst::SystemClock::obtain();
let latency = (current_time.time().unwrap().nseconds() as f64 - self.input_time as f64) / 1_000_000.0;

writeln!(
f,
"End2End latency: {:.3} ms",
latency
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO I would drop this. it is not End2End latency and IMHO it doesn't add any useful information because we can not assume input_queue is after the source.

Also it out of the scope of video-encoder-stats.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, removed here: 4826fdc


let pre_encode_time = &self.pre_encode_time;
let post_encode_time = &self.post_encode_time;
let encode_latency = (*post_encode_time as f64 - *pre_encode_time as f64) / 1_000_000.0;
Copy link
Copy Markdown

@rgonzalezfluendo rgonzalezfluendo Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encode latency is almost the same as processing time. The processing time is the average of all buffers, and encode latency is for each buffer, right?

But I did this change and the values are different. Strange...

patch
diff --git a/video/stats/src/encoderstats/imp.rs b/video/stats/src/encoderstats/imp.rs
index a0c56e39..e834ee29 100644
--- a/video/stats/src/encoderstats/imp.rs
+++ b/video/stats/src/encoderstats/imp.rs
@@ -152,6 +152,7 @@ impl EncoderStats {
             if let Some(mut meta) = buffer.meta_mut::<VideoEncoderStatsMeta>() {
                 let mut new_stats = meta.stats().clone();
                 new_stats.post_encode_time = current_time;
+                stats.lock().unwrap().total_processing_time_2 += (new_stats.post_encode_time - new_stats.pre_encode_time);
 
                 meta.replace(new_stats);
 
diff --git a/video/stats/src/videoencoderstats.rs b/video/stats/src/videoencoderstats.rs
index f15ababe..24a97fd6 100644
--- a/video/stats/src/videoencoderstats.rs
+++ b/video/stats/src/videoencoderstats.rs
@@ -34,6 +34,7 @@ pub struct VideoEncoderStats {
     pub time_last_buffers: VecDeque<Instant>,
     pub max_buffers_inside: usize,
     pub total_processing_time: Duration,
+    pub total_processing_time_2: u64,
     pub threads_utime: u64,
     pub threads_stime: u64,
     pub framerate: Option<gst::Fraction>,
@@ -53,6 +54,7 @@ impl Default for VideoEncoderStats {
             time_last_buffers: VecDeque::<Instant>::new(),
             max_buffers_inside: 0,
             total_processing_time: Duration::ZERO,
+            total_processing_time_2: 0,
             threads_utime: 0,
             threads_stime: 0,
             vmaf_score: None,
@@ -116,8 +118,13 @@ impl fmt::Display for VideoEncoderStats {
         };
         let bitrate_str = bitrate/1024.0; // Convert to kbps
 
+        writeln!(f, "Num buffers: {}", self.num_buffers)?;
         writeln!(f, "Bitrate: {:.3} kbps", bitrate_str)?;
 
+        writeln!(f, "total_processing_time 1: {:?}", self.total_processing_time)?;
+        writeln!(f, "total_processing_time 2: {:?}", self.total_processing_time_2)?;
+
+
         let avg_processing_time = self.avg_processing_time().as_millis();
         writeln!(
             f,

For instance 25.184094281s vs 25.150373246s

/GstPipeline:pipeline0/GstEncoderStats:encoderstats0: last-message = Encoder: x265enc
Output size: 986 KB
Num buffers: 100
Bitrate: 2367.105 kbps
total_processing_time 1: 25.184094281s
total_processing_time 2: 25150373246
Processing time: 251 ms
CPU: 5.22 s
VMAF: N/A
Encode latency: 1032.306 ms
End2End latency: 1033.626 ms
``

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really weird, I've been running this pipeline:

gst-launch-1.0 souphttpsrc location=\"https://ftp.nluug.nl/pub/graphics/blender/demo/movies/ToS/tears_of_steel_1080p.mov\" ! qtdemux name=demux demux.video_0 ! queue ! decodebin3 ! videoconvertscale ! capsfilter caps=\"video/x-raw,aspect-ratio=1/1\" ! tee name=tee ! video-encoder-stats encoder=\"x264enc bitrate=1024 tune=zerolatency speed-preset=ultrafast threads=4\" name=vs0 vmaf-stats=false tee. ! video-encoder-stats encoder=\"x264enc bitrate=512 tune=zerolatency speed-preset=ultrafast\" name=vs1 video-compare-mixer split-screen=true backend=CPU name=mixer vs0.src ! h264parse ! avdec_h264 ! mixer.sink_0  vs1.src ! h264parse ! avdec_h264 ! mixer.sink_1  mixer. ! autovideosink -v

with your patch and I see almost identical values after a long run

/GstPipeline:pipeline0/GstEncoderStats:vs1: last-message = Encoder: x264enc
Output size: 8280 KB
Num buffers: 3570
Bitrate: 445.347 kbps
total_processing_time 1: 25.290121055s
total_processing_time 2: 25292717208
Processing time: 7 ms
CPU: 47.68 s
VMAF: N/A
Encode latency: 6.069 ms
End2End latency: 7.683 ms

/GstPipeline:pipeline0/GstEncoderStats:vs0: last-message = Encoder: x264enc
Output size: 16715 KB
Num buffers: 3570
Bitrate: 898.989 kbps
total_processing_time 1: 22.632750704s
total_processing_time 2: 22638976626
Processing time: 6 ms
CPU: 47.55 s
VMAF: N/A
Encode latency: 7.141 ms
End2End latency: 9.407 ms

and both encode latency is in the range of the processing time, and the total processing time match 🤔

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been now comparing x265 vs x264:

/GstPipeline:pipeline0/GstEncoderStats:vs1: last-message = Encoder: x264enc
Output size: 8165 KB
Num buffers: 3446
Bitrate: 454.979 kbps
total_processing_time 1: 27.792252549s
total_processing_time 2: 27797596912
Processing time: 8 ms
CPU: 44.47 s
VMAF: N/A
Encode latency: 7.226 ms

/GstPipeline:pipeline0/GstEncoderStats:vs0: last-message = Encoder: x265enc
Output size: 14688 KB
Num buffers: 3446
Bitrate: 818.388 kbps
total_processing_time 1: 75.500975765s
total_processing_time 2: 75505265723
Processing time: 21 ms
CPU: 261.13 s
VMAF: N/A
Encode latency: 22.058 ms

I got again almost the same total processing time in both cases. Can you share your pipeline, so I can try to replicate?

For long runs (143s) I can explain the total processing time difference (5ms) can be due to avg. processing time is set before pre and post encoding timestamps, and the sooner requires a lock to sync with global:

            stats.lock().unwrap().buffer_out();
            gst::log!(CAT, "Buffer out encoder src pad");

            let current_time = *gst::ClockTime::from_nseconds(
                gst::SystemClock::obtain().upcast::<gst::Clock>().time().unwrap().nseconds()
            );

            let buffer = buffer.make_mut();
            if let Some(mut meta) = buffer.meta_mut::<VideoEncoderStatsMeta>() {
                let mut new_stats = meta.stats().clone();
                new_stats.post_encode_time = current_time;
                stats.lock().unwrap().total_processing_time_2 += (new_stats.post_encode_time - new_stats.pre_encode_time);

                meta.replace(new_stats);

                gst::log!(CAT, "Buffer out encoder src pad, post_encode_time: {}", current_time);
            } else {
                gst::warning!(CAT, "No VideoEncoderStatsMeta found on buffer in encoder src probe");
            }

Any thoughts about this?


fn main() -> Result<(), Error> {
gst::init()?;

Copy link
Copy Markdown

@rgonzalezfluendo rgonzalezfluendo Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to avoid Error: no element "video-encoder-stats`

Suggested change
`gstvideostats::plugin_register_static().expect("Failed to register videostats plugin");`

Note: https://github.com/fluendo/gst-plugins-rs/blob/main/utils/togglerecord/examples/gtk_recording.rs#L326

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true. Added here for all the examples: 80dea0b

Comment thread video/stats/src/comparemixer/imp.rs Outdated
Comment on lines +364 to +365
self.add_overlay_probe(&self.overlay0);
self.add_overlay_probe(&self.overlay1);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like to have two different overlays. The UX with only one is much better.

Compare

image or image

with

image or image

Also IMHO the overlay must be out of the video-compare-mixer. Application must do the overlay.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point. I wanted to make two of them to control independently, but it's true if we are comparing we will probably show the same parameters and so on. Also you approach will improve the performance. As it implies a refactor I will create a specific task for this: https://fluendo.atlassian.net/browse/RDI-2975

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've unified the overlay here: 8010eea

Comment thread video/stats/src/encoderstats/imp.rs Outdated
let num_buffers = stats.num_buffers;

if num_buffers % (fps_n as u64) == 0 {
let queue_name = if obj_name.contains("0") { "encq0:src" } else { "encq1:src" };
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rgonzalezfluendo I am measuring the CPU time based on the time spent on the thread (queue) that contains the encoder + encoder specific threads. I see x265 CPU is really high compared with x264, but it can be normal. What do you think?

/GstPipeline:pipeline0/GstEncoderStats:vs1: last-message = Encoder: x264enc
Output size: 8165 KB
Num buffers: 3446
Bitrate: 454.979 kbps
total_processing_time 1: 27.792252549s
total_processing_time 2: 27797596912
Processing time: 8 ms
CPU: 44.47 s
VMAF: N/A
Encode latency: 7.226 ms

/GstPipeline:pipeline0/GstEncoderStats:vs0: last-message = Encoder: x265enc
Output size: 14688 KB
Num buffers: 3446
Bitrate: 818.388 kbps
total_processing_time 1: 75.500975765s
total_processing_time 2: 75505265723
Processing time: 21 ms
CPU: 261.13 s
VMAF: N/A
Encode latency: 22.058 ms

diegonieto and others added 10 commits September 23, 2025 12:58
* get latency from SystemTime
** TODO: implement npt as webrtc-precise-sync-send.rs does
* add timestamp before encoding as part of videoencoderstatsmeta
* compare the input timestamp in video-compare-mixer with the current
timestamp (after decoding the video stream)
* add a new example video-stats-zero-latency to showcase the latency
difference
If vmaf metric is not available N/A will be shown in the metric instead of 0 value.
This enables posting the stats as messages unless the new property silent is set true, by default is false
Remove throughput metric and leave avg. processing time instead
Co-authored-by: Ruben Gonzalez <56379722+rgonzalezfluendo@users.noreply.github.com>
Co-authored-by: Ruben Gonzalez <56379722+rgonzalezfluendo@users.noreply.github.com>
It might not represent the actual end2end latency
videoencoderstats meta is now read from the input queues. The overlay is
not linked after the compositor and the values are updated once per
second according to the given framerate of the pipeline

A new fmt for video-compare-mixer has been created to compose both stats
left and right into a single text for the overlay
In this way it is possible to disable the overlay of the stats in the
screen. It defaults to enabled.
}

impl fmt::Display for QueueStats {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need your help here @rgonzalezfluendo to properly align the text in the layout

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the problem?

diegonieto and others added 9 commits October 12, 2025 13:32
Before the encoder was mixed with another objects that may introduce a
little overhead. This way we ensure only encoder CPU is taken into
account
Silent intermediate messages and signals based on property value.
Provide always messages and signal with the final stats on EOS
From
```
Num. Buffers first output: -1
Num. Buffers first output: 1
Num. Buffers first output: 1
```

to

```
Num. Buffers first output: No first output yet
Num. Buffers first output: 1
Num. Buffers first output: 1
```

Issue: RDI-3210
This maximum value is only valid if there is no reordering of frames
within the encoder (b-frames). If reordering occurs, it is necessary
to mark the buffers to relate the input buffer to the output buffer.
Similar to MR9622

MR9622: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/9622

Issue: RDI-3209
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants