video-encoder-stats: add video encoder stats elements#4
video-encoder-stats: add video encoder stats elements#4diegonieto wants to merge 46 commits intomainfrom
Conversation
4e8826e to
e26cae9
Compare
8adc49c to
2485f76
Compare
b8b868b to
b676a09
Compare
b676a09 to
2358245
Compare
| let current_time = gst::SystemClock::obtain(); | ||
| let latency = (current_time.time().unwrap().nseconds() as f64 - self.input_time as f64) / 1_000_000.0; | ||
|
|
||
| writeln!( | ||
| f, | ||
| "End2End latency: {:.3} ms", | ||
| latency | ||
| ) |
There was a problem hiding this comment.
IMHO I would drop this. it is not End2End latency and IMHO it doesn't add any useful information because we can not assume input_queue is after the source.
Also it out of the scope of video-encoder-stats.
|
|
||
| let pre_encode_time = &self.pre_encode_time; | ||
| let post_encode_time = &self.post_encode_time; | ||
| let encode_latency = (*post_encode_time as f64 - *pre_encode_time as f64) / 1_000_000.0; |
There was a problem hiding this comment.
Encode latency is almost the same as processing time. The processing time is the average of all buffers, and encode latency is for each buffer, right?
But I did this change and the values are different. Strange...
patch
diff --git a/video/stats/src/encoderstats/imp.rs b/video/stats/src/encoderstats/imp.rs
index a0c56e39..e834ee29 100644
--- a/video/stats/src/encoderstats/imp.rs
+++ b/video/stats/src/encoderstats/imp.rs
@@ -152,6 +152,7 @@ impl EncoderStats {
if let Some(mut meta) = buffer.meta_mut::<VideoEncoderStatsMeta>() {
let mut new_stats = meta.stats().clone();
new_stats.post_encode_time = current_time;
+ stats.lock().unwrap().total_processing_time_2 += (new_stats.post_encode_time - new_stats.pre_encode_time);
meta.replace(new_stats);
diff --git a/video/stats/src/videoencoderstats.rs b/video/stats/src/videoencoderstats.rs
index f15ababe..24a97fd6 100644
--- a/video/stats/src/videoencoderstats.rs
+++ b/video/stats/src/videoencoderstats.rs
@@ -34,6 +34,7 @@ pub struct VideoEncoderStats {
pub time_last_buffers: VecDeque<Instant>,
pub max_buffers_inside: usize,
pub total_processing_time: Duration,
+ pub total_processing_time_2: u64,
pub threads_utime: u64,
pub threads_stime: u64,
pub framerate: Option<gst::Fraction>,
@@ -53,6 +54,7 @@ impl Default for VideoEncoderStats {
time_last_buffers: VecDeque::<Instant>::new(),
max_buffers_inside: 0,
total_processing_time: Duration::ZERO,
+ total_processing_time_2: 0,
threads_utime: 0,
threads_stime: 0,
vmaf_score: None,
@@ -116,8 +118,13 @@ impl fmt::Display for VideoEncoderStats {
};
let bitrate_str = bitrate/1024.0; // Convert to kbps
+ writeln!(f, "Num buffers: {}", self.num_buffers)?;
writeln!(f, "Bitrate: {:.3} kbps", bitrate_str)?;
+ writeln!(f, "total_processing_time 1: {:?}", self.total_processing_time)?;
+ writeln!(f, "total_processing_time 2: {:?}", self.total_processing_time_2)?;
+
+
let avg_processing_time = self.avg_processing_time().as_millis();
writeln!(
f,
For instance 25.184094281s vs 25.150373246s
/GstPipeline:pipeline0/GstEncoderStats:encoderstats0: last-message = Encoder: x265enc
Output size: 986 KB
Num buffers: 100
Bitrate: 2367.105 kbps
total_processing_time 1: 25.184094281s
total_processing_time 2: 25150373246
Processing time: 251 ms
CPU: 5.22 s
VMAF: N/A
Encode latency: 1032.306 ms
End2End latency: 1033.626 ms
``
There was a problem hiding this comment.
really weird, I've been running this pipeline:
gst-launch-1.0 souphttpsrc location=\"https://ftp.nluug.nl/pub/graphics/blender/demo/movies/ToS/tears_of_steel_1080p.mov\" ! qtdemux name=demux demux.video_0 ! queue ! decodebin3 ! videoconvertscale ! capsfilter caps=\"video/x-raw,aspect-ratio=1/1\" ! tee name=tee ! video-encoder-stats encoder=\"x264enc bitrate=1024 tune=zerolatency speed-preset=ultrafast threads=4\" name=vs0 vmaf-stats=false tee. ! video-encoder-stats encoder=\"x264enc bitrate=512 tune=zerolatency speed-preset=ultrafast\" name=vs1 video-compare-mixer split-screen=true backend=CPU name=mixer vs0.src ! h264parse ! avdec_h264 ! mixer.sink_0 vs1.src ! h264parse ! avdec_h264 ! mixer.sink_1 mixer. ! autovideosink -v
with your patch and I see almost identical values after a long run
/GstPipeline:pipeline0/GstEncoderStats:vs1: last-message = Encoder: x264enc
Output size: 8280 KB
Num buffers: 3570
Bitrate: 445.347 kbps
total_processing_time 1: 25.290121055s
total_processing_time 2: 25292717208
Processing time: 7 ms
CPU: 47.68 s
VMAF: N/A
Encode latency: 6.069 ms
End2End latency: 7.683 ms
/GstPipeline:pipeline0/GstEncoderStats:vs0: last-message = Encoder: x264enc
Output size: 16715 KB
Num buffers: 3570
Bitrate: 898.989 kbps
total_processing_time 1: 22.632750704s
total_processing_time 2: 22638976626
Processing time: 6 ms
CPU: 47.55 s
VMAF: N/A
Encode latency: 7.141 ms
End2End latency: 9.407 ms
and both encode latency is in the range of the processing time, and the total processing time match 🤔
There was a problem hiding this comment.
I've been now comparing x265 vs x264:
/GstPipeline:pipeline0/GstEncoderStats:vs1: last-message = Encoder: x264enc
Output size: 8165 KB
Num buffers: 3446
Bitrate: 454.979 kbps
total_processing_time 1: 27.792252549s
total_processing_time 2: 27797596912
Processing time: 8 ms
CPU: 44.47 s
VMAF: N/A
Encode latency: 7.226 ms
/GstPipeline:pipeline0/GstEncoderStats:vs0: last-message = Encoder: x265enc
Output size: 14688 KB
Num buffers: 3446
Bitrate: 818.388 kbps
total_processing_time 1: 75.500975765s
total_processing_time 2: 75505265723
Processing time: 21 ms
CPU: 261.13 s
VMAF: N/A
Encode latency: 22.058 ms
I got again almost the same total processing time in both cases. Can you share your pipeline, so I can try to replicate?
For long runs (143s) I can explain the total processing time difference (5ms) can be due to avg. processing time is set before pre and post encoding timestamps, and the sooner requires a lock to sync with global:
stats.lock().unwrap().buffer_out();
gst::log!(CAT, "Buffer out encoder src pad");
let current_time = *gst::ClockTime::from_nseconds(
gst::SystemClock::obtain().upcast::<gst::Clock>().time().unwrap().nseconds()
);
let buffer = buffer.make_mut();
if let Some(mut meta) = buffer.meta_mut::<VideoEncoderStatsMeta>() {
let mut new_stats = meta.stats().clone();
new_stats.post_encode_time = current_time;
stats.lock().unwrap().total_processing_time_2 += (new_stats.post_encode_time - new_stats.pre_encode_time);
meta.replace(new_stats);
gst::log!(CAT, "Buffer out encoder src pad, post_encode_time: {}", current_time);
} else {
gst::warning!(CAT, "No VideoEncoderStatsMeta found on buffer in encoder src probe");
}
Any thoughts about this?
|
|
||
| fn main() -> Result<(), Error> { | ||
| gst::init()?; | ||
|
|
There was a problem hiding this comment.
Needed to avoid Error: no element "video-encoder-stats`
| `gstvideostats::plugin_register_static().expect("Failed to register videostats plugin");` |
| self.add_overlay_probe(&self.overlay0); | ||
| self.add_overlay_probe(&self.overlay1); |
There was a problem hiding this comment.
I understand your point. I wanted to make two of them to control independently, but it's true if we are comparing we will probably show the same parameters and so on. Also you approach will improve the performance. As it implies a refactor I will create a specific task for this: https://fluendo.atlassian.net/browse/RDI-2975
| let num_buffers = stats.num_buffers; | ||
|
|
||
| if num_buffers % (fps_n as u64) == 0 { | ||
| let queue_name = if obj_name.contains("0") { "encq0:src" } else { "encq1:src" }; |
There was a problem hiding this comment.
@rgonzalezfluendo I am measuring the CPU time based on the time spent on the thread (queue) that contains the encoder + encoder specific threads. I see x265 CPU is really high compared with x264, but it can be normal. What do you think?
/GstPipeline:pipeline0/GstEncoderStats:vs1: last-message = Encoder: x264enc
Output size: 8165 KB
Num buffers: 3446
Bitrate: 454.979 kbps
total_processing_time 1: 27.792252549s
total_processing_time 2: 27797596912
Processing time: 8 ms
CPU: 44.47 s
VMAF: N/A
Encode latency: 7.226 ms
/GstPipeline:pipeline0/GstEncoderStats:vs0: last-message = Encoder: x265enc
Output size: 14688 KB
Num buffers: 3446
Bitrate: 818.388 kbps
total_processing_time 1: 75.500975765s
total_processing_time 2: 75505265723
Processing time: 21 ms
CPU: 261.13 s
VMAF: N/A
Encode latency: 22.058 ms
Different encoders introduce different latencies and the pipelines where not starting properly. Introduce multiples queues in bifurcations and convergences. Introduce also videoconvert and fix the video caps format to I420
Change start-up sequence
This adds a new pad allowing to directly connect to the decoded data. It also introduce both videoencoderstats and videocomparemixer unit tests.
This reduces drastically the processing time taken by the vmaf element, avoiding to post scores for each frame.
* get latency from SystemTime ** TODO: implement npt as webrtc-precise-sync-send.rs does * add timestamp before encoding as part of videoencoderstatsmeta * compare the input timestamp in video-compare-mixer with the current timestamp (after decoding the video stream) * add a new example video-stats-zero-latency to showcase the latency difference
If vmaf metric is not available N/A will be shown in the metric instead of 0 value.
This enables posting the stats as messages unless the new property silent is set true, by default is false
Remove throughput metric and leave avg. processing time instead
…tead of global values Relates to RDI-2963
Co-authored-by: Ruben Gonzalez <56379722+rgonzalezfluendo@users.noreply.github.com>
Co-authored-by: Ruben Gonzalez <56379722+rgonzalezfluendo@users.noreply.github.com>
It might not represent the actual end2end latency
Fix time object new methods
80dea0b to
1242a6b
Compare
videoencoderstats meta is now read from the input queues. The overlay is not linked after the compositor and the values are updated once per second according to the given framerate of the pipeline A new fmt for video-compare-mixer has been created to compose both stats left and right into a single text for the overlay
In this way it is possible to disable the overlay of the stats in the screen. It defaults to enabled.
| } | ||
|
|
||
| impl fmt::Display for QueueStats { | ||
| fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { |
There was a problem hiding this comment.
I would need your help here @rgonzalezfluendo to properly align the text in the layout
Before the encoder was mixed with another objects that may introduce a little overhead. This way we ensure only encoder CPU is taken into account
Silent intermediate messages and signals based on property value. Provide always messages and signal with the final stats on EOS
From ``` Num. Buffers first output: -1 Num. Buffers first output: 1 Num. Buffers first output: 1 ``` to ``` Num. Buffers first output: No first output yet Num. Buffers first output: 1 Num. Buffers first output: 1 ``` Issue: RDI-3210
93cfd3c to
c7cb30d
Compare
This maximum value is only valid if there is no reordering of frames within the encoder (b-frames). If reordering occurs, it is necessary to mark the buffers to relate the input buffer to the output buffer. Similar to MR9622 MR9622: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/9622 Issue: RDI-3209




video-encoder-stats:The element that collects statistics from a video encoder, and attach them onto the
GstBuffersas metadata. It helps analyze encoding performance and quality metrics.video-compare-mixer:The element in charge of comparing and mixing multiple video streams. Useful for side-by-side quality comparisons or blending outputs from different encoders.
videoencoderstatsmeta:Defines metadata structures and logic for handling video encoder statistics along the pipeline. It has been defined as
GstVideoEncoderStatsMetaAPI.Running example:
cargo r --example video-encoder-stats