Skip to content

Add batch json output and chunked inference#25

Open
StefanFabian wants to merge 3 commits intogoogle:mainfrom
StefanFabian:main
Open

Add batch json output and chunked inference#25
StefanFabian wants to merge 3 commits intogoogle:mainfrom
StefanFabian:main

Conversation

@StefanFabian
Copy link
Copy Markdown

This also fixes #24
Main changes are:

  • Introduction of an option --batch_json_output to output the full result information, including frame scores, when using batch mode.
  • Chunked loading and processing of the video.
    This enables processing of longer videos, which would otherwise cause out-of-memory errors.
    A new option --chunk_size_frames is added to allow users to set a value that can fit in their GPU memory.

@google-cla
Copy link
Copy Markdown

google-cla Bot commented Jan 2, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@ShortyCM
Copy link
Copy Markdown

ShortyCM commented Mar 3, 2026

A tool like this should work on videos of arbitrary length. I'm trying to test with a 1080p 2-minute clip and it commits over 150 GB of memory when it runs. Why on earth is it doing that? It shouldn't require any more memory than is necessary to hold the current working chunk of data. The current working chunk of data should not consist of the entire video stream, and then some. Committing over 150 GB of memory for a 2-minute 23.976 fps 1920x1080 8-bit video is insanity. Sorry, but I can't afford 9 TB of RAM to process a movie. This whole project needs serious refactoring.

@StefanFabian
Copy link
Copy Markdown
Author

StefanFabian commented Mar 3, 2026

Agree with you that it should, but I think this is great work here, even if a bit unpolished.
It's really helpful for my evaluations.
Did my fork work for your application?
It should be able to handle your use case and also works with gpu inference.
I've even used it on some 60-minute-long videos.

@ShortyCM
Copy link
Copy Markdown

ShortyCM commented Mar 4, 2026

I tried yours. So, testing with a 2-minute clip out of the middle of Blade Runner 2 which is 1080p 8-bit 23.976 fps resulted in this with their original code:

T:\UVQ-original>timethis python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv

TimeThis : Command Line : python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv
TimeThis : Start Time : Tue Mar 03 16:37:22 2026

3.6651830673217773

TimeThis : Command Line : python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv
TimeThis : Start Time : Tue Mar 03 16:37:22 2026
TimeThis : End Time : Tue Mar 03 16:39:54 2026
TimeThis : Elapsed Time : 00:02:32.335

The commit size peaked right around 150 GB, and will vary depending on video length. It grows and grows the longer the video is, and the deeper it gets into that length. I don't dare try anything longer than 2 minutes since that is already committing 150 GB of memory and I only have 64 GB of RAM. Then I tried your fork, which resulted in this:

T:\uvq-StefanFabian>timethis python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv

TimeThis : Command Line : python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv
TimeThis : Start Time : Tue Mar 03 16:42:00 2026

3.660670757293701

TimeThis : Command Line : python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv
TimeThis : Start Time : Tue Mar 03 16:42:00 2026
TimeThis : End Time : Tue Mar 03 16:44:12 2026
TimeThis : Elapsed Time : 00:02:11.324

Your fork had its commit size peak right around 122.6 GB, and still showed the same behaviour where commit size continually grows as it gets deeper into the video. And with "my" fork (in quotation marks because it is just ChatGPT code after much arguing and many iterations) resulted in this:

T:\uvq-mine>timethis python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv

TimeThis : Command Line : python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv
TimeThis : Start Time : Tue Mar 03 16:40:31 2026

3.665183088996194

TimeThis : Command Line : python uvq_inference.py blade2-1080p-x265-01-ultrafast.mkv
TimeThis : Start Time : Tue Mar 03 16:40:31 2026
TimeThis : End Time : Tue Mar 03 16:41:28 2026
TimeThis : Elapsed Time : 00:00:57.301

And its commit size peaked right around 2.9 GB, and stays right at 2.9 GB with 1080p 8-bit 23.976 fps video no matter how long it is. It doesn't care if it is 2 minutes, 2 hours, or 10 hours, whatever. It is only working on a single frame at a time as far as I'm aware, so it never grows. I don't know if the code is any good or not, as I haven't even read it. But it produces the same score as the original code up to the 7th decimal place, whereas yours differs at the 3rd decimal place. Though I doubt it really matters beyond the first. Feel free to take a look at my fork, but as I say, I didn't have anything to do with writing that code beyond yelling at ChatGPT for a while. haha. I think the key to reducing memory usage was simply limiting it to a single frame at a time. btw, this is with CPU processing. I haven't tried GPU processing with any fork as of yet.

@StefanFabian
Copy link
Copy Markdown
Author

As stated at the top my fork provides an option --chunk_size_frames which you can reduce to limit the number of frames processed per batch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Single video inference fails with cuda

2 participants