Add vllm-omni image generation benchmark by spillai · Pull Request #3 · vlm-run/vlmbench

spillai · 2026-02-08T00:00:38Z

Summary

New imagegen subcommand for benchmarking text-to-image generation models via vllm-omni
12 built-in diverse prompts (objects, landscapes, architecture, portraits, abstract, text rendering, etc.)
Uses OpenAI-compatible client.images.generate() API (/v1/images/generations)
Primary metric: s/img (seconds per image), secondary: img/min
Extends NativeVllmServerManager with omni=True param for vllm serve --omni
Full display: config panel, results panel, compare table for imagegen results
compare auto-detects "type": "imagegen" in JSON files
Optional --save-images to dump generated PNGs

Usage

vlmbench imagegen --model Qwen/Qwen-Image --backend vllm-omni
vlmbench imagegen --model Tongyi-MAI/Z-Image-Turbo --size 512x512 --save-images ./imgs

Models added to MODELS.md

Qwen/Qwen-Image
Tongyi-MAI/Z-Image-Turbo

Test plan

vlmbench imagegen --model Qwen/Qwen-Image --base-url http://localhost:8000/v1 --no-serve
vlmbench compare results/imagegen-*.json
Verify --save-images writes PNGs to disk
Run lint: make lint

🤖 Generated with Claude Code

gemini-code-assist · 2026-02-08T00:00:58Z

Summary of Changes

Hello @spillai, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the vlmbench tool by introducing a dedicated benchmarking suite for text-to-image generation models. It enables users to assess the performance of vllm-omni powered image generation services using a set of predefined prompts and provides detailed metrics, configuration, and comparison capabilities. The update streamlines the process of evaluating image generation models and integrates seamlessly with existing vlmbench workflows.

Highlights

New imagegen subcommand: Introduced a new imagegen subcommand for benchmarking text-to-image generation models.
vLLM-Omni Backend Support: Added support for the vllm-omni backend, including updates to NativeVllmServerManager to handle the --omni flag.
OpenAI API Integration: Utilizes the OpenAI-compatible client.images.generate() API for image generation requests.
Diverse Prompts: Incorporated 12 built-in, diverse prompts for comprehensive image generation benchmarking.
New Metrics: Primary metric is seconds per image (s/img), with images per minute (img/min) as a secondary metric.
Enhanced Comparison: The compare command now automatically detects and displays imagegen benchmark results.
Image Saving Option: Added an optional --save-images flag to dump generated PNGs to disk.
Model Documentation: Updated MODELS.md to include Qwen/Qwen-Image and Tongyi-MAI/Z-Image-Turbo as supported image generation models.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

.claude/skills/vlmbench/MODELS.md
- Added a new section for vllm-omni image generation models.
- Listed Qwen/Qwen-Image and Tongyi-MAI/Z-Image-Turbo as supported models.
vlmbench/cli.py
- Introduced IMAGEGEN_PROMPTS for diverse image generation scenarios.
- Defined new dataclasses (ImageGenRunRaw, ImageGenConfig, ImageGenResults, ImageGenBenchmarkResult) to manage image generation benchmark data.
- Extended NativeVllmServerManager to support vllm serve --omni via an omni parameter.
- Updated backend resolution logic (_resolve_backend_for_monitor, _create_server_manager, resolve_server) to correctly identify and manage the vllm-omni backend.
- Implemented core image generation benchmarking logic, including API calls, retry mechanisms, and optional image saving.
- Added dedicated display functions (print_imagegen_config, print_imagegen_results, print_imagegen_compare_table) for image generation benchmarks.
- Created a new imagegen CLI command with options for model, image size, concurrency, and output.
- Modified the compare command to dynamically handle both existing and new imagegen benchmark result types.
- Included imagegen in the list of recognized CLI subcommands.

Activity

The initial code for this pull request was generated using Claude Code.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new imagegen subcommand to benchmark text-to-image generation models, which is a great addition. The implementation is comprehensive, including support for the vllm-omni backend, new data schemas for image generation results, and well-formatted display panels for configuration and results. The changes are well-integrated with the existing codebase. My feedback includes a couple of suggestions to improve code structure and maintainability by decoupling logic from the CLI framework and reducing code duplication.

gemini-code-assist · 2026-02-08T00:11:38Z

+                    console.print()
+                    print_error("Warmup failed", result["error"])
+                    raise typer.Exit(1)


Calling typer.Exit(1) from within this core benchmark function tightly couples it to the typer CLI framework. This can make the function harder to reuse or test in other contexts. A better practice is to raise a custom exception (e.g., BenchmarkWarmupError) and handle the exit gracefully in the imagegen command function that calls it. This would improve separation of concerns.

gemini-code-assist · 2026-02-08T00:11:38Z

+    if is_imagegen:
+        ig_results: list[ImageGenBenchmarkResult] = []
+        for filepath in files:
+            path = Path(filepath)
+            if not path.exists():
+                console.print(f"[red]File not found: {filepath}[/red]")
+                raise typer.Exit(1)
+            with open(path) as f:
+                data = json.load(f)
+            ig_results.append(_dc_from_dict(ImageGenBenchmarkResult, data))
+        print_imagegen_compare_table(ig_results)
+    else:
+        results: list[BenchmarkResult] = []
+        for filepath in files:
+            path = Path(filepath)
+            if not path.exists():
+                console.print(f"[red]File not found: {filepath}[/red]")
+                raise typer.Exit(1)
+            with open(path) as f:
+                data = json.load(f)
+            results.append(_dc_from_dict(BenchmarkResult, data))

-    print_compare_table(results)
+        # Sort by total tokens_per_sec (across workers) descending
+        results.sort(key=lambda r: r.results.tokens_per_sec * r.input.max_concurrency, reverse=True)
+        print_compare_table(results)


The logic for loading and parsing result files is duplicated in the if and else blocks. This can be consolidated to improve maintainability and reduce redundancy. You can determine the result class first, then have a single loop to load all files.

result_cls = ImageGenBenchmarkResult if is_imagegen else BenchmarkResult all_results = [] for filepath in files: path = Path(filepath) if not path.exists(): console.print(f"[red]File not found: {filepath}[/red]") raise typer.Exit(1) with open(path) as f: data = json.load(f) all_results.append(_dc_from_dict(result_cls, data)) if is_imagegen: print_imagegen_compare_table(all_results) else: # Sort by total tokens_per_sec (across workers) descending all_results.sort(key=lambda r: r.results.tokens_per_sec * r.input.max_concurrency, reverse=True) print_compare_table(all_results)

* Merge spillai/init-cli: accept shlex safety fixes, keep imagegen code * Add .subtask to gitignore * Add imagegen subcommand for vllm-omni image generation benchmarking * Update README with configuration details and image Subtask-Task: feat/vllm-omni

When --backend is explicitly set (e.g. vllm-omni), use it for the monitor session name instead of the auto-detected backend type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist bot reviewed Feb 8, 2026

View reviewed changes

spillai force-pushed the fix/pypdfium-pr branch from 49a9102 to 76d6700 Compare February 8, 2026 00:15

spillai force-pushed the feat/vllm-omni-pr branch from e64c629 to 72e6281 Compare February 8, 2026 00:19

spillai changed the base branch from fix/pypdfium-pr to main February 8, 2026 00:20

Fix tmux session name when auto-detecting server with explicit backend

e5fb157

When --backend is explicitly set (e.g. vllm-omni), use it for the monitor session name instead of the auto-detected backend type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add vllm-omni image generation benchmark#3

Add vllm-omni image generation benchmark#3
spillai wants to merge 2 commits intomainfrom
feat/vllm-omni-pr

spillai commented Feb 8, 2026

Uh oh!

gemini-code-assist bot commented Feb 8, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 8, 2026

Uh oh!

gemini-code-assist bot Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spillai commented Feb 8, 2026

Summary

Usage

Models added to MODELS.md

Test plan

Uh oh!

gemini-code-assist bot commented Feb 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant