feat(vllm): add Gemma 4 models, image, and ROCm serving recipes by coketaste · Pull Request #144 · ROCm/MAD

coketaste · 2026-04-14T18:43:52Z

Register pyt_vllm_gemma-4-26b-a4b-it and pyt_vllm_gemma-4-31b-it in models.json (gemma4 Docker stack).
Add docker/pyt_vllm_gemma4.ubuntu.amd.Dockerfile from vllm/vllm-openai-rocm:gemma4 with transformers 5.5.0.
Extend scripts/vllm/configs/default.yaml with Gemma 4 serving blocks (TRITON_ATTN, gfx942 float16; 26B MoE disables AITER fused MoE).
Quote JSON-like extra_args in run_vllm.py (shlex) for --limit-mm-per-prompt with existing --flag YAML keys.
Document Gemma 4 in benchmark/vllm/README.md.

- Register pyt_vllm_gemma-4-26b-a4b-it and pyt_vllm_gemma-4-31b-it in models.json (gemma4 Docker stack). - Add docker/pyt_vllm_gemma4.ubuntu.amd.Dockerfile from vllm/vllm-openai-rocm:gemma4 with transformers 5.5.0. - Extend scripts/vllm/configs/default.yaml with Gemma 4 serving blocks (TRITON_ATTN, gfx942 float16; 26B MoE disables AITER fused MoE). - Quote JSON-like extra_args in run_vllm.py (shlex) for --limit-mm-per-prompt with existing --flag YAML keys. - Document Gemma 4 in benchmark/vllm/README.md.

Copilot

Pull request overview

Adds Gemma 4 (26B-A4B-it and 31B-it) vLLM serving support to the MAD benchmarking stack, including new model registrations, ROCm/Gemma4 Docker build plumbing, and documented serving recipes.

Changes:

Registered two Gemma 4 vLLM models in models.json and documented them in benchmark/vllm/README.md.
Added a Gemma4-specific AMD Ubuntu Dockerfile based on vllm/vllm-openai-rocm:gemma4 and extended scripts/vllm/configs/default.yaml with Gemma 4 serving recipes/overrides.
Updated scripts/vllm/run_vllm.py to shell-quote JSON-like/whitespace-containing extra_args values (notably --limit-mm-per-prompt).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
scripts/vllm/run_vllm.py	Adjusts `extra_args` formatting/quoting when composing the vLLM command line.
scripts/vllm/configs/default.yaml	Adds Gemma 4 serving benchmark blocks and gfx942 dtype overrides.
models.json	Registers Gemma 4 vLLM models and their MAD metadata/output CSV names.
docker/pyt_vllm_gemma4.ubuntu.amd.Dockerfile	Introduces a Gemma4-tagged base image Dockerfile and pins transformers.
benchmark/vllm/README.md	Documents Gemma 4 image tag usage, gating/token requirements, and recipe details.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+RUN pip3 list
+
+# Specify entrypoint to override upstream
+ENTRYPOINT [""]


+                    s = str(v)
+                    st = s.strip()
+                    if (
+                        k == "--limit-mm-per-prompt"
+                        or (st[:1] in "{[")
+                        or any(ch.isspace() for ch in s)
+                    ):
+                        extra_args_str += f" {k} {shlex.quote(s)}"
+                    else:
+                        extra_args_str += f" {k} {v}"


coketaste requested a review from gargrahul as a code owner April 14, 2026 18:43

Copilot AI review requested due to automatic review settings April 14, 2026 18:43

coketaste requested review from Rohan138, amathews-amd and ppalaniappan-amd as code owners April 14, 2026 18:43

coketaste self-assigned this Apr 14, 2026

Copilot started reviewing on behalf of coketaste April 14, 2026 18:45 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(vllm): add Gemma 4 models, image, and ROCm serving recipes#144

feat(vllm): add Gemma 4 models, image, and ROCm serving recipes#144
coketaste wants to merge 1 commit intoROCm:developfrom
coketaste:coketaste/gemma4

coketaste commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

coketaste commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants