[4/n][trainer] feat: flowgrpo - add diffusers + fsdp engine support#50
[4/n][trainer] feat: flowgrpo - add diffusers + fsdp engine support#50
Conversation
50425ca to
ba27f6a
Compare
There was a problem hiding this comment.
Pull request overview
Adds a Diffusers-based diffusion model training path to VERL’s trainer stack, integrating a new FSDP/FSDP2 engine implementation and FlowGRPO-specific loss/scheduler utilities.
Changes:
- Introduces
DiffusersFSDPEngine(FSDP/FSDP2) for diffusion-model training/inference, including checkpointing and LoRA handling. - Adds diffusion-specific padding + loss utilities and FlowGRPO policy loss / image KL helper.
- Adds Diffusers model config + scheduler/model helpers and a sanity test for the new engine.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
verl/workers/utils/padding.py |
Adds prompt-embed padding→no-padding conversion for diffusion batches. |
verl/workers/utils/losses.py |
Adds diffusion_loss and imports kl_penalty_image. |
verl/workers/engine_workers.py |
Makes worker logic tolerant of diffusion configs (missing HF fields / input_ids). |
verl/workers/engine/fsdp/diffusers_impl.py |
New Diffusers FSDP engine implementation (core of the PR). |
verl/workers/engine/fsdp/__init__.py |
Conditionally exports the Diffusers FSDP engine. |
verl/workers/engine/__init__.py |
Exposes Diffusers engine at the package level when available. |
verl/workers/config/model.py |
Adds DiffusersModelConfig dataclass. |
verl/workers/config/engine.py |
Allows TrainingWorkerConfig.model_config to be HF or Diffusers config. |
verl/utils/fsdp_utils.py |
Extends LoRA param collection to support diffusers module naming/prefixes. |
verl/trainer/ppo/core_algos.py |
Adds FlowGRPO policy loss + kl_penalty_image. |
verl/trainer/config/model/diffusers_model.yaml |
New Hydra config template for Diffusers models. |
verl/models/diffusers_model/* |
Adds diffusion-model abstraction utilities, Qwen-Image adapter, and FlowMatch SDE scheduler. |
tests/special_sanity/check_device_api_usage.py |
Updates allowlist for new engine file. |
tests/models/test_diffusers_fsdp_engine.py |
Adds a Ray-based integration test for diffusion FSDP engine. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Adds first-pass Diffusers (diffusion model) training support on the existing worker/engine stack by introducing an FSDP/FSDP2 engine implementation and the accompanying FlowGRPO loss/scheduler plumbing.
Changes:
- Introduce
DiffusersFSDPEngine(FSDP/FSDP2) plus diffusion-model utilities (scheduler + model-specific hooks). - Add FlowGRPO policy loss + image KL helper and a diffusion-specific loss wrapper.
- Add prompt-embed padding→no-padding conversion and new CPU/GPU tests + config YAML for diffusers models.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| verl/workers/utils/padding.py | Add embeds_padding_2_no_padding for diffusion prompt embeds. |
| verl/workers/utils/losses.py | Add diffusion_loss and image KL integration. |
| verl/workers/engine_workers.py | Make worker logic robust when model_config lacks LLM-only fields (e.g., hf_config, input_ids). |
| verl/workers/engine/fsdp/diffusers_impl.py | New Diffusers FSDP/FSDP2 engine implementation. |
| verl/workers/engine/fsdp/init.py | Conditional export of DiffusersFSDPEngine. |
| verl/workers/engine/init.py | Export DiffusersFSDPEngine when available. |
| verl/workers/config/model.py | Add DiffusersModelConfig. |
| verl/workers/config/engine.py | Allow TrainingWorkerConfig.model_config to be HF or Diffusers config. |
| verl/utils/fsdp_utils.py | Extend LoRA param collection to support diffusers module structure. |
| verl/trainer/ppo/core_algos.py | Register flow_grpo policy loss + add kl_penalty_image. |
| verl/trainer/config/model/diffusers_model.yaml | Add Hydra config template for diffusers models. |
| verl/models/diffusers_model/* | New diffusion model base/registry, QwenImage hook, scheduler impl, and utils. |
| tests/utils/test_padding_on_cpu.py | Unit test for embed padding→no-padding conversion. |
| tests/trainer/ppo/test_core_algos_on_cpu.py | Unit test for FlowGRPO policy loss. |
| tests/special_sanity/check_device_api_usage.py | Allowlist new engine file. |
| tests/models/test_diffusers_fsdp_engine.py | End-to-end-ish smoke test for diffusers engine (fsdp/fsdp2). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
77db4b9 to
a6d12a8
Compare
7b1b0c5 to
b3496f7
Compare
What does this PR do?
Checklist Before Starting
[{modules}] {type}: {description}(This will be checked by the CI){modules}includefsdp,megatron,veomni,sglang,vllm,rollout,trainer,ci,training_utils,recipe,hardware,deployment,ray,worker,single_controller,misc,perf,model,algo,env,tool,ckpt,doc,data,cfg,reward,fully_async,one_step_off,like[megatron, fsdp, doc]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.[BREAKING][fsdp, megatron] feat: dynamic batchingTest
API and Usage Example
# Add code snippet or script demonstrating how to use thisDesign & Code Changes
Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel in theverlSlack workspace. (If not accessible, please try the Feishu group (飞书群).)recipesubmodule, please also update the reference to the submodule commit viagit submodule update --remoteorcd recipe && git pull origin main.