[2/n][rollout] feat: flowgrpo - add diffusion agent loop support#53
[2/n][rollout] feat: flowgrpo - add diffusion agent loop support#53AndyZhou952 wants to merge 118 commits intozhtmike:mainfrom
Conversation
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
| ) | ||
|
|
||
|
|
||
| class DiffusionAgentLoopWorker: |
There was a problem hiding this comment.
Possibly inherited from AgentLoopWorker, check repeated methods?
There was a problem hiding this comment.
Feels like it makes more sense to keep separate classes, considering the amount of overwrites needed. (Trace removed, DiffusersModelConfig, sampling params etc.)
|
fix CI pls |
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces diffusion-model support to the experimental agent loop stack, adding a diffusion-specific single-turn agent loop, a diffusion worker pipeline for batching/postprocessing, and an initial test targeting diffusion rollouts.
Changes:
- Add
DiffusionSingleTurnAgentLoopand register it asdiffusion_single_turn_agent. - Extend the agent loop manager/worker codepath with
DiffusionAgentLoopWorkerand diffusion-specific output schemas. - Add a new diffusion agent loop test under
tests/experimental/agent_loop.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
verl/experimental/agent_loop/single_turn_agent_loop.py |
Adds a diffusion-oriented single-turn agent loop that passes negative prompts and expects image outputs. |
verl/experimental/agent_loop/agent_loop.py |
Adds diffusion output models, diffusion worker implementation, and routes manager to the diffusion worker based on model_type. |
verl/experimental/agent_loop/__init__.py |
Exposes DiffusionAgentLoopWorker in the package exports. |
tests/experimental/agent_loop/test_diffusion_agent_loop.py |
Adds an end-to-end-style diffusion rollout test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
|
move to verl PR |
What does this PR do?
Add diffusion agent loop support
Checklist Before Starting
[{modules}] {type}: {description}(This will be checked by the CI){modules}includefsdp,megatron,veomni,sglang,vllm,rollout,trainer,ci,training_utils,recipe,hardware,deployment,ray,worker,single_controller,misc,perf,model,algo,env,tool,ckpt,doc,data,cfg,reward,fully_async,one_step_off,like[megatron, fsdp, doc]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.[BREAKING][fsdp, megatron] feat: dynamic batchingTest
API and Usage Example
# Add code snippet or script demonstrating how to use thisDesign & Code Changes
Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel in theverlSlack workspace. (If not accessible, please try the Feishu group (飞书群).)recipesubmodule, please also update the reference to the submodule commit viagit submodule update --remoteorcd recipe && git pull origin main.