Fix get_wandb_tags tag-limit handling and log all GRPO configs by mnoukhov · Pull Request #1727 · allenai/open-instruct

mnoukhov · 2026-06-17T20:28:14Z

What

Two small fixes ported over from another branch:

get_wandb_tags tag-limit handling. get_wandb_tags now accepts an extra_tags argument (e.g. the experiment name). Extra tags are prepended and, like all tags, truncated to W&B's 64-char limit. Previously callers did [args.exp_name] + get_wandb_tags(), which bypassed truncation and could exceed W&B's limit. All callers (grpo_fast.py, finetune.py, dpo_tune_cache.py, reward_modeling.py) are updated, and the OLMo-core GRPO WandBCallback now also receives tags.
Log all GRPO config dataclasses. grpo.py now logs tc, model_config, streaming_config, and vllm_config into the wandb config in addition to args, instead of only args.

Testing

make style && make quality pass.

🤖 Generated with Claude Code

gemini-code-assist

Code Review

This pull request refactors the Weights & Biases (W&B) integration across multiple training scripts to use an updated get_wandb_tags function that prepends extra tags. It also updates grpo.py to log multiple configuration dataclasses to the W&B config. Feedback was provided on grpo.py regarding the risk of unpacking multiple dataclasses using ** in a single .update() call, which could cause a runtime TypeError if there are overlapping field names; updating the dictionary sequentially is recommended instead.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-17T20:29:14Z

+    json_config.update(
+        **dataclasses.asdict(args),
+        **dataclasses.asdict(tc),
+        **dataclasses.asdict(model_config),
+        **dataclasses.asdict(streaming_config),
+        **dataclasses.asdict(vllm_config),
+    )


Unpacking multiple dataclasses as keyword arguments using ** in a single update() call is risky. If any of these dataclasses (args, tc, model_config, streaming_config, vllm_config) share any field names (either now or in the future), Python will raise a TypeError: update() got multiple values for keyword argument ... at runtime.

To prevent potential runtime crashes and ensure robustness, update the dictionary sequentially by passing the dictionaries directly to .update().

for config in (args, tc, model_config, streaming_config, vllm_config): json_config.update(dataclasses.asdict(config))

This would silently overwrite some configs with others. I prefer to have it fail if two dataclasses share configs

mnoukhov added 3 commits June 17, 2026 16:27

correct wandb tag limit

c4de080

log all args

dd15db7

Update CHANGELOG

bf5e0f9

mnoukhov force-pushed the wandb-tags-and-grpo-config branch from de66fa9 to bf5e0f9 Compare June 17, 2026 20:28

gemini-code-assist Bot reviewed Jun 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix get_wandb_tags tag-limit handling and log all GRPO configs#1727

Fix get_wandb_tags tag-limit handling and log all GRPO configs#1727
mnoukhov wants to merge 3 commits into
mainfrom
wandb-tags-and-grpo-config

mnoukhov commented Jun 17, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 17, 2026

Uh oh!

mnoukhov Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mnoukhov commented Jun 17, 2026

What

Testing

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

mnoukhov Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant