Drop stale async rollout results by hamishivi · Pull Request #1738 · allenai/open-instruct

hamishivi · 2026-06-25T16:20:23Z

Summary

Add max_result_age_steps to accumulate_inference_batches: in async GRPO, a rollout whose generating policy is more than that many steps behind the current trainer step (training_step - result.model_step > max_result_age_steps) is dropped instead of trained on.
When a stale result is dropped, a fresh prompt is replenished into the generator (when replenish_prompts is on) so throughput is unaffected, and the result is not kept for requeue-on-timeout.
Logs a stale_results_dropped reward metric per accumulation.
Wired from DataPreparationActor using async_steps as the staleness threshold (results more than one async window behind are dropped).

Notes

model_step tracking and the model_step_min/max/mean / num_steps_off_policy metrics already existed on main; this only adds the drop policy on top.
The staleness predicate is factored into a small pure result_is_stale(...) helper for testability.

Test plan

New unit tests TestResultIsStale cover: disabled when max_result_age_steps/training_step/model_step is None, stale when lag exceeds the threshold, not stale at the threshold boundary, and not stale when fresh.
ruff, formatting, and whole-project ty check pass.

GPU_TESTS=bypass

Made with Cursor

Add max_result_age_steps to accumulate_inference_batches: rollouts whose generating policy is more than that many steps behind the trainer are dropped (and a fresh prompt is replenished) rather than trained on, with a stale_results_dropped metric. Wired from the DataPreparationActor using async_steps as the threshold. Adds a result_is_stale helper + unit tests. Co-authored-by: Cursor <cursoragent@cursor.com>

Co-authored-by: Cursor <cursoragent@cursor.com>

gemini-code-assist

Code Review

This pull request introduces a mechanism to drop stale async rollout results whose generating policy lags too far behind the trainer, replenishing them with fresh prompts and logging a stale_results_dropped metric. Unit tests were also added to verify the staleness logic. The reviewer identified a critical issue where dropping stale results when replenish_prompts is disabled could lead to a silent deadlock or hang due to a drained pipeline, and suggested adding a validation check to prevent this configuration.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-25T16:21:48Z

+        if result_is_stale(result.model_step, training_step, max_result_age_steps):
+            stale_results_dropped += 1
+            logger.warning(
+                "[accumulate_inference_batches] Dropping stale result for index=%s at training_step=%s: "
+                "model_step=%s lag=%s max_result_age_steps=%s",
+                result.index,
+                training_step,
+                result.model_step,
+                training_step - result.model_step,
+                max_result_age_steps,
+            )
+            if replenish_prompts:
+                assert iter_dataloader is not None and param_prompt_Q is not None
+                add_prompt_to_generator(
+                    next(iter_dataloader),
+                    iter_dataloader._epoch,
+                    param_prompt_Q,
+                    generation_config,
+                    is_eval=False,
+                    base_env_config=base_env_config,
+                    ground_truth_overrides=ground_truth_overrides,
+                )
+            continue


If max_result_age_steps is configured but replenish_prompts is False, dropping stale results without replenishing them will permanently reduce the number of prompts in flight. Over time, this will drain the pipeline and cause a silent deadlock/hang as the accumulator waits indefinitely for num_prompts results that will never arrive.

To prevent this, we should explicitly validate that replenish_prompts is enabled when staleness checking is active.

if result_is_stale(result.model_step, training_step, max_result_age_steps): if not replenish_prompts: raise ValueError("max_result_age_steps requires replenish_prompts to be True to avoid deadlocks.") stale_results_dropped += 1 logger.warning( "[accumulate_inference_batches] Dropping stale result for index=%s at training_step=%s: " "model_step=%s lag=%s max_result_age_steps=%s", result.index, training_step, result.model_step, training_step - result.model_step, max_result_age_steps, ) assert iter_dataloader is not None and param_prompt_Q is not None add_prompt_to_generator( next(iter_dataloader), iter_dataloader._epoch, param_prompt_Q, generation_config, is_eval=False, base_env_config=base_env_config, ground_truth_overrides=ground_truth_overrides, ) continue

Co-authored-by: Cursor <cursoragent@cursor.com>

Dropping stale results without replenishing would steadily drain the in-flight prompt pool and hang the accumulator, so validate up front that max_result_age_steps is only used with replenish_prompts=True. The drop path now replenishes unconditionally. Adds a unit test for the guard. Co-authored-by: Cursor <cursoragent@cursor.com>

hamishivi · 2026-06-25T16:29:40Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a mechanism to drop stale async rollout results whose generating policy lags too far behind the trainer (exceeding max_result_age_steps). When a stale result is detected, it is discarded, a fresh prompt is replenished to prevent the pipeline from draining, and a stale_results_dropped metric is logged. Unit tests have been added to verify the staleness detection logic and the associated configuration validation. There are no review comments, so no additional feedback is provided.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

farhatkevin

looks good!

hamishivi and others added 2 commits June 25, 2026 09:20

Add CHANGELOG entry for stale rollout dropping (#1738)

deb9a0a

Co-authored-by: Cursor <cursoragent@cursor.com>

gemini-code-assist Bot reviewed Jun 25, 2026

View reviewed changes

hamishivi and others added 2 commits June 25, 2026 09:22

Log per-batch count of dropped stale rollout results

bccb5ca

Co-authored-by: Cursor <cursoragent@cursor.com>

gemini-code-assist Bot reviewed Jun 25, 2026

View reviewed changes

hamishivi requested a review from farhatkevin June 25, 2026 16:48

farhatkevin approved these changes Jun 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drop stale async rollout results#1738

Drop stale async rollout results#1738
hamishivi wants to merge 4 commits into
mainfrom
hamishivi/drop-stale-rollouts

hamishivi commented Jun 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Uh oh!

hamishivi commented Jun 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

farhatkevin left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hamishivi commented Jun 25, 2026

Summary

Notes

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

hamishivi commented Jun 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

farhatkevin left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants