Skip to content

[fix] stop merging agentic turns at first non-COMPLETED turn#1323

Merged
Shi-Dong merged 3 commits into
mainfrom
shi/fix-merge-samples-truncated
Jun 13, 2026
Merged

[fix] stop merging agentic turns at first non-COMPLETED turn#1323
Shi-Dong merged 3 commits into
mainfrom
shi/fix-merge-samples-truncated

Conversation

@Shi-Dong

Copy link
Copy Markdown
Contributor

When MILES_EXPERIMENTAL_ROLLOUT_REFACTOR is on, each LLM call within a multi-turn agentic trajectory produces its own Sample, and merge_samples folds them left into one training Sample via _merge_sample_pair. _merge_sample_pair asserts the accumulated turn is COMPLETED before appending the next, encoding the invariant that only the final turn may be non-COMPLETED.

A trajectory whose intermediate turn TRUNCATED (hit --rollout-max-response-len mid-generation) but whose agent harness still produced later turns violated that invariant and crashed the rollout loop with AssertionError: a.status must be COMPLETED, got Status.TRUNCATED.

Fix: in merge_samples, stop folding at the first non-COMPLETED turn so the trajectory simply ends there (merged Sample retains the truncated status). Keeps the assertion in _merge_sample_pair as a real invariant. No behavior change for fully-COMPLETED trajectories.

Repro: DeepSeek-V4-Flash agentic RL on Terminal-Bench-2/terminus-2; crashed right after step 0 on a polyglot task that truncated an intermediate turn.

merge_samples folds per-turn Samples of a multi-turn agentic trajectory
into one training Sample. _merge_sample_pair asserts the accumulated turn
is COMPLETED before appending the next, encoding the invariant that only
the final turn may be non-COMPLETED. When an intermediate turn TRUNCATED
(hit rollout-max-response-len mid-generation) yet the agent harness still
produced later turns, the assertion crashed the rollout loop. Stop folding
at the first non-COMPLETED turn so the trajectory ends there.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the merge_samples function in miles/rollout/generate_utils/sample_utils.py to ensure that only a COMPLETED turn can be extended by a later turn. If an intermediate turn is truncated (i.e., its status is not COMPLETED), the merging loop breaks early. There are no review comments, so there is no feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@guapisolo guapisolo left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@Shi-Dong Shi-Dong merged commit 969c2a7 into main Jun 13, 2026
32 checks passed
@Shi-Dong Shi-Dong deleted the shi/fix-merge-samples-truncated branch June 13, 2026 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants