fix(engine): resume length-truncated turns; unclamp Opus output (32K→128K) by mgoldsborough · Pull Request #374 · NimbleBrainInc/nimblebrain

mgoldsborough · 2026-06-03T07:01:35Z

What & why

RCA of a production session where a long agent turn was silently cut off mid-response. Two independent defects at two layers; both fixed here.

Symptom 1 — turn lost at the output ceiling

Magnitude: @ai-sdk/anthropic@3.0.64's getModelCapabilities() predates Opus 4.7/4.8, so claude-opus-4-7 falls into the opus-4- catch-all (32K) and the SDK silently clamps the platform's 128K request down (dist/index.js:3168). A turn truncated at exactly out=32000, finishReason=length. Bumped to 3.0.81, whose table maps opus-4-7/opus-4-8 → 128K. With maxOutputTokens unset (the correct default), Opus now gets its real ceiling.
Mechanism: the agent loop ended the run on a finishReason: "length" response with no tool call, treating truncation as "done" (engine.ts). It now auto-resumes from the partial text, bounded by MAX_LENGTH_CONTINUATIONS (4) and emitting a context.length_continuation event. Past the bound it ends stopReason: "length" (unchanged from before).

Safety: reasoning models

A length cut mid-thinking drains an unsigned reasoning block; replaying it as the trailing assistant message is what Anthropic rejects ("thinking blocks in the latest assistant message cannot be modified"). The resume is guarded by hasUnsignedReasoning() — that case surfaces stopReason: "length" instead of resuming. (Caught by adversarial review; our default model runs with extended thinking.)

Doc

Corrected the stale RuntimeConfig.maxOutputTokens doc ("Default: 16384") — unset resolves to the model's catalog ceiling; 16384 is only the unknown-model fallback.

Tests

New test/unit/engine-length-continuation.test.ts (6): resume + seamless stitch, continuation bound → length, non-length no-regression, tool-call path bypasses resume, unsigned-reasoning surfaces, signed-reasoning resumes.
Updated one existing finishReason test to the corrected behavior.
bun run verify:static ✅, tsc ✅, full unit suite green except one pre-existing unrelated failure (dompurify in an untouched bundle UI, fails identically on main).

Follow-ups (not in this PR)

coerce-input/validate: model stringified list arg reaches bundle as str (pydantic list_type) #372 — model occasionally emits a list arg as a JSON string that reaches the bundle as str; needs a repro with full args before fixing at the right layer.
The "Default: 16384" text also lives in the schemas repo (fetched artifact) — small follow-up there.

…128K) A long agent turn that hit the model output ceiling was silently lost: the loop treated a `finishReason: "length"` response with no tool call as "model done" and ended the run mid-answer. Two independent causes, two layers: - Magnitude: @ai-sdk/anthropic@3.0.64 predates Opus 4.7/4.8 — both fall into its `opus-4-` catch-all (32K) and the SDK silently clamps the requested 128K down. Bump to 3.0.81, whose model table maps opus-4-7/4-8 → 128K, so an unset maxOutputTokens now resolves to the real catalog ceiling. - Mechanism: the engine now auto-resumes a length-truncated turn from its partial text, bounded by MAX_LENGTH_CONTINUATIONS (4) and emitting a `context.length_continuation` event. Resumes are skipped when a reasoning block lacks its provider signature (a mid-thinking cut) — replaying an unsigned thinking block as the trailing assistant message is rejected by Anthropic; that case surfaces stopReason "length" instead. Also corrects the stale "Default: 16384" doc on RuntimeConfig.maxOutputTokens (unset resolves to the catalog ceiling; 16384 is only the unknown-model fallback). Tests: new test/unit/engine-length-continuation.test.ts covers resume + stitch, the continuation bound, the non-length no-regression, the tool-call path, and both signed/unsigned reasoning cases.

…h-resume Record why the resume isn't gated by provider (engine is provider-agnostic; the path is bounded and Anthropic-default) and what to do if a non-Anthropic model becomes a default. No behavior change.

mgoldsborough added the qa-reviewed QA review completed with no critical issues label Jun 3, 2026

docs(engine): note Anthropic prefill-continuation assumption on lengt…

19ca0d5

…h-resume Record why the resume isn't gated by provider (engine is provider-agnostic; the path is bounded and Anthropic-default) and what to do if a non-Anthropic model becomes a default. No behavior change.

mgoldsborough mentioned this pull request Jun 3, 2026

test(integration): stop mcp-resources flaking on subprocess-spawn timeout #377

Merged

mgoldsborough merged commit fac15c2 into main Jun 3, 2026
9 of 10 checks passed

mgoldsborough deleted the fix/engine-length-continuation-and-output branch June 3, 2026 07:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(engine): resume length-truncated turns; unclamp Opus output (32K→128K)#374

fix(engine): resume length-truncated turns; unclamp Opus output (32K→128K)#374
mgoldsborough merged 2 commits into
mainfrom
fix/engine-length-continuation-and-output

mgoldsborough commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mgoldsborough commented Jun 3, 2026

What & why

Symptom 1 — turn lost at the output ceiling

Safety: reasoning models

Doc

Tests

Follow-ups (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant