feat: context optimization — summary limits, step truncation, min range, nudge tuning#36
Merged
Conversation
…ge, nudge tuning - maxSummaryLength config (default 100): reject compress if summary exceeds limit - minCompressRange config (default 2000): reject compress if range too small - stripStepMarkers in prune: skip step-start, truncate step-finish to 50 chars - Nudge: target large tool outputs (>5000 chars) explicitly - Shorter pressure level descriptions and per-message guidance - Block ID list unchanged (accuracy requirement) 487 tests pass, typecheck clean
4b5e227 to
8aa9480
Compare
…ts, dedup - config: raise maxSummaryLength default 100 -> 200 (less aggressive) - prune: guard step-finish truncation with truncated !== reason so the parts array reference stays stable on idempotent re-runs (prefix cache) - compress message/range: dedup messageId set in minCompressRange char counting (message.ts now matches range.ts); document that the throw is intentionally placed after prepareSession with no persisted state - tests: add stripStepMarkers regression coverage (removal, truncation, short-reason preserve, idempotency, no-op on clean messages) typecheck clean, 492 tests pass
4016628 to
103afef
Compare
- Remove mark_block and unmark_block tool registrations (index.ts) - Remove mark_block description from system prompt (system.ts) - Keep gc/merge.ts + gc/truncate.ts as dormant safety nets - GC at 100% context remains as ultimate fallback - Model now only sees compress + decompress tools
…llback - delete lib/compress/mark-block.ts + its export (tools already unregistered) - remove mark_block/unmark_block from DEFAULT_PROTECTED_TOOLS - gc/merge: remove buildNudgeText/collectActiveMarkedBlocks/multi-tier logic; runBatchCleanup now only force-merges old-gen blocks at 100% (hardcoded, not read from config). Fixes broken nudge that referenced removed tools. - hooks: drop dead tier-1 nudge branch + appendBatchCleanupNudge helper - tests: replace mark-tier runBatchCleanup tests with 100% fallback coverage (mergeMarkedBlocks primitive tests retained) Full GC config/state cleanup deferred to a follow-up. typecheck clean, 483 tests pass.
Replace the aggressive 200-char hard reject (which forced expensive full retries and pushed the model to drop detail) with a two-tier scheme: - maxSummaryLength (default 200, unchanged): now a SOFT target interpolated into the compress-message/compress-range tool descriptions, guiding the model to write concise summaries upfront. - maxSummaryLengthHard (default 800): the new hard ceiling. Only summaries beyond this are rejected, so reasonable 220-700 char summaries pass one-shot. Compression becomes near-retry-free. Also validates maxSummaryLengthHard >= maxSummaryLength. typecheck clean, 486 tests pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Systematic context optimization to reduce token waste in long sessions. Based on root cause analysis of session that grew to 47% context usage. Eight commits spanning cache-neutral fixes, tool cleanup, bug fixes, and documentation.
Changes
1. Summary Length Limits (two-tier)
maxSummaryLength(default 200): Soft target shown in schema description — guides model to write concise summariesmaxSummaryLengthHard(default 800): Hard ceiling enforced at execution — allows exceptions for file paths, decisions, signatures, exact valueshard >= soft2. Minimum Compress Range
minCompressRange(default 2000): Reject compress if range too small — prevents overhead exceeding savings3. Step Marker Truncation
step-startentirely, truncatestep-finishto ~50 chars (was avg 155)4. Nudge Strengthening
5. mark_block + unmark_block Removal
6. Auto-Detect Consumed Blocks (Plan B)
requiredBlockIds+ boundary detection, deduped(bN)placeholders — system handles it7. Directive Nudge Range Fix
startId/endId(could be outside visible range)anchorMessageId(where block summary appears in visible messages)8. GC Simplification
9. README Updates
Config Options
{ "compress": { "maxSummaryLength": 200, // soft target (chars) "maxSummaryLengthHard": 800, // hard ceiling (chars) "minCompressRange": 2000 // min chars to allow compress } }Not Implemented
(bN)placeholder accuracyCommits
feat: context optimization — summary limits, step truncation, min range, nudge tuningreview fixes: maxSummaryLength 200, step-finish idempotent, prune tests, dedupfeat: remove mark_block + unmark_block from model toolsfeat: auto-detect consumed blocks in compress + fix directive nudge rangesrefactor: retire mark_block mechanism, reduce GC to hardcoded 100% fallbackdocs: update README — remove mark_block, simplify GC to 100% fallbackdocs: sync Chinese README — remove mark_block, simplify GC to 100% fallbackfeat(compress): soft summary target + generous hard ceilingVerification
npm run typecheck: clean ✅npm run test: 495 pass, 0 fail ✅devlog/2026-06-29_context-optimization/{REQ,WORKLOG}.md