Skip to content

feat: context optimization — summary limits, step truncation, min range, nudge tuning#36

Merged
ranxianglei merged 8 commits into
masterfrom
ranxianglei/2026-06-29_context-optimization
Jun 29, 2026
Merged

feat: context optimization — summary limits, step truncation, min range, nudge tuning#36
ranxianglei merged 8 commits into
masterfrom
ranxianglei/2026-06-29_context-optimization

Conversation

@ranxianglei

@ranxianglei ranxianglei commented Jun 28, 2026

Copy link
Copy Markdown
Owner

Summary

Systematic context optimization to reduce token waste in long sessions. Based on root cause analysis of session that grew to 47% context usage. Eight commits spanning cache-neutral fixes, tool cleanup, bug fixes, and documentation.

Changes

1. Summary Length Limits (two-tier)

  • maxSummaryLength (default 200): Soft target shown in schema description — guides model to write concise summaries
  • maxSummaryLengthHard (default 800): Hard ceiling enforced at execution — allows exceptions for file paths, decisions, signatures, exact values
  • Config validation: type check + positivity + hard >= soft

2. Minimum Compress Range

  • minCompressRange (default 2000): Reject compress if range too small — prevents overhead exceeding savings

3. Step Marker Truncation

  • Skip step-start entirely, truncate step-finish to ~50 chars (was avg 155)
  • Saves ~90K tokens in long sessions

4. Nudge Strengthening

  • Explicitly targets tool outputs >5000 chars for compression
  • Directive nudge: when >50 blocks, suggests specific consolidation ranges with token estimates
  • Guidance text shortened to 1-3 sentences per pressure level

5. mark_block + unmark_block Removal

  • Removed from model tools — description misled model ("use mark_block instead of compress")
  • Removed from system prompt and protected tools list
  • Batch cleanup Tiers 1-2 code retained but dormant (no new marks can be created)

6. Auto-Detect Consumed Blocks (Plan B)

  • compress handler now auto-detects ALL blocks within the compress range (not just boundary blocks)
  • Uses requiredBlockIds + boundary detection, deduped
  • Model no longer needs to manually include (bN) placeholders — system handles it
  • Placeholder validation downgraded from error to warning

7. Directive Nudge Range Fix

  • Fixed bug: ranges used stale startId/endId (could be outside visible range)
  • Fixed bug: ranges could be backwards (end < start)
  • Now uses anchorMessageId (where block summary appears in visible messages)
  • Only suggests ranges within visible message window
  • No suggestion when no blocks are visible

8. GC Simplification

  • Retired batchCleanup three-tier system (Tiers 1-3)
  • Reduced to hardcoded 100% GC as ultimate fallback
  • Simpler, more predictable, fewer moving parts

9. README Updates

  • English + Chinese README synced
  • Updated: tool list, lifecycle diagram, GC config, protected tools
  • Removed all mark_block/unmark_block references

Config Options

{
  "compress": {
    "maxSummaryLength": 200,       // soft target (chars)
    "maxSummaryLengthHard": 800,   // hard ceiling (chars)
    "minCompressRange": 2000       // min chars to allow compress
  }
}

Not Implemented

  • Exclude old reasoning: Cancelled — causes recurring cache breaks (prefix changes every turn as reasoning crosses age threshold)
  • Compress input cleanup: Not feasible with current OpenCode plugin API (no way to modify stored tool parts post-execution)
  • Block ID list → count only: Kept as-is — compress tool needs block IDs for (bN) placeholder accuracy

Commits

  1. feat: context optimization — summary limits, step truncation, min range, nudge tuning
  2. review fixes: maxSummaryLength 200, step-finish idempotent, prune tests, dedup
  3. feat: remove mark_block + unmark_block from model tools
  4. feat: auto-detect consumed blocks in compress + fix directive nudge ranges
  5. refactor: retire mark_block mechanism, reduce GC to hardcoded 100% fallback
  6. docs: update README — remove mark_block, simplify GC to 100% fallback
  7. docs: sync Chinese README — remove mark_block, simplify GC to 100% fallback
  8. feat(compress): soft summary target + generous hard ceiling

Verification

  • npm run typecheck: clean ✅
  • npm run test: 495 pass, 0 fail ✅
  • Devlog: devlog/2026-06-29_context-optimization/{REQ,WORKLOG}.md

…ge, nudge tuning

- maxSummaryLength config (default 100): reject compress if summary exceeds limit
- minCompressRange config (default 2000): reject compress if range too small
- stripStepMarkers in prune: skip step-start, truncate step-finish to 50 chars
- Nudge: target large tool outputs (>5000 chars) explicitly
- Shorter pressure level descriptions and per-message guidance
- Block ID list unchanged (accuracy requirement)

487 tests pass, typecheck clean
@ranxianglei ranxianglei force-pushed the ranxianglei/2026-06-29_context-optimization branch from 4b5e227 to 8aa9480 Compare June 29, 2026 03:11
…ts, dedup

- config: raise maxSummaryLength default 100 -> 200 (less aggressive)
- prune: guard step-finish truncation with truncated !== reason so the
  parts array reference stays stable on idempotent re-runs (prefix cache)
- compress message/range: dedup messageId set in minCompressRange char
  counting (message.ts now matches range.ts); document that the throw is
  intentionally placed after prepareSession with no persisted state
- tests: add stripStepMarkers regression coverage (removal, truncation,
  short-reason preserve, idempotency, no-op on clean messages)

typecheck clean, 492 tests pass
@ranxianglei ranxianglei force-pushed the ranxianglei/2026-06-29_context-optimization branch from 4016628 to 103afef Compare June 29, 2026 04:48
- Remove mark_block and unmark_block tool registrations (index.ts)
- Remove mark_block description from system prompt (system.ts)
- Keep gc/merge.ts + gc/truncate.ts as dormant safety nets
- GC at 100% context remains as ultimate fallback
- Model now only sees compress + decompress tools
…llback

- delete lib/compress/mark-block.ts + its export (tools already unregistered)
- remove mark_block/unmark_block from DEFAULT_PROTECTED_TOOLS
- gc/merge: remove buildNudgeText/collectActiveMarkedBlocks/multi-tier logic;
  runBatchCleanup now only force-merges old-gen blocks at 100% (hardcoded,
  not read from config). Fixes broken nudge that referenced removed tools.
- hooks: drop dead tier-1 nudge branch + appendBatchCleanupNudge helper
- tests: replace mark-tier runBatchCleanup tests with 100% fallback coverage
  (mergeMarkedBlocks primitive tests retained)

Full GC config/state cleanup deferred to a follow-up. typecheck clean,
483 tests pass.
Replace the aggressive 200-char hard reject (which forced expensive full
retries and pushed the model to drop detail) with a two-tier scheme:

- maxSummaryLength (default 200, unchanged): now a SOFT target interpolated
  into the compress-message/compress-range tool descriptions, guiding the
  model to write concise summaries upfront.
- maxSummaryLengthHard (default 800): the new hard ceiling. Only summaries
  beyond this are rejected, so reasonable 220-700 char summaries pass
  one-shot. Compression becomes near-retry-free.

Also validates maxSummaryLengthHard >= maxSummaryLength. typecheck clean,
486 tests pass.
@ranxianglei ranxianglei merged commit c734daf into master Jun 29, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant