Skip to content
Merged
26 changes: 9 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,18 +82,20 @@ Or add to your opencode config:

ACP hands the context-compression tool directly to the model. The model is
**100% responsible** for context compression. The model's available tools are
mainly: **compress**, **decompress**, and **delete** (`mark_block` / `unmark_block`).
mainly: **compress** and **decompress**. A hardcoded 100% GC fallback acts as
a safety net when the context window is completely full.

### Lifecycle

Three operations: **compress**, **decompress**, and **delete**. Content loops
between raw and compressed, and eventually terminates in deletion:
Two operations: **compress** and **decompress**. Content loops between raw and
compressed. When context hits 100%, old-gen block summaries are truncated as
a last resort:

```mermaid
stateDiagram-v2
Raw --> Compressed : compress
Compressed --> Raw : decompress
Compressed --> Deleted : delete
Compressed --> Truncated : GC at 100%
```

### Compression strategy
Expand Down Expand Up @@ -305,7 +307,7 @@ Each level overrides the previous, so project settings take priority over global
"protectedTools": [],
},
},
// Garbage collection and batch cleanup
// Garbage collection — hardcoded 100% fallback only
"gc": {
"algorithm": "truncate",
// young → old generation promotion after this many survivals
Expand All @@ -314,18 +316,8 @@ Each level overrides the previous, so project settings take priority over global
"maxBlockAge": 15,
// truncate old-gen summaries exceeding this length (chars)
"maxOldGenSummaryLength": 3000,
// run major GC when context usage exceeds this
// run major GC when context usage exceeds this (hardcoded, not configurable)
"majorGcThresholdPercent": "100%",
// Three-tier batch merge-cleanup for blocks flagged via mark_block.
// Accepts a number or "X%" of the model context window.
"batchCleanup": {
// At/above this usage, remind the model about marked blocks
"lowThreshold": "60%",
// At/above this usage, auto merge-compress all marked blocks into one
"highThreshold": "75%",
// At/above this usage, force-merge all old-gen blocks (before GC)
"forceThreshold": "90%",
},
},
}
```
Expand Down Expand Up @@ -354,7 +346,7 @@ To reset an override, delete the matching file from your overrides directory.
### Protected Tools

By default, these tools are always protected from pruning:
`task`, `skill`, `todowrite`, `todoread`, `compress`, `decompress`, `mark_block`, `unmark_block`, `batch`, `plan_enter`, `plan_exit`, `write`, `edit`
`task`, `skill`, `todowrite`, `todoread`, `compress`, `decompress`, `batch`, `plan_enter`, `plan_exit`, `write`, `edit`

The `protectedTools` arrays in `commands` and `strategies` add to this default list.

Expand Down
24 changes: 7 additions & 17 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,17 +73,17 @@ opencode plugin opencode-acp@latest --global

## 工作原理

ACP 把上下文压缩工具直接交给模型。模型对上下文压缩**负全责**。模型可用的工具主要是:**compress**、**decompress** 和 **delete**(`mark_block` / `unmark_block`)
ACP 把上下文压缩工具直接交给模型。模型对上下文压缩**负全责**。模型可用的工具主要是:**compress** 和 **decompress**。当上下文达到 100% 时,系统自动触发 GC 截断作为兜底

### 生命周期

三个操作:**压缩**、**解压缩**、**删除**。内容在原始与压缩之间循环,最终以删除终结
两个操作:**压缩**、**解压缩**。内容在原始与压缩之间循环。当上下文达到 100% 时,GC 自动截断老年代 block 作为兜底

```mermaid
stateDiagram-v2
Raw --> Compressed : compress
Compressed --> Raw : decompress
Compressed --> Deleted : delete
Compressed --> GC_Truncated : GC (100%)
```

### 压缩策略
Expand All @@ -104,9 +104,9 @@ stateDiagram-v2

由模型决定何时解压。当上下文大到足以干扰模型的 self-attention 时,简短的 block 会让模型先压缩一部分内容,处理完紧急事务,再在后续工作中按需解压。

### 删除策略
### GC 兜底

为了应对大量小块历史内容的堆积,新版本增加了删除策略。由模型决定是否删除。**一旦删除,内容不可恢复。** 这取代了原先的强制 GC,使得强制垃圾回收不再删除模型认为重要的内容
当上下文达到 100% 时,系统自动截断老年代 block 摘要,防止上下文溢出。这是最后的兜底机制,不影响模型的正常压缩/解压操作

---

Expand Down Expand Up @@ -289,18 +289,8 @@ ACP 使用自己的配置文件,按以下顺序搜索:
"maxBlockAge": 15,
// 截断超过此长度(字符)的老年代摘要
"maxOldGenSummaryLength": 3000,
// 上下文使用率超过此值时执行主 GC
// 上下文使用率超过此值时执行主 GC(兜底,硬编码为 100%)
"majorGcThresholdPercent": "100%",
// 通过 mark_block 标记的块的三级批量合并清理阈值。
// 接受数字或 "X%"(模型上下文窗口的百分比)。
"batchCleanup": {
// 达到此使用率时,提醒模型已标记的块
"lowThreshold": "60%",
// 达到此使用率时,自动将所有已标记块合并压缩为一个
"highThreshold": "75%",
// 达到此使用率时,强制合并所有老年代块(GC 之前)
"forceThreshold": "90%",
},
},
}
```
Expand Down Expand Up @@ -329,7 +319,7 @@ ACP 暴露六个可编辑的 prompt:
### 受保护工具

默认情况下,以下工具始终受保护不被剪枝:
`task`、`skill`、`todowrite`、`todoread`、`compress`、`decompress`、`mark_block`、`unmark_block`、`batch`、`plan_enter`、`plan_exit`、`write`、`edit`
`task`、`skill`、`todowrite`、`todoread`、`compress`、`decompress`、`batch`、`plan_enter`、`plan_exit`、`write`、`edit`

`commands` 和 `strategies` 中的 `protectedTools` 数组会添加到此默认列表。

Expand Down
31 changes: 31 additions & 0 deletions devlog/2026-06-29_context-optimization/REQ.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Context Optimization — Reduce Token Waste

## Problem

Session ses_102504697ffeYg89Sn0k8aknYg grew to 47% context usage. Root cause analysis revealed systematic token waste:

1. **Compress summaries too verbose**: avg 579 chars (~145 tokens), some up to 2011 chars. Include unnecessary metrics, reviewer quotes, experimental parameters.
2. **Compress tool calls are pure overhead**: 344 calls × 813 chars avg = 280K chars. Each stores full summary in input — duplicated with block summary.
3. **Step markers waste space**: 4698 step-start/step-finish parts × ~88 chars avg = 413K chars (~103K tokens). Only mark boundaries, no useful content.
4. **Large tool outputs not compressed**: Model keeps 20-50K char outputs "just in case".
5. **No minimum compress range**: Model compresses tiny ranges (<2K chars) where overhead exceeds savings.
6. **ACP guidance too verbose**: Multi-paragraph nudge text wastes ~200 tokens/turn.

## Requirements

1. **R1**: Limit compress summary length to configurable max (default 100 chars). Reject if exceeded.
2. **R2**: ~~Truncate compress tool input after execution~~ — NOT FEASIBLE (no API to modify stored parts).
3. **R3**: Strengthen nudge to target large tool outputs (>5K chars) explicitly.
4. **R5**: Truncate step markers in context construction (skip step-start, truncate step-finish to 50 chars).
5. **R6**: Shorten ACP guidance text (pressure levels + per-message guidance).
6. **R7**: Enforce minimum compress range (default 2000 chars). Reject if below.

## Cache Safety

All fixes are either cache-neutral (only affect future operations) or one-time breaks that stabilize after deployment. No recurring cache breaks.

## Non-Goals

- Excluding old reasoning from context (causes recurring cache breaks — cancelled).
- Modifying block ID list (accuracy risk — kept as-is).
- compress tool input cleanup (not feasible with current API).
47 changes: 47 additions & 0 deletions devlog/2026-06-29_context-optimization/WORKLOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Worklog — Context Optimization

## Changes (8 files, +186/-8 lines)

### Fix 1: Summary length limit (R1)
- **config.ts**: Added `maxSummaryLength` (default 100) to CompressConfig
- **config-validation.ts**: Type + key validation
- **compress/message.ts, compress/range.ts**: Check `summary.length > maxSummaryLength` → throw error before creating block

### Fix 2: Compress tool cleanup (R2) — NOT FEASIBLE
- ToolContext API only allows modifying output/title/metadata, NOT input args
- Added TODO comments in both handlers noting `experimental.chat.messages.transform` as alternative
- Documented for future investigation

### Fix 3: Nudge strengthening (R3)
- **inject/utils.ts**: Guidance text now explicitly mentions ">5000 characters" tool outputs
- Changed from generic "compress tool outputs" to targeted "if any tool output >5000 chars and you've finished reading, compress it into a summary NOW"

### Fix 5: Step marker truncation (R5)
- **prune.ts**: New `stripStepMarkers()` function
- Skips `step-start` parts entirely (zero-value boundary markers)
- Truncates `step-finish` reason to 50 chars (was avg 155 chars)
- Called from `prune()` before context injection
- Estimated savings: ~90K tokens per session with heavy reasoning

### Fix 6: ACP simplification (R6)
- **system.ts**: Pressure level descriptions shortened to 1 sentence each
- Normal: "Be frugal — compress tool outputs you've finished using into summaries."
- Elevated: "Context is growing — compress larger ranges you no longer need."
- Critical: "Compress aggressively now — target the largest visible ranges first."
- **inject/utils.ts**: Per-message guidance reduced from 5+ to 3 sentences
- Block ID list: UNCHANGED (accuracy requirement)

### Fix 7: Minimum compress range (R7)
- **config.ts**: Added `minCompressRange` (default 2000) to CompressConfig
- **config-validation.ts**: Type + key validation
- **compress/message.ts, compress/range.ts**: Calculate total message chars via `countMessageCharacters()` → throw error if < minCompressRange
- **token-utils.ts**: New `countMessageCharacters()` helper

## Verification
- `npm run typecheck`: clean ✅
- `npm run test`: 487 pass, 0 fail ✅
- Block ID list: verified unchanged (empty git diff on nudge.ts)

## Not Implemented
- **Fix 4 (exclude old reasoning)**: Cancelled — causes recurring cache breaks as reasoning crosses age threshold every turn.
- **Fix 2 (compress input cleanup)**: Not feasible with current OpenCode plugin API. Needs `experimental.chat.messages.transform` hook investigation.
6 changes: 1 addition & 5 deletions index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ import {
createCompressMessageTool,
createCompressRangeTool,
createDecompressTool,
createMarkBlockTool,
createUnmarkBlockTool,
} from "./lib/compress"
import {
compressDisabledByOpencode,
Expand Down Expand Up @@ -91,8 +89,6 @@ const server: Plugin = (async (ctx) => {
? createCompressMessageTool(compressToolContext)
: createCompressRangeTool(compressToolContext),
decompress: createDecompressTool(compressToolContext),
mark_block: createMarkBlockTool(compressToolContext),
unmark_block: createUnmarkBlockTool(compressToolContext),
}),
},
config: async (opencodeConfig) => {
Expand All @@ -113,7 +109,7 @@ const server: Plugin = (async (ctx) => {

const toolsToAdd: string[] = []
if (config.compress.permission !== "deny" && !config.experimental.allowSubAgents) {
toolsToAdd.push("compress", "decompress", "mark_block", "unmark_block")
toolsToAdd.push("compress", "decompress")
}

if (toolsToAdd.length > 0) {
Expand Down
1 change: 0 additions & 1 deletion lib/compress/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,3 @@ export { ToolContext } from "./types"
export { createCompressMessageTool } from "./message"
export { createCompressRangeTool } from "./range"
export { createDecompressTool } from "./decompress"
export { createMarkBlockTool, createUnmarkBlockTool } from "./mark-block"
148 changes: 0 additions & 148 deletions lib/compress/mark-block.ts

This file was deleted.

Loading
Loading