Active Context Pruning for OpenCode
The model decides when and what to compress — not a hard limit.
opencode plugin opencode-acp@latest --global
ACP hands all context-management authority to the model itself — not relying on external models or any complex external mechanism to do context management. It is, to date, the best context-management implementation on the market.
This brings two concrete effects:
- It saves about two-thirds of tokens. A model with a 1,000,000-token context window effectively runs in the 200,000–300,000 token range.
- It supports ultra-long sessions without losing key content — 500M-token-level cumulative context, 100,000 messages per session.
Real engineering context, in practice.
Supports 500M-token-level cumulative context, with p95 context around 30% and an average prompt-cache hit ratio above 85%. (That average — not per-session — is explained in Impact on Prompt Caching, where it turns out to save far more tokens than traditional compression.)
| Session 1 | Session 2 | |
|---|---|---|
| Messages | 3,024 | 2,028 |
| Total tokens processed | 582 M | 463 M |
| Prompt-cache hit ratio | 86.2% | 89.0% |
| Context p50 (median) | 1.2 K (<1%) | 1.8 K (<1%) |
| Context p75 | 2.8 K | 3.5 K |
| Context p90 | 108 K (11%) | 58 K (6%) |
| Context p95 | 251 K (25%) | 335 K (34%) |
| Context p99 | 425 K (43%) | 442 K (44%) |
| Peak | 488 K (49%) | 769 K (77%) |
(Context percentages are of the 1M window.)
opencode plugin opencode-acp@latest --globalOr add to your opencode config:
{
"plugin": {
"opencode-acp": "latest"
}
}ACP hands the context-compression tool directly to the model. The model is
100% responsible for context compression. The model's available tools are
mainly: compress, decompress, and delete (mark_block / unmark_block).
Three operations: compress, decompress, and delete. Content loops between raw and compressed, and eventually terminates in deletion:
stateDiagram-v2
Raw --> Compressed : compress
Compressed --> Raw : decompress
Compressed --> Deleted : delete
The system injects a prompt telling the model the current context ratio, the compression ratio, whether context is idle, and compression suggestions. When the trigger ratio is hit, content is compressed in priority order:
- Agent/subagent review & consultation results (largest block of uncompressed content)
- Verbose command output (build/test runs, git diff/log/status, directory listings)
- Exploration that led nowhere (failed approaches, dead-end searches)
- Redundant tool results (reading the same file repeatedly, repeated status checks)
- Intermediate steps of completed multi-step tasks
- Resolved discussion threads (once a decision is recorded)
- Large file contents already used
After compression, the original content is replaced by a short block that
references the original (recoverable via decompress).
The model decides when to decompress. When the context is large enough to interfere with the model's self-attention, short blocks lead the model to compress some content first, handle the urgent matter, then decompress what it needs in later work.
To handle the accumulation of many small historical blocks, the new version adds a deletion strategy. The model decides whether to delete. Once deleted, content is irrecoverable. This replaces the original forced GC, so that forced garbage collection no longer deletes things the model considers important.
Historically, ACP has fixed many of the low-cache-hit-rate problems caused by DCP. The overall cache hit rate is now ~87%.
Compared to traditional compression — which only compresses at 80–90% and, once it compresses, forces 100% of the context to re-hit — ACP's hit rate is effectively higher.
Additionally, ACP keeps total context around ~30% most of the time, versus the traditional 50–80%. So total token savings are far higher than traditional compression.
Conclusion: ACP simultaneously raises the overall cache hit rate and ensures key context information is not lost.
ACP provides an /acp slash command (also accepts /dcp for backward compatibility):
| Command | Description |
|---|---|
/acp |
Shows available ACP commands |
/acp context |
Token usage breakdown by category (system, user, assistant, tools, etc.) and how much has been saved through pruning |
/acp stats |
Cumulative pruning statistics across all sessions |
/acp sweep [n] |
Prunes all tools since the last user message. Optional count: /acp sweep 10 prunes the last 10 tools. Respects commands.protectedTools |
/acp manual [on|off] |
Toggle manual mode. When on, the AI will not autonomously use context management tools |
/acp compress [focus] |
Trigger a single compress tool execution. Optional focus text directs what content to compress, following the active compress.mode |
/acp decompress <n> |
Restore a specific active compression by ID. Running without an argument shows available compression IDs, token sizes, and topics |
/acp recompress <n> |
Re-apply a user-decompressed compression by ID. Running without an argument shows recompressible IDs, token sizes, and topics |
ACP uses its own config file, searched in order:
- Global:
~/.config/opencode/acp.jsonc(oracp.json), created automatically on first run - Custom config directory:
$OPENCODE_CONFIG_DIR/acp.jsonc(oracp.json), ifOPENCODE_CONFIG_DIRis set - Project:
.opencode/acp.jsonc(oracp.json) in your project's.opencodedirectory
If no acp.jsonc is found, ACP falls back to dcp.jsonc / dcp.json (for backward compatibility with existing DCP installations) and auto-migrates on first write.
Each level overrides the previous, so project settings take priority over global. Restart OpenCode after making config changes.
Important
Disable OpenCode's built-in auto-compaction. ACP handles context management itself — OpenCode's compaction conflicts with ACP and can cause issues (re-expanded messages, lost compression state). Add to your opencode.json:
Or set the environment variable: OPENCODE_DISABLE_AUTOCOMPACT=1
Note
If you use models with smaller context windows, such as GitHub Copilot models or local models, lower compress.minContextLimit and compress.maxContextLimit in your configuration to match the available context.
Default Configuration (click to expand)
{
"$schema": "https://raw.githubusercontent.com/ranxianglei/opencode-acp/master/dcp.schema.json",
// Enable or disable the plugin
"enabled": true,
// Automatically update npm-installed ACP when a newer npm latest is available.
// Version-locked plugin specs are not updated.
"autoUpdate": true,
// Enable debug logging to ~/.config/opencode/logs/acp/
"debug": false,
// Notification display: "off", "minimal", or "detailed"
"pruneNotification": "detailed",
// Notification type: "chat" (in-conversation) or "toast" (system toast)
"pruneNotificationType": "chat",
// Slash commands configuration
"commands": {
"enabled": true,
// Additional tools to protect from pruning via commands (e.g., /acp sweep)
"protectedTools": [],
},
// Manual mode: disables autonomous context management,
// tools only run when explicitly triggered via /acp commands
"manualMode": {
"enabled": false,
// When true, automatic cleanup (deduplication, purgeErrors)
// still runs even in manual mode
"automaticStrategies": true,
},
// Protect from pruning for <turns> message turns past tool invocation
"turnProtection": {
"enabled": false,
"turns": 4,
},
// Experimental settings
"experimental": {
// Allow ACP processing in subagent sessions
"allowSubAgents": false,
// Enable user-editable prompt overrides under dcp-prompts directories
// When false (default), prompt override files/directories are ignored
"customPrompts": false,
},
// Protect file operations from pruning via glob patterns
// Patterns match tool parameters.filePath (e.g. read/write/edit)
"protectedFilePatterns": [],
// Unified context compression tool and behavior settings
"compress": {
// Compression mode: "range" (compress spans into block summaries)
// or experimental "message" (compress individual raw messages)
"mode": "range",
// Permission mode: "allow" (no prompt), "ask" (prompt), "deny" (tool not registered)
"permission": "allow",
// Show compression content in a chat notification
"showCompression": true,
// Let active summary tokens extend the effective maxContextLimit
"summaryBuffer": true,
// Soft upper threshold: above this, ACP keeps injecting strong
// compression nudges (based on nudgeFrequency), so compression is
// much more likely. Accepts: number or "X%" of model context window.
"maxContextLimit": "55%",
// Soft lower threshold for reminder nudges: below this, turn/iteration
// reminders are off (compression less likely). At/above this, reminders
// are on. Accepts: number or "X%" of model context window.
"minContextLimit": "45%",
// Optional per-model override for maxContextLimit by providerID/modelID.
// If present, this wins over the global maxContextLimit.
// Accepts: number or "X%".
// Example:
// "modelMaxLimits": {
// "openai/gpt-5.3-codex": 120000,
// "anthropic/claude-sonnet-4.6": "80%"
// },
// Optional per-model override for minContextLimit.
// If present, this wins over the global minContextLimit.
// "modelMinLimits": {
// "openai/gpt-5.3-codex": 50000,
// "anthropic/claude-sonnet-4.6": "25%"
// },
// How often the context-limit nudge fires (1 = every fetch, 5 = every 5th)
"nudgeFrequency": 5,
// Start adding compression reminders after this many
// messages have happened since the last user message
"iterationNudgeThreshold": 15,
// Controls how likely compression is after user messages
// ("strong" = more likely, "soft" = less likely)
"nudgeForce": "soft",
// Tool names whose completed outputs are appended to the compression
"protectedTools": [],
// Preserve text wrapped in <protect>...</protect> when compressed
"protectTags": false,
// Preserve your messages during compression.
// Warning: large copy-pasted prompts will never be compressed away
"protectUserMessages": false,
},
// Automatic pruning strategies
"strategies": {
// Remove duplicate tool calls (same tool with same arguments)
"deduplication": {
"enabled": true,
// Additional tools to protect from pruning
"protectedTools": [],
},
// Prune tool inputs for errored tools after X turns
"purgeErrors": {
"enabled": true,
// Number of turns before errored tool inputs are pruned
"turns": 4,
// Additional tools to protect from pruning
"protectedTools": [],
},
},
// Garbage collection and batch cleanup
"gc": {
"algorithm": "truncate",
// young → old generation promotion after this many survivals
"promotionThreshold": 5,
// deactivate a block after this many survivals
"maxBlockAge": 15,
// truncate old-gen summaries exceeding this length (chars)
"maxOldGenSummaryLength": 3000,
// run major GC when context usage exceeds this
"majorGcThresholdPercent": "100%",
// Three-tier batch merge-cleanup for blocks flagged via mark_block.
// Accepts a number or "X%" of the model context window.
"batchCleanup": {
// At/above this usage, remind the model about marked blocks
"lowThreshold": "60%",
// At/above this usage, auto merge-compress all marked blocks into one
"highThreshold": "75%",
// At/above this usage, force-merge all old-gen blocks (before GC)
"forceThreshold": "90%",
},
},
}ACP exposes six editable prompts:
systemcompress-rangecompress-messagecontext-limit-nudgeturn-nudgeiteration-nudge
This feature is disabled by default. Set experimental.customPrompts to true in your ACP config to activate it.
When enabled, managed defaults are written to ~/.config/opencode/acp-prompts/defaults/ as plain-text prompt files. A single README.md in that directory explains each prompt and how to create overrides.
To customize behavior, add a file with the same name under an overrides directory and edit it as plain text.
To reset an override, delete the matching file from your overrides directory.
By default, these tools are always protected from pruning:
task, skill, todowrite, todoread, compress, decompress, mark_block, unmark_block, batch, plan_enter, plan_exit, write, edit
The protectedTools arrays in commands and strategies add to this default list.
For the compress tool, compress.protectedTools ensures specific tool outputs are appended to the compressed summary. By default it includes task, skill, todowrite, todoread, and decompress.
ACP is a drop-in replacement for DCP. To migrate:
- Remove the old DCP plugin from your
opencode.json - Install ACP:
opencode plugin install opencode-acp@latest --global - Restart OpenCode
What's preserved:
- Session state (compression blocks, message ID mappings) -- auto-migrated from
plugin/dcp/to~/.local/share/opencode/storage/plugin/acp/ - Config file
~/.config/opencode/dcp.jsonc-- ACP auto-migrates toacp.jsonc - Prompt overrides in
~/.config/opencode/dcp-prompts/-- auto-migrates toacp-prompts/
What changes:
- Storage directory:
plugin/dcp/toplugin/acp/(auto-migrated on first launch) - Log directory:
logs/dcp/tologs/acp/ - Slash command:
/dcpto/acp(both work for backward compatibility) - Notification headers:
DCPtoACP - Context usage label:
DCP thresholdtoACP threshold
ACP auto-migrates config from dcp.jsonc to acp.jsonc and prompts from dcp-prompts/ to acp-prompts/ on first launch.
Bug Fixes (38 total) -- applied on top of DCP v3.1.11
| # | Severity | Summary |
|---|---|---|
| 1 | CRITICAL | State not persisted across restarts -- messageIds, block deactivation, save errors silently lost |
| 2 | CRITICAL | resetOnCompaction() clears all compression blocks -- undoes all pruning work |
| 3 | CRITICAL | prune silently drops summary -- DATA LOSS when no user message precedes anchor |
| 4 | CRITICAL | getCurrentTokenUsage returns 0 -- prevents nudge from ever triggering |
| 5 | HIGH | loadPruneMessagesState duplicates activeBlockIds + reasoning-strip undefined guard |
| 6 | HIGH | Synthetic summary messages get mNNNN refs but are invisible to boundary lookup |
| 7 | HIGH | State not persisted across restarts -- messageIds, block deactivation, and save errors silently lost |
| 8 | HIGH | isMessageCompacted() inconsistent with compaction summary message handling |
| 9 | HIGH | Compressed block summaries retain stale mNNNN message ID tags -- model copies stale IDs |
| 10 | HIGH | Model uses stale mNNNN IDs from nudges/summaries -- compress fails with "startId not available" |
| 11 | HIGH | Major GC skips legacy blocks without generation field -- oversized blocks never collected |
| 12 | HIGH | Percentage-based thresholds calculated against effective input context instead of full model context window |
| 13 | HIGH | Context window leaks -- compressed messages reappear after /compact |
| 14 | HIGH | Compression notifications write full block summaries to DB -- can reach 150KB+ per notification |
| 15 | HIGH | npm auto-install overwrites fork with upstream package |
| 16 | HIGH | Summary mNNNN refs in compress output -- model copies stale message IDs |
| 17 | HIGH | Synthetic messages not in messageIdToBlockId -- compress fails to find them |
| 18 | HIGH | Compress stops model from responding after compression completes |
| 19 | HIGH | Dynamic block guidance breaks API prefix cache |
| 20 | HIGH | GC never deactivates old blocks -- dead-weight accumulates indefinitely |
| 21 | HIGH | Logger + tokenizer 20-50s per-turn latency (268x slowdown) |
| 22 | HIGH | compress throws hard error on reversed block boundaries -- model gives up |
| 23--34 | MEDIUM | Various fixes for dedup, purge errors, schema validation, hook timing, etc. |
| 35 | HIGH | Aging warnings shown at low context usage (<50%) -- triggers unnecessary compress, wastes tokens |
| 36 | HIGH | Compression summary emitted as a standalone user message before the user's real turn -- model reads its own prior assistant output as user input, causing dialog role confusion / self-Q&A loops |
| 37 | HIGH | Message-transform pipeline runs on OpenCode's hidden title/summary/compaction agent requests -- corrupts the request and shared session state, breaking session title generation |
| 38 | CRITICAL | pruneToolOutputs/pruneToolInputs/pruneToolErrors mutate existing messages in-place -- invalidates LLM prefix cache, causing 89% of fresh input tokens to be wasted on cache-invalidating re-sends |
For the complete list with root cause analysis, see the bug tracker.
AGPL-3.0-or-later -- This project is a fork of @tarquinen/opencode-dcp. Original copyright belongs to the original author. Modifications and bug fixes by ranxianglei.
{ "compaction": { "auto": false } }