Skip to content

mCo0L/caveman-claw

Repository files navigation

Caveman Claw

Cut your OpenClaw token costs without losing technical accuracy.

Every message you send to an LLM costs tokens — and most of that cost comes from the conversation history that gets sent along with it. As a conversation grows, the model receives your entire history on every single request. Caveman Claw reduces that cost in two ways:

  1. Output mode — tells the AI to respond in a terse, fragment-based style. Same technical accuracy, fewer words, fewer tokens billed.
  2. Auto-compression — automatically trims older messages in your conversation history before each request. You never notice it happening, but you pay for less.

Requirements


Install

git clone https://github.com/mCo0L/caveman-claw
openclaw plugins install ./caveman-claw
openclaw plugins enable caveman-claw

Once published to npm, the install will be:

openclaw plugins install @mco0l/caveman-claw
openclaw plugins enable caveman-claw

That's it. After enabling, caveman mode activates automatically for every new session (controlled by globalByDefault — on by default).


Updating

Since Caveman Claw isn't on ClawHub yet, updates are manual:

Option 1 — if you cloned the repo locally:

cd /path/to/your/caveman-claw
git pull
openclaw plugins install ./caveman-claw
openclaw plugins disable caveman-claw
openclaw plugins enable caveman-claw

Option 2 — pull directly in the plugins directory:

cd ~/.openclaw/plugins/caveman-claw
git pull
openclaw plugins disable caveman-claw
openclaw plugins enable caveman-claw

What happens out of the box

Once installed with default settings:

  • Every new session gets caveman mode automatically — the AI responds in terse, fragment-based style from the first message
  • Every LLM request has older conversation history algorithmically compressed before it's sent — articles stripped, filler cut, synonyms shortened
  • Token savings are tracked and available any time with /cc stats

You don't need to type anything to get the benefits. It just runs.


Commands

You can also control caveman mode manually from chat:

Command What it does
/cc Activate caveman mode (full intensity)
/cc lite Mild mode — drops filler, keeps full sentences
/cc ultra Aggressive mode — abbreviates fn, param, cfg, err, etc.
/cc compress <file> Compress a context file like SOUL.md or SKILL.md
/cc stats Show how many tokens and dollars you've saved
normal Turn off caveman mode for this session
stop caveman Same as above

Intensity levels explained

lite — Removes filler phrases ("Sure!", "Of course", "I'd be happy to") and stops there. Full sentences preserved.

full (default) — Everything in lite, plus: fragments instead of full sentences, hedging removed ("might", "possibly", "appears to"), common words shortened ("utilize" → "use", "however" → "but", "additionally" → "also").

ultra — Everything in full, plus: technical terms abbreviated (function → fn, parameter → param, configuration → cfg, error → err, repository → repo).

Compressing context files

If you have large context files (SOUL.md, SKILL.md, project docs) that get loaded on every request, you can compress them permanently:

/cc compress SOUL.md

This rewrites the file in caveman format and saves a .original.md backup first. Code blocks, URLs, and numbers are never modified.


Savings report

/cc stats
Caveman Claw savings:

Session:  -12,400 tokens  (~$0.04 saved)
Today:    -89,200 tokens  (~$0.27 saved)
All time: -1.2M tokens    (~$3.60 saved)

Model: openrouter/anthropic/claude-haiku-4-5 · $0.80/1M tokens
Note: tracks input compression only. Output savings not measured.

Pricing is read automatically from your OpenClaw config (~/.openclaw/openclaw.json) and model catalog — no manual setup needed.


Configuration

After installing, you can edit plugin.json in the Caveman Claw plugin directory to change defaults:

{
  "config": {
    "autoCompress": true,
    "historyDepth": 2,
    "intensity": "full",
    "globalByDefault": true
  }
}
Option Default Description
globalByDefault true Automatically activate caveman mode for every new session
autoCompress true Compress older conversation history before each LLM call
historyDepth 2 How many of the most recent assistant turns to leave uncompressed
intensity "full" Compression level: lite, full, or ultra

Setting globalByDefault: false means caveman mode is opt-in — you activate it manually per session with /cc.


How compression works

When a request is sent, Caveman Claw processes the conversation history before it reaches the model:

  • User messages are never compressed
  • System messages are never compressed
  • The most recent N assistant turns are left uncompressed (controlled by historyDepth)
  • Older assistant messages are compressed: articles removed, filler phrases stripped, synonyms shortened, hedging cut
  • Code blocks, URLs, and numbers are extracted and restored exactly — never touched

When compression reduces the history, a short system note is injected before your latest message letting the model know the older context was pre-compressed and to respond with matching brevity.


Current limitations

  • Auto-compression requires context_before_request hook support in OpenClaw (issue #33017). This hook isn't available yet — the plugin registers it in advance and will activate automatically once OpenClaw ships it. In the meantime, globalByDefault: true (the default) gives you caveman output mode on every session, and /cc compress handles manual context file compression.
  • Savings tracking covers input tokens only — output token reduction from shorter responses is not measured.
  • Token estimation is approximate — uses a character-based heuristic (~4 chars/token). Accurate enough for tracking savings trends; not suitable for billing.

Troubleshooting

/cc not showing up in chat Run openclaw plugins list to confirm the plugin is enabled. If it shows as disabled, run openclaw plugins enable caveman-claw.

Stats show $0 savings Pricing requires ~/.openclaw/openclaw.json (for the active model) and ~/.openclaw/agents/main/agent/models.json (for per-model pricing). These are created by OpenClaw automatically — if they're missing, token counts will still be tracked, just without dollar estimates.

Caveman mode feels too aggressive Switch to lite mode: set "intensity": "lite" in plugin.json, or type /cc lite in chat for the current session only.


Under the hood

For contributors and curious users:

  • Compression is purely algorithmic — regex-based rules strip articles, filler phrases, hedging words, and replace verbose synonyms with shorter ones. Code blocks, URLs, and numbers are extracted before processing and restored exactly afterward.
  • Hook registration uses api.registerHook({ name: 'context_before_request', handler }) — a forward-compatible registration that activates once OpenClaw ships the hook.
  • Savings tracking writes to data/savings.json using atomic temp-file-then-rename writes to prevent corruption on concurrent requests.
  • Model pricing is read from ~/.openclaw/openclaw.json (active model) and ~/.openclaw/agents/main/agent/models.json (per-model costs) — the same files OpenClaw uses for its own dashboard.
  • Session activation for globalByDefault listens to session:patch and command:new events and records activated session IDs in data/activated-sessions.json.

Credits

Caveman Claw was directly inspired by caveman by Julius Brussee — the original token-saving caveman-speak plugin for Claude Code. The core idea, compression philosophy, and intensity levels all trace back to that project.

Caveman Claw is a ground-up reimplementation for the OpenClaw ecosystem: TypeScript compression library, native plugin registration, token savings tracking against OpenClaw's own model pricing, and an auto-compression hook. Credit for the concept belongs entirely to Julius.


Contributing

Issues and PRs welcome at github.com/mCo0L/caveman-claw.


License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors