Skip to content

Commit d454136

Browse files
ArchieIndianclaude
andauthored
Add compaction-resilience-guard skill (#38)
Monitors compaction for failures (empty, inflation, garbled, repetition) and enforces a 3-level fallback chain: normal → aggressive → deterministic truncation. Ensures compaction always makes forward progress. Inspired by lossless-claw's three-level escalation system. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c85d820 commit d454136

4 files changed

Lines changed: 562 additions & 0 deletions

File tree

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
---
2+
name: compaction-resilience-guard
3+
version: "1.0"
4+
category: openclaw-native
5+
description: Monitors memory compaction for failures and enforces a three-level fallback chain — normal, aggressive, deterministic truncation — ensuring compaction always makes forward progress.
6+
stateful: true
7+
---
8+
9+
# Compaction Resilience Guard
10+
11+
## What it does
12+
13+
Memory compaction can fail silently: the LLM produces empty output, summaries that are *larger* than their input, or garbled text. When this happens, compaction stalls and context overflows.
14+
15+
Compaction Resilience Guard enforces a three-level escalation chain inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw):
16+
17+
| Level | Strategy | When used |
18+
|---|---|---|
19+
| L1 — Normal | Standard summarization prompt | First attempt |
20+
| L2 — Aggressive | Low temperature, reduced reasoning, shorter output target | After L1 failure |
21+
| L3 — Deterministic | Pure truncation: keep first N + last N lines, drop middle | After L2 failure |
22+
23+
This ensures compaction **always makes progress** — even if the LLM is broken.
24+
25+
## When to invoke
26+
27+
- After any compaction event — validate the output
28+
- When context usage approaches 90% — compaction may be failing
29+
- When summaries seem unusually long or empty — detect inflation
30+
- As a pre-check before memory-dag-compactor runs
31+
32+
## How to use
33+
34+
```bash
35+
python3 guard.py --check # Validate recent compaction outputs
36+
python3 guard.py --check --file <summary.yaml> # Check a specific summary file
37+
python3 guard.py --simulate <text> # Run the 3-level chain on sample text
38+
python3 guard.py --report # Show failure/escalation history
39+
python3 guard.py --status # Last check summary
40+
python3 guard.py --format json # Machine-readable output
41+
```
42+
43+
## Failure detection
44+
45+
The guard detects these compaction failures:
46+
47+
| Failure | How detected | Action |
48+
|---|---|---|
49+
| Empty output | Summary length < 10 chars | Escalate to next level |
50+
| Inflation | Summary tokens > input tokens | Escalate to next level |
51+
| Garbled text | Entropy score > 5.0 (random chars) | Escalate to next level |
52+
| Repetition | Same 20+ char phrase repeated 3+ times | Escalate to next level |
53+
| Truncation marker | Contains `[FALLBACK]` or `[TRUNCATED]` | Record as L3 usage |
54+
| Stale | Summary unchanged from previous run | Flag for review |
55+
56+
## Procedure
57+
58+
**Step 1 — Check recent compaction outputs**
59+
60+
```bash
61+
python3 guard.py --check
62+
```
63+
64+
Validates all summary nodes in memory-dag-compactor state. Reports failures by level and whether escalation was needed.
65+
66+
**Step 2 — Simulate the fallback chain**
67+
68+
```bash
69+
python3 guard.py --simulate "$(cat long-text.txt)"
70+
```
71+
72+
Runs the 3-level chain on sample text to test that each level produces valid output.
73+
74+
**Step 3 — Review escalation history**
75+
76+
```bash
77+
python3 guard.py --report
78+
```
79+
80+
Shows how often each level was used. High L2/L3 usage indicates the primary summarization prompt needs improvement.
81+
82+
## State
83+
84+
Failure counts, escalation history, and per-summary validation results stored in `~/.openclaw/skill-state/compaction-resilience-guard/state.yaml`.
85+
86+
Fields: `last_check_at`, `level_usage`, `failures`, `check_history`.
87+
88+
## Notes
89+
90+
- Read-only monitoring — does not perform compaction itself
91+
- Works alongside memory-dag-compactor as a quality gate
92+
- Deterministic truncation (L3) preserves first 30% and last 20% of input, drops middle
93+
- Entropy is measured using Shannon entropy on character distribution
94+
- High L3 usage (>10% of compactions) suggests a systemic LLM issue
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
version: "1.0"
2+
description: Compaction failure tracking, escalation history, and level usage stats.
3+
fields:
4+
last_check_at:
5+
type: datetime
6+
level_usage:
7+
type: object
8+
description: How often each fallback level was used
9+
fields:
10+
l1_normal: { type: integer, default: 0 }
11+
l2_aggressive: { type: integer, default: 0 }
12+
l3_deterministic: { type: integer, default: 0 }
13+
failures:
14+
type: list
15+
description: Recent compaction failures detected
16+
items:
17+
summary_id: { type: string }
18+
failure_type: { type: enum, values: [empty, inflation, garbled, repetition, stale] }
19+
level_used: { type: integer, description: "1, 2, or 3" }
20+
input_tokens: { type: integer }
21+
output_tokens: { type: integer }
22+
detected_at: { type: datetime }
23+
check_history:
24+
type: list
25+
description: Rolling log of past checks (last 20)
26+
items:
27+
checked_at: { type: datetime }
28+
summaries_checked: { type: integer }
29+
failures_found: { type: integer }
30+
escalations: { type: integer }
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Example runtime state for compaction-resilience-guard
2+
last_check_at: "2026-03-16T23:05:00.000000"
3+
level_usage:
4+
l1_normal: 42
5+
l2_aggressive: 3
6+
l3_deterministic: 1
7+
failures:
8+
- summary_id: s-d0-012
9+
failure_type: inflation
10+
level_used: 2
11+
input_tokens: 500
12+
output_tokens: 620
13+
detected_at: "2026-03-16T23:04:58.000000"
14+
- summary_id: s-d1-005
15+
failure_type: repetition
16+
level_used: 3
17+
input_tokens: 800
18+
output_tokens: 200
19+
detected_at: "2026-03-15T23:05:00.000000"
20+
check_history:
21+
- checked_at: "2026-03-16T23:05:00.000000"
22+
summaries_checked: 18
23+
failures_found: 1
24+
escalations: 1
25+
- checked_at: "2026-03-15T23:05:00.000000"
26+
summaries_checked: 15
27+
failures_found: 1
28+
escalations: 1
29+
- checked_at: "2026-03-14T23:05:00.000000"
30+
summaries_checked: 12
31+
failures_found: 0
32+
escalations: 0
33+
# ── Walkthrough ──────────────────────────────────────────────────────────────
34+
# python3 guard.py --check
35+
#
36+
# Compaction Resilience Check — 2026-03-16 23:05
37+
# ──────────────────────────────────────────────────
38+
# Summaries checked: 18
39+
# Failures found: 1
40+
# Escalations needed: 1
41+
# Status: DEGRADED
42+
#
43+
# ! s-d0-012: inflation (entropy=3.2, 620 tok)
44+
#
45+
# python3 guard.py --report
46+
#
47+
# Compaction Resilience Report
48+
# ──────────────────────────────────────────────────
49+
# Total compactions tracked: 46
50+
# L1 Normal: 42 (91%)
51+
# L2 Aggressive: 3 (7%)
52+
# L3 Deterministic: 1 (2%)
53+
#
54+
# Recent failures: 2
55+
# s-d0-012: inflation (L2)
56+
# s-d1-005: repetition (L3)
57+
#
58+
# python3 guard.py --simulate "$(cat long-text.txt)"
59+
#
60+
# Fallback Chain Simulation
61+
# ──────────────────────────────────────────────────
62+
# Input: 2500 tokens (10000 chars)
63+
# Level used: L1 (l1_normal)
64+
# Output: 1000 tokens
65+
# Compression: 40%

0 commit comments

Comments
 (0)