Add compaction-resilience-guard skill (#38)

ArchieIndian · claude · web-flow · commit d454136f2c4c · 2026-03-17T00:53:31.000+05:30
Monitors compaction for failures (empty, inflation, garbled, repetition)
and enforces a 3-level fallback chain: normal → aggressive → deterministic
truncation. Ensures compaction always makes forward progress.

Inspired by lossless-claw's three-level escalation system.

Co-authored-by: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/skills/openclaw-native/compaction-resilience-guard/SKILL.md b/skills/openclaw-native/compaction-resilience-guard/SKILL.md
@@ -0,0 +1,94 @@
+---
+name: compaction-resilience-guard
+version: "1.0"
+category: openclaw-native
+description: Monitors memory compaction for failures and enforces a three-level fallback chain — normal, aggressive, deterministic truncation — ensuring compaction always makes forward progress.
+stateful: true
+---
+
+# Compaction Resilience Guard
+
+## What it does
+
+Memory compaction can fail silently: the LLM produces empty output, summaries that are *larger* than their input, or garbled text. When this happens, compaction stalls and context overflows.
+
+Compaction Resilience Guard enforces a three-level escalation chain inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw):
+
+| Level | Strategy | When used |
+|---|---|---|
+| L1 — Normal | Standard summarization prompt | First attempt |
+| L2 — Aggressive | Low temperature, reduced reasoning, shorter output target | After L1 failure |
+| L3 — Deterministic | Pure truncation: keep first N + last N lines, drop middle | After L2 failure |
+
+This ensures compaction **always makes progress** — even if the LLM is broken.
+
+## When to invoke
+
+- After any compaction event — validate the output
+- When context usage approaches 90% — compaction may be failing
+- When summaries seem unusually long or empty — detect inflation
+- As a pre-check before memory-dag-compactor runs
+
+## How to use
+
+```bash
+python3 guard.py --check                       # Validate recent compaction outputs
+python3 guard.py --check --file <summary.yaml> # Check a specific summary file
+python3 guard.py --simulate <text>             # Run the 3-level chain on sample text
+python3 guard.py --report                      # Show failure/escalation history
+python3 guard.py --status                      # Last check summary
+python3 guard.py --format json                 # Machine-readable output
+```
+
+## Failure detection
+
+The guard detects these compaction failures:
+
+| Failure | How detected | Action |
+|---|---|---|
+| Empty output | Summary length < 10 chars | Escalate to next level |
+| Inflation | Summary tokens > input tokens | Escalate to next level |
+| Garbled text | Entropy score > 5.0 (random chars) | Escalate to next level |
+| Repetition | Same 20+ char phrase repeated 3+ times | Escalate to next level |
+| Truncation marker | Contains `[FALLBACK]` or `[TRUNCATED]` | Record as L3 usage |
+| Stale | Summary unchanged from previous run | Flag for review |
+
+## Procedure
+
+**Step 1 — Check recent compaction outputs**
+
+```bash
+python3 guard.py --check
+```
+
+Validates all summary nodes in memory-dag-compactor state. Reports failures by level and whether escalation was needed.
+
+**Step 2 — Simulate the fallback chain**
+
+```bash
+python3 guard.py --simulate "$(cat long-text.txt)"
+```
+
+Runs the 3-level chain on sample text to test that each level produces valid output.
+
+**Step 3 — Review escalation history**
+
+```bash
+python3 guard.py --report
+```
+
+Shows how often each level was used. High L2/L3 usage indicates the primary summarization prompt needs improvement.
+
+## State
+
+Failure counts, escalation history, and per-summary validation results stored in `~/.openclaw/skill-state/compaction-resilience-guard/state.yaml`.
+
+Fields: `last_check_at`, `level_usage`, `failures`, `check_history`.
+
+## Notes
+
+- Read-only monitoring — does not perform compaction itself
+- Works alongside memory-dag-compactor as a quality gate
+- Deterministic truncation (L3) preserves first 30% and last 20% of input, drops middle
+- Entropy is measured using Shannon entropy on character distribution
+- High L3 usage (>10% of compactions) suggests a systemic LLM issue
diff --git a/skills/openclaw-native/compaction-resilience-guard/STATE_SCHEMA.yaml b/skills/openclaw-native/compaction-resilience-guard/STATE_SCHEMA.yaml
@@ -0,0 +1,30 @@
+version: "1.0"
+description: Compaction failure tracking, escalation history, and level usage stats.
+fields:
+  last_check_at:
+    type: datetime
+  level_usage:
+    type: object
+    description: How often each fallback level was used
+    fields:
+      l1_normal:        { type: integer, default: 0 }
+      l2_aggressive:    { type: integer, default: 0 }
+      l3_deterministic: { type: integer, default: 0 }
+  failures:
+    type: list
+    description: Recent compaction failures detected
+    items:
+      summary_id:   { type: string }
+      failure_type: { type: enum, values: [empty, inflation, garbled, repetition, stale] }
+      level_used:   { type: integer, description: "1, 2, or 3" }
+      input_tokens: { type: integer }
+      output_tokens: { type: integer }
+      detected_at:  { type: datetime }
+  check_history:
+    type: list
+    description: Rolling log of past checks (last 20)
+    items:
+      checked_at:   { type: datetime }
+      summaries_checked: { type: integer }
+      failures_found:    { type: integer }
+      escalations:       { type: integer }
diff --git a/skills/openclaw-native/compaction-resilience-guard/example-state.yaml b/skills/openclaw-native/compaction-resilience-guard/example-state.yaml
@@ -0,0 +1,65 @@
+# Example runtime state for compaction-resilience-guard
+last_check_at: "2026-03-16T23:05:00.000000"
+level_usage:
+  l1_normal: 42
+  l2_aggressive: 3
+  l3_deterministic: 1
+failures:
+  - summary_id: s-d0-012
+    failure_type: inflation
+    level_used: 2
+    input_tokens: 500
+    output_tokens: 620
+    detected_at: "2026-03-16T23:04:58.000000"
+  - summary_id: s-d1-005
+    failure_type: repetition
+    level_used: 3
+    input_tokens: 800
+    output_tokens: 200
+    detected_at: "2026-03-15T23:05:00.000000"
+check_history:
+  - checked_at: "2026-03-16T23:05:00.000000"
+    summaries_checked: 18
+    failures_found: 1
+    escalations: 1
+  - checked_at: "2026-03-15T23:05:00.000000"
+    summaries_checked: 15
+    failures_found: 1
+    escalations: 1
+  - checked_at: "2026-03-14T23:05:00.000000"
+    summaries_checked: 12
+    failures_found: 0
+    escalations: 0
+# ── Walkthrough ──────────────────────────────────────────────────────────────
+# python3 guard.py --check
+#
+#   Compaction Resilience Check — 2026-03-16 23:05
+#   ──────────────────────────────────────────────────
+#     Summaries checked:  18
+#     Failures found:     1
+#     Escalations needed: 1
+#     Status: DEGRADED
+#
+#     ! s-d0-012: inflation (entropy=3.2, 620 tok)
+#
+# python3 guard.py --report
+#
+#   Compaction Resilience Report
+#   ──────────────────────────────────────────────────
+#     Total compactions tracked: 46
+#     L1 Normal:           42 (91%)
+#     L2 Aggressive:        3 (7%)
+#     L3 Deterministic:     1 (2%)
+#
+#     Recent failures: 2
+#       s-d0-012: inflation (L2)
+#       s-d1-005: repetition (L3)
+#
+# python3 guard.py --simulate "$(cat long-text.txt)"
+#
+#   Fallback Chain Simulation
+#   ──────────────────────────────────────────────────
+#     Input:  2500 tokens (10000 chars)
+#     Level used: L1 (l1_normal)
+#     Output: 1000 tokens
+#     Compression: 40%
diff --git a/skills/openclaw-native/compaction-resilience-guard/guard.py b/skills/openclaw-native/compaction-resilience-guard/guard.py