Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions skills/openclaw-native/compaction-resilience-guard/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
name: compaction-resilience-guard
version: "1.0"
category: openclaw-native
description: Monitors memory compaction for failures and enforces a three-level fallback chain — normal, aggressive, deterministic truncation — ensuring compaction always makes forward progress.
stateful: true
---

# Compaction Resilience Guard

## What it does

Memory compaction can fail silently: the LLM produces empty output, summaries that are *larger* than their input, or garbled text. When this happens, compaction stalls and context overflows.

Compaction Resilience Guard enforces a three-level escalation chain inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw):

| Level | Strategy | When used |
|---|---|---|
| L1 — Normal | Standard summarization prompt | First attempt |
| L2 — Aggressive | Low temperature, reduced reasoning, shorter output target | After L1 failure |
| L3 — Deterministic | Pure truncation: keep first N + last N lines, drop middle | After L2 failure |

This ensures compaction **always makes progress** — even if the LLM is broken.

## When to invoke

- After any compaction event — validate the output
- When context usage approaches 90% — compaction may be failing
- When summaries seem unusually long or empty — detect inflation
- As a pre-check before memory-dag-compactor runs

## How to use

```bash
python3 guard.py --check # Validate recent compaction outputs
python3 guard.py --check --file <summary.yaml> # Check a specific summary file
python3 guard.py --simulate <text> # Run the 3-level chain on sample text
python3 guard.py --report # Show failure/escalation history
python3 guard.py --status # Last check summary
python3 guard.py --format json # Machine-readable output
```

## Failure detection

The guard detects these compaction failures:

| Failure | How detected | Action |
|---|---|---|
| Empty output | Summary length < 10 chars | Escalate to next level |
| Inflation | Summary tokens > input tokens | Escalate to next level |
| Garbled text | Entropy score > 5.0 (random chars) | Escalate to next level |
| Repetition | Same 20+ char phrase repeated 3+ times | Escalate to next level |
| Truncation marker | Contains `[FALLBACK]` or `[TRUNCATED]` | Record as L3 usage |
| Stale | Summary unchanged from previous run | Flag for review |

## Procedure

**Step 1 — Check recent compaction outputs**

```bash
python3 guard.py --check
```

Validates all summary nodes in memory-dag-compactor state. Reports failures by level and whether escalation was needed.

**Step 2 — Simulate the fallback chain**

```bash
python3 guard.py --simulate "$(cat long-text.txt)"
```

Runs the 3-level chain on sample text to test that each level produces valid output.

**Step 3 — Review escalation history**

```bash
python3 guard.py --report
```

Shows how often each level was used. High L2/L3 usage indicates the primary summarization prompt needs improvement.

## State

Failure counts, escalation history, and per-summary validation results stored in `~/.openclaw/skill-state/compaction-resilience-guard/state.yaml`.

Fields: `last_check_at`, `level_usage`, `failures`, `check_history`.

## Notes

- Read-only monitoring — does not perform compaction itself
- Works alongside memory-dag-compactor as a quality gate
- Deterministic truncation (L3) preserves first 30% and last 20% of input, drops middle
- Entropy is measured using Shannon entropy on character distribution
- High L3 usage (>10% of compactions) suggests a systemic LLM issue
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
version: "1.0"
description: Compaction failure tracking, escalation history, and level usage stats.
fields:
last_check_at:
type: datetime
level_usage:
type: object
description: How often each fallback level was used
fields:
l1_normal: { type: integer, default: 0 }
l2_aggressive: { type: integer, default: 0 }
l3_deterministic: { type: integer, default: 0 }
failures:
type: list
description: Recent compaction failures detected
items:
summary_id: { type: string }
failure_type: { type: enum, values: [empty, inflation, garbled, repetition, stale] }
level_used: { type: integer, description: "1, 2, or 3" }
input_tokens: { type: integer }
output_tokens: { type: integer }
detected_at: { type: datetime }
check_history:
type: list
description: Rolling log of past checks (last 20)
items:
checked_at: { type: datetime }
summaries_checked: { type: integer }
failures_found: { type: integer }
escalations: { type: integer }
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Example runtime state for compaction-resilience-guard
last_check_at: "2026-03-16T23:05:00.000000"
level_usage:
l1_normal: 42
l2_aggressive: 3
l3_deterministic: 1
failures:
- summary_id: s-d0-012
failure_type: inflation
level_used: 2
input_tokens: 500
output_tokens: 620
detected_at: "2026-03-16T23:04:58.000000"
- summary_id: s-d1-005
failure_type: repetition
level_used: 3
input_tokens: 800
output_tokens: 200
detected_at: "2026-03-15T23:05:00.000000"
check_history:
- checked_at: "2026-03-16T23:05:00.000000"
summaries_checked: 18
failures_found: 1
escalations: 1
- checked_at: "2026-03-15T23:05:00.000000"
summaries_checked: 15
failures_found: 1
escalations: 1
- checked_at: "2026-03-14T23:05:00.000000"
summaries_checked: 12
failures_found: 0
escalations: 0
# ── Walkthrough ──────────────────────────────────────────────────────────────
# python3 guard.py --check
#
# Compaction Resilience Check — 2026-03-16 23:05
# ──────────────────────────────────────────────────
# Summaries checked: 18
# Failures found: 1
# Escalations needed: 1
# Status: DEGRADED
#
# ! s-d0-012: inflation (entropy=3.2, 620 tok)
#
# python3 guard.py --report
#
# Compaction Resilience Report
# ──────────────────────────────────────────────────
# Total compactions tracked: 46
# L1 Normal: 42 (91%)
# L2 Aggressive: 3 (7%)
# L3 Deterministic: 1 (2%)
#
# Recent failures: 2
# s-d0-012: inflation (L2)
# s-d1-005: repetition (L3)
#
# python3 guard.py --simulate "$(cat long-text.txt)"
#
# Fallback Chain Simulation
# ──────────────────────────────────────────────────
# Input: 2500 tokens (10000 chars)
# Level used: L1 (l1_normal)
# Output: 1000 tokens
# Compression: 40%
Loading
Loading