PR: Implement Conductor State Retention and Garbage Collection#16
Open
0-robert wants to merge 1 commit intoRigos0:mainfrom
Open
PR: Implement Conductor State Retention and Garbage Collection#160-robert wants to merge 1 commit intoRigos0:mainfrom
0-robert wants to merge 1 commit intoRigos0:mainfrom
Conversation
- Adds 'ctl gc-state' to prune stale wakeups, inbox items, and archived workers. - Implements atomic log rotation for 'events.jsonl' and 'runs.jsonl'. - Includes comprehensive safety guards for active runs and pending notifications. - Adds 33 TDD behavioral tests and 10 manual edge-case verification guides. - Organizes design, architecture (Mermaid), and test docs in 'docs/gc/'.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR: Implement Conductor State Retention and Garbage Collection
Overview
This PR implements a robust state retention and garbage collection (GC) system for the Conductor. It prevents the
.superturtle/state/directory from growing indefinitely by pruning stale terminal records and rotating append-only logs after a configurable retention window (default: 7 days).Backlog Reference
.superturtle/state/does not grow forever (Line 139 inCLAUDE.md).Visual Architecture
flowchart TD subgraph Trigger["1. Trigger & Configuration"] CLI[("./ctl gc-state --max-age 7d")] --> Store["Init ConductorStateStore"] end subgraph Discovery["2. Safety Guard Discovery"] direction LR ScanW["Scan /workers/"] --> ActiveSet["Map 'Active Run IDs'<br/>(Lifecycle != archived)"] ScanWK["Scan /wakeups/"] --> PendingSet["Map 'Pending Run IDs'<br/>(State == pending|processing)"] end subgraph Engine["3. Surgical Pruning Engine"] direction TB subgraph Decisions["Safety Logic Gates"] direction LR D1{{"Prune Wakeup?"}} D2{{"Prune Inbox?"}} D3{{"Prune Worker?"}} end D1 -- "NOT Active" --> P1[("os.unlink(wakeups/)")] D2 -- "Age > Cutoff" --> P2[("os.unlink(inbox/)")] D3 -- "No Pending" --> P3[("os.unlink(workers/)")] end subgraph Rotation["4. Atomic Log Rotation"] direction LR Logs[("events.jsonl<br/>runs.jsonl")] --> Split["Filter Old Lines"] Split --> Archive["Append to .1"] Split --> Replace["Atomic replace()"] end Trigger --> Discovery Discovery --> Engine Engine --> Rotation Rotation --> Summary["GcResult Summary"] %% Styling style Trigger fill:#f9f0ff,stroke:#722ed1,stroke-width:2px style Discovery fill:#e6f7ff,stroke:#1890ff,stroke-width:2px style Engine fill:#fff7e6,stroke:#fa8c16,stroke-width:2px style Rotation fill:#f6ffed,stroke:#52c41a,stroke-width:2px style Decisions fill:#ffffff,stroke:#333,stroke-dasharray: 5 5 style Summary fill:#fff1f0,stroke:#f5222d,stroke-width:4pxCore Implementation
state/conductor_gc.py): A resilient Python module that performs surgical unlinking of stale JSON records.run_id.pendingorprocessingnotifications for that run are delivered.os.replaceto ensure zero log corruption even if the process is interrupted.errors="replace"and resilient parsing.gc-statecommand toctlwith--dry-runand human-readable durations.Testing & Verification
super_turtle/state/test_conductor_gc.pycovering state transitions, safety gates, and timestamp precision.super_turtle/docs/gc/manual_testing.md) including binary log injection and interruption simulation.New Documentation
super_turtle/docs/gc/planning.md: Detailed design and scope.super_turtle/docs/gc/architecture.md: Visual technical reference.super_turtle/docs/gc/manual_testing.md: Manual edge-case verification guide.