From b7eed7f9c324b22304491140bc7e56b12e125dd6 Mon Sep 17 00:00:00 2001 From: Sergei Garin Date: Sun, 24 May 2026 20:59:37 +0300 Subject: [PATCH 1/3] Add loop skill --- README.md | 5 ++ skills/loop/SKILL.md | 65 ++++++++++++++++++++ skills/loop/references/protocol.md | 96 ++++++++++++++++++++++++++++++ 3 files changed, 166 insertions(+) create mode 100644 skills/loop/SKILL.md create mode 100644 skills/loop/references/protocol.md diff --git a/README.md b/README.md index 02bdd51..4b139fa 100644 --- a/README.md +++ b/README.md @@ -200,6 +200,11 @@ If a skill needs reusable instructions that are not a runnable skill: - Use when: scope is locked and the main job is execution. - Do not use when: the task still needs discovery or approval shaping. +- `skills/loop` + - What it is: bounded agent-agnostic iteration loop that runs a fresh subagent/executor for each pass and reports progress after every iteration. + - Use when: the user says `Loop:` or explicitly asks to keep iterating a task until done, blocked, no-progress, or max iterations. + - Do not use when: the task is a normal one-shot request or when looping would bypass required approval, safety, or external-action gates. + - `skills/code-review-orchestrator` - What it is: one entrypoint for multi-role code review with merged findings. - Use when: the user wants a repo, diff, branch, or PR reviewed from one or more specialist angles. diff --git a/skills/loop/SKILL.md b/skills/loop/SKILL.md new file mode 100644 index 0000000..26ff9a4 --- /dev/null +++ b/skills/loop/SKILL.md @@ -0,0 +1,65 @@ +--- +name: loop +description: Agent-agnostic iteration loop for commands like "Loop: find bugs in this project", "Loop until this test passes", or "run an agent loop on this task". Use when a main orchestrator should run the same entrypoint task through fresh subagents/executors one iteration at a time, report progress after each run, and stop on completion, no progress, blocker, safety, user stop, or max iterations. +--- + +# Loop + +Run a bounded, agent-agnostic loop where the main orchestrator delegates one task iteration at a time to a fresh subagent, executor, or worker, then decides whether another iteration is useful. + +For the detailed state fields, worker prompt, and report templates, read `references/protocol.md`. + +## Trigger + +Use this skill when the user explicitly asks for a loop or iterative agent pass, especially: +- `Loop: find bugs in this project` +- `Loop until the failing test is fixed` +- `Run a loop over this cleanup task` +- `Keep launching workers until no new issues are found` + +Do not use it for ordinary one-shot tasks unless the user requests iterative delegation or the work clearly needs repeated independent attempts. + +## Inputs to capture + +Before the first iteration, identify: +- `task`: exact entrypoint task to run once per iteration +- `target`: repo, path, branch, document, issue, or other scope +- `maxIterations`: default `3` unless the user sets another bounded value +- `checks`: project-appropriate verification, chosen by the orchestrator from the task context +- `stopSignals`: any user-specific success criteria, blockers, or safety limits + +If a missing input makes safe execution impossible, ask one concise question instead of starting the loop. + +## Protocol + +1. Initialize loop state: task, target, iteration number, max iterations, last result, progress summary, open risks, stop decision. +2. For each iteration, start a fresh subagent/executor with the task, target, current state summary, constraints, expected output contract, and allowed action boundary. +3. The worker runs the entrypoint task once. It must not silently start its own unbounded loop. +4. Worker returns compact evidence: status, result, changed files or findings, verification, blocker, risk, and recommended next step. +5. Orchestrator reports progress to the user after each iteration. +6. Orchestrator decides continue or stop using the stop conditions below. +7. If continuing, update state and launch the next fresh worker with the new context. +8. End with a completion report: iterations run, final status, evidence, checks, unresolved risks, and next action. + +## Stop conditions + +Stop immediately when any condition is true: +- task is complete or acceptance criteria pass +- no meaningful new progress since the previous iteration +- the issue cannot be reproduced or the worker found no actionable next step +- max iterations reached +- blocker requires user input, credentials, approval, or external dependency +- user says stop, pause, or changes scope +- safety, privacy, destructive-action, financial, legal, or external-write rules require approval or refusal + +## Safety and approvals + +Normal agent policy still governs every iteration. +- Do not use looping to bypass approval gates, destructive-action confirmations, external-write rules, spending limits, security boundaries, or privacy constraints. +- Keep loops bounded; never interpret `Loop:` as permission for infinite autonomous execution. +- If an iteration would perform destructive, external, or sensitive action, pause and get the required approval before that action. +- Preserve exact errors, paths, IDs, commands, and user constraints when handing state to the next worker. + +## Verification + +Choose checks from the actual project/task, not from this skill by default. Examples: tests, lint, typecheck, build, `git diff --check`, direct inspection, reproduced issue, or rendered/manual validation. If no meaningful check can run, state why in the progress and final report. diff --git a/skills/loop/references/protocol.md b/skills/loop/references/protocol.md new file mode 100644 index 0000000..94d6c7e --- /dev/null +++ b/skills/loop/references/protocol.md @@ -0,0 +1,96 @@ +# Loop protocol reference + +Use this reference when running a bounded agent-agnostic loop. Keep the visible `SKILL.md` lean; use these templates for state and handoffs. + +## State fields + +Track these fields in the orchestrator: + +```yaml +task: "exact one-iteration entrypoint" +target: "repo/path/doc/issue/scope" +maxIterations: 3 +iteration: 0 +checks: ["project-appropriate verification"] +constraints: ["user constraints", "policy constraints", "repo constraints"] +progressSoFar: "compact summary of completed work/findings" +lastResult: "status/result/evidence from previous worker" +openRisks: ["unresolved risks or blockers"] +stopDecision: "continue | stop" +stopReason: "why" +``` + +Default `maxIterations` is `3`. The external source material inspected for this skill describes loop patterns but does not provide a default max iteration count, so this skill uses a conservative bounded default. The user may override it with another explicit bounded value. + +## Worker prompt shape + +Give each worker a compact contract like this, adapted to the current runtime: + +```text +You are iteration {n}/{maxIterations} of a bounded loop. + +Task: {task} +Target: {target} + +State from previous iterations: +{progressSoFar} + +Constraints: +- Run the entrypoint task once. +- Do not start your own unbounded loop. +- Preserve exact paths, commands, IDs, errors, and user constraints. +- Follow normal safety, approval, destructive-action, privacy, and external-write rules. +- Run or recommend project-appropriate checks. + +Return: +- status: complete | progress | no-progress | blocked | unsafe | failed +- result: concise outcome +- evidence: files, commands, findings, or reproduction details +- checks: run / not run / why not +- blocker: missing approval/input/dependency if any +- risk: remaining risk +- next: stop or one concrete next iteration goal +``` + +## Progress report after each iteration + +Report after every worker returns, before deciding silently to continue: + +```text +Iteration {n}/{maxIterations}: {status} +Progress: {what changed or was learned} +Evidence/checks: {compact evidence} +Decision: {continue/stop} — {reason} +Next: {next iteration goal or final next step} +``` + +## Continue decision rubric + +Continue only when all are true: +- there is a concrete next iteration goal +- the previous iteration produced progress or a new lead +- the next iteration does not require missing approval or user input +- the loop remains within the original scope +- max iterations has not been reached +- expected value justifies another worker + +Stop when another iteration would mostly repeat the same attempt, widen scope without approval, or hide a blocker. + +## Completion report + +Final report format: + +```text +Loop complete: {final status} +Iterations: {ran}/{maxIterations} +Result: {final outcome} +Evidence: {key files/findings/commands} +Checks: {what passed/failed/not run} +Stopped because: {stop reason} +Remaining risks: {risks or none} +Next: {recommended user/action step} +``` + +## Source inspiration + +This skill was written as original repo-local guidance after inspecting ECC `continuous-agent-loop` and `autonomous-loops` material. Borrowed ideas are conceptual only: bounded orchestration, subagent waves, context bridging, quality gates, and recovery from loop churn. Runtime-specific command syntax from those sources is intentionally not required here. From a38a0539fc673299f04e654ac4e8feb50c673c71 Mon Sep 17 00:00:00 2001 From: Sergei Garin Date: Sun, 24 May 2026 22:32:49 +0300 Subject: [PATCH 2/3] Replace loop skill with upstream ECC source --- README.md | 6 +- skills/loop/LICENSE | 21 + skills/loop/SKILL.md | 635 +++++++++++++++++++++++++++-- skills/loop/references/protocol.md | 96 ----- 4 files changed, 614 insertions(+), 144 deletions(-) create mode 100644 skills/loop/LICENSE delete mode 100644 skills/loop/references/protocol.md diff --git a/README.md b/README.md index 4b139fa..a940152 100644 --- a/README.md +++ b/README.md @@ -201,9 +201,9 @@ If a skill needs reusable instructions that are not a runnable skill: - Do not use when: the task still needs discovery or approval shaping. - `skills/loop` - - What it is: bounded agent-agnostic iteration loop that runs a fresh subagent/executor for each pass and reports progress after every iteration. - - Use when: the user says `Loop:` or explicitly asks to keep iterating a task until done, blocked, no-progress, or max iterations. - - Do not use when: the task is a normal one-shot request or when looping would bypass required approval, safety, or external-action gates. + - What it is: ECC autonomous loop pattern catalog for Claude Code, from sequential `claude -p` pipelines to RFC-driven multi-agent DAG orchestration. + - Use when: selecting or designing an autonomous development loop, CI/PR loop, parallel generation loop, or quality-gated agent workflow. + - Do not use when: the task only needs a one-shot answer or when autonomous execution would bypass required approval, safety, or external-action gates. - `skills/code-review-orchestrator` - What it is: one entrypoint for multi-role code review with merged findings. diff --git a/skills/loop/LICENSE b/skills/loop/LICENSE new file mode 100644 index 0000000..b832b6f --- /dev/null +++ b/skills/loop/LICENSE @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2026 Affaan Mustafa + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/skills/loop/SKILL.md b/skills/loop/SKILL.md index 26ff9a4..704c0cb 100644 --- a/skills/loop/SKILL.md +++ b/skills/loop/SKILL.md @@ -1,65 +1,610 @@ --- name: loop -description: Agent-agnostic iteration loop for commands like "Loop: find bugs in this project", "Loop until this test passes", or "run an agent loop on this task". Use when a main orchestrator should run the same entrypoint task through fresh subagents/executors one iteration at a time, report progress after each run, and stop on completion, no progress, blocker, safety, user stop, or max iterations. +description: "Patterns and architectures for autonomous Claude Code loops — from simple sequential pipelines to RFC-driven multi-agent DAG systems." +origin: ECC --- -# Loop +# Autonomous Loops Skill -Run a bounded, agent-agnostic loop where the main orchestrator delegates one task iteration at a time to a fresh subagent, executor, or worker, then decides whether another iteration is useful. +> Compatibility note (v1.8.0): `autonomous-loops` is retained for one release. +> The canonical skill name is now `continuous-agent-loop`. New loop guidance +> should be authored there, while this skill remains available to avoid +> breaking existing workflows. -For the detailed state fields, worker prompt, and report templates, read `references/protocol.md`. +Patterns, architectures, and reference implementations for running Claude Code autonomously in loops. Covers everything from simple `claude -p` pipelines to full RFC-driven multi-agent DAG orchestration. -## Trigger +## When to Use -Use this skill when the user explicitly asks for a loop or iterative agent pass, especially: -- `Loop: find bugs in this project` -- `Loop until the failing test is fixed` -- `Run a loop over this cleanup task` -- `Keep launching workers until no new issues are found` +- Setting up autonomous development workflows that run without human intervention +- Choosing the right loop architecture for your problem (simple vs complex) +- Building CI/CD-style continuous development pipelines +- Running parallel agents with merge coordination +- Implementing context persistence across loop iterations +- Adding quality gates and cleanup passes to autonomous workflows -Do not use it for ordinary one-shot tasks unless the user requests iterative delegation or the work clearly needs repeated independent attempts. +## Loop Pattern Spectrum -## Inputs to capture +From simplest to most sophisticated: -Before the first iteration, identify: -- `task`: exact entrypoint task to run once per iteration -- `target`: repo, path, branch, document, issue, or other scope -- `maxIterations`: default `3` unless the user sets another bounded value -- `checks`: project-appropriate verification, chosen by the orchestrator from the task context -- `stopSignals`: any user-specific success criteria, blockers, or safety limits +| Pattern | Complexity | Best For | +|---------|-----------|----------| +| [Sequential Pipeline](#1-sequential-pipeline-claude--p) | Low | Daily dev steps, scripted workflows | +| [NanoClaw REPL](#2-nanoclaw-repl) | Low | Interactive persistent sessions | +| [Infinite Agentic Loop](#3-infinite-agentic-loop) | Medium | Parallel content generation, spec-driven work | +| [Continuous Claude PR Loop](#4-continuous-claude-pr-loop) | Medium | Multi-day iterative projects with CI gates | +| [De-Sloppify Pattern](#5-the-de-sloppify-pattern) | Add-on | Quality cleanup after any Implementer step | +| [Ralphinho / RFC-Driven DAG](#6-ralphinho--rfc-driven-dag-orchestration) | High | Large features, multi-unit parallel work with merge queue | -If a missing input makes safe execution impossible, ask one concise question instead of starting the loop. +--- + +## 1. Sequential Pipeline (`claude -p`) + +**The simplest loop.** Break daily development into a sequence of non-interactive `claude -p` calls. Each call is a focused step with a clear prompt. + +### Core Insight + +> If you can't figure out a loop like this, it means you can't even drive the LLM to fix your code in interactive mode. + +The `claude -p` flag runs Claude Code non-interactively with a prompt, exits when done. Chain calls to build a pipeline: + +```bash +#!/bin/bash +# daily-dev.sh — Sequential pipeline for a feature branch + +set -e + +# Step 1: Implement the feature +claude -p "Read the spec in docs/auth-spec.md. Implement OAuth2 login in src/auth/. Write tests first (TDD). Do NOT create any new documentation files." + +# Step 2: De-sloppify (cleanup pass) +claude -p "Review all files changed by the previous commit. Remove any unnecessary type tests, overly defensive checks, or testing of language features (e.g., testing that TypeScript generics work). Keep real business logic tests. Run the test suite after cleanup." + +# Step 3: Verify +claude -p "Run the full build, lint, type check, and test suite. Fix any failures. Do not add new features." + +# Step 4: Commit +claude -p "Create a conventional commit for all staged changes. Use 'feat: add OAuth2 login flow' as the message." +``` + +### Key Design Principles + +1. **Each step is isolated** — A fresh context window per `claude -p` call means no context bleed between steps. +2. **Order matters** — Steps execute sequentially. Each builds on the filesystem state left by the previous. +3. **Negative instructions are dangerous** — Don't say "don't test type systems." Instead, add a separate cleanup step (see [De-Sloppify Pattern](#5-the-de-sloppify-pattern)). +4. **Exit codes propagate** — `set -e` stops the pipeline on failure. + +### Variations + +**With model routing:** +```bash +# Research with Opus (deep reasoning) +claude -p --model opus "Analyze the codebase architecture and write a plan for adding caching..." + +# Implement with Sonnet (fast, capable) +claude -p "Implement the caching layer according to the plan in docs/caching-plan.md..." + +# Review with Opus (thorough) +claude -p --model opus "Review all changes for security issues, race conditions, and edge cases..." +``` + +**With environment context:** +```bash +# Pass context via files, not prompt length +echo "Focus areas: auth module, API rate limiting" > .claude-context.md +claude -p "Read .claude-context.md for priorities. Work through them in order." +rm .claude-context.md +``` + +**With `--allowedTools` restrictions:** +```bash +# Read-only analysis pass +claude -p --allowedTools "Read,Grep,Glob" "Audit this codebase for security vulnerabilities..." + +# Write-only implementation pass +claude -p --allowedTools "Read,Write,Edit,Bash" "Implement the fixes from security-audit.md..." +``` + +--- + +## 2. NanoClaw REPL + +**ECC's built-in persistent loop.** A session-aware REPL that calls `claude -p` synchronously with full conversation history. + +```bash +# Start the default session +node scripts/claw.js + +# Named session with skill context +CLAW_SESSION=my-project CLAW_SKILLS=tdd-workflow,security-review node scripts/claw.js +``` + +### How It Works + +1. Loads conversation history from `~/.claude/claw/{session}.md` +2. Each user message is sent to `claude -p` with full history as context +3. Responses are appended to the session file (Markdown-as-database) +4. Sessions persist across restarts + +### When NanoClaw vs Sequential Pipeline + +| Use Case | NanoClaw | Sequential Pipeline | +|----------|----------|-------------------| +| Interactive exploration | Yes | No | +| Scripted automation | No | Yes | +| Session persistence | Built-in | Manual | +| Context accumulation | Grows per turn | Fresh each step | +| CI/CD integration | Poor | Excellent | + +See the `/claw` command documentation for full details. + +--- + +## 3. Infinite Agentic Loop + +**A two-prompt system** that orchestrates parallel sub-agents for specification-driven generation. Developed by disler (credit: @disler). + +### Architecture: Two-Prompt System + +``` +PROMPT 1 (Orchestrator) PROMPT 2 (Sub-Agents) +┌─────────────────────┐ ┌──────────────────────┐ +│ Parse spec file │ │ Receive full context │ +│ Scan output dir │ deploys │ Read assigned number │ +│ Plan iteration │────────────│ Follow spec exactly │ +│ Assign creative dirs │ N agents │ Generate unique output │ +│ Manage waves │ │ Save to output dir │ +└─────────────────────┘ └──────────────────────┘ +``` + +### The Pattern + +1. **Spec Analysis** — Orchestrator reads a specification file (Markdown) defining what to generate +2. **Directory Recon** — Scans existing output to find the highest iteration number +3. **Parallel Deployment** — Launches N sub-agents, each with: + - The full spec + - A unique creative direction + - A specific iteration number (no conflicts) + - A snapshot of existing iterations (for uniqueness) +4. **Wave Management** — For infinite mode, deploys waves of 3-5 agents until context is exhausted + +### Implementation via Claude Code Commands + +Create `.claude/commands/infinite.md`: + +```markdown +Parse the following arguments from $ARGUMENTS: +1. spec_file — path to the specification markdown +2. output_dir — where iterations are saved +3. count — integer 1-N or "infinite" + +PHASE 1: Read and deeply understand the specification. +PHASE 2: List output_dir, find highest iteration number. Start at N+1. +PHASE 3: Plan creative directions — each agent gets a DIFFERENT theme/approach. +PHASE 4: Deploy sub-agents in parallel (Task tool). Each receives: + - Full spec text + - Current directory snapshot + - Their assigned iteration number + - Their unique creative direction +PHASE 5 (infinite mode): Loop in waves of 3-5 until context is low. +``` + +**Invoke:** +```bash +/project:infinite specs/component-spec.md src/ 5 +/project:infinite specs/component-spec.md src/ infinite +``` + +### Batching Strategy + +| Count | Strategy | +|-------|----------| +| 1-5 | All agents simultaneously | +| 6-20 | Batches of 5 | +| infinite | Waves of 3-5, progressive sophistication | + +### Key Insight: Uniqueness via Assignment + +Don't rely on agents to self-differentiate. The orchestrator **assigns** each agent a specific creative direction and iteration number. This prevents duplicate concepts across parallel agents. + +--- + +## 4. Continuous Claude PR Loop + +**A production-grade shell script** that runs Claude Code in a continuous loop, creating PRs, waiting for CI, and merging automatically. Created by AnandChowdhary (credit: @AnandChowdhary). + +### Core Loop + +``` +┌─────────────────────────────────────────────────────┐ +│ CONTINUOUS CLAUDE ITERATION │ +│ │ +│ 1. Create branch (continuous-claude/iteration-N) │ +│ 2. Run claude -p with enhanced prompt │ +│ 3. (Optional) Reviewer pass — separate claude -p │ +│ 4. Commit changes (claude generates message) │ +│ 5. Push + create PR (gh pr create) │ +│ 6. Wait for CI checks (poll gh pr checks) │ +│ 7. CI failure? → Auto-fix pass (claude -p) │ +│ 8. Merge PR (squash/merge/rebase) │ +│ 9. Return to main → repeat │ +│ │ +│ Limit by: --max-runs N | --max-cost $X │ +│ --max-duration 2h | completion signal │ +└─────────────────────────────────────────────────────┘ +``` + +### Installation + +> **Warning:** Install continuous-claude from its repository after reviewing the code. Do not pipe external scripts directly to bash. + +### Usage + +```bash +# Basic: 10 iterations +continuous-claude --prompt "Add unit tests for all untested functions" --max-runs 10 + +# Cost-limited +continuous-claude --prompt "Fix all linter errors" --max-cost 5.00 + +# Time-boxed +continuous-claude --prompt "Improve test coverage" --max-duration 8h + +# With code review pass +continuous-claude \ + --prompt "Add authentication feature" \ + --max-runs 10 \ + --review-prompt "Run npm test && npm run lint, fix any failures" + +# Parallel via worktrees +continuous-claude --prompt "Add tests" --max-runs 5 --worktree tests-worker & +continuous-claude --prompt "Refactor code" --max-runs 5 --worktree refactor-worker & +wait +``` + +### Cross-Iteration Context: SHARED_TASK_NOTES.md + +The critical innovation: a `SHARED_TASK_NOTES.md` file persists across iterations: + +```markdown +## Progress +- [x] Added tests for auth module (iteration 1) +- [x] Fixed edge case in token refresh (iteration 2) +- [ ] Still need: rate limiting tests, error boundary tests + +## Next Steps +- Focus on rate limiting module next +- The mock setup in tests/helpers.ts can be reused +``` + +Claude reads this file at iteration start and updates it at iteration end. This bridges the context gap between independent `claude -p` invocations. + +### CI Failure Recovery + +When PR checks fail, Continuous Claude automatically: +1. Fetches the failed run ID via `gh run list` +2. Spawns a new `claude -p` with CI fix context +3. Claude inspects logs via `gh run view`, fixes code, commits, pushes +4. Re-waits for checks (up to `--ci-retry-max` attempts) + +### Completion Signal + +Claude can signal "I'm done" by outputting a magic phrase: + +```bash +continuous-claude \ + --prompt "Fix all bugs in the issue tracker" \ + --completion-signal "CONTINUOUS_CLAUDE_PROJECT_COMPLETE" \ + --completion-threshold 3 # Stops after 3 consecutive signals +``` + +Three consecutive iterations signaling completion stops the loop, preventing wasted runs on finished work. + +### Key Configuration -## Protocol +| Flag | Purpose | +|------|---------| +| `--max-runs N` | Stop after N successful iterations | +| `--max-cost $X` | Stop after spending $X | +| `--max-duration 2h` | Stop after time elapsed | +| `--merge-strategy squash` | squash, merge, or rebase | +| `--worktree ` | Parallel execution via git worktrees | +| `--disable-commits` | Dry-run mode (no git operations) | +| `--review-prompt "..."` | Add reviewer pass per iteration | +| `--ci-retry-max N` | Auto-fix CI failures (default: 1) | -1. Initialize loop state: task, target, iteration number, max iterations, last result, progress summary, open risks, stop decision. -2. For each iteration, start a fresh subagent/executor with the task, target, current state summary, constraints, expected output contract, and allowed action boundary. -3. The worker runs the entrypoint task once. It must not silently start its own unbounded loop. -4. Worker returns compact evidence: status, result, changed files or findings, verification, blocker, risk, and recommended next step. -5. Orchestrator reports progress to the user after each iteration. -6. Orchestrator decides continue or stop using the stop conditions below. -7. If continuing, update state and launch the next fresh worker with the new context. -8. End with a completion report: iterations run, final status, evidence, checks, unresolved risks, and next action. +--- + +## 5. The De-Sloppify Pattern + +**An add-on pattern for any loop.** Add a dedicated cleanup/refactor step after each Implementer step. + +### The Problem + +When you ask an LLM to implement with TDD, it takes "write tests" too literally: +- Tests that verify TypeScript's type system works (testing `typeof x === 'string'`) +- Overly defensive runtime checks for things the type system already guarantees +- Tests for framework behavior rather than business logic +- Excessive error handling that obscures the actual code + +### Why Not Negative Instructions? + +Adding "don't test type systems" or "don't add unnecessary checks" to the Implementer prompt has downstream effects: +- The model becomes hesitant about ALL testing +- It skips legitimate edge case tests +- Quality degrades unpredictably + +### The Solution: Separate Pass + +Instead of constraining the Implementer, let it be thorough. Then add a focused cleanup agent: + +```bash +# Step 1: Implement (let it be thorough) +claude -p "Implement the feature with full TDD. Be thorough with tests." + +# Step 2: De-sloppify (separate context, focused cleanup) +claude -p "Review all changes in the working tree. Remove: +- Tests that verify language/framework behavior rather than business logic +- Redundant type checks that the type system already enforces +- Over-defensive error handling for impossible states +- Console.log statements +- Commented-out code + +Keep all business logic tests. Run the test suite after cleanup to ensure nothing breaks." +``` + +### In a Loop Context + +```bash +for feature in "${features[@]}"; do + # Implement + claude -p "Implement $feature with TDD." + + # De-sloppify + claude -p "Cleanup pass: review changes, remove test/code slop, run tests." + + # Verify + claude -p "Run build + lint + tests. Fix any failures." + + # Commit + claude -p "Commit with message: feat: add $feature" +done +``` + +### Key Insight + +> Rather than adding negative instructions which have downstream quality effects, add a separate de-sloppify pass. Two focused agents outperform one constrained agent. + +--- + +## 6. Ralphinho / RFC-Driven DAG Orchestration + +**The most sophisticated pattern.** An RFC-driven, multi-agent pipeline that decomposes a spec into a dependency DAG, runs each unit through a tiered quality pipeline, and lands them via an agent-driven merge queue. Created by enitrat (credit: @enitrat). + +### Architecture Overview + +``` +RFC/PRD Document + │ + ▼ + DECOMPOSITION (AI) + Break RFC into work units with dependency DAG + │ + ▼ +┌──────────────────────────────────────────────────────┐ +│ RALPH LOOP (up to 3 passes) │ +│ │ +│ For each DAG layer (sequential, by dependency): │ +│ │ +│ ┌── Quality Pipelines (parallel per unit) ───────┐ │ +│ │ Each unit in its own worktree: │ │ +│ │ Research → Plan → Implement → Test → Review │ │ +│ │ (depth varies by complexity tier) │ │ +│ └────────────────────────────────────────────────┘ │ +│ │ +│ ┌── Merge Queue ─────────────────────────────────┐ │ +│ │ Rebase onto main → Run tests → Land or evict │ │ +│ │ Evicted units re-enter with conflict context │ │ +│ └────────────────────────────────────────────────┘ │ +│ │ +└──────────────────────────────────────────────────────┘ +``` + +### RFC Decomposition + +AI reads the RFC and produces work units: + +```typescript +interface WorkUnit { + id: string; // kebab-case identifier + name: string; // Human-readable name + rfcSections: string[]; // Which RFC sections this addresses + description: string; // Detailed description + deps: string[]; // Dependencies (other unit IDs) + acceptance: string[]; // Concrete acceptance criteria + tier: "trivial" | "small" | "medium" | "large"; +} +``` + +**Decomposition Rules:** +- Prefer fewer, cohesive units (minimize merge risk) +- Minimize cross-unit file overlap (avoid conflicts) +- Keep tests WITH implementation (never separate "implement X" + "test X") +- Dependencies only where real code dependency exists + +The dependency DAG determines execution order: +``` +Layer 0: [unit-a, unit-b] ← no deps, run in parallel +Layer 1: [unit-c] ← depends on unit-a +Layer 2: [unit-d, unit-e] ← depend on unit-c +``` + +### Complexity Tiers + +Different tiers get different pipeline depths: + +| Tier | Pipeline Stages | +|------|----------------| +| **trivial** | implement → test | +| **small** | implement → test → code-review | +| **medium** | research → plan → implement → test → PRD-review + code-review → review-fix | +| **large** | research → plan → implement → test → PRD-review + code-review → review-fix → final-review | + +This prevents expensive operations on simple changes while ensuring architectural changes get thorough scrutiny. + +### Separate Context Windows (Author-Bias Elimination) + +Each stage runs in its own agent process with its own context window: + +| Stage | Model | Purpose | +|-------|-------|---------| +| Research | Sonnet | Read codebase + RFC, produce context doc | +| Plan | Opus | Design implementation steps | +| Implement | Codex | Write code following the plan | +| Test | Sonnet | Run build + test suite | +| PRD Review | Sonnet | Spec compliance check | +| Code Review | Opus | Quality + security check | +| Review Fix | Codex | Address review issues | +| Final Review | Opus | Quality gate (large tier only) | + +**Critical design:** The reviewer never wrote the code it reviews. This eliminates author bias — the most common source of missed issues in self-review. -## Stop conditions +### Merge Queue with Eviction -Stop immediately when any condition is true: -- task is complete or acceptance criteria pass -- no meaningful new progress since the previous iteration -- the issue cannot be reproduced or the worker found no actionable next step -- max iterations reached -- blocker requires user input, credentials, approval, or external dependency -- user says stop, pause, or changes scope -- safety, privacy, destructive-action, financial, legal, or external-write rules require approval or refusal +After quality pipelines complete, units enter the merge queue: -## Safety and approvals +``` +Unit branch + │ + ├─ Rebase onto main + │ └─ Conflict? → EVICT (capture conflict context) + │ + ├─ Run build + tests + │ └─ Fail? → EVICT (capture test output) + │ + └─ Pass → Fast-forward main, push, delete branch +``` -Normal agent policy still governs every iteration. -- Do not use looping to bypass approval gates, destructive-action confirmations, external-write rules, spending limits, security boundaries, or privacy constraints. -- Keep loops bounded; never interpret `Loop:` as permission for infinite autonomous execution. -- If an iteration would perform destructive, external, or sensitive action, pause and get the required approval before that action. -- Preserve exact errors, paths, IDs, commands, and user constraints when handing state to the next worker. +**File Overlap Intelligence:** +- Non-overlapping units land speculatively in parallel +- Overlapping units land one-by-one, rebasing each time + +**Eviction Recovery:** +When evicted, full context is captured (conflicting files, diffs, test output) and fed back to the implementer on the next Ralph pass: + +```markdown +## MERGE CONFLICT — RESOLVE BEFORE NEXT LANDING + +Your previous implementation conflicted with another unit that landed first. +Restructure your changes to avoid the conflicting files/lines below. + +{full eviction context with diffs} +``` + +### Data Flow Between Stages + +``` +research.contextFilePath ──────────────────→ plan +plan.implementationSteps ──────────────────→ implement +implement.{filesCreated, whatWasDone} ─────→ test, reviews +test.failingSummary ───────────────────────→ reviews, implement (next pass) +reviews.{feedback, issues} ────────────────→ review-fix → implement (next pass) +final-review.reasoning ────────────────────→ implement (next pass) +evictionContext ───────────────────────────→ implement (after merge conflict) +``` + +### Worktree Isolation + +Every unit runs in an isolated worktree (uses jj/Jujutsu, not git): +``` +/tmp/workflow-wt-{unit-id}/ +``` + +Pipeline stages for the same unit **share** a worktree, preserving state (context files, plan files, code changes) across research → plan → implement → test → review. + +### Key Design Principles + +1. **Deterministic execution** — Upfront decomposition locks in parallelism and ordering +2. **Human review at leverage points** — The work plan is the single highest-leverage intervention point +3. **Separate concerns** — Each stage in a separate context window with a separate agent +4. **Conflict recovery with context** — Full eviction context enables intelligent re-runs, not blind retries +5. **Tier-driven depth** — Trivial changes skip research/review; large changes get maximum scrutiny +6. **Resumable workflows** — Full state persisted to SQLite; resume from any point + +### When to Use Ralphinho vs Simpler Patterns + +| Signal | Use Ralphinho | Use Simpler Pattern | +|--------|--------------|-------------------| +| Multiple interdependent work units | Yes | No | +| Need parallel implementation | Yes | No | +| Merge conflicts likely | Yes | No (sequential is fine) | +| Single-file change | No | Yes (sequential pipeline) | +| Multi-day project | Yes | Maybe (continuous-claude) | +| Spec/RFC already written | Yes | Maybe | +| Quick iteration on one thing | No | Yes (NanoClaw or pipeline) | + +--- + +## Choosing the Right Pattern + +### Decision Matrix + +``` +Is the task a single focused change? +├─ Yes → Sequential Pipeline or NanoClaw +└─ No → Is there a written spec/RFC? + ├─ Yes → Do you need parallel implementation? + │ ├─ Yes → Ralphinho (DAG orchestration) + │ └─ No → Continuous Claude (iterative PR loop) + └─ No → Do you need many variations of the same thing? + ├─ Yes → Infinite Agentic Loop (spec-driven generation) + └─ No → Sequential Pipeline with de-sloppify +``` + +### Combining Patterns + +These patterns compose well: + +1. **Sequential Pipeline + De-Sloppify** — The most common combination. Every implement step gets a cleanup pass. + +2. **Continuous Claude + De-Sloppify** — Add `--review-prompt` with a de-sloppify directive to each iteration. + +3. **Any loop + Verification** — Use ECC's `/verify` command or `verification-loop` skill as a gate before commits. + +4. **Ralphinho's tiered approach in simpler loops** — Even in a sequential pipeline, you can route simple tasks to Haiku and complex tasks to Opus: + ```bash + # Simple formatting fix + claude -p --model haiku "Fix the import ordering in src/utils.ts" + + # Complex architectural change + claude -p --model opus "Refactor the auth module to use the strategy pattern" + ``` + +--- + +## Anti-Patterns + +### Common Mistakes + +1. **Infinite loops without exit conditions** — Always have a max-runs, max-cost, max-duration, or completion signal. + +2. **No context bridge between iterations** — Each `claude -p` call starts fresh. Use `SHARED_TASK_NOTES.md` or filesystem state to bridge context. + +3. **Retrying the same failure** — If an iteration fails, don't just retry. Capture the error context and feed it to the next attempt. + +4. **Negative instructions instead of cleanup passes** — Don't say "don't do X." Add a separate pass that removes X. + +5. **All agents in one context window** — For complex workflows, separate concerns into different agent processes. The reviewer should never be the author. + +6. **Ignoring file overlap in parallel work** — If two parallel agents might edit the same file, you need a merge strategy (sequential landing, rebase, or conflict resolution). + +--- -## Verification +## References -Choose checks from the actual project/task, not from this skill by default. Examples: tests, lint, typecheck, build, `git diff --check`, direct inspection, reproduced issue, or rendered/manual validation. If no meaningful check can run, state why in the progress and final report. +| Project | Author | Link | +|---------|--------|------| +| Ralphinho | enitrat | credit: @enitrat | +| Infinite Agentic Loop | disler | credit: @disler | +| Continuous Claude | AnandChowdhary | credit: @AnandChowdhary | +| NanoClaw | ECC | `/claw` command in this repo | +| Verification Loop | ECC | `skills/verification-loop/` in this repo | diff --git a/skills/loop/references/protocol.md b/skills/loop/references/protocol.md deleted file mode 100644 index 94d6c7e..0000000 --- a/skills/loop/references/protocol.md +++ /dev/null @@ -1,96 +0,0 @@ -# Loop protocol reference - -Use this reference when running a bounded agent-agnostic loop. Keep the visible `SKILL.md` lean; use these templates for state and handoffs. - -## State fields - -Track these fields in the orchestrator: - -```yaml -task: "exact one-iteration entrypoint" -target: "repo/path/doc/issue/scope" -maxIterations: 3 -iteration: 0 -checks: ["project-appropriate verification"] -constraints: ["user constraints", "policy constraints", "repo constraints"] -progressSoFar: "compact summary of completed work/findings" -lastResult: "status/result/evidence from previous worker" -openRisks: ["unresolved risks or blockers"] -stopDecision: "continue | stop" -stopReason: "why" -``` - -Default `maxIterations` is `3`. The external source material inspected for this skill describes loop patterns but does not provide a default max iteration count, so this skill uses a conservative bounded default. The user may override it with another explicit bounded value. - -## Worker prompt shape - -Give each worker a compact contract like this, adapted to the current runtime: - -```text -You are iteration {n}/{maxIterations} of a bounded loop. - -Task: {task} -Target: {target} - -State from previous iterations: -{progressSoFar} - -Constraints: -- Run the entrypoint task once. -- Do not start your own unbounded loop. -- Preserve exact paths, commands, IDs, errors, and user constraints. -- Follow normal safety, approval, destructive-action, privacy, and external-write rules. -- Run or recommend project-appropriate checks. - -Return: -- status: complete | progress | no-progress | blocked | unsafe | failed -- result: concise outcome -- evidence: files, commands, findings, or reproduction details -- checks: run / not run / why not -- blocker: missing approval/input/dependency if any -- risk: remaining risk -- next: stop or one concrete next iteration goal -``` - -## Progress report after each iteration - -Report after every worker returns, before deciding silently to continue: - -```text -Iteration {n}/{maxIterations}: {status} -Progress: {what changed or was learned} -Evidence/checks: {compact evidence} -Decision: {continue/stop} — {reason} -Next: {next iteration goal or final next step} -``` - -## Continue decision rubric - -Continue only when all are true: -- there is a concrete next iteration goal -- the previous iteration produced progress or a new lead -- the next iteration does not require missing approval or user input -- the loop remains within the original scope -- max iterations has not been reached -- expected value justifies another worker - -Stop when another iteration would mostly repeat the same attempt, widen scope without approval, or hide a blocker. - -## Completion report - -Final report format: - -```text -Loop complete: {final status} -Iterations: {ran}/{maxIterations} -Result: {final outcome} -Evidence: {key files/findings/commands} -Checks: {what passed/failed/not run} -Stopped because: {stop reason} -Remaining risks: {risks or none} -Next: {recommended user/action step} -``` - -## Source inspiration - -This skill was written as original repo-local guidance after inspecting ECC `continuous-agent-loop` and `autonomous-loops` material. Borrowed ideas are conceptual only: bounded orchestration, subagent waves, context bridging, quality gates, and recovery from loop churn. Runtime-specific command syntax from those sources is intentionally not required here. From bd08750b6190f4ed818a40c8394482045455c537 Mon Sep 17 00:00:00 2001 From: Sergei Garin Date: Sun, 24 May 2026 23:17:45 +0300 Subject: [PATCH 3/3] Simplify loop skill into generic workflow --- README.md | 6 +- skills/loop/SKILL.md | 701 ++++++----------------------- skills/loop/references/protocol.md | 86 ++++ 3 files changed, 237 insertions(+), 556 deletions(-) create mode 100644 skills/loop/references/protocol.md diff --git a/README.md b/README.md index a940152..2c29962 100644 --- a/README.md +++ b/README.md @@ -201,9 +201,9 @@ If a skill needs reusable instructions that are not a runnable skill: - Do not use when: the task still needs discovery or approval shaping. - `skills/loop` - - What it is: ECC autonomous loop pattern catalog for Claude Code, from sequential `claude -p` pipelines to RFC-driven multi-agent DAG orchestration. - - Use when: selecting or designing an autonomous development loop, CI/PR loop, parallel generation loop, or quality-gated agent workflow. - - Do not use when: the task only needs a one-shot answer or when autonomous execution would bypass required approval, safety, or external-action gates. + - What it is: generic agent-agnostic loop router for bounded repeated task cycles with state baton, one-cycle executors, progress reporting, and explicit stop rules. + - Use when: a task should run through repeated independent cycles, such as bug hunts, docs cleanup passes, PR comment handling, or quality-gated review/fix loops. + - Do not use when: the task only needs a one-shot answer, continuation criteria are missing, or another cycle would bypass required approval, safety, or external-action gates. - `skills/code-review-orchestrator` - What it is: one entrypoint for multi-role code review with merged findings. diff --git a/skills/loop/SKILL.md b/skills/loop/SKILL.md index 704c0cb..59329f1 100644 --- a/skills/loop/SKILL.md +++ b/skills/loop/SKILL.md @@ -1,610 +1,205 @@ --- name: loop -description: "Patterns and architectures for autonomous Claude Code loops — from simple sequential pipelines to RFC-driven multi-agent DAG systems." -origin: ECC +description: "Generic agent-agnostic loop router for repeated task cycles with explicit state, worker handoff, progress reporting, and safe stop conditions." +origin: ECC-adapted --- -# Autonomous Loops Skill +# Loop Skill -> Compatibility note (v1.8.0): `autonomous-loops` is retained for one release. -> The canonical skill name is now `continuous-agent-loop`. New loop guidance -> should be authored there, while this skill remains available to avoid -> breaking existing workflows. +Run a bounded, stateful agent loop for tasks that benefit from repeated independent cycles, for example: `Loop: find bugs in project X`. -Patterns, architectures, and reference implementations for running Claude Code autonomously in loops. Covers everything from simple `claude -p` pipelines to full RFC-driven multi-agent DAG orchestration. +This skill is agent-agnostic. It does not require Claude, CI, PRs, or any specific implementation runner. Use the executor/subagent mechanism available in the current environment. -## When to Use +## Source note -- Setting up autonomous development workflows that run without human intervention -- Choosing the right loop architecture for your problem (simple vs complex) -- Building CI/CD-style continuous development pipelines -- Running parallel agents with merge coordination -- Implementing context persistence across loop iterations -- Adding quality gates and cleanup passes to autonomous workflows +Adapted from the MIT-licensed ECC autonomous loop material. The upstream catalog informed the stability mechanisms here: bounded iterations, persistent state/baton, separate worker contexts, retry context, verification gates, saturation stops, and approval boundaries. The upstream Claude-specific command catalog is intentionally not the operational path for this skill. -## Loop Pattern Spectrum +## Triggers -From simplest to most sophisticated: +Use when the request asks for a repeated autonomous cycle, or starts with forms like: -| Pattern | Complexity | Best For | -|---------|-----------|----------| -| [Sequential Pipeline](#1-sequential-pipeline-claude--p) | Low | Daily dev steps, scripted workflows | -| [NanoClaw REPL](#2-nanoclaw-repl) | Low | Interactive persistent sessions | -| [Infinite Agentic Loop](#3-infinite-agentic-loop) | Medium | Parallel content generation, spec-driven work | -| [Continuous Claude PR Loop](#4-continuous-claude-pr-loop) | Medium | Multi-day iterative projects with CI gates | -| [De-Sloppify Pattern](#5-the-de-sloppify-pattern) | Add-on | Quality cleanup after any Implementer step | -| [Ralphinho / RFC-Driven DAG](#6-ralphinho--rfc-driven-dag-orchestration) | High | Large features, multi-unit parallel work with merge queue | +- `Loop: ` +- `Run a loop to ` +- `Keep iterating on until ` +- `Find more bugs in ` +- `Repeat cleanup/review passes until no progress` ---- - -## 1. Sequential Pipeline (`claude -p`) - -**The simplest loop.** Break daily development into a sequence of non-interactive `claude -p` calls. Each call is a focused step with a clear prompt. - -### Core Insight - -> If you can't figure out a loop like this, it means you can't even drive the LLM to fix your code in interactive mode. - -The `claude -p` flag runs Claude Code non-interactively with a prompt, exits when done. Chain calls to build a pipeline: - -```bash -#!/bin/bash -# daily-dev.sh — Sequential pipeline for a feature branch - -set -e - -# Step 1: Implement the feature -claude -p "Read the spec in docs/auth-spec.md. Implement OAuth2 login in src/auth/. Write tests first (TDD). Do NOT create any new documentation files." - -# Step 2: De-sloppify (cleanup pass) -claude -p "Review all files changed by the previous commit. Remove any unnecessary type tests, overly defensive checks, or testing of language features (e.g., testing that TypeScript generics work). Keep real business logic tests. Run the test suite after cleanup." - -# Step 3: Verify -claude -p "Run the full build, lint, type check, and test suite. Fix any failures. Do not add new features." - -# Step 4: Commit -claude -p "Create a conventional commit for all staged changes. Use 'feat: add OAuth2 login flow' as the message." -``` - -### Key Design Principles - -1. **Each step is isolated** — A fresh context window per `claude -p` call means no context bleed between steps. -2. **Order matters** — Steps execute sequentially. Each builds on the filesystem state left by the previous. -3. **Negative instructions are dangerous** — Don't say "don't test type systems." Instead, add a separate cleanup step (see [De-Sloppify Pattern](#5-the-de-sloppify-pattern)). -4. **Exit codes propagate** — `set -e` stops the pipeline on failure. - -### Variations - -**With model routing:** -```bash -# Research with Opus (deep reasoning) -claude -p --model opus "Analyze the codebase architecture and write a plan for adding caching..." - -# Implement with Sonnet (fast, capable) -claude -p "Implement the caching layer according to the plan in docs/caching-plan.md..." - -# Review with Opus (thorough) -claude -p --model opus "Review all changes for security issues, race conditions, and edge cases..." -``` - -**With environment context:** -```bash -# Pass context via files, not prompt length -echo "Focus areas: auth module, API rate limiting" > .claude-context.md -claude -p "Read .claude-context.md for priorities. Work through them in order." -rm .claude-context.md -``` - -**With `--allowedTools` restrictions:** -```bash -# Read-only analysis pass -claude -p --allowedTools "Read,Grep,Glob" "Audit this codebase for security vulnerabilities..." - -# Write-only implementation pass -claude -p --allowedTools "Read,Write,Edit,Bash" "Implement the fixes from security-audit.md..." -``` - ---- - -## 2. NanoClaw REPL - -**ECC's built-in persistent loop.** A session-aware REPL that calls `claude -p` synchronously with full conversation history. - -```bash -# Start the default session -node scripts/claw.js - -# Named session with skill context -CLAW_SESSION=my-project CLAW_SKILLS=tdd-workflow,security-review node scripts/claw.js -``` - -### How It Works - -1. Loads conversation history from `~/.claude/claw/{session}.md` -2. Each user message is sent to `claude -p` with full history as context -3. Responses are appended to the session file (Markdown-as-database) -4. Sessions persist across restarts - -### When NanoClaw vs Sequential Pipeline - -| Use Case | NanoClaw | Sequential Pipeline | -|----------|----------|-------------------| -| Interactive exploration | Yes | No | -| Scripted automation | No | Yes | -| Session persistence | Built-in | Manual | -| Context accumulation | Grows per turn | Fresh each step | -| CI/CD integration | Poor | Excellent | - -See the `/claw` command documentation for full details. - ---- - -## 3. Infinite Agentic Loop - -**A two-prompt system** that orchestrates parallel sub-agents for specification-driven generation. Developed by disler (credit: @disler). - -### Architecture: Two-Prompt System - -``` -PROMPT 1 (Orchestrator) PROMPT 2 (Sub-Agents) -┌─────────────────────┐ ┌──────────────────────┐ -│ Parse spec file │ │ Receive full context │ -│ Scan output dir │ deploys │ Read assigned number │ -│ Plan iteration │────────────│ Follow spec exactly │ -│ Assign creative dirs │ N agents │ Generate unique output │ -│ Manage waves │ │ Save to output dir │ -└─────────────────────┘ └──────────────────────┘ -``` - -### The Pattern - -1. **Spec Analysis** — Orchestrator reads a specification file (Markdown) defining what to generate -2. **Directory Recon** — Scans existing output to find the highest iteration number -3. **Parallel Deployment** — Launches N sub-agents, each with: - - The full spec - - A unique creative direction - - A specific iteration number (no conflicts) - - A snapshot of existing iterations (for uniqueness) -4. **Wave Management** — For infinite mode, deploys waves of 3-5 agents until context is exhausted - -### Implementation via Claude Code Commands - -Create `.claude/commands/infinite.md`: - -```markdown -Parse the following arguments from $ARGUMENTS: -1. spec_file — path to the specification markdown -2. output_dir — where iterations are saved -3. count — integer 1-N or "infinite" - -PHASE 1: Read and deeply understand the specification. -PHASE 2: List output_dir, find highest iteration number. Start at N+1. -PHASE 3: Plan creative directions — each agent gets a DIFFERENT theme/approach. -PHASE 4: Deploy sub-agents in parallel (Task tool). Each receives: - - Full spec text - - Current directory snapshot - - Their assigned iteration number - - Their unique creative direction -PHASE 5 (infinite mode): Loop in waves of 3-5 until context is low. -``` - -**Invoke:** -```bash -/project:infinite specs/component-spec.md src/ 5 -/project:infinite specs/component-spec.md src/ infinite -``` - -### Batching Strategy - -| Count | Strategy | -|-------|----------| -| 1-5 | All agents simultaneously | -| 6-20 | Batches of 5 | -| infinite | Waves of 3-5, progressive sophistication | - -### Key Insight: Uniqueness via Assignment - -Don't rely on agents to self-differentiate. The orchestrator **assigns** each agent a specific creative direction and iteration number. This prevents duplicate concepts across parallel agents. - ---- - -## 4. Continuous Claude PR Loop - -**A production-grade shell script** that runs Claude Code in a continuous loop, creating PRs, waiting for CI, and merging automatically. Created by AnandChowdhary (credit: @AnandChowdhary). - -### Core Loop - -``` -┌─────────────────────────────────────────────────────┐ -│ CONTINUOUS CLAUDE ITERATION │ -│ │ -│ 1. Create branch (continuous-claude/iteration-N) │ -│ 2. Run claude -p with enhanced prompt │ -│ 3. (Optional) Reviewer pass — separate claude -p │ -│ 4. Commit changes (claude generates message) │ -│ 5. Push + create PR (gh pr create) │ -│ 6. Wait for CI checks (poll gh pr checks) │ -│ 7. CI failure? → Auto-fix pass (claude -p) │ -│ 8. Merge PR (squash/merge/rebase) │ -│ 9. Return to main → repeat │ -│ │ -│ Limit by: --max-runs N | --max-cost $X │ -│ --max-duration 2h | completion signal │ -└─────────────────────────────────────────────────────┘ -``` - -### Installation - -> **Warning:** Install continuous-claude from its repository after reviewing the code. Do not pipe external scripts directly to bash. - -### Usage - -```bash -# Basic: 10 iterations -continuous-claude --prompt "Add unit tests for all untested functions" --max-runs 10 - -# Cost-limited -continuous-claude --prompt "Fix all linter errors" --max-cost 5.00 - -# Time-boxed -continuous-claude --prompt "Improve test coverage" --max-duration 8h - -# With code review pass -continuous-claude \ - --prompt "Add authentication feature" \ - --max-runs 10 \ - --review-prompt "Run npm test && npm run lint, fix any failures" - -# Parallel via worktrees -continuous-claude --prompt "Add tests" --max-runs 5 --worktree tests-worker & -continuous-claude --prompt "Refactor code" --max-runs 5 --worktree refactor-worker & -wait -``` - -### Cross-Iteration Context: SHARED_TASK_NOTES.md - -The critical innovation: a `SHARED_TASK_NOTES.md` file persists across iterations: +Do not use when a one-shot answer is enough, when safe continuation criteria are missing and cannot be inferred, or when the next action requires approval that has not been granted. -```markdown -## Progress -- [x] Added tests for auth module (iteration 1) -- [x] Fixed edge case in token refresh (iteration 2) -- [ ] Still need: rate limiting tests, error boundary tests - -## Next Steps -- Focus on rate limiting module next -- The mock setup in tests/helpers.ts can be reused -``` +## Inputs -Claude reads this file at iteration start and updates it at iteration end. This bridges the context gap between independent `claude -p` invocations. +Collect or infer: -### CI Failure Recovery +- `task`: the work objective for each cycle. +- `maxIterations`: default `3` unless the user gives another bound. +- `successCriteria`: what counts as done or good enough. +- `stopConditions`: max iterations, no progress, blocker, user stop, approval boundary, risk threshold, or success signal. +- `verificationRequirements`: optional checks/review/tests that must run before considering a cycle successful. +- `executorConstraints`: allowed tools, write permissions, external actions, target paths/repos, and reporting format. -When PR checks fail, Continuous Claude automatically: -1. Fetches the failed run ID via `gh run list` -2. Spawns a new `claude -p` with CI fix context -3. Claude inspects logs via `gh run view`, fixes code, commits, pushes -4. Re-waits for checks (up to `--ci-retry-max` attempts) +If the task is risky, destructive, external-writing, financial, privacy-sensitive, or under-specified in a way that changes safety, ask for approval or clarification before starting. -### Completion Signal +## State Baton -Claude can signal "I'm done" by outputting a magic phrase: +Keep a compact baton after every iteration and pass it to the next executor. -```bash -continuous-claude \ - --prompt "Fix all bugs in the issue tracker" \ - --completion-signal "CONTINUOUS_CLAUDE_PROJECT_COMPLETE" \ - --completion-threshold 3 # Stops after 3 consecutive signals +```yaml +task: "..." +maxIterations: 3 +iteration: 1 +successCriteria: + - "..." +stopConditions: + - "..." +verificationRequirements: + - "..." +lastResult: "none yet" +nextAction: "start first cycle" +noProgressCount: 0 +blocker: null +artifacts: + files: [] + prs: [] + issues: [] + notes: [] +retryContext: null +approvalBoundaries: + - "..." ``` -Three consecutive iterations signaling completion stops the loop, preventing wasted runs on finished work. - -### Key Configuration - -| Flag | Purpose | -|------|---------| -| `--max-runs N` | Stop after N successful iterations | -| `--max-cost $X` | Stop after spending $X | -| `--max-duration 2h` | Stop after time elapsed | -| `--merge-strategy squash` | squash, merge, or rebase | -| `--worktree ` | Parallel execution via git worktrees | -| `--disable-commits` | Dry-run mode (no git operations) | -| `--review-prompt "..."` | Add reviewer pass per iteration | -| `--ci-retry-max N` | Auto-fix CI failures (default: 1) | - ---- - -## 5. The De-Sloppify Pattern +Minimum per-iteration state: + +- iteration number +- last result +- next action +- no-progress count +- blocker, if any +- artifacts/PRs/issues/files, if any +- verification result, if any +- retry context for failures -**An add-on pattern for any loop.** Add a dedicated cleanup/refactor step after each Implementer step. +## Loop Lifecycle -### The Problem +1. Initialize the state baton from the inputs. +2. Start exactly one executor/subagent for the current iteration. +3. Executor performs one focused cycle, reports result, then stops. +4. Main/orchestrator reports progress to the user. +5. Main/orchestrator updates the state baton. +6. Main/orchestrator decides: continue, retry with context, pause for approval, or stop. +7. If continuing, start the next executor with the updated baton. +8. Do not give a final answer until either the next iteration has been started or the loop has explicitly stopped. -When you ask an LLM to implement with TDD, it takes "write tests" too literally: -- Tests that verify TypeScript's type system works (testing `typeof x === 'string'`) -- Overly defensive runtime checks for things the type system already guarantees -- Tests for framework behavior rather than business logic -- Excessive error handling that obscures the actual code +## Executor Protocol -### Why Not Negative Instructions? +Each executor gets one iteration only. -Adding "don't test type systems" or "don't add unnecessary checks" to the Implementer prompt has downstream effects: -- The model becomes hesitant about ALL testing -- It skips legitimate edge case tests -- Quality degrades unpredictably +Executor instructions must include: -### The Solution: Separate Pass +- the task and current baton +- the exact iteration number +- what to inspect/change/run in this cycle +- verification/review requirements for this cycle +- approval and safety boundaries +- required report fields: result, evidence, artifacts, verification, blocker, next recommended action -Instead of constraining the Implementer, let it be thorough. Then add a focused cleanup agent: +Executor must: -```bash -# Step 1: Implement (let it be thorough) -claude -p "Implement the feature with full TDD. Be thorough with tests." +- run only its assigned cycle +- preserve exact paths, commands, IDs, errors, and artifacts in the report +- stop after reporting +- not launch its own loop or next iteration unless explicitly assigned +- not bypass approval, external-action, destructive-action, or privacy boundaries -# Step 2: De-sloppify (separate context, focused cleanup) -claude -p "Review all changes in the working tree. Remove: -- Tests that verify language/framework behavior rather than business logic -- Redundant type checks that the type system already enforces -- Over-defensive error handling for impossible states -- Console.log statements -- Commented-out code +## Orchestrator Contract -Keep all business logic tests. Run the test suite after cleanup to ensure nothing breaks." -``` +The main/orchestrator owns continuity. -### In a Loop Context +After each executor result: -```bash -for feature in "${features[@]}"; do - # Implement - claude -p "Implement $feature with TDD." +- summarize progress to the user before continuing when the environment supports visible progress updates +- update the baton +- increment `iteration` +- update `noProgressCount` +- capture failures as `retryContext` +- decide whether continuation criteria still allow another cycle +- start the next executor if continuing +- explicitly stop if done, saturated, blocked, unsafe, out of iterations, or waiting for approval - # De-sloppify - claude -p "Cleanup pass: review changes, remove test/code slop, run tests." +Final report must include: - # Verify - claude -p "Run build + lint + tests. Fix any failures." +- why the loop stopped +- iterations completed +- results found or changes made +- verification performed +- open blockers/risks +- artifacts/PRs/issues/files touched, if any +- recommended next step - # Commit - claude -p "Commit with message: feat: add $feature" -done -``` +## Continuation and Stop Rules -### Key Insight +Continue only when all are true: -> Rather than adding negative instructions which have downstream quality effects, add a separate de-sloppify pass. Two focused agents outperform one constrained agent. +- `iteration < maxIterations` +- no stop condition has fired +- there is a concrete next action +- the last cycle made progress, or the retry context changes the next attempt materially +- no new approval boundary is required before the next action ---- +Stop when any are true: -## 6. Ralphinho / RFC-Driven DAG Orchestration +- success criteria are met +- max iterations reached +- `noProgressCount >= 2` by default +- the same blocker repeats without new information +- verification proves the approach is failing +- continuation would require unapproved external, destructive, risky, or privacy-sensitive action +- user asks to stop -**The most sophisticated pattern.** An RFC-driven, multi-agent pipeline that decomposes a spec into a dependency DAG, runs each unit through a tiered quality pipeline, and lands them via an agent-driven merge queue. Created by enitrat (credit: @enitrat). +## Stability Mechanisms -### Architecture Overview +- **Bounded loop:** default `maxIterations: 3`; never infinite by default. +- **State baton:** durable, compact handoff between independent contexts. +- **One-cycle executors:** workers cannot silently self-loop. +- **Progress reporting:** user sees cycle results before the next cycle decision. +- **Saturation stop:** repeated no-progress cycles stop instead of burning time. +- **Retry context:** failures carry exact logs/errors/diffs into the next attempt. +- **Verification gates:** checks/reviews are explicit, not assumed. +- **Approval boundaries:** external writes, destructive operations, secrets, finance, security-sensitive changes, and remote pushes still need the surrounding environment's approval rules. +- **Resume behavior:** if interrupted, reload the last baton/artifacts, verify repo/task state, then continue only from the next safe iteration. +- **Separate contexts:** author/reviewer or search/fix phases can be separate executors when bias or complexity matters. -``` -RFC/PRD Document - │ - ▼ - DECOMPOSITION (AI) - Break RFC into work units with dependency DAG - │ - ▼ -┌──────────────────────────────────────────────────────┐ -│ RALPH LOOP (up to 3 passes) │ -│ │ -│ For each DAG layer (sequential, by dependency): │ -│ │ -│ ┌── Quality Pipelines (parallel per unit) ───────┐ │ -│ │ Each unit in its own worktree: │ │ -│ │ Research → Plan → Implement → Test → Review │ │ -│ │ (depth varies by complexity tier) │ │ -│ └────────────────────────────────────────────────┘ │ -│ │ -│ ┌── Merge Queue ─────────────────────────────────┐ │ -│ │ Rebase onto main → Run tests → Land or evict │ │ -│ │ Evicted units re-enter with conflict context │ │ -│ └────────────────────────────────────────────────┘ │ -│ │ -└──────────────────────────────────────────────────────┘ -``` +## Generic Examples -### RFC Decomposition +### Bughunt loop -AI reads the RFC and produces work units: +Input: `Loop: find bugs in project X, max 3 iterations, verify with targeted tests when possible.` -```typescript -interface WorkUnit { - id: string; // kebab-case identifier - name: string; // Human-readable name - rfcSections: string[]; // Which RFC sections this addresses - description: string; // Detailed description - deps: string[]; // Dependencies (other unit IDs) - acceptance: string[]; // Concrete acceptance criteria - tier: "trivial" | "small" | "medium" | "large"; -} -``` +Cycle shape: -**Decomposition Rules:** -- Prefer fewer, cohesive units (minimize merge risk) -- Minimize cross-unit file overlap (avoid conflicts) -- Keep tests WITH implementation (never separate "implement X" + "test X") -- Dependencies only where real code dependency exists +1. Executor audits one area and reports a concrete bug or no finding. +2. Orchestrator records evidence and chooses the next area. +3. Stop on two no-finding cycles, max iterations, or unapproved write boundary. -The dependency DAG determines execution order: -``` -Layer 0: [unit-a, unit-b] ← no deps, run in parallel -Layer 1: [unit-c] ← depends on unit-a -Layer 2: [unit-d, unit-e] ← depend on unit-c -``` +### Docs cleanup loop -### Complexity Tiers +Input: `Loop: improve onboarding docs until the quickstart is coherent.` -Different tiers get different pipeline depths: +Cycle shape: -| Tier | Pipeline Stages | -|------|----------------| -| **trivial** | implement → test | -| **small** | implement → test → code-review | -| **medium** | research → plan → implement → test → PRD-review + code-review → review-fix | -| **large** | research → plan → implement → test → PRD-review + code-review → review-fix → final-review | +1. Executor reviews one doc path or user journey and edits only approved files. +2. Executor reports changed files and remaining gaps. +3. Orchestrator runs/requests verification if configured, then continues or stops. -This prevents expensive operations on simple changes while ensuring architectural changes get thorough scrutiny. +### PR comment loop -### Separate Context Windows (Author-Bias Elimination) +Input: `Loop: address unresolved review comments, max 5, stop before pushing.` -Each stage runs in its own agent process with its own context window: - -| Stage | Model | Purpose | -|-------|-------|---------| -| Research | Sonnet | Read codebase + RFC, produce context doc | -| Plan | Opus | Design implementation steps | -| Implement | Codex | Write code following the plan | -| Test | Sonnet | Run build + test suite | -| PRD Review | Sonnet | Spec compliance check | -| Code Review | Opus | Quality + security check | -| Review Fix | Codex | Address review issues | -| Final Review | Opus | Quality gate (large tier only) | - -**Critical design:** The reviewer never wrote the code it reviews. This eliminates author bias — the most common source of missed issues in self-review. - -### Merge Queue with Eviction - -After quality pipelines complete, units enter the merge queue: - -``` -Unit branch - │ - ├─ Rebase onto main - │ └─ Conflict? → EVICT (capture conflict context) - │ - ├─ Run build + tests - │ └─ Fail? → EVICT (capture test output) - │ - └─ Pass → Fast-forward main, push, delete branch -``` - -**File Overlap Intelligence:** -- Non-overlapping units land speculatively in parallel -- Overlapping units land one-by-one, rebasing each time - -**Eviction Recovery:** -When evicted, full context is captured (conflicting files, diffs, test output) and fed back to the implementer on the next Ralph pass: - -```markdown -## MERGE CONFLICT — RESOLVE BEFORE NEXT LANDING - -Your previous implementation conflicted with another unit that landed first. -Restructure your changes to avoid the conflicting files/lines below. - -{full eviction context with diffs} -``` - -### Data Flow Between Stages - -``` -research.contextFilePath ──────────────────→ plan -plan.implementationSteps ──────────────────→ implement -implement.{filesCreated, whatWasDone} ─────→ test, reviews -test.failingSummary ───────────────────────→ reviews, implement (next pass) -reviews.{feedback, issues} ────────────────→ review-fix → implement (next pass) -final-review.reasoning ────────────────────→ implement (next pass) -evictionContext ───────────────────────────→ implement (after merge conflict) -``` +Cycle shape: -### Worktree Isolation - -Every unit runs in an isolated worktree (uses jj/Jujutsu, not git): -``` -/tmp/workflow-wt-{unit-id}/ -``` - -Pipeline stages for the same unit **share** a worktree, preserving state (context files, plan files, code changes) across research → plan → implement → test → review. - -### Key Design Principles - -1. **Deterministic execution** — Upfront decomposition locks in parallelism and ordering -2. **Human review at leverage points** — The work plan is the single highest-leverage intervention point -3. **Separate concerns** — Each stage in a separate context window with a separate agent -4. **Conflict recovery with context** — Full eviction context enables intelligent re-runs, not blind retries -5. **Tier-driven depth** — Trivial changes skip research/review; large changes get maximum scrutiny -6. **Resumable workflows** — Full state persisted to SQLite; resume from any point - -### When to Use Ralphinho vs Simpler Patterns - -| Signal | Use Ralphinho | Use Simpler Pattern | -|--------|--------------|-------------------| -| Multiple interdependent work units | Yes | No | -| Need parallel implementation | Yes | No | -| Merge conflicts likely | Yes | No (sequential is fine) | -| Single-file change | No | Yes (sequential pipeline) | -| Multi-day project | Yes | Maybe (continuous-claude) | -| Spec/RFC already written | Yes | Maybe | -| Quick iteration on one thing | No | Yes (NanoClaw or pipeline) | - ---- - -## Choosing the Right Pattern - -### Decision Matrix - -``` -Is the task a single focused change? -├─ Yes → Sequential Pipeline or NanoClaw -└─ No → Is there a written spec/RFC? - ├─ Yes → Do you need parallel implementation? - │ ├─ Yes → Ralphinho (DAG orchestration) - │ └─ No → Continuous Claude (iterative PR loop) - └─ No → Do you need many variations of the same thing? - ├─ Yes → Infinite Agentic Loop (spec-driven generation) - └─ No → Sequential Pipeline with de-sloppify -``` - -### Combining Patterns - -These patterns compose well: - -1. **Sequential Pipeline + De-Sloppify** — The most common combination. Every implement step gets a cleanup pass. - -2. **Continuous Claude + De-Sloppify** — Add `--review-prompt` with a de-sloppify directive to each iteration. - -3. **Any loop + Verification** — Use ECC's `/verify` command or `verification-loop` skill as a gate before commits. - -4. **Ralphinho's tiered approach in simpler loops** — Even in a sequential pipeline, you can route simple tasks to Haiku and complex tasks to Opus: - ```bash - # Simple formatting fix - claude -p --model haiku "Fix the import ordering in src/utils.ts" - - # Complex architectural change - claude -p --model opus "Refactor the auth module to use the strategy pattern" - ``` - ---- - -## Anti-Patterns - -### Common Mistakes - -1. **Infinite loops without exit conditions** — Always have a max-runs, max-cost, max-duration, or completion signal. - -2. **No context bridge between iterations** — Each `claude -p` call starts fresh. Use `SHARED_TASK_NOTES.md` or filesystem state to bridge context. - -3. **Retrying the same failure** — If an iteration fails, don't just retry. Capture the error context and feed it to the next attempt. - -4. **Negative instructions instead of cleanup passes** — Don't say "don't do X." Add a separate pass that removes X. - -5. **All agents in one context window** — For complex workflows, separate concerns into different agent processes. The reviewer should never be the author. - -6. **Ignoring file overlap in parallel work** — If two parallel agents might edit the same file, you need a merge strategy (sequential landing, rebase, or conflict resolution). - ---- +1. Executor handles one coherent group of comments. +2. Executor reports comment IDs, files changed, and checks run. +3. Orchestrator continues while comments remain and local changes are safe; stops before any unapproved push/merge. -## References +## Reference Protocol -| Project | Author | Link | -|---------|--------|------| -| Ralphinho | enitrat | credit: @enitrat | -| Infinite Agentic Loop | disler | credit: @disler | -| Continuous Claude | AnandChowdhary | credit: @AnandChowdhary | -| NanoClaw | ECC | `/claw` command in this repo | -| Verification Loop | ECC | `skills/verification-loop/` in this repo | +For a copyable baton schema and worker/orchestrator report templates, see `references/protocol.md`. diff --git a/skills/loop/references/protocol.md b/skills/loop/references/protocol.md new file mode 100644 index 0000000..9dab311 --- /dev/null +++ b/skills/loop/references/protocol.md @@ -0,0 +1,86 @@ +# Generic Loop Protocol + +This reference is the copyable protocol for `skills/loop`. It is intentionally agent-agnostic: replace "executor" with the current runtime's worker, subagent, script, or human-assisted pass. + +## Iteration Baton + +```yaml +loopId: "short human-readable id" +task: "original user task" +maxIterations: 3 +iteration: 1 +successCriteria: + - "observable done condition" +stopConditions: + - "max iterations" + - "success criteria met" + - "noProgressCount >= 2" + - "unapproved approval boundary" +verificationRequirements: + - "tests/checks/review to run when relevant" +lastResult: + summary: "none yet" + evidence: [] + artifacts: [] + verification: [] +nextAction: "specific next cycle objective" +noProgressCount: 0 +blocker: null +retryContext: null +approvalBoundaries: + - "no remote push without approval" + - "no destructive command without approval" +``` + +## Executor Prompt Template + +```markdown +You are the executor for loop iteration {iteration}/{maxIterations}. + +Task: {task} +Current baton: +{baton} + +Do exactly one cycle: +1. Work only on `nextAction` unless a small prerequisite lookup is required. +2. Respect approval/safety boundaries. +3. Run configured verification when applicable and safe. +4. Report compactly, then stop. Do not start the next iteration. + +Return: +- status: completed | partial | blocked | no-progress +- result: +- evidence: exact paths/commands/IDs/errors/findings +- artifacts: files/PRs/issues/notes created or changed +- verification: checks run and outcomes, or why not run +- blocker: +- next: recommended next action +``` + +## Orchestrator Update Template + +```markdown +Iteration {n} result: {one-line summary} +Evidence: {key evidence} +Verification: {check status} +Decision: continue | retry-with-context | stop | pause-for-approval +Reason: {continuation/stop rule} +Next action: {specific next cycle objective, if continuing} +``` + +## Progress Accounting + +Update `noProgressCount` as follows: + +- Reset to `0` when the executor finds a new useful result, lands an approved change, removes a blocker, or produces new evidence that changes the next action. +- Increment by `1` when the executor repeats known information, cannot act for the same reason, or produces no actionable evidence. +- Stop at `noProgressCount >= 2` unless the user explicitly requested a larger saturation window. + +## Resume Checklist + +Before resuming an interrupted loop: + +1. Read the last baton and final executor report. +2. Inspect current artifacts/state; do not trust stale baton entries blindly. +3. Re-run only cheap, relevant verification if state may have changed. +4. Continue from the next safe iteration, or stop/pause if an approval boundary is now active.