Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
1. **Understand before changing** — For non-trivial work in unfamiliar code, read `graphify-out/GRAPH_REPORT.md` and `wiki/index.md` first. If `graphify-out/` is missing, **create it first** before any non-trivial exploration (see §0 for the rebuild command). Don't edit code you haven't mapped.
2. **Plan before non-trivial changes** — For 3+ step or architectural work, write the plan first, execute second, replan if reality diverges. Skip for mechanical one-liners.
3. **Reuse before writing** — Before adding a new function, type, or helper, grep for existing functionality. Exact fit: reuse. Close fit (~80%): refactor existing code (flag the scope change in the plan). Never silently copy-paste. (§1a)
4. **Delegate to subagents** — Offload research, parallel exploration, and focused subtasks to keep the main context clean. Match model tier (Haiku/Sonnet/Opus) to task complexity. (§2)
4. **Delegate to subagents** — Offload research, parallel exploration, and focused subtasks to keep the main context clean. Match model tier (Haiku/Sonnet/Opus) to task complexity. Reuse a context-warm agent (`SendMessage`) before spawning a fresh one when the follow-up touches the same files. (§2)
5. **Capture every correction** — When the user corrects an approach, immediately save a memory entry that prevents the same mistake. Review relevant memories at session start. (§3)
6. **No "done" without proof** — Run tests, check logs, exercise the UI. "Should work" is not a status. If verification isn't possible, say so explicitly. (§4)
7. **Prefer elegance to hacks** — On non-trivial changes, pause and ask "is there a cleaner way?" before shipping. If a fix feels hacky, do it right. Skip for obvious one-liners. (§5)
Expand Down Expand Up @@ -124,13 +124,15 @@ The loop:
4. **Opus re-reviews** the updated diff.
5. Repeat 3-4 until a review pass returns no actionable findings. "Until everything is addressed" means a clean pass, not "the obvious ones are fixed".

Rules: reviewer and implementer are distinct roles, ideally distinct agents (review the diff as if a stranger wrote it); implementer on Sonnet, reviewer on Opus (set via `model`); log per-round findings in the plan file for auditability; review per task as it lands, don't batch; skip only for the same trivially mechanical edits §1b lets you skip the worktree for.
Rules: reviewer and implementer are distinct roles, ideally distinct agents (review the diff as if a stranger wrote it); implementer on Sonnet, reviewer on Opus (set via `model`); log per-round findings in the plan file for auditability; review per task as it lands, don't batch; skip only for the same trivially mechanical edits §1b lets you skip the worktree for. Across rounds, keep the SAME implementer and SAME reviewer agents alive and continue them via `SendMessage` (§2 agent reuse): the diff and files stay loaded in each agent's context, so round N+1 costs only the delta, not a full re-read. Reviewer/implementer separation is between the two roles, not between rounds of the same role.

### 2. Subagent Strategy

Full rubric, PR-shipping tier split, and rationale in `~/.claude/subagent-strategy.md`. Headline rules:

- Use subagents liberally to keep the main context clean; one focused task per subagent. **Brief them fully** — they start cold: goal, relevant context (what you tried/ruled out), expected output format, length cap.
- **Reuse a live agent before spawning a new one.** Before any `Agent` spawn, check whether an agent from earlier in the session already holds the relevant context: same files, same diff, same investigation thread. If so, continue it via `SendMessage` instead of starting fresh; the files it read stay loaded, so nothing is re-read and the briefing shrinks to just the delta. Prime cases: re-review rounds on an updated diff (§1c), an implementer fixing findings in files it just wrote, follow-up questions to a research/Explore agent. Do NOT reuse when independence is the point (adversarial verification per §4, fresh-eyes review where reviewer must not trust the implementer), when the agent needs a different model tier, or when its context is bloated or polluted by failed attempts. Detail in `subagent-strategy.md` ("Reuse agents before spawning new ones").
- **In `Workflow` scripts, batch same-file work into one agent.** `agent()` calls always start cold, so structure the script so each file/module is owned by a single agent doing all its steps (read, fix, verify), rather than per-step agents each re-reading the same files.
- **When NOT to use subagents**: tight debugging loops where each iteration informs the next, work needing multiple rounds of your own judgement, interactive refinement with the user.
- **Background-first: don't block the main chat.** Default to running independent or long-running work in the background so the main session stays responsive instead of waiting on it: spawn subagents with `run_in_background: true` (builds, full test suites, CI/deploy/CR watchers, migrations, broad sweeps, research fan-outs, anything that takes more than ~30s or fans out), and use Bash `run_in_background: true` for long shell commands. Keep moving on other independent work or hand control back to the user while it runs, and collect results when the completion notification arrives - **never poll** (`TaskOutput`/status loops just re-block you). Run in the **foreground only** when the very next step truly needs that result, or for the 'When NOT to use subagents' carve-outs (tight debugging loops, interactive refinement). Fire independent calls together: multiple queries -> one message with parallel `Agent` calls. Lock patterns for concurrent agents in `multi-agent-comms.md`.
- **Delegate to the cheapest sufficient tier — actively, not just when in doubt.** The main session is usually the most expensive option; reserving it for work that needs it is the biggest cost lever.
Expand Down
29 changes: 28 additions & 1 deletion subagent-strategy.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Subagent Strategy — Detailed Rubric

Detail extracted from `CLAUDE.md` §2. The headline triggers stay in `CLAUDE.md`; the rationale, the model rubric, and the PR-shipping tier split live here. Read this when deciding how to delegate work or which model tier to spawn a subagent on.
Detail extracted from `CLAUDE.md` §2. The headline triggers stay in `CLAUDE.md`; the rationale, the model rubric, and the PR-shipping tier split live here. Read this when deciding how to delegate work, which model tier to spawn a subagent on, or whether to continue an existing agent instead of spawning a new one.

## Delegate to the cheapest sufficient tier — actively

Expand Down Expand Up @@ -28,6 +28,33 @@ The main session is the user's interactive channel; blocking it on work that cou
- **Hand control back while work runs.** After dispatching background work, return to the user or pick up the next independent task instead of idling - summarize what is running and what you will do when it lands.
- **Keep verification honest.** Backgrounding must not skip the post-implementation review or end-to-end verification (CLAUDE.md section 4). Collect and check each background result before reporting it done: a launched agent is not a completed one.

## Reuse agents before spawning new ones (context economy)

Every fresh `Agent` spawn starts cold: it re-reads the project docs, re-greps, and re-loads every file it needs before doing anything useful. When an agent from earlier in the session already holds that context, continuing it via `SendMessage` (by agent ID or name) makes the follow-up cost only the delta. This is the agent-level analog of §1a "reuse before writing": check what already exists before creating something new.

**The check, before any spawn**: does a running or recently finished agent already have the relevant files, diff, or investigation thread in context? Signals that it does:

- The follow-up touches the **same files or module** the agent just read or edited (fix findings in code it wrote, extend a change it made, answer another question about the area it explored).
- It is the **next round of the same loop**: §1c re-review of an updated diff, a CR-fix push to the same branch, a watcher follow-up on the same PR/run.
- It is a **follow-up question** to a research/Explore/triage agent about material it already surveyed.

In all of these, send the agent the new instruction with just the delta ("review the updated diff; previous findings 1 and 3 were fixed in <files>") instead of a full cold briefing.

**When NOT to reuse** (spawn fresh instead):

- **Independence is the point.** Adversarial verification (CLAUDE.md §4), fresh-eyes review, refute-style judging: a verifier that shares the implementer's context inherits its blind spots. Reviewer and implementer stay distinct agents; but the same reviewer SHOULD persist across rounds of its own loop.
- **Wrong tier.** An agent's model is fixed at spawn. If the follow-up needs Opus judgement and the warm agent is Haiku/Sonnet (or the follow-up is mechanical and the warm agent is Opus, where each continued turn re-reads its whole accumulated context at Opus prices), a fresh right-tier spawn is cheaper than a wrong-tier continuation.
- **Polluted or bloated context.** The agent went down failed paths, accumulated huge tool output, or is near its context limit. A fresh agent with a tight briefing beats a confused warm one.
- **Unrelated task.** Overlap in time is not overlap in context; don't funnel misc work through one long-lived agent.

**Tie-breaker**: when the follow-up reads the same >2-3 files the agent already loaded, reuse usually wins; when the briefing is two sentences and the files are small, a cold spawn at a cheaper tier may still be cheaper. Decide by which context is larger: the files to re-read, or the delta message.

**`Workflow` scripts have no SendMessage**: each `agent()` call is a cold start. Get the same economy structurally:

- **Partition by file/module, not by step.** One agent owns each file or cluster and performs ALL steps on it (read, fix, test, verify) in a single `agent()` call, instead of a per-step pipeline where stage 2's agent re-reads everything stage 1's agent just read.
- Multi-stage pipelines are still right when stages genuinely need different perspectives (find -> adversarially verify) or different tiers; accept the re-read there, it buys independence.
- When stages must stay separate but stage 2 only needs stage 1's *conclusions*, pass them in the prompt (file paths, line numbers, findings) so stage 2 reads only the cited spans, not the whole surface again.

## Model rubric — match tier to task complexity

- **Haiku / gpt-5.4-mini / Gemini 3.1 Flash-Lite** (default for most delegations): file renames, typo fixes, mechanical edits with a clear spec, simple lookups (grep for a symbol, find where X is called), reading one file to answer a factual question, formatting/style fixes, running a single command (or routine `gh`/`git` op) and reporting output, implementing a tightly-specified function, writing a test from a tight spec, code review of a small single-file diff, mechanical API/SDK migration with a documented mapping, classifying/labelling against a clear rubric (e.g. triage chunks), summarising one file or short diff. Cheap, fast, good enough when the answer is mostly mechanical or rubric-driven.
Expand Down