diff --git a/workflows/bugfix/.claude/skills/assess/SKILL.md b/workflows/bugfix/.claude/skills/assess/SKILL.md index 8b161ff3..8688b2a1 100644 --- a/workflows/bugfix/.claude/skills/assess/SKILL.md +++ b/workflows/bugfix/.claude/skills/assess/SKILL.md @@ -68,7 +68,49 @@ gh repo clone OWNER/REPO /workspace/repos/REPO This is read-only exploration — understand the code, don't change it. -### Step 3: Summarize Your Understanding +### Step 3: Check for Existing Work + +Before investing effort, check whether this bug is already being addressed: + +- **Check for linked PRs on the issue:** + +```bash +gh issue view NUMBER --repo OWNER/REPO --json body,comments --jq '[.body, .comments[].body] | join("\n")' | grep -i "pull\|PR\|#" +``` + +- **Scan recent PR titles for overlap:** + +```bash +gh pr list --repo OWNER/REPO --state open --limit 30 --json number,title,headRefName --jq '.[] | "\(.number)\t\(.title)"' +``` + + Skim the titles for anything related to the bug's component, error, or + symptoms. If a PR looks relevant, read its description before proceeding. + +- **Check for duplicate or related issues:** + +```bash +gh issue list --repo OWNER/REPO --state open --limit 30 --json number,title --jq '.[] | "\(.number)\t\(.title)"' +``` + +If you find an open PR that appears to directly address this bug, **stop here +and use `AskUserQuestion`** before continuing the assessment. Present the +options: + +- "PR #N appears to address this bug — review it instead of starting fresh" +- "PR #N is related but doesn't fully cover it — continue with assessment" +- "Not sure if PR #N is relevant — continue with assessment" + +This gate applies in both normal and speedrun mode. Do not continue to Step 4 +until the user responds. The `AskUserQuestion` tool triggers platform +notifications so the user knows you need their input. + +If duplicate or related issues are found (but no PR), note them in the +assessment and continue — these inform the assessment but don't block it. + +If no existing work is found, note that and continue. + +### Step 4: Summarize Your Understanding Present a clear, concise summary to the user covering: @@ -81,7 +123,7 @@ Present a clear, concise summary to the user covering: - **Severity/impact:** Your assessment of how serious this is, based on the information available -### Step 4: Identify What You Know vs. What's Missing +### Step 5: Identify What You Know vs. What's Missing Be explicit about gaps: @@ -93,7 +135,7 @@ Be explicit about gaps: - **Assumptions:** Any assumptions you're making — call them out so the user can confirm or correct them -### Step 5: Propose a Reproduction Plan +### Step 6: Propose a Reproduction Plan Based on your understanding, outline how you would approach reproduction: @@ -105,7 +147,7 @@ Based on your understanding, outline how you would approach reproduction: If the bug seems straightforward, the plan can be brief. If it's complex or ambiguous, be thorough. -### Step 6: Present to the User +### Step 7: Present to the User Deliver your assessment in this structure: @@ -115,6 +157,9 @@ Deliver your assessment in this structure: **Issue:** [title or one-line summary] **Source:** [issue URL, conversation, etc.] +### Existing Work +[Any related PRs, duplicate issues, or prior attempts — or "None found"] + ### Understanding [Your 2-3 sentence summary of the bug] @@ -139,12 +184,22 @@ Deliver your assessment in this structure: Be direct. If the bug report is clear and complete, say so. If it's vague or missing critical details, say that too. -### Step 7: Write the Assessment Artifact +**If the bug doesn't actually apply**, say so clearly and present options: + +- "This issue doesn't affect your project — here's why. Want to close it?" +- "The reported issue doesn't apply directly, but here's a related improvement + we could make (with trade-offs): ..." +- "This appears to be a duplicate of #N — should we close this one?" + +Do not proceed to fix something you've concluded isn't broken. Present your +finding and let the user decide. + +### Step 8: Write the Assessment Artifact Save your assessment to `artifacts/bugfix/reports/assessment.md` so that subsequent phases (and speedrun resumption) can detect that this phase is complete. The file should contain the same content you presented to the user -in Step 6. +in Step 7. ## Output diff --git a/workflows/bugfix/.claude/skills/controller/SKILL.md b/workflows/bugfix/.claude/skills/controller/SKILL.md index 70697f57..3456b416 100644 --- a/workflows/bugfix/.claude/skills/controller/SKILL.md +++ b/workflows/bugfix/.claude/skills/controller/SKILL.md @@ -34,6 +34,10 @@ executing phases and handling transitions between them. 8. **PR** (`/pr`) — `.claude/skills/pr/SKILL.md` Push the branch to a fork and create a draft pull request. +9. **Summary** (`/summary`) — `.claude/skills/summary/SKILL.md` + Scan all artifacts and present a synthesized summary of findings, decisions, + and status. Can also be invoked at any point mid-workflow. + Phases can be skipped or reordered at the user's discretion. ## How to Execute a Phase @@ -43,12 +47,20 @@ Phases can be skipped or reordered at the user's discretion. "Starting the /fix phase (dispatched by `.claude/skills/controller/SKILL.md`)." This is very important so the user knows the workflow is working, learns about the available phases, and so skills can find their way back here. -2. **Read** the skill file from the list above +2. **Read** the skill file from the list above. You MUST call the Read tool on + the skill's `SKILL.md` file before executing. If you find yourself executing + a phase without having read its skill file, you are doing it wrong — stop + and read it now. 3. **Execute** the skill's steps directly — the user should see your progress 4. When the skill is done, it will report its findings and re-read this controller. Then use "Recommending Next Steps" below to offer options. -5. Present the skill's results and your recommendations to the user -6. **Stop and wait** for the user to tell you what to do next +5. Present the skill's results and your recommendations to the user. +6. **Use `AskUserQuestion` to get the user's decision.** Present the + recommended next step and alternatives as options. Do NOT continue until the + user responds. This is a hard gate — the `AskUserQuestion` tool triggers + platform notifications and status indicators so the user knows you need + their input. Plain-text questions do not create these signals and the user + may not see them. ## Recommending Next Steps @@ -59,7 +71,7 @@ happened. ### Typical Flow ```text -assess → reproduce → diagnose → fix → test → review → document → pr +assess → reproduce → diagnose → fix → test → review → document → pr → summary ``` ### What to Recommend @@ -110,6 +122,17 @@ directly — don't force them through earlier phases. ## Rules -- **Never auto-advance.** Always wait for the user between phases. +- **Never auto-advance.** Always use `AskUserQuestion` and wait for the user's + response between phases. This is the single most important rule in this + controller. If you proceed to another phase without the user's explicit + go-ahead, the workflow is broken. +- **Always read skill files.** Every phase execution MUST begin with a Read + tool call on the skill's `SKILL.md` file. Do not execute a phase from memory + or general knowledge — the skill files contain specific, tested instructions + that differ from what you might do ad-hoc. +- **Urgency does not bypass process.** Security advisories, critical bugs, and + production incidents may create pressure to act fast. The phase-gated + workflow exists precisely to prevent hasty action. Follow every phase gate + regardless of perceived urgency. - **Recommendations come from this file, not from skills.** Skills report findings; this controller decides what to recommend next. diff --git a/workflows/bugfix/.claude/skills/fix/SKILL.md b/workflows/bugfix/.claude/skills/fix/SKILL.md index f0113a39..bb09c684 100755 --- a/workflows/bugfix/.claude/skills/fix/SKILL.md +++ b/workflows/bugfix/.claude/skills/fix/SKILL.md @@ -59,21 +59,42 @@ Before finalizing the implementation, ensure thoroughness: - **Check for complete enumeration**: If implementing switch/case logic or conditional checks, verify you've handled all possible values. Search the codebase for where these values are defined or used. - **Example**: If implementing polling that stops on "terminal" session phases, search the codebase for all usages of session phases to build a complete list (Stopped, Completed, Failed, Error) rather than assuming you know them all. -### Step 4: Address Related Code +### Step 4: Review Error Handling UX + +If your fix involves error handling, validation, or user-facing messages, +review the error paths for clarity: + +- **Match error context to error type.** A CLI argument error should use the + CLI framework's error type (e.g., `click.BadParameter`), while a + configuration file error should use a general exception that says which file + and line caused the problem. Don't report config file errors as CLI parameter + errors, or vice versa. +- **Test every error path manually.** Trigger each error condition and read the + message from the user's perspective. Is it clear what went wrong? Does it + point to the right place to fix it? +- **Consider different error contexts:** + - CLI errors → should reference the flag or argument + - Config file errors → should reference the file path and setting + - Runtime errors → should include enough context to reproduce + - API errors → should include the endpoint and status code +- **Ensure error messages don't leak internals.** Stack traces, internal paths, + and raw exception types are useful for developers but confusing for users. + +### Step 5: Address Related Code - Fix similar patterns identified in root cause analysis - Update affected function signatures if necessary - Ensure consistency across the codebase - Consider adding defensive programming where appropriate -### Step 5: Update Documentation +### Step 6: Update Documentation - Update inline code documentation - Modify API documentation if interfaces changed - Update configuration documentation if settings changed - Note any breaking changes clearly -### Step 6: Pre-commit Quality Checks +### Step 7: Pre-commit Quality Checks - Run code formatters (e.g., `gofmt`, `black`, `prettier`) - Run linters and fix all warnings (e.g., `golangci-lint`, `flake8`, `eslint`) @@ -81,7 +102,7 @@ Before finalizing the implementation, ensure thoroughness: - Check for any new security vulnerabilities introduced - Verify no secrets or sensitive data added -### Step 7: Document Implementation +### Step 8: Document Implementation Create `artifacts/bugfix/fixes/implementation-notes.md` containing: diff --git a/workflows/bugfix/.claude/skills/pr/SKILL.md b/workflows/bugfix/.claude/skills/pr/SKILL.md index 981e2c4a..e13a5c7a 100644 --- a/workflows/bugfix/.claude/skills/pr/SKILL.md +++ b/workflows/bugfix/.claude/skills/pr/SKILL.md @@ -1,6 +1,6 @@ --- name: pr -description: Create a pull request for a bug fix, handling fork workflows, authentication, and remote setup systematically. +description: Create a pull request from the current branch. Use this instead of running gh pr create directly — it detects GitHub App vs user auth, finds or creates forks, syncs workflow files, detects the upstream default branch, and falls back to compare URLs when API access is limited. --- # Create Pull Request Skill @@ -54,7 +54,10 @@ These are determined during pre-flight checks. Record each value as you go. | `AUTH_TYPE` | Step 0: `gh auth status` + `gh api user` | `user-token` / `github-app` / `none` | | `GH_USER` | Step 0: `gh api user` or `/installation/repositories` (for bots) | `jsmith` | | `UPSTREAM_OWNER/REPO` | Step 2c: `gh repo view --json nameWithOwner` | `acme/myproject` | +| `DEFAULT_BRANCH` | Step 2c: detected from upstream repo | `main` / `dev` / `master` | +| `UPSTREAM_REMOTE` | Step 2b: the git remote name pointing to the upstream org | `origin` / `upstream` | | `FORK_OWNER` | Step 3: owner portion of fork's `nameWithOwner`, or `GH_USER` if newly created | `jsmith` | +| `FORK_REMOTE` | Step 4: the git remote name pointing to the fork | `fork` / `origin` | | `REPO` | The repository name (without owner) | `myproject` | | `BRANCH_NAME` | Step 5: the branch you create | `bugfix/issue-42-null-check` | @@ -86,7 +89,43 @@ Record `GH_USER` and `AUTH_TYPE`: - If `gh api user` succeeded: `AUTH_TYPE` = `user-token`, `GH_USER` = the login - If `gh api user` failed but `/installation/repositories` worked: `AUTH_TYPE` = `github-app`, `GH_USER` = the repo owner login -- If `gh auth status` itself failed: `AUTH_TYPE` = `none` +- If `gh auth status` itself failed: try recovering from an expired token + (see below). If recovery fails: `AUTH_TYPE` = `none` + +**Recovering from expired tokens:** Platform sessions often start with a +`GITHUB_TOKEN` env var that can expire mid-session. The `refresh_credentials` +MCP tool refreshes the backend but does NOT update the shell env var. If +`gh auth status` fails, check for a git credential helper: + +```bash +git config --global credential.helper 2>/dev/null +``` + +If a credential helper exists (e.g., `/tmp/git-credential-ambient`), query it +for a fresh token: + +```bash +FRESH_TOKEN=$(printf 'protocol=https\nhost=github.com\n\n' | git credential fill 2>/dev/null | grep '^password=' | cut -d= -f2) +``` + +If that returns a token, export it and re-check auth: + +```bash +export GITHUB_TOKEN="$FRESH_TOKEN" +gh auth status +``` + +**Important:** Each shell invocation gets a fresh environment, so you must +prepend the export to every subsequent `gh` command, or write the token to +`~/.config/gh/hosts.yml` so `gh` picks it up natively: + +```bash +gh auth login --with-token <<< "$FRESH_TOKEN" +``` + +The `gh auth login` approach is preferred — it persists for all subsequent +`gh` commands without per-command exports. After recovery, re-run the identity +checks above to set `AUTH_TYPE` and `GH_USER`. ### Step 1: Locate the Project Repository @@ -142,21 +181,33 @@ the user's fork. Common patterns: | `fork` | user's name | Fork (read-write) | | `upstream` | upstream org | Upstream (read-only) | -**2c. Identify the upstream repo:** +**2c. Identify the upstream repo and its default branch:** If `gh` is authenticated: ```bash -gh repo view --json nameWithOwner --jq .nameWithOwner +gh repo view --json nameWithOwner,defaultBranchRef --jq '{nameWithOwner, defaultBranch: .defaultBranchRef.name}' ``` -If `gh` is NOT authenticated, extract from the git remote URL: +This returns both the repo name and its default branch. Record +`UPSTREAM_OWNER/REPO` and `DEFAULT_BRANCH` from the output. + +If `gh` is NOT authenticated, extract from the upstream remote (identified in +Step 2b as `UPSTREAM_REMOTE`): ```bash -git remote get-url origin | sed -E 's#.*/([^/]+/[^/]+?)(\.git)?$#\1#' +# Repo name (use UPSTREAM_REMOTE, not necessarily origin — origin may be the fork) +git remote get-url UPSTREAM_REMOTE | sed -E 's#.*/([^/]+/[^/]+?)(\.git)?$#\1#' + +# Default branch +git remote show UPSTREAM_REMOTE 2>/dev/null | grep 'HEAD branch' | awk '{print $NF}' ``` -Record the result as `UPSTREAM_OWNER/REPO` — you'll need it later. +Record the results as `UPSTREAM_OWNER/REPO` and `DEFAULT_BRANCH`. + +**Do not assume the default branch is `main`.** Many projects use `dev`, +`master`, `develop`, or other branch names. All subsequent steps that +reference the base branch MUST use `DEFAULT_BRANCH`. **2d. Check current branch and changes:** @@ -181,6 +232,8 @@ Pre-flight summary: | AUTH_TYPE | ___ | | GH_USER | ___ | | UPSTREAM_OWNER/REPO | ___ | +| DEFAULT_BRANCH | ___ | +| UPSTREAM_REMOTE | ___ | | EXISTING_REMOTES | ___ | | HAS_CHANGES | yes / no | | CURRENT_BRANCH | ___ | @@ -313,8 +366,15 @@ If not present, add it: git remote add fork https://github.com/FORK_OWNER/REPO.git ``` -Use `fork` as the remote name. If `origin` already points to the fork, that's -fine — just use `origin` in subsequent commands instead of `fork`. +**Set `FORK_REMOTE`** based on what you find: + +- If you just added a remote named `fork` → `FORK_REMOTE` = `fork` +- If `origin` already points to the fork (URL contains `FORK_OWNER`) → + `FORK_REMOTE` = `origin` +- If another existing remote points to the fork → `FORK_REMOTE` = that name + +Record `FORK_REMOTE` now. **Use it in all subsequent commands** (push, fetch, +ls-remote) instead of hardcoding `fork` or `origin`. ### Step 4a: Check Fork Sync Status @@ -335,12 +395,13 @@ permission by design. **Detection:** ```bash -# Fetch the fork to get its current state -git fetch fork +# Fetch both the fork and the upstream remote (identified in Step 2b) +git fetch FORK_REMOTE +git fetch UPSTREAM_REMOTE -# Check for workflow file differences between fork/main and local main -# (local main should be synced with upstream) -WORKFLOW_DIFF=$(git diff fork/main..main -- .github/workflows/ --name-only 2>/dev/null) +# Compare fork's DEFAULT_BRANCH against upstream's DEFAULT_BRANCH +# (don't rely on the local branch — it may be stale) +WORKFLOW_DIFF=$(git diff --name-only FORK_REMOTE/DEFAULT_BRANCH..UPSTREAM_REMOTE/DEFAULT_BRANCH -- .github/workflows/ 2>/dev/null) if [ -n "$WORKFLOW_DIFF" ]; then echo "Fork is out of sync with upstream (workflow files differ):" @@ -348,14 +409,18 @@ if [ -n "$WORKFLOW_DIFF" ]; then fi ``` +`UPSTREAM_REMOTE` is whichever remote was identified as pointing to the +upstream org in Step 2b (typically `origin` when origin is the upstream repo, +or `upstream` if the user added it separately). + **If workflow differences exist — attempt automated sync:** ```bash -# Try to sync the fork's main branch with upstream -gh api --method POST repos/FORK_OWNER/REPO/merge-upstream -f branch=main +# Try to sync the fork's default branch with upstream +gh api --method POST repos/FORK_OWNER/REPO/merge-upstream -f branch=DEFAULT_BRANCH ``` -- If this succeeds: fetch the fork again (`git fetch fork`) and continue +- If this succeeds: fetch the fork again (`git fetch FORK_REMOTE`) and continue - If this fails (usually due to workflow permission restrictions): guide the user to sync manually @@ -372,7 +437,7 @@ gh api --method POST repos/FORK_OWNER/REPO/merge-upstream -f branch=main > > 2. **Via command line** (may require `gh auth refresh -s workflow` first): > ``` -> gh repo sync FORK_OWNER/REPO --branch main +> gh repo sync FORK_OWNER/REPO --branch DEFAULT_BRANCH > ``` > > Let me know when the sync is complete and I'll continue with the PR. @@ -381,12 +446,13 @@ gh api --method POST repos/FORK_OWNER/REPO/merge-upstream -f branch=main ```bash # Fetch the updated fork -git fetch fork +git fetch FORK_REMOTE -# Rebase the feature branch onto the synced fork/main -git rebase fork/main +# Update local DEFAULT_BRANCH before creating the feature branch +git checkout DEFAULT_BRANCH +git rebase FORK_REMOTE/DEFAULT_BRANCH -# Continue to Step 5 (create branch) +# Continue to Step 5 (create feature branch from updated DEFAULT_BRANCH) ``` ### Step 5: Create a Branch @@ -434,8 +500,8 @@ and include that instead. # Ensure git uses gh for authentication gh auth setup-git -# Push the branch -git push -u fork BRANCH_NAME +# Push the branch (use FORK_REMOTE from Step 4) +git push -u FORK_REMOTE BRANCH_NAME ``` **If this fails:** @@ -445,7 +511,7 @@ git push -u fork BRANCH_NAME re-authenticate or the sandbox may be blocking network access. - **Remote not found**: Verify the fork remote URL is correct. - **Permission denied**: The fork remote may be pointing to upstream, not the - actual fork. Verify with `git remote get-url fork`. + actual fork. Verify with `git remote get-url FORK_REMOTE`. ### Step 8: Create the Draft PR @@ -465,7 +531,7 @@ gh pr create \ --draft \ --repo UPSTREAM_OWNER/REPO \ --head FORK_OWNER:BRANCH_NAME \ - --base main \ + --base DEFAULT_BRANCH \ --title "fix(SCOPE): SHORT_DESCRIPTION" \ --body-file artifacts/bugfix/docs/pr-description.md ``` @@ -486,7 +552,7 @@ Do NOT retry, do NOT debug further, do NOT fall back to a patch file. Instead: query parameters so the PR form opens fully populated: ```text - https://github.com/UPSTREAM_OWNER/REPO/compare/main...FORK_OWNER:BRANCH_NAME?expand=1&title=URL_ENCODED_TITLE&body=URL_ENCODED_BODY + https://github.com/UPSTREAM_OWNER/REPO/compare/DEFAULT_BRANCH...FORK_OWNER:BRANCH_NAME?expand=1&title=URL_ENCODED_TITLE&body=URL_ENCODED_BODY ``` URL-encode the title and body. If the encoded URL would exceed ~8KB @@ -513,7 +579,7 @@ Do NOT retry, do NOT debug further, do NOT fall back to a patch file. Instead: ``` **If "branch not found"**: The push in Step 7 may have failed silently. -Verify with `git ls-remote fork BRANCH_NAME`. +Verify with `git ls-remote FORK_REMOTE BRANCH_NAME`. ### Step 9: Confirm and Report @@ -598,6 +664,7 @@ or network access is completely blocked: | Symptom | Cause | Fix | | --- | --- | --- | | `gh auth status` fails | Not logged in | User must run `gh auth login` | +| `gh` commands return 401 "Bad credentials" | `GITHUB_TOKEN` expired mid-session | Query git credential helper for fresh token, then `gh auth login --with-token` (see Step 0 recovery) | | `git push` "could not read Username" | git credential helper not configured | Run `gh auth setup-git` then retry push | | `git push` permission denied | Pushing to upstream, not fork | Verify remote URL, switch to fork | | `git push` "refusing to allow...without `workflows` permission" | Fork out of sync with upstream (missing workflow files) | Run Step 4a: sync fork, then rebase and retry push | @@ -605,7 +672,7 @@ or network access is completely blocked: | `gh repo fork` fails | Sandbox blocks forking | User creates fork manually | | Branch not found on remote | Push failed silently | Re-run `git push`, check network | | No changes to commit | Changes already committed or not staged | Check `git status`, `git log` | -| Wrong base branch | Upstream default isn't `main` | Check with `gh repo view --json defaultBranchRef` | +| Wrong base branch | `DEFAULT_BRANCH` wasn't detected or was overridden | Re-run `gh repo view --json defaultBranchRef` and update `DEFAULT_BRANCH` | ## Notes diff --git a/workflows/bugfix/.claude/skills/reproduce/SKILL.md b/workflows/bugfix/.claude/skills/reproduce/SKILL.md index af1ea35a..f8f7fe60 100644 --- a/workflows/bugfix/.claude/skills/reproduce/SKILL.md +++ b/workflows/bugfix/.claude/skills/reproduce/SKILL.md @@ -35,11 +35,67 @@ Methodically reproduce bugs and document their behavior so that diagnosis and fi ### Step 2: Set Up Environment +**Before installing anything**, inspect the project's dependency configuration +to understand what's needed: + +```bash +# Check for Python project metadata +cat pyproject.toml 2>/dev/null | head -40 +cat setup.py 2>/dev/null | head -20 +cat requirements.txt 2>/dev/null | head -20 + +# Check for Node.js project metadata +cat package.json 2>/dev/null | head -30 + +# Check for Go project metadata +cat go.mod 2>/dev/null | head -10 +``` + +**Key things to look for:** + +- **Required language version** (e.g., `requires-python = ">=3.12"`, + `"engines": { "node": ">=18" }`, `go 1.22`) +- **Package manager** (look for `uv.lock`, `poetry.lock`, `Pipfile.lock`, + `pnpm-lock.yaml`, `yarn.lock`, `package-lock.json`) +- **Dev dependencies** and test frameworks + +**Environment setup by project type:** + +| Indicator | Package Manager | Setup Command | +| --- | --- | --- | +| `uv.lock` or `[tool.uv]` in pyproject.toml | uv | `uv sync` | +| `poetry.lock` | Poetry | `poetry install` | +| `Pipfile.lock` | pipenv | `pipenv install --dev` | +| `requirements.txt` only | pip | `python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt` | +| `pnpm-lock.yaml` | pnpm | `pnpm install` | +| `yarn.lock` | Yarn | `yarn install` | +| `package-lock.json` | npm | `npm ci` | +| `go.mod` | Go modules | `go mod download` | + +**Check for version managers** before concluding a runtime isn't available: + +```bash +# Python +uv python list 2>/dev/null || pyenv versions 2>/dev/null +# Node +nvm ls 2>/dev/null || fnm list 2>/dev/null +``` + +Then proceed with the standard setup: + - Verify environment matches the conditions described in the bug report - Check dependencies, configuration files, and required data - Document any environment variables or special setup needed - Ensure you're on the correct branch or commit +**If environment setup fails**, don't keep retrying the same approach. Stop, +read the error message, and try a different strategy. Common recovery patterns: + +- Wrong Python version → use `uv python install X.Y` or `pyenv install X.Y` +- Missing system dependency → check if there's a Docker/container option +- Permission errors → check if a virtualenv is needed +- Build failures → look for a `Makefile`, `justfile`, or `scripts/` directory + ### Step 3: Attempt Reproduction - Follow the reported steps to reproduce exactly as described diff --git a/workflows/bugfix/.claude/skills/speedrun/SKILL.md b/workflows/bugfix/.claude/skills/speedrun/SKILL.md index 86a57d76..a451be46 100644 --- a/workflows/bugfix/.claude/skills/speedrun/SKILL.md +++ b/workflows/bugfix/.claude/skills/speedrun/SKILL.md @@ -23,7 +23,7 @@ phases to include or skip. Each time you read this file, you will: 1. Determine which phase to run next (see "Determine Next Phase" below) -2. If all phases are done, print the completion report and stop +2. If all phases are done (including `/summary`), stop 3. Otherwise, execute that one phase (see "Execute a Phase" below) 4. The phase skill will tell you to return to the file that dispatched it — that's this file (`.claude/skills/speedrun/SKILL.md`). Re-read it and repeat. @@ -47,6 +47,7 @@ context, then pick the first phase that is NOT done. | review | `.claude/skills/review/SKILL.md` | `artifacts/bugfix/review/verdict.md` exists | | document | `.claude/skills/document/SKILL.md` | `artifacts/bugfix/docs/pr-description.md` exists | | pr | `.claude/skills/pr/SKILL.md` | A PR URL has been shared in conversation | +| summary | `.claude/skills/summary/SKILL.md` | `artifacts/bugfix/summary.md` exists | ### Rules @@ -128,9 +129,19 @@ context, then pick the first phase that is NOT done. - Follow the PR skill's full process including its fallback ladder. - If PR creation fails after exhausting fallbacks, report and stop. -## Completion Report +### summary -When all phases are done (or if you stop early due to escalation), present: +- Always run this as the final phase. It replaces the Completion Report below. +- The summary skill scans all artifacts and presents a synthesized overview. + This is the last thing the user sees — it surfaces findings that might + otherwise get buried in earlier artifacts. +- Speedrun still MUST honor any `AskUserQuestion` hard gates from earlier + phases (e.g., the existing-PR decision in assess). Those gates block until + the user responds — do not skip or work around them. + +## Completion Report (Early Stop Only) + +If you stop early due to escalation (before `/summary` runs), present: ```markdown ## Speedrun Complete diff --git a/workflows/bugfix/.claude/skills/summary/SKILL.md b/workflows/bugfix/.claude/skills/summary/SKILL.md new file mode 100644 index 00000000..b2b67574 --- /dev/null +++ b/workflows/bugfix/.claude/skills/summary/SKILL.md @@ -0,0 +1,141 @@ +--- +name: summary +description: Scan all workflow artifacts and present a synthesized summary of findings, decisions, and status. +--- + +# Workflow Summary Skill + +## Dispatch + +This skill can be invoked at any point in the workflow — from the controller, +from speedrun, or directly by the user. It does not require prior phases to +have completed; it summarizes whatever exists so far. + +--- + +You are producing a concise, high-signal summary of everything the bugfix +workflow has done so far. Your audience is someone who hasn't been watching +the workflow run — they want to know what happened, what was decided, and +what needs attention, without reading every artifact. + +## Your Role + +Scan the artifact directory, read what's there, and synthesize the important +findings into a single summary. Surface things that might otherwise get buried: +related PRs found, reproduction failures, review concerns, assumptions that +were never confirmed, decisions made on the user's behalf. + +## Process + +### Step 1: Discover Artifacts + +Scan the artifact root directory to find everything the workflow has produced: + +```bash +find artifacts/bugfix/ -type f -name '*.md' ! -name 'summary.md' 2>/dev/null | sort +``` + +If `artifacts/bugfix/` doesn't exist or is empty, report that no artifacts +have been generated yet and stop. + +### Step 2: Read All Artifacts + +Read every artifact file found in Step 1. Don't skip any — even small or +seemingly unimportant files may contain notable findings. + +### Step 3: Extract Key Findings + +As you read each artifact, pull out information in these categories: + +**Existing work discovered** +- Related PRs, duplicate issues, or prior fix attempts found during assessment +- Whether any of these were acted on or deferred + +**Bug understanding** +- What the bug is and whether it was confirmed +- If reproduction failed, why +- If the assessment concluded the bug doesn't apply, what happened next + +**Root cause and fix** +- The identified root cause (one sentence) +- What was changed and why +- Any alternative approaches that were considered but rejected + +**Testing status** +- Whether the full test suite was run and passed +- New regression tests added +- Any test failures or gaps flagged during review + +**Review concerns** +- The review verdict (solid / tests incomplete / fix inadequate) +- Any specific concerns or caveats raised +- Whether concerns were addressed or are still outstanding + +**Outstanding items** +- Assumptions that were never confirmed by the user +- Decisions made without explicit user input +- Follow-up work recommended but not yet done +- Known limitations or edge cases not covered + +**PR status** +- Whether a PR was created, and the URL +- Whether it was created via `gh pr create` or a manual compare URL +- What branch it targets + +### Step 4: Present the Summary + +Present the summary directly to the user using this structure: + +```markdown +## Bugfix Workflow Summary + +**Issue:** [title or one-line description] +**Status:** [where the workflow stopped — e.g., "PR created", "review complete, PR pending", "assessment only"] + +### Key Findings +- [The most important things the user should know — 3-5 bullet points max] + +### Decisions Made +- [Any choices made during the workflow, especially those made without explicit user input] + +### Outstanding Concerns +- [Review caveats, untested edge cases, unconfirmed assumptions — or "None"] + +### Artifacts +- [List of all artifact files with one-line descriptions] + +### PR +- [PR URL and status, or "Not yet created"] +``` + +Keep it tight. The value of this summary is density — if it's as long as the +artifacts themselves, it's not a summary. + +### Step 5: Write the Summary Artifact + +Save the summary to `artifacts/bugfix/summary.md`. + +## Rules + +- **Read, don't assume.** Base everything on what the artifacts actually say, + not on what you think happened during the workflow. If you weren't the agent + that ran the earlier phases, you don't know what happened — read the files. +- **Flag what's missing.** If a phase was skipped or an artifact is absent, + say so. "No reproduction report was generated" is useful information. +- **Don't editorialize.** Report what the artifacts say. If the review flagged + a concern, include it. Don't soften it or add your own interpretation. +- **Keep it short.** The whole point is that nobody reads the full artifacts. + If your summary is more than ~40 lines of Markdown, cut it down. + +## Output + +- Summary presented directly to the user (inline) +- Summary saved to `artifacts/bugfix/summary.md` + +## When This Phase Is Done + +The summary is the deliverable. Present it and stop — there is no next phase +to recommend. + +If dispatched by the controller or speedrun, announce which file you are +returning to and re-read it. diff --git a/workflows/bugfix/.claude/skills/test/SKILL.md b/workflows/bugfix/.claude/skills/test/SKILL.md index 235093e3..46ffac8d 100755 --- a/workflows/bugfix/.claude/skills/test/SKILL.md +++ b/workflows/bugfix/.claude/skills/test/SKILL.md @@ -26,15 +26,58 @@ Systematically verify fixes and build test coverage that prevents recurrence. Yo ## Process -### Step 1: Create Regression Test +### Step 1: Survey Existing Test Patterns + +Before writing any tests, examine how the project already tests its code. +This prevents style clashes and ensures you use the right tools. + +- **Identify the test framework and runner:** + +```bash +# Check for test configuration +cat pytest.ini 2>/dev/null || cat setup.cfg 2>/dev/null | head -20 +cat jest.config.* 2>/dev/null || cat vitest.config.* 2>/dev/null +cat *_test.go 2>/dev/null | head -5 +``` + +- **Read 2-3 existing test files** in the same area of the codebase as your + fix. Look for: + - How tests are structured (arrange/act/assert, BDD, table-driven) + - What assertion style is used (`assert`, `expect`, `require`) + - What mocking approach is used (fixtures, factories, `unittest.mock`, + `jest.mock`, `testify/mock`) + - Whether there are shared test helpers, fixtures, or utilities + - How tests are named (conventions for test functions and descriptions) + +- **Check for existing test fixtures and helpers:** + +```bash +# Common locations for test utilities +ls tests/conftest.py 2>/dev/null +ls tests/helpers/ 2>/dev/null || ls tests/utils/ 2>/dev/null +ls tests/fixtures/ 2>/dev/null +ls __tests__/setup.* 2>/dev/null +``` + +- **Prefer project infrastructure over generic approaches:** + - If the project has a `CliRunner` fixture → use it instead of `capsys` + - If the project has factory functions → use them instead of raw constructors + - If the project uses `httpx` test client → don't switch to `requests` + +### Step 2: Create Regression Test - Write a test that reproduces the original bug - Verify the test **fails** without your fix (proves it catches the bug) - Verify the test **passes** with your fix (proves the fix works) - Use descriptive test names that reference the issue (e.g., `TestStatusUpdateRetry_Issue425`) -- Follow project test conventions and patterns +- **Match the style of existing tests** — use the same assertion patterns, + mock strategies, and naming conventions you found in Step 1 +- Prefer modern, readable APIs over legacy patterns: + - Python: use `call_args.args[0]` over `call_args[0][0]` (tuple indexing) + - Use framework-specific test utilities over generic ones + - Use named parameters over positional where the API supports it -### Step 2: Unit Testing +### Step 3: Unit Testing - Test the specific functions/methods that were modified - Cover all code paths in the fix @@ -44,44 +87,84 @@ Systematically verify fixes and build test coverage that prevents recurrence. Yo - **Test all states/phases/conditions**: If the fix involves state-dependent logic, ensure tests cover ALL possible states, not just the common ones. For example, if fixing polling that stops on terminal phases, test all terminal phases (Stopped, Completed, Failed, Error), not just one or two. - **Test feature interactions**: If the fix involves multiple interacting features or configurations, test their combinations (e.g., pagination + polling together, not separately) -### Step 3: Integration Testing +### Step 4: Integration Testing - Test the fix in realistic scenarios with dependent components - Verify end-to-end behavior matches expectations - Test interactions with databases, APIs, or external systems - Ensure the fix works in the full system context -### Step 4: Regression Testing +### Step 5: Run the Full Test Suite (MANDATORY) + +This step is not optional. Do not skip it. Do not run only your new tests. -- Run the **entire** test suite to catch unintended side effects -- Verify no existing tests were broken by the changes +- Run the **entire** test suite for the project's stack — unit, integration, and E2E: + +```bash +# Run the command that matches this project's language/tooling. +pytest tests/ # Python projects +npm test # Node.js projects +go test ./... # Go projects +``` + +- If the project has separate test directories (e.g., `tests/unit/`, + `tests/e2e/`, `tests/integration/`), run ALL of them. +- **Why this matters:** Your fix may break tests in unrelated areas. A config + validation change can break an E2E test that uses a mock config. A new + import can cause a circular dependency. These failures will surface in CI — + catch them here instead. - If tests fail, investigate whether: - The test was wrong (update it) - The fix broke something (revise the fix) - Test needs updating due to intentional behavior change (document it) +- **Do not proceed to the next step until the full suite passes.** If you + cannot get all tests passing, document the failures clearly. + +### Step 6: Lint and Format All Modified Files + +Run the project's formatters and linters on **every file you've touched** — +both source files and test files. Test code must meet the same formatting +standards as production code. + +```bash +# Identify all modified files +git diff --name-only HEAD + +# Then run the appropriate formatters on ALL of them: +# Python: black FILE1 FILE2 ... && isort FILE1 FILE2 ... && ruff check --fix FILE1 FILE2 ... +# Node.js: npx prettier --write FILE1 FILE2 ... && npx eslint --fix FILE1 FILE2 ... +# Go: gofmt -w FILE1 FILE2 ... +``` + +If the project has a pre-commit hook or a `make lint` / `npm run lint:fix` +target, use that instead — it will apply all configured checks at once. + +**Why this is a separate step:** It's common to run formatters after writing +source code in `/fix` but forget to re-run them after writing test code in +`/test`. This step ensures both are covered. -### Step 5: Manual Verification +### Step 7: Manual Verification - Manually execute the original reproduction steps from the reproduction report - Verify the expected behavior is now observed - Test related functionality to ensure no side effects - Test in multiple environments if applicable (dev, staging) -### Step 6: Performance Validation +### Step 8: Performance Validation - If the fix touches performance-sensitive code, measure impact - Profile before/after if the bug was performance-related - Ensure no performance degradation introduced - Document any performance changes in test report -### Step 7: Security Check +### Step 9: Security Check - Verify the fix doesn't introduce security vulnerabilities - Check for common issues: SQL injection, XSS, CSRF, etc. - Ensure error messages don't leak sensitive information - Validate input handling and sanitization -### Step 8: Document Test Results +### Step 10: Document Test Results Create comprehensive test report at `artifacts/bugfix/tests/verification.md` containing: @@ -96,7 +179,7 @@ Create comprehensive test report at `artifacts/bugfix/tests/verification.md` con - **Known Limitations**: Any edge cases not fully addressed - **Recommendations**: Follow-up work or monitoring needed -### Step 9: Report Results to the User +### Step 11: Report Results to the User After writing `artifacts/bugfix/tests/verification.md`: diff --git a/workflows/bugfix/CLAUDE.md b/workflows/bugfix/CLAUDE.md index 88f8262b..9a55676c 100755 --- a/workflows/bugfix/CLAUDE.md +++ b/workflows/bugfix/CLAUDE.md @@ -10,6 +10,7 @@ Systematic bug resolution through these phases: 6. **Review** (`/review`) — *(Optional)* Critically evaluate fix and tests 7. **Document** (`/document`) — Release notes and documentation 8. **PR** (`/pr`) — Submit a pull request +9. **Summary** (`/summary`) — Synthesize all artifacts into a final status report All phases are implemented as skills at `.claude/skills/{name}/SKILL.md`. The workflow controller at `.claude/skills/controller/SKILL.md` manages phase