TDD workflow commands for AI coding agents (Claude Code, OpenCode, Codex).
"TDD helps you to pay attention to the right issues at the right time so you can make your designs cleaner, you can refine your designs as you learn." β Kent Beck
AI coding agents like Claude Code, OpenCode, and Codex can be extended with project- or user-level instruction files. Claude Code and OpenCode expose them as slash commands (/foo β contents of foo.md). Codex exposes them as skills ($foo to mention, or /skills to list β Codex does not support user-defined /foo slash commands). This repo provides ready-made content for Test-Driven Development workflows that installs into each agent's native mechanism.
Custom commands are just a glorified copy-paste mechanismβbut that simplicity is what makes them effective for establishing consistent development practices.
Instead of explaining TDD principles each session, type /red to write a failing test, /green to make it pass, /refactor to clean up. The commands guide Claude through each step methodicallyβyou focus on what to build, Claude handles the how.
Want to go faster? Use /cycle to let Claude run the entire red-green-refactor sequence before checking in with you. For even more autonomy (your mileage may vary), /tdd gives Claude full discretion on when to advance between phases.
Also included are commands for commits, PRs, code reviews, and other tasks that come up during day-to-day development.
One-off run (no install):
npx @wbern/agent-instructions # npm
pnpm dlx @wbern/agent-instructions # pnpmInstall globally:
# Homebrew (tap once, then install)
brew tap wbern/tap
brew install wbern/tap/agent-instructions
# npm
npm install -g @wbern/agent-instructionsPer-agent install examples:
# Claude Code + OpenCode, project-scope slash commands
agent-instructions --scope=project --agent=both --overwrite
# Codex, user-scope skills
agent-instructions --scope=user --agent=codex --overwriteThe interactive installer lets you choose:
- Feature flags: Enable optional integrations like Beads MCP
- Scope: User-level (global) or project-level installation
- Agent:
claude,opencode,codex, orboth(claude + opencode)
After installation, restart your agent if it's currently running.
Codex CLI does not support user-defined /foo slash commands. When you install with --agent=codex, this package writes Codex skills to ~/.codex/skills/<name>/SKILL.md (user scope) or .codex/skills/<name>/SKILL.md (project scope). Invoke them by:
- Typing
$red,$green,$tdd, etc. to mention a skill explicitly - Running
/skillsto list installed skills - Letting Codex pick implicitly β it can select a skill when your prompt matches its
description
The --agent=both shortcut targets Claude Code + OpenCode only. To install for Codex, pass --agent=codex explicitly.
To automatically regenerate commands when teammates install dependencies, add it as a dev dependency with a postinstall script:
npm install --save-dev @wbern/agent-instructionsThen add a postinstall script to your package.json:
{
"scripts": {
"postinstall": "agent-instructions --scope=project --agent=both --overwrite"
},
"devDependencies": {
"@wbern/agent-instructions": "^4.0.0"
}
}This ensures commands are regenerated whenever anyone runs npm install, pnpm install, or yarn install.
CLI Options:
| Option | Description |
|---|---|
--scope=project |
Installation scope (project, user, or a custom path) |
--agent=opencode |
Target agent (opencode, claude, codex, both) |
--prefix=my- |
Add prefix to command names |
--commands=commit,red,green |
Install only specific commands |
--skip-template-injection |
Skip injecting project template customizations |
--update-existing |
Only update already-installed commands |
--overwrite |
Overwrite conflicting files without prompting |
--skip-on-conflict |
Skip conflicting files without prompting |
--flags=beads,github |
Enable feature flags (beads, github, gitlab, etc.) |
--allowed-tools=Bash(git diff:*),Bash(git status:*) |
Pre-approve tools for commands (non-interactive mode) |
--skills=tdd,commit |
Generate selected commands as skills |
--help, -h |
Show help message |
--version, -v |
Show version number |
You can inject project-specific instructions into generated commands by adding a template block to your AGENTS.md or CLAUDE.md file.
Both <claude-commands-template> and <agent-commands-template> tags are supported β use whichever fits your project.
Add this to your project's AGENTS.md (or CLAUDE.md):
# My Project
Other instructions here...
<agent-commands-template>
## Project-Specific Rules
- Always use pnpm instead of npm
- Run tests with `pnpm test`
</agent-commands-template>When you run agent-instructions, the template content is appended to all generated commands.
Use the commands attribute to inject content only into specific commands:
<agent-commands-template commands="commit,pr-ask">
## Git Conventions
- Use conventional commits format
- Reference issue numbers in commits
</agent-commands-template>This injects the content only into commit.md and pr-ask.md.
The generator checks CLAUDE.md first, then AGENTS.md. Only the first file found is used.
This is the core TDD workflow. Additional utility commands (worktrees, spikes, etc.) are listed in Available Commands below.
flowchart TB
Start([Start New Work])
Start --> Step1[<b>1. PLAN</b>]
Step1 --> Issue[π /issue<br/>Have GitHub issue<br/><i>Requires: GitHub MCP</i>]
Step1 --> CreateIssues[π /create-issues<br/>No issue yet<br/><i>Optional: Beads MCP</i>]
Issue --> Step2[<b>2. CODE with TDD</b>]
CreateIssues --> Step2
Step2 -->|Manual| Red[π΄ /red<br/>Write failing test]
Red --> Green[π’ /green<br/>Make it pass]
Green --> Refactor[π΅ /refactor<br/>Clean up code]
Refactor --> MoreTests{More tests?}
Step2 -->|Automated| Cycle[π /cycle<br/>Runs red+green+refactor]
Cycle --> MoreTests
MoreTests -->|Yes| Step2
MoreTests -->|No| Step3
Step3[<b>3. SHIP</b>]
Step3 --> Commit[π¦ /commit<br/>Create commit]
Commit --> ShipChoice{How to<br/>merge?}
ShipChoice -->|Simple change| Ship[π’ /pr-ship<br/>Direct to main<br/><i>Requires: GitHub MCP</i>]
ShipChoice -->|Show team| Show[π /pr-show<br/>Auto-merge + notify<br/><i>Requires: GitHub MCP</i>]
ShipChoice -->|Needs review| Ask[π¬ /pr-ask<br/>Create PR<br/><i>Requires: GitHub MCP</i>]
Ship --> Done([β
Done])
Show --> Done
Ask --> Done
/issue- Analyze GitHub issue and create TDD implementation plan/create-issues- Create implementation plan from feature/requirement with PRD-style discovery and TDD acceptance criteria
/spike- Execute TDD Spike Phase - exploratory coding to understand problem space before TDD/tdd- Remind agent about TDD approach and continue conversation/red- Execute TDD Red Phase - write ONE failing test/green- Execute TDD Green Phase - write minimal implementation to pass the failing test/refactor- Execute TDD Refactor Phase - improve code structure while keeping tests green/cycle- Execute complete TDD cycle - Red, Green, and Refactor phases in sequence/simplify- Reduce code complexity while keeping tests green/tdd-review- Review test suite quality against FIRST principles and TDD anti-patterns
/commit- Create a git commit following project standards/busycommit- Create multiple atomic git commits, one logical change at a time/pr- Creates a pull request using GitHub MCP/summarize- Summarize conversation progress and next steps/gap- Analyze conversation context for unaddressed items and gaps/code-review- Code review using dynamic category detection and domain-specific analysis/polish- Review and address issues in existing code - fix problems or justify skipping
/pr-ship- Ship code directly to main - for small, obvious changes that don't need review/pr-show- Show code to team with auto-merge - for changes that should be visible but don't need approval/pr-ask- Request team review and approval - for complex changes needing discussion
/worktree-setup- Initial setup of a repo for the worktree-friendly [repo]/main layout, with optional main.2/main.3 parallel copies for trunk-based work/worktree-add- Add a new git worktree from branch name or issue URL, copy settings, install deps, and open in current IDE/worktree-cleanup- Clean up merged worktrees by verifying PR/issue status, consolidating settings, and removing stale worktrees
/beepboop- Communicate AI-generated content with transparent attribution/add-command- Guide for creating new slash commands/kata- Generate a TDD practice challenge with boilerplate test setup/create-adr- Create a new Architecture Decision Record (ADR)/research- Research a problem in parallel via web docs, web search, codebase exploration, and deep ultrathink/commit-hook-checklist- Audit commit-hook automation as gates against AI slop and broken commits (Node.js, Go, polyglot)/upgrade-deps- Check for dependency upgrades and assess safety before updating/gastown-setup- Bootstrap a new Gas Town installation, optionally with the wbern/gastown-me-and-my-crew preset for manual-mode operation
Here's a simple example to get you started with the TDD workflow:
1. Write a failing test (/red)
You: /red add a function that validates email addresses
Claude: I'll write a failing test for email validation.
[Creates test file with a test that checks email validation]
[Runs test - shows it failing because function doesn't exist yet]
2. Make it pass (/green)
You: /green
Claude: I'll implement the minimal code to pass the test.
[Creates the email validation function with basic implementation]
[Runs test - shows it passing]
3. Refactor for quality (/refactor)
You: /refactor extract regex pattern to a constant
Claude: I'll refactor while keeping tests green.
[Extracts magic values, improves code structure]
[Runs tests - confirms they still pass]
Starting from a GitHub issue:
/issue 123
Claude analyzes the GitHub issue and creates a TDD implementation plan showing what tests to write.
Running a full TDD cycle:
/cycle implement user authentication with password hashing
Claude executes the complete red-green-refactor cycle: writes a failing test, implements it, then refactors.
Individual phases for more control:
/red test that users can't login with wrong password
/green
/refactor move password verification to separate function
Committing and creating PRs:
/commit
Claude reviews changes, drafts a commit message following project standards, and creates the commit.
/pr-ask
Claude analyzes commits, creates a PR with summary and test plan.
/red- Claude writes ONE failing test based on your description/green- Claude writes minimal implementation to pass the current failing test/refactor- Claude improves code structure without changing behavior/cycle- Claude runs all three phases in sequence for a complete feature
The commands enforce TDD discipline: you can't refactor with failing tests, can't write multiple tests at once, and implementation must match test requirements.
Summary: User works through solving a code challenge: given a list of integers, find the closest to zero. If there is a tie, choose the positive value. (challenge from cyber-dojo.org)
Commands: /red /green /refactor
Legend: π΄ Red (test fails) | π’ Green (test passes) | π΅ Refactor (cleanup)
π€ /red single element [1] returns 1
π΄ writes test
expect(closestToZero([1])).toBe(1);FAIL: closestToZero is not defined
π€ /green
π’ returns hardcoded 1
function closestToZero(nums: number[]): number {
return 1; // minimal - only need to pass current test
}β Test passes
π€ /red two numbers [3, 2] returns 2
π΄ writes test
expect(closestToZero([3, 2])).toBe(2);FAIL: expected 1 to be 2
π€ /green
π’ sorts and returns first
function closestToZero(nums: number[]): number {
return nums.sort((a, b) => a - b)[0]; // minimal
}β 2/2 tests pass
π€ /red with negative [-3, 2] returns 2
π΄ writes test
expect(closestToZero([-3, 2])).toBe(2);FAIL: expected -3 to be 2 (regular sort puts -3 first)
π€ /green
π’ sorts by absolute value
function closestToZero(nums: number[]): number {
return nums.sort((a, b) => Math.abs(a) - Math.abs(b))[0];
}β 3/3 tests pass
π€ /red exact tie [-2, 2] returns positive 2
π΄ writes test
expect(closestToZero([-2, 2])).toBe(2);FAIL: expected -2 to be 2 (returns first when tied)
π€ /green
π’ adds tie-breaking: prefer positive
function closestToZero(nums: number[]): number {
return nums.sort((a, b) => {
const diff = Math.abs(a) - Math.abs(b);
return diff !== 0 ? diff : b - a;
})[0];
}β 4/4 tests pass
π€ /refactor extract helper and use iteration instead of sort
π΅ extracts helper function for comparison logic
function isCloser(candidate: number, current: number): boolean {
const candAbs = Math.abs(candidate);
const currAbs = Math.abs(current);
return candAbs < currAbs || (candAbs === currAbs && candidate > current);
}
function closestToZero(nums: number[]): number {
if (nums.length === 0) return 0;
let closest = nums[0];
for (const num of nums) {
if (isCloser(num, closest)) {
closest = num;
}
}
return closest;
}β 4/4 tests still pass
π€ /refactor
π΅ uses reduce for more functional style
function closestToZero(nums: number[]): number {
if (nums.length === 0) return 0;
return nums.reduce((closest, num) =>
isCloser(num, closest) ? num : closest,
);
}β 4/4 tests still pass
Counted across all Claude Code sessions, filtered to commands shipped by this repo. Total: 1,286 invocations over ~5 weeks. Movement shown vs. the Jan 20 β Feb 3, 2026 sample (previously mislabeled as 2025).
| Command | Usage | Movement |
|---|---|---|
| /code-review | 27% | β² up from 13% |
| /tdd | 20% | βΌ down from 26% |
| /research | 18% | β² up from 15% |
| /commit | 15% | β² up from 8% |
| /gap | 12% | βΌ down from 15% |
| /polish | 4% | β² up from 2% |
| /red | 1% | βΌ down from 2% |
| /green | 1% | β² new (not in prior sample) |
| /summarize | <1% | β flat (was 1%) |
| /refactor | <1% | βΌ down from 5% |
Other commands from the prior sample that fell out of regular use this window: /create-issues (was 4%), /issue (2%), /worktree-add (2%), /pr (1%), /spike (1%), /tdd-review (1%), /create-adr (1%).
The rest (/kata, /busycommit, /beepboop, /simplify, /pr-show, /pr-ship, /pr-ask, /worktree-cleanup, /worktree-setup, /gastown-setup, /upgrade-deps, /commit-hook-checklist, /add-command) didn't see use in this window β kept around because they earn their keep occasionally, even if not weekly.
See CONTRIBUTING.md for development workflow, build system, and fragment management.
TDD workflow instructions adapted from TDD Guard by Nizar.
FIRST principles and test quality criteria from TDD Manifesto.
Example kata from Cyber-Dojo.
- citypaul/.dotfiles - Claude Code configuration with TDD workflows and custom commands
- nizos/tdd-guard - Original TDD Guard instructions for Claude