@wbern/agent-instructions

TDD workflow commands for AI coding agents (Claude Code, OpenCode, Codex).

"TDD helps you to pay attention to the right issues at the right time so you can make your designs cleaner, you can refine your designs as you learn." — Kent Beck

AI coding agents like Claude Code, OpenCode, and Codex can be extended with project- or user-level instruction files. Claude Code and OpenCode expose them as slash commands (/foo → contents of foo.md). Codex exposes them as skills ($foo to mention, or /skills to list — Codex does not support user-defined /foo slash commands). This repo provides ready-made content for Test-Driven Development workflows that installs into each agent's native mechanism.

Custom commands are just a glorified copy-paste mechanism—but that simplicity is what makes them effective for establishing consistent development practices.

Instead of explaining TDD principles each session, type /red to write a failing test, /green to make it pass, /refactor to clean up. The commands guide Claude through each step methodically—you focus on what to build, Claude handles the how.

Want to go faster? Use /cycle to let Claude run the entire red-green-refactor sequence before checking in with you. For even more autonomy (your mileage may vary), /tdd gives Claude full discretion on when to advance between phases.

Also included are commands for commits, PRs, code reviews, and other tasks that come up during day-to-day development.

Installation

One-off run (no install):

npx @wbern/agent-instructions       # npm
pnpm dlx @wbern/agent-instructions  # pnpm

Install globally:

# Homebrew (tap once, then install)
brew tap wbern/tap
brew install wbern/tap/agent-instructions

# npm
npm install -g @wbern/agent-instructions

Per-agent install examples:

# Claude Code + OpenCode, project-scope slash commands
agent-instructions --scope=project --agent=both --overwrite

# Codex, user-scope skills
agent-instructions --scope=user --agent=codex --overwrite

The interactive installer lets you choose:

Feature flags: Enable optional integrations like Beads MCP
Scope: User-level (global) or project-level installation
Agent: claude, opencode, codex, or both (claude + opencode)

After installation, restart your agent if it's currently running.

Codex: skills, not slash commands

Codex CLI does not support user-defined /foo slash commands. When you install with --agent=codex, this package writes Codex skills to ~/.codex/skills/<name>/SKILL.md (user scope) or .codex/skills/<name>/SKILL.md (project scope). Invoke them by:

Typing $red, $green, $tdd, etc. to mention a skill explicitly
Running /skills to list installed skills
Letting Codex pick implicitly — it can select a skill when your prompt matches its description

The --agent=both shortcut targets Claude Code + OpenCode only. To install for Codex, pass --agent=codex explicitly.

Adding to Your Repository

To automatically regenerate commands when teammates install dependencies, add it as a dev dependency with a postinstall script:

npm install --save-dev @wbern/agent-instructions

Then add a postinstall script to your package.json:

{
  "scripts": {
    "postinstall": "agent-instructions --scope=project --agent=both --overwrite"
  },
  "devDependencies": {
    "@wbern/agent-instructions": "^4.0.0"
  }
}

This ensures commands are regenerated whenever anyone runs npm install, pnpm install, or yarn install.

CLI Options:

Option	Description
`--scope=project`	Installation scope (project, user, or a custom path)
`--agent=opencode`	Target agent (opencode, claude, codex, both)
`--prefix=my-`	Add prefix to command names
`--commands=commit,red,green`	Install only specific commands
`--skip-template-injection`	Skip injecting project template customizations
`--update-existing`	Only update already-installed commands
`--overwrite`	Overwrite conflicting files without prompting
`--skip-on-conflict`	Skip conflicting files without prompting
`--flags=beads,github`	Enable feature flags (beads, github, gitlab, etc.)
`--allowed-tools=Bash(git diff:),Bash(git status:)`	Pre-approve tools for commands (non-interactive mode)
`--skills=tdd,commit`	Generate selected commands as skills
`--help, -h`	Show help message
`--version, -v`	Show version number

Customizing Commands

You can inject project-specific instructions into generated commands by adding a template block to your AGENTS.md or CLAUDE.md file.

Both <claude-commands-template> and <agent-commands-template> tags are supported — use whichever fits your project.

Basic Usage

Add this to your project's AGENTS.md (or CLAUDE.md):

# My Project

Other instructions here...

<agent-commands-template>
## Project-Specific Rules

- Always use pnpm instead of npm
- Run tests with `pnpm test`
</agent-commands-template>

When you run agent-instructions, the template content is appended to all generated commands.

Targeting Specific Commands

Use the commands attribute to inject content only into specific commands:

<agent-commands-template commands="commit,pr-ask">
## Git Conventions

- Use conventional commits format
- Reference issue numbers in commits
</agent-commands-template>

This injects the content only into commit.md and pr-ask.md.

File Priority

The generator checks CLAUDE.md first, then AGENTS.md. Only the first file found is used.

Which Command Should I Use?

Main Workflow

This is the core TDD workflow. Additional utility commands (worktrees, spikes, etc.) are listed in Available Commands below.

flowchart TB
    Start([Start New Work])

    Start --> Step1[<b>1. PLAN</b>]

    Step1 --> Issue[📋 /issue<br/>Have GitHub issue<br/><i>Requires: GitHub MCP</i>]
    Step1 --> CreateIssues[📝 /create-issues<br/>No issue yet<br/><i>Optional: Beads MCP</i>]

    Issue --> Step2[<b>2. CODE with TDD</b>]
    CreateIssues --> Step2

    Step2 -->|Manual| Red[🔴 /red<br/>Write failing test]
    Red --> Green[🟢 /green<br/>Make it pass]
    Green --> Refactor[🔵 /refactor<br/>Clean up code]
    Refactor --> MoreTests{More tests?}

    Step2 -->|Automated| Cycle[🔄 /cycle<br/>Runs red+green+refactor]
    Cycle --> MoreTests

    MoreTests -->|Yes| Step2
    MoreTests -->|No| Step3

    Step3[<b>3. SHIP</b>]

    Step3 --> Commit[📦 /commit<br/>Create commit]
    Commit --> ShipChoice{How to<br/>merge?}

    ShipChoice -->|Simple change| Ship[🚢 /pr-ship<br/>Direct to main<br/><i>Requires: GitHub MCP</i>]
    ShipChoice -->|Show team| Show[👀 /pr-show<br/>Auto-merge + notify<br/><i>Requires: GitHub MCP</i>]
    ShipChoice -->|Needs review| Ask[💬 /pr-ask<br/>Create PR<br/><i>Requires: GitHub MCP</i>]

    Ship --> Done([✅ Done])
    Show --> Done
    Ask --> Done

Available Commands

Planning

/issue - Analyze GitHub issue and create TDD implementation plan
/create-issues - Create implementation plan from feature/requirement with PRD-style discovery and TDD acceptance criteria

Test-Driven Development

/spike - Execute TDD Spike Phase - exploratory coding to understand problem space before TDD
/tdd - Remind agent about TDD approach and continue conversation
/red - Execute TDD Red Phase - write ONE failing test
/green - Execute TDD Green Phase - write minimal implementation to pass the failing test
/refactor - Execute TDD Refactor Phase - improve code structure while keeping tests green
/cycle - Execute complete TDD cycle - Red, Green, and Refactor phases in sequence
/simplify - Reduce code complexity while keeping tests green
/tdd-review - Review test suite quality against FIRST principles and TDD anti-patterns

Workflow

/commit - Create a git commit following project standards
/busycommit - Create multiple atomic git commits, one logical change at a time
/pr - Creates a pull request using GitHub MCP
/summarize - Summarize conversation progress and next steps
/gap - Analyze conversation context for unaddressed items and gaps
/code-review - Code review using dynamic category detection and domain-specific analysis
/polish - Review and address issues in existing code - fix problems or justify skipping

Ship / Show / Ask

/pr-ship - Ship code directly to main - for small, obvious changes that don't need review
/pr-show - Show code to team with auto-merge - for changes that should be visible but don't need approval
/pr-ask - Request team review and approval - for complex changes needing discussion

Worktree Management

/worktree-setup - Initial setup of a repo for the worktree-friendly [repo]/main layout, with optional main.2/main.3 parallel copies for trunk-based work
/worktree-add - Add a new git worktree from branch name or issue URL, copy settings, install deps, and open in current IDE
/worktree-cleanup - Clean up merged worktrees by verifying PR/issue status, consolidating settings, and removing stale worktrees

Utilities

/beepboop - Communicate AI-generated content with transparent attribution
/add-command - Guide for creating new slash commands
/kata - Generate a TDD practice challenge with boilerplate test setup
/create-adr - Create a new Architecture Decision Record (ADR)
/research - Research a problem in parallel via web docs, web search, codebase exploration, and deep ultrathink
/commit-hook-checklist - Audit commit-hook automation as gates against AI slop and broken commits (Node.js, Go, polyglot)
/upgrade-deps - Check for dependency upgrades and assess safety before updating
/gastown-setup - Bootstrap a new Gas Town installation, optionally with the wbern/gastown-me-and-my-crew preset for manual-mode operation

Getting Started

Quick Start: Your First TDD Cycle

Here's a simple example to get you started with the TDD workflow:

1. Write a failing test (/red)

You: /red add a function that validates email addresses

Claude: I'll write a failing test for email validation.
[Creates test file with a test that checks email validation]
[Runs test - shows it failing because function doesn't exist yet]

2. Make it pass (/green)

You: /green

Claude: I'll implement the minimal code to pass the test.
[Creates the email validation function with basic implementation]
[Runs test - shows it passing]

3. Refactor for quality (/refactor)

You: /refactor extract regex pattern to a constant

Claude: I'll refactor while keeping tests green.
[Extracts magic values, improves code structure]
[Runs tests - confirms they still pass]

Complete Workflow Example

Starting from a GitHub issue:

/issue 123

Claude analyzes the GitHub issue and creates a TDD implementation plan showing what tests to write.

Running a full TDD cycle:

/cycle implement user authentication with password hashing

Claude executes the complete red-green-refactor cycle: writes a failing test, implements it, then refactors.

Individual phases for more control:

/red test that users can't login with wrong password
/green
/refactor move password verification to separate function

Committing and creating PRs:

/commit

Claude reviews changes, drafts a commit message following project standards, and creates the commit.

/pr-ask

Claude analyzes commits, creates a PR with summary and test plan.

What to Expect

/red - Claude writes ONE failing test based on your description
/green - Claude writes minimal implementation to pass the current failing test
/refactor - Claude improves code structure without changing behavior
/cycle - Claude runs all three phases in sequence for a complete feature

The commands enforce TDD discipline: you can't refactor with failing tests, can't write multiple tests at once, and implementation must match test requirements.

Example Conversations

Conversation: Solving "Find Closest to Zero" Code Challenge

Summary: User works through solving a code challenge: given a list of integers, find the closest to zero. If there is a tie, choose the positive value. (challenge from cyber-dojo.org)

Commands: /red /green /refactor Legend: 🔴 Red (test fails) | 🟢 Green (test passes) | 🔵 Refactor (cleanup)

👤 /red single element [1] returns 1

🔴 writes test

expect(closestToZero([1])).toBe(1);

FAIL: closestToZero is not defined

👤 /green

🟢 returns hardcoded 1

function closestToZero(nums: number[]): number {
  return 1; // minimal - only need to pass current test
}

✅ Test passes

👤 /red two numbers [3, 2] returns 2

🔴 writes test

expect(closestToZero([3, 2])).toBe(2);

FAIL: expected 1 to be 2

👤 /green

🟢 sorts and returns first

function closestToZero(nums: number[]): number {
  return nums.sort((a, b) => a - b)[0]; // minimal
}

✅ 2/2 tests pass

👤 /red with negative [-3, 2] returns 2

🔴 writes test

expect(closestToZero([-3, 2])).toBe(2);

FAIL: expected -3 to be 2 (regular sort puts -3 first)

👤 /green

🟢 sorts by absolute value

function closestToZero(nums: number[]): number {
  return nums.sort((a, b) => Math.abs(a) - Math.abs(b))[0];
}

✅ 3/3 tests pass

👤 /red exact tie [-2, 2] returns positive 2

🔴 writes test

expect(closestToZero([-2, 2])).toBe(2);

FAIL: expected -2 to be 2 (returns first when tied)

👤 /green

🟢 adds tie-breaking: prefer positive

function closestToZero(nums: number[]): number {
  return nums.sort((a, b) => {
    const diff = Math.abs(a) - Math.abs(b);
    return diff !== 0 ? diff : b - a;
  })[0];
}

✅ 4/4 tests pass

👤 /refactor extract helper and use iteration instead of sort

🔵 extracts helper function for comparison logic

function isCloser(candidate: number, current: number): boolean {
  const candAbs = Math.abs(candidate);
  const currAbs = Math.abs(current);
  return candAbs < currAbs || (candAbs === currAbs && candidate > current);
}

function closestToZero(nums: number[]): number {
  if (nums.length === 0) return 0;
  let closest = nums[0];
  for (const num of nums) {
    if (isCloser(num, closest)) {
      closest = num;
    }
  }
  return closest;
}

✅ 4/4 tests still pass

👤 /refactor

🔵 uses reduce for more functional style

function closestToZero(nums: number[]): number {
  if (nums.length === 0) return 0;
  return nums.reduce((closest, num) =>
    isCloser(num, closest) ? num : closest,
  );
}

✅ 4/4 tests still pass

Transparency: @wbern's Usage Stats (Apr 8 – May 15, 2026)

Counted across all Claude Code sessions, filtered to commands shipped by this repo. Total: 1,286 invocations over ~5 weeks. Movement shown vs. the Jan 20 – Feb 3, 2026 sample (previously mislabeled as 2025).

Command	Usage	Movement
/code-review	27%	▲ up from 13%
/tdd	20%	▼ down from 26%
/research	18%	▲ up from 15%
/commit	15%	▲ up from 8%
/gap	12%	▼ down from 15%
/polish	4%	▲ up from 2%
/red	1%	▼ down from 2%
/green	1%	▲ new (not in prior sample)
/summarize	<1%	≈ flat (was 1%)
/refactor	<1%	▼ down from 5%

Other commands from the prior sample that fell out of regular use this window: /create-issues (was 4%), /issue (2%), /worktree-add (2%), /pr (1%), /spike (1%), /tdd-review (1%), /create-adr (1%).

The rest (/kata, /busycommit, /beepboop, /simplify, /pr-show, /pr-ship, /pr-ask, /worktree-cleanup, /worktree-setup, /gastown-setup, /upgrade-deps, /commit-hook-checklist, /add-command) didn't see use in this window — kept around because they earn their keep occasionally, even if not weekly.

Contributing

See CONTRIBUTING.md for development workflow, build system, and fragment management.

Credits

TDD workflow instructions adapted from TDD Guard by Nizar.

FIRST principles and test quality criteria from TDD Manifesto.

Example kata from Cyber-Dojo.

Related Projects

citypaul/.dotfiles - Claude Code configuration with TDD workflows and custom commands
nizos/tdd-guard - Original TDD Guard instructions for Claude

Name		Name	Last commit message	Last commit date
Latest commit History 185 Commits
.claude		.claude
.github		.github
.husky		.husky
.opencode		.opencode
example-conversations		example-conversations
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.jscpd.json		.jscpd.json
.markdownlint.json		.markdownlint.json
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
biome.json		biome.json
knip.json		knip.json
lint-staged.config.js		lint-staged.config.js
logo-dark.svg		logo-dark.svg
logo.svg		logo.svg
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@wbern/agent-instructions

Installation

Codex: skills, not slash commands

Adding to Your Repository

Customizing Commands

Basic Usage

Targeting Specific Commands

File Priority

Which Command Should I Use?

Main Workflow

Available Commands

Planning

Test-Driven Development

Workflow

Ship / Show / Ask

Worktree Management

Utilities

Getting Started

Quick Start: Your First TDD Cycle

Complete Workflow Example

What to Expect

Example Conversations

Conversation: Solving "Find Closest to Zero" Code Challenge

Transparency: @wbern's Usage Stats (Apr 8 – May 15, 2026)

Contributing

Credits

Related Projects

About

Uh oh!

Releases 41

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

@wbern/agent-instructions

Installation

Codex: skills, not slash commands

Adding to Your Repository

Customizing Commands

Basic Usage

Targeting Specific Commands

File Priority

Which Command Should I Use?

Main Workflow

Available Commands

Planning

Test-Driven Development

Workflow

Ship / Show / Ask

Worktree Management

Utilities

Getting Started

Quick Start: Your First TDD Cycle

Complete Workflow Example

What to Expect

Example Conversations

Conversation: Solving "Find Closest to Zero" Code Challenge

Transparency: @wbern's Usage Stats (Apr 8 – May 15, 2026)

Contributing

Credits

Related Projects

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 41

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages