Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion PRODUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ Output: a single append-only markdown file with blind research, structured debat
- [ ] More adapters (Cursor, Windsurf, Aider)

### Future
- [ ] 3+ participant panels with role assignment
- [x] ~~3+ participant panels with role assignment~~ — Eval showed 5-agent is counterproductive (92% coverage vs 97% for 2-agent, 2x cost). Cross-model diversity > agent count.
- [ ] Async mode (participants contribute hours/days apart)
- [ ] Web viewer for discussion logs
- [ ] Cost tracking (tokens per discussion)
Expand Down
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,20 @@ This installs the `/discuss` command and the council orchestrator script to `~/.

That's it. Two AI instances debate the topic with full reasoning and produce a consensus. Everything runs from one terminal — no copy-pasting between windows, no manual coordination.

By default, both debaters use the same AI you're running the command in — two Claudes in Claude Code, two Codex instances in Codex. To run a cross-model debate:
By default, both debaters use the same AI you're running the command in — two Claudes in Claude Code, two Codex instances in Codex. **For best results, use cross-model debates** — different models have different blind spots, so Claude + Codex produces better analysis than two instances of either model alone (see [eval results](tests/eval-results/)):

```
/discuss "Should we use a monorepo?" monorepo.md --agents claude,codex
```

### Discuss a PR

```
/discuss --pr 123
```

Two agents debate the design decisions in a pull request — not code style, but architectural tradeoffs, approach, and alternatives. Posts the consensus as a PR comment when done.

### From Codex CLI

Point Codex to the adapter file in this repo:
Expand Down
107 changes: 104 additions & 3 deletions adapters/claude/.claude/commands/discuss.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ A single command for structured, turn-based AI discussions. Supports three modes
/discuss "topic" file.md → council mode (default): orchestrates two Claude instances debating to completion
/discuss "topic" file.md --agents claude,codex → council with cross-model debate (Claude vs Codex)
/discuss "topic" file.md --mode external → external mode: creates discussion file, waits for another AI to join manually
/discuss --pr 123 → PR discussion: debate the design decisions in a pull request
/discuss file.md → join mode: joins an existing discussion as a participant
```

Expand Down Expand Up @@ -46,13 +47,14 @@ When invoked, print this to the user so they know what's happening:

Parse the user's input to determine the mode:

1. If a **topic string in quotes** AND a **file path** are provided:
1. If `--pr NUMBER` is provided → PR discussion mode (see below)
2. If a **topic string in quotes** AND a **file path** are provided:
- Check for `--mode external` flag → external mode
- Check for `--agents X,Y` flag (council mode only) → set `agent_a_cli` and `agent_b_cli` (e.g. `--agents claude,codex`)
- Check for `--lens LENS_ID` flag (council mode only) → set `lens_id` directly, skip picker. Validate against the IDs in `~/.claude/scripts/prompts/lenses.json`. If the ID is not found, error with the list of valid IDs from the registry.
- Otherwise → council mode (default)
2. If **only a file path** is provided and the file exists → join mode
3. If **only a file path** is provided and the file does NOT exist → error: "File not found. To start a new discussion, provide a topic: `/discuss \"your topic\" file.md`"
3. If **only a file path** is provided and the file exists → join mode
4. If **only a file path** is provided and the file does NOT exist → error: "File not found. To start a new discussion, provide a topic: `/discuss \"your topic\" file.md`"

---

Expand Down Expand Up @@ -129,6 +131,105 @@ For each response turn, follow the **Turn Structure** below.

---

## PR Discussion Mode (`--pr`)

Debates the design decisions in a pull request. This is not a code review — it's a structured discussion about the architectural tradeoffs, design choices, and approach taken in the PR.

### How it works

1. **Gather PR context** using `gh` CLI:
```bash
gh pr view NUMBER --json title,body,baseRefName,headRefName
gh pr diff NUMBER
```

2. **Generate the topic** from the PR title and body. The topic should frame the discussion around the design decisions, not the code style.

3. **Create the discussion file** as `pr-NUMBER-discussion.md` in the current directory. Include the PR context as a preamble section before the Key Questions:

```markdown
---
topic: "<generated from PR title>"
mode: council
pr_number: NUMBER
lens_id: "simplicity-vs-correctness"
selection_mode: "default"
max_rounds: 5
git_commit: none
agent_a: "Claude Agent A"
agent_b: "Claude Agent B"
agent_a_cli: "claude"
agent_b_cli: "claude"
agent_a_lens: "simplicity/pragmatism"
agent_b_lens: "correctness/rigor"
status: researching
turn: A
round: 0
created: <ISO 8601 timestamp>
last_updated: <ISO 8601 timestamp>
---

# Discussion: <PR title — design tradeoffs>

## PR Context

**PR #NUMBER:** <title>
**Branch:** <head> → <base>

### Description
<PR body>

### Diff Summary
<summary of changed files and key changes — not the full diff>

## Key Questions
1. [Generated from the PR — focus on design/architecture decisions]
2. ...
3. ...
```

4. **Default lens is `simplicity-vs-correctness`** — most PR discussions are about design tradeoffs. The picker is still shown so the user can override.

5. **Run the orchestrator** as normal: `node ~/.claude/scripts/headless-council.js pr-NUMBER-discussion.md`

6. **Post the consensus as a PR comment** when done:
```bash
gh pr comment NUMBER --body "$(cat <<'EOF'
## AI Council Discussion

<formatted consensus summary from the discussion>

<link to full discussion file>
EOF
)"
```

Print to the user:
> Starting PR discussion for #NUMBER: "<PR title>"
> Lens: simplicity-vs-correctness (enter to accept, or pick 1-3)
> Output: pr-NUMBER-discussion.md
> Running...

When complete:
> Discussion complete. Consensus posted as PR comment.
> Full discussion: pr-NUMBER-discussion.md

### What it focuses on

The PR discussion should focus on:
- Is this the right approach / abstraction?
- What are the tradeoffs being made?
- What alternatives were considered (or should have been)?
- Does this scale to the known future requirements?
- Are there hidden assumptions or coupling?

It should NOT focus on:
- Code style, naming, formatting
- Individual line-level bugs (that's code review)
- Test coverage specifics

---

## Council Mode (`--mode council`)

Orchestrates two independent top-level Claude instances that debate the topic with full reasoning capabilities. Each instance runs as a separate `claude -p` process with `--effort max`, ensuring extended thinking is available for every turn. The orchestrator (you) manages the discussion file, frontmatter, and turn sequencing.
Expand Down
6 changes: 6 additions & 0 deletions docs/research.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,12 @@ Eo et al. show that debate does not need to happen on every problem, and adaptiv

Source: [Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning](https://arxiv.org/abs/2504.05047)

### Agent roles and multi-agent chat rooms

MindStudio (2026) provides a practical guide to multi-agent debate with distinct personas (advocate, skeptic, synthesizer). Recommends 3-5 agents with a neutral synthesizer for tie-breaking. Our own eval (15 discussions, 5 configs, 3 topics) found that 3 agents with synthesizer matches cross-model quality, but 5 agents is counterproductive — agents go deep on their role's angle and lose breadth. Cross-model diversity (Claude + Codex) consistently outperforms same-model multi-agent setups.

Source: [How to Build Agent Chat Rooms: Multi-Agent Debate for Better AI Outputs](https://www.mindstudio.ai/blog/agent-chat-rooms-multi-agent-debate-claude-code)

## Caveat

A [2025 ICLR analysis](https://d2jud02ci9yv69.cloudfront.net/2025-04-28-mad-159/blog/mad/) noted that multi-agent debate doesn't consistently outperform simpler methods like chain-of-thought on all benchmarks. The benefits are most pronounced on tasks requiring diverse perspectives, factual verification, and structured reasoning —which is exactly what this tool targets.
Expand Down
Loading