Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions autoplan/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -546,7 +546,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
What alternatives were dismissed too quickly? What competitive or market risks are
unaddressed? What scope decisions will look foolish in 6 months? Be adversarial.
No compliments. Just the strategic blind spots.
File: <plan_path>" -s read-only --enable web_search_cached`
File: <plan_path>" -s read-only --search`
Timeout: 10 minutes

**Claude CEO subagent** (via Agent tool):
Expand Down Expand Up @@ -657,7 +657,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
accessibility requirements (keyboard nav, contrast, touch targets) specified or
aspirational? Does the plan describe specific UI decisions or generic patterns?
What design decisions will haunt the implementer if left ambiguous?
Be opinionated. No hedging." -s read-only --enable web_search_cached`
Be opinionated. No hedging." -s read-only --search`
Timeout: 10 minutes

**Claude design subagent** (via Agent tool):
Expand Down Expand Up @@ -722,7 +722,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
CEO: <insert CEO consensus table summary — key concerns, DISAGREEs>
Design: <insert Design consensus table summary, or 'skipped, no UI scope'>

File: <plan_path>" -s read-only --enable web_search_cached`
File: <plan_path>" -s read-only --search`
Timeout: 10 minutes

**Claude eng subagent** (via Agent tool):
Expand All @@ -737,7 +737,6 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
NO prior-phase context — subagent must be truly independent.

Error handling: same as Phase 1 (non-blocking, degradation matrix applies).

- Architecture choices: explicit over clever (P5). If codex disagrees with valid reason → TASTE DECISION.
- Evals: always include all relevant suites (P1)
- Test plan: generate artifact at `~/.gstack/projects/$SLUG/{user}-{branch}-test-plan-{datetime}.md`
Expand Down
7 changes: 3 additions & 4 deletions autoplan/SKILL.md.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
What alternatives were dismissed too quickly? What competitive or market risks are
unaddressed? What scope decisions will look foolish in 6 months? Be adversarial.
No compliments. Just the strategic blind spots.
File: <plan_path>" -s read-only --enable web_search_cached`
File: <plan_path>" -s read-only --search`
Timeout: 10 minutes

**Claude CEO subagent** (via Agent tool):
Expand Down Expand Up @@ -314,7 +314,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
accessibility requirements (keyboard nav, contrast, touch targets) specified or
aspirational? Does the plan describe specific UI decisions or generic patterns?
What design decisions will haunt the implementer if left ambiguous?
Be opinionated. No hedging." -s read-only --enable web_search_cached`
Be opinionated. No hedging." -s read-only --search`
Timeout: 10 minutes

**Claude design subagent** (via Agent tool):
Expand Down Expand Up @@ -379,7 +379,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
CEO: <insert CEO consensus table summary — key concerns, DISAGREEs>
Design: <insert Design consensus table summary, or 'skipped, no UI scope'>

File: <plan_path>" -s read-only --enable web_search_cached`
File: <plan_path>" -s read-only --search`
Timeout: 10 minutes

**Claude eng subagent** (via Agent tool):
Expand All @@ -394,7 +394,6 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
NO prior-phase context — subagent must be truly independent.

Error handling: same as Phase 1 (non-blocking, degradation matrix applies).

- Architecture choices: explicit over clever (P5). If codex disagrees with valid reason → TASTE DECISION.
- Evals: always include all relevant suites (P1)
- Test plan: generate artifact at `~/.gstack/projects/$SLUG/{user}-{branch}-test-plan-{datetime}.md`
Expand Down
Binary file added bun.lockb
Binary file not shown.
16 changes: 8 additions & 8 deletions codex/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -373,13 +373,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)

2. Run the review (5-minute timeout):
```bash
codex review --base <base> -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR"
codex exec review --base <base> -c 'model_reasoning_effort="high"' --json 2>"$TMPERR"
```

Use `timeout: 300000` on the Bash call. If the user provided custom instructions
(e.g., `/codex review focus on security`), pass them as the prompt argument:
```bash
codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR"
codex exec review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --json 2>"$TMPERR"
```

3. Capture the output. Then parse cost from stderr:
Expand Down Expand Up @@ -517,7 +517,7 @@ With focus (e.g., "security"):

2. Run codex exec with **JSONL output** to capture reasoning traces and tool calls (5-minute timeout):
```bash
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached --json 2>/dev/null | python3 -c "
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --search --json 2>/dev/null | python3 -c "
import sys, json
for line in sys.stdin:
line = line.strip()
Expand Down Expand Up @@ -602,7 +602,7 @@ THE PLAN:

For a **new session:**
```bash
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached --json 2>"$TMPERR" | python3 -c "
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --search --json 2>"$TMPERR" | python3 -c "
import sys, json
for line in sys.stdin:
line = line.strip()
Expand Down Expand Up @@ -635,7 +635,7 @@ for line in sys.stdin:

For a **resumed session** (user chose "Continue"):
```bash
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached --json 2>"$TMPERR" | python3 -c "
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --search --json 2>"$TMPERR" | python3 -c "
<same python streaming parser as above>
"
```
Expand Down Expand Up @@ -671,10 +671,10 @@ Session saved — run /codex again to continue this conversation.
agentic coding model). This means as OpenAI ships newer models, /codex automatically
uses them. If the user wants a specific model, pass `-m` through to codex.

**Reasoning effort:** All modes use `xhigh` — maximum reasoning power. When reviewing code, breaking code, or consulting on architecture, you want the model thinking as hard as possible.
**Reasoning effort:** All modes use `high` — the maximum reasoning power available. Valid values are `minimal`, `low`, `medium`, `high`. When reviewing code, breaking code, or consulting on architecture, you want the model thinking as hard as possible.

**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
**Web search:** Challenge and consult modes use `--search` so Codex can look up
docs and APIs. Review mode (`codex exec review`) does not need `--search` — it has its own built-in review prompt.

If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`
or `/codex challenge -m gpt-5.2`), pass the `-m` flag through to codex.
Expand Down
16 changes: 8 additions & 8 deletions codex/SKILL.md.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -79,13 +79,13 @@ TMPERR=$(mktemp /tmp/codex-err-XXXXXX.txt)

2. Run the review (5-minute timeout):
```bash
codex review --base <base> -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR"
codex exec review --base <base> -c 'model_reasoning_effort="high"' --json 2>"$TMPERR"
```

Use `timeout: 300000` on the Bash call. If the user provided custom instructions
(e.g., `/codex review focus on security`), pass them as the prompt argument:
```bash
codex review "focus on security" --base <base> -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR"
codex exec review "focus on security" --base <base> -c 'model_reasoning_effort="high"' --json 2>"$TMPERR"
```

3. Capture the output. Then parse cost from stderr:
Expand Down Expand Up @@ -158,7 +158,7 @@ With focus (e.g., "security"):

2. Run codex exec with **JSONL output** to capture reasoning traces and tool calls (5-minute timeout):
```bash
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached --json 2>/dev/null | python3 -c "
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --search --json 2>/dev/null | python3 -c "
import sys, json
for line in sys.stdin:
line = line.strip()
Expand Down Expand Up @@ -243,7 +243,7 @@ THE PLAN:

For a **new session:**
```bash
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached --json 2>"$TMPERR" | python3 -c "
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --search --json 2>"$TMPERR" | python3 -c "
import sys, json
for line in sys.stdin:
line = line.strip()
Expand Down Expand Up @@ -276,7 +276,7 @@ for line in sys.stdin:

For a **resumed session** (user chose "Continue"):
```bash
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached --json 2>"$TMPERR" | python3 -c "
codex exec resume <session-id> "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --search --json 2>"$TMPERR" | python3 -c "
<same python streaming parser as above>
"
```
Expand Down Expand Up @@ -312,10 +312,10 @@ Session saved — run /codex again to continue this conversation.
agentic coding model). This means as OpenAI ships newer models, /codex automatically
uses them. If the user wants a specific model, pass `-m` through to codex.

**Reasoning effort:** All modes use `xhigh` — maximum reasoning power. When reviewing code, breaking code, or consulting on architecture, you want the model thinking as hard as possible.
**Reasoning effort:** All modes use `high` — the maximum reasoning power available. Valid values are `minimal`, `low`, `medium`, `high`. When reviewing code, breaking code, or consulting on architecture, you want the model thinking as hard as possible.

**Web search:** All codex commands use `--enable web_search_cached` so Codex can look up
docs and APIs during review. This is OpenAI's cached index — fast, no extra cost.
**Web search:** Challenge and consult modes use `--search` so Codex can look up
docs and APIs. Review mode (`codex exec review`) does not need `--search` — it has its own built-in review prompt.

If the user specifies a model (e.g., `/codex review -m gpt-5.1-codex-max`
or `/codex challenge -m gpt-5.2`), pass the `-m` flag through to codex.
Expand Down
2 changes: 1 addition & 1 deletion design-consultation/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -453,7 +453,7 @@ codex exec "Given this product context, propose a complete design direction:
- Differentiation: 2 deliberate departures from category norms
- Anti-slop: no purple gradients, no 3-column icon grids, no centered everything, no decorative blobs

Be opinionated. Be specific. Do not hedge. This is YOUR design direction — own it." -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached 2>"$TMPERR_DESIGN"
Be opinionated. Be specific. Do not hedge. This is YOUR design direction — own it." -s read-only -c 'model_reasoning_effort="medium"' --search 2>"$TMPERR_DESIGN"
```
Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
```bash
Expand Down
2 changes: 1 addition & 1 deletion design-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -993,7 +993,7 @@ HARD REJECTION — flag if ANY apply:
6. Carousel with no narrative purpose
7. App UI made of stacked cards instead of layout

Be specific. Reference file:line for every finding." -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR_DESIGN"
Be specific. Reference file:line for every finding." -s read-only -c 'model_reasoning_effort="high"' --search 2>"$TMPERR_DESIGN"
```
Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
```bash
Expand Down
4 changes: 2 additions & 2 deletions docs/skills.md
Original file line number Diff line number Diff line change
Expand Up @@ -716,9 +716,9 @@ When `/review` catches bugs from Claude's perspective, `/codex` brings a complet

### Three modes

**Review** — run `codex review` against the current diff. Codex reads every changed file, classifies findings by severity (P1 critical, P2 high, P3 medium), and returns a PASS/FAIL verdict. Any P1 finding = FAIL. The review is fully independent — Codex doesn't see Claude's review.
**Review** — run `/codex review` against the current diff. Codex reads every changed file, classifies findings by severity (P1 critical, P2 high, P3 medium), and returns a PASS/FAIL verdict. Any P1 finding = FAIL. The review is fully independent — Codex doesn't see Claude's review.

**Challenge** — adversarial mode. Codex actively tries to break your code. It looks for edge cases, race conditions, security holes, and assumptions that would fail under load. Uses maximum reasoning effort (`xhigh`). Think of it as a penetration test for your logic.
**Challenge** — adversarial mode. Codex actively tries to break your code. It looks for edge cases, race conditions, security holes, and assumptions that would fail under load. Uses maximum reasoning effort (`high`). Think of it as a penetration test for your logic.

**Consult** — open conversation with session continuity. Ask Codex anything about the codebase. Follow-up questions reuse the same session, so context carries over. Great for "am I thinking about this correctly?" moments.

Expand Down
4 changes: 2 additions & 2 deletions office-hours/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -687,7 +687,7 @@ Write the full prompt (context block + instructions) to this file. Use the mode-

```bash
TMPERR_OH=$(mktemp /tmp/codex-oh-err-XXXXXXXX)
codex exec "$(cat "$CODEX_PROMPT_FILE")" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR_OH"
codex exec "$(cat "$CODEX_PROMPT_FILE")" -s read-only -c 'model_reasoning_effort="high"' --search 2>"$TMPERR_OH"
```

Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
Expand Down Expand Up @@ -838,7 +838,7 @@ If user chooses A, launch both voices simultaneously:
1. **Codex** (via Bash, `model_reasoning_effort="medium"`):
```bash
TMPERR_SKETCH=$(mktemp /tmp/codex-sketch-XXXXXXXX)
codex exec "For this product approach, provide: a visual thesis (one sentence — mood, material, energy), a content plan (hero → support → detail → CTA), and 2 interaction ideas that change page feel. Apply beautiful defaults: composition-first, brand-first, cardless, poster not document. Be opinionated." -s read-only -c 'model_reasoning_effort="medium"' --enable web_search_cached 2>"$TMPERR_SKETCH"
codex exec "For this product approach, provide: a visual thesis (one sentence — mood, material, energy), a content plan (hero → support → detail → CTA), and 2 interaction ideas that change page feel. Apply beautiful defaults: composition-first, brand-first, cardless, poster not document. Be opinionated." -s read-only -c 'model_reasoning_effort="medium"' --search 2>"$TMPERR_SKETCH"
```
Use a 5-minute timeout (`timeout: 300000`). After completion: `cat "$TMPERR_SKETCH" && rm -f "$TMPERR_SKETCH"`

Expand Down
2 changes: 1 addition & 1 deletion plan-ceo-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -1044,7 +1044,7 @@ THE PLAN:

```bash
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR_PV"
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --search 2>"$TMPERR_PV"
```

Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
Expand Down
2 changes: 1 addition & 1 deletion plan-design-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -467,7 +467,7 @@ HARD RULES — first classify as MARKETING/LANDING PAGE vs APP UI vs HYBRID, the
- APP UI: Calm surface hierarchy, dense but readable, utility language, minimal chrome
- UNIVERSAL: CSS variables for colors, no default font stacks, one job per section, cards earn existence

For each finding: what's wrong, what will happen if it ships unresolved, and the specific fix. Be opinionated. No hedging." -s read-only -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR_DESIGN"
For each finding: what's wrong, what will happen if it ships unresolved, and the specific fix. Be opinionated. No hedging." -s read-only -c 'model_reasoning_effort="high"' --search 2>"$TMPERR_DESIGN"
```
Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
```bash
Expand Down
2 changes: 1 addition & 1 deletion plan-eng-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -722,7 +722,7 @@ THE PLAN:

```bash
TMPERR_PV=$(mktemp /tmp/codex-planreview-XXXXXXXX)
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="xhigh"' --enable web_search_cached 2>"$TMPERR_PV"
codex exec "<prompt>" -s read-only -c 'model_reasoning_effort="high"' --search 2>"$TMPERR_PV"
```

Use a 5-minute timeout (`timeout: 300000`). After the command completes, read stderr:
Expand Down
Loading