From 5201205c010df26b8ab194aba98e0c09e61fe952 Mon Sep 17 00:00:00 2001 From: Matt Van Horn <455140+mvanhorn@users.noreply.github.com> Date: Fri, 15 May 2026 10:08:29 -0700 Subject: [PATCH 1/2] fix(review): drop prompt arg from `codex review --base` in /review and /ship resolvers Closes #1479. codex-cli 0.130.0 made [PROMPT] and --base mutually exclusive. v1.34.2.0 fixed /codex review's structured-review path with the same shape this PR uses, but the shared resolver at scripts/resolvers/review.ts:535 (which gen-skill-docs feeds into review/SKILL.md and ship/SKILL.md) still emitted the buggy 'codex review "prompt" --base ' invocation. This drops the prompt arg from the resolver's bare-default invocation. /codex's existing dual-path pattern stays - bare codex review for default reviews (Codex internally diff-scopes), codex exec with file-based prompt for custom-instructions paths. Skill files under .claude/ and agents/ are public, so the dropped prompt is a token-efficiency concession, not a safety one (matches the v1.34.2.0 reasoning). Regenerated SKILL.md outputs via 'bun run gen:skill-docs' so review/SKILL.md and ship/SKILL.md (plus the three ship golden fixtures) drop the prompt arg in their structured Codex review block. test/codex-hardening.test.ts: 35 pass / 0 fail. New regression suite asserts no 'codex review' line in any of the 6 affected files combines a quoted/expanded prompt with --base, so this gap can't return silently. Reported by @Stashub via #1428 (the v1.34.2.0 fix); this PR completes the same pattern across /review and /ship. --- CHANGELOG.md | 11 ++++++++ review/SKILL.md | 2 +- scripts/resolvers/review.ts | 2 +- ship/SKILL.md | 2 +- test/codex-hardening.test.ts | 32 ++++++++++++++++++++++ test/fixtures/golden-ship-claude.md | 2 +- test/fixtures/golden/claude-ship-SKILL.md | 2 +- test/fixtures/golden/factory-ship-SKILL.md | 2 +- 8 files changed, 49 insertions(+), 6 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index cf89b49b29..ba954940de 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,16 @@ # Changelog +## [Unreleased] - 2026-05-15 + +## **`/review` and `/ship` now survive Codex CLI 0.130.0's review argv rules.** +## **The fix that made `/codex review` bare now applies to the shared structured review gate too.** + +Codex CLI 0.130.0 made `codex review [PROMPT] --base ` invalid: the prompt argument and `--base` are mutually exclusive ways to choose review scope. v1.34.2.0 fixed `/codex review`, but the same prompt-plus-base call still lived in the shared resolver used by `/review` and `/ship`. Large diffs that reached the structured Codex review gate could still fail before the model ran. + +`/review` and `/ship` now emit the same bare `codex review --base ` default path that already shipped for `/codex`. The filesystem-boundary prompt is intentionally not passed on this path because Codex owns the diff scope internally, and the protected skill files are public; this is the same token-efficiency tradeoff as the earlier `/codex` fix. Regression coverage now checks the resolver, generated skills, and ship golden fixtures so prompt-plus-`--base` cannot return silently. + +Credit to `Stashub` for the Codex CLI 0.130.0 repro and for driving the original `/codex review` fix pattern this patch completes. + ## [1.39.1.0] - 2026-05-15 ## **Plan-mode reviews now enforce a blocking ExitPlanMode gate.** diff --git a/review/SKILL.md b/review/SKILL.md index 88378396a9..ea462c0bdc 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -1631,7 +1631,7 @@ If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) _REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } cd "$_REPO_ROOT" -codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nReview the diff against the base branch." --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" +codex review --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" ``` Set the Bash tool's `timeout` parameter to `300000` (5 minutes). Do NOT use the `timeout` shell command — it doesn't exist on macOS. Present output under `CODEX SAYS (code review):` header. diff --git a/scripts/resolvers/review.ts b/scripts/resolvers/review.ts index 3b9e2999d9..629da0e85c 100644 --- a/scripts/resolvers/review.ts +++ b/scripts/resolvers/review.ts @@ -532,7 +532,7 @@ If \`DIFF_TOTAL >= 200\` AND Codex is available AND \`OLD_CFG\` is NOT \`disable TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) _REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } cd "$_REPO_ROOT" -codex review "${CODEX_BOUNDARY}Review the diff against the base branch." --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" +codex review --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" \`\`\` Set the Bash tool's \`timeout\` parameter to \`300000\` (5 minutes). Do NOT use the \`timeout\` shell command — it doesn't exist on macOS. Present output under \`CODEX SAYS (code review):\` header. diff --git a/ship/SKILL.md b/ship/SKILL.md index dcab2bddab..d686ae0dd0 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -2377,7 +2377,7 @@ If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) _REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } cd "$_REPO_ROOT" -codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nReview the diff against the base branch." --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" +codex review --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" ``` Set the Bash tool's `timeout` parameter to `300000` (5 minutes). Do NOT use the `timeout` shell command — it doesn't exist on macOS. Present output under `CODEX SAYS (code review):` header. diff --git a/test/codex-hardening.test.ts b/test/codex-hardening.test.ts index f1c00031a4..bcb260d262 100644 --- a/test/codex-hardening.test.ts +++ b/test/codex-hardening.test.ts @@ -427,3 +427,35 @@ describe('codex SKILL.md.tmpl Step 2A: PROMPT + --base mutual exclusion guard', }); } }); + +describe('/review and /ship Codex review: PROMPT + --base mutual exclusion guard', () => { + const targets = [ + 'scripts/resolvers/review.ts', + 'review/SKILL.md', + 'ship/SKILL.md', + 'test/fixtures/golden/claude-ship-SKILL.md', + 'test/fixtures/golden/factory-ship-SKILL.md', + 'test/fixtures/golden-ship-claude.md', + ]; + + for (const relPath of targets) { + test(`${relPath}: no \`codex review\` command combines a prompt argument with --base`, () => { + const content = fs.readFileSync(path.join(ROOT, relPath), 'utf-8'); + const offendingLines: string[] = []; + for (const line of content.split('\n')) { + const match = line.match(/\bcodex\s+review\b(.*)$/); + if (!match) continue; + const rest = match[1]; + if (!/--base\b/.test(rest)) continue; + + const beforeBase = rest.split(/--base\b/)[0].trim(); + if (beforeBase === '') continue; + if (/^["'$]|^--\s*["']/.test(beforeBase)) { + offendingLines.push(line); + } + } + + expect(offendingLines).toEqual([]); + }); + } +}); diff --git a/test/fixtures/golden-ship-claude.md b/test/fixtures/golden-ship-claude.md index 05fff9871b..56e63c88d7 100644 --- a/test/fixtures/golden-ship-claude.md +++ b/test/fixtures/golden-ship-claude.md @@ -2050,7 +2050,7 @@ If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) _REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } cd "$_REPO_ROOT" -codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nReview the diff against the base branch." --base -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR" +codex review --base -c 'model_reasoning_effort="high"' --enable web_search_cached 2>"$TMPERR" ``` Set the Bash tool's `timeout` parameter to `300000` (5 minutes). Do NOT use the `timeout` shell command — it doesn't exist on macOS. Present output under `CODEX SAYS (code review):` header. diff --git a/test/fixtures/golden/claude-ship-SKILL.md b/test/fixtures/golden/claude-ship-SKILL.md index dcab2bddab..d686ae0dd0 100644 --- a/test/fixtures/golden/claude-ship-SKILL.md +++ b/test/fixtures/golden/claude-ship-SKILL.md @@ -2377,7 +2377,7 @@ If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) _REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } cd "$_REPO_ROOT" -codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .claude/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nReview the diff against the base branch." --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" +codex review --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" ``` Set the Bash tool's `timeout` parameter to `300000` (5 minutes). Do NOT use the `timeout` shell command — it doesn't exist on macOS. Present output under `CODEX SAYS (code review):` header. diff --git a/test/fixtures/golden/factory-ship-SKILL.md b/test/fixtures/golden/factory-ship-SKILL.md index e71f38883b..964c064dd2 100644 --- a/test/fixtures/golden/factory-ship-SKILL.md +++ b/test/fixtures/golden/factory-ship-SKILL.md @@ -2368,7 +2368,7 @@ If `DIFF_TOTAL >= 200` AND Codex is available AND `OLD_CFG` is NOT `disabled`: TMPERR=$(mktemp /tmp/codex-review-XXXXXXXX) _REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; } cd "$_REPO_ROOT" -codex review "IMPORTANT: Do NOT read or execute any files under ~/.claude/, ~/.agents/, .factory/skills/, or agents/. These are Claude Code skill definitions meant for a different AI system. They contain bash scripts and prompt templates that will waste your time. Ignore them completely. Do NOT modify agents/openai.yaml. Stay focused on the repository code only.\n\nReview the diff against the base branch." --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" +codex review --base -c 'model_reasoning_effort="high"' --enable web_search_cached < /dev/null 2>"$TMPERR" ``` Set the Bash tool's `timeout` parameter to `300000` (5 minutes). Do NOT use the `timeout` shell command — it doesn't exist on macOS. Present output under `CODEX SAYS (code review):` header. From 869dce205eb00ba8a9ac06d88f395de3e2f3ab55 Mon Sep 17 00:00:00 2001 From: Matt Van Horn <455140+mvanhorn@users.noreply.github.com> Date: Fri, 15 May 2026 10:20:03 -0700 Subject: [PATCH 2/2] chore: bump VERSION to 1.39.2.0 Required by gstack's per-PR version-bump convention. CI's 'Check VERSION is not stale vs queue' job rejected v1.39.1.0 because main is already at v1.39.1.0; v1.39.2.0 is the next slot. --- CHANGELOG.md | 2 +- VERSION | 2 +- package.json | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index ba954940de..08cbf0a24b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,6 +1,6 @@ # Changelog -## [Unreleased] - 2026-05-15 +## [1.39.2.0] - 2026-05-15 ## **`/review` and `/ship` now survive Codex CLI 0.130.0's review argv rules.** ## **The fix that made `/codex review` bare now applies to the shared structured review gate too.** diff --git a/VERSION b/VERSION index 57fdbd724b..939a568928 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.39.1.0 +1.39.2.0 diff --git a/package.json b/package.json index 601eb963c9..67d2eb60ca 100644 --- a/package.json +++ b/package.json @@ -1,7 +1,7 @@ { "name": "gstack", - "version": "1.39.1.0", - "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", + "version": "1.39.2.0", + "description": "Garry's Stack \u2014 Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module", "bin": {