diff --git a/design-review/SKILL.md b/design-review/SKILL.md index 953d9d1a..19f9d201 100644 --- a/design-review/SKILL.md +++ b/design-review/SKILL.md @@ -910,16 +910,119 @@ Record baseline design score and AI slop score at end of Phase 6. ## Design Outside Voices (parallel) -**Automatic:** Outside voices run automatically when Codex is available. No opt-in needed. +**Automatic:** Outside voices run automatically when Codex or Gemini is available. No opt-in needed. -**Check Codex availability:** +**Check tool availability:** ```bash which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" +which gemini 2>/dev/null && echo "GEMINI_AVAILABLE" || echo "GEMINI_NOT_AVAILABLE" ``` -**If Codex is available**, launch both voices simultaneously: +Launch all available voices simultaneously: -1. **Codex design voice** (via Bash): +1. **Gemini visual voice** (PRIORITY — via Bash, if GEMINI_AVAILABLE): + +Gemini 3.1 Pro has the strongest multimodal vision of any model. Use it as the **primary visual reviewer** — feed actual screenshots from Phase 1 and Phase 3 directly to Gemini for pixel-level design analysis. This catches issues that source-code-only reviewers miss: visual hierarchy balance, whitespace rhythm, color harmony, compositional weight, and AI slop patterns. + +**Collect screenshots from previous phases:** +```bash +SCREENSHOTS=$(find "$REPORT_DIR/screenshots" -name "*.png" -type f 2>/dev/null | head -5) +if [ -z "$SCREENSHOTS" ]; then + echo "NO_SCREENSHOTS" +else + echo "SCREENSHOTS_FOUND: $(echo "$SCREENSHOTS" | wc -l | tr -d ' ')" +fi +``` + +**If screenshots exist**, feed them to Gemini for visual analysis: +```bash +TMPERR_GM_VISUAL=$(mktemp /tmp/gemini-visual-XXXXXXXX) +TMPOUT_GM_VISUAL=$(mktemp /tmp/gemini-visual-out-XXXXXXXX) + +# Build the image args for gemini CLI +IMG_ARGS="" +for img in $SCREENSHOTS; do + IMG_ARGS="$IMG_ARGS --image $img" +done + +gemini --model gemini-3.1-pro-preview $IMG_ARGS \ + "You are a world-class product designer doing a VISUAL audit of these screenshots. You are looking at the RENDERED output, not source code. Evaluate what you SEE: + +VISUAL HIERARCHY & COMPOSITION: +- Where does the eye land first? Is that intentional? +- Is there clear primary > secondary > tertiary hierarchy? +- White space rhythm: intentional or accidental gaps? +- Squint test: does the layout hold when blurred? +- Compositional weight: balanced or lopsided? + +COLOR & HARMONY: +- Is the palette cohesive across all screenshots? +- Contrast ratios: any text barely readable? +- Semantic color usage: red=error, green=success consistent? +- Dark mode: surfaces use elevation, not just inverted? + +TYPOGRAPHY AS RENDERED: +- Type scale feels systematic or random? +- Line lengths readable (45-75 chars)? +- Font pairing: complementary or competing? +- Heading hierarchy visually clear? + +SPACING & RHYTHM: +- Grid alignment: everything snaps or drifts? +- Consistent padding/margin rhythm? +- Border-radius hierarchy or uniform bubbly? +- Inner radius = outer radius - gap on nested elements? + +AI SLOP DETECTION (the 10 anti-patterns — visual check): +1. Purple/violet gradient backgrounds? +2. The 3-column feature grid with icon circles? +3. Icons in colored circles as decoration? +4. Centered everything? +5. Uniform bubbly border-radius on all elements? +6. Decorative blobs, floating circles, wavy dividers? +7. Emoji as design elements? +8. Colored left-border on cards? +9. Generic hero copy ('Welcome to...', 'Unlock the power...')? +10. Cookie-cutter section rhythm? + +RESPONSIVE (if mobile/tablet screenshots present): +- Mobile layout intentional or just stacked desktop? +- Touch targets >= 44px? +- Navigation appropriate for viewport? + +For each finding: describe WHAT you see, WHY it's a problem, and HOW to fix it. Rate severity: critical/high/medium/polish. + +LITMUS CHECKS — answer YES/NO based on what you SEE: +1. Brand/product unmistakable in first screen? +2. One strong visual anchor present? +3. Page understandable by scanning headlines only? +4. Each section has one job? +5. Are cards actually necessary? +6. Does motion feel intentional (or absent)? +7. Would design feel premium with all decorative shadows removed?" \ + > "$TMPOUT_GM_VISUAL" 2>"$TMPERR_GM_VISUAL" +``` + +Use a 5-minute timeout (`timeout: 300000`). After the command completes: +```bash +cat "$TMPOUT_GM_VISUAL" +cat "$TMPERR_GM_VISUAL" && rm -f "$TMPERR_GM_VISUAL" "$TMPOUT_GM_VISUAL" +``` + +**If no screenshots exist** (Phase 1/3 not yet run), fall back to source-code visual heuristics: +```bash +echo "No screenshots available yet — Gemini visual voice deferred to post-Phase 3." +``` + +Present output under a `GEMINI SAYS (visual audit — pixel-level):` header. + +**Error handling (non-blocking):** +- Auth failure: "Gemini auth failed. Check your API key configuration." +- Timeout: "Gemini timed out after 5 minutes." +- Empty response: "Gemini returned no response." +- On any Gemini error: proceed with Codex and/or Claude subagent output only. + +2. **Codex design voice** (via Bash, if CODEX_AVAILABLE — source-code audit): ```bash TMPERR_DESIGN=$(mktemp /tmp/codex-design-XXXXXXXX) codex exec "Review the frontend source code in this repo. Evaluate against these design hard rules: @@ -958,7 +1061,7 @@ Use a 5-minute timeout (`timeout: 300000`). After the command completes, read st cat "$TMPERR_DESIGN" && rm -f "$TMPERR_DESIGN" ``` -2. **Claude design subagent** (via Agent tool): +3. **Claude design subagent** (via Agent tool — consistency patterns): Dispatch a subagent with this prompt: "Review the frontend source code in this repo. You are an independent senior product designer doing a source-code design audit. Focus on CONSISTENCY PATTERNS across files rather than individual violations: - Are spacing values systematic across the codebase? @@ -972,22 +1075,25 @@ For each finding: what's wrong, severity (critical/high/medium), and the file:li - **Auth failure:** If stderr contains "auth", "login", "unauthorized", or "API key": "Codex authentication failed. Run `codex login` to authenticate." - **Timeout:** "Codex timed out after 5 minutes." - **Empty response:** "Codex returned no response." -- On any Codex error: proceed with Claude subagent output only, tagged `[single-model]`. -- If Claude subagent also fails: "Outside voices unavailable — continuing with primary review." +- On any Codex error: proceed with Gemini and/or Claude subagent output only. +- On any Gemini error: proceed with Codex and/or Claude subagent output only. +- If all outside voices fail: "Outside voices unavailable — continuing with primary review." +Present Gemini output under a `GEMINI SAYS (visual audit — pixel-level):` header. Present Codex output under a `CODEX SAYS (design source audit):` header. Present subagent output under a `CLAUDE SUBAGENT (design consistency):` header. **Synthesis — Litmus scorecard:** -Use the same scorecard format as /plan-design-review (shown above). Fill in from both outputs. -Merge findings into the triage with `[codex]` / `[subagent]` / `[cross-model]` tags. +Use the same scorecard format as /plan-design-review (shown above). Fill in from all outputs. +Gemini's visual litmus checks take PRIORITY for visual dimensions (hierarchy, composition, color, spacing) because it evaluates the rendered output, not source code. +Merge findings into the triage with `[gemini-visual]` / `[codex]` / `[subagent]` / `[cross-model]` tags. **Log the result:** ```bash ~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"design-outside-voices","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}' ``` -Replace STATUS with "clean" or "issues_found", SOURCE with "codex+subagent", "codex-only", "subagent-only", or "unavailable". +Replace STATUS with "clean" or "issues_found", SOURCE with "gemini+codex+subagent", "gemini+subagent", "gemini+codex", "codex+subagent", "gemini-only", "codex-only", "subagent-only", or "unavailable". ## Phase 7: Triage diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index a274efc0..3206fee5 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -1240,7 +1240,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl - **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup. - **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes. - **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed. -- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping. +- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Fallback chain: Codex → Gemini (gemini-3.1-pro-preview) → Claude subagent. Never gates shipping. **Verdict logic:** - **CLEARED**: Eng Review has >= 1 entry within 7 days from either \`review\` or \`plan-eng-review\` with status "clean" (or \`skip_eng_review\` is \`true\`) diff --git a/plan-design-review/SKILL.md b/plan-design-review/SKILL.md index ce5f9e75..41a11fc1 100644 --- a/plan-design-review/SKILL.md +++ b/plan-design-review/SKILL.md @@ -383,21 +383,115 @@ AskUserQuestion: "I've rated this plan {N}/10 on design completeness. The bigges ## Design Outside Voices (parallel) Use AskUserQuestion: -> "Want outside design voices before the detailed review? Codex evaluates against OpenAI's design hard rules + litmus checks; Claude subagent does an independent completeness review." +> "Want outside design voices before the detailed review? Gemini analyzes mockups/wireframes visually (best-in-class vision); Codex evaluates against design hard rules + litmus checks; Claude subagent does an independent completeness review." > > A) Yes — run outside design voices > B) No — proceed without If user chooses B, skip this step and continue. -**Check Codex availability:** +**Check tool availability:** ```bash which codex 2>/dev/null && echo "CODEX_AVAILABLE" || echo "CODEX_NOT_AVAILABLE" +which gemini 2>/dev/null && echo "GEMINI_AVAILABLE" || echo "GEMINI_NOT_AVAILABLE" ``` -**If Codex is available**, launch both voices simultaneously: +Launch all available voices simultaneously: -1. **Codex design voice** (via Bash): +1. **Gemini visual voice** (PRIORITY — via Bash, if GEMINI_AVAILABLE): + +Gemini 3.1 Pro has the strongest multimodal vision of any model. At plan stage, it provides **vision-guided design critique** by: +- Analyzing any mockups, wireframes, or screenshots referenced in or alongside the plan +- Applying visual design expertise to plan descriptions that text-only models miss +- Detecting AI slop patterns from plan descriptions (e.g., "3-column feature grid" → instant flag) + +**Scan for visual assets referenced in the plan:** +```bash +PLAN_DIR=$(dirname "[plan-file-path]") +MOCKUPS=$(find "$PLAN_DIR" -maxdepth 2 \( -name "*.png" -o -name "*.jpg" -o -name "*.jpeg" -o -name "*.webp" -o -name "*.svg" \) -type f 2>/dev/null | head -5) +if [ -z "$MOCKUPS" ]; then + echo "NO_MOCKUPS" +else + echo "MOCKUPS_FOUND: $(echo "$MOCKUPS" | wc -l | tr -d ' ')" +fi +``` + +**If mockups/wireframes exist**, feed them to Gemini for visual analysis: +```bash +TMPERR_GM=$(mktemp /tmp/gemini-plan-visual-XXXXXXXX) +TMPOUT_GM=$(mktemp /tmp/gemini-plan-visual-out-XXXXXXXX) + +IMG_ARGS="" +for img in $MOCKUPS; do + IMG_ARGS="$IMG_ARGS --image $img" +done + +cat "[plan-file-path]" | gemini --model gemini-3.1-pro-preview $IMG_ARGS \ + "You are a world-class product designer reviewing a DESIGN PLAN. You have the plan text via stdin and any mockups/wireframes as images. + +VISUAL DESIGN ASSESSMENT (from mockups, if provided): +- Does the visual hierarchy match the plan's stated priorities? +- Are the mockups specific (real fonts, colors, spacing) or generic wireframes? +- Do the mockups avoid AI slop patterns (3-column grids, purple gradients, icon circles)? +- Is there compositional intention or just layout? +- Would a user feel this was designed by a human with taste? + +PLAN DESIGN COMPLETENESS (from text): +- Does the plan specify CONCRETE visual decisions (font names, color hex, spacing scale)? +- Or does it use vague terms ('clean', 'modern', 'intuitive')? +- Are interaction states specified (loading, empty, error, success)? +- Is there a visual hierarchy defined for each screen? +- Does the plan distinguish between marketing/landing vs app UI patterns? + +AI SLOP RISK (visual pattern detection): +1. '3-column feature grid' → THE most recognizable AI layout +2. Purple/violet gradients → AI color default +3. 'Hero section' without brand specificity → cookie-cutter +4. 'Card-based layout' without justification → lazy default +5. 'Clean modern UI' → meaningless descriptor + +LITMUS CHECKS — answer YES/NO: +1. Brand/product unmistakable in first screen? +2. One strong visual anchor present? +3. Page understandable by scanning headlines only? +4. Each section has one job? +5. Are cards actually necessary? +6. Does motion improve hierarchy or atmosphere? +7. Would design feel premium with all decorative shadows removed? + +For each finding: what's wrong, severity, and the specific fix to add to the plan." \ + > "$TMPOUT_GM" 2>"$TMPERR_GM" +``` +Use a 5-minute timeout (`timeout: 300000`). After the command completes: +```bash +cat "$TMPOUT_GM" +cat "$TMPERR_GM" && rm -f "$TMPERR_GM" "$TMPOUT_GM" +``` + +**If no mockups exist**, still run Gemini for text-based plan critique (it still provides valuable design insight): +```bash +TMPERR_GM=$(mktemp /tmp/gemini-plan-XXXXXXXX) +TMPOUT_GM=$(mktemp /tmp/gemini-plan-out-XXXXXXXX) + +cat "[plan-file-path]" | gemini --model gemini-3.1-pro-preview \ + "You are a world-class product designer reviewing a design plan (text only, no mockups). Focus on: 1) specificity vs vagueness of visual decisions, 2) AI slop risk in described patterns, 3) missing interaction states, 4) whether the plan will produce something that looks DESIGNED vs GENERATED. Answer the 7 litmus checks YES/NO. For each finding: what's wrong, severity, and the fix." \ + > "$TMPOUT_GM" 2>"$TMPERR_GM" +``` +Use a 5-minute timeout (`timeout: 300000`). After the command completes: +```bash +cat "$TMPOUT_GM" +cat "$TMPERR_GM" && rm -f "$TMPERR_GM" "$TMPOUT_GM" +``` + +Present output under a `GEMINI SAYS (visual design critique):` header. + +**Error handling (non-blocking):** +- Auth failure: "Gemini auth failed. Check your API key configuration." +- Timeout: "Gemini timed out after 5 minutes." +- Empty response: "Gemini returned no response." +- On any Gemini error: proceed with Codex and/or Claude subagent output only. + +2. **Codex design voice** (via Bash, if CODEX_AVAILABLE): ```bash TMPERR_DESIGN=$(mktemp /tmp/codex-design-XXXXXXXX) codex exec "Read the plan file at [plan-file-path]. Evaluate this plan's UI/UX design against these criteria. @@ -432,7 +526,7 @@ Use a 5-minute timeout (`timeout: 300000`). After the command completes, read st cat "$TMPERR_DESIGN" && rm -f "$TMPERR_DESIGN" ``` -2. **Claude design subagent** (via Agent tool): +3. **Claude design subagent** (via Agent tool): Dispatch a subagent with this prompt: "Read the plan file at [plan-file-path]. You are an independent senior product designer reviewing this plan. You have NOT seen any prior review. Evaluate: @@ -448,9 +542,11 @@ For each finding: what's wrong, severity (critical/high/medium), and the fix." - **Auth failure:** If stderr contains "auth", "login", "unauthorized", or "API key": "Codex authentication failed. Run `codex login` to authenticate." - **Timeout:** "Codex timed out after 5 minutes." - **Empty response:** "Codex returned no response." -- On any Codex error: proceed with Claude subagent output only, tagged `[single-model]`. -- If Claude subagent also fails: "Outside voices unavailable — continuing with primary review." +- On any Codex error: proceed with Gemini and/or Claude subagent output only. +- On any Gemini error: proceed with Codex and/or Claude subagent output only. +- If all outside voices fail: "Outside voices unavailable — continuing with primary review." +Present Gemini output under a `GEMINI SAYS (visual design critique):` header. Present Codex output under a `CODEX SAYS (design critique):` header. Present subagent output under a `CLAUDE SUBAGENT (design completeness):` header. @@ -458,26 +554,26 @@ Present subagent output under a `CLAUDE SUBAGENT (design completeness):` header. ``` DESIGN OUTSIDE VOICES — LITMUS SCORECARD: -═══════════════════════════════════════════════════════════════ - Check Claude Codex Consensus - ─────────────────────────────────────── ─────── ─────── ───────── - 1. Brand unmistakable in first screen? — — — - 2. One strong visual anchor? — — — - 3. Scannable by headlines only? — — — - 4. Each section has one job? — — — - 5. Cards actually necessary? — — — - 6. Motion improves hierarchy? — — — - 7. Premium without decorative shadows? — — — - ─────────────────────────────────────── ─────── ─────── ───────── - Hard rejections triggered: — — — -═══════════════════════════════════════════════════════════════ -``` - -Fill in each cell from the Codex and subagent outputs. CONFIRMED = both agree. DISAGREE = models differ. NOT SPEC'D = not enough info to evaluate. +═══════════════════════════════════════════════════════════════════════════ + Check Gemini Claude Codex Consensus + ─────────────────────────────────────── ─────── ─────── ─────── ───────── + 1. Brand unmistakable in first screen? — — — — + 2. One strong visual anchor? — — — — + 3. Scannable by headlines only? — — — — + 4. Each section has one job? — — — — + 5. Cards actually necessary? — — — — + 6. Motion improves hierarchy? — — — — + 7. Premium without decorative shadows? — — — — + ─────────────────────────────────────── ─────── ─────── ─────── ───────── + Hard rejections triggered: — — — — +═══════════════════════════════════════════════════════════════════════════ +``` + +Fill in each cell from all outputs. Gemini's visual litmus checks take PRIORITY when mockups are present because it evaluates the actual visual output. CONFIRMED = majority agree. DISAGREE = models differ. NOT SPEC'D = not enough info to evaluate. **Pass integration (respects existing 7-pass contract):** - Hard rejections → raised as the FIRST items in Pass 1, tagged `[HARD REJECTION]` -- Litmus DISAGREE items → raised in the relevant pass with both perspectives +- Litmus DISAGREE items → raised in the relevant pass with all perspectives - Litmus CONFIRMED failures → pre-loaded as known issues in the relevant pass - Passes can skip discovery and go straight to fixing for pre-identified issues @@ -485,7 +581,7 @@ Fill in each cell from the Codex and subagent outputs. CONFIRMED = both agree. D ```bash ~/.claude/skills/gstack/bin/gstack-review-log '{"skill":"design-outside-voices","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","status":"STATUS","source":"SOURCE","commit":"'"$(git rev-parse --short HEAD)"'"}' ``` -Replace STATUS with "clean" or "issues_found", SOURCE with "codex+subagent", "codex-only", "subagent-only", or "unavailable". +Replace STATUS with "clean" or "issues_found", SOURCE with "gemini+codex+subagent", "gemini+subagent", "gemini+codex", "codex+subagent", "gemini-only", "codex-only", "subagent-only", or "unavailable". ## The 0-10 Rating Method @@ -746,7 +842,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl - **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup. - **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes. - **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed. -- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping. +- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Fallback chain: Codex → Gemini (gemini-3.1-pro-preview) → Claude subagent. Never gates shipping. **Verdict logic:** - **CLEARED**: Eng Review has >= 1 entry within 7 days from either \`review\` or \`plan-eng-review\` with status "clean" (or \`skip_eng_review\` is \`true\`) diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index ecf0ae30..d48f670d 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -868,7 +868,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl - **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup. - **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes. - **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed. -- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping. +- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Fallback chain: Codex → Gemini (gemini-3.1-pro-preview) → Claude subagent. Never gates shipping. **Verdict logic:** - **CLEARED**: Eng Review has >= 1 entry within 7 days from either \`review\` or \`plan-eng-review\` with status "clean" (or \`skip_eng_review\` is \`true\`) diff --git a/ship/SKILL.md b/ship/SKILL.md index 16d0e4b3..944372b2 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -339,7 +339,7 @@ Parse the output. Find the most recent entry for each skill (plan-ceo-review, pl - **CEO Review (optional):** Use your judgment. Recommend it for big product/business changes, new user-facing features, or scope decisions. Skip for bug fixes, refactors, infra, and cleanup. - **Design Review (optional):** Use your judgment. Recommend it for UI/UX changes. Skip for backend-only, infra, or prompt-only changes. - **Adversarial Review (automatic):** Auto-scales by diff size. Small diffs (<50 lines) skip adversarial. Medium diffs (50–199) get cross-model adversarial. Large diffs (200+) get all 4 passes: Claude structured, Codex structured, Claude adversarial subagent, Codex adversarial. No configuration needed. -- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Falls back to Claude subagent if Codex is unavailable. Never gates shipping. +- **Outside Voice (optional):** Independent plan review from a different AI model. Offered after all review sections complete in /plan-ceo-review and /plan-eng-review. Fallback chain: Codex → Gemini (gemini-3.1-pro-preview) → Claude subagent. Never gates shipping. **Verdict logic:** - **CLEARED**: Eng Review has >= 1 entry within 7 days from either \`review\` or \`plan-eng-review\` with status "clean" (or \`skip_eng_review\` is \`true\`)