diff --git a/.cargo/config.toml b/.cargo/config.toml index 2f334038e6..bb4d51925b 100644 --- a/.cargo/config.toml +++ b/.cargo/config.toml @@ -14,6 +14,7 @@ rules-audit = "run --profile tool --bin rules-audit --features audit --" semantic-audit = "run --profile tool --features cli --bin oracle-gen -- semantic-audit" scrape-feeds = "run --release -p feed-scraper --" tune-ai = "run --release --features tune --bin ai-tune --" +ai-gate = "run --bin ai-gate --" engine-inventory = "run --quiet -p engine-inventory-gen" [env] diff --git a/.claude/agents/engine-implementation-executor.md b/.claude/agents/engine-implementation-executor.md index 4f2ac0486f..dcf89a84d1 100644 --- a/.claude/agents/engine-implementation-executor.md +++ b/.claude/agents/engine-implementation-executor.md @@ -125,12 +125,12 @@ If any modified file is under `crates/engine/src/parser/`, inspect added lines f ```bash git diff --name-only | grep 'crates/engine/src/parser/' | while read f; do git diff "$f" | grep '^+' | grep -v '^+++' | grep -vE '^\+\s*//' \ - | grep -E '\.(contains|starts_with|ends_with|find)\(' \ + | grep -E '\.(contains|starts_with|ends_with|find|rfind|split|splitn|rsplit|split_once)\(' \ | grep -v '#\[test\]' | grep -v '#\[cfg(test)\]' done ``` -Any output is a hard failure unless it is a test, comment, explicitly annotated non-dispatch structural use, or `oracle_util.rs` dual-string `TextPair` helper work. +The `rfind`/`split`/`split_once`/`rsplit` arms are deliberate: `scripts/check-parser-combinators.sh` does not catch them, so a green gate is not proof of combinator compliance — this inline grep covers that blind spot. Any output is a hard failure unless it is a test, comment, explicitly annotated non-dispatch structural use, or `oracle_util.rs` dual-string `TextPair` helper work. For parser changes always run additionally: @@ -140,6 +140,28 @@ cargo coverage cargo semantic-audit ``` +### Discriminating-test gate + +Every behavioral change MUST ship at least one test that drives the real pipeline (`apply()` / the scenario runner / the cast-pipeline harness) and **would fail if the fix were reverted**. A test that only asserts the parsed AST shape — an `assert_eq!` on a parsed `AbilityDefinition` / `Effect` / `StaticMode` without resolving it through the engine — does NOT satisfy this gate. It is a shape test, not a regression test. + +Confirm discrimination concretely before returning: + +- For the primary fix, name the assertion that flips when the fix is reverted. If you cannot name one, the test does not discriminate — add one that does. +- Trace each test fixture through the fix's first input-shape dispatches (`is_none()` / `is_empty()` / variant `match` / "has-X" guards). If every fixture is degenerate in the same way (no ability, no targets, empty or single-element collection, all-generic cost), the test likely takes a different internal branch than production inputs and silently passes — reach the real arm instead. (Precedent: an Emerge cost-reduction test whose all-generic sacrifice made the wrong reduction coincide with the right one; an Undaunted test that called a function the reduction never runs in, so the positive case could not pass.) + +This is the single most common defect the `/review-impl` loop catches (shape-only tests on keyword and parser PRs). Catch it here, before review. + +### CR-annotation diff gate + +Before returning, grep every CR number you added or changed **in the diff** against `docs/MagicCompRules.txt` — not just the ones you remember writing: + +```bash +git diff | grep -E '^\+' | grep -oE 'CR [0-9]{3}(\.[0-9]+[a-z]?)?' | sed 's/^CR //' | sort -u \ + | while read -r n; do grep -qE "^${n}([^0-9]|$)" docs/MagicCompRules.txt || echo "UNVERIFIED: CR ${n}"; done +``` + +Any `UNVERIFIED:` line is a hard stop — the rule number does not exist in the rules text (a hallucinated subpart, e.g. the recurring `702.808` / wrong-keyword-subpart class) or is malformed. Re-derive the correct rule or flag it explicitly; never ship an unverified CR annotation. A clean grep is necessary but not sufficient: also confirm the cited rule actually *describes* the annotated code, not merely that the number exists. + ## Output Return a structured report to the orchestrator: @@ -147,10 +169,12 @@ Return a structured report to the orchestrator: 1. **Diff summary** — files touched, grouped by subsystem, with a one-line purpose per file. 2. **Verification results** — which Tilt resources are green; any failures with `tilt logs` excerpts (own vs unrelated). 3. **Parser diff gate** — pass/fail with offending lines if any. -4. **Judgement calls** — any place you had to choose between two readings of the plan, with the reasoning. -5. **Stop-and-return items** — any places you stopped rather than improvise. -6. **CR annotations added/changed** — each one with the grep command that verified it. -7. **Deviations from the plan** — what changed vs. the plan and why. -8. **Risks** — anything the orchestrator's `/review-impl` loop should pay extra attention to. +4. **Discriminating-test gate** — for the primary fix, the assertion that flips when the fix is reverted, and confirmation no production-reachable arm is left covered only by a degenerate fixture. State if any test is shape-only. +5. **CR-annotation diff gate** — the grep result; list any `UNVERIFIED:` rule, or confirm zero. +6. **Judgement calls** — any place you had to choose between two readings of the plan, with the reasoning. +7. **Stop-and-return items** — any places you stopped rather than improvise. +8. **CR annotations added/changed** — each one with the grep command that verified it. +9. **Deviations from the plan** — what changed vs. the plan and why. +10. **Risks** — anything the orchestrator's `/review-impl` loop should pay extra attention to. Do NOT commit. Do NOT push. The orchestrator decides what to stage and when. diff --git a/.claude/agents/pr-review-comment-resolver.md b/.claude/agents/pr-review-comment-resolver.md index 7f0d5567ed..9320d291b4 100644 --- a/.claude/agents/pr-review-comment-resolver.md +++ b/.claude/agents/pr-review-comment-resolver.md @@ -52,14 +52,43 @@ Accept: ### 2. Fetch Review Feedback -Fetch all relevant feedback: +Fetch all feedback in ONE GraphQL call. **Most actionable feedback on this repo is top-level — PR review bodies (where Gemini's summary review and most human reviews land) and issue/conversation comments — not file/line-anchored inline threads.** GraphQL is the idle rate-limit bucket, so one query avoids the per-PR REST `--paginate` walk that drains `core`; more importantly it fetches all three feedback objects **comprehensively**, which the "all blocking review comments resolved" gate requires: ```bash -gh pr view --repo phase-rs/phase --json reviewDecision,reviews,comments -gh api repos/phase-rs/phase/pulls//comments --paginate -gh api repos/phase-rs/phase/issues//comments --paginate +# First page: omit the cursor vars (they default to null = from the start). +# To paginate, pass the connection's endCursor back via -F Cursor=. +gh api graphql -F owner=phase-rs -F name=phase -F pr= -f query=' +query($owner:String!,$name:String!,$pr:Int!,$reviewsCursor:String,$commentsCursor:String,$threadsCursor:String) { + repository(owner:$owner, name:$name) { + pullRequest(number:$pr) { + reviewDecision + reviews(first:100, after:$reviewsCursor) { # top-level review bodies (Gemini summary, human reviews) — the MAJORITY + pageInfo { hasNextPage endCursor } + nodes { author{login} body state submittedAt url } + } + comments(first:100, after:$commentsCursor) { # top-level issue/conversation comments + pageInfo { hasNextPage endCursor } + nodes { author{login} body url createdAt } + } + reviewThreads(first:100, after:$threadsCursor) { # inline (file/line) threads + resolved state — the minority + pageInfo { hasNextPage endCursor } + nodes { isResolved isOutdated path line + comments(first:50) { pageInfo { hasNextPage endCursor } + nodes { author{login} body url createdAt } } } + } + } + } +}' --jq '.data.repository.pullRequest' ``` +Comprehensiveness rules for the gate: + +- **Top-level `reviews` and `comments` have no `isResolved` flag** — read every one's **body** for actionable findings and judge whether each is addressed in the current commits. Do NOT gate on review `state`: the dominant reviewer here (Gemini) posts its findings in a `COMMENTED` review with `reviewDecision: null` (verified on live PRs), so a `state`-based gate would skip the majority of review content. Treat `state == CHANGES_REQUESTED` as an *additional* hard block, never the sole blocking signal. Compare reviews by each author's **latest** `submittedAt` — a later approval from the same author supersedes their earlier change-request, and vice-versa. +- **Inline `reviewThreads`** are the minority here but still binding: filter to `isResolved == false` for actionable items. +- If ANY connection's `pageInfo.hasNextPage` is true, paginate by passing that connection's `endCursor` back via its cursor variable (`-F reviewsCursor=`, `-F commentsCursor=…`, `-F threadsCursor=…`) and re-running — do not stop at the first page; this is the comprehensive gate, not a triage slice. Truncating `reviews`/`comments` would silently hide top-level blockers, the majority case. The nested per-thread `comments` connection also exposes `pageInfo`: if a single thread's `comments.pageInfo.hasNextPage` is true (50+ replies — rare), refetch that thread's comments with its own `after` cursor. + +Never substitute a per-PR `gh api .../pulls//comments --paginate` REST walk (drains `core`, inline-only, lacks resolved-state) or a time-windowed `since=` sweep (risks dropping old unaddressed top-level feedback). + For each item, extract source, author, body, file path and line/range, timestamps, and whether it is resolved, outdated, duplicated, or still actionable. Skip resolved and informational comments. If uncertain, keep the item and mark it `needs-human-confirmation`. ### 3. Categorize And Prioritize diff --git a/.claude/skills/add-ai-feature-policy/SKILL.md b/.claude/skills/add-ai-feature-policy/SKILL.md index 1d1661d511..ec8fd09760 100644 --- a/.claude/skills/add-ai-feature-policy/SKILL.md +++ b/.claude/skills/add-ai-feature-policy/SKILL.md @@ -7,9 +7,9 @@ description: Use when adding a new deck-aware AI feature (`DeckFeatures` axis) a This is the authoritative checklist for adding a new feature to the three-layer AI architecture (Features → Plan → Policies). Nine features exist as reference: `landfall`, `mana_ramp`, `tribal`, `control`, `aristocrats`, `aggro_pressure`, `tokens_wide`, `plus_one_counters`, `spellslinger_prowess`. Each follows the same pattern; mirror it. -**Before you start:** Read `landfall.rs` (simplest), then `aristocrats.rs` (geometric-mean commitment with identity lookup), then one feature similar in shape to yours. Trace from `features/.rs::detect()` through `session.rs::features_for` through `policies/.rs::verdict` so the data flow is concrete. +**Before you start:** Read `landfall.rs` (simplest), then `aristocrats.rs` (geometric-mean commitment with identity lookup), then one feature similar in shape to yours. Trace from `features/.rs::detect()` through `DeckFeatures::analyze` and `AiSession::from_game` through `policies/.rs::verdict` so the data flow is concrete. -> **CR Verification Rule.** Every CR number in annotations MUST be verified by greppign `docs/MagicCompRules.txt` BEFORE you write it. Do NOT rely on memory or prior prompts. The 701.x and 702.x sections are arbitrary sequential assignments that hallucinate consistently. Run `grep -n "^701.34" docs/MagicCompRules.txt` for every number. CR 122.1d is Stun (NOT +1/+1 — that's CR 122.1a). CR 701.27 is Transform (NOT Proliferate — that's CR 701.34). Two real catches from the existing batch — you will have similar misses unless you grep. +> **CR Verification Rule.** Every CR number in annotations MUST be verified by grepping `docs/MagicCompRules.txt` BEFORE you write it. Do NOT rely on memory or prior prompts. The 701.x and 702.x sections are arbitrary sequential assignments that hallucinate consistently. Run `grep -n "^701.34" docs/MagicCompRules.txt` for every number. CR 122.1d is Stun (NOT +1/+1 — that's CR 122.1a). CR 701.27 is Transform (NOT Proliferate — that's CR 701.34). Two real catches from the existing batch — you will have similar misses unless you grep. --- @@ -18,7 +18,8 @@ This is the authoritative checklist for adding a new feature to the three-layer ``` AiSession (per-game cache; built once in AiSession::from_game) ├─ features: HashMap ← Layer 1: structural detection - ├─ plan: HashMap ← Layer 2: derived schedule + ├─ plan: HashMap ← Layer 2: bottoming + curve schedule + ├─ deck_profile / strategy: per-player strategic profiles ├─ synergy: HashMap └─ memory: PolicyMemory @@ -42,8 +43,24 @@ Both run inside their respective registries (`PolicyRegistry::default`, `MulliganRegistry::default`). Activation is the single multiplicative knob. ``` +`AiSession::from_game` analyzes `current_main` plus `current_commander`; commander entries are +weighted as build-around cards before feature/profile/synergy detection. One-off analysis helpers +that only receive a deck slice stay pure over that slice. + **Single-knob rule.** A `TacticalPolicy` exposes `activation()` returning `Option`. The registry multiplies `verdict.delta * activation` exactly once. There is no `archetype_scale`, no `turn_mult`, no second pass. If a policy needs archetype/turn-sensitive weight, it computes it inside `activation()` from the inputs. +**Score contract.** Tactical policy deltas are card-equivalent units: `delta = 1.0` means one card of expected value. Use the band helpers on `PolicyVerdict` so the chosen strength is visible at the call site: + +| Band | Helper | Range | Meaning | +|---|---|---|---| +| Nudge | `PolicyVerdict::nudge` | `(0.0, 0.3]` | tie-breaker | +| Preference | `PolicyVerdict::preference` | `(0.3, 1.5]` | half-card to 1.5-card preference | +| Strong | `PolicyVerdict::strong` | `(1.5, 5.0]` | multi-card swing | +| Critical | `PolicyVerdict::critical` | `(5.0, 15.0]` | game-deciding | +| Veto | `PolicyVerdict::reject` | n/a | never take this action | + +All scoring scalars come from `AiConfig::policy_penalties` / `PolicyPenalties` with rationale comments. Thresholds that only gate activation stay as `pub const` in the policy/feature module. Never encode vetoes as sentinel scores. + --- ## 2. Hard constraints (non-negotiable) @@ -101,7 +118,7 @@ Every feature must produce ALL of these: 1. **AST verification table in module docstring** — one line per axis citing the exact `crates/engine/src/types/...:line` where the type lives + a CR number where applicable. Mirror `landfall.rs:1-12` and `aggro_pressure.rs:1-25`. This is your proof that the AST supports detection without parser changes. 2. **`Feature` struct** — counters as `u32`, ratios as `f32`, identity-lookup lists as `Vec`, single `commitment: f32` (0.0..=1.0). NO bool fields. Document each field with a CR if it implements a rule. -3. **Commitment formula** with calibration anchor in a doc comment. Show the math (e.g., `commitment = clamp01(2.0 * low_density + 1.0 * hasty_density + 0.6 * burn_density)`) AND a real-deck calibration ("Mono-Red Burn: ~16 one-drops + 4 burn → commitment ≈ 0.85"). Anti-calibration ("UW control → commitment ≈ 0.0") is also required. Two formula shapes are established: +3. **Commitment formula** with calibration anchor in a doc comment. Commitments are density-normalized per 60 nonland cards so 60-card and 99-card decks with the same density score the same. Use only the shared helpers in `features/commitment.rs`: `commitment::density_per_60(count, total_nonland)`, `commitment::weighted_sum(&[(weight, density)])`, and `commitment::geometric_mean(&[pillar_density])`. Show the math AND a real-deck calibration ("Mono-Red Burn: ~16 one-drops + 4 burn → commitment ≈ 0.85"). Anti-calibration ("UW control → commitment ≈ 0.0") is also required. Two formula shapes are established: - **Weighted sum, clamped** (when missing pillars are tolerable). Pattern: control, aggro_pressure, spellslinger. - **Geometric mean with zero-pillar collapse** (when missing a pillar = not this archetype). Pattern: aristocrats, tokens_wide. 4. **Parts-based predicate exports** — one `pub(crate) fn is__parts() -> bool` per axis. Each takes the slice the policy actually has access to on `GameObject` (e.g., `is_burn_spell_parts(core_types: &[CoreType], abilities: &[AbilityDefinition])`). Add a public `is_X(face: &CardFace)` wrapper if useful for tests; remove it if dead per clippy. **The `_parts` predicates are the contract** — they let the policy classify live `GameObject`s without reconstructing a `CardFace`. @@ -179,7 +196,7 @@ pub trait MulliganPolicy: Send + Sync { - Constant `Some(1.0)` requires an `// activation-constant: ` marker (lint at `policies/tests/activation_marker_lint.rs`). **`verdict()` patterns:** -- Return `PolicyVerdict::Score { delta: f64, reason: PolicyReason }` (no Reject unless you genuinely want to veto). +- Return `PolicyVerdict::{nudge,preference,strong,critical}(ctx.config.policy_penalties., reason)` for scores. Return `PolicyVerdict::reject(reason)` only for genuine vetoes. - `PolicyReason::new("stable_kind_string").with_fact("metric_name", value as i64)` — kind is `&'static str`, frozen per policy. - Mirror `free_outlet_activation.rs:80-94`: re-classify the live ability structurally using parts predicates against `obj.abilities`/`obj.keywords`/etc. — don't trust deck-time classification at decision time. @@ -187,6 +204,7 @@ pub trait MulliganPolicy: Send + Sync { - Opt-out below `MULLIGAN_FLOOR` returns `Score { delta: 0.0, reason: _keepables_na }`. - Walk `hand` (`Vec`); look up each via `state.objects.get(&oid)`. Classify by core type, mana value, identity lookup against feature `payoff_names`. - Verdict tiers in priority order (first match returns): `_keepable_ideal` (+1.5..2.0), `_workable` (+0.5..1.0), penalty (-1.0..-1.5), default (0.0). +- `turn_order` and `mulligans_taken` are real inputs. New policies must consume them or carry an `// input-unused: ` marker next to the binding. --- @@ -198,13 +216,13 @@ For a feature named `` with `Feature` struct and policies `Pol |---|---| | `crates/phase-ai/src/features/mod.rs` | Add `pub mod ;` + `pub use ::Feature;` + `pub : Feature` field on `DeckFeatures`. | | `crates/phase-ai/src/features/.rs` | NEW. The feature module + tests. | -| `crates/phase-ai/src/session.rs` | Import + call `::detect(deck)` in `features_for()`. | +| `crates/phase-ai/src/features/mod.rs` | Call `::detect(deck)` in `DeckFeatures::analyze`. | | `crates/phase-ai/src/policies/mod.rs` | Add `pub mod ;` (and any new tactical policy module). | | `crates/phase-ai/src/policies/.rs` | NEW. Tactical policy + tests. | | `crates/phase-ai/src/policies/registry.rs` | Add `PolicyId::` (and `Mulligan`) variants. Import `super::::Policy`. Add `Box::new(Policy)` to `PolicyRegistry::default`'s vec. | | `crates/phase-ai/src/policies/mulligan/_keepables.rs` | NEW (if a mulligan policy is in scope). | | `crates/phase-ai/src/policies/mulligan/mod.rs` | Add module + `pub use` + `Box::new(Mulligan)` to `MulliganRegistry::default`'s vec. | -| `crates/phase-ai/src/plan/curves.rs` | OPTIONAL: add tempo-class branch in `tempo_class_for` (carefully ordered — see Section 8). Add expected-curve adjustments in `expected_lands_for`/`expected_mana_for`/`expected_threats_for` if archetype shifts the curve. | +| `crates/phase-ai/src/plan/curves.rs` | OPTIONAL: add tempo-class branch in `tempo_class_for` (carefully ordered — see Section 8). Add expected-curve adjustments in `expected_lands_for`/`expected_mana_for`/`expected_threats_for` if archetype shifts the curve. Plan data is consumed by mulligan bottoming and curve expectations; do not add zombie plan fields. | The `no_name_matching` lint at `features/tests/no_name_matching.rs` automatically scans both `src/features/` and `src/policies/` — no manual exemption needed unless your file names appear in the lint's allow-list (unlikely). @@ -292,23 +310,29 @@ Every feature MUST have: - `opts_out_when_commitment_low` - `ideal_hand_` → positive delta with `_keepable_ideal` reason - Each verdict tier (workable, penalty, default) +- A `turn_order` / `mulligans_taken` test, or an explicit `// input-unused: ` marker in the implementation. + +**Measurement evidence:** +- New-policy PRs attach a `rtk cargo ai-gate` paired-seed report. No flips is acceptable for narrow policies, but the run must exist. --- ## 12. Implementation order (each step compiles before the next) 1. Read `landfall.rs` + the most-similar reference feature end to end. -2. Verify each CR number you'll cite by greppign `docs/MagicCompRules.txt`. +2. Verify each CR number you'll cite by grepping `docs/MagicCompRules.txt`. 3. STEP 0 (if needed): refactor an existing predicate from another feature to `pub(crate)` parts shape (e.g., `control::is_card_draw_parts` was promoted for spellslinger). Run the existing tests; behavior must be preserved. -4. Create `features/.rs`: struct + thresholds + `detect()` + parts predicates + tests. -5. Wire `features/mod.rs` and `session.rs`. -6. Create `policies/.rs` (tactical) + tests. Add `PolicyId` variant + `Box::new()` registration. -7. Create `policies/mulligan/_keepables.rs` + tests. Add `PolicyId` variant + register in `MulliganRegistry::default`. -8. (Optional) Add `plan/curves.rs` branches with tests. -9. (Optional) Cross-feature amplification (Section 10). -10. `cargo fmt && cargo clippy --all-targets -p phase-ai -- -D warnings && cargo test -p phase-ai --lib`. All must be clean. -11. Commit with explicit `git add path1 path2 ...`. NEVER `git add -A`. -12. Spawn an opus advisory review (see `git log --grep="address review"` for prior fix-up commit messages as templates). Address MUST FIX in a follow-up commit; CONSIDER items at your discretion. +4. Optionally start from `scripts/new-ai-policy.sh `; it writes the three scaffold files and prints the wiring checklist. +5. Create `features/.rs`: struct + thresholds + `detect()` + parts predicates + tests. +6. Wire `features/mod.rs`. +7. Create `policies/.rs` (tactical) + tests. Add `PolicyId` variant + `Box::new()` registration. +8. Create `policies/mulligan/_keepables.rs` + tests. Add `PolicyId` variant + register in `MulliganRegistry::default`. +9. (Optional) Add `plan/curves.rs` branches with tests. +10. (Optional) Cross-feature amplification (Section 10). +11. Run `cargo fmt --all`, then use Tilt verification: `./scripts/tilt-wait.sh clippy test-ai`. If Tilt is down, fall back to the direct phase-ai clippy/test commands. +12. Run `rtk cargo ai-gate` for new policy work. +13. Commit with explicit `git add path1 path2 ...`. NEVER `git add -A`. +14. Spawn an opus advisory review (see `git log --grep="address review"` for prior fix-up commit messages as templates). Address MUST FIX in a follow-up commit; CONSIDER items at your discretion. --- diff --git a/.claude/skills/ai-duel/SKILL.md b/.claude/skills/ai-duel/SKILL.md index eed19a7ec5..83c6d2ca39 100644 --- a/.claude/skills/ai-duel/SKILL.md +++ b/.claude/skills/ai-duel/SKILL.md @@ -11,13 +11,26 @@ Run AI-vs-AI game simulations to test decision quality, validate matchups, and c ```bash # Default: Red Aggro vs Green Midrange, 5 games, Medium difficulty -cargo run --release --bin ai-duel -- client/public --batch 5 +rtk cargo run --release --bin ai-duel -- client/public --batch 5 # Single verbose game (see every combat action and spell cast) -cargo run --release --bin ai-duel -- client/public --seed 42 --difficulty VeryHard +rtk cargo run --release --bin ai-duel -- client/public --seed 42 --difficulty VeryHard # Batch with specific seed for reproducibility -cargo run --release --bin ai-duel -- client/public --batch 20 --seed 1000 --difficulty Medium +rtk cargo run --release --bin ai-duel -- client/public --batch 20 --seed 1000 --difficulty Medium + +# Full registered matchup suite in measurement mode +rtk cargo run --release --bin ai-duel -- client/public --suite --games 10 --seed 42 \ + --output target/duel-suite-results.json + +# Compare two suite reports with paired-seed sign-test status +rtk cargo run --release --bin ai-duel -- compare crates/phase-ai/baselines/duel-suite.json \ + target/duel-suite-results.json + +# Commander candidate-seat measurement +rtk cargo run --release --bin ai-duel -- client/public --commander-suite --games 8 --seed 42 \ + --difficulty Hard --baseline-difficulty Medium \ + --output target/commander-suite-results.json ``` ## CLI Options @@ -30,6 +43,34 @@ cargo run --release --bin ai-duel -- client/public --batch 20 --seed 1000 --diff | `--matchup NAME` | Deck matchup preset | red-vs-green | | `--list-matchups` | Show available matchups | - | | `--verbose` | Print every action (full trace) | off | +| `--suite` | Run every registered `MatchupSpec` in measurement mode | off | +| `--games N` | Games per matchup/suite cell | 10 for suite, 4 for commander suite | +| `--output PATH` | JSON report path | `target/duel-suite-results.json` | +| `--suite-filter STR` | Run only suite matchups whose id contains STR | all | +| `--show-attribution` | Capture `phase_ai::decision_trace` policy attribution | off | +| `--commander-suite` | Run 4-player Commander candidate-seat rotations | off | +| `--baseline-difficulty LEVEL` | Baseline seats for `--commander-suite` | Medium | +| `--feed PATH` | Commander feed under data root | `feeds/mtggoldfish-commander.json` | + +## Measurement And Gates + +`ai-duel` uses `AiConfig::into_measurement(seed)` for single, suite, and +Commander regression runs. Measurement mode disables wall-clock search budgets +and bounds search by node/depth budgets, so outcomes are functions of the +binary, config, matchup, and seed. + +Use `rtk cargo ai-gate` for the normal regression gate. It runs the pinned suite +against `crates/phase-ai/baselines/`, compares paired seeds, and reports +FAIL/WARN/PASS with a binomial sign-test p-value. WARN means movement was +observed but not significant enough to fail the gate. + +When intentionally changing AI behavior, refresh the baseline in the same PR +after review: + +```bash +rtk ./scripts/refresh-ai-baseline.sh +rtk cargo ai-gate +``` ## Performance Guide @@ -45,7 +86,10 @@ All times are release mode (`--release`). Debug mode is 5-10x slower. ## Deck Configuration -The `ai-duel` binary uses hardcoded decks selected via `--matchup`. Deck functions are in `crates/phase-ai/src/bin/ai_duel.rs`. +The `ai-duel` binary resolves `--matchup` and `--suite` entries from +`crates/phase-ai/src/duel_suite/spec.rs`. Inline starter/metagame deck builders +live in `crates/phase-ai/src/duel_suite/inline_decks.rs`; snapshot decks live +under `crates/phase-ai/duel_decks/`. ### Available Matchups @@ -79,7 +123,10 @@ Use `--matchup NAME` to select a preset. Use `--list-matchups` to see all option ### Changing Decks -To add new matchups, add a deck builder function and matchup entry in `ai_duel.rs`. Card names must match entries in `client/public/card-data.json`. Use `jq 'keys[]' client/public/card-data.json | grep -i "card name"` to find exact names. +To add new matchups, add a `MatchupSpec` in `duel_suite/spec.rs`. Use an inline +builder only for small stable decks; prefer snapshot decks for real metagame +coverage. Card names must match entries in `client/public/card-data.json`. Use +`jq 'keys[]' client/public/card-data.json | rg -i "card name"` to find exact names. To find high-coverage metagame decks for testing, check the feed data: ```bash @@ -112,6 +159,15 @@ Control decks improve more at higher difficulty levels (they need search to time For testing AI quality independent of deck matchup advantage, use mirror matches (`prowess-mirror`, `red-mirror`, etc.). Win rates should be close to 50/50. +### Commander Suite + +`--commander-suite` loads four resolvable decks from the Commander feed, runs +one candidate seat against three baseline seats, and rotates the candidate seat +across P0-P3. Metrics are candidate win rate, survival turns, and elimination +order. Use this as the first strength signal for multiplayer/Commander changes; +it is slower than the two-player suite and should run nightly or on targeted AI +changes rather than on every parser-only change. + ## Interpreting Results **Healthy signs:** @@ -136,14 +192,30 @@ When running single verbose games, look for: - **Not attacking with lethal**: Having lethal on board but not swinging - **Tapping out into lethal**: Casting sorcery-speed when opponent has lethal on board +## Community Scenario Fixtures + +When a Discord `#ai-suggestions` thread includes a saved game-state zip, turn it +into a deterministic AI scenario before landing the fix: + +1. Download the zip and commit it under `crates/phase-ai/fixtures/scenarios/`. +2. Add an entry to `crates/phase-ai/fixtures/scenarios/community-scenarios.json` + with the Discord thread id, archive name, and expected action assertion. +3. Run the fixture through `crates/phase-ai/tests/community_scenarios.rs`; it + loads the same `{ "gameState": ... }` export used by `ai-bench-state`, applies + saved-state compatibility migrations, and evaluates the action in measurement + mode with a fixed seed. + ## Related Files | File | Purpose | |------|---------| | `crates/phase-ai/src/bin/ai_duel.rs` | Duel simulation binary | | `crates/phase-ai/src/bin/ai_tune.rs` | CMA-ES weight optimization | +| `crates/phase-ai/src/bin/ai_gate.rs` | Pinned regression gate wrapper | | `crates/phase-ai/src/auto_play.rs` | AI action driver | | `crates/phase-ai/src/combat_ai.rs` | Combat decisions | +| `crates/phase-ai/src/duel_suite/` | Matchup registry, runner, compare, attribution | | `crates/phase-ai/src/search.rs` | Action selection + search | | `crates/phase-ai/tests/ai_quality.rs` | Regression test suite | | `crates/phase-ai/tests/scenarios.rs` | Scenario integration tests | +| `crates/phase-ai/tests/community_scenarios.rs` | Discord saved-state scenario tests | diff --git a/.claude/skills/bug-triage/SKILL.md b/.claude/skills/bug-triage/SKILL.md index 80c479c710..fc8f9b6922 100644 --- a/.claude/skills/bug-triage/SKILL.md +++ b/.claude/skills/bug-triage/SKILL.md @@ -730,6 +730,11 @@ This is the inverse of the `fully_parsed` hard rule: `fully_parsed` never proves ### AI bugs (area:ai) 1. Check `crates/phase-ai/` for the relevant evaluation/action-generation logic 2. AI bugs are rarely caught by parser coverage — they need gameplay testing +3. If the report includes a saved game-state zip, convert it before the fix + lands: download the zip to `crates/phase-ai/fixtures/scenarios/`, add a + `community-scenarios.json` assertion with the Discord thread id and expected + action shape, then verify `crates/phase-ai/tests/community_scenarios.rs` + exercises it in measurement mode. ## Triage Data Files diff --git a/.claude/skills/card-test/SKILL.md b/.claude/skills/card-test/SKILL.md new file mode 100644 index 0000000000..2acef57498 --- /dev/null +++ b/.claude/skills/card-test/SKILL.md @@ -0,0 +1,138 @@ +--- +name: card-test +description: Canonical recipe for writing engine cast-pipeline tests. Use GameScenario + GameRunner::cast(...).resolve() and assert via CastOutcome deltas. Covers the five test-harness foot-guns (hand-written TargetRef vectors, incomplete modal target submission, wrong-point hand baseline, inline-keyword cards, AST-internal flag assertions) and the right-way fix for each. Use whenever writing or porting a runtime test that casts a spell and checks its effect. +--- + +# card-test — the canonical cast-pipeline test recipe + +This skill gives ONE rigid recipe for runtime engine tests that cast a spell and +assert its effect. It exists because five test-harness foot-guns recur; the +[`SpellCast`] driver and [`CastOutcome`] (in `crates/engine/src/game/scenario.rs`) +make them structurally impossible when you follow this recipe. + +Reference ports that demonstrate the pattern: + +- `crates/engine/tests/chord_of_calling.rs` — X-spell, convoke, `final_waiting_for()` boundary. +- `crates/engine/src/game/casting.rs` — `exsanguinate_*` (life deltas, 2- and 3-player), `vicious_rivalry_*` (X-cost filter + zones), `kozileks_command_modes_*` (modal + X + multi-target). + +## The recipe + +```rust +use engine::game::scenario::{GameScenario, GameRunner}; +// ... typed imports for the effect under test ... + +#[test] +fn my_card_does_the_thing() { + // 1. BUILD STATE via GameScenario (or wrap a raw GameState with + // GameRunner::from_state(state) when the test builds state imperatively). + let mut scenario = GameScenario::new(); + scenario.at_phase(Phase::PreCombatMain); + let spell = scenario + .add_spell_to_hand_from_oracle(P0, "My Card", /* is_instant */ true, ORACLE) + .id(); + // ... stage targets / mana / library as needed ... + let mut runner = scenario.build(); + + // 2. CAST through the pipeline. Declare INTENT, never targets-by-hand. + let outcome = runner + .cast(spell) + .modes(&[0, 2]) // modal "choose N" — omit if non-modal + .x(3) // announce X — omit if non-X + .target_player(P1) // declare a player intent (reusable across slots) + .target_objects(&[victim]) // declare object intents (one per slot, in order) + .convoke_with(&[creature]) // tap creatures to pay via Convoke — omit otherwise + .resolve(); + + // 3. ASSERT via CastOutcome deltas — behavior/semantics, never AST internals. + outcome.assert_life_delta(P1, -3); + outcome.assert_zone(&[victim], Zone::Exile); + outcome.assert_hand_drawn(P0, 1); + // For "no further prompt" boundaries (e.g. fail-to-find), inspect the halt state: + assert!(matches!(outcome.final_waiting_for(), WaitingFor::Priority { .. })); +} +``` + +### What the driver does for you + +`resolve()` runs a bounded state-machine over `state.waiting_for` (CR 601.2a–h): + +- `ModeChoice` → submits `.modes(..)` (panics if modal but no modes declared). +- `ChooseXValue` → submits `.x(..)` (panics if X needed but not declared). +- `ManaPayment { convoke_mode }` → taps each `.convoke_with(..)` creature with + mana of its color, then finalizes (CR 702.51b). Pool-funded casts auto-pay and + never surface this window. +- `TargetSelection` → answers **one slot at a time, in written order** + (CR 601.2c). Object intent is consumed (one declared object per slot); player + intent is reusable (one player may be targeted by several modes). +- `Priority` (post-cast) → captures the per-player hand baseline at stack commit + (CR 601.2a) and proceeds to resolution. +- Resolution auto-answers `OrderTriggers` (CR 603.3b) and `ScryChoice` (keeps + looked-at cards on top, CR 701.22a); it stops at stack-empty or at any prompt + it is not taught to answer (e.g. `SearchChoice`) so you can assert on it via + `final_waiting_for()`. +- Any unhandled prompt → a clear `extend the driver or drive this case manually` + panic. Extending the driver is the correct response; never assert around a + silent skip. + +### CastOutcome accessors + +| Accessor | Returns | +|---|---| +| `hand_drawn(p)` / `assert_hand_drawn(p, n)` | net cards drawn since stack commit (the clean resolution delta) | +| `zone_of(o)` / `assert_zone(&[o], zone)` | an object's current zone | +| `life_delta(p)` / `assert_life_delta(p, n)` | net life change since before the cast | +| `final_waiting_for()` | the state the pipeline halted in | +| `state()` | read-only `&GameState` for assertions the typed accessors don't cover | + +## Anti-patterns — the five foot-guns and the right-way fix + +1. **Hand-written `TargetRef` vectors.** Building a flat `Vec` and + submitting `SelectTargets { targets }` is fragile (`TargetRef` is non-`Copy`; + `.copied()` won't compile, and the order must match the slots exactly). + *Fix:* declare intent with `.target_object(..)` / `.target_player(..)`; the + driver matches each intent to its slot. + +2. **Incomplete modal target submission.** Omitting one mode's slot yields + `InvalidAction("Illegal target selected")` because targets are validated one + per slot, in written order (`ability_utils::choose_target`). + *Fix:* the driver answers every slot via `ChooseTarget` in order — you cannot + forget a slot. + +3. **Hand/zone baseline captured at the wrong point.** `handle_cast_spell` does + NOT remove the spell from hand; CR 601.2a removes it only when the cast + commits to the stack (the `Priority` window). A baseline taken before that is + off by one. + *Fix:* `CastOutcome::hand_drawn` is measured against the stack-commit + baseline the driver captures for you. Use `assert_hand_drawn`, never a + hand-picked `let hand_before = ...`. + +4. **Keywords fed as inline Oracle reminder text.** Writing `"Convoke (Your + creatures ...)"` as plain text parses to `Effect::Unimplemented`. + *Fix:* build keyworded cards via + `from_oracle_text_with_keywords(&["Convoke"], text)`, and pay convoke via + `.convoke_with(&[..])` (which the driver routes through `TapForConvoke`). + +5. **Asserting representation-internal dual-encoded flags.** Asserting an + AST field such as `ChangeZone.up_to` couples the test to one encoding of the + spec rather than the behavior. + *Fix:* assert behavior via `CastOutcome` deltas (`zone_of`, `life_delta`, + `hand_drawn`). When a SHAPE assertion is genuinely needed (parser structure), + label the test SHAPE and assert via semantic accessors (e.g. a + `MultiTargetSpec`), not internal bools. + +## Hard rules + +- **Never call the raw `resolve()` stack function directly.** Drive through the + `apply()` pipeline (via `runner.cast(..).resolve()` or `runner.act(..)`). + Calling `stack::resolve_top` / `effect::resolve` directly bypasses the + intervening-if recheck and the `cast_from_zone` carry-through, so cast-trigger + bugs (cascade/storm/casualty) go invisible. See the memory landmine + `project_cast_from_zone_stack_wipe`. +- **Prefer runtime-behavior tests.** Reserve AST-SHAPE tests for parser + structure; label them SHAPE and assert via semantic accessors, not internal + flags. The two Kozilek/Chord SHAPE tests + (`kozileks_command_full_four_mode_parse`, + `chord_of_calling_parses_x_mana_value_search_shape`) are the correct pattern + for that and are intentionally NOT driven through `cast()`. +- **CR-annotate any assertion that encodes a rule** (verify the number against + `docs/MagicCompRules.txt` before writing it). diff --git a/.claude/skills/engine-implementer/SKILL.md b/.claude/skills/engine-implementer/SKILL.md index f8edeae164..25cb7ae81b 100644 --- a/.claude/skills/engine-implementer/SKILL.md +++ b/.claude/skills/engine-implementer/SKILL.md @@ -77,6 +77,8 @@ cargo fmt --all After a non-zero `tilt-wait.sh`, fetch details with `tilt logs --tail 50 --since 2m`. Distinguish your diff's errors from concurrent-agent errors per CLAUDE.md's "Defer to other active agents" guidance. +Confirm the executor's two pre-commit gates came back clean (items 4 and 5 of its report): the **discriminating-test gate** (at least one test drives the real pipeline and would fail if the fix were reverted — AST-shape-only coverage does not count) and the **CR-annotation diff gate** (every added/changed `CR ` resolves in `docs/MagicCompRules.txt`). If the executor shipped only shape tests, or any CR annotation came back `UNVERIFIED`, loop back to Step 3 with that as a fix constraint — do not commit shape-only coverage or an unverified CR number. + ### Step 5 — Review implementation until clean (unbounded loop) Spawn a `general-purpose` agent and instruct it to invoke `/review-impl` against the implementation diff. The reviewer MUST also verify the originally reported bug or requirement is actually fixed via a discriminating runtime test — not just that the code looks clean (`feedback_review_impl_verify_bug_fixed`). diff --git a/.claude/skills/oracle-parser/SKILL.md b/.claude/skills/oracle-parser/SKILL.md index a24b02af69..c9a3effae3 100644 --- a/.claude/skills/oracle-parser/SKILL.md +++ b/.claude/skills/oracle-parser/SKILL.md @@ -72,7 +72,7 @@ Each rule below is defined in CLAUDE.md. One-sentence principle + codebase examp | Rule | Example Location | |------|-----------------| | **Never match verbatim Oracle text** — decompose every phrase into typed building blocks (grammar + helpers + enums). A verbatim string match handles exactly one card and poisons the architecture. | Contrast: typed `QuantityRef`/`Comparator` vs. literal string | -| **Compose combinators by dimension** — N independent dimensions = N chained `alt()` calls, not N! branches. | `oracle_nom/condition.rs` multi-axis composition | +| **Compose combinators by dimension** — N independent axes = a sum of per-axis `alt()`/`opt()` calls inside one sequence, never a flat `alt` of full-string `tag`s (the product). Variation in the *middle* or an *optional* segment still factors: `recognize((tag(..), alt(..), tag(..), opt(..), tag(..)))`. Smell: a flat `alt` whose arms share a long common prefix **and** suffix. | `oracle_nom/condition.rs` multi-axis composition; PATTERNS.md §8b | | **Nest by prefix dispatch** — shared prefixes use `preceded(tag(...), sub_combinator)` to eliminate redundant matching. | `oracle_trigger.rs` phase trigger nesting | | **Word-boundary scanning** — try a combinator at each word boundary via scanning loop, not `contains()` chains. | `oracle_casting.rs::scan_timing_restrictions`, `oracle_trigger.rs::scan_for_phase` | | **`parse_inner_condition` is the single authority** for all game-state conditions. Trigger/static parsers MUST delegate to it. | `oracle_nom/condition.rs::parse_inner_condition` | @@ -98,26 +98,61 @@ If any answer is wrong, **stop and refactor before moving on.** ## 2. Architecture Overview -### Parse Pipeline +### Parse Pipeline — Document-Level Two-Phase (parse → IR → lower) + +`parse_oracle_text()` (oracle.rs, near the bottom of the file) is the public +entry point and a **thin wrapper** over two phases: `parse_oracle_ir()` (IR +production) followed by `lower_oracle_ir()` (IR lowering). Diagnostics flow +through `OracleDocIr.diagnostics` → `ParsedAbilities.parse_warnings`. ``` Oracle text (from MTGJSON) ↓ -strip_reminder_text() — remove (parenthesized text) +parse_oracle_ir() — oracle.rs: the priority router (see §3) + ├─ normalize_card_name_refs() — card name / "this creature" → ~ (once, at entry) + ├─ pre-parsers (before the line loop): Saga chapters [oracle_saga.rs], + │ Attraction visit lines [oracle_attraction.rs], Class levels + │ [oracle_class.rs], Leveler LEVEL blocks [oracle_level.rs], + │ Spacecraft "N+ |" thresholds [oracle_spacecraft.rs], Strive cost + ├─ per line: strip_reminder_text(), then classify by priority slot (§3) ↓ -normalize_self_refs() — card name / "this creature" → ~ +OracleDocIr — oracle_ir/doc.rs: Vec + diagnostics. + │ Core variants carry typed IR (EffectChainIr, + │ TriggerIr, StaticIr, ReplacementIr); PreLowered + │ variants carry already-assembled engine types. ↓ -parse_oracle_text() — classify line by priority (see §3) +lower_oracle_ir() — oracle.rs: exhaustive match on each OracleItemIr + ↓ (core IR → dedicated lowering fn; PreLowered → identity) +ParsedAbilities — abilities / triggers / statics / replacements / + keywords / casting options + parse_warnings +``` + +Per-line classification inside `parse_oracle_ir` (simplified; §3 has the full slot table): + +``` ├─ Keywords-only → keyword extraction ├─ "When/Whenever/At" → parse_trigger_line() [oracle_trigger.rs] ├─ Contains ":" → activated ability parsing [oracle_cost.rs + oracle_effect/] - ├─ is_static_pattern() → parse_static_line() [oracle_classifier.rs → oracle_static.rs] + ├─ is_static_pattern() → parse_static_line() [oracle_classifier.rs → oracle_static/] ├─ is_replacement_pattern() → parse_replacement_line() [oracle_classifier.rs → oracle_replacement.rs] ├─ Imperative verb → parse_effect_chain() [oracle_effect/] ├─ dispatch_line_nom() → parse_effect_chain_with_context() [oracle_dispatch.rs → oracle_effect/] └─ Fallback → Effect::Unimplemented ``` +### IR Layer — `oracle_ir/` + +| Sub-module | Purpose | +|-----------|---------| +| `ast.rs` | All parser AST types (`ParsedEffectClause`, `ClauseAst`, modal/loyalty AST — moved here from `oracle_effect/types.rs`, `oracle_modal.rs`, `oracle.rs`) | +| `doc.rs` | Document-level IR: `OracleDocIr`, `OracleItemIr` | +| `context.rs` | `ParseContext` for stateful parsing (subject, actor, card_name, host_self_reference, …) | +| `diagnostic.rs` | `OracleDiagnostic` — structured parse warnings | +| `effect_chain.rs` | `EffectChainIr` + lowering for spell/ability effect chains | +| `trigger.rs` | `TriggerIr` + lowering | +| `static_ir.rs` | `StaticIr` + lowering | +| `replacement.rs` | `ReplacementIr` + lowering | + ### Nom Combinator Layer — `oracle_nom/` All parser branches delegate atomic parsing to shared nom 8.0 combinators: @@ -130,16 +165,20 @@ All parser branches delegate atomic parsing to shared nom 8.0 combinators: | `duration.rs` | Duration phrase combinators ("until end of turn", etc.) | | `condition.rs` | Condition phrase combinators ("if", "unless", "as long as") | | `filter.rs` | Filter property combinators (zone, type, controller, "with") | -| `error.rs` | `OracleResult` type, `parse_or_unimplemented` error boundary | -| `context.rs` | `ParseContext` for stateful parsing | +| `error.rs` | `OracleError` / `OracleResult` type aliases, `oracle_err` error constructor. Parse-failure authority is `Effect::unimplemented(name, fragment)` (`types/ability.rs`) — never hand-construct `Effect::Unimplemented { .. }` literals | +| `context.rs` | Re-export shim for `ParseContext` (canonical home: `oracle_ir/context.rs`) | | `bridge.rs` | `nom_on_lower`, `nom_on_lower_required`, `nom_parse_lower` — mixed-case bridging | +| `enchant.rs` | "Enchant {filter}" attachment-restriction combinators | +| `return_as_aura.rs` | "return ~ ... attached to" / return-as-Aura combinators | -### Two-Phase Parse/Lower Architecture +### Two-Phase Parse/Lower Architecture (clause level) -The parser uses a two-phase approach: **parse → AST → lower → Effect**. +Within the effect branch, the parser uses the same two-phase approach at clause +granularity: **parse → AST → lower → Effect**. ``` parse_effect_clause() — entry point (oracle_effect/mod.rs) + → clause_shell::peel_clause() — strip structural slots first (see below) → parse_clause_ast() — classify sentence shape → ClauseAst → lower_clause_ast() — convert AST to Effect → lower_subject_predicate_ast() — for SubjectPredicate clauses @@ -149,6 +188,45 @@ parse_effect_clause() — entry point (oracle_effect/mod.rs) → lower_imperative_family_ast() — convert to Effect ``` +### Clause Shell — Structural Slot Peeling (`clause_shell.rs`) + +**This is the destination architecture for structural slots.** Phase 2 of the +no-text-swallowing refactor (see `data/parser-swallow-progress.md`) inverts +slot-consumption responsibility: instead of every body parser recognizing AND +consuming surrounding structural slots inline (and silently dropping them when +it forgets — the swallowing bug class), `peel_clause(text)` recursively strips +slot-bearing prefixes/suffixes off the clause, accumulating them into a +`ClauseContext` (synthesized attributes). The bare imperative remainder is +handed to the existing body parsers, and the context is applied back onto the +parsed result (`apply_optional`, `duration()`, `condition()`), so no recognized +slot can be dropped. + +**Slots live in the shell today:** +- **Optional** — `"you may [verb]"` → `ClauseContext.optional` (CR 608.2d), with + a specialized-phrase blocklist (`is_specialized_you_may_phrase`) for "you may" + constructions whose dedicated body parsers need the full surface form + (alt-costs, impulse-draw permission, causative "have", retarget, Dig + keep-from-among, specialized reveals). +- **Duration** — trailing duration suffix → `ClauseContext.duration`, delegating + to `strip_trailing_duration` so the suffix table stays single-source (with an + `is_specialized_duration_carrier` guard for parsers that need the suffix as a + disambiguation signal, e.g. impulse-draw vs `CastFromZone`). +- **Leading condition** — `"if [cond], [effect]"` → `ClauseContext.condition`, + delegating to `strip_leading_general_conditional` (which routes through + `parse_inner_condition`). + +**Call site:** `parse_effect_clause()` in `oracle_effect/mod.rs` (single peel +point; `oracle_trigger.rs` and `oracle_effect/subject.rs` reference the peel +in comments where their behavior depends on it). If the peeled variant lands +in `Unimplemented`, the original text is retried — the shell is conservative. + +**Rule for new work:** when adding handling for a new *structural* slot type +(durations, optional clauses, conditions, and future siblings like APNAP order +or activation limits), migrate it into the shell rather than adding another +per-callsite `strip_*` pass: add a field on `ClauseContext`, a branch in +`peel_inner`, an `apply_*` method, and remove the linear `strip_*` calls at the +body-parser call sites. Do not add new inline slot consumption to body parsers. + ### Parser Dispatch Architecture - **Nom combinators** handle ALL parsing dispatch — atomic, structural, sentence-level verb dispatch, and top-level routing. @@ -163,31 +241,82 @@ parse_effect_clause() — entry point (oracle_effect/mod.rs) ## 3. Parsing Priority System -Lines in `parse_oracle_text()` are classified in this exact order. **First match wins.** - -| Priority | Pattern | Router | Module | -|----------|---------|--------|--------| -| 1 | Keywords-only line (comma-separated) | Keyword extraction | `oracle.rs` | -| 2 | `"Enchant {filter}"` | Skipped (external) | — | -| 3 | `"Equip {cost}"` / `"Equip — {cost}"` | `try_parse_equip()` | `oracle.rs` | -| 4 | `"Choose one/two —"` (modal) | Bullet parsing | `oracle_modal.rs` | -| 5 | Planeswalker loyalty `[+N]/[-N]/[0]:` | `try_parse_loyalty_line()` | `oracle.rs` | -| 6 | Contains `":"` with cost prefix | Activated ability | `oracle_cost.rs` | -| 7 | Starts with `"When"` / `"Whenever"` / `"At"` | `parse_trigger_line()` | `oracle_trigger.rs` | -| 8 | `is_static_pattern()` matches | `parse_static_line()` | `oracle_static.rs` | -| 9 | `is_replacement_pattern()` matches | `parse_replacement_line()` | `oracle_replacement.rs` | -| 10 | Card is Instant/Sorcery + imperative | `parse_effect_chain()` | `oracle_effect/` | -| 11 | Roman numeral (saga chapter) | Skipped | — | -| 12 | Keyword cost line (kicker, etc.) | Skipped (MTGJSON handles) | — | -| 13 | Ability word prefix (`"Landfall —"`) | Strip prefix, re-classify from P7 | `oracle.rs` | -| 14a | `dispatch_line_nom()` — effect candidates | `parse_effect_chain_with_context()` | `oracle.rs` | -| 15 | Fallback | `Effect::Unimplemented` | — | - -### `is_static_pattern()` detects: -`"gets +"`, `"gets -"`, `"get +"`, `"get -"`, `"have "`, `"has "`, `"can't be blocked"`, `"can't attack"`, `"can't block"`, `"enchanted "`, `"equipped "`, `"all creatures "`, `"enters with "`, `"cost {"`, and more. Check the function for the full list. - -### `is_replacement_pattern()` detects: -`"as ~ enters"`, `"enters tapped"`, `"if damage would be dealt"`, `"instead"`. +Lines in `parse_oracle_ir()` (oracle.rs) are classified slot by slot. **First +match wins — the binding order is the SOURCE ORDER of the slots in the line +loop, NOT the numeric labels.** The labels are historical and non-monotonic +(`8b (early)` runs before `0`; loyalty `11` runs before the `3x` keyword-line +slots). When inserting a new slot, place it by evaluation position and grep +`// Priority` in oracle.rs to see the live order; the table below mirrors that +grep in evaluation order (one row per `// Priority