Skip to content

Latest commit

 

History

History
651 lines (524 loc) · 33.3 KB

File metadata and controls

651 lines (524 loc) · 33.3 KB

QuickMUD Development Guide for AI Agents

Purpose

rom24-quickmud-python is a ROM 2.4b6 → Python faithful port. The goal is 100% behavioral parity with the original C engine, not improvement. When ROM C behavior is "wrong" or quirky, we replicate it exactly.

The companion browser frontend lives in the sibling project quickmud-web-client (../quickmud-web-client) and is versioned independently.

Source of Truth

  • Engine repo (this project): https://github.com/Nostoi/rom24-quickmud-python
  • Original ROM C source: src/ (read-only reference; do not modify)
  • Frontend repo: ../quickmud-web-client

Before changing engine behavior, read the corresponding ROM C function. Do not guess.

Resuming in Any Harness (Claude Code, Codex, Opencode, …)

A fresh agent can pick up this project regardless of which CLI it runs in. Required reading order at session start:

  1. docs/sessions/SESSION_STATUS.md — single canonical "where we are" pointer.
  2. The latest docs/sessions/SESSION_SUMMARY_*.md it references.
  3. docs/parity/ROM_C_SUBSYSTEM_AUDIT_TRACKER.md — pick the next ⚠️ Partial / ❌ Not Audited row.
  4. The per-file audit doc for the chosen target (e.g. docs/parity/ACT_OBJ_C_AUDIT.md).

Workflow skills live as plain markdown in .claude/skills/ and are readable from any harness:

  • Claude Code agents: invoke them via the Skill tool (e.g. Skill({skill: "rom-parity-audit", args: "scan.c"})). Do not read the SKILL.md and follow it inline — let the harness load it.

  • Codex / Opencode / other harnesses: Read the SKILL.md file and follow the instructions manually. Same workflow, just no Skill tool.

  • .claude/skills/rom-parity-audit/SKILL.md — file-level audit (5 phases).

  • .claude/skills/rom-gap-closer/SKILL.md — single-gap TDD close (one test, one commit).

  • .claude/skills/rom-session-handoff/SKILL.md — end-of-session SUMMARY + STATUS writer.

  • .claude/skills/gitnexus/gitnexus-cli/SKILL.md — wraps npx gitnexus … with canonical flags.

GitNexus MCP must be configured per-harness for gitnexus_impact / gitnexus_detect_changes calls to work. Without it, fall back to grep and run the area-specific integration test suite to catch regressions.


ROM Parity Rules (CRITICAL)

These are non-negotiable. Violations are bugs even if tests pass.

  • RNG: use mud.math.rng_mm.number_*, never random.* in combat/affects.
  • Integer math: use c_div/c_mod from mud.math.c_compat whenever an operand can be negative. ROM is C: integer division truncates toward zero and % takes the sign of the dividend, whereas Python // floors toward −∞ and % takes the sign of the divisor — they diverge only when a negative operand is involved (e.g. alignment, saving_throw, stat modifiers, any signed delta). Bare // / % are acceptable and widely used for provably non-negative operands (levels, percents, dice counts, gold splits), where they are bit-for-bit identical to C. When in doubt, use c_div/c_mod. This is NOT grep-enforceable (a blanket // ban would flag ~30 correct non-negative uses); it requires reading the operand domain — cite the ROM C line on signed-math sites.
  • Flag values: use enums (PlayerFlag.AUTOLOOT, WearFlag.NO_SAC, WearLocation.HOLD, …). Never hardcode hex bit values — ROM C uses bit shifts and the hex you'd guess from the constant name is often wrong.
    • Wrong: PLR_AUTOLOOT = 0x00000800
    • Right: PlayerFlag.AUTOLOOT
  • Equipment lookup: char.equipment is keyed STRICTLY by the integer wear slot — char.equipment[int(WearLocation.HOLD)] (IntEnum keys hash-equal to their int, so char.equipment[WearLocation.HOLD] is the same key). NEVER use a string slot name (char.equipment["hold"], char.equipped["held"]) and never char.carrying. ROM keys equipment by int iWear (src/handler.c:1733 get_eq_char loops obj->wear_loc == iWear); there are no string slot names in ROM. All write paths canonicalize: do_wear stores int(wear_loc), Character.equip_object(obj, slot) runs slot through mud.models.constants.canonical_wear_slot (which accepts an int, a numeric string, or a legacy name like "wield"), and the JSON restore in from_orm coerces the str-keyed reload back to int. Readers MUST look up the int key — a string-literal read like equipment.get("shield") silently misses objects equipped under the canonical int key (the 2.9.87 school-light / combat-shield bug class). Enforced by tests/test_equipment_key_convention.py (grep-guard, same pattern as test_rng_determinism.py).
  • Character inventory: char.inventory, not char.carrying.
  • Room occupants: room.people, not room.characters.
  • Entity identity: compare entity instances (Character, MobInstance, Object, Room) with is/is not, NEVER ==/!=. ROM compares entities by pointer. The live runtime types — Character (PCs), MobInstance (mobs — spawn_mob returns it, NOT a Character subclass), Object (spawn_object), and Room (ROOM_INDEX_DATA *) — are all now @dataclass(eq=False) (INV-034, 2.12.78–80), so ==/is agree — __eq__/__hash__ are identity, exactly ROM pointer semantics. Always use is regardless — it documents pointer intent, is grep-auditable, and is immune to a future dataclass-decorator regression.
    • Wrong: if victim == ch: / if obj in ch.inventory: to mean this exact object / room.people.remove(target)
    • Right: if victim is ch: and identity-keyed membership checks This is structural divergence class 6 — now fully closed. The root cause for every entity runtime type (Character/MobInstance/Object/Room, plus legacy ObjectInstance — value-equality) is fixed and tracked ✅ ENFORCED as INV-034 in docs/parity/CROSS_FILE_INVARIANTS_TRACKER.md (all five eq=False). Note eq=False does not propagate to sibling/subclass entity dataclasses — each entity runtime type needs its own eq=False (the MobInstance/ObjectInstance miss, caught post-commit by review). (Not grep-enforceable — ==/!= can't be type-discriminated lexically; cite the ROM C pointer compare on identity sites, as INV-031(c) did for is_same_group.)
  • Comments: reference ROM C source on parity-sensitive code, e.g. # mirroring ROM src/fight.c:one_hit.
  • No deferring. When an audit finds a missing/partial ROM C function, implement it; do not mark it "P2 — optional". ROM parity gaps are always P0.
  • Integration tests are mandatory for new parity work — they must verify ROM behavior, not just code coverage.

Mandatory reading before any audit or integration-test work: docs/ROM_PARITY_VERIFICATION_GUIDE.md.

Message Delivery (Architectural Divergence)

ROM C delivers messages directly to the socket descriptor via write_to_buffer() during the game tick. Python cannot do synchronous socket writes inside the synchronous game_tick(). Instead, we use asyncio.create_task(send_to_char(...)) for fire-and-forget delivery, matching ROM C's real-time prompt behavior.

Messages generated during combat ticks must reach connected players immediately. The char.messages list is a test fallback only — it must NOT be the primary delivery mechanism for combat output. Use _push_message() from engine.py or broadcast_room() from protocol.py.

Full rationale: docs/divergences/MESSAGE_DELIVERY.md.

Cross-File Invariants

The per-file audit (/rom-parity-audit) checks one ROM C file against its Python equivalent function-by-function. That methodology is necessary but not sufficient — it misses contracts that span modules: single-delivery of messages, character_registry membership, prompt-render-after-raw_kill ordering, and so on. Three production bugs shipped this year against files marked ≥95% audited because the broken contract lived in code outside the audited file.

ALWAYS re-verify a written ✅ / status / "verified-as-of-vX" claim — in an audit doc, tracker row, or load-bearing warning — against the ROM C source (or an empirical run on the installed tool version) before relying on or relaying it. A ✅ records when someone last checked, not that it is still true. Anti-pattern: relaying "the audit says X is ROM-correct" or "the warning says tool Y is broken" without re-checking source/version. Three 2026-05-31 bugs traced to stale doc claims — CAST-008 (hidden behind a do_cast "immediately deducted mana" audit note), NANNY-015/TRAIN-002 (false-✅ audit rows asserting the buggy behavior was ROM-correct), and the GitNexus 32 KB warning (stale since 1.6.5). All four were caught by reading ROM C / an empirical run, not the docs.

These contracts are tracked in docs/parity/CROSS_FILE_INVARIANTS_TRACKER.md with stable IDs (INV-NNN) and one enforcement test each.

When you audit, close, or extend a file:

  1. Skim the cross-file tracker. If any INV touches the surface you're working on, run its enforcement test before claiming the work done.
  2. If the gap you fix touches code in a different module than the audit's "primary" Python file, add a line to the relevant INV's "Touched by" trail in the tracker so the call chain stays visible.
  3. If a NEW invariant surfaces (root cause of a bug crosses files), add the next free INV-NNN with: name, ROM mechanism, Python enforcement point, regression test, status. Keep the list small — if it grows past ~20 entries, the per-file methodology itself needs revisiting.

When the cross-file invariants pass is the active mode: when docs/parity/ROM_C_SUBSYSTEM_AUDIT_TRACKER.md has no ⚠️ Partial / ❌ Not Audited rows (current state — all P0/P1/P2 at 100%, P3 at 75%

  • 3 N/A), the per-file audit default is exhausted and cross-file invariants becomes the primary pass. Method: pick a candidate area not yet covered by an INV row (affect ticks, position transitions, mob script triggers, group/follower chain are current candidates), run a 5-minute probe (read ROM C contract → read Python equivalent → write one failing test for the contract), then either close as a gap (single commit) or file as the next free INV-NNN. The probe- then-scope pattern is faster than full audit phases when you're hunting for divergences vs cataloguing them — the 2.9.3–2.9.6 session (INV-014 + INV-013 carrier-field sweep + decay-loop coverage) is the worked example.

The point is not to re-audit every file under the new lens; it's to enumerate the contracts once, lock each one with a test, and reference those tests from the per-file rows. The /rom-parity-audit skill's Phase 2 calls this out explicitly.

Completeness lens & verification epistemics

The per-file audit and the cross-INV process answer "is this function/contract correct?" They do not answer "how close to done are we, and how would we know?" That question has a tractable reformulation: ROM↔Python parity risk only exists where the two engines diverge structurally (sync vs async, pointer vs GC identity, array vs dict, C-int vs Python-int, static-buffer reuse, fread parsing). That set of divergence classes is small (~11) and enumerable, and each class has a single correct verification layer. The roster is docs/parity/DIVERGENCE_CLASS_ROSTER.md; run a sweep via /rom-divergence-sweep.

This lens contains the existing processes, it does not replace them: the grep-guards (test_rng_determinism.py, test_equipment_key_convention.py, test_attribute_convention.py) are its Layer A (static bypass-guards); the cross-INV process is its Layer C (ordering/lifecycle, no chokepoint); signed-math is Layer B (domain-conditional, human-read). Keep using cross-INV, /rom-gap-closer, and /rom-parity-audit — the roster just routes work into them.

Four guardrails — learned the hard way, durable on purpose:

  1. Durable surfaces hold method; trackers hold status. Never write a number, a ✅, or "class X is clean" into AGENTS.md / CLAUDE.md / a SKILL.md. Status goes in the roster / INV trackers (which are expected to rot and be re-verified on read).
  2. A committed guard beats a doc-✅. A rglob → forbid-pattern → assert test is self-maintaining; a tracker note rots. Prefer turning a verified contract into a test over writing "verified."
  3. The roster is enumeration-dependent — blind to classes nobody named. The only enumeration-independent check is differential execution (tools/diff_harness/). Therefore "close on the known surface" ≠ "close to ROM parity" — never blur the two.
  4. Re-verify any ✅/status claim against ROM C source (or an empirical run on the installed tool) before relying on or relaying it — reinforcing the rule above; a recall oracle (re-deriving a known-open item) is how you prove a sweep has recall before trusting its silence.

Build / Lint / Test

# All tests — runs in parallel by default (-n auto --dist loadscope via
# pyproject addopts); ~94s on a 10-core machine (~517s serial).
pytest

# Single-test debugging — disable parallelism for readable output / pdb:
pytest -n0 tests/test_foo.py::test_bar

# Integration tests
pytest tests/integration/ -v

# Coverage gate (CI requires ≥80%)
pytest --cov=mud --cov-report=term --cov-fail-under=80

# Lint / format
ruff check .
ruff format --check .

# Type check (strict on selected modules)
mypy mud/net/ansi.py mud/security/hash_utils.py --follow-imports=skip

# Comprehensive command registration check
python3 test_all_commands.py

# Differential parity harness (ROM C ⇄ Python) — pure-Python replay, no C build
pytest tests/test_differential_smoke.py tests/test_diff_harness_unit.py

Three test layers — unit (tests/test_*.py), integration (tests/integration/), and command-registry (test_all_commands.py). Run all three when adding commands.

Differential testing harness (tools/diff_harness/)

A fourth, complementary verification layer: it runs the original ROM 2.4b6 C engine and the Python port through identical scripted scenarios and diffs observable state + output, so parity divergences surface mechanically instead of relying on hand-written assertions. Full docs: tools/diff_harness/README.md; open divergences: tools/diff_harness/FINDINGS.md.

When to use it:

  • You changed an engine surface covered by a scenario (movement/get-drop, affect_armor, combat_melee_rounds, spell_combat) — run the replay to confirm you didn't drift from ROM.
  • You're hunting a subtle parity divergence and want C ground truth rather than a Python-authored expectation. Author a new scenario (see below) to capture it.
  • Before claiming a parity gap closed on a surface the harness can exercise.

How to use it:

  • Everyday replay (no C build): pytest tests/test_differential_smoke.py tests/test_diff_harness_unit.py. Replays pure-Python against the committed goldens in tests/data/golden/diff/. A missing golden skips; a known, not-yet-resolved divergence xfails (KNOWN_DIVERGENCES in the replay test).
  • Author a scenario: drop tools/diff_harness/scenarios/<name>.json (name, seed, start_room, char, watch set, steps). A snapshot over the watch-set is auto-inserted after every step. v1 is a deterministic no-RNG slice (look/movement/get-drop/inventory/wear-remove).
  • Regenerate/verify goldens (needs the instrumented C binary): cd src && make -f Makefile.diffshim diffshim (one-time, additive — ROM src/*.c stay byte-for-byte unchanged), then python3 -m tools.diff_harness.capture --all (regenerate) or --check (CI-style diff vs committed). Goldens are stamped with the repo HEAD sha + build flags + seed; the trace is C-engine output (immutable unless ROM C or a scenario changes), so a regen typically only refreshes the provenance stamp.

A divergence is a finding, not a golden to overwrite. ROM is the source of truth: when replay fails on a real divergence, triage it, record it in FINDINGS.md, file a parity gap (per the routing table below), and fix Python or the data — never edit the golden to make the test pass.

Test fixtures (from conftest.py)

movable_char_factory(name, room_vnum, points=100)
movable_mob_factory(vnum, room_vnum, points=100)
object_factory(proto_kwargs)
place_object_factory(room_vnum, vnum=..., proto_kwargs=...)
portal_factory(room_vnum, to_vnum, closed=False)
ensure_can_move(entity, points=100)

Note: Object.__post_init__ does not auto-sync value from the prototype. Test fixtures must do obj.value = list(proto.value) after construction.

Test determinism (RNG)

The Mitchell-Moore RNG (mud.utils.rng_mm) is global mutable state, so RNG-dependent tests are flaky if state leaks across test boundaries.

  • tests/integration/conftest.py has an autouse fixture that calls rng_mm.seed_mm(12345) before every integration test. Do not remove it. Without it, tests that depend on probabilistic outcomes (scavenger acts on a 1/64 roll, AoE saves, holy_word damage rolls, combat hit/miss) flake on ordering. This was added in v2.6.2 — see CHANGELOG.
  • If your test needs a specific RNG sequence, call rng_mm.seed_mm(<seed>) inside the test (after fixture setup) — that overrides the autouse default.
  • Never use random.* in production code or in tests that are checking ROM parity. Use rng_mm.number_* so the seed actually controls behavior.
  • Don't write a new test that "just runs more iterations until something happens" without seeding. That's a flake waiting to surface in CI.

A test asserting a behavior that contradicts ROM C is a bug in the test, not in the implementation. ROM is the source of truth. When a test fails: read the corresponding ROM C function before assuming the Python code is wrong. (Example: test_giant_strength_refuses_to_stack was originally test_stat_modifiers_stack_from_same_spell — the test asserted stacking, but ROM magic.c:3022-3030 explicitly anti-stacks.)

Parallel test execution & isolation

The suite runs in parallel by default — pyproject.toml sets addopts = "-n auto --dist loadscope" (pytest-xdist). ~94s vs ~517s serial.

pytest                              # all tests, parallel (default)
pytest -n0                          # force serial (debugging, pdb, readable output)
pytest -n0 tests/test_foo.py::bar   # one test, serial
pytest tests/integration/ -v        # a subdir, still parallel

--dist loadscope keeps every test in a module/class on one worker, so module-scoped fixtures and intra-file ordering behave exactly as in a serial run. What it does not protect against is cross-file shared state: xdist workers are separate processes, so they only share the filesystem, network ports, and global singletons that persist within a worker across the multiple files assigned to it. When you add or touch a test, keep it parallel-safe:

  • Never depend on global state another test file sets up. A test must pass when run alone (pytest -n0 path::test). If it only passes in the full suite, it has a hidden cross-file dependency (e.g. time_info.sunlight left at daytime, a populated area_registry, a seeded DB schema). Set up what you need locally instead.
  • Reset global mutable singletons you mutatearea_registry, character_registry, mud.registry.descriptor_list, time_info, object_registry. Snapshot/clear/restore in an autouse fixture (see tests/conftest.py _reset_object_registry / _reset_descriptor_list for the pattern). A test that leaves a registry dirty breaks whatever lands next on the same worker.
  • Per-worker resources for anything on disk or a port. The shared SQLite engine (mud/db/session.py, fixed sqlite:///mud.db) is isolated by tests/conftest.py setting a per-worker DATABASE_URL from PYTEST_XDIST_WORKER. If you add a test that binds a port or writes a fixed file, namespace it by worker_id / PYTEST_XDIST_WORKER (or use tmp_path).
  • Diagnosing an xdist-only failure: run the file alone (-n0). Passes alone but fails parallel → your test leaks OR a sibling leaks into it (reset the singleton). Fails alone too → it depends on cross-file setup (make it self-contained). Worker grouping shifts between runs, so a different latent leak can surface each run — fix the root cause, don't just re-run.

Code Style

  • from __future__ import annotations first; stdlib / third-party / local.
  • Strict type annotations; TYPE_CHECKING guard for circular imports.
  • snake_case functions/vars, PascalCase classes, UPPER_CASE constants, _prefix private.
  • Public functions have docstrings; parity-sensitive code cites ROM C source.
  • Line length 120, double quotes, 4-space indent (ruff/black).

Trackers (single source of truth — do not duplicate status into AGENTS.md)

File Purpose When to update
docs/parity/ROM_C_SUBSYSTEM_AUDIT_TRACKER.md Per-file ROM C audit status (43 files) Any audit work
docs/parity/CROSS_FILE_INVARIANTS_TRACKER.md Cross-module contracts (INV-NNN: SINGLE-DELIVERY, REGISTRY-MEMBERSHIP, PROMPT-CLAMP, …) — what per-file audits miss Whenever a bug's root cause crosses module boundaries, or when an audit touches a file referenced by an INV
docs/parity/INTEGRATION_TEST_COVERAGE_TRACKER.md Coverage of 21 gameplay systems Adding integration tests
docs/parity/ROM_PARITY_FEATURE_TRACKER.md Feature-level parity backlog Implementing parity features
docs/parity/ACT_OBJ_C_AUDIT.md, ACT_INFO_C_AUDIT.md, etc. Per-file gap tables (GET-001, PUT-002, …) Closing specific gaps
PROJECT_COMPLETION_STATUS.md Subsystem confidence scores After major subsystem work
CHANGELOG.md User-visible / dev-visible change log Every push (see Repo Hygiene)

TODO.md and ARCHITECTURAL_TASKS.md are historical and complete — do not update them.

For the audit methodology itself (5 phases, ROM-C → audit doc → implementation → integration tests → completion), see docs/ROM_PARITY_VERIFICATION_GUIDE.md.

Out-of-scope bugs surfaced mid-audit (file durably, do not just mention)

Bugs discovered while reading code for an audit or gap-closer — but that are NOT the gap you're closing — must be filed durably before moving on. Chat-only mentions evaporate at session end and have caused real regressions to ship.

Routing by bug shape:

Bug shape File where
Cross-file / cross-module contract violation (root cause spans files) New INV-NNN row in docs/parity/CROSS_FILE_INVARIANTS_TRACKER.md. Use next free ID; check the table for the current max — IDs are non-contiguous because of retirements.
Local divergence in a single function (typo, wrong constant, missing branch) Per-file docs/parity/<FILE>_C_AUDIT.md with a stable ID matching the file's pattern (e.g. MLOAD-001, OLOAD-002).
Blocks a different audit's gap closure Annotate the blocked row's status to ⚠️ BLOCKED and reference the new bug ID under "Notes". Add a short "Blocked rows" subsection at the bottom of the audit doc if the file doesn't already have one.
Generic feature gap / missing helper docs/parity/ROM_PARITY_FEATURE_TRACKER.md.

Then surface it in this session's SESSION_SUMMARY_*.md "Outstanding" section so the next agent sees the new ID without having to re-discover it.

A verbal mention to the user is not filing. The chat transcript is not durable workflow state — the trackers are.


Session Notes

Per-session work logs live under docs/sessions/. AGENTS.md itself does not carry a running session narrative — keep it stable.

Conventions

  • Filename: SESSION_SUMMARY_YYYY-MM-DD_<short-topic>.md. Multi-session days get distinct topics (..._ACT_MOVE_COMPLETE.md, ..._ACT_OBJ_C_AUDIT_PHASES_1-4.md).
  • Handoff documents (when a session ends mid-stream and another agent picks up): HANDOFF_YYYY-MM-DD_<topic>.md, also under docs/sessions/.
  • Single canonical "current" pointer: docs/sessions/SESSION_STATUS.md always reflects the latest state. Newer sessions overwrite it; the old contents are preserved in the dated SESSION_SUMMARY_*.md for that session.

Workflow

  1. Start of session: read docs/sessions/SESSION_STATUS.md and the most recent SESSION_SUMMARY_*.md for context. Skim the relevant tracker docs for the active audit.
  2. During session: keep work-in-progress notes wherever you like, but do not commit transient scratch files to the repo root.
  3. End of session: create docs/sessions/SESSION_SUMMARY_<date>_<topic>.md summarizing what landed (gaps closed, files touched, tests added). Update docs/sessions/SESSION_STATUS.md to point at the new summary and state the next intended task.
  4. Do not add session logs to AGENTS.md, README.md, or CHANGELOG.md. Those are stable surfaces; sessions are an append-only log.

If you find new SESSION_SUMMARY_*.md or HANDOFF_*.md files at the repo root, move them into docs/sessions/ as part of repo hygiene.


Repo Hygiene

Before pushing changes:

  1. Update CHANGELOG.md

    • Add an entry under ## [Unreleased] (or the next version section).
    • Format: Keep a Changelog (Added / Changed / Fixed / Removed).
    • Summarize user-visible behavior, dev-workflow changes, and important fixes — not internal refactors.
  2. Update README.md when needed

    • Setup, usage, architecture, command-parity claims, badges, or test counts that changed.
    • Keep ROM-parity status badges honest. Do not claim percentages the trackers don't support.
    • Whenever you touch README's "Project Status" / badges / metrics, also refresh AGENTS.md's tracker pointers and docs/sessions/SESSION_STATUS.md in the same commit so the three surfaces (README, AGENTS, SESSION_STATUS) never disagree. Source of truth for the underlying numbers stays in docs/parity/* trackers — do not invent figures the trackers don't back.
  3. Update the version

    • Single source: pyproject.toml version = "X.Y.Z".
    • Bump on any branch push, PR update, or master push intended to be shared remotely. Use semver:
      • patch — fixes, docs, parity gap closures with no API surface change
      • minor — new commands, new gameplay features, new test infrastructure
      • major — breaking protocol/save-format/data-model changes
    • Keep the version bump in the same commit/PR as the related code and changelog entries.
  4. PR and master hygiene

    • Before opening or updating a PR: bump pyproject.toml version, update CHANGELOG.md, update README.md if setup/workflow/parity status changed.
    • Before merging or pushing to master:
      • confirm pyproject.toml version reflects the full state being merged
      • confirm CHANGELOG.md describes the shipped behavior
      • confirm README.md still matches current setup and parity status
      • confirm docs/sessions/SESSION_STATUS.md is current
    • Treat master as the publish-ready branch. Do not push stale versions or stale changelog entries to master.
  5. Verify

    • pytest passes (or the only failures are the documented pre-existing ones).
    • ruff check . clean.
    • For integration-test or audit work, update the relevant tracker doc.
    • Run gitnexus_detect_changes() before committing (see GitNexus section).

Specialized Agent Files

File Use when
AGENT.md Architectural integration analysis (subsystem confidence < 0.92)
AGENT.EXECUTOR.md Executing tasks already defined by AGENT.md
FUNCTION_COMPLETION_AGENT.md Implementing the remaining ~57 unmapped helper functions (optional)

Autonomous Mode

When the user explicitly says "auto mode" or "complete all tasks":

  1. Build a todo list from the relevant tracker docs.
  2. Execute sequentially without per-step approval; fix errors immediately.
  3. Run pytest after each major task; do not proceed past a regression.
  4. Update tracker docs and docs/sessions/SESSION_STATUS.md as you go.
  5. Stop on time limit, scope completion, or unrecoverable error after 3 fix attempts.

Auto mode is not a license for destructive operations (force pushes, history rewrites, dropping data). Those still need explicit confirmation.


GitNexus — Code Intelligence

This project is indexed by GitNexus as rom24-quickmud-python (33 200 symbols, 54 952 relationships, 300 execution flows). If any GitNexus tool warns the index is stale, run npx gitnexus analyze first.

Use the gitnexus-cli skill for any GitNexus CLI invocation (analyze, status, clean, wiki, list). Do not call npx gitnexus … raw — invoke Skill({skill: "gitnexus-cli", args: "<intent>"}) so the canonical flags (e.g. --skip-agents-md to preserve the auto-managed CLAUDE.md/AGENTS.md block) are applied.

Always Do

  • Run impact analysis before editing any symbol. Before modifying a function, class, or method, run gitnexus_impact({target: "symbolName", direction: "upstream"}) and report blast radius (callers, affected processes, risk) to the user.
  • Run gitnexus_detect_changes() before committing to verify your changes only affect expected symbols and execution flows.
  • Warn the user on HIGH/CRITICAL risk before proceeding.
  • For unfamiliar code, prefer gitnexus_query({query: "concept"}) over grep.
  • For full context on a symbol (callers, callees, flows), use gitnexus_context({name: "symbolName"}).

Never Do

  • Edit a function/class/method without first running gitnexus_impact.
  • Ignore HIGH or CRITICAL risk warnings.
  • Rename symbols with find-and-replace — use gitnexus_rename.
  • Commit without running gitnexus_detect_changes().

Resources

Resource Use for
gitnexus://repo/rom24-quickmud-python/context Codebase overview, index freshness
gitnexus://repo/rom24-quickmud-python/clusters Functional areas
gitnexus://repo/rom24-quickmud-python/processes Execution flows
gitnexus://repo/rom24-quickmud-python/process/{name} Step-by-step trace

Skill files

Task Skill file
"How does X work?" .claude/skills/gitnexus/gitnexus-exploring/SKILL.md
Blast radius .claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md
Trace bugs .claude/skills/gitnexus/gitnexus-debugging/SKILL.md
Refactor / rename .claude/skills/gitnexus/gitnexus-refactoring/SKILL.md
Tools / schema .claude/skills/gitnexus/gitnexus-guide/SKILL.md
CLI commands .claude/skills/gitnexus/gitnexus-cli/SKILL.md

GitNexus — Code Intelligence

This project is indexed by GitNexus as rom24-quickmud-python (39196 symbols, 65340 relationships, 300 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.

If any GitNexus tool warns the index is stale, run npx gitnexus analyze in terminal first.

Always Do

  • MUST run impact analysis before editing any symbol. Before modifying a function, class, or method, run gitnexus_impact({target: "symbolName", direction: "upstream"}) and report the blast radius (direct callers, affected processes, risk level) to the user.
  • MUST run gitnexus_detect_changes() before committing to verify your changes only affect expected symbols and execution flows.
  • MUST warn the user if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
  • When exploring unfamiliar code, use gitnexus_query({query: "concept"}) to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
  • When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use gitnexus_context({name: "symbolName"}).

Never Do

  • NEVER edit a function, class, or method without first running gitnexus_impact on it.
  • NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
  • NEVER rename symbols with find-and-replace — use gitnexus_rename which understands the call graph.
  • NEVER commit changes without running gitnexus_detect_changes() to check affected scope.

Resources

Resource Use for
gitnexus://repo/rom24-quickmud-python/context Codebase overview, check index freshness
gitnexus://repo/rom24-quickmud-python/clusters All functional areas
gitnexus://repo/rom24-quickmud-python/processes All execution flows
gitnexus://repo/rom24-quickmud-python/process/{name} Step-by-step execution trace

CLI

Task Read this skill file
Understand architecture / "How does X work?" .claude/skills/gitnexus/gitnexus-exploring/SKILL.md
Blast radius / "What breaks if I change X?" .claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md
Trace bugs / "Why is X failing?" .claude/skills/gitnexus/gitnexus-debugging/SKILL.md
Rename / extract / split / refactor .claude/skills/gitnexus/gitnexus-refactoring/SKILL.md
Tools, resources, schema reference .claude/skills/gitnexus/gitnexus-guide/SKILL.md
Index, status, clean, wiki CLI commands .claude/skills/gitnexus/gitnexus-cli/SKILL.md