From 806bbb80f89d9b6a058dd9c651d784e2df77049e Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 12:16:56 +1000 Subject: [PATCH 01/72] chore: untrack generated agent instruction files (CLAUDE.md, AGENTS.md) These are filigree-generated agent instruction files (carrying the `` marker) regenerated by the session-start hook on every run, so tracking them produces churn-only diffs as the filigree version bumps. They are already listed in .gitignore; untrack them so the ignore actually takes effect (git keeps tracking files added before they were ignored). They remain on disk for local use and continue to be regenerated. Mirrors PR #5's untracking of local integration config. Co-Authored-By: Claude Opus 4.8 --- AGENTS.md | 119 ------------------------------------------------------ CLAUDE.md | 119 ------------------------------------------------------ 2 files changed, 238 deletions(-) delete mode 100644 AGENTS.md delete mode 100644 CLAUDE.md diff --git a/AGENTS.md b/AGENTS.md deleted file mode 100644 index d2ea656..0000000 --- a/AGENTS.md +++ /dev/null @@ -1,119 +0,0 @@ - -## Filigree Issue Tracker - -`filigree` tracks tasks for this project. Data lives in `.filigree/`. Prefer -the MCP tools (`mcp__filigree__*`) when available; fall back to the `filigree` -CLI otherwise. - -### Workflow - -```bash -# At session start -filigree session-context # ready / in-progress / critical path - -# Pick up the next startable issue (atomic claim + transition into its working status) -filigree start-next-work --assignee -# ...or claim a specific issue -filigree start-work --assignee - -# Do the work, commit, then -filigree close -``` - -Use the atomic claim+transition verbs — `work_start` / `work_start_next` -(MCP) or `start-work` / `start-next-work` (CLI). Do **not** chain -`work_claim` (MCP) or `filigree claim` (CLI) with a subsequent status -update — the two-step form races against other agents; the combined verb is -atomic. - -**Ready ≠ startable.** The working status is type-specific (tasks → -`in_progress`, features → `building`). Bugs start at `triage`, which has no -single-hop transition into work (`triage → confirmed → fixing`), so a triage -bug is *ready* but not directly *startable*: `work_start` on one returns -`INVALID_TRANSITION` naming the next status, and `work_start_next` skips it. -`work_ready` items carry a `startable` flag (plus a `next_action` hint when -false). Pass `advance=true` (MCP) / `--advance` (CLI) to walk the soft -transitions to the nearest working status automatically. - -### Observations: when (and when not) to use them - -`observation_create` is a fire-and-forget scratchpad for *incidental* defects — things -you notice *outside the scope of your current task* (a code smell in a -neighbouring file, a stale TODO, a missing test for an edge case you happened -to spot). Notes expire after 14 days unless promoted. Include `file_path` and -`line` when relevant. At session end, skim `observation_list` and either -`observation_dismiss` or `observation_promote` for what has accumulated. - -**You fix bugs in your currently defined scope. You do NOT use observations -to finish work prematurely.** If a defect, gap, or follow-up belongs to your -current task, you own it — handle it as part of that task: fix it now, expand -the task's scope, file a proper issue with a dependency, or surface it to the -user. Filing it as an observation and closing the task is *not* completing -the task; it is shipping known-broken work and hiding the debt in a 14-day -expiring scratchpad. The test is "would I have noticed this even if I weren't -working on this task?" If no, it's task scope, not an observation. - -### Priority scale - -- P0: Critical (drop everything) -- P1: High (do next) -- P2: Medium (default) -- P3: Low -- P4: Backlog - -### Reaching for tools - -MCP tool schemas describe each tool; `filigree --help` and `filigree ---help` are the authoritative CLI reference. You do not need to memorise -either catalogue. The verbs you will reach for most: - -- **Find work:** `work_ready`, `work_blocked`, `issue_list`, `issue_search` -- **Claim work:** `work_start`, `work_start_next` -- **Update:** `comment_add`, `label_add`, `issue_update`, `issue_close` -- **Admin (irreversible):** `issue_delete` (MCP) / `delete-issue` (CLI) — - hard-deletes a terminal issue and its rows; `admin_undo_last` cannot reverse it. -- **Scratchpad:** `observation_create`, `observation_list`, `observation_promote`, `observation_dismiss` -- **Cross-product entity bindings (ADR-029):** `entity_association_add`, - `entity_association_remove`, `entity_association_list`, - `entity_association_list_by_entity`. Used when a sibling tool (e.g. - Clarion) needs to bind a Filigree issue to a function, class, or - module identifier it owns. The `entity_id` is an opaque external string - from Filigree's perspective and may be a `clarion:eid:...` SEI or a legacy - locator; callers may also supply `entity_kind` explicitly. The consumer (the sibling tool's read - path) does drift detection against the stored - `content_hash_at_attach`. `entity_association_list_by_entity` is the - reverse-lookup surface — given an opaque external entity ID, return every - Filigree issue bound to it (project isolation is by DB file). Also - reachable over HTTP as - `GET/POST /api/issue/{issue_id}/entity-associations`, - `DELETE /api/issue/{issue_id}/entity-associations?entity_id=…`, - and `GET /api/entity-associations?entity_id=…`. -- **Health:** `stats_get`, `metrics_get`, `mcp_status_get` - -Pass `--actor ` (CLI) so events attribute to your agent identity. It -works in either position — before the verb (`filigree --actor X update …`) or -after it (`filigree update … --actor X`); the post-verb value overrides the -group-level one. - -### Error handling - -Errors return `{error: str, code: ErrorCode, details?: dict}`. Switch on -`code`, not on message text. Codes: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, -`INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, -`INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, -`CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, -`BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. - -On `INVALID_TRANSITION`, call `workflow_transition_list` (MCP) or -`filigree transitions ` to see what the workflow allows from here. - -Two failure modes deserve a specific response: - -- **`SCHEMA_MISMATCH`** — the installed `filigree` is older than the project - database. The error message contains upgrade guidance. Surface it to the - user; do not retry. -- **`ForeignDatabaseError`** — filigree found a parent project's database - but no local `.filigree.conf`. Run `filigree init` in the current - directory. Do **not** `cd` upward to a different project unless that was - the actual intent. - diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index d2ea656..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,119 +0,0 @@ - -## Filigree Issue Tracker - -`filigree` tracks tasks for this project. Data lives in `.filigree/`. Prefer -the MCP tools (`mcp__filigree__*`) when available; fall back to the `filigree` -CLI otherwise. - -### Workflow - -```bash -# At session start -filigree session-context # ready / in-progress / critical path - -# Pick up the next startable issue (atomic claim + transition into its working status) -filigree start-next-work --assignee -# ...or claim a specific issue -filigree start-work --assignee - -# Do the work, commit, then -filigree close -``` - -Use the atomic claim+transition verbs — `work_start` / `work_start_next` -(MCP) or `start-work` / `start-next-work` (CLI). Do **not** chain -`work_claim` (MCP) or `filigree claim` (CLI) with a subsequent status -update — the two-step form races against other agents; the combined verb is -atomic. - -**Ready ≠ startable.** The working status is type-specific (tasks → -`in_progress`, features → `building`). Bugs start at `triage`, which has no -single-hop transition into work (`triage → confirmed → fixing`), so a triage -bug is *ready* but not directly *startable*: `work_start` on one returns -`INVALID_TRANSITION` naming the next status, and `work_start_next` skips it. -`work_ready` items carry a `startable` flag (plus a `next_action` hint when -false). Pass `advance=true` (MCP) / `--advance` (CLI) to walk the soft -transitions to the nearest working status automatically. - -### Observations: when (and when not) to use them - -`observation_create` is a fire-and-forget scratchpad for *incidental* defects — things -you notice *outside the scope of your current task* (a code smell in a -neighbouring file, a stale TODO, a missing test for an edge case you happened -to spot). Notes expire after 14 days unless promoted. Include `file_path` and -`line` when relevant. At session end, skim `observation_list` and either -`observation_dismiss` or `observation_promote` for what has accumulated. - -**You fix bugs in your currently defined scope. You do NOT use observations -to finish work prematurely.** If a defect, gap, or follow-up belongs to your -current task, you own it — handle it as part of that task: fix it now, expand -the task's scope, file a proper issue with a dependency, or surface it to the -user. Filing it as an observation and closing the task is *not* completing -the task; it is shipping known-broken work and hiding the debt in a 14-day -expiring scratchpad. The test is "would I have noticed this even if I weren't -working on this task?" If no, it's task scope, not an observation. - -### Priority scale - -- P0: Critical (drop everything) -- P1: High (do next) -- P2: Medium (default) -- P3: Low -- P4: Backlog - -### Reaching for tools - -MCP tool schemas describe each tool; `filigree --help` and `filigree ---help` are the authoritative CLI reference. You do not need to memorise -either catalogue. The verbs you will reach for most: - -- **Find work:** `work_ready`, `work_blocked`, `issue_list`, `issue_search` -- **Claim work:** `work_start`, `work_start_next` -- **Update:** `comment_add`, `label_add`, `issue_update`, `issue_close` -- **Admin (irreversible):** `issue_delete` (MCP) / `delete-issue` (CLI) — - hard-deletes a terminal issue and its rows; `admin_undo_last` cannot reverse it. -- **Scratchpad:** `observation_create`, `observation_list`, `observation_promote`, `observation_dismiss` -- **Cross-product entity bindings (ADR-029):** `entity_association_add`, - `entity_association_remove`, `entity_association_list`, - `entity_association_list_by_entity`. Used when a sibling tool (e.g. - Clarion) needs to bind a Filigree issue to a function, class, or - module identifier it owns. The `entity_id` is an opaque external string - from Filigree's perspective and may be a `clarion:eid:...` SEI or a legacy - locator; callers may also supply `entity_kind` explicitly. The consumer (the sibling tool's read - path) does drift detection against the stored - `content_hash_at_attach`. `entity_association_list_by_entity` is the - reverse-lookup surface — given an opaque external entity ID, return every - Filigree issue bound to it (project isolation is by DB file). Also - reachable over HTTP as - `GET/POST /api/issue/{issue_id}/entity-associations`, - `DELETE /api/issue/{issue_id}/entity-associations?entity_id=…`, - and `GET /api/entity-associations?entity_id=…`. -- **Health:** `stats_get`, `metrics_get`, `mcp_status_get` - -Pass `--actor ` (CLI) so events attribute to your agent identity. It -works in either position — before the verb (`filigree --actor X update …`) or -after it (`filigree update … --actor X`); the post-verb value overrides the -group-level one. - -### Error handling - -Errors return `{error: str, code: ErrorCode, details?: dict}`. Switch on -`code`, not on message text. Codes: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, -`INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, -`INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, -`CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, -`BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. - -On `INVALID_TRANSITION`, call `workflow_transition_list` (MCP) or -`filigree transitions ` to see what the workflow allows from here. - -Two failure modes deserve a specific response: - -- **`SCHEMA_MISMATCH`** — the installed `filigree` is older than the project - database. The error message contains upgrade guidance. Surface it to the - user; do not retry. -- **`ForeignDatabaseError`** — filigree found a parent project's database - but no local `.filigree.conf`. Run `filigree init` in the current - directory. Do **not** `cd` upward to a different project unless that was - the actual intent. - From a565c27a2e96dcaf56e709edbe69f1bdf953ea6a Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 12:23:45 +1000 Subject: [PATCH 02/72] chore: gitignore CLAUDE.md and AGENTS.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the ignore entries that make the prior untracking stick — without these, the session-start hook's regenerated copies would show up as untracked and could be re-added. Co-Authored-By: Claude Opus 4.8 --- .gitignore | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.gitignore b/.gitignore index ab4b814..9cd1a79 100644 --- a/.gitignore +++ b/.gitignore @@ -19,3 +19,5 @@ __pycache__/ loomweave.yaml wardline.yaml .loomweave/loomweave.lock +AGENTS.md +CLAUDE.md From 54dabd587ba512886e9d5b7c76ddfd43956ad7ce Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 12:46:35 +1000 Subject: [PATCH 03/72] test(filigree/client): cover transport/error branches (roadmap 13) filigree/client.py was the lowest-covered module when arch-analysis filed this; subsequent PRs lifted it to 94%, leaving the transport/error paths a security reviewer cares about uncovered (lines 35, 43, 118-119, 127, 130). Add focused tests for the unsigned-transport seam (Q-M4): - _json_body_bytes(None) -> b"" (zero-byte signed body) - _path_and_query carries the query string the HMAC commits to - _urllib_fetch wraps urllib.error.URLError as a typed FiligreeError - _decode_json_response rejects non-JSON content type - _decode_json_response rejects oversized responses before decode client.py now at 100% line coverage; 527 passed. Closes legis-f675dc5cd4 Co-Authored-By: Claude Opus 4.8 --- tests/filigree/test_client.py | 62 +++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/tests/filigree/test_client.py b/tests/filigree/test_client.py index 6eaf477..052fb07 100644 --- a/tests/filigree/test_client.py +++ b/tests/filigree/test_client.py @@ -1,5 +1,6 @@ import pytest +import legis.filigree.client as client_mod from legis.filigree.client import FiligreeError, HttpFiligreeClient @@ -213,3 +214,64 @@ def fake_urlopen(req, timeout=None): ).encode("utf-8") expected = hmac.new(key, message, hashlib.sha256).hexdigest() assert signature == expected + + +# --- roadmap 13: transport / error-path branches (the surface a security +# reviewer cares about, and the unsigned-transport seam tied to Q-M4) --- + +def test_json_body_bytes_none_is_empty(): + # A None body signs and sends zero bytes (the body-hash is over b""). + assert client_mod._json_body_bytes(None) == b"" + + +def test_path_and_query_includes_query_string(): + # The signed message commits to path AND query; a verifier that dropped the + # query would compute a different signature, so the query must be carried. + assert ( + client_mod._path_and_query("https://filigree/api/entity-associations?entity_id=x") + == "/api/entity-associations?entity_id=x" + ) + # No query -> bare path; empty path -> "/". + assert client_mod._path_and_query("https://filigree/api/x") == "/api/x" + assert client_mod._path_and_query("https://filigree") == "/" + + +def test_urllib_fetch_wraps_transport_error(monkeypatch): + # A urllib URLError (DNS failure, connection refused, timeout) surfaces as a + # typed FiligreeError, never an unhandled urllib exception. + import urllib.request + + def boom(req, timeout=None): + raise urllib.error.URLError("connection refused") + + monkeypatch.setattr(urllib.request, "urlopen", boom) + with pytest.raises(FiligreeError, match="connection refused"): + client_mod._urllib_fetch("GET", "https://filigree.example/api/x", None) + + +def test_decode_rejects_non_json_content_type(): + # A proxy/error page returning text/html must not be json-parsed; it is a + # typed transport error. + class _HtmlResp: + headers = {"Content-Type": "text/html; charset=utf-8"} + + def read(self, _n): # pragma: no cover - not reached; type check first + return b"503" + + with pytest.raises(FiligreeError, match="non-JSON content type"): + client_mod._decode_json_response(_HtmlResp(), "GET /api/x") + + +def test_decode_rejects_oversized_response(): + # A response larger than MAX_RESPONSE_BYTES is rejected before decode so a + # hostile/buggy Filigree cannot exhaust memory. + big = b"x" * (client_mod.MAX_RESPONSE_BYTES + 1) + + class _BigResp: + headers = {"Content-Type": "application/json"} + + def read(self, n): + return big[:n] + + with pytest.raises(FiligreeError, match="response too large"): + client_mod._decode_json_response(_BigResp(), "GET /api/x") From 4ca617f70796c1242dbe819134f98ccd89f6cff5 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 12:50:13 +1000 Subject: [PATCH 04/72] ci: raise coverage floor to 88%, add ruff lint gate, clear F401s (Q-L7 / roadmap 11) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CI ran --cov-fail-under=70 against ~91% actual coverage — 20 points of silent- regression headroom — and had no lint step. Two F401 unused imports existed (policy/grammar.py Hashable, mcp.py WardlinePayloadError). - Clear both F401s. - Add ruff (dev dep + [tool.ruff.lint] default rule set E4/E7/E9/F) and a `ruff check src` CI gate. Import-sorting/pyupgrade left off deliberately to avoid unrelated churn and F821 on the honesty-gate test fixtures. - Raise the global floor to 88% (--cov-fail-under and [tool.coverage.report]). - Add scripts/check_coverage_floors.py: per-package floors for the security- critical packages (enforcement 93, service 92, governance 90, api 88, mcp.py 80), set a few points below current so they catch a real regression without tripping on incidental churn. Wired as a CI step. 527 passed, total 91.12%, all floors hold; ruff/mypy/boundary-gate green. Closes legis-d16c4fab16 Co-Authored-By: Claude Opus 4.8 --- .github/workflows/ci.yml | 6 +- .gitignore | 1 + pyproject.toml | 21 +++++++ scripts/check_coverage_floors.py | 95 ++++++++++++++++++++++++++++++++ src/legis/mcp.py | 2 +- src/legis/policy/grammar.py | 2 +- uv.lock | 27 +++++++++ 7 files changed, 151 insertions(+), 3 deletions(-) create mode 100644 scripts/check_coverage_floors.py diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index d77e3a3..a6c44fb 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -15,8 +15,12 @@ jobs: enable-cache: true - name: Install dependencies run: uv sync --dev + - name: Run lint + run: uv run ruff check src - name: Run test suite - run: uv run pytest --cov=legis --cov-report=term-missing --cov-fail-under=70 + run: uv run pytest --cov=legis --cov-report=term-missing --cov-report=json --cov-fail-under=88 + - name: Enforce per-package coverage floors + run: uv run python scripts/check_coverage_floors.py - name: Run SEI conformance oracle run: uv run pytest tests/conformance/test_sei_oracle.py - name: Run live Loomweave oracle diff --git a/.gitignore b/.gitignore index ab4b814..09b1965 100644 --- a/.gitignore +++ b/.gitignore @@ -15,6 +15,7 @@ __pycache__/ .filigree .filigree.conf .coverage +coverage.json .mcp.json loomweave.yaml wardline.yaml diff --git a/pyproject.toml b/pyproject.toml index 0f23bc0..f134bad 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -40,6 +40,7 @@ dev = [ "pytest-cov>=5.0", "httpx>=0.27", "mypy>=1.19", + "ruff>=0.8", "types-PyYAML>=6.0", ] @@ -62,3 +63,23 @@ python_version = "3.12" files = ["src/legis"] show_error_codes = true warn_unused_configs = true + +[tool.ruff] +target-version = "py312" +src = ["src"] + +[tool.ruff.lint] +# Ruff's default rule set (pyflakes F + the safe slice of pycodestyle E). This +# is the level that caught the F401 unused imports this floor-raise cleared; the +# CI gate (`ruff check src`) keeps src clean. Import-sorting (I) / pyupgrade (UP) +# are deliberately not enabled here — turning them on would impose unrelated +# reformatting churn and trip F821 on the honesty-gate test fixtures that inject +# `handler` dynamically; out of scope for a lint-gating change. +select = ["E4", "E7", "E9", "F"] + +[tool.coverage.report] +# Global silent-regression floor. CI passes --cov-fail-under explicitly (same +# value) so a local `pytest --cov` matches the gate. Per-package floors for the +# security-critical packages are enforced separately by +# scripts/check_coverage_floors.py. +fail_under = 88 diff --git a/scripts/check_coverage_floors.py b/scripts/check_coverage_floors.py new file mode 100644 index 0000000..43943f7 --- /dev/null +++ b/scripts/check_coverage_floors.py @@ -0,0 +1,95 @@ +#!/usr/bin/env python3 +"""Per-package coverage-floor gate (roadmap 11 / Q-L7). + +The global ``--cov-fail-under`` floor closes the aggregate silent-regression +headroom, but a regression concentrated in one security-critical package can +hide behind a high total. This gate enforces a minimum line-coverage percentage +per package (or single module) against ``coverage.json``. + +Floors are intentionally set a few points below current coverage: tight enough +to catch a real regression, loose enough not to trip on incidental churn. Raise +a floor when a package's coverage rises and you want to lock the gain in. + +Usage: + python scripts/check_coverage_floors.py [coverage.json] + +Exit status 0 if every floor holds, 1 otherwise (with a per-package report). +""" + +from __future__ import annotations + +import json +import sys + +# path-prefix (relative to repo root, as coverage records it) -> floor percent. +# A prefix ending in ".py" matches a single module; otherwise it matches a +# package subtree. Current coverage (2026-06-06) shown in the trailing comment. +FLOORS: dict[str, float] = { + "src/legis/enforcement/": 93.0, # currently ~95.0 + "src/legis/service/": 92.0, # currently ~94.1 + "src/legis/governance/": 90.0, # currently ~92.7 + "src/legis/api/": 88.0, # currently ~89.8 + "src/legis/mcp.py": 80.0, # currently ~82 +} + + +def _load(path: str) -> dict: + with open(path, encoding="utf-8") as fh: + return json.load(fh) + + +def _aggregate(files: dict, prefix: str) -> tuple[int, int]: + """Sum (covered_lines, num_statements) over files matching ``prefix``.""" + covered = statements = 0 + for path, info in files.items(): + norm = path.replace("\\", "/") + if prefix.endswith(".py"): + match = norm == prefix + else: + match = norm.startswith(prefix) + if match: + summary = info["summary"] + covered += summary["covered_lines"] + statements += summary["num_statements"] + return covered, statements + + +def main(argv: list[str]) -> int: + report_path = argv[1] if len(argv) > 1 else "coverage.json" + try: + data = _load(report_path) + except FileNotFoundError: + print( + f"coverage report not found: {report_path}\n" + "Run pytest with --cov-report=json first.", + file=sys.stderr, + ) + return 1 + + files = data.get("files", {}) + failures: list[str] = [] + print(f"Per-package coverage floors ({report_path}):") + for prefix, floor in sorted(FLOORS.items()): + covered, statements = _aggregate(files, prefix) + if statements == 0: + failures.append(f" {prefix}: no statements measured (prefix matched nothing)") + continue + pct = 100.0 * covered / statements + status = "ok" if pct >= floor else "FAIL" + print(f" [{status}] {prefix:28} {pct:5.1f}% (floor {floor:.1f}%, {covered}/{statements})") + if pct < floor: + failures.append( + f" {prefix}: {pct:.1f}% < floor {floor:.1f}%" + ) + + if failures: + print("\nCoverage floor breach:", file=sys.stderr) + for line in failures: + print(line, file=sys.stderr) + return 1 + print("All per-package coverage floors hold.") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main(sys.argv)) diff --git a/src/legis/mcp.py b/src/legis/mcp.py index 53e901e..736218f 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -54,7 +54,7 @@ from legis.service.wardline import route_wardline_scan from legis.store.audit_store import AuditStore from legis.wardline.governor import WardlineCellPolicy -from legis.wardline.ingest import WardlinePayloadError, WardlineSeverity +from legis.wardline.ingest import WardlineSeverity _AGENT_TOOLS = frozenset( diff --git a/src/legis/policy/grammar.py b/src/legis/policy/grammar.py index 0517928..7b654f9 100644 --- a/src/legis/policy/grammar.py +++ b/src/legis/policy/grammar.py @@ -12,7 +12,7 @@ from __future__ import annotations -from collections.abc import Hashable, Mapping +from collections.abc import Mapping from dataclasses import dataclass from enum import Enum from typing import Any, Protocol, runtime_checkable diff --git a/uv.lock b/uv.lock index f8f4e34..63ed943 100644 --- a/uv.lock +++ b/uv.lock @@ -371,6 +371,7 @@ dev = [ { name = "mypy" }, { name = "pytest" }, { name = "pytest-cov" }, + { name = "ruff" }, { name = "types-pyyaml" }, ] @@ -389,6 +390,7 @@ dev = [ { name = "mypy", specifier = ">=1.19" }, { name = "pytest", specifier = ">=8.0" }, { name = "pytest-cov", specifier = ">=5.0" }, + { name = "ruff", specifier = ">=0.8" }, { name = "types-pyyaml", specifier = ">=6.0" }, ] @@ -716,6 +718,31 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/f1/12/de94a39c2ef588c7e6455cfbe7343d3b2dc9d6b6b2f40c4c6565744c873d/pyyaml-6.0.3-cp314-cp314t-win_arm64.whl", hash = "sha256:ebc55a14a21cb14062aa4162f906cd962b28e2e9ea38f9b4391244cd8de4ae0b", size = 149341, upload-time = "2025-09-25T21:32:56.828Z" }, ] +[[package]] +name = "ruff" +version = "0.15.16" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/a6/bd/5f7ec371001337d8fa61701c186ff8b613ecac1651848c5950f4c4d5f2e9/ruff-0.15.16.tar.gz", hash = "sha256:d05e78d38c78caf020b03789e25106c93017db5a0cb6e2819885018c61343b78", size = 4714267, upload-time = "2026-06-04T16:33:09.974Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/0c/42/53ef1c3953f157956db9bf7861e3bc50b9b887ce93300aa48cdba8336fe6/ruff-0.15.16-py3-none-linux_armv6l.whl", hash = "sha256:6ac3c0b3969cc6cf6b158c4e2f8f682acb58e7d700d8a44b65ecdc72d66ab0b2", size = 10709025, upload-time = "2026-06-04T16:32:51.935Z" }, + { url = "https://files.pythonhosted.org/packages/93/9a/a79159346f19134a956607754e57d8d128f7a4c00f4ad2f7514d224c172c/ruff-0.15.16-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:197c207ed75ffba54a0dec23db4aa939a27a3053073e085e0042433cbdc58e4a", size = 11063550, upload-time = "2026-06-04T16:32:42.24Z" }, + { url = "https://files.pythonhosted.org/packages/bc/72/3ce2ac000a5299ec238e01f51397b3b653c93b077d9b1bfe8715bb895f20/ruff-0.15.16-py3-none-macosx_11_0_arm64.whl", hash = "sha256:3a39fec45ab316cc23e7558f23fea4a70403ddb5648ea9a4a3854a16973d0071", size = 10421345, upload-time = "2026-06-04T16:32:37.251Z" }, + { url = "https://files.pythonhosted.org/packages/b0/c2/cc7fad3ec9169373f5b6a18f1917b91080feec40c3f9658334a1d28e2f03/ruff-0.15.16-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:ba93191d79003116b95128c9d306e045200fdbd0bccb782b110f3cd1d4abc5cf", size = 10757217, upload-time = "2026-06-04T16:32:54.722Z" }, + { url = "https://files.pythonhosted.org/packages/69/d2/3474009eaa0a65b31fa7152a2fad5e2f050c640ceb1e6b02ee6922e94c82/ruff-0.15.16-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:c6ee4b90520630120ef032aa5cc10db483852dff950e78b1d717e2993a61ac8d", size = 10507035, upload-time = "2026-06-04T16:33:05.343Z" }, + { url = "https://files.pythonhosted.org/packages/ca/81/b7ae6ccbd11f0c8dc3d5d67fc4be9b57ff57ca86ba56152021378e1277f2/ruff-0.15.16-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4e4215bc938bc3c8215c1472c1aa437e310fee20cd427335fec9d7e609563628", size = 11255291, upload-time = "2026-06-04T16:32:49.49Z" }, + { url = "https://files.pythonhosted.org/packages/d9/e1/46e526f1a7cc90857ce6ddf25fbb77eb6568651ac38d71b033af07076dd5/ruff-0.15.16-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:7c8d26be963b090f10e29abc8b3e74a2a321f6fa34e02424e30b5af89350ecbb", size = 12124922, upload-time = "2026-06-04T16:33:07.821Z" }, + { url = "https://files.pythonhosted.org/packages/1a/da/5c791b088b596b24d0deb967fa28ae02ad751a140c0b9ea81c5ab915d6c0/ruff-0.15.16-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f198cf4123602a2280ed46c307bcbafe41758d6fee5b456b6b6058ca1514b3b4", size = 11332186, upload-time = "2026-06-04T16:33:02.971Z" }, + { url = "https://files.pythonhosted.org/packages/72/11/5da87abe20047c8962361473923ebb2f62b595250126aadfad8c20649c1e/ruff-0.15.16-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bb27515fa6240fb586ae82b901a59e67d24acff86f2190b433dc542fe0435aeb", size = 11373541, upload-time = "2026-06-04T16:32:47.007Z" }, + { url = "https://files.pythonhosted.org/packages/fe/2a/8554754c23a854ae3fd6b507e36ad61ddb121e298c6d5d617dec94ed0f14/ruff-0.15.16-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:a267c46ba1593fc26b8eecbea050b39d40c0b6bb7781ee11c90a02cd10032951", size = 11353014, upload-time = "2026-06-04T16:32:34.795Z" }, + { url = "https://files.pythonhosted.org/packages/62/25/62ea41529ec89f742ea3fed9cb1059c72877ec7cf9b9e99ac9cf3294d1d9/ruff-0.15.16-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:528c68f39a91498a8d50e91ff5985df3d105782bab49cc378e73ac26bff083e8", size = 10737467, upload-time = "2026-06-04T16:32:26.348Z" }, + { url = "https://files.pythonhosted.org/packages/90/17/334d3ad9de4d40f9dd58fdd09e35ce64553bb501e2f19a839e2fb6be14fc/ruff-0.15.16-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:7ed55c58950df60589a9a7a5d2f8fa5f54ebd287163be805adfe6ee95a9de123", size = 10521910, upload-time = "2026-06-04T16:32:32.54Z" }, + { url = "https://files.pythonhosted.org/packages/4d/bd/3ac7c6ae77a885c1004b3dda2446ea401768d24f851c14b4ad4b24f6639c/ruff-0.15.16-py3-none-musllinux_1_2_i686.whl", hash = "sha256:d482feaf51512b50f9790ceb417a56a61dd1e9d9bf967662b9ed27c01b34f53a", size = 10979190, upload-time = "2026-06-04T16:32:57.492Z" }, + { url = "https://files.pythonhosted.org/packages/33/d7/609546e6a413c3f216fbf2a50c928f97c80939154f6a0503114094a86191/ruff-0.15.16-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:1e15bc8c94513dae2a40cc9ef07c94fdd4ecc9e29dabebeebe170f952322c9e3", size = 11477014, upload-time = "2026-06-04T16:32:44.687Z" }, + { url = "https://files.pythonhosted.org/packages/74/0d/f2cd247ad32633a5c36e97141a2c21b11c6279f7957bc2ff360b1e08fddd/ruff-0.15.16-py3-none-win32.whl", hash = "sha256:580378f7bd4aa25f72e74aa54948a9622f142b1e509521dd10902e886681cc1e", size = 10735541, upload-time = "2026-06-04T16:32:30.145Z" }, + { url = "https://files.pythonhosted.org/packages/8b/9e/02e845ef151b1dee585e55c4739f8e1734ae1d9f1221dff65761c162208b/ruff-0.15.16-py3-none-win_amd64.whl", hash = "sha256:408256017284eddf98fff77b29aa4fb30f586042d535b2d9befc6512f400aaec", size = 11843403, upload-time = "2026-06-04T16:32:39.76Z" }, + { url = "https://files.pythonhosted.org/packages/15/19/016553f86f207450aebebc2b2b5088d086b901cc8186c02ac4284db3bd88/ruff-0.15.16-py3-none-win_arm64.whl", hash = "sha256:8cd61783afb39638a7133ef0d2dfb1e91277593962f81b5a8423eb0b888a6121", size = 11134555, upload-time = "2026-06-04T16:33:00.136Z" }, +] + [[package]] name = "sqlalchemy" version = "2.0.50" From fa01db198604a56572cd05f4ade3eddf27d25f1e Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 12:53:20 +1000 Subject: [PATCH 05/72] fix(identity): TTL-revalidate capability latch; type-check content_hash (Q-L6) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit resolver.py probed the Loomweave sei capability once per instance and latched the result for the resolver's whole life. A long-lived service resolver whose Loomweave LOST the capability mid-life kept treating it as capable until a later call happened to raise (and symmetrically never noticed a capability regained). - Add a capability TTL (default 300s): the latch — positive or negative — ages out and is re-probed. Time is injected (`monotonic`, default time.monotonic) for deterministic tests, mirroring the client.py timestamp/nonce injection. Transient probe errors still clear the latch and retry on the next resolve. - Type-check content_hash from the resolve response: a non-string degrades to None rather than landing in the typed str|None content axis. 531 passed; ruff/mypy/coverage-floors green. Closes legis-5875d0f156 Co-Authored-By: Claude Opus 4.8 --- src/legis/identity/resolver.py | 50 +++++++++++++++++++---- tests/identity/test_resolver.py | 72 +++++++++++++++++++++++++++++++++ 2 files changed, 115 insertions(+), 7 deletions(-) diff --git a/src/legis/identity/resolver.py b/src/legis/identity/resolver.py index e9f6589..f719a8a 100644 --- a/src/legis/identity/resolver.py +++ b/src/legis/identity/resolver.py @@ -9,13 +9,19 @@ from __future__ import annotations +import time from dataclasses import dataclass -from typing import Any +from typing import Any, Callable from legis.canonical import content_hash from legis.identity.loomweave_client import LoomweaveIdentity from legis.identity.entity_key import EntityKey +# A long-lived resolver re-probes the Loomweave sei capability at most once per +# this window. Without it a positive latch is permanent: a Loomweave that loses +# the capability mid-life would be trusted forever (Q-L6). +_DEFAULT_CAPABILITY_TTL_SECONDS = 300.0 + @dataclass(frozen=True) class IdentityResolution: @@ -28,9 +34,18 @@ class IdentityResolution: class IdentityResolver: - def __init__(self, client: LoomweaveIdentity | None) -> None: + def __init__( + self, + client: LoomweaveIdentity | None, + *, + capability_ttl: float = _DEFAULT_CAPABILITY_TTL_SECONDS, + monotonic: Callable[[], float] = time.monotonic, + ) -> None: self._client = client - self._capable: bool | None = None # probe once per instance + self._capable: bool | None = None # cached probe result; None = unknown + self._capable_checked_at: float | None = None + self._capability_ttl = capability_ttl + self._monotonic = monotonic @property def client(self) -> LoomweaveIdentity | None: @@ -40,12 +55,28 @@ def client(self) -> LoomweaveIdentity | None: def _capability(self) -> bool: if self._client is None: return False - if self._capable is None: + now = self._monotonic() + checked_at = self._capable_checked_at + # The latch (positive OR negative) is fresh only while within the TTL. + # The original code latched the first result for the resolver's whole + # life, so a capability lost (or gained) upstream was never noticed by a + # long-lived resolver (Q-L6). + fresh = ( + self._capable is not None + and checked_at is not None + and now - checked_at < self._capability_ttl + ) + if not fresh: try: self._capable = bool(self._client.capability()) except Exception: - return False # honest transient degrade — retry on next resolve - return self._capable + # Honest transient degrade — clear the latch so the next resolve + # retries rather than trusting a stale value. + self._capable = None + self._capable_checked_at = None + return False + self._capable_checked_at = now + return self._capable if self._capable is not None else False def _snapshot(self, sei: str) -> tuple[dict[str, Any] | None, str]: try: @@ -86,10 +117,15 @@ def resolve(self, locator: str) -> IdentityResolution: if not isinstance(sei, str) or not sei: return degraded snapshot, snapshot_status = self._snapshot(sei) + # content_hash is carried verbatim into the governance record; trust only + # a string. A non-string from a buggy/hostile Loomweave degrades to None + # rather than polluting the typed content axis (Q-L6). + raw_content_hash = res.get("content_hash") + content_hash_value = raw_content_hash if isinstance(raw_content_hash, str) else None return IdentityResolution( EntityKey.from_sei(sei), True, - res.get("content_hash"), + content_hash_value, snapshot, "resolved", snapshot_status, diff --git a/tests/identity/test_resolver.py b/tests/identity/test_resolver.py index 8461c77..5a2e3cc 100644 --- a/tests/identity/test_resolver.py +++ b/tests/identity/test_resolver.py @@ -108,3 +108,75 @@ def test_alive_sei_with_lineage_failure_records_unavailable_status(): assert res.lineage_snapshot is None assert res.identity_resolution_status == "resolved" assert res.lineage_snapshot_status == "unavailable" + + +# --- Q-L6: the capability latch must revalidate (TTL), and content_hash must be +# type-checked, not trusted verbatim from the Loomweave response. --- + + +class _Probe(FakeClient): + """A client whose capability can be flipped, counting probes.""" + + def __init__(self, *, capable=True, resolve=None, lineage=None): + super().__init__(capable=capable, resolve=resolve, lineage=lineage) + self.probes = 0 + + def capability(self): + self.probes += 1 + return self._capable + + +def test_capability_is_cached_within_ttl(): + # Within the TTL window the positive latch is reused — one probe across many + # resolves (the caching the original code intended). + clock = {"t": 1000.0} + client = _Probe(resolve=ALIVE, lineage=[{"event": "born"}]) + r = IdentityResolver(client, capability_ttl=300.0, monotonic=lambda: clock["t"]) + for _ in range(5): + assert r.resolve("python:function:m.f").entity_key.identity_stable is True + assert client.probes == 1 + + +def test_capability_latch_revalidates_after_ttl(): + # A Loomweave that LOSES the sei capability mid-life must not be treated as + # capable forever by a long-lived resolver. After the TTL elapses the latch + # is re-probed and the resolver honestly degrades. + clock = {"t": 1000.0} + client = _Probe(resolve=ALIVE, lineage=[{"event": "born"}]) + r = IdentityResolver(client, capability_ttl=300.0, monotonic=lambda: clock["t"]) + + assert r.resolve("python:function:m.f").entity_key.identity_stable is True + assert client.probes == 1 + + client._capable = False # capability revoked upstream + clock["t"] += 299.0 # still within TTL → stale latch reused + assert r.resolve("python:function:m.f").entity_key.identity_stable is True + assert client.probes == 1 + + clock["t"] += 2.0 # now past TTL → re-probe, sees the loss + assert r.resolve("python:function:m.f").entity_key.identity_stable is False + assert client.probes == 2 + + +def test_capability_regained_after_ttl_is_noticed(): + # Symmetric to revocation: a negative latch must also age out, so a Loomweave + # that GAINS the capability is eventually picked up. + clock = {"t": 0.0} + client = _Probe(capable=False, resolve=ALIVE, lineage=[{"event": "born"}]) + r = IdentityResolver(client, capability_ttl=300.0, monotonic=lambda: clock["t"]) + + assert r.resolve("python:function:m.f").entity_key.identity_stable is False + client._capable = True + clock["t"] += 301.0 + assert r.resolve("python:function:m.f").entity_key.identity_stable is True + + +def test_non_string_content_hash_is_dropped(): + # content_hash is carried verbatim into the record; a non-string value from a + # buggy/hostile Loomweave must not land in the typed str|None field. + for bad in (12345, {"nested": "obj"}, ["list"], 3.14): + resolve = {**ALIVE, "content_hash": bad} + r = IdentityResolver(FakeClient(resolve=resolve, lineage=[{"event": "born"}])) + res = r.resolve("python:function:m.f") + assert res.entity_key.value == "loomweave:eid:deadbeef" + assert res.content_hash is None From 881c80fd3a4efa8e701e79a8e420686b320a8b65 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 12:55:59 +1000 Subject: [PATCH 06/72] ci: make live Loomweave conformance non-optional for releases (roadmap 12) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The per-PR live oracle in ci.yml is gated on `vars.LOOMWEAVE_URL != ''`, so an absent variable silently skips it and Loomweave endpoint/header drift sails through default CI. Releases inherited the same blind spot. - Add .github/workflows/loomweave-conformance.yml: a FAIL-CLOSED live oracle gate. A missing LOOMWEAVE_URL / LOOMWEAVE_LIVE_ORACLE_LOCATOR / HMAC secret is an ::error and exit 1, not a skip. Runs on a daily schedule (drift sweep), workflow_dispatch, and workflow_call. - Gate release publish on it: release.yml grows a `conformance` job that calls the reusable workflow, and `publish` now needs [build, conformance]. A release cannot be published unless live conformance passes. The per-PR ci.yml step stays opt-in (unchanged) — releases and the schedule are where conformance is now mandatory. Live behavior can only be exercised with a real Loomweave endpoint; YAML structure and the local skip path are verified. Closes legis-2087bfca94 Co-Authored-By: Claude Opus 4.8 --- .github/workflows/loomweave-conformance.yml | 64 +++++++++++++++++++++ .github/workflows/release.yml | 10 +++- 2 files changed, 73 insertions(+), 1 deletion(-) create mode 100644 .github/workflows/loomweave-conformance.yml diff --git a/.github/workflows/loomweave-conformance.yml b/.github/workflows/loomweave-conformance.yml new file mode 100644 index 0000000..7a28077 --- /dev/null +++ b/.github/workflows/loomweave-conformance.yml @@ -0,0 +1,64 @@ +name: loomweave-conformance + +# Live cross-repo Loomweave SEI conformance. +# +# Unlike the per-PR oracle step in ci.yml (opt-in, silently skipped when +# LOOMWEAVE_URL is unset), this gate is FAIL-CLOSED: a missing endpoint, locator +# fixture, or HMAC credential is an ERROR, not a pass. That closes the roadmap-12 +# hole where an absent var let Loomweave endpoint/header drift sail through CI. +# +# It runs on a schedule (catch drift between releases) and is callable as a +# reusable workflow (`workflow_call`) so the release pipeline gates publish on it +# — making conformance non-optional for releases. + +on: + schedule: + - cron: "0 7 * * *" # daily 07:00 UTC drift sweep + workflow_dispatch: + workflow_call: + +permissions: + contents: read + +jobs: + live-loomweave-oracle: + name: Live Loomweave oracle (fail-closed) + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: astral-sh/setup-uv@v5 + with: + enable-cache: true + - name: Install dependencies + run: uv sync --dev + - name: Require live Loomweave configuration + env: + LOOMWEAVE_URL: ${{ vars.LOOMWEAVE_URL }} + LOOMWEAVE_LIVE_ORACLE_LOCATOR: ${{ vars.LOOMWEAVE_LIVE_ORACLE_LOCATOR }} + LEGIS_LOOMWEAVE_HMAC_KEY: ${{ secrets.LEGIS_LOOMWEAVE_HMAC_KEY }} + run: | + missing=0 + if [ -z "${LOOMWEAVE_URL}" ]; then + echo "::error::LOOMWEAVE_URL variable is not set — live Loomweave conformance cannot run. Configure it under Settings → Secrets and variables → Actions → Variables." + missing=1 + fi + if [ -z "${LOOMWEAVE_LIVE_ORACLE_LOCATOR}" ]; then + echo "::error::LOOMWEAVE_LIVE_ORACLE_LOCATOR variable is not set — the round-trip locator fixture is required for conformance." + missing=1 + fi + if [ -z "${LEGIS_LOOMWEAVE_HMAC_KEY}" ]; then + echo "::error::LEGIS_LOOMWEAVE_HMAC_KEY secret is not set — the signed Loomweave channel credential is required." + missing=1 + fi + if [ "${missing}" -ne 0 ]; then + exit 1 + fi + - name: Run live Loomweave conformance oracle + env: + LOOMWEAVE_URL: ${{ vars.LOOMWEAVE_URL }} + LOOMWEAVE_LIVE_ORACLE_LOCATOR: ${{ vars.LOOMWEAVE_LIVE_ORACLE_LOCATOR }} + LEGIS_LOOMWEAVE_HMAC_KEY: ${{ secrets.LEGIS_LOOMWEAVE_HMAC_KEY }} + # -rs reports any skip in the log; the guard above makes the test file's + # own skipif conditions (unset URL / locator) unreachable, so a skip here + # would signal an unexpected gap rather than a benign opt-out. + run: uv run pytest tests/conformance/test_live_loomweave_oracle.py -q -rs diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 2b903a5..09760d8 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -54,9 +54,17 @@ jobs: name: dist path: dist/ + conformance: + # Live cross-repo Loomweave SEI conformance, required before publish. The + # reusable workflow is fail-closed: a missing LOOMWEAVE_URL / locator / HMAC + # credential fails the release rather than silently skipping (roadmap 12). + name: Live Loomweave conformance + uses: ./.github/workflows/loomweave-conformance.yml + secrets: inherit + publish: name: Publish to PyPI - needs: build + needs: [build, conformance] runs-on: ubuntu-latest environment: name: pypi From c1f726d4490311609a66aa14a86c53d0d75e098b Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 13:00:01 +1000 Subject: [PATCH 07/72] fix(store): enforce AuditStore batch read-free invariant + regression test (Q-M5) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit AuditStore.transaction()'s all-or-nothing guarantee depended on an UNENFORCED contract: every public read (read_all / read_by_seq / verify_integrity / get_latest_sequence_and_hash) opens its own connection, so a read issued inside a held BEGIN IMMEDIATE batch would miss the uncommitted appends and contend with the write lock (SQLITE_BUSY on SQLite, possibly silent on other backends). Only governor's own discipline (resolve all entities before opening the batch) kept the gate append paths read-free — nothing in the store stopped a future gate method from introducing an in-batch read. - Enforce: each read method guards on the thread-local and raises a clear RuntimeError if called inside an active transaction() batch on the same thread — turning silent, backend-dependent contention into a deterministic, loud failure a future regression's tests catch immediately. - Regression test (on real on-disk SQLite, not in-memory/shared-cache): drives the real EnforcementEngine.submit_override and SignoffGate.request append paths through route_findings' batch and asserts they complete (proving read- free), plus guard-fires-on-read and all-or-nothing rollback tests. - Update transaction() docstrings (store + protocol) to state the invariant is now enforced. 539 passed; ruff/mypy/coverage-floors green. Closes legis-79f42be309 Co-Authored-By: Claude Opus 4.8 --- src/legis/store/audit_store.py | 41 +++++- src/legis/store/protocol.py | 4 +- tests/store/test_batch_read_free_invariant.py | 134 ++++++++++++++++++ 3 files changed, 171 insertions(+), 8 deletions(-) create mode 100644 tests/store/test_batch_read_free_invariant.py diff --git a/src/legis/store/audit_store.py b/src/legis/store/audit_store.py index c17b623..b3e550f 100644 --- a/src/legis/store/audit_store.py +++ b/src/legis/store/audit_store.py @@ -122,13 +122,16 @@ def transaction(self) -> Iterator[None]: connection thread-locally; nested ``transaction()`` calls reuse the outer one. - Appends only. ``read_all`` / ``read_by_seq`` / ``verify_integrity`` open - their own connection via ``self._engine.begin()`` — they will NOT see - this batch's uncommitted appends, and on SQLite a read connection can - hit ``SQLITE_BUSY`` against the held ``BEGIN IMMEDIATE`` write lock. Do - all reads before entering the context (as ``wardline.governor`` does: it - resolves every entity before opening the batch). Only ``append``'s own - chain-head read is safe here, because it runs on the ambient connection. + Appends only, and now *enforced*: ``read_all`` / ``read_by_seq`` / + ``verify_integrity`` / ``get_latest_sequence_and_hash`` open their own + connection via ``self._engine.begin()``, so a read issued inside this + context would not see the batch's uncommitted appends and on SQLite would + hit ``SQLITE_BUSY`` against the held ``BEGIN IMMEDIATE`` write lock. Each + guards on the thread-local and raises ``RuntimeError`` rather than + contending silently (``_assert_no_batch_in_progress``). Do all reads + before entering the context (as ``wardline.governor`` does: it resolves + every entity before opening the batch). Only ``append``'s own chain-head + read is safe here, because it runs on the ambient connection. """ if getattr(self._txn, "conn", None) is not None: # Already inside a batch on this thread — reuse it (nested no-op). @@ -143,6 +146,26 @@ def transaction(self) -> Iterator[None]: finally: self._txn.conn = None + def _assert_no_batch_in_progress(self, method: str) -> None: + """Fail loudly if a fresh-connection read runs inside a held batch (Q-M5). + + ``transaction()`` holds a ``BEGIN IMMEDIATE`` write lock on the ambient + thread-local connection. Every public read opens its OWN connection, so + a read issued while the batch is held would (a) contend with that lock + (``SQLITE_BUSY`` on SQLite, and possibly no error on other backends) and + (b) miss the batch's uncommitted appends. The original contract relied on + callers never doing this; this guard *enforces* it, turning a silent, + backend-dependent contention into an explicit, deterministic error so a + future in-batch read in a gate append path fails its tests immediately. + """ + if getattr(self._txn, "conn", None) is not None: + raise RuntimeError( + f"AuditStore.{method}() called inside an active transaction() batch " + "on this thread. Fresh-connection reads contend with the batch's " + "held BEGIN IMMEDIATE write lock and cannot see its uncommitted " + "appends — resolve all reads before opening the batch (Q-M5)." + ) + def _insert(self, conn: Any, payload: dict[str, Any]) -> int: c_hash = content_hash(payload) prev = conn.execute( @@ -176,6 +199,7 @@ def append(self, payload: dict[str, Any]) -> int: return self._insert(conn, payload) def read_all(self) -> list[AuditRecord]: + self._assert_no_batch_in_progress("read_all") with self._engine.begin() as conn: rows = conn.execute( select(self._log).order_by(self._log.c.seq.asc()) @@ -192,6 +216,7 @@ def read_all(self) -> list[AuditRecord]: ] def read_by_seq(self, seq: int) -> AuditRecord | None: + self._assert_no_batch_in_progress("read_by_seq") with self._engine.begin() as conn: row = conn.execute( select(self._log).where(self._log.c.seq == seq) @@ -207,6 +232,7 @@ def read_by_seq(self, seq: int) -> AuditRecord | None: ) def verify_integrity(self) -> bool: + self._assert_no_batch_in_progress("verify_integrity") prev_hash = GENESIS try: records = self.read_all() @@ -231,6 +257,7 @@ def verify_integrity(self) -> bool: return True def get_latest_sequence_and_hash(self) -> tuple[int, str]: + self._assert_no_batch_in_progress("get_latest_sequence_and_hash") with self._engine.begin() as conn: row = conn.execute( select(self._log.c.seq, self._log.c.chain_hash) diff --git a/src/legis/store/protocol.py b/src/legis/store/protocol.py index dc0a3e8..db10c6f 100644 --- a/src/legis/store/protocol.py +++ b/src/legis/store/protocol.py @@ -37,6 +37,8 @@ def transaction(self) -> AbstractContextManager[None]: ``read_by_seq``, ``verify_integrity``) is NOT guaranteed to observe uncommitted appends from the same batch — it sees a pre-batch snapshot — and on a single-connection backend (SQLite) may contend with the - held write transaction. Resolve all reads before opening the batch. + held write transaction. Resolve all reads before opening the batch. The + SQLite implementation (``AuditStore``) *enforces* this: an in-batch read + on the same thread raises ``RuntimeError`` instead of contending. """ ... diff --git a/tests/store/test_batch_read_free_invariant.py b/tests/store/test_batch_read_free_invariant.py new file mode 100644 index 0000000..5d84eef --- /dev/null +++ b/tests/store/test_batch_read_free_invariant.py @@ -0,0 +1,134 @@ +"""The transaction() read-free invariant is enforced and gate-path-proven (Q-M5). + +`AuditStore.transaction()` groups appends into one all-or-nothing batch behind a +held `BEGIN IMMEDIATE` write lock. Its contract is appends-only: a fresh- +connection read inside the batch would miss the uncommitted appends and contend +with the lock (`SQLITE_BUSY`). These tests pin that the store now *enforces* the +invariant (turning silent contention into a loud error), that the real gate +append paths driven through `route_findings` honour it, and that the batch is +genuinely all-or-nothing on a real on-disk SQLite file. +""" + +from __future__ import annotations + +import pytest + +from legis.clock import FixedClock +from legis.enforcement.engine import EnforcementEngine +from legis.enforcement.signoff import SignoffGate +from legis.identity.entity_key import EntityKey +from legis.store.audit_store import AuditStore +from legis.wardline.governor import WardlineCellPolicy, route_findings +from legis.wardline.ingest import active_defects + +_CLOCK = "2026-06-02T12:00:00+00:00" + + +def _on_disk_store(tmp_path, name="g.db") -> AuditStore: + # A real file, NOT sqlite:///:memory: and NOT shared-cache — so the held + # BEGIN IMMEDIATE genuinely locks a second connection out (the condition the + # invariant protects against). + return AuditStore(f"sqlite:///{tmp_path / name}") + + +def _scan(n: int) -> dict: + return { + "findings": [ + { + "rule_id": f"PY-WL-{100 + i}", + "message": f"untrusted reaches trusted #{i}", + "severity": "ERROR", + "kind": "defect", + "fingerprint": f"fp{i}", + "qualname": f"m.f{i}", + "properties": {"actual_return": "UNKNOWN_RAW"}, + "suppressed": "active", + } + for i in range(n) + ] + } + + +# --- the guard itself: a read inside a held batch raises, not contends --- + +@pytest.mark.parametrize( + "call", + [ + lambda s: s.read_all(), + lambda s: s.read_by_seq(1), + lambda s: s.verify_integrity(), + lambda s: s.get_latest_sequence_and_hash(), + ], +) +def test_read_inside_batch_raises_runtime_error(tmp_path, call): + store = _on_disk_store(tmp_path) + store.append({"event": "before"}) + with pytest.raises(RuntimeError, match="active transaction"): + with store.transaction(): + store.append({"event": "in-batch"}) + call(store) + + +def test_reads_work_again_after_batch_exits(tmp_path): + store = _on_disk_store(tmp_path) + with store.transaction(): + store.append({"event": "a"}) + store.append({"event": "b"}) + # Once the batch commits and the thread-local clears, reads are fine again. + assert len(store.read_all()) == 2 + assert store.verify_integrity() is True + + +# --- the real gate append paths, driven through route_findings' batch --- + +def test_surface_override_batch_is_read_free_on_disk(tmp_path): + # EnforcementEngine.submit_override is the append path here. If it (or + # anything it calls) issued a fresh-connection read inside the batch, the + # guard would raise; a clean completion proves the path is read-free. + engine = EnforcementEngine(_on_disk_store(tmp_path), FixedClock(_CLOCK)) + results = route_findings( + active_defects(_scan(3)), + policy=WardlineCellPolicy.SURFACE_OVERRIDE, + agent_id="agent-1", + resolve=lambda q: (EntityKey.from_locator(q or "unknown"), {}), + engine=engine, + ) + assert len(results) == 3 + # All three landed atomically and the chain is intact (reads outside batch). + assert len(engine.records()) == 3 + assert engine._store.verify_integrity() is True + + +def test_block_escalate_batch_is_read_free_on_disk(tmp_path): + # SignoffGate.request is the append path here. + gate = SignoffGate(_on_disk_store(tmp_path), FixedClock(_CLOCK)) + results = route_findings( + active_defects(_scan(3)), + policy=WardlineCellPolicy.BLOCK_ESCALATE, + agent_id="agent-1", + resolve=lambda q: (EntityKey.from_locator(q or "unknown"), {}), + signoff=gate, + ) + assert len(results) == 3 + assert len(gate.records()) == 3 + assert gate._store.verify_integrity() is True + + +# --- all-or-nothing on a real file: a mid-batch failure rolls everything back --- + +def test_batch_rolls_back_atomically_on_disk(tmp_path): + store = _on_disk_store(tmp_path) + store.append({"event": "committed-before-batch"}) + + with pytest.raises(RuntimeError, match="boom"): + with store.transaction(): + store.append({"event": "batch-1"}) + store.append({"event": "batch-2"}) + raise RuntimeError("boom") # mid-loop failure + + # The two in-batch appends rolled back; only the pre-batch record survives, + # and the hash chain is unbroken — proving real on-disk atomicity, not a + # half-written batch. + records = store.read_all() + assert [r.payload["event"] for r in records] == ["committed-before-batch"] + assert store.verify_integrity() is True From 0b7d41c487aeef3e379e685e08531cd50f2a0927 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 13:07:08 +1000 Subject: [PATCH 08/72] fix(policy): reconcile gate/scanner fingerprint extraction; defer RFC-8785 (Q-L5/Q-L4) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Q-L5 (reconcile fingerprints) — DONE: The runtime honesty gate (decorator.fingerprint via inspect.getsource) and the static scanner (boundary_scan via ast.get_source_segment of the FunctionDef) could compute divergent fingerprints for a decorated or class-method test_ref: inspect.getsource INCLUDES decorator lines, get_source_segment EXCLUDES them, and only the runtime path normalized CRLF or had a parse-failure fallback. - get_normalized_ast_str now strips decorator_list (FunctionDef/AsyncFunctionDef/ ClassDef), making the AST fingerprint decorator-insensitive. - New fingerprint_source(): the single canonicalization (CRLF->LF, dedent, normalized-AST hash, fallback) both paths now route through — they can no longer diverge. - Verified empirically: a decorated test fingerprinted e2b8.. (runtime) vs d833.. (static) before; both d833.. after. Undecorated module-level tests are unchanged, so the 7 real boundaries' stored fingerprints still verify (policy-boundary-check PASS). Q-L4 (RFC-8785) — ASSESSED & DEFERRED with evidence: RFC-8785 is gated on "when cross-language verification is needed" (ADR-0001/0002, arch-handover item 15). No current consumer verifies a legis hash from a non- Python runtime; content_hash always derives utf-8 bytes, so v1 is deterministic for single-language use today. The fingerprints are Python ast.dump output (not cross-language JSON) so RFC-8785 does not apply to them. Recorded the assessment in canonical.py; the single-choke-point design keeps it a one-file upgrade when a cross-language verifier lands. Building it now would be speculative. 543 passed; ruff/mypy/coverage-floors/boundary-gate green. Closes legis-b4445c2f42 Co-Authored-By: Claude Opus 4.8 --- src/legis/canonical.py | 13 +++- src/legis/policy/boundary_scan.py | 11 ++-- src/legis/policy/decorator.py | 43 +++++++++---- tests/policy/test_boundary_scan.py | 6 +- tests/policy/test_decorator.py | 98 +++++++++++++++++++++++++++++- 5 files changed, 149 insertions(+), 22 deletions(-) diff --git a/src/legis/canonical.py b/src/legis/canonical.py index eb5df71..636a566 100644 --- a/src/legis/canonical.py +++ b/src/legis/canonical.py @@ -2,7 +2,18 @@ v1 uses sorted-key, tight-separator JSON for deterministic hashing. RFC 8785 is a future hardening (elspeth uses RFC 8785); legis should converge there before -the protected cell ships cryptographic guarantees (see ADR-0001). +the protected cell ships cryptographic guarantees (see ADR-0001 / ADR-0002). + +Q-L4 deferral (assessed 2026-06-06): RFC-8785 is gated on "when cross-language +verification is needed." No current consumer verifies a legis hash from a +non-Python runtime — every hash is produced and checked in-process, and +``content_hash`` always derives bytes via ``.encode("utf-8")``, so the +``ensure_ascii=False`` byte output is deterministic for legis's single-language +use today. Because this is the single canonicalization choke point, the RFC-8785 +upgrade stays a one-file change for the day a cross-language verifier lands. The +companion Q-L5 fingerprint reconciliation (decorator.py / boundary_scan.py) is +independent and is done — those fingerprints are Python ``ast.dump`` output, not +cross-language JSON, so RFC-8785 does not apply to them. """ from __future__ import annotations diff --git a/src/legis/policy/boundary_scan.py b/src/legis/policy/boundary_scan.py index fa44cee..38cd505 100644 --- a/src/legis/policy/boundary_scan.py +++ b/src/legis/policy/boundary_scan.py @@ -3,13 +3,11 @@ from __future__ import annotations import ast -import textwrap from dataclasses import asdict, dataclass from pathlib import Path from typing import Any, cast -from legis.canonical import content_hash -from legis.policy.decorator import get_normalized_ast_str +from legis.policy.decorator import fingerprint_source from legis.policy.evidence import evaluate_test_evidence @@ -154,9 +152,10 @@ def _visit_function(self, node: ast.FunctionDef | ast.AsyncFunctionDef) -> None: test_source, test_node = test_result test_segment = ast.get_source_segment(test_source, test_node) or "" - actual_fingerprint = content_hash( - get_normalized_ast_str(textwrap.dedent(test_segment)) - ) + # Same canonicalization the runtime honesty gate uses — CRLF/dedent + # normalization and a decorator-insensitive AST hash — so the two + # paths cannot diverge for a decorated / class-method test_ref (Q-L5). + actual_fingerprint = fingerprint_source(test_segment) if actual_fingerprint != test_fingerprint: self._add( "POLICY_BOUNDARY_TEST_FINGERPRINT_MISMATCH", diff --git a/src/legis/policy/decorator.py b/src/legis/policy/decorator.py index aa32c14..fdf19d8 100644 --- a/src/legis/policy/decorator.py +++ b/src/legis/policy/decorator.py @@ -104,16 +104,44 @@ def wrapper(*args: Any, **kwargs: Any) -> Any: def get_normalized_ast_str(source: str) -> str: import ast parsed = ast.parse(source) - # Strip docstrings for node in ast.walk(parsed): + # Strip docstrings. if isinstance(node, (ast.FunctionDef, ast.ClassDef, ast.Module)): if node.body and isinstance(node.body[0], ast.Expr): val = node.body[0].value if isinstance(val, ast.Constant) and isinstance(val.value, str): node.body.pop(0) + # Strip decorators so the fingerprint does not depend on whether the + # extracted source carried the decorator lines. The runtime gate reads + # the test via inspect.getsource (decorators INCLUDED); the static + # scanner reads it via ast.get_source_segment of the FunctionDef + # (decorators EXCLUDED). Without this, a decorated or class-method + # test_ref fingerprints differently on each path (Q-L5). + if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)): + node.decorator_list = [] return ast.dump(parsed) +def fingerprint_source(source: str) -> str: + """The single canonicalization both fingerprint paths share (Q-L5). + + Normalizes platform line endings (CRLF->LF) and indentation, then hashes the + docstring- and decorator-stripped AST. Falls back to hashing the normalized + source text when it cannot be parsed (e.g. an extracted fragment). The + runtime honesty gate (``fingerprint``) and the static scanner + (``boundary_scan``) MUST both route through here so they can never compute + divergent fingerprints for the same referenced test. + """ + import textwrap + + source = source.replace("\r\n", "\n") + source = textwrap.dedent(source) + try: + return content_hash(get_normalized_ast_str(source)) + except Exception: + return content_hash(source) + + def fingerprint(test_fn: Callable[..., Any]) -> str: """Content hash of a test function's source — the gate's anti-vibe teeth. @@ -126,16 +154,9 @@ def fingerprint(test_fn: Callable[..., Any]) -> str: except (OSError, TypeError) as exc: raise OSError(f"Source code not available for test: {exc}") from exc - # Normalize CRLF to LF to handle platform line ending differences - source = source.replace("\r\n", "\n") - - try: - import textwrap - source = textwrap.dedent(source) - normalized = get_normalized_ast_str(source) - return content_hash(normalized) - except Exception: - return content_hash(source) + # Route through the shared canonicalization the static scanner also uses, so + # the two paths cannot diverge (Q-L5). + return fingerprint_source(source) @dataclass(frozen=True) diff --git a/tests/policy/test_boundary_scan.py b/tests/policy/test_boundary_scan.py index aad91da..c2f58b9 100644 --- a/tests/policy/test_boundary_scan.py +++ b/tests/policy/test_boundary_scan.py @@ -1,12 +1,12 @@ from pathlib import Path -from legis.canonical import content_hash from legis.policy.boundary_scan import scan_policy_boundaries -from legis.policy.decorator import get_normalized_ast_str +from legis.policy.decorator import fingerprint_source def _test_fingerprint(source: str) -> str: - return content_hash(get_normalized_ast_str(source)) + # The canonical fingerprint both the gate and scanner compute (Q-L5). + return fingerprint_source(source) def _write_boundary_subject( diff --git a/tests/policy/test_decorator.py b/tests/policy/test_decorator.py index c95a087..a99eec1 100644 --- a/tests/policy/test_decorator.py +++ b/tests/policy/test_decorator.py @@ -1,6 +1,102 @@ +import ast +import importlib.util + import pytest -from legis.policy.decorator import PolicyBoundaryMetadata, policy_boundary +from legis.policy.decorator import ( + PolicyBoundaryMetadata, + fingerprint, + fingerprint_source, + policy_boundary, +) + + +# --- Q-L5: the runtime gate and the static scanner must agree --- + +def _static_fingerprint(module_source: str, name: str) -> str: + """Reproduce the static scanner's extraction: the FunctionDef segment + (decorators excluded) run through the shared canonicalization.""" + tree = ast.parse(module_source) + node = next( + n + for n in ast.walk(tree) + if isinstance(n, (ast.FunctionDef, ast.AsyncFunctionDef)) and n.name == name + ) + segment = ast.get_source_segment(module_source, node) or "" + return fingerprint_source(segment) + + +def _runtime_fingerprint(tmp_path, module_source: str, name: str) -> str: + """Reproduce the runtime gate's extraction: inspect.getsource of the live + function (decorators included).""" + path = tmp_path / "refmod.py" + path.write_text(module_source, encoding="utf-8") + spec = importlib.util.spec_from_file_location("refmod_ql5", path) + mod = importlib.util.module_from_spec(spec) + spec.loader.exec_module(mod) + return fingerprint(getattr(mod, name)) + + +_DECORATED_TEST_MODULE = ( + "import functools\n" + "\n" + "def deco(f):\n" + " @functools.wraps(f)\n" + " def w(*a, **k):\n" + " return f(*a, **k)\n" + " return w\n" + "\n" + "@deco\n" + "def referenced_test():\n" + ' """exercises the boundary"""\n' + " assert True\n" +) + + +def test_runtime_and_static_fingerprints_agree_for_decorated_test(tmp_path): + # The crux of Q-L5: inspect.getsource includes the @deco line, while + # ast.get_source_segment of the FunctionDef does not — decorator-insensitive + # normalization makes the two paths converge. + runtime = _runtime_fingerprint(tmp_path, _DECORATED_TEST_MODULE, "referenced_test") + static = _static_fingerprint(_DECORATED_TEST_MODULE, "referenced_test") + assert runtime == static + + +def test_runtime_and_static_fingerprints_agree_for_class_method(tmp_path): + # Class methods are indented and may be decorated; dedent + decorator strip + # must still make the two extraction paths agree. + module = ( + "import functools\n" + "\n" + "def deco(f):\n" + " return f\n" + "\n" + "class TestThing:\n" + " @deco\n" + " def referenced_test(self):\n" + " assert 1 + 1 == 2\n" + ) + path = tmp_path / "refmod.py" + path.write_text(module, encoding="utf-8") + spec = importlib.util.spec_from_file_location("refmod_ql5_cls", path) + mod = importlib.util.module_from_spec(spec) + spec.loader.exec_module(mod) + runtime = fingerprint(mod.TestThing.referenced_test) + static = _static_fingerprint(module, "referenced_test") + assert runtime == static + + +def test_fingerprint_source_is_crlf_invariant(): + lf = "def t():\n assert True\n" + crlf = lf.replace("\n", "\r\n") + assert fingerprint_source(lf) == fingerprint_source(crlf) + + +def test_fingerprint_source_unparsable_fragment_falls_back(): + # A non-parseable fragment hashes the normalized text rather than raising — + # both paths share this fallback, so they still agree. + frag = " assert broken(:\n" + assert isinstance(fingerprint_source(frag), str) def test_decorator_is_passthrough_and_attaches_metadata(): From 01dcc56dd6641a053093eca8311f8b5067137787 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 13:13:00 +1000 Subject: [PATCH 09/72] refactor(mcp): table-driven call_tool dispatch + stdin line bound (Q-L8 / roadmap 14) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit mcp.py was the single-file complexity hotspot. Three parts from the ticket: 1. Table-driven dispatch: call_tool was a ~360-line if/elif over the tool name. Each branch is now a `_tool_(runtime, args)` handler behind a `_TOOL_HANDLERS` dict; call_tool is a 9-line shell that keeps the same argument-key validation, UNKNOWN_TOOL fallback, and _service_error wrapper. Extracted mechanically (AST-driven) to avoid transcription drift; all 39 mcp tests and the full suite pass unchanged — behavior-preserving. 2. Stdin line-size bound: the hand-rolled one-object-per-line framing read lines unbounded, so a peer sending a line with no newline forced an unbounded read. run_jsonrpc now reads via _read_bounded_line (default 16 MiB, override with LEGIS_MCP_MAX_REQUEST_BYTES); an over-long record is rejected with -32700 and the framing realigns at the next newline rather than mis-parsing. 3. Shared config module (mcp->api edge): already landed — DEFAULT_*_DB live in legis.config and mcp imports from there (no legis.api import remains). Verified, no further change needed. mcp.py coverage 82.2% -> 83.4%; 545 passed; ruff/mypy/coverage-floors green. Closes legis-d70b003e93 Co-Authored-By: Claude Opus 4.8 --- src/legis/mcp.py | 770 ++++++++++++++++++++++----------------- tests/mcp/test_server.py | 42 +++ 2 files changed, 475 insertions(+), 337 deletions(-) diff --git a/src/legis/mcp.py b/src/legis/mcp.py index 736218f..8171305 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -8,6 +8,7 @@ from __future__ import annotations +from collections.abc import Callable from dataclasses import asdict, dataclass import json import os @@ -78,6 +79,25 @@ _SUPPORTED_PROTOCOL_VERSIONS = ("2024-11-05", "2025-03-26") _DEFAULT_PROTOCOL_VERSION = _SUPPORTED_PROTOCOL_VERSIONS[-1] +# Upper bound on a single JSON-RPC line read from stdin. The hand-rolled framing +# is one object per line; without a bound a peer (or a corrupted pipe) sending a +# line with no newline forces an unbounded read into memory. 16 MiB comfortably +# fits a maximal scan_route request (MAX_FINDINGS=500 with properties) while +# refusing a pathological one. Override with LEGIS_MCP_MAX_REQUEST_BYTES. +_DEFAULT_MAX_REQUEST_BYTES = 16 * 1024 * 1024 + + +def _max_request_bytes() -> int: + raw = os.environ.get("LEGIS_MCP_MAX_REQUEST_BYTES") + if raw: + try: + value = int(raw) + except ValueError: + return _DEFAULT_MAX_REQUEST_BYTES + if value > 0: + return value + return _DEFAULT_MAX_REQUEST_BYTES + @dataclass class McpRuntime: @@ -688,365 +708,398 @@ def _verified_records(runtime: McpRuntime) -> list[Any]: return runtime.engine.records() -def call_tool(runtime: McpRuntime, name: str, args: dict[str, Any]) -> dict[str, Any]: - try: - _validate_argument_keys(name, args) - if name == "policy_explain": - explanation = explain_policy( - _registry(runtime), - policy=_require(args, "policy"), - entity=_require(args, "entity"), - engine=runtime.engine, - protected_gate=runtime.protected_gate, - signoff_gate=runtime.signoff_gate, - ) - return _tool_result(_explanation_payload(explanation)) - - if name == "override_submit": - policy = _require(args, "policy") - entity = _require(args, "entity") - rationale = _require(args, "rationale") - idempotency_key = _optional_string(args, "idempotency_key") - simple_engine = ( - _engine(runtime) - if _registry(runtime).cell_for(policy) in ("chill", "coached") - else runtime.engine - ) - explanation = explain_policy( - _registry(runtime), - policy=policy, - entity=entity, - engine=simple_engine, - protected_gate=runtime.protected_gate, - signoff_gate=runtime.signoff_gate, - ) - if not explanation.enabled: - raise NotEnabledError( - f"cell {explanation.cell!r} is not enabled for override submission" - ) - idempotency_request_hash = ( - _override_idempotency_request_hash( - agent_id=runtime.agent_id, - policy=policy, - entity=entity, - rationale=rationale, - cell=explanation.cell, - file_fingerprint=_optional_string(args, "file_fingerprint"), - ast_path=_optional_string(args, "ast_path"), - ) - if idempotency_key is not None - else None +def _tool_policy_explain(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + explanation = explain_policy( + _registry(runtime), + policy=_require(args, "policy"), + entity=_require(args, "entity"), + engine=runtime.engine, + protected_gate=runtime.protected_gate, + signoff_gate=runtime.signoff_gate, + ) + return _tool_result(_explanation_payload(explanation)) + + +def _tool_override_submit(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + policy = _require(args, "policy") + entity = _require(args, "entity") + rationale = _require(args, "rationale") + idempotency_key = _optional_string(args, "idempotency_key") + simple_engine = ( + _engine(runtime) + if _registry(runtime).cell_for(policy) in ("chill", "coached") + else runtime.engine + ) + explanation = explain_policy( + _registry(runtime), + policy=policy, + entity=entity, + engine=simple_engine, + protected_gate=runtime.protected_gate, + signoff_gate=runtime.signoff_gate, + ) + if not explanation.enabled: + raise NotEnabledError( + f"cell {explanation.cell!r} is not enabled for override submission" + ) + idempotency_request_hash = ( + _override_idempotency_request_hash( + agent_id=runtime.agent_id, + policy=policy, + entity=entity, + rationale=rationale, + cell=explanation.cell, + file_fingerprint=_optional_string(args, "file_fingerprint"), + ast_path=_optional_string(args, "ast_path"), + ) + if idempotency_key is not None + else None + ) + extra_extensions = ( + { + "mcp_idempotency_key": idempotency_key, + "mcp_idempotency_request_hash": idempotency_request_hash, + "mcp_cell": explanation.cell, + } + if idempotency_key is not None + else {"mcp_cell": explanation.cell} + ) + if idempotency_key is not None and idempotency_request_hash is not None: + existing = _existing_idempotent_record( + runtime, idempotency_key, idempotency_request_hash + ) + if existing is not None: + return _tool_result( + _idempotent_override_response(existing.payload, existing.seq) ) - extra_extensions = ( + if explanation.cell in ("chill", "coached"): + override_result = submit_override( + _engine(runtime), + identity=runtime.identity, + policy=policy, + entity=entity, + rationale=rationale, + agent_id=runtime.agent_id, + extra_extensions=extra_extensions, + ) + if explanation.cell == "chill": + return _tool_result( { - "mcp_idempotency_key": idempotency_key, - "mcp_idempotency_request_hash": idempotency_request_hash, - "mcp_cell": explanation.cell, + "outcome": "ACCEPTED_SELF", + "cell": "chill", + "seq": override_result.seq, + "note": "self-cleared; human reviews asynchronously", } - if idempotency_key is not None - else {"mcp_cell": explanation.cell} ) - if idempotency_key is not None and idempotency_request_hash is not None: - existing = _existing_idempotent_record( - runtime, idempotency_key, idempotency_request_hash - ) - if existing is not None: - return _tool_result( - _idempotent_override_response(existing.payload, existing.seq) - ) - if explanation.cell in ("chill", "coached"): - override_result = submit_override( - _engine(runtime), - identity=runtime.identity, - policy=policy, - entity=entity, - rationale=rationale, - agent_id=runtime.agent_id, - extra_extensions=extra_extensions, - ) - if explanation.cell == "chill": - return _tool_result( - { - "outcome": "ACCEPTED_SELF", - "cell": "chill", - "seq": override_result.seq, - "note": "self-cleared; human reviews asynchronously", - } - ) - return _tool_result( - _judged_result_payload( - cell="coached", - seq=override_result.seq, - accepted=override_result.accepted, - judge_model=override_result.judge_model, - judge_rationale=override_result.judge_rationale, - ) - ) - if explanation.cell == "structured": - signoff = request_signoff( - runtime.signoff_gate, - identity=runtime.identity, - policy=policy, - entity=entity, - rationale=rationale, - agent_id=runtime.agent_id, - extra_extensions=extra_extensions, - ) - return _tool_result( - { - "outcome": "ESCALATED_PENDING", - "cell": "structured", - "seq": signoff.seq, - "cleared": signoff.cleared, - "human_required": True, - "operator_instruction": ( - f"Human sign-off required for seq {signoff.seq}." - ), - "poll_tool": "signoff_status_get", - "poll_handle": signoff.seq, - } - ) - if explanation.cell == "protected": - missing = [ - item.to_payload() - for item in explanation.required_inputs - if not _optional_string(args, item.field) - ] - if missing: - return _tool_result( - { - "outcome": "NEED_INPUTS", - "cell": "protected", - "required_inputs": missing, - } - ) - protected = submit_protected_override( - runtime.protected_gate, - identity=runtime.identity, - policy=policy, - entity=entity, - rationale=rationale, - agent_id=runtime.agent_id, - file_fingerprint=_require(args, "file_fingerprint"), - ast_path=_require(args, "ast_path"), - source_root=runtime.source_root, - extra_extensions=extra_extensions, - ) - return _tool_result( - _judged_result_payload( - cell="protected", - seq=protected.seq, - accepted=protected.accepted, - judge_model=protected.judge_model, - judge_rationale=protected.judge_rationale, - ) - ) - raise NotEnabledError(f"unsupported policy cell {explanation.cell!r}") - - if name == "signoff_status_get": - seq = _require_int(args, "seq") - if runtime.signoff_gate is None: - raise NotEnabledError("structured cell not enabled") - request = runtime.signoff_gate.request_record(seq) - if request is None: - return _tool_error("NO_SUCH_REQUEST", f"no sign-off request at seq {seq}") - if not runtime.signoff_gate.is_cleared(seq): - return _tool_result({"cleared": False, "seq": seq}) - signed = _signoff_signed_record(runtime, seq) - payload: dict[str, Any] = {"cleared": True, "seq": seq} - if signed is not None: - payload["signed_by"] = signed.get("agent_id") - payload["signed_at"] = signed.get("recorded_at") - return _tool_result(payload) - - if name == "policy_evaluate": - ev = evaluate_policy( - _grammar(runtime), - engine=_engine(runtime), - policy=_require(args, "policy"), - target=_require_object(args, "target"), + return _tool_result( + _judged_result_payload( + cell="coached", + seq=override_result.seq, + accepted=override_result.accepted, + judge_model=override_result.judge_model, + judge_rationale=override_result.judge_rationale, ) + ) + if explanation.cell == "structured": + signoff = request_signoff( + runtime.signoff_gate, + identity=runtime.identity, + policy=policy, + entity=entity, + rationale=rationale, + agent_id=runtime.agent_id, + extra_extensions=extra_extensions, + ) + return _tool_result( + { + "outcome": "ESCALATED_PENDING", + "cell": "structured", + "seq": signoff.seq, + "cleared": signoff.cleared, + "human_required": True, + "operator_instruction": ( + f"Human sign-off required for seq {signoff.seq}." + ), + "poll_tool": "signoff_status_get", + "poll_handle": signoff.seq, + } + ) + if explanation.cell == "protected": + missing = [ + item.to_payload() + for item in explanation.required_inputs + if not _optional_string(args, item.field) + ] + if missing: return _tool_result( { - "outcome": ev.result.value, - "detail": ev.detail, - "provenance_gap": ev.provenance_gap, + "outcome": "NEED_INPUTS", + "cell": "protected", + "required_inputs": missing, } ) + protected = submit_protected_override( + runtime.protected_gate, + identity=runtime.identity, + policy=policy, + entity=entity, + rationale=rationale, + agent_id=runtime.agent_id, + file_fingerprint=_require(args, "file_fingerprint"), + ast_path=_require(args, "ast_path"), + source_root=runtime.source_root, + extra_extensions=extra_extensions, + ) + return _tool_result( + _judged_result_payload( + cell="protected", + seq=protected.seq, + accepted=protected.accepted, + judge_model=protected.judge_model, + judge_rationale=protected.judge_rationale, + ) + ) + raise NotEnabledError(f"unsupported policy cell {explanation.cell!r}") + + +def _tool_signoff_status_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + seq = _require_int(args, "seq") + if runtime.signoff_gate is None: + raise NotEnabledError("structured cell not enabled") + request = runtime.signoff_gate.request_record(seq) + if request is None: + return _tool_error("NO_SUCH_REQUEST", f"no sign-off request at seq {seq}") + if not runtime.signoff_gate.is_cleared(seq): + return _tool_result({"cleared": False, "seq": seq}) + signed = _signoff_signed_record(runtime, seq) + payload: dict[str, Any] = {"cleared": True, "seq": seq} + if signed is not None: + payload["signed_by"] = signed.get("agent_id") + payload["signed_at"] = signed.get("recorded_at") + return _tool_result(payload) + + +def _tool_policy_evaluate(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + ev = evaluate_policy( + _grammar(runtime), + engine=_engine(runtime), + policy=_require(args, "policy"), + target=_require_object(args, "target"), + ) + return _tool_result( + { + "outcome": ev.result.value, + "detail": ev.detail, + "provenance_gap": ev.provenance_gap, + } + ) - if name == "scan_route": - server_cell = os.environ.get("LEGIS_WARDLINE_CELL") - server_cell_by_severity = os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY") - if server_cell and server_cell_by_severity: - return _tool_error( - "INVALID_CELL_SPEC", "server Wardline routing is misconfigured" - ) - has_cell = "cell" in args - has_map = "severity_map" in args - has_fail_on = "fail_on" in args - server_routing = server_cell is not None or server_cell_by_severity is not None - if server_routing and (has_cell or has_map or has_fail_on): + +def _tool_scan_route(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + server_cell = os.environ.get("LEGIS_WARDLINE_CELL") + server_cell_by_severity = os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY") + if server_cell and server_cell_by_severity: + return _tool_error( + "INVALID_CELL_SPEC", "server Wardline routing is misconfigured" + ) + has_cell = "cell" in args + has_map = "severity_map" in args + has_fail_on = "fail_on" in args + server_routing = server_cell is not None or server_cell_by_severity is not None + if server_routing and (has_cell or has_map or has_fail_on): + return _tool_error( + "INVALID_CELL_SPEC", "Wardline routing is server-owned" + ) + if not server_routing: + if os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") != "1": + return _tool_error( + "INVALID_CELL_SPEC", + "Wardline routing is server-owned; configure " + "LEGIS_WARDLINE_CELL or LEGIS_WARDLINE_CELL_BY_SEVERITY", + ) + if has_fail_on: + if not has_cell or has_map: return _tool_error( - "INVALID_CELL_SPEC", "Wardline routing is server-owned" + "INVALID_CELL_SPEC", + "fail_on routing requires cell and forbids severity_map", ) - if not server_routing: - if os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") != "1": - return _tool_error( - "INVALID_CELL_SPEC", - "Wardline routing is server-owned; configure " - "LEGIS_WARDLINE_CELL or LEGIS_WARDLINE_CELL_BY_SEVERITY", - ) - if has_fail_on: - if not has_cell or has_map: - return _tool_error( - "INVALID_CELL_SPEC", - "fail_on routing requires cell and forbids severity_map", - ) - elif has_cell == has_map: - return _tool_error( - "INVALID_CELL_SPEC", - "provide exactly one of cell or severity_map", - ) - scan = _require_object(args, "scan") - scan_policy: WardlineCellPolicy | None = None - scan_cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None - scan_fail_on: WardlineSeverity | None = None - try: - if server_cell_by_severity is not None: - scan_cell_map = _parse_wardline_cell_map(server_cell_by_severity) - elif server_cell is not None: - scan_policy = WardlineCellPolicy(server_cell) - elif has_cell: - scan_policy = WardlineCellPolicy(_require(args, "cell")) - if has_fail_on: - scan_fail_on = WardlineSeverity[_require(args, "fail_on")] - else: - raw_map = _require_object(args, "severity_map") - scan_cell_map = { - WardlineSeverity[severity]: WardlineCellPolicy(cell) - for severity, cell in raw_map.items() - } - except (KeyError, ValueError) as exc: - return _tool_error("INVALID_CELL_SPEC", str(exc)) - routed = route_wardline_scan( - scan, - agent_id=runtime.agent_id, - identity=runtime.identity, - engine=_engine(runtime), - signoff=runtime.signoff_gate, - policy=scan_policy, - cell_map=scan_cell_map, - fail_on=scan_fail_on, - artifact_key=( - runtime.wardline_artifact_key - or ( - os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"].encode("utf-8") - if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") - else None - ) - ), + elif has_cell == has_map: + return _tool_error( + "INVALID_CELL_SPEC", + "provide exactly one of cell or severity_map", ) - return _tool_result({"outcome": "ROUTED", "routed": routed}) - - if name == "git_branch_list": - return _tool_result( - {"branches": [asdict(branch) for branch in _git(runtime).branches()]} + scan = _require_object(args, "scan") + scan_policy: WardlineCellPolicy | None = None + scan_cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None + scan_fail_on: WardlineSeverity | None = None + try: + if server_cell_by_severity is not None: + scan_cell_map = _parse_wardline_cell_map(server_cell_by_severity) + elif server_cell is not None: + scan_policy = WardlineCellPolicy(server_cell) + elif has_cell: + scan_policy = WardlineCellPolicy(_require(args, "cell")) + if has_fail_on: + scan_fail_on = WardlineSeverity[_require(args, "fail_on")] + else: + raw_map = _require_object(args, "severity_map") + scan_cell_map = { + WardlineSeverity[severity]: WardlineCellPolicy(cell) + for severity, cell in raw_map.items() + } + except (KeyError, ValueError) as exc: + return _tool_error("INVALID_CELL_SPEC", str(exc)) + routed = route_wardline_scan( + scan, + agent_id=runtime.agent_id, + identity=runtime.identity, + engine=_engine(runtime), + signoff=runtime.signoff_gate, + policy=scan_policy, + cell_map=scan_cell_map, + fail_on=scan_fail_on, + artifact_key=( + runtime.wardline_artifact_key + or ( + os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"].encode("utf-8") + if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") + else None ) + ), + ) + return _tool_result({"outcome": "ROUTED", "routed": routed}) - if name == "git_commit_get": - return _tool_result( - {"commit": asdict(_git(runtime).commit(_require(args, "sha")))} - ) - if name == "git_rename_list": - return _tool_result( - { - "renames": [ - asdict(rename) - for rename in _git(runtime).renames(_require(args, "rev_range")) - ] - } - ) +def _tool_git_branch_list(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + return _tool_result( + {"branches": [asdict(branch) for branch in _git(runtime).branches()]} + ) - if name == "git_rename_feed_get": - from legis.git.rename_feed import build_rename_feed - return _tool_result( - build_rename_feed( - runtime.source_root or os.getcwd(), - base=_require(args, "base"), - head=args.get("head", "HEAD"), - include_worktree=bool(args.get("include_worktree", False)), - ) - ) +def _tool_git_commit_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + return _tool_result( + {"commit": asdict(_git(runtime).commit(_require(args, "sha")))} + ) - if name == "filigree_closure_gate_get": - from legis.governance.filigree_gate import evaluate_issue_closure - if runtime.binding_ledger is None: - raise NotEnabledError("binding ledger not enabled") - return _tool_result( - evaluate_issue_closure(runtime.binding_ledger, issue_id=_require(args, "issue_id")) - ) +def _tool_git_rename_list(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + return _tool_result( + { + "renames": [ + asdict(rename) + for rename in _git(runtime).renames(_require(args, "rev_range")) + ] + } + ) - if name == "pull_request_get": - number = _require_int(args, "number") - pull = _pulls(runtime).get(number) - if pull is None: - return _tool_error("NOT_FOUND", f"unknown PR: {number}") - pull_payload = asdict(pull) - pull_payload["state"] = pull.state.value - pull_checks = ( - _checks(runtime).for_pr(number) - if runtime.check_surface is not None - else [] - ) - pull_payload["checks"] = [_check_to_dict(run) for run in pull_checks] - return _tool_result(pull_payload) - - if name == "check_list": - check_surface = _checks(runtime) - target_type = _require(args, "target_type") - target = _require(args, "target") - if target_type == "commit": - checks = check_surface.for_commit(target) - response_target: str | int = target - elif target_type == "branch": - checks = check_surface.for_branch(target) - response_target = target - elif target_type == "pr": - try: - pr_number = int(target) - except ValueError as exc: - raise InvalidArgumentError( - "target_type 'pr' requires an integer target" - ) from exc - checks = check_surface.for_pr(pr_number) - response_target = pr_number - else: - raise InvalidArgumentError( - "target_type must be one of: commit, branch, pr" - ) - return _tool_result( - { - "target_type": target_type, - "target": response_target, - "checks": [_check_to_dict(run) for run in checks], - } - ) - if name == "override_rate_get": - rate = compute_override_rate(_verified_records(runtime)) - return _tool_result( - { - "status": rate.status.value, - "rate": rate.rate, - "sample_size": rate.sample_size, - "note": _OVERRIDE_RATE_NOTE, - } - ) +def _tool_git_rename_feed_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + from legis.git.rename_feed import build_rename_feed + + return _tool_result( + build_rename_feed( + runtime.source_root or os.getcwd(), + base=_require(args, "base"), + head=args.get("head", "HEAD"), + include_worktree=bool(args.get("include_worktree", False)), + ) + ) + + +def _tool_filigree_closure_gate_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + from legis.governance.filigree_gate import evaluate_issue_closure - return _tool_error("UNKNOWN_TOOL", f"unknown tool: {name}") + if runtime.binding_ledger is None: + raise NotEnabledError("binding ledger not enabled") + return _tool_result( + evaluate_issue_closure(runtime.binding_ledger, issue_id=_require(args, "issue_id")) + ) + + +def _tool_pull_request_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + number = _require_int(args, "number") + pull = _pulls(runtime).get(number) + if pull is None: + return _tool_error("NOT_FOUND", f"unknown PR: {number}") + pull_payload = asdict(pull) + pull_payload["state"] = pull.state.value + pull_checks = ( + _checks(runtime).for_pr(number) + if runtime.check_surface is not None + else [] + ) + pull_payload["checks"] = [_check_to_dict(run) for run in pull_checks] + return _tool_result(pull_payload) + + +def _tool_check_list(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + check_surface = _checks(runtime) + target_type = _require(args, "target_type") + target = _require(args, "target") + if target_type == "commit": + checks = check_surface.for_commit(target) + response_target: str | int = target + elif target_type == "branch": + checks = check_surface.for_branch(target) + response_target = target + elif target_type == "pr": + try: + pr_number = int(target) + except ValueError as exc: + raise InvalidArgumentError( + "target_type 'pr' requires an integer target" + ) from exc + checks = check_surface.for_pr(pr_number) + response_target = pr_number + else: + raise InvalidArgumentError( + "target_type must be one of: commit, branch, pr" + ) + return _tool_result( + { + "target_type": target_type, + "target": response_target, + "checks": [_check_to_dict(run) for run in checks], + } + ) + + +def _tool_override_rate_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: + rate = compute_override_rate(_verified_records(runtime)) + return _tool_result( + { + "status": rate.status.value, + "rate": rate.rate, + "sample_size": rate.sample_size, + "note": _OVERRIDE_RATE_NOTE, + } + ) + + +_TOOL_HANDLERS: dict[str, Callable[["McpRuntime", dict[str, Any]], dict[str, Any]]] = { + "policy_explain": _tool_policy_explain, + "override_submit": _tool_override_submit, + "signoff_status_get": _tool_signoff_status_get, + "policy_evaluate": _tool_policy_evaluate, + "scan_route": _tool_scan_route, + "git_branch_list": _tool_git_branch_list, + "git_commit_get": _tool_git_commit_get, + "git_rename_list": _tool_git_rename_list, + "git_rename_feed_get": _tool_git_rename_feed_get, + "filigree_closure_gate_get": _tool_filigree_closure_gate_get, + "pull_request_get": _tool_pull_request_get, + "check_list": _tool_check_list, + "override_rate_get": _tool_override_rate_get, +} + + +def call_tool(runtime: McpRuntime, name: str, args: dict[str, Any]) -> dict[str, Any]: + try: + _validate_argument_keys(name, args) + handler = _TOOL_HANDLERS.get(name) + if handler is None: + return _tool_error("UNKNOWN_TOOL", f"unknown tool: {name}") + return handler(runtime, args) except Exception as exc: return _service_error(exc) @@ -1113,8 +1166,51 @@ def handle_request(request: dict[str, Any], runtime: McpRuntime) -> dict[str, An return {"jsonrpc": "2.0", "id": request_id, "result": result} +def _read_bounded_line(stream: TextIO, max_chars: int) -> tuple[str, bool]: + """Read one newline-terminated record, bounded to ``max_chars``. + + Returns ``(line, overflow)``. ``overflow`` is True when the record exceeded + the bound — the remainder of that over-long line is then drained to the next + newline so framing stays aligned for the following request. Returns + ``("", False)`` at EOF. ``readline(max_chars + 1)`` stops at a newline OR the + size cap, so a record longer than the bound comes back without a trailing + newline — the signal we key on. + """ + line = stream.readline(max_chars + 1) + if line == "": + return "", False + if len(line) > max_chars and not line.endswith("\n"): + while True: + extra = stream.readline(max_chars + 1) + if extra == "" or extra.endswith("\n"): + break + return line, True + return line, False + + def run_jsonrpc(input_stream: TextIO, output_stream: TextIO, runtime: McpRuntime) -> None: - for line in input_stream: + max_chars = _max_request_bytes() + while True: + line, overflow = _read_bounded_line(input_stream, max_chars) + if not line: + break # EOF + if overflow: + output_stream.write( + json.dumps( + { + "jsonrpc": "2.0", + "id": None, + "error": { + "code": -32700, + "message": f"request exceeds maximum size of {max_chars} bytes", + }, + }, + separators=(",", ":"), + ) + + "\n" + ) + output_stream.flush() + continue if not line.strip(): continue try: diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index d48c991..4c45b29 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -1393,3 +1393,45 @@ def get_by_issue_id(self, issue_id): assert result["isError"] is True assert result["structuredContent"]["error_code"] == "AUDIT_INTEGRITY_FAILURE" + + +# --- roadmap 14: stdin JSON-RPC line-size bound --- + +def test_run_jsonrpc_rejects_oversized_line_and_stays_framed(tmp_path, monkeypatch): + # A single line over the bound is rejected with -32700 and does not consume + # the following request — framing realigns at the next newline. + monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", "400") + runtime, _store = _runtime(tmp_path) + runtime.initialized = False + oversized = { + "jsonrpc": "2.0", "id": 99, "method": "tools/list", + "params": {"pad": "A" * 2000}, + } + responses = _run( + _messages( + {"jsonrpc": "2.0", "id": 1, "method": "initialize", + "params": {"protocolVersion": "2025-03-26"}}, + oversized, + {"jsonrpc": "2.0", "id": 2, "method": "tools/list", "params": {}}, + ), + runtime, + ) + + assert responses[0]["id"] == 1 and "result" in responses[0] + assert responses[1]["id"] is None + assert responses[1]["error"]["code"] == -32700 + assert "maximum size" in responses[1]["error"]["message"] + # The request AFTER the oversized line is still parsed and answered. + assert responses[2]["id"] == 2 and "result" in responses[2] + + +def test_max_request_bytes_env_override_and_fallback(monkeypatch): + from legis.mcp import _DEFAULT_MAX_REQUEST_BYTES, _max_request_bytes + + monkeypatch.delenv("LEGIS_MCP_MAX_REQUEST_BYTES", raising=False) + assert _max_request_bytes() == _DEFAULT_MAX_REQUEST_BYTES + monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", "4096") + assert _max_request_bytes() == 4096 + for bad in ("not-an-int", "0", "-5"): + monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", bad) + assert _max_request_bytes() == _DEFAULT_MAX_REQUEST_BYTES From d7e7c81d04e8c2867776c0cd3c11d5d7d3ee33b6 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 17:20:24 +1000 Subject: [PATCH 10/72] chore: stop tracking AGENTS.md and CLAUDE.md Untrack the filigree-generated AGENTS.md and CLAUDE.md instruction files and add them to .gitignore so they stay local-only. Co-Authored-By: Claude Opus 4.8 --- .gitignore | 2 + AGENTS.md | 119 ----------------------------------------------------- CLAUDE.md | 119 ----------------------------------------------------- 3 files changed, 2 insertions(+), 238 deletions(-) delete mode 100644 AGENTS.md delete mode 100644 CLAUDE.md diff --git a/.gitignore b/.gitignore index 09b1965..1f1fc6c 100644 --- a/.gitignore +++ b/.gitignore @@ -20,3 +20,5 @@ coverage.json loomweave.yaml wardline.yaml .loomweave/loomweave.lock +AGENTS.md +CLAUDE.md diff --git a/AGENTS.md b/AGENTS.md deleted file mode 100644 index d2ea656..0000000 --- a/AGENTS.md +++ /dev/null @@ -1,119 +0,0 @@ - -## Filigree Issue Tracker - -`filigree` tracks tasks for this project. Data lives in `.filigree/`. Prefer -the MCP tools (`mcp__filigree__*`) when available; fall back to the `filigree` -CLI otherwise. - -### Workflow - -```bash -# At session start -filigree session-context # ready / in-progress / critical path - -# Pick up the next startable issue (atomic claim + transition into its working status) -filigree start-next-work --assignee -# ...or claim a specific issue -filigree start-work --assignee - -# Do the work, commit, then -filigree close -``` - -Use the atomic claim+transition verbs — `work_start` / `work_start_next` -(MCP) or `start-work` / `start-next-work` (CLI). Do **not** chain -`work_claim` (MCP) or `filigree claim` (CLI) with a subsequent status -update — the two-step form races against other agents; the combined verb is -atomic. - -**Ready ≠ startable.** The working status is type-specific (tasks → -`in_progress`, features → `building`). Bugs start at `triage`, which has no -single-hop transition into work (`triage → confirmed → fixing`), so a triage -bug is *ready* but not directly *startable*: `work_start` on one returns -`INVALID_TRANSITION` naming the next status, and `work_start_next` skips it. -`work_ready` items carry a `startable` flag (plus a `next_action` hint when -false). Pass `advance=true` (MCP) / `--advance` (CLI) to walk the soft -transitions to the nearest working status automatically. - -### Observations: when (and when not) to use them - -`observation_create` is a fire-and-forget scratchpad for *incidental* defects — things -you notice *outside the scope of your current task* (a code smell in a -neighbouring file, a stale TODO, a missing test for an edge case you happened -to spot). Notes expire after 14 days unless promoted. Include `file_path` and -`line` when relevant. At session end, skim `observation_list` and either -`observation_dismiss` or `observation_promote` for what has accumulated. - -**You fix bugs in your currently defined scope. You do NOT use observations -to finish work prematurely.** If a defect, gap, or follow-up belongs to your -current task, you own it — handle it as part of that task: fix it now, expand -the task's scope, file a proper issue with a dependency, or surface it to the -user. Filing it as an observation and closing the task is *not* completing -the task; it is shipping known-broken work and hiding the debt in a 14-day -expiring scratchpad. The test is "would I have noticed this even if I weren't -working on this task?" If no, it's task scope, not an observation. - -### Priority scale - -- P0: Critical (drop everything) -- P1: High (do next) -- P2: Medium (default) -- P3: Low -- P4: Backlog - -### Reaching for tools - -MCP tool schemas describe each tool; `filigree --help` and `filigree ---help` are the authoritative CLI reference. You do not need to memorise -either catalogue. The verbs you will reach for most: - -- **Find work:** `work_ready`, `work_blocked`, `issue_list`, `issue_search` -- **Claim work:** `work_start`, `work_start_next` -- **Update:** `comment_add`, `label_add`, `issue_update`, `issue_close` -- **Admin (irreversible):** `issue_delete` (MCP) / `delete-issue` (CLI) — - hard-deletes a terminal issue and its rows; `admin_undo_last` cannot reverse it. -- **Scratchpad:** `observation_create`, `observation_list`, `observation_promote`, `observation_dismiss` -- **Cross-product entity bindings (ADR-029):** `entity_association_add`, - `entity_association_remove`, `entity_association_list`, - `entity_association_list_by_entity`. Used when a sibling tool (e.g. - Clarion) needs to bind a Filigree issue to a function, class, or - module identifier it owns. The `entity_id` is an opaque external string - from Filigree's perspective and may be a `clarion:eid:...` SEI or a legacy - locator; callers may also supply `entity_kind` explicitly. The consumer (the sibling tool's read - path) does drift detection against the stored - `content_hash_at_attach`. `entity_association_list_by_entity` is the - reverse-lookup surface — given an opaque external entity ID, return every - Filigree issue bound to it (project isolation is by DB file). Also - reachable over HTTP as - `GET/POST /api/issue/{issue_id}/entity-associations`, - `DELETE /api/issue/{issue_id}/entity-associations?entity_id=…`, - and `GET /api/entity-associations?entity_id=…`. -- **Health:** `stats_get`, `metrics_get`, `mcp_status_get` - -Pass `--actor ` (CLI) so events attribute to your agent identity. It -works in either position — before the verb (`filigree --actor X update …`) or -after it (`filigree update … --actor X`); the post-verb value overrides the -group-level one. - -### Error handling - -Errors return `{error: str, code: ErrorCode, details?: dict}`. Switch on -`code`, not on message text. Codes: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, -`INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, -`INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, -`CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, -`BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. - -On `INVALID_TRANSITION`, call `workflow_transition_list` (MCP) or -`filigree transitions ` to see what the workflow allows from here. - -Two failure modes deserve a specific response: - -- **`SCHEMA_MISMATCH`** — the installed `filigree` is older than the project - database. The error message contains upgrade guidance. Surface it to the - user; do not retry. -- **`ForeignDatabaseError`** — filigree found a parent project's database - but no local `.filigree.conf`. Run `filigree init` in the current - directory. Do **not** `cd` upward to a different project unless that was - the actual intent. - diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index d2ea656..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,119 +0,0 @@ - -## Filigree Issue Tracker - -`filigree` tracks tasks for this project. Data lives in `.filigree/`. Prefer -the MCP tools (`mcp__filigree__*`) when available; fall back to the `filigree` -CLI otherwise. - -### Workflow - -```bash -# At session start -filigree session-context # ready / in-progress / critical path - -# Pick up the next startable issue (atomic claim + transition into its working status) -filigree start-next-work --assignee -# ...or claim a specific issue -filigree start-work --assignee - -# Do the work, commit, then -filigree close -``` - -Use the atomic claim+transition verbs — `work_start` / `work_start_next` -(MCP) or `start-work` / `start-next-work` (CLI). Do **not** chain -`work_claim` (MCP) or `filigree claim` (CLI) with a subsequent status -update — the two-step form races against other agents; the combined verb is -atomic. - -**Ready ≠ startable.** The working status is type-specific (tasks → -`in_progress`, features → `building`). Bugs start at `triage`, which has no -single-hop transition into work (`triage → confirmed → fixing`), so a triage -bug is *ready* but not directly *startable*: `work_start` on one returns -`INVALID_TRANSITION` naming the next status, and `work_start_next` skips it. -`work_ready` items carry a `startable` flag (plus a `next_action` hint when -false). Pass `advance=true` (MCP) / `--advance` (CLI) to walk the soft -transitions to the nearest working status automatically. - -### Observations: when (and when not) to use them - -`observation_create` is a fire-and-forget scratchpad for *incidental* defects — things -you notice *outside the scope of your current task* (a code smell in a -neighbouring file, a stale TODO, a missing test for an edge case you happened -to spot). Notes expire after 14 days unless promoted. Include `file_path` and -`line` when relevant. At session end, skim `observation_list` and either -`observation_dismiss` or `observation_promote` for what has accumulated. - -**You fix bugs in your currently defined scope. You do NOT use observations -to finish work prematurely.** If a defect, gap, or follow-up belongs to your -current task, you own it — handle it as part of that task: fix it now, expand -the task's scope, file a proper issue with a dependency, or surface it to the -user. Filing it as an observation and closing the task is *not* completing -the task; it is shipping known-broken work and hiding the debt in a 14-day -expiring scratchpad. The test is "would I have noticed this even if I weren't -working on this task?" If no, it's task scope, not an observation. - -### Priority scale - -- P0: Critical (drop everything) -- P1: High (do next) -- P2: Medium (default) -- P3: Low -- P4: Backlog - -### Reaching for tools - -MCP tool schemas describe each tool; `filigree --help` and `filigree ---help` are the authoritative CLI reference. You do not need to memorise -either catalogue. The verbs you will reach for most: - -- **Find work:** `work_ready`, `work_blocked`, `issue_list`, `issue_search` -- **Claim work:** `work_start`, `work_start_next` -- **Update:** `comment_add`, `label_add`, `issue_update`, `issue_close` -- **Admin (irreversible):** `issue_delete` (MCP) / `delete-issue` (CLI) — - hard-deletes a terminal issue and its rows; `admin_undo_last` cannot reverse it. -- **Scratchpad:** `observation_create`, `observation_list`, `observation_promote`, `observation_dismiss` -- **Cross-product entity bindings (ADR-029):** `entity_association_add`, - `entity_association_remove`, `entity_association_list`, - `entity_association_list_by_entity`. Used when a sibling tool (e.g. - Clarion) needs to bind a Filigree issue to a function, class, or - module identifier it owns. The `entity_id` is an opaque external string - from Filigree's perspective and may be a `clarion:eid:...` SEI or a legacy - locator; callers may also supply `entity_kind` explicitly. The consumer (the sibling tool's read - path) does drift detection against the stored - `content_hash_at_attach`. `entity_association_list_by_entity` is the - reverse-lookup surface — given an opaque external entity ID, return every - Filigree issue bound to it (project isolation is by DB file). Also - reachable over HTTP as - `GET/POST /api/issue/{issue_id}/entity-associations`, - `DELETE /api/issue/{issue_id}/entity-associations?entity_id=…`, - and `GET /api/entity-associations?entity_id=…`. -- **Health:** `stats_get`, `metrics_get`, `mcp_status_get` - -Pass `--actor ` (CLI) so events attribute to your agent identity. It -works in either position — before the verb (`filigree --actor X update …`) or -after it (`filigree update … --actor X`); the post-verb value overrides the -group-level one. - -### Error handling - -Errors return `{error: str, code: ErrorCode, details?: dict}`. Switch on -`code`, not on message text. Codes: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, -`INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, -`INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, -`CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, -`BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. - -On `INVALID_TRANSITION`, call `workflow_transition_list` (MCP) or -`filigree transitions ` to see what the workflow allows from here. - -Two failure modes deserve a specific response: - -- **`SCHEMA_MISMATCH`** — the installed `filigree` is older than the project - database. The error message contains upgrade guidance. Surface it to the - user; do not retry. -- **`ForeignDatabaseError`** — filigree found a parent project's database - but no local `.filigree.conf`. Run `filigree init` in the current - directory. Do **not** `cd` upward to a different project unless that was - the actual intent. - From fa5ca3b59114c3a88793f94eaf920eaff3fad715 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 17:42:09 +1000 Subject: [PATCH 11/72] feat(wardline): typed SKIPPED_DIRTY_TREE amber state + dirty-tree dev path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit wardline now ships `scan --format legis --allow-dirty`, emitting an UNSIGNED `dirty: true` dev artifact (signing stays clean-tree-only). The upstream contract the legis-side work was deferred on now exists, so: - keyless dev posture: a dirty artifact governs but records the marker honestly (artifact_status="dirty"), distinguishable from a clean unsigned scan. - CI posture (artifact key configured): a dirty dev artifact is a typed amber `SKIPPED_DIRTY_TREE` outcome on scan_route / /wardline/scan-results — not the generic red (WardlinePayloadError -> 422 / INVALID_ARGUMENT), and nothing is governed. `LEGIS_WARDLINE_ALLOW_DIRTY=1` is the explicit server-side dev-mode opt-in that governs it unsigned (recorded "dirty", never "verified"). Relaxation is scoped to exactly `dirty is True AND no signature`: a signed payload still verifies (a forged signature stays red), and a clean unsigned payload still requires a signature. `dirty` is checked as strict boolean True because the scan dict is caller-controlled, and dev-mode comes only from server config, never the payload — so the clean-tree signing guarantee is intact. Closes legis-d731c760c5 (P0 dirty-tree dev path), legis-7e85e8e7ba (P1 typed amber state). Co-Authored-By: Claude Opus 4.8 --- CHANGELOG.md | 17 +++++ src/legis/api/app.py | 18 ++++- src/legis/mcp.py | 60 ++++++++++----- src/legis/service/wardline.py | 5 +- src/legis/wardline/ingest.py | 63 ++++++++++++++++ tests/api/test_combinations_api.py | 51 +++++++++++++ tests/mcp/test_server.py | 66 +++++++++++++++++ tests/wardline/test_ingest.py | 114 +++++++++++++++++++++++++++++ 8 files changed, 371 insertions(+), 23 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e6e160f..f28db55 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,22 @@ All notable changes to Legis are documented here. The format follows versions per [PEP 440](https://peps.python.org/pep-0440/) / [SemVer](https://semver.org/) (pre-release: `1.0.0rc1`). +## [Unreleased] + +### Added +- **Dirty-tree dev path** — `verify_wardline_artifact` now recognises the + unsigned `dirty: true` dev artifact emitted by `wardline scan --format legis + --allow-dirty`. In the keyless posture it governs but records the marker + honestly (`artifact_status: "dirty"`). In the CI posture (artifact key + configured) a dirty dev artifact is a typed amber **`SKIPPED_DIRTY_TREE`** + outcome on `scan_route` / `/wardline/scan-results` — distinguishable from the + generic red, never governed — unless `LEGIS_WARDLINE_ALLOW_DIRTY=1` opts into + governing it unsigned (recorded as `"dirty"`). The relaxation is scoped to + exactly `dirty is True AND no signature`: a signed payload still verifies + (a forged signature stays red) and a clean unsigned payload still requires a + signature, so the clean-tree signing guarantee is intact. (legis-d731c760c5, + legis-7e85e8e7ba; upstream wardline `--allow-dirty`.) + ## [1.0.0rc1] — 2026-06-03 First release candidate for 1.0. Everything built through Sprint 6 plus the @@ -49,4 +65,5 @@ WP-M1 service-layer extraction, consolidated behind a stable version. (Filigree signature column, live-Loomweave oracle + HMAC auth, operative git-rename feed) remain. +[Unreleased]: https://peps.python.org/pep-0440/ [1.0.0rc1]: https://peps.python.org/pep-0440/ diff --git a/src/legis/api/app.py b/src/legis/api/app.py index 15f7448..17134d8 100644 --- a/src/legis/api/app.py +++ b/src/legis/api/app.py @@ -59,7 +59,11 @@ from legis.pulls.models import PullRequest, PullRequestState from legis.pulls.surface import PullSurface from legis.wardline.governor import WardlineCellPolicy -from legis.wardline.ingest import WardlinePayloadError, WardlineSeverity +from legis.wardline.ingest import ( + WardlineDirtyTreeError, + WardlinePayloadError, + WardlineSeverity, +) security = HTTPBearer(auto_error=False) @@ -834,11 +838,21 @@ def wardline_scan_results(body: ScanResultsIn, actor: str = Depends(verify_write if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") else None ), + allow_dirty=os.environ.get("LEGIS_WARDLINE_ALLOW_DIRTY") == "1", ) + except WardlineDirtyTreeError as exc: + # Amber, not red: a dirty dev tree is "environment not ready", not a + # broken/tampered scan. 200 with a typed skip so a harness can tell + # it apart from the 422 generic failure and nothing is governed. + return { + "outcome": exc.reason, + "routed": [], + "detail": str(exc), + } except WardlinePayloadError as exc: raise HTTPException(status_code=422, detail=f"invalid Wardline scan: {exc}") except ValueError as exc: raise HTTPException(status_code=409, detail=str(exc)) - return {"routed": routed} + return {"outcome": "ROUTED", "routed": routed} return app diff --git a/src/legis/mcp.py b/src/legis/mcp.py index 8171305..fd2bfa4 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -55,7 +55,7 @@ from legis.service.wardline import route_wardline_scan from legis.store.audit_store import AuditStore from legis.wardline.governor import WardlineCellPolicy -from legis.wardline.ingest import WardlineSeverity +from legis.wardline.ingest import WardlineDirtyTreeError, WardlineSeverity _AGENT_TOOLS = frozenset( @@ -116,6 +116,7 @@ class McpRuntime: grammar: PolicyGrammar | None = None source_root: str | Path | None = None wardline_artifact_key: bytes | None = None + wardline_allow_dirty: bool = False binding_ledger: Any | None = None @@ -202,6 +203,7 @@ def build_runtime(agent_id: str) -> McpRuntime: if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") else None ), + wardline_allow_dirty=os.environ.get("LEGIS_WARDLINE_ALLOW_DIRTY") == "1", binding_ledger=binding_ledger, ) @@ -269,7 +271,12 @@ def tool_definitions() -> list[dict[str, Any]]: "name": "scan_route", "description": ( "Route Wardline scan findings through one cell, a severity_map " - "policy, or a cell plus fail_on threshold." + "policy, or a cell plus fail_on threshold. Returns a discriminated " + "outcome: ROUTED (governed) or SKIPPED_DIRTY_TREE (an unsigned " + "dirty-tree dev artifact arrived where signed provenance is " + "required — a typed amber skip, not a failure; commit for a " + "signed artifact, or set LEGIS_WARDLINE_ALLOW_DIRTY=1 to govern " + "it unsigned in dev)." ), "inputSchema": _schema( ["scan"], @@ -949,24 +956,37 @@ def _tool_scan_route(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any } except (KeyError, ValueError) as exc: return _tool_error("INVALID_CELL_SPEC", str(exc)) - routed = route_wardline_scan( - scan, - agent_id=runtime.agent_id, - identity=runtime.identity, - engine=_engine(runtime), - signoff=runtime.signoff_gate, - policy=scan_policy, - cell_map=scan_cell_map, - fail_on=scan_fail_on, - artifact_key=( - runtime.wardline_artifact_key - or ( - os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"].encode("utf-8") - if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") - else None - ) - ), - ) + try: + routed = route_wardline_scan( + scan, + agent_id=runtime.agent_id, + identity=runtime.identity, + engine=_engine(runtime), + signoff=runtime.signoff_gate, + policy=scan_policy, + cell_map=scan_cell_map, + fail_on=scan_fail_on, + artifact_key=( + runtime.wardline_artifact_key + or ( + os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"].encode("utf-8") + if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") + else None + ) + ), + allow_dirty=( + runtime.wardline_allow_dirty + or os.environ.get("LEGIS_WARDLINE_ALLOW_DIRTY") == "1" + ), + ) + except WardlineDirtyTreeError as exc: + # Amber, not red (INVALID_ARGUMENT): a dirty dev tree is "environment + # not ready", not a broken/tampered scan. A typed outcome lets a harness + # tell "commit first" apart from a genuine legis/scan fault; nothing is + # governed. + return _tool_result( + {"outcome": exc.reason, "routed": [], "detail": str(exc)} + ) return _tool_result({"outcome": "ROUTED", "routed": routed}) diff --git a/src/legis/service/wardline.py b/src/legis/service/wardline.py index cb86e9e..a34f410 100644 --- a/src/legis/service/wardline.py +++ b/src/legis/service/wardline.py @@ -32,8 +32,11 @@ def route_wardline_scan( cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None, fail_on: WardlineSeverity | None = None, artifact_key: bytes | None = None, + allow_dirty: bool = False, ) -> list[dict[str, Any]]: - artifact_provenance = verify_wardline_artifact(scan, artifact_key) + artifact_provenance = verify_wardline_artifact( + scan, artifact_key, allow_dirty=allow_dirty + ) findings = active_defects(scan) def resolve(qualname: str | None) -> tuple[EntityKey, dict[str, Any]]: diff --git a/src/legis/wardline/ingest.py b/src/legis/wardline/ingest.py index 825c36c..c66f905 100644 --- a/src/legis/wardline/ingest.py +++ b/src/legis/wardline/ingest.py @@ -53,6 +53,28 @@ class WardlinePayloadError(ValueError): """A Wardline scan payload is not shaped like the trusted wire contract.""" +# A dirty working tree is not a malformed payload — it is "the dev environment +# is not ready for a signed artifact yet". wardline emits an UNSIGNED, dirty:true +# dev artifact for this case (signing stays clean-tree-only). legis classifies it +# as a typed amber/skipped state, NOT a generic red, so a harness can tell +# "commit first" apart from "legis/the scan is broken". +SKIPPED_DIRTY_TREE = "SKIPPED_DIRTY_TREE" + + +class WardlineDirtyTreeError(Exception): + """A dirty-tree dev artifact arrived where signed CI provenance is required. + + Deliberately NOT a ``WardlinePayloadError`` (which boundaries map to a + generic red — HTTP 422 / MCP ``INVALID_ARGUMENT``): the whole point is that + this amber/skipped state is *distinguishable* from a malformed-or-tampered + payload. Raised only in the CI posture (artifact key configured) when the + dirty dev artifact is unsigned and the dev-mode opt-in is off. Boundaries + catch it and surface a typed ``SKIPPED_DIRTY_TREE`` outcome. + """ + + reason = SKIPPED_DIRTY_TREE + + def wardline_artifact_fields(scan: Mapping[str, Any]) -> dict[str, Any]: """The Wardline artifact payload covered by ``artifact_signature``.""" if not isinstance(scan, Mapping): @@ -67,12 +89,32 @@ def wardline_artifact_fields(scan: Mapping[str, Any]) -> dict[str, Any]: def verify_wardline_artifact( scan: Mapping[str, Any], artifact_key: bytes | None, + *, + allow_dirty: bool = False, ) -> dict[str, Any]: """Validate optional server-required artifact authentication. When ``artifact_key`` is configured, the scan must carry signed scanner, rule-set, commit, and tree provenance. Without a configured key we still record any supplied metadata, but mark it explicitly unverified. + + Dirty-tree dev artifacts (``dirty: true`` + no signature — wardline + ``--allow-dirty``) are a typed amber case, never a generic red: + + * keyless dev posture — already permissive; the scan governs, but the + dirty marker is recorded honestly (``artifact_status == "dirty"``) so a + dirty dev scan is distinguishable from a clean unsigned one. + * CI posture (``artifact_key`` configured) — by default a dirty dev + artifact raises :class:`WardlineDirtyTreeError` (the boundary surfaces a + typed ``SKIPPED_DIRTY_TREE`` outcome). ``allow_dirty`` is the explicit + server-side dev-mode opt-in that lets legis govern it UNSIGNED, recorded + as ``"dirty"`` (never ``"verified"``). + + The relaxation is scoped to exactly ``dirty is True AND no signature``: a + signed payload still verifies normally (so a forged signature stays red), + and a clean unsigned payload still requires a signature (``allow_dirty`` + relaxes only the dirty case, not "any unsigned"). ``dirty`` is checked as + strict boolean ``True`` because the scan dict is caller-controlled. """ fields = wardline_artifact_fields(scan) provenance = { @@ -83,9 +125,30 @@ def verify_wardline_artifact( if isinstance(value, str) and value: provenance[key] = value + signature_present = isinstance(scan.get(ARTIFACT_SIGNATURE_FIELD), str) and bool( + scan.get(ARTIFACT_SIGNATURE_FIELD) + ) + is_dirty_dev_artifact = scan.get("dirty") is True and not signature_present + if artifact_key is None: + if is_dirty_dev_artifact: + provenance["artifact_status"] = "dirty" return provenance + if is_dirty_dev_artifact: + if not allow_dirty: + raise WardlineDirtyTreeError( + "wardline emitted an unsigned dirty-tree dev artifact " + "(dirty: true); signing is clean-tree-only. Commit for a " + "signed artifact, or set LEGIS_WARDLINE_ALLOW_DIRTY=1 to " + "govern it unsigned in dev." + ) + return { + "artifact_status": "dirty", + **{key: value for key in ARTIFACT_PROVENANCE_FIELDS + if isinstance(value := scan.get(key), str) and value}, + } + missing = [ key for key in ARTIFACT_PROVENANCE_FIELDS if not isinstance(scan.get(key), str) or not scan[key] diff --git a/tests/api/test_combinations_api.py b/tests/api/test_combinations_api.py index 80a885a..ce7c74c 100644 --- a/tests/api/test_combinations_api.py +++ b/tests/api/test_combinations_api.py @@ -556,6 +556,57 @@ def test_scan_results_records_verified_artifact_provenance(tmp_path, monkeypatch assert wardline["artifact_signature"].startswith("hmac-sha256:v2:") +def _dirty_wardline_scan(): + return { + "scanner_identity": "wardline@1.0.0rc1", + "rule_set_version": "rules@abc123", + "commit_sha": "a" * 40, + "tree_sha": "b" * 40, + "dirty": True, + "findings": [ + {"rule_id": "R", "message": "m", "severity": "INFO", "kind": "defect", + "fingerprint": "fp", "qualname": "m.f", "properties": {}, "suppressed": "active"} + ], + } + + +def test_scan_results_dirty_tree_is_amber_skip_not_red(tmp_path, monkeypatch): + # P1: key configured, dirty + unsigned, no dev-mode -> HTTP 200 typed amber + # SKIPPED_DIRTY_TREE (distinguishable from the 422 generic red); nothing + # governed. + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + monkeypatch.delenv("LEGIS_WARDLINE_ALLOW_DIRTY", raising=False) + c = _client(tmp_path) + + resp = c.post("/wardline/scan-results", + json={"cell": "surface_only", "agent_id": "a", + "scan": _dirty_wardline_scan()}) + + assert resp.status_code == 200 + body = resp.json() + assert body["outcome"] == "SKIPPED_DIRTY_TREE" + assert body["routed"] == [] + assert c.get("/overrides").json() == [] + + +def test_scan_results_dirty_tree_governs_under_devmode_optin(tmp_path, monkeypatch): + # P0: the explicit dev-mode opt-in governs the unsigned dirty artifact, + # recorded honestly as artifact_status="dirty". + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + monkeypatch.setenv("LEGIS_WARDLINE_ALLOW_DIRTY", "1") + c = _client(tmp_path) + + resp = c.post("/wardline/scan-results", + json={"cell": "surface_only", "agent_id": "a", + "scan": _dirty_wardline_scan()}) + + assert resp.status_code == 200 + assert resp.json()["outcome"] == "ROUTED" + wardline = c.get("/overrides").json()[0]["extensions"]["wardline"] + assert wardline["artifact_status"] == "dirty" + assert "artifact_signature" not in wardline + + def test_scan_results_single_cell_still_works(tmp_path): c = _client(tmp_path) body = {"cell": "surface_override", "agent_id": "agent-1", "scan": {"findings": [ diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index 4c45b29..a253701 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -997,6 +997,72 @@ def test_scan_route_records_verified_artifact_provenance(tmp_path, monkeypatch): assert wardline["artifact_signature"].startswith("hmac-sha256:v2:") +def _dirty_scan(): + return { + "scanner_identity": "wardline@1.0.0rc1", + "rule_set_version": "rules@abc123", + "commit_sha": "a" * 40, + "tree_sha": "b" * 40, + "dirty": True, + **_active_scan(), + } + + +def test_scan_route_dirty_tree_is_amber_skip_not_red(tmp_path, monkeypatch): + # P1: a dirty dev artifact in the CI posture (key configured) is a typed + # amber SKIPPED_DIRTY_TREE outcome, NOT the generic INVALID_ARGUMENT red, + # and nothing is governed. + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + monkeypatch.setenv("LEGIS_WARDLINE_CELL", "surface_only") + monkeypatch.delenv("LEGIS_WARDLINE_ALLOW_DIRTY", raising=False) + runtime, store = _runtime(tmp_path) + + result = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": {"name": "scan_route", "arguments": {"scan": _dirty_scan()}}, + } + ), + runtime, + )[0]["result"] + + assert result.get("isError") is not True + structured = result["structuredContent"] + assert structured["outcome"] == "SKIPPED_DIRTY_TREE" + assert structured["routed"] == [] + assert store.read_all() == [] + + +def test_scan_route_dirty_tree_governs_under_devmode_optin(tmp_path, monkeypatch): + # P0: the explicit server-side dev-mode opt-in governs the unsigned dirty + # artifact, recorded honestly as artifact_status="dirty". + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + monkeypatch.setenv("LEGIS_WARDLINE_CELL", "surface_only") + monkeypatch.setenv("LEGIS_WARDLINE_ALLOW_DIRTY", "1") + runtime, store = _runtime(tmp_path) + + result = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": {"name": "scan_route", "arguments": {"scan": _dirty_scan()}}, + } + ), + runtime, + )[0]["result"]["structuredContent"] + + assert result["outcome"] == "ROUTED" + assert result["routed"][0]["mode"] == "surface_only" + wardline = store.read_all()[0].payload["extensions"]["wardline"] + assert wardline["artifact_status"] == "dirty" + assert "artifact_signature" not in wardline + + def test_scan_route_fail_on_threshold_routes_each_finding(tmp_path, monkeypatch): monkeypatch.setenv("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING", "1") runtime, _store = _runtime(tmp_path) diff --git a/tests/wardline/test_ingest.py b/tests/wardline/test_ingest.py index 75572ca..06dc028 100644 --- a/tests/wardline/test_ingest.py +++ b/tests/wardline/test_ingest.py @@ -109,3 +109,117 @@ def test_unknown_suppression_state_is_still_rejected(): scan = {"findings": [_finding(fingerprint="x", suppressed="haunted")]} with pytest.raises(WardlinePayloadError, match="unsupported suppression state"): active_defects(scan) + + +# --- dirty-tree dev artifact (P0 dev path + P1 typed amber SKIPPED_DIRTY_TREE) --- +# +# wardline `scan --format legis --allow-dirty` emits an UNSIGNED dev artifact +# marked `dirty: true` (signing stays clean-tree-only). legis must: +# - keyless dev: govern it, but record the dirty marker honestly; +# - CI posture (key configured): NOT conflate "dirty dev tree" with a +# tampered/malformed payload (a generic red). Default to a typed amber +# SKIPPED_DIRTY_TREE; govern unsigned only under an explicit dev-mode opt-in. +# The relaxation is scoped to exactly `dirty is True AND signature absent` — a +# signed (or clean) payload still verifies normally, so a real tamper stays red. + +from legis.enforcement.signing import sign # noqa: E402 +from legis.wardline.ingest import ( # noqa: E402 + SKIPPED_DIRTY_TREE, + WardlineDirtyTreeError, + verify_wardline_artifact, + wardline_artifact_fields, +) + +_KEY = b"wardline-artifact-key" + + +def _artifact(*, dirty=None, signed=False, key=_KEY, **over): + scan = { + "scanner_identity": "wardline@1.0.0rc1", + "rule_set_version": "rules@abc123", + "commit_sha": "a" * 40, + "tree_sha": "b" * 40, + "findings": [], + } + if dirty is not None: + scan["dirty"] = dirty + scan.update(over) + if signed: + scan["artifact_signature"] = sign(wardline_artifact_fields(scan), key) + return scan + + +def test_dirty_error_is_not_a_generic_payload_error(): + # The amber skip must be DISTINGUISHABLE from the generic red at the + # boundary — so it is not a WardlinePayloadError (which maps to 422 / + # INVALID_ARGUMENT). It carries a typed reason instead. + assert not issubclass(WardlineDirtyTreeError, WardlinePayloadError) + assert WardlineDirtyTreeError.reason == SKIPPED_DIRTY_TREE + + +def test_keyless_dirty_artifact_governs_with_honest_dirty_status(): + # Keyless local dev is already permissive; the only change is that the + # dirty marker is recorded honestly so a dirty dev scan is distinguishable + # from a clean unsigned one. + prov = verify_wardline_artifact(_artifact(dirty=True), None) + assert prov["artifact_status"] == "dirty" + assert prov["commit_sha"] == "a" * 40 + + +def test_keyless_clean_unsigned_artifact_stays_unverified(): + prov = verify_wardline_artifact(_artifact(), None) + assert prov["artifact_status"] == "unverified" + + +def test_ci_dirty_without_devmode_is_typed_amber_skip_not_red(): + # P1: key configured, dirty + unsigned, dev-mode OFF -> typed amber skip, + # NOT a generic WardlinePayloadError red. + with pytest.raises(WardlineDirtyTreeError) as exc: + verify_wardline_artifact(_artifact(dirty=True), _KEY, allow_dirty=False) + assert exc.value.reason == SKIPPED_DIRTY_TREE + + +def test_ci_dirty_with_devmode_governs_unsigned_as_dirty(): + # P0: key configured, dirty + unsigned, dev-mode ON -> govern unsigned, + # recorded honestly as dirty (never "verified"). + prov = verify_wardline_artifact(_artifact(dirty=True), _KEY, allow_dirty=True) + assert prov["artifact_status"] == "dirty" + assert "artifact_signature" not in prov + assert prov["scanner_identity"] == "wardline@1.0.0rc1" + + +def test_devmode_does_not_relax_a_tampered_signature(): + # Security row: dirty + a PRESENT-but-invalid signature is tampering, not a + # dev tree. Relaxation is scoped to UNSIGNED only, so this stays red even + # with dev-mode on. + scan = _artifact(dirty=True) + scan["artifact_signature"] = "hmac-sha256:v2:" + "0" * 64 # forged + with pytest.raises(WardlinePayloadError, match="does not verify"): + verify_wardline_artifact(scan, _KEY, allow_dirty=True) + + +def test_devmode_does_not_relax_a_clean_unsigned_artifact(): + # Security row: dev-mode relaxes ONLY dirty+unsigned, never "any unsigned". + # A clean (dirty absent/false) unsigned artifact still requires a signature. + with pytest.raises(WardlinePayloadError, match="signature is required"): + verify_wardline_artifact(_artifact(dirty=False), _KEY, allow_dirty=True) + with pytest.raises(WardlinePayloadError, match="signature is required"): + verify_wardline_artifact(_artifact(), _KEY, allow_dirty=True) + + +def test_dirty_marker_must_be_strict_boolean_true(): + # The scan dict is attacker-controlled. A truthy non-True dirty value + # (string "true", 1) must NOT trip the dev relaxation — it falls through to + # normal verification (red when a key is configured and it is unsigned). + for bogus in ("true", "True", 1, [1]): + with pytest.raises(WardlinePayloadError, match="signature is required"): + verify_wardline_artifact(_artifact(dirty=bogus), _KEY, allow_dirty=True) + + +def test_signed_dirty_artifact_verifies_normally(): + # A validly-signed payload that also carries dirty:true is trusted via its + # signature (only the key-holder can produce it); the dirty marker does not + # hijack the signed path into a skip. + scan = _artifact(dirty=True, signed=True) + prov = verify_wardline_artifact(scan, _KEY, allow_dirty=False) + assert prov["artifact_status"] == "verified" From dbc8303d536cc2972cd2a3639476a74a5ae999aa Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 17:49:13 +1000 Subject: [PATCH 12/72] style(tests): clear remaining ruff findings (release prep) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `ruff check src tests` is now clean: - remove unused imports (EnforcementEngine, EntityKey in test_regressions; ExemptionRegistry in test_exemptions) — F401. - hoist mid-file `import pytest` / `TamperError` to the module top in test_protected_extensions — E402. - `# noqa: F821` on the honesty-gate fixture functions (fake_/weak_/nested_ boundary_test) whose free `handler` name is intentional: those functions are fingerprinted BY SOURCE and never executed, so `handler` stands for the real boundary call the gate looks for. AST-based behavioural check is unaffected (comments are ignored), and the fingerprint round-trip is computed at runtime. Co-Authored-By: Claude Opus 4.8 --- tests/enforcement/test_protected_extensions.py | 13 ++++++++----- tests/enforcement/test_regressions.py | 2 -- tests/policy/test_exemptions.py | 1 - tests/policy/test_honesty_gate.py | 10 ++++++---- 4 files changed, 14 insertions(+), 12 deletions(-) diff --git a/tests/enforcement/test_protected_extensions.py b/tests/enforcement/test_protected_extensions.py index a49021b..c3b6176 100644 --- a/tests/enforcement/test_protected_extensions.py +++ b/tests/enforcement/test_protected_extensions.py @@ -1,5 +1,12 @@ +import pytest + from legis.clock import FixedClock -from legis.enforcement.protected import ProtectedGate, TrailVerifier, signing_fields +from legis.enforcement.protected import ( + ProtectedGate, + TamperError, + TrailVerifier, + signing_fields, +) from legis.enforcement.signing import verify from legis.enforcement.verdict import JudgeOpinion, Verdict from legis.identity.entity_key import EntityKey @@ -48,10 +55,6 @@ def test_loomweave_block_does_not_break_the_signature(tmp_path): assert verify(signing_fields(payload), sig, KEY) is True -import pytest -from legis.enforcement.protected import TamperError - - def test_mutating_loomweave_block_invalidates_the_signature(tmp_path): # Discriminating regression lock for WP-A1/L-05: the loomweave block must be bound # to the signed field set. Mutating it after signing MUST break the signature. diff --git a/tests/enforcement/test_regressions.py b/tests/enforcement/test_regressions.py index ca43c97..ba20af2 100644 --- a/tests/enforcement/test_regressions.py +++ b/tests/enforcement/test_regressions.py @@ -5,10 +5,8 @@ from legis.api.app import create_app from legis.cli import main from legis.clock import FixedClock -from legis.enforcement.engine import EnforcementEngine from legis.enforcement.signoff import SignoffGate from legis.git.surface import GitSurface, GitError -from legis.identity.entity_key import EntityKey from legis.policy.decorator import check_policy_boundary, policy_boundary, fingerprint from legis.policy.grammar import PolicyGrammar, PolicyResult from legis.policy.exemptions import ExemptionRegistry, Exemption diff --git a/tests/policy/test_exemptions.py b/tests/policy/test_exemptions.py index 2ae7283..c9f576d 100644 --- a/tests/policy/test_exemptions.py +++ b/tests/policy/test_exemptions.py @@ -6,7 +6,6 @@ Exemption, ExemptionAllowlist, ExemptionError, - ExemptionRegistry, load_exemptions, ) diff --git a/tests/policy/test_honesty_gate.py b/tests/policy/test_honesty_gate.py index 58516f0..8dac7a1 100644 --- a/tests/policy/test_honesty_gate.py +++ b/tests/policy/test_honesty_gate.py @@ -7,9 +7,11 @@ ) -# A real, resolvable "test" function the gate will fingerprint. +# Fixture functions the gate fingerprints BY SOURCE — they are never executed, +# so the free `handler` name is intentional (it stands for the real boundary +# call the gate looks for); noqa keeps that deliberate undefined name. def fake_boundary_test(): - result = handler("payload") + result = handler("payload") # noqa: F821 assert result == "payload", "no-eval" @@ -20,7 +22,7 @@ def string_only_boundary_test(): def weak_policy_boundary_test(): - assert handler("payload") == "payload" + assert handler("payload") == "payload" # noqa: F821 assert "no-eval" == "no-eval" @@ -57,7 +59,7 @@ def test_gate_passes_with_a_pinned_unmodified_test(): def test_gate_parses_nested_test_sources_consistently(): def nested_boundary_test(): - result = handler("payload") + result = handler("payload") # noqa: F821 assert result == "payload", "no-eval" good = fingerprint(nested_boundary_test) From 01d26c6076df82df9c8a83d12a7bc246b6eb99c9 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 17:49:20 +1000 Subject: [PATCH 13/72] chore(release): bump version to 1.0.0rc4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Promote the dirty-tree dev path (typed SKIPPED_DIRTY_TREE amber + LEGIS_WARDLINE_ALLOW_DIRTY) and the test-suite lint cleanup into a tagged release candidate. `__version__` and pyproject agree at 1.0.0rc4; CHANGELOG [Unreleased] promoted to [1.0.0rc4] — 2026-06-06. Co-Authored-By: Claude Opus 4.8 --- CHANGELOG.md | 10 ++++++++-- pyproject.toml | 2 +- src/legis/__init__.py | 2 +- 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f28db55..dafc9aa 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,7 +5,7 @@ All notable changes to Legis are documented here. The format follows versions per [PEP 440](https://peps.python.org/pep-0440/) / [SemVer](https://semver.org/) (pre-release: `1.0.0rc1`). -## [Unreleased] +## [1.0.0rc4] — 2026-06-06 ### Added - **Dirty-tree dev path** — `verify_wardline_artifact` now recognises the @@ -21,6 +21,12 @@ versions per [PEP 440](https://peps.python.org/pep-0440/) / signature, so the clean-tree signing guarantee is intact. (legis-d731c760c5, legis-7e85e8e7ba; upstream wardline `--allow-dirty`.) +### Fixed +- **Lint** — cleared the remaining `ruff` findings in the test suite (unused + imports, mid-file imports hoisted to module top, and `# noqa: F821` on the + honesty-gate fixture functions whose free `handler` name is fingerprinted by + source, not executed). `ruff check src tests` is now clean. + ## [1.0.0rc1] — 2026-06-03 First release candidate for 1.0. Everything built through Sprint 6 plus the @@ -65,5 +71,5 @@ WP-M1 service-layer extraction, consolidated behind a stable version. (Filigree signature column, live-Loomweave oracle + HMAC auth, operative git-rename feed) remain. -[Unreleased]: https://peps.python.org/pep-0440/ +[1.0.0rc4]: https://peps.python.org/pep-0440/ [1.0.0rc1]: https://peps.python.org/pep-0440/ diff --git a/pyproject.toml b/pyproject.toml index f134bad..8809ce7 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "legis" -version = "1.0.0rc3" +version = "1.0.0rc4" description = "Legis — the git/CI + governance layer of the Weft suite" readme = "README.md" license = "MIT" diff --git a/src/legis/__init__.py b/src/legis/__init__.py index df1f691..7986973 100644 --- a/src/legis/__init__.py +++ b/src/legis/__init__.py @@ -1,3 +1,3 @@ """Legis — the git/CI + governance layer of the Weft suite.""" -__version__ = "1.0.0rc3" +__version__ = "1.0.0rc4" From 67946df4890869c67d0fc75e481cee6fc024edd0 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 18:16:48 +1000 Subject: [PATCH 14/72] chore(gitignore): separate project-conduct artifacts from the capability MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Tidy .gitignore and apply one consistent rule across the four loom tools: their working folders are *information about the conduct of the project* (issue data, code-archaeology index, scan cache, audit DBs) and do not belong in the repo that holds the *capability* (the shipping code + material). - group entries under clear headings; fold the scattered .loomweave/* lines into a single .loomweave/; add .wardline/ and the SQLite WAL sidecars (*.db-shm / *.db-wal) so the *.db audit-data ignore is complete. - ignore all four tools' working dirs: .filigree/, .loomweave/, .wardline/, legis audit *.db (filigree/wardline/legis already complied). - untrack the loomweave files committed under ADR-005 (config.json, instance_id, .gitignore). That ADR is loomweave's storage opinion for its own output; legis is a consumer and sets its own repo policy. Only conduct noise was tracked here anyway — loomweave.db was already untracked, leaving a near-empty config and a per-machine UUID. Files stay on disk (git rm --cached); loomweave regenerates them locally. Co-Authored-By: Claude Opus 4.8 --- .gitignore | 36 +++++++++++++++++++++++++++--------- .loomweave/.gitignore | 26 -------------------------- .loomweave/config.json | 4 ---- .loomweave/instance_id | 1 - 4 files changed, 27 insertions(+), 40 deletions(-) delete mode 100644 .loomweave/.gitignore delete mode 100644 .loomweave/config.json delete mode 100644 .loomweave/instance_id diff --git a/.gitignore b/.gitignore index 1f1fc6c..e765ee6 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,4 @@ -.worktrees/ +# OS / editor cruft .DS_Store Thumbs.db .idea/ @@ -8,17 +8,35 @@ Thumbs.db .venv/ __pycache__/ *.py[cod] -.pytest_cache/ *.egg-info/ -# Local audit/scratch databases (never commit audit data) -*.db -.filigree -.filigree.conf +.pytest_cache/ +.mypy_cache/ +.ruff_cache/ .coverage coverage.json + +# Worktrees +.worktrees/ + +# Local tooling config (machine-specific, never commit) .mcp.json -loomweave.yaml -wardline.yaml -.loomweave/loomweave.lock + +# Agent instruction files — filigree-generated, regenerated each session AGENTS.md CLAUDE.md + +# --- Loom suite working folders & local config (regenerated/local; never commit) --- +# Filigree — issue-tracker database + project config +.filigree/ +.filigree.conf +# Loomweave — code-archaeology index/cache + config +.loomweave/ +loomweave.yaml +# Wardline — scanner cache + config +.wardline/ +wardline.yaml +# Legis — local audit/scratch databases + their SQLite WAL sidecars +# (audit data is never committed) +*.db +*.db-shm +*.db-wal diff --git a/.loomweave/.gitignore b/.loomweave/.gitignore deleted file mode 100644 index e861d9e..0000000 --- a/.loomweave/.gitignore +++ /dev/null @@ -1,26 +0,0 @@ -# Loomweave .gitignore — ADR-005 tracked-vs-excluded list. -# Tracked (committed): loomweave.db, config.json, .gitignore itself. -# Excluded (ignored): WAL sidecars, shadow DB, per-run logs, tmp scratch. - -# SQLite write-ahead files never belong in the repo. -*-wal -*-shm -*.db-wal -*.db-shm - -# Shadow DB intermediate (ADR-011 --shadow-db). -*.shadow.db -*.db.new - -# Semantic-search embeddings sidecar (ADR-040): large + rebuildable, never -# committed (keeps loomweave.db unbloated). WAL files are covered by *.db-wal/-shm. -embeddings.db - -# Scratch / temp space. -tmp/ - -# Per-run log directories (see detailed-design §File layout). The run dir -# metadata (config.yaml, stats.json, partial.json) is tracked; only the -# raw LLM request/response log is excluded. -logs/ -runs/*/log.jsonl diff --git a/.loomweave/config.json b/.loomweave/config.json deleted file mode 100644 index d7ef3ef..0000000 --- a/.loomweave/config.json +++ /dev/null @@ -1,4 +0,0 @@ -{ - "schema_version": 1, - "last_run_id": null -} diff --git a/.loomweave/instance_id b/.loomweave/instance_id deleted file mode 100644 index 16ed381..0000000 --- a/.loomweave/instance_id +++ /dev/null @@ -1 +0,0 @@ -48bbdc71-c426-4b23-8217-a0ea17e349e7 From 0127b66ebfcd8a7149eee1997ed7a60dd3e21c10 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 18:47:28 +1000 Subject: [PATCH 15/72] feat(install): inject legis instructions + skill pack with automatic versioning MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Legis now "stands itself up" like its siblings: `legis install` injects a lean agent-orientation block into CLAUDE.md / AGENTS.md, installs the legis-workflow skill pack (Claude + Codex), registers a SessionStart hook, and extends .gitignore with the local config surface (.legis/, legis.yaml). The block carries a versioned, content-hashed marker (); a drift check re-injects it when either the bundled content or the package version changes. Two triggers keep it fresh: the Claude Code SessionStart hook (`legis session-context`) and a best-effort refresh on `legis mcp` boot — the latter closes the Codex-only-repo gap that the hook-only approach (filigree, loomweave) leaves open. Mirrors filigree's install/hooks mechanism (inject/replace/append, atomic write, symlink rejection, idempotent hook registration), right-sized for legis (no dashboard, no server mode). Lean block + skill-pack split keeps the injected context small while the skill carries the full CLI + MCP-tool reference. Design spec: docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md Gates: ruff + mypy clean; 616 passed / 91.00% coverage; per-package floors hold; wheel ships data/; end-to-end install + session-context smoke idempotent. Co-Authored-By: Claude Opus 4.8 (1M context) --- .gitignore | 4 +- ...6-06-legis-instruction-injection-design.md | 192 +++++++ src/legis/cli.py | 79 +++ src/legis/data/instructions.md | 16 + src/legis/data/skills/legis-workflow/SKILL.md | 249 +++++++++ src/legis/hooks.py | 105 ++++ src/legis/install.py | 515 ++++++++++++++++++ tests/test_cli_install.py | 114 ++++ tests/test_hooks.py | 101 ++++ tests/test_install.py | 405 ++++++++++++++ 10 files changed, 1779 insertions(+), 1 deletion(-) create mode 100644 docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md create mode 100644 src/legis/data/instructions.md create mode 100644 src/legis/data/skills/legis-workflow/SKILL.md create mode 100644 src/legis/hooks.py create mode 100644 src/legis/install.py create mode 100644 tests/test_cli_install.py create mode 100644 tests/test_hooks.py create mode 100644 tests/test_install.py diff --git a/.gitignore b/.gitignore index e765ee6..5508bbc 100644 --- a/.gitignore +++ b/.gitignore @@ -36,7 +36,9 @@ loomweave.yaml .wardline/ wardline.yaml # Legis — local audit/scratch databases + their SQLite WAL sidecars -# (audit data is never committed) +# (audit data is never committed) and local working dir / config *.db *.db-shm *.db-wal +.legis/ +legis.yaml diff --git a/docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md b/docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md new file mode 100644 index 0000000..3578c44 --- /dev/null +++ b/docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md @@ -0,0 +1,192 @@ +# Legis instruction injection — design spec + +**Date:** 2026-06-06 +**Status:** Approved for implementation (ultracode) +**Author:** John Morrissey (with Claude) + +## Goal + +Make legis "stand itself up" the way its siblings do: a coding agent that opens +a legis project finds an **agent-calibrated orientation block** in +`CLAUDE.md` / `AGENTS.md` plus a `legis-workflow` **skill pack**, and that +content stays **automatically fresh** (versioned content hash; re-injected on +drift) for **both** Claude Code and Codex agents. + +This mirrors Filigree's proven mechanism +(`filigree/src/filigree/install.py`, `hooks.py`, +`install_support/hooks.py`) and adopts Loomweave's skill-tree fingerprint +drift detection — with one improvement over both siblings: refresh also fires +on **MCP server boot**, closing the "Codex-only repo never refreshes" gap. + +## Doctrine anchor + +From `README.md`: *"Each tool stands itself up preloaded with agent-calibrated +instructions — the instruction layer is the configuration mechanism."* and +*"Agent-first: humans on the loop, not in the loop."* This feature is the legis +realization of that instruction layer. + +## Architecture + +### Two-tier content (best practice: lean block + skill pack) + +1. **Lean orientation block** (~20 lines) injected into `CLAUDE.md` / `AGENTS.md`. + - States what legis is (the git/CI + governance layer of Weft), how to reach + it (`mcp__legis__*` tools when present; `legis` CLI fallback), and points to + the `legis-workflow` skill for the full reference. + - Delimited by versioned markers: + - open: `` + - close: `` + - `{version}` = `importlib.metadata.version("legis")` → falls back to + `legis.__version__` (currently `1.0.0rc4`). + - `{hash}` = first 8 hex chars of `sha256(block_body_text)`. + - **Freshness compares the full `v{version}:{hash}` token**, so a body edit + (hash drift) *or* a package-version bump both trigger re-injection and keep + the marker truthful. (Filigree compares hash-only; legis compares both so + "automatic versioning" actually tracks the version.) + +2. **`legis-workflow` skill pack** carrying the depth: CLI command reference, + MCP tool catalogue, error-code/recovery table, workflow patterns. Shipped as + package data; installed into `.claude/skills/legis-workflow/` and + `.agents/skills/legis-workflow/` (Codex). Drift-detected via a skill-tree + fingerprint (sorted relative POSIX path + bytes, sha256[:8]). + +### Refresh triggers (two — full coverage) + +- **Claude Code SessionStart hook** (`legis session-context`) registered in + `.claude/settings.json`. Refreshes block + skill drift when Claude opens the + repo. +- **`legis mcp` startup** — best-effort `refresh_instructions(cwd)` invoked from + the CLI `mcp` branch before the stdio loop starts. This is the **load-bearing + trigger for Codex-only repos** (no `.claude/` hook). Idempotent: writes only + when the embedded hash differs, so no git churn in steady state. All failures + are swallowed — the refresh must never block or crash the MCP server. + +Both triggers call the same `refresh_instructions(root)`. Refresh **only updates +files/skills that already carry the marker** (drift refresh in place). Initial +**creation** is the job of `legis install` — an MCP boot or hook never +surprise-creates `CLAUDE.md`. (Matches Filigree's freshness semantics.) + +## Components + +### `src/legis/data/instructions.md` +The lean block body (no markers — markers are added programmatically). Content: +what legis is, `mcp__legis__*` + CLI fallback, the six CLI subcommands, and a +pointer to the `legis-workflow` skill. + +### `src/legis/data/skills/legis-workflow/SKILL.md` +Skill pack with YAML frontmatter (`name: legis-workflow`, a `description:` that +triggers on governance/override/policy-cell/CI-gate/git-rename/closure-gate +tasks). Body documents: +- CLI: `serve`, `mcp`, `check-override-rate`, `governance-gate`, + `sei-backfill`, `policy-boundary-check`. +- MCP tools: `policy_explain`, `override_submit`, `signoff_status_get`, + `policy_evaluate`, `scan_route`, `git_branch_list`, `git_commit_get`, + `git_rename_list`, `git_rename_feed_get`, `filigree_closure_gate_get`, + `pull_request_get`, `check_list`, `override_rate_get`. +- Error codes / recovery (sourced from `legis/mcp.py` `_recovery_for`). + +### `src/legis/install.py` +Mirrors Filigree's injection core, right-sized (no dashboard, no server mode): +- `INSTRUCTIONS_MARKER = "`) to the current + version+hash, re-inject on mismatch; for each installed skill root, compare + tree fingerprint to source and reinstall on mismatch. Returns human-readable + update messages. `root` defaults to the caller's cwd; the MCP-boot caller + passes `Path.cwd()` and accepts that a non-project cwd simply no-ops (refresh + only ever touches marker-bearing files). + Best-effort: callers guard against `OSError`/`UnicodeDecodeError`/`ValueError`. +- `generate_session_context() -> str | None`: run `refresh_instructions(cwd)`; + return the joined update messages, or `None` when nothing changed (silent — + no governance snapshot, no DB dependency). + +### `src/legis/cli.py` +- `legis install` subcommand: flags `--claude-md`, `--agents-md`, `--skills`, + `--codex-skills`, `--hooks`, `--gitignore`; no flags ⇒ all. Steps: inject + `CLAUDE.md`, inject `AGENTS.md`, install skills, install codex skills, install + hooks, ensure gitignore. Print a per-step result table. +- `legis session-context` subcommand: prints `generate_session_context()` (or + nothing) and exits 0. +- In the existing `mcp` branch: call `refresh_instructions(Path.cwd())` inside a + broad `try/except` (swallow all) **before** `mcp_main(...)`. + +### `pyproject.toml` +Ensure `src/legis/data/**` (the `instructions.md` and the skill tree) ships in +the wheel/sdist under `uv_build`. Verify via +`importlib.resources.files("legis.data")` at test time. + +### `.gitignore` +Extend the existing `# Legis —` stanza so it also ignores the (prophylactic, +sibling-consistent) local config surface: +``` +# Legis — local audit/scratch databases + their SQLite WAL sidecars +# and local working dir / config (regenerated/local; never commit) +*.db +*.db-shm +*.db-wal +.legis/ +legis.yaml +``` + +## Out of scope (YAGNI) + +- Dashboard / ephemeral-port / server-mode machinery (legis has none). +- A PreToolUse hook (no dashboard to restart). +- A Codex-native hook (the MCP-boot refresh supersedes it). +- Changing how `CLAUDE.md`/`AGENTS.md` are tracked — they remain gitignored + regenerated artifacts; the legis block coexists with whatever else regenerates + them. + +## Testing + +Mirror Filigree/Loomweave coverage (repo floor: 88%): +- `inject_instructions`: create / append / replace / malformed (missing end + marker) / idempotent re-run. +- `_instructions_hash` stable; `_build_instructions_block` marker shape; + marker-hash regex extraction. +- `_skill_tree_fingerprint` changes on content/path change; `refresh_instructions` + updates a drifted `CLAUDE.md` **and** `AGENTS.md` and a drifted skill pack; + no-ops (returns `[]`) when fresh; skips files without the marker. +- `install_claude_code_hooks`: fresh install, idempotent re-run, bare→absolute + upgrade, malformed `settings.json` backup, does not duplicate, reuses only + unscoped blocks. +- `ensure_gitignore`: adds `.legis/`/`legis.yaml`, idempotent, preserves + existing content. +- `_atomic_write_text`: preserves existing file mode; new file respects umask; + rejects symlink target. +- CLI: `legis install` (all + each selective flag) writes expected artifacts; + `legis session-context` prints refresh messages / nothing; `mcp` branch + refresh is best-effort (a raising `refresh_instructions` does not break + `mcp` startup). +- Packaging: `importlib.resources.files("legis.data")` resolves the template and + skill tree. + +## Gates + +`ruff`, `mypy` (py312, the repo's strict config), `pytest` with the 88% floor, +all green before done. diff --git a/src/legis/cli.py b/src/legis/cli.py index d9532f3..5b2e5e1 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -140,6 +140,22 @@ def build_parser() -> argparse.ArgumentParser: help="Output format: human-readable text (default) or machine-readable json", ) + install = subparsers.add_parser( + "install", + help="Inject legis instructions, install the legis-workflow skill, and register the hook", + ) + install.add_argument("--claude-md", action="store_true", help="Inject instructions into CLAUDE.md only") + install.add_argument("--agents-md", action="store_true", help="Inject instructions into AGENTS.md only") + install.add_argument("--skills", action="store_true", help="Install the Claude Code skill pack only") + install.add_argument("--codex-skills", action="store_true", help="Install the Codex skill pack only") + install.add_argument("--hooks", action="store_true", help="Register the Claude Code SessionStart hook only") + install.add_argument("--gitignore", action="store_true", help="Add legis config rules to .gitignore only") + + subparsers.add_parser( + "session-context", + help="SessionStart hook: refresh drifted legis instructions/skills in the cwd", + ) + return parser @@ -221,6 +237,52 @@ def _check_override_rate(db_url: str) -> int: return 1 if res.status is GateStatus.FAIL else 0 +def _run_install(args) -> int: + from legis.install import ( + ensure_gitignore, + inject_instructions, + install_claude_code_hooks, + install_codex_skills, + install_skills, + ) + + project_root = Path.cwd() + install_all = not any( + [args.claude_md, args.agents_md, args.skills, args.codex_skills, args.hooks, args.gitignore] + ) + + steps: list[tuple[bool, str, object]] = [ + (install_all or args.claude_md, "CLAUDE.md", lambda: inject_instructions(project_root / "CLAUDE.md")), + (install_all or args.agents_md, "AGENTS.md", lambda: inject_instructions(project_root / "AGENTS.md")), + (install_all or args.skills, "Claude Code skill", lambda: install_skills(project_root)), + (install_all or args.codex_skills, "Codex skill", lambda: install_codex_skills(project_root)), + (install_all or args.hooks, "Claude Code hook", lambda: install_claude_code_hooks(project_root)), + (install_all or args.gitignore, ".gitignore", lambda: ensure_gitignore(project_root)), + ] + + failures = 0 + for selected, name, step in steps: + if not selected: + continue + ok, message = step() # type: ignore[operator] + mark = "OK" if ok else "FAIL" + print(f"[{mark}] {name}: {message}") + if not ok: + failures += 1 + return 1 if failures else 0 + + +def _refresh_instructions_best_effort() -> None: + """Refresh drifted legis instructions on MCP boot. Never raises.""" + try: + from legis.hooks import refresh_instructions + + for message in refresh_instructions(Path.cwd()): + print(message, file=sys.stderr) + except Exception: # noqa: BLE001 (boot refresh must never break the server) + pass + + def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: if argv is None: argv = sys.argv[1:] @@ -247,6 +309,17 @@ def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: run("legis.api.app:create_app", host=args.host, port=args.port, factory=True) return 0 + if args.command == "install": + return _run_install(args) + + if args.command == "session-context": + from legis.hooks import generate_session_context + + context = generate_session_context() + if context: + print(context) + return 0 + if args.command in {"check-override-rate", "governance-gate"}: return _check_override_rate(args.db) @@ -275,6 +348,12 @@ def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: os.environ["LEGIS_POLICY_CELLS"] = args.policy_cells _apply_judge_env(args) + # Universal refresh trigger: every agent (Claude or Codex) reaches legis + # by booting this MCP server, so refreshing here keeps the instruction + # block + skill pack fresh even in Codex-only repos with no SessionStart + # hook. Best-effort — it must never block or break server startup. + _refresh_instructions_best_effort() + from legis.mcp import main as mcp_main return mcp_main(args.agent_id) diff --git a/src/legis/data/instructions.md b/src/legis/data/instructions.md new file mode 100644 index 0000000..e951079 --- /dev/null +++ b/src/legis/data/instructions.md @@ -0,0 +1,16 @@ +## Legis (git/CI + governance) + +Legis is the git/CI and governance layer of the Weft suite. Reach for it when a policy fires at the CI/git boundary and a change needs a *recordable* override or human sign-off, when you need governance attestations keyed to stable code identity (SEI), or when you need git/CI context — branches, commits, pull requests, check outcomes, and the Loomweave-bound rename feed — around the work. Enforcement is graded: agent-programmable policy cells decide whether a violation self-clears with an audit trail, is judged inline, or escalates to a human; every decision lands in an append-only, SEI-keyed audit trail that survives rename/move. + +Prefer the `mcp__legis__*` MCP tools when available; fall back to the `legis` CLI. + +CLI subcommands: + +- `serve` — run the Legis API server. +- `mcp` — run the Legis MCP stdio server (launch-bound `--agent-id`). +- `check-override-rate` — exit 1 if the override-rate gate is FAIL (for CI). +- `governance-gate` — run governance CI gates (currently the override-rate gate). +- `sei-backfill` — resolve legacy locator-keyed governance records through Loomweave batch resolve. +- `policy-boundary-check` — fail when `@policy_boundary` metadata lacks current behavioural evidence. + +Full command + MCP-tool reference: see the `legis-workflow` skill. diff --git a/src/legis/data/skills/legis-workflow/SKILL.md b/src/legis/data/skills/legis-workflow/SKILL.md new file mode 100644 index 0000000..8056e00 --- /dev/null +++ b/src/legis/data/skills/legis-workflow/SKILL.md @@ -0,0 +1,249 @@ +--- +name: legis-workflow +description: > + This skill should be used when the user asks to explain or evaluate a policy cell, + submit a graded override, check the override-rate CI gate, run a governance gate, + read git branch/commit context, read the git-rename feed for Loomweave, gate a + Filigree closure on verified binding evidence, route Wardline scan findings through + governance, read recorded pull-request or CI check outcomes, run the + policy-boundary-check, or back-fill SEI-keyed governance records — or when working + in a project that uses legis for git/CI governance and graded enforcement. +--- + +# Legis Workflow + +Legis is the git/CI and **governance** side of the Weft suite. This skill is the +depth behind the lean `CLAUDE.md` block: the full CLI reference, the MCP tool +catalogue, the error/recovery table, and the worked patterns an agent actually +runs. Keep it faithful to the installed `legis` — when in doubt, `legis --help` +and `legis --help` are authoritative. + +## What legis is + +Legis answers *what changed, in which branch/commit/PR/check context, and what +governance or attestation state exists for that change?* It is an SEI **consumer** +(Loomweave remains the identity authority) and the suite's single governed judge — +**Wardline analyses trust; Legis governs it, one judge not two**. It does not own +issue state (Filigree) or code identity (Loomweave); it adds branch/commit/PR/check +context and a graded enforcement layer on top. + +Enforcement is a **2×2** of policy *cells*, each agent-set, each a distinct +override flow: + +| | Judge OFF | Judge ON | +|---|---|---| +| **Simple** | **chill** — agent self-reports a recordable override; human reviews async (`ACCEPTED_SELF`) | **coached** — an LLM wall evaluates the override *before* it records; `ACCEPTED_BY_JUDGE` or `BLOCKED` (not self-clearable) | +| **Complex** | **structured** — block + escalate; a human operator must sign off before the gate clears (`ESCALATED_PENDING`) | **protected** — full machinery: HMAC-signed verdicts, decay sweep, override-rate gate, operator override | + +The operating invariant is **agent-first: humans on the loop, not in the loop.** +Every cell produces an append-only audit trail keyed on SEI, so the record survives +rename/move. The recorded override is the safety mechanism — an attributable audit +event, never a silent pass. + +## Reaching the tools + +Prefer the MCP tools (`mcp__legis__*`) when a Legis MCP server is attached; fall +back to the `legis` CLI otherwise. Each surface maps thinly over the same service +layer, so they agree on outcomes. + +**Identity is launch-bound.** The MCP server is started with +`legis mcp --agent-id `; that `--agent-id` is the actor for every override, +sign-off, and audit record the session produces. **No tool schema accepts an actor +argument** — you cannot spoof or override identity from a call. (Contrast the CLI's +`sei-backfill --actor`, which stamps appended backfill events from a one-shot +command, not an interactive session.) + +The MCP transport is stdio JSON-RPC (one object per line). Tool errors come back as +`isError` results with a `structuredContent` envelope carrying `error_code`, +`message`, `recoverable`, and `next_action` (see Error handling). + +## CLI reference + +`legis [flags]`. Most stores fall back to environment variables; flags +override. + +### `legis serve` — run the Legis API server +- `--host` (default `127.0.0.1`), `--port` (default `8000`) — bind address. +- `--governance-db` — governance store URL (env `LEGIS_GOVERNANCE_DB`). +- `--check-db` — check store URL (env `LEGIS_CHECK_DB`). +- `--protected-policies` — comma-separated protected policy list (env `LEGIS_PROTECTED_POLICIES`). +- `--loomweave-url` — Loomweave identity API URL (env `LOOMWEAVE_API_URL`). +- `--filigree-url` — Filigree issue-tracker API URL (env `FILIGREE_API_URL`). +- `--binding-db` — sign-off binding ledger URL (env `LEGIS_BINDING_DB`). +- Judge flags (shared): `--judge-provider` (`openrouter`; omit to keep protected cells fail-closed), `--judge-model` (env `LEGIS_JUDGE_MODEL`), `--judge-max-tokens` (env `LEGIS_JUDGE_MAX_TOKENS`). + +### `legis mcp` — run the MCP stdio server +- `--agent-id` (**required**) — launch-bound agent identity; the actor for all records this session. +- `--governance-db` (env `LEGIS_GOVERNANCE_DB`), `--check-db` (env `LEGIS_CHECK_DB`). +- `--policy-cells` — policy cell registry TOML path (env `LEGIS_POLICY_CELLS`). +- `--protected-policies` (env `LEGIS_PROTECTED_POLICIES`), `--loomweave-url` (env `LOOMWEAVE_API_URL`). +- Judge flags (shared): `--judge-provider`, `--judge-model`, `--judge-max-tokens`. + +### `legis check-override-rate` — CI gate +Fails (exit 1) if the override-rate gate is `FAIL`. For CI use. +- `--db` — governance store URL (default mirrors the server's `LEGIS_GOVERNANCE_DB` / `DEFAULT_GOVERNANCE_DB`). + +Prints `override-rate gate: (rate=…, sample=…)`. A missing SQLite DB under +`CI=true` (without `LEGIS_ALLOW_MISSING_GOVERNANCE_DB=1`) fails; otherwise it prints +`PASS_WITH_NOTICE` and exits 0. A failed hash-chain integrity check exits 1. + +### `legis governance-gate` — run governance CI gates +Currently runs the override-rate gate (same implementation and `--db` semantics as +`check-override-rate`). Use this name for the general CI gate entry point. + +### `legis sei-backfill` — resolve legacy locator-keyed records +Resolves legacy locator-keyed governance records through Loomweave batch resolve and +emits a JSON report. +- `--db` — governance store URL (env `LEGIS_GOVERNANCE_DB`). +- `--loomweave-url` (**required**) — Loomweave identity API URL. +- `--execute` — append backfill events (omit for a dry-run report). +- `--actor` (default `legis-sei-backfill`) — actor stamped on appended events. + +### `legis policy-boundary-check` — boundary-evidence gate +Fails (exit 1) when `@policy_boundary` metadata lacks current behavioural evidence. +- `--root` (default `src`) — Python source root to scan. +- `--repo-root` (default `.`) — repo root for `test_ref` resolution. +- `--format` (`text` | `json`, default `text`) — human-readable lines vs machine-readable findings. + +Prints `policy-boundary-check: PASS` (exit 0) when clean; otherwise one +`path:line: rule_id: qualname: reason` per finding (exit 1). + +## MCP tool catalogue + +All tools return a `structuredContent` JSON payload. Names are exact. + +### Governance / policy +| Tool | Purpose | +|---|---| +| `policy_explain` | Explain which governance cell controls a policy/entity pair, whether that cell is enabled here, and which move the agent may make next. | +| `policy_evaluate` | Evaluate a policy against a target **without recording an override**. Returns outcome, detail, and any `provenance_gap`. | +| `override_submit` | Submit an override as the launch-bound agent. Routes to the governing cell and returns a discriminated outcome envelope (`ACCEPTED_SELF` / `ACCEPTED_BY_JUDGE` / `BLOCKED` / `ESCALATED_PENDING` / `NEED_INPUTS`). | +| `signoff_status_get` | Poll whether a **structured** sign-off request (by `seq`) has been cleared. | +| `override_rate_get` | Read the fixed operator force-past override-rate gate (status / rate / sample_size). Measures operator force-pasts; **not** movable by agent retries. | +| `scan_route` | Route Wardline scan findings through one cell, a `severity_map`, or a cell + `fail_on` threshold. Returns `ROUTED` or `SKIPPED_DIRTY_TREE` (typed amber skip). | + +### Git +| Tool | Purpose | +|---|---| +| `git_branch_list` | List local git branches and upstream divergence facts. | +| `git_commit_get` | Read one git commit by SHA or safe ref. | +| `git_rename_list` | List git rename evidence for a revision range (`rev_range`). | +| `git_rename_feed_get` | Loomweave-ready rename feed: committed renames over `base..head` plus optional uncommitted working-tree renames (`include_worktree`). | + +### Pulls / checks +| Tool | Purpose | +|---|---| +| `pull_request_get` | Read recorded pull-request metadata (`number`) with joined check outcomes. | +| `check_list` | Read recorded CI/check outcomes for a `target_type` of `commit`, `branch`, or `pr` plus a `target`. | + +### Filigree binding +| Tool | Purpose | +|---|---| +| `filigree_closure_gate_get` | Read whether legis holds **verified binding evidence** for closing a Filigree issue (`issue_id`). Requires the binding ledger to be enabled. | + +### Override-submit outcomes (by cell) +- **chill** → `ACCEPTED_SELF` — self-cleared; human reviews asynchronously. +- **coached** / **protected** → `ACCEPTED_BY_JUDGE` (may be re-judged later) or `BLOCKED`. A `BLOCKED` verdict carries a `blocked_reason_code` (`RATIONALE_INSUFFICIENT` / `CODE_VIOLATION` / `POLICY_HARD_BLOCK` / `UNCLASSIFIED`), `self_clearable: false`, and `next_actions: [REVISE_CODE, REVISE_RATIONALE]`. A blocked attempt **does not count toward your override-rate** — you cannot self-clear past the judge. +- **structured** → `ESCALATED_PENDING` — human sign-off required; poll `signoff_status_get` with the returned `seq`. +- **protected** with missing inputs → `NEED_INPUTS` — supply the listed fields (e.g. `file_fingerprint`, `ast_path`) and resubmit. + +Pass an `idempotency_key` on `override_submit` to make retries safe: a repeat with +the same request returns the original outcome; a reused key with a *different* +request is rejected (`INVALID_ARGUMENT`). + +## Error handling + +Tool errors carry `error_code`, `message`, `recoverable`, and a `next_action` hint. +Branch on `error_code`, not message text. + +| `error_code` | Recoverable | `next_action` | +|---|---|---| +| `INVALID_ARGUMENT` | yes | Correct the tool arguments and retry. | +| `INVALID_CELL_SPEC` | yes | Use server-owned routing or a valid cell configuration. | +| `CELL_NOT_ENABLED` | yes | Ask the operator to enable the required governance cell. | +| `NO_SUCH_REQUEST` | yes | Poll a known sign-off sequence returned by `override_submit`. | +| `NOT_FOUND` | yes | Refresh the target identifier and retry. | +| `UNKNOWN_TOOL` | yes | Call `tools/list` and use one of the advertised tool names. | +| `GIT_ERROR` | yes | Check the git ref or revision range and retry. | +| `SERVICE_ERROR` | yes | Inspect the error message before retrying. | +| `AUDIT_INTEGRITY_FAILURE` | **no** | Stop and ask an operator to inspect the governance trail. | +| `INTERNAL_ERROR` | **no** | Inspect the error message before retrying. | + +`AUDIT_INTEGRITY_FAILURE` (raised on a failed hash-chain verification or a binding +ledger error) and `INTERNAL_ERROR` are **not recoverable** — do not retry; surface +them to a human. Everything else is recoverable by fixing the input or asking the +operator to enable a cell. + +Two routing-specific notes for `scan_route`: +- Wardline routing is **server-owned**. Passing `cell` / `severity_map` / `fail_on` + when the server already configures routing (`LEGIS_WARDLINE_CELL` / + `LEGIS_WARDLINE_CELL_BY_SEVERITY`) returns `INVALID_CELL_SPEC`. Request-side + routing is only honoured under the explicit `LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING=1` + escape hatch. +- An unsigned dirty-tree dev artifact arriving where signed provenance is required + is **not** an error — it returns `outcome: SKIPPED_DIRTY_TREE` (a typed amber skip; + nothing is governed). Commit for a signed artifact, or set + `LEGIS_WARDLINE_ALLOW_DIRTY=1` to govern it unsigned in dev. + +## Workflow patterns + +### Evaluate a policy cell, then submit a graded override +``` +policy_explain {policy, entity} # which cell governs, is it enabled, what move is next +# read explanation.cell and available_moves (already filtered to agent-callable tools) +override_submit {policy, entity, rationale [, file_fingerprint, ast_path, idempotency_key]} +``` +- **chill** → `ACCEPTED_SELF`; you are done, the human reviews the trail async. +- **coached/protected** → if `BLOCKED`, do not retry verbatim — `REVISE_CODE` or + `REVISE_RATIONALE` per `next_actions`; the judge cannot be talked past and the + blocked attempt costs you nothing on the override-rate. +- **structured** → `ESCALATED_PENDING`; poll `signoff_status_get {seq}` until + `cleared: true`. Do not proceed on the gated change until then. +- **protected** → if `NEED_INPUTS`, supply `file_fingerprint` + `ast_path` (the + bytes and AST node the judge binds its verdict to) and resubmit. + +### Check the override-rate gate in CI +The gate measures **operator force-pasts**, not agent retries — a high rate means +the policy is miscalibrated or an operator is breaking their own rules. +``` +# in-session read: +override_rate_get {} # → {status, rate, sample_size} +# CI step (exit 1 on FAIL): +legis check-override-rate --db +# or the general entry point: +legis governance-gate --db +``` + +### Read the git-rename feed for Loomweave +Legis is the (contract-locked) rename provider Loomweave's SEI re-binding matcher +consumes. +``` +git_rename_feed_get {base, head?, include_worktree?} +# committed renames over base..head, plus optional uncommitted working-tree renames +# lower-level evidence over an explicit range: +git_rename_list {rev_range} +``` + +### Gate a Filigree closure on verified binding evidence +Before closing a governed Filigree issue, confirm Legis holds verified, SEI-keyed +sign-off binding evidence for it. +``` +filigree_closure_gate_get {issue_id} # requires the binding ledger to be enabled +# only close in Filigree once this reports verified binding evidence; +# Filigree retains lifecycle authority — Legis only certifies the evidence. +``` +If the ledger is not enabled you get `CELL_NOT_ENABLED` — ask the operator to wire +`LEGIS_BINDING_DB` / `--binding-db`. + +### Route Wardline findings through governance +``` +scan_route {scan} # routing is server-owned; pass only the scan +# → ROUTED (governed into the configured cell) or SKIPPED_DIRTY_TREE (commit, or +# set LEGIS_WARDLINE_ALLOW_DIRTY=1 in dev) +``` + +### Gate boundary evidence in CI +``` +legis policy-boundary-check --root src --repo-root . --format json +# exit 1 with findings when @policy_boundary metadata lacks current behavioural evidence +``` diff --git a/src/legis/hooks.py b/src/legis/hooks.py new file mode 100644 index 0000000..62dfaa7 --- /dev/null +++ b/src/legis/hooks.py @@ -0,0 +1,105 @@ +"""SessionStart / MCP-boot refresh for legis instruction artifacts. + +Two callers drive ``refresh_instructions``: + +- the ``legis session-context`` CLI subcommand, registered as a Claude Code + SessionStart hook, and +- ``legis mcp`` startup (best-effort), which is the universal trigger that also + covers Codex-only repos with no ``.claude/`` hook. + +Both refresh *in place* only — they never create a block or skill pack that is +not already present (that is ``legis install``'s job). A non-project cwd simply +produces no work, because the refresh only ever touches marker-bearing files. +""" + +from __future__ import annotations + +import logging +import re +from pathlib import Path + +from legis.install import ( + INSTRUCTIONS_MARKER, + SKILL_NAME, + _build_instructions_block, # noqa: F401 (kept for symmetry / tests) + _get_skills_source_dir, + _marker_token, + _skill_tree_fingerprint, + inject_instructions, + install_codex_skills, + install_skills, +) + +logger = logging.getLogger(__name__) + +_MARKER_TOKEN_RE = re.compile(r"") + + +def _extract_marker_token(content: str) -> str | None: + """Return the ``v{version}:{hash}`` token from a legis marker, or ``None``.""" + m = _MARKER_TOKEN_RE.search(content) + return m.group(1) if m else None + + +def refresh_instructions(root: Path) -> list[str]: + """Refresh drifted legis instruction blocks and skill packs under *root*. + + Compares the embedded ``v{version}:{hash}`` token against the current one + for ``CLAUDE.md`` / ``AGENTS.md`` (re-injecting on drift), and each installed + skill pack's tree fingerprint against the bundled source (reinstalling on + drift). Returns human-readable update messages (empty when everything is + current). Only marker-bearing files and already-installed skill packs are + touched. + """ + messages: list[str] = [] + current_token = _marker_token() + + for filename in ("CLAUDE.md", "AGENTS.md"): + md_path = root / filename + if not md_path.exists(): + continue + try: + content = md_path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + logger.debug("Could not read %s for freshness check", md_path, exc_info=True) + continue + if INSTRUCTIONS_MARKER not in content: + continue + if _extract_marker_token(content) == current_token: + continue + ok, _ = inject_instructions(md_path) + if ok: + messages.append(f"Updated legis instructions in {filename}") + + source_root = _get_skills_source_dir() / SKILL_NAME + if source_root.is_dir(): + source_hash = _skill_tree_fingerprint(source_root) + skill_targets = ( + (root / ".claude" / "skills" / SKILL_NAME, install_skills, "Updated legis skill pack"), + (root / ".agents" / "skills" / SKILL_NAME, install_codex_skills, "Updated legis Codex skill pack"), + ) + for target_root, installer, msg in skill_targets: + if not target_root.is_dir(): + continue + if _skill_tree_fingerprint(target_root) != source_hash: + ok, _ = installer(root) + if ok: + messages.append(msg) + + return messages + + +def generate_session_context() -> str | None: + """Refresh instruction drift in the cwd and return any update messages. + + Returns ``None`` when nothing changed (silent SessionStart output — legis + keeps no project snapshot and depends on no governance database here). + """ + try: + messages = refresh_instructions(Path.cwd()) + except (OSError, UnicodeDecodeError, ValueError): + logger.warning("Instruction freshness check failed", exc_info=True) + return None + if not messages: + return None + return "\n".join(messages) diff --git a/src/legis/install.py b/src/legis/install.py new file mode 100644 index 0000000..6a66487 --- /dev/null +++ b/src/legis/install.py @@ -0,0 +1,515 @@ +"""Project installation helpers for legis. + +Legis "stands itself up": ``legis install`` injects a lean agent-orientation +block into ``CLAUDE.md`` / ``AGENTS.md``, installs the ``legis-workflow`` skill +pack, registers a Claude Code SessionStart hook, and extends ``.gitignore``. + +The block carries a versioned, content-hashed marker +(````) so a drift check can +re-inject it when either the bundled content or the package version changes. +This mirrors filigree's mechanism (``filigree/src/filigree/install.py`` and +``install_support/``), right-sized for legis: no dashboard, no server mode. +""" + +from __future__ import annotations + +import hashlib +import importlib.metadata +import importlib.resources +import json +import os +import shlex +import shutil +import stat +import tempfile +from pathlib import Path +from typing import Any + +# --------------------------------------------------------------------------- +# Constants +# --------------------------------------------------------------------------- + +INSTRUCTIONS_MARKER = "" + +SKILL_NAME = "legis-workflow" +"""Name of the legis skill pack directory.""" + +SESSION_CONTEXT_COMMAND = "legis session-context" +"""Bare form of the SessionStart hook command.""" + + +# --------------------------------------------------------------------------- +# Symlink-safe project paths +# --------------------------------------------------------------------------- + + +class UnsafeInstallPathError(ValueError): + """Raised when an installer target could escape the project root.""" + + +def _is_relative_to(path: Path, root: Path) -> bool: + try: + path.relative_to(root) + except ValueError: + return False + return True + + +def _check_existing_components_not_symlinks(path: Path, root: Path) -> None: + """Reject symlinks in existing path components between root and path.""" + current = root + try: + relative_parts = path.relative_to(root).parts + except ValueError as exc: # pragma: no cover - guarded by callers + msg = f"Installer target {path} is outside project root {root}" + raise UnsafeInstallPathError(msg) from exc + + for part in relative_parts: + current = current / part + if current.is_symlink(): + msg = f"Refusing to write through symlinked installer target: {current}" + raise UnsafeInstallPathError(msg) + + +def project_path(project_root: Path, *parts: str) -> Path: + """Return a project-contained path, rejecting symlink escape hatches.""" + root = project_root.resolve(strict=True) + target = root.joinpath(*parts) + _check_existing_components_not_symlinks(target, root) + resolved_target = target.resolve(strict=False) + if not _is_relative_to(resolved_target, root): + msg = f"Installer target {target} resolves outside project root {root}" + raise UnsafeInstallPathError(msg) + return target + + +def ensure_project_dir(project_root: Path, *parts: str) -> Path: + """Create and return a project-contained directory without following links.""" + target = project_path(project_root, *parts) + target.mkdir(parents=True, exist_ok=True) + _check_existing_components_not_symlinks(target, project_root.resolve(strict=True)) + if not target.is_dir(): + msg = f"Installer target directory is not a directory: {target}" + raise UnsafeInstallPathError(msg) + return target + + +def reject_symlink(path: Path) -> None: + """Reject a direct installer target that is a symlink, including dangling.""" + if path.is_symlink(): + msg = f"Refusing to write through symlinked installer target: {path}" + raise UnsafeInstallPathError(msg) + + +# --------------------------------------------------------------------------- +# Instructions block +# --------------------------------------------------------------------------- + + +def _instructions_text() -> str: + """Read the instructions template from the shipped data file.""" + ref = importlib.resources.files("legis.data").joinpath("instructions.md") + return ref.read_text(encoding="utf-8") + + +def _instructions_hash() -> str: + """Return the first 8 hex characters of SHA256 of the instructions content.""" + return hashlib.sha256(_instructions_text().encode()).hexdigest()[:8] + + +def _instructions_version() -> str: + """Return a sensible legis version for instructions markers.""" + try: + return importlib.metadata.version("legis") + except importlib.metadata.PackageNotFoundError: + from legis import __version__ + + return __version__ or "0.0.0-dev" + + +def _marker_token() -> str: + """Return the ``v{version}:{hash}`` identity carried by the open marker. + + Freshness compares this whole token, so a content edit (hash drift) *or* a + package-version bump both re-inject and keep the marker truthful. + """ + return f"v{_instructions_version()}:{_instructions_hash()}" + + +def _build_instructions_block() -> str: + """Build the full instructions block with versioned markers.""" + text = _instructions_text() + opening = f"{INSTRUCTIONS_MARKER}:{_marker_token()} -->" + return f"{opening}\n{text}{_END_MARKER}" + + +def _atomic_write_text(path: Path, content: str) -> None: + """Write *content* to *path* atomically (temp + rename), preserving mode.""" + reject_symlink(path) + existing_mode: int | None + try: + existing_mode = stat.S_IMODE(path.stat().st_mode) + except FileNotFoundError: + existing_mode = None + + fd, tmp = tempfile.mkstemp(dir=path.parent, suffix=".tmp", prefix=path.name) + try: + with os.fdopen(fd, "w", encoding="utf-8") as f: + f.write(content) + if existing_mode is not None: + os.chmod(tmp, existing_mode) + else: + umask = os.umask(0) + os.umask(umask) + os.chmod(tmp, 0o666 & ~umask) + os.replace(tmp, path) + except BaseException: + Path(tmp).unlink(missing_ok=True) + raise + + +def inject_instructions(file_path: Path) -> tuple[bool, str]: + """Inject legis workflow instructions into a markdown file. + + - missing file → create with just the block; + - has the marker → replace the block in place; + - exists without the marker → append the block. + """ + try: + reject_symlink(file_path) + except UnsafeInstallPathError as exc: + return False, str(exc) + + block = _build_instructions_block() + + if not file_path.exists(): + _atomic_write_text(file_path, block + "\n") + return True, f"Created {file_path}" + + content = file_path.read_text(encoding="utf-8") + if INSTRUCTIONS_MARKER in content: + start = content.index(INSTRUCTIONS_MARKER) + end_pos = content.find(_END_MARKER, start) + if end_pos != -1: + end = end_pos + len(_END_MARKER) + content = content[:start] + block + content[end:] + else: + # Missing end marker — the block is unclosed, so everything from + # the start marker onward belongs to the broken block. Replace + # from the start marker through EOF rather than leaving orphan + # content the next run can no longer distinguish from user text. + content = content[:start] + block + _atomic_write_text(file_path, content) + return True, f"Updated instructions in {file_path}" + + if not content.endswith("\n"): + content += "\n" + content += "\n" + block + "\n" + _atomic_write_text(file_path, content) + return True, f"Appended instructions to {file_path}" + + +# --------------------------------------------------------------------------- +# Skill pack +# --------------------------------------------------------------------------- + + +def _get_skills_source_dir() -> Path: + """Return the path to the bundled skills directory inside the package.""" + return Path(__file__).parent / "data" / "skills" + + +def _skill_tree_fingerprint(root: Path) -> str: + """Return a short hash of every file under *root* (relative path + bytes).""" + digest = hashlib.sha256() + files = sorted(p for p in root.rglob("*") if p.is_file()) + for path in files: + rel = path.relative_to(root).as_posix().encode("utf-8") + digest.update(rel) + digest.update(b"\0") + try: + digest.update(path.read_bytes()) + except OSError: + digest.update(b"") + digest.update(b"\0") + return digest.hexdigest()[:8] + + +def _install_skill_to(project_root: Path, target_subpath: Path) -> tuple[bool, str]: + """Copy the legis skill pack into *target_subpath* under *project_root*. + + Idempotent — overwrites existing skill files to track the installed legis + version. Safe under concurrent invocation: each call stages into a unique + directory and tolerates a peer winning the final rename race. + """ + skill_source = _get_skills_source_dir() / SKILL_NAME + if not skill_source.is_dir(): + return False, f"Skill source not found at {skill_source}" + + try: + target_parent = ensure_project_dir(project_root, *target_subpath.parts) + except UnsafeInstallPathError as exc: + return False, str(exc) + target_dir = target_parent / SKILL_NAME + try: + reject_symlink(target_dir) + except UnsafeInstallPathError as exc: + return False, str(exc) + + staging = Path(tempfile.mkdtemp(dir=target_dir.parent, prefix=f"{SKILL_NAME}.installing.")) + staging.rmdir() + staging_consumed = False + backup: Path | None = None + try: + shutil.copytree(skill_source, staging) + if target_dir.exists(): + backup_holder = Path(tempfile.mkdtemp(dir=target_dir.parent, prefix=f"{SKILL_NAME}.old.")) + backup_holder.rmdir() + try: + os.rename(target_dir, backup_holder) + backup = backup_holder + except FileNotFoundError: + pass + try: + os.rename(staging, target_dir) + staging_consumed = True + except OSError: + # A peer raced ahead with identical content — accept their result. + pass + finally: + if not staging_consumed and staging.exists(): + shutil.rmtree(staging, ignore_errors=True) + if backup is not None and backup.exists(): + shutil.rmtree(backup, ignore_errors=True) + + return True, f"Installed skill pack to {target_dir}" + + +def install_skills(project_root: Path) -> tuple[bool, str]: + """Copy the legis skill pack into ``.claude/skills/`` for the project.""" + return _install_skill_to(project_root, Path(".claude") / "skills") + + +def install_codex_skills(project_root: Path) -> tuple[bool, str]: + """Copy the legis skill pack into ``.agents/skills/`` for Codex.""" + return _install_skill_to(project_root, Path(".agents") / "skills") + + +# --------------------------------------------------------------------------- +# Claude Code SessionStart hook +# --------------------------------------------------------------------------- + + +def _find_legis_command() -> list[str]: + """Resolve how to invoke legis for a hook command. + + Prefer a ``legis`` binary on PATH; otherwise fall back to the safe-path + module form `` -P -m legis`` so module resolution does not prepend + the project directory. + """ + found = shutil.which("legis") + if found: + return [found] + import sys + + return [sys.executable, "-P", "-m", "legis"] + + +def _hook_cmd_matches(hook_command: str, bare_command: str) -> bool: + """Whether *hook_command* is a bare, absolute-path, or module form of *bare_command*.""" + if hook_command == bare_command: + return True + try: + hook_tokens = shlex.split(hook_command) + bare_tokens = shlex.split(bare_command) + except ValueError: + return False + if not hook_tokens or not bare_tokens: + return False + n = len(bare_tokens) + bare_bin = bare_tokens[0] # "legis" + + if len(hook_tokens) == n: + if hook_tokens[1:] != bare_tokens[1:]: + return False + hook_bin = hook_tokens[0] + if hook_bin == bare_bin: + return True + hook_base = hook_bin.rsplit("/", 1)[-1].rsplit("\\", 1)[-1] + return hook_base.lower() in {bare_bin.lower(), f"{bare_bin.lower()}.exe"} + + module_prefixes = (["-m", bare_bin], ["-P", "-m", bare_bin]) + for prefix in module_prefixes: + if len(hook_tokens) == n + len(prefix) and hook_tokens[1 : 1 + len(prefix)] == prefix: + return hook_tokens[1 + len(prefix) :] == bare_tokens[1:] + + return False + + +def _has_unscoped_session_start_hook(settings: dict[str, Any], command: str) -> bool: + """Whether *command* appears in an unscoped/wildcard SessionStart block.""" + if not isinstance(settings, dict): + return False + hooks = settings.get("hooks", {}) + if not isinstance(hooks, dict): + return False + session_start = hooks.get("SessionStart", []) + if not isinstance(session_start, list): + return False + for matcher in session_start: + if not isinstance(matcher, dict): + continue + if "matcher" in matcher and matcher.get("matcher") not in (None, "*"): + continue + hook_list = matcher.get("hooks", []) + if not isinstance(hook_list, list): + continue + for hook in hook_list: + if isinstance(hook, dict) and _hook_cmd_matches(hook.get("command", ""), command): + return True + return False + + +def _upgrade_hook_commands(settings: dict[str, Any], bare_command: str, new_command: str) -> bool: + """Replace hook commands matching *bare_command* with *new_command*.""" + changed = False + hooks = settings.get("hooks", {}) + if not isinstance(hooks, dict): + return False + session_start = hooks.get("SessionStart", []) + if not isinstance(session_start, list): + return False + for matcher in session_start: + if not isinstance(matcher, dict): + continue + hook_list = matcher.get("hooks", []) + if not isinstance(hook_list, list): + continue + for hook in hook_list: + if not isinstance(hook, dict): + continue + cmd = hook.get("command", "") + if _hook_cmd_matches(cmd, bare_command) and cmd != new_command: + hook["command"] = new_command + changed = True + return changed + + +def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: + """Register ``legis session-context`` as a Claude Code SessionStart hook. + + Idempotent: re-running upgrades a bare/stale command to the resolved binary + and never duplicates the entry. Reuses only an unscoped block already + carrying the legis hook; otherwise appends a dedicated matcher-less block so + the hook fires on every SessionStart source. + """ + try: + claude_dir = ensure_project_dir(project_root, ".claude") + except UnsafeInstallPathError as exc: + return False, str(exc) + settings_path = claude_dir / "settings.json" + try: + reject_symlink(settings_path) + except UnsafeInstallPathError as exc: + return False, str(exc) + + settings: dict[str, Any] = {} + if settings_path.exists(): + try: + parsed = json.loads(settings_path.read_text(encoding="utf-8")) + if not isinstance(parsed, dict): + raise ValueError("settings.json is not a JSON object") + settings = parsed + except (json.JSONDecodeError, ValueError): + backup = settings_path.with_suffix(".json.bak") + try: + reject_symlink(backup) + except UnsafeInstallPathError as exc: + return False, str(exc) + shutil.copy2(settings_path, backup) + + prefix = shlex.join(_find_legis_command()) + session_context_cmd = f"{prefix} session-context" + + upgraded = _upgrade_hook_commands(settings, SESSION_CONTEXT_COMMAND, session_context_cmd) + needs_add = not _has_unscoped_session_start_hook(settings, SESSION_CONTEXT_COMMAND) + + if not needs_add: + _atomic_write_text(settings_path, json.dumps(settings, indent=2) + "\n") + if upgraded: + return True, f"Upgraded hook command in .claude/settings.json to use {prefix}" + return True, "Hook already registered in .claude/settings.json" + + hooks = settings.setdefault("hooks", {}) + if not isinstance(hooks, dict): + hooks = settings["hooks"] = {} + session_start = hooks.setdefault("SessionStart", []) + if not isinstance(session_start, list): + session_start = hooks["SessionStart"] = [] + + matcher_block: dict[str, Any] | None = None + for matcher in session_start: + if not isinstance(matcher, dict): + continue + if "matcher" in matcher and matcher.get("matcher") not in (None, "*"): + continue + hook_list = matcher.get("hooks", []) + if not isinstance(hook_list, list): + continue + if any( + isinstance(hook, dict) and _hook_cmd_matches(hook.get("command", ""), SESSION_CONTEXT_COMMAND) + for hook in hook_list + ): + matcher_block = matcher + break + + if matcher_block is None: + matcher_block = {"hooks": []} + session_start.append(matcher_block) + + matcher_block.setdefault("hooks", []).append( + {"type": "command", "command": session_context_cmd, "timeout": 5000} + ) + + _atomic_write_text(settings_path, json.dumps(settings, indent=2) + "\n") + return True, f"Registered hook in .claude/settings.json: {session_context_cmd}" + + +# --------------------------------------------------------------------------- +# .gitignore +# --------------------------------------------------------------------------- + +_LEGIS_IGNORE_RULES = (".legis/", "legis.yaml") +_LEGIS_IGNORE_BLOCK = ( + "\n# Legis — local working dir / config (regenerated/local; never commit)\n" + ".legis/\n" + "legis.yaml\n" +) + + +def ensure_gitignore(project_root: Path) -> tuple[bool, str]: + """Ensure legis's local config surface (``.legis/``, ``legis.yaml``) is ignored.""" + try: + gitignore = project_path(project_root, ".gitignore") + except UnsafeInstallPathError as exc: + return False, str(exc) + + if gitignore.exists(): + content = gitignore.read_text(encoding="utf-8") + present = { + line.strip() for line in content.splitlines() if line.strip() and not line.lstrip().startswith("#") + } + missing = [rule for rule in _LEGIS_IGNORE_RULES if rule not in present] + if not missing: + return True, "legis config already in .gitignore" + if not content.endswith("\n"): + content += "\n" + content += _LEGIS_IGNORE_BLOCK + _atomic_write_text(gitignore, content) + return True, f"Added {', '.join(missing)} to .gitignore" + + _atomic_write_text(gitignore, _LEGIS_IGNORE_BLOCK.lstrip("\n")) + return True, "Created .gitignore with legis config rules" diff --git a/tests/test_cli_install.py b/tests/test_cli_install.py new file mode 100644 index 0000000..50b3b01 --- /dev/null +++ b/tests/test_cli_install.py @@ -0,0 +1,114 @@ +"""Tests for the install / session-context CLI surfaces and MCP-boot refresh.""" + +from __future__ import annotations + +import json + +from legis import install +from legis.cli import build_parser, main +from legis.install import INSTRUCTIONS_MARKER, SKILL_NAME + + +def test_install_all_creates_every_artifact(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + rc = main(["install"]) + assert rc == 0 + + assert INSTRUCTIONS_MARKER in (tmp_path / "CLAUDE.md").read_text() + assert INSTRUCTIONS_MARKER in (tmp_path / "AGENTS.md").read_text() + assert (tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md").is_file() + assert (tmp_path / ".agents" / "skills" / SKILL_NAME / "SKILL.md").is_file() + settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) + assert "SessionStart" in settings["hooks"] + gitignore = (tmp_path / ".gitignore").read_text() + assert ".legis/" in gitignore and "legis.yaml" in gitignore + + +def test_install_selective_gitignore_only(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = main(["install", "--gitignore"]) + assert rc == 0 + assert (tmp_path / ".gitignore").exists() + assert not (tmp_path / "CLAUDE.md").exists() + assert not (tmp_path / ".claude").exists() + + +def test_install_claude_md_only(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = main(["install", "--claude-md"]) + assert rc == 0 + assert (tmp_path / "CLAUDE.md").exists() + assert not (tmp_path / "AGENTS.md").exists() + + +def test_install_reports_failure_rc1_on_symlink(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + real = tmp_path / "real.md" + real.write_text("x") + (tmp_path / "CLAUDE.md").symlink_to(real) + rc = main(["install", "--claude-md"]) + assert rc == 1 + assert "FAIL" in capsys.readouterr().out + + +def test_session_context_silent_when_fresh(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + install.inject_instructions(tmp_path / "CLAUDE.md") + rc = main(["session-context"]) + assert rc == 0 + assert capsys.readouterr().out == "" + + +def test_session_context_prints_on_drift(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + install.inject_instructions(tmp_path / "CLAUDE.md") + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED\n") + rc = main(["session-context"]) + assert rc == 0 + assert "CLAUDE.md" in capsys.readouterr().out + + +def test_install_subcommand_parses_flags(): + args = build_parser().parse_args(["install", "--claude-md", "--hooks"]) + assert args.command == "install" + assert args.claude_md is True + assert args.hooks is True + assert args.agents_md is False + + +# --------------------------------------------------------------------------- +# MCP-boot refresh wiring +# --------------------------------------------------------------------------- + + +def test_mcp_boot_refreshes_drifted_instructions(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + install.inject_instructions(tmp_path / "CLAUDE.md") + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED ON BOOT\n") + + import legis.mcp as mcp_module + + monkeypatch.setattr(mcp_module, "main", lambda agent_id: 0) + + rc = main(["mcp", "--agent-id", "agent-1"]) + assert rc == 0 + assert "DRIFTED ON BOOT" in (tmp_path / "CLAUDE.md").read_text() + + +def test_mcp_boot_refresh_failure_does_not_break_startup(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + + import legis.hooks as hooks_module + import legis.mcp as mcp_module + + calls = [] + + def boom(_root): + raise RuntimeError("refresh exploded") + + monkeypatch.setattr(hooks_module, "refresh_instructions", boom) + monkeypatch.setattr(mcp_module, "main", lambda agent_id: calls.append(agent_id) or 0) + + rc = main(["mcp", "--agent-id", "agent-1"]) + assert rc == 0 + assert calls == ["agent-1"] diff --git a/tests/test_hooks.py b/tests/test_hooks.py new file mode 100644 index 0000000..1404b58 --- /dev/null +++ b/tests/test_hooks.py @@ -0,0 +1,101 @@ +"""Tests for legis.hooks — drift refresh and SessionStart context.""" + +from __future__ import annotations + +from legis import hooks, install +from legis.hooks import ( + _extract_marker_token, + generate_session_context, + refresh_instructions, +) +from legis.install import ( + SKILL_NAME, + _marker_token, + inject_instructions, + install_skills, +) + + +def test_extract_marker_token_roundtrip(): + token = _marker_token() + content = f"x\n\nbody\n" + assert _extract_marker_token(content) == token + + +def test_extract_marker_token_absent(): + assert _extract_marker_token("no marker here") is None + + +def test_refresh_noop_when_fresh(tmp_path): + inject_instructions(tmp_path / "CLAUDE.md") + inject_instructions(tmp_path / "AGENTS.md") + assert refresh_instructions(tmp_path) == [] + + +def test_refresh_updates_drifted_block_in_both_files(tmp_path, monkeypatch): + inject_instructions(tmp_path / "CLAUDE.md") + inject_instructions(tmp_path / "AGENTS.md") + + # Simulate drift: the bundled content now hashes differently. + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED BODY\n") + messages = refresh_instructions(tmp_path) + + assert any("CLAUDE.md" in m for m in messages) + assert any("AGENTS.md" in m for m in messages) + assert "DRIFTED BODY" in (tmp_path / "CLAUDE.md").read_text() + assert "DRIFTED BODY" in (tmp_path / "AGENTS.md").read_text() + + +def test_refresh_skips_file_without_marker(tmp_path): + (tmp_path / "CLAUDE.md").write_text("# plain file, no legis marker\n") + assert refresh_instructions(tmp_path) == [] + assert "legis:instructions" not in (tmp_path / "CLAUDE.md").read_text() + + +def test_refresh_skips_absent_files(tmp_path): + # Neither CLAUDE.md nor AGENTS.md exists and no skills installed. + assert refresh_instructions(tmp_path) == [] + + +def test_refresh_reinstalls_drifted_skill_pack(tmp_path): + install_skills(tmp_path) + skill = tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md" + source = skill.read_text() + # Corrupt the installed copy so its fingerprint diverges from source. + skill.write_text(source + "\nLOCAL EDIT THAT MUST BE OVERWRITTEN\n") + + messages = refresh_instructions(tmp_path) + + assert any("skill pack" in m for m in messages) + assert skill.read_text() == source + + +def test_refresh_does_not_create_skill_pack_when_absent(tmp_path): + # No skill installed → refresh must not create one. + refresh_instructions(tmp_path) + assert not (tmp_path / ".claude" / "skills" / SKILL_NAME).exists() + + +def test_generate_session_context_returns_none_when_fresh(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + inject_instructions(tmp_path / "CLAUDE.md") + assert generate_session_context() is None + + +def test_generate_session_context_returns_messages_on_drift(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + inject_instructions(tmp_path / "CLAUDE.md") + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED\n") + context = generate_session_context() + assert context is not None + assert "CLAUDE.md" in context + + +def test_generate_session_context_swallows_errors(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + + def boom(_root): + raise OSError("disk gone") + + monkeypatch.setattr(hooks, "refresh_instructions", boom) + assert generate_session_context() is None diff --git a/tests/test_install.py b/tests/test_install.py new file mode 100644 index 0000000..086f97f --- /dev/null +++ b/tests/test_install.py @@ -0,0 +1,405 @@ +"""Tests for legis.install — instruction injection, skills, hooks, gitignore.""" + +from __future__ import annotations + +import json +import os +import stat + +import pytest + +from legis import install +from legis.install import ( + INSTRUCTIONS_MARKER, + SKILL_NAME, + UnsafeInstallPathError, + _build_instructions_block, + _instructions_hash, + _instructions_text, + _instructions_version, + _marker_token, + _skill_tree_fingerprint, + ensure_gitignore, + inject_instructions, + install_claude_code_hooks, + install_codex_skills, + install_skills, + reject_symlink, +) + + +# --------------------------------------------------------------------------- +# Instructions block primitives +# --------------------------------------------------------------------------- + + +def test_instructions_text_is_nonempty_and_marker_free(): + text = _instructions_text() + assert text.strip() + # The body must not contain markers; they are added programmatically. + assert INSTRUCTIONS_MARKER not in text + assert "/legis:instructions" not in text + + +def test_instructions_hash_is_stable_8_hex(): + h = _instructions_hash() + assert len(h) == 8 + assert all(c in "0123456789abcdef" for c in h) + assert h == _instructions_hash() + + +def test_instructions_version_prefers_dist_metadata(): + import importlib.metadata + + # Prefers installed distribution metadata; falls back to legis.__version__. + # (In a dev venv the editable dist metadata can lag the source __version__; + # in a real release they agree. Assert the documented preference, not a + # hardcoded string.) + try: + expected = importlib.metadata.version("legis") + except importlib.metadata.PackageNotFoundError: + from legis import __version__ + + expected = __version__ + assert _instructions_version() == expected + assert _instructions_version() # non-empty + + +def test_instructions_version_falls_back_to_dunder(monkeypatch): + import importlib.metadata + + def _raise(_name): + raise importlib.metadata.PackageNotFoundError("legis") + + monkeypatch.setattr(install.importlib.metadata, "version", _raise) + from legis import __version__ + + assert _instructions_version() == __version__ + + +def test_build_block_has_open_and_close_markers(): + block = _build_instructions_block() + assert block.startswith(f"{INSTRUCTIONS_MARKER}:{_marker_token()} -->") + assert block.rstrip().endswith("") + assert _instructions_text() in block + + +# --------------------------------------------------------------------------- +# inject_instructions +# --------------------------------------------------------------------------- + + +def test_inject_creates_missing_file(tmp_path): + target = tmp_path / "CLAUDE.md" + ok, msg = inject_instructions(target) + assert ok + assert "Created" in msg + content = target.read_text() + assert INSTRUCTIONS_MARKER in content + assert "" in content + + +def test_inject_appends_to_existing_file_without_marker(tmp_path): + target = tmp_path / "AGENTS.md" + target.write_text("# My project\n\nExisting guidance.\n") + ok, msg = inject_instructions(target) + assert ok + assert "Appended" in msg + content = target.read_text() + assert "Existing guidance." in content + assert content.index("Existing guidance.") < content.index(INSTRUCTIONS_MARKER) + + +def test_inject_replaces_existing_block_preserving_surrounding_text(tmp_path, monkeypatch): + target = tmp_path / "CLAUDE.md" + target.write_text("TOP\n\n") + inject_instructions(target) + # Append trailing user content after the block. + target.write_text(target.read_text() + "\nBOTTOM\n") + + monkeypatch.setattr(install, "_instructions_text", lambda: "NEW BODY CONTENT\n") + ok, msg = inject_instructions(target) + assert ok + assert "Updated" in msg + content = target.read_text() + assert "TOP" in content + assert "BOTTOM" in content + assert "NEW BODY CONTENT" in content + # Exactly one block remains. + assert content.count(INSTRUCTIONS_MARKER) == 1 + assert content.count("") == 1 + + +def test_inject_idempotent_when_content_unchanged(tmp_path): + target = tmp_path / "CLAUDE.md" + inject_instructions(target) + first = target.read_text() + inject_instructions(target) + assert target.read_text() == first + + +def test_inject_repairs_block_with_missing_end_marker(tmp_path): + target = tmp_path / "CLAUDE.md" + # Open marker but no close marker, plus trailing junk. + target.write_text(f"HEAD\n{INSTRUCTIONS_MARKER}:vX:dead -->\norphan body no close\n") + ok, msg = inject_instructions(target) + assert ok + content = target.read_text() + assert "HEAD" in content + assert "orphan body no close" not in content + assert content.count(INSTRUCTIONS_MARKER) == 1 + assert "" in content + + +def test_inject_rejects_symlink_target(tmp_path): + real = tmp_path / "real.md" + real.write_text("x") + link = tmp_path / "CLAUDE.md" + link.symlink_to(real) + ok, msg = inject_instructions(link) + assert ok is False + assert "symlink" in msg.lower() + + +# --------------------------------------------------------------------------- +# _atomic_write_text +# --------------------------------------------------------------------------- + + +def test_atomic_write_preserves_existing_mode(tmp_path): + target = tmp_path / "CLAUDE.md" + target.write_text("seed") + os.chmod(target, 0o640) + inject_instructions(target) + mode = stat.S_IMODE(target.stat().st_mode) + assert mode == 0o640 + + +def test_reject_symlink_raises_on_symlink(tmp_path): + real = tmp_path / "r" + real.write_text("x") + link = tmp_path / "l" + link.symlink_to(real) + with pytest.raises(UnsafeInstallPathError): + reject_symlink(link) + + +# --------------------------------------------------------------------------- +# Skill pack +# --------------------------------------------------------------------------- + + +def test_install_skills_copies_pack(tmp_path): + ok, msg = install_skills(tmp_path) + assert ok + skill = tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md" + assert skill.is_file() + assert "legis-workflow" in skill.read_text() + + +def test_install_codex_skills_targets_agents_dir(tmp_path): + ok, _ = install_codex_skills(tmp_path) + assert ok + assert (tmp_path / ".agents" / "skills" / SKILL_NAME / "SKILL.md").is_file() + + +def test_install_skills_idempotent(tmp_path): + install_skills(tmp_path) + skill = tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md" + first = skill.read_text() + ok, _ = install_skills(tmp_path) + assert ok + assert skill.read_text() == first + + +def test_skill_tree_fingerprint_changes_with_content(tmp_path): + root = tmp_path / "pack" + root.mkdir() + (root / "a.md").write_text("one") + fp1 = _skill_tree_fingerprint(root) + (root / "a.md").write_text("two") + fp2 = _skill_tree_fingerprint(root) + assert fp1 != fp2 + + +# --------------------------------------------------------------------------- +# Hook registration +# --------------------------------------------------------------------------- + + +def _session_commands(settings: dict) -> list[str]: + cmds: list[str] = [] + for block in settings.get("hooks", {}).get("SessionStart", []): + for hook in block.get("hooks", []): + cmds.append(hook.get("command", "")) + return cmds + + +def test_install_hooks_fresh(tmp_path): + ok, msg = install_claude_code_hooks(tmp_path) + assert ok + settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) + cmds = _session_commands(settings) + assert any(c.endswith("session-context") for c in cmds) + + +def test_install_hooks_idempotent_no_duplicate(tmp_path): + install_claude_code_hooks(tmp_path) + install_claude_code_hooks(tmp_path) + settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) + cmds = [c for c in _session_commands(settings) if c.endswith("session-context")] + assert len(cmds) == 1 + + +def test_install_hooks_upgrades_bare_command(tmp_path, monkeypatch): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text( + json.dumps( + {"hooks": {"SessionStart": [{"hooks": [{"type": "command", "command": "legis session-context"}]}]}} + ) + ) + # Force a resolved binary path so the bare command must be upgraded. + monkeypatch.setattr(install, "_find_legis_command", lambda: ["/opt/bin/legis"]) + ok, msg = install_claude_code_hooks(tmp_path) + assert ok + settings = json.loads((claude / "settings.json").read_text()) + cmds = _session_commands(settings) + assert "/opt/bin/legis session-context" in cmds + assert cmds.count("/opt/bin/legis session-context") == 1 + + +def test_install_hooks_backs_up_malformed_settings(tmp_path): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text("{ this is not json") + ok, _ = install_claude_code_hooks(tmp_path) + assert ok + assert (claude / "settings.json.bak").is_file() + settings = json.loads((claude / "settings.json").read_text()) + assert any(c.endswith("session-context") for c in _session_commands(settings)) + + +def test_install_hooks_does_not_reuse_scoped_block(tmp_path): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text( + json.dumps( + { + "hooks": { + "SessionStart": [ + {"matcher": "resume", "hooks": [{"type": "command", "command": "legis session-context"}]} + ] + } + } + ) + ) + install_claude_code_hooks(tmp_path) + settings = json.loads((claude / "settings.json").read_text()) + # A new unscoped block must be added — the scoped one does not cover cold start. + blocks = settings["hooks"]["SessionStart"] + unscoped = [b for b in blocks if "matcher" not in b or b.get("matcher") in (None, "*")] + assert unscoped + assert any(h["command"].endswith("session-context") for b in unscoped for h in b["hooks"]) + + +# --------------------------------------------------------------------------- +# _hook_cmd_matches +# --------------------------------------------------------------------------- + + +@pytest.mark.parametrize( + "command,expected", + [ + ("legis session-context", True), + ("/usr/local/bin/legis session-context", True), + ("/path/python -P -m legis session-context", True), + ("/path/python -m legis session-context", True), + ("echo legis session-context", False), + ("legis serve", False), + ], +) +def test_hook_cmd_matches(command, expected): + assert install._hook_cmd_matches(command, "legis session-context") is expected + + +# --------------------------------------------------------------------------- +# .gitignore +# --------------------------------------------------------------------------- + + +def test_ensure_gitignore_creates_file(tmp_path): + ok, msg = ensure_gitignore(tmp_path) + assert ok + content = (tmp_path / ".gitignore").read_text() + assert ".legis/" in content + assert "legis.yaml" in content + + +def test_ensure_gitignore_appends_missing_rules(tmp_path): + (tmp_path / ".gitignore").write_text("*.db\n") + ok, msg = ensure_gitignore(tmp_path) + assert ok + content = (tmp_path / ".gitignore").read_text() + assert "*.db" in content + assert ".legis/" in content + assert "legis.yaml" in content + + +def test_ensure_gitignore_idempotent(tmp_path): + ensure_gitignore(tmp_path) + first = (tmp_path / ".gitignore").read_text() + ok, msg = ensure_gitignore(tmp_path) + assert ok + assert "already" in msg + assert (tmp_path / ".gitignore").read_text() == first + + +# --------------------------------------------------------------------------- +# Command resolution and safe-path edges +# --------------------------------------------------------------------------- + + +def test_find_legis_command_prefers_binary_on_path(monkeypatch): + monkeypatch.setattr(install.shutil, "which", lambda _name: "/opt/bin/legis") + assert install._find_legis_command() == ["/opt/bin/legis"] + + +def test_find_legis_command_module_fallback(monkeypatch): + monkeypatch.setattr(install.shutil, "which", lambda _name: None) + cmd = install._find_legis_command() + assert cmd[-3:] == ["-P", "-m", "legis"] + + +def test_project_path_rejects_symlinked_component(tmp_path): + real_dir = tmp_path / "real_dir" + real_dir.mkdir() + link_dir = tmp_path / ".claude" + link_dir.symlink_to(real_dir, target_is_directory=True) + with pytest.raises(UnsafeInstallPathError): + install.project_path(tmp_path, ".claude", "settings.json") + + +def test_ensure_project_dir_creates_and_returns_dir(tmp_path): + created = install.ensure_project_dir(tmp_path, ".claude", "skills") + assert created.is_dir() + assert created == tmp_path / ".claude" / "skills" + + +def test_install_skills_reports_missing_source(tmp_path, monkeypatch): + empty = tmp_path / "no_skills_here" + empty.mkdir() + monkeypatch.setattr(install, "_get_skills_source_dir", lambda: empty) + ok, msg = install_skills(tmp_path) + assert ok is False + assert "not found" in msg + + +def test_upgrade_hook_commands_tolerates_non_dict_settings(): + assert install._upgrade_hook_commands({"hooks": []}, "legis session-context", "x") is False + assert install._upgrade_hook_commands({}, "legis session-context", "x") is False + + +def test_has_unscoped_session_start_hook_tolerates_non_dict(): + assert install._has_unscoped_session_start_hook({"hooks": "nope"}, "legis session-context") is False + assert install._has_unscoped_session_start_hook({}, "legis session-context") is False From b2457108abbad9c8c9f8ca310e556b149323ab40 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 19:01:57 +1000 Subject: [PATCH 16/72] fix(install): harden skill swap, hook upgrade, gitignore, nested-corrupt settings Applies the confirmed findings from the adversarial review of the instruction-injection feature: - Skill install no longer reports success over a destroyed pack: a genuine (non-peer) rename failure during the staging->target swap now restores the original pack from backup and returns (False, ...) instead of swallowing the error and returning "Installed". The prior pack is only discarded once the new one is in place. - _upgrade_hook_commands now skips user-scoped SessionStart blocks (matcher not in {None, "*"}), so a user's portable bare `legis session-context` is never rewritten into a venv-specific absolute path. Matches the scope filter the reuse/has-unscoped paths already use. - Removed the dead matcher-block reuse loop (unreachable: needs_add implies no unscoped legis hook exists) in favour of always appending a dedicated matcher-less block. - A valid settings.json whose nested hooks/SessionStart has the wrong type is now backed up to settings.json.bak before the in-place reset, instead of silently discarding that user data. - ensure_gitignore appends only the absent rules, fixing a duplicate-write when one of (.legis/, legis.yaml) was already present. Documented the intentional version+hash freshness divergence from filigree in _marker_token (legis git-ignores the regenerated CLAUDE.md/AGENTS.md, so a version-bump re-inject produces no committed diff). Adds regression tests for each fix plus the previously-untested version-bump freshness path and Codex skill-pack drift path. Gates: ruff + mypy clean; 623 passed / 91.19%; per-package floors hold; install.py 82%->87%, hooks.py 94%; end-to-end smoke confirms no gitignore duplication and scoped-block preservation. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/install.py | 101 ++++++++++++++++++++++++++++-------------- tests/test_hooks.py | 24 ++++++++++ tests/test_install.py | 85 +++++++++++++++++++++++++++++++++++ 3 files changed, 176 insertions(+), 34 deletions(-) diff --git a/src/legis/install.py b/src/legis/install.py index 6a66487..7f35813 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -134,7 +134,11 @@ def _marker_token() -> str: """Return the ``v{version}:{hash}`` identity carried by the open marker. Freshness compares this whole token, so a content edit (hash drift) *or* a - package-version bump both re-inject and keep the marker truthful. + package-version bump both re-inject and keep the marker truthful. This is a + deliberate divergence from filigree (which compares the hash segment only): + legis treats ``CLAUDE.md`` / ``AGENTS.md`` as regenerated, git-ignored + artifacts, so a marker-only rewrite on a version bump produces no committed + diff — it just keeps the embedded version honest. """ return f"v{_instructions_version()}:{_instructions_hash()}" @@ -262,6 +266,7 @@ def _install_skill_to(project_root: Path, target_subpath: Path) -> tuple[bool, s staging = Path(tempfile.mkdtemp(dir=target_dir.parent, prefix=f"{SKILL_NAME}.installing.")) staging.rmdir() staging_consumed = False + swap_done = False backup: Path | None = None try: shutil.copytree(skill_source, staging) @@ -276,13 +281,32 @@ def _install_skill_to(project_root: Path, target_subpath: Path) -> tuple[bool, s try: os.rename(staging, target_dir) staging_consumed = True + swap_done = True except OSError: - # A peer raced ahead with identical content — accept their result. - pass + # Distinguish a peer winning the race (target now holds their + # identical content) from a genuine failure. Only the former is + # safe to report as success — otherwise we would claim a successful + # install over a pack we just destroyed. + if target_dir.exists() and target_dir.is_dir(): + swap_done = True + else: + # Genuine failure: restore the original pack we set aside and + # report failure rather than a false-positive "Installed". + if backup is not None and backup.exists(): + try: + os.rename(backup, target_dir) + backup = None + except OSError: + # Could not restore — leave the backup in place (it may + # be the only surviving copy) rather than delete it. + pass + return False, f"Failed to install skill pack to {target_dir}: swap failed" finally: if not staging_consumed and staging.exists(): shutil.rmtree(staging, ignore_errors=True) - if backup is not None and backup.exists(): + # Only discard the prior pack once the new one is in place. If the swap + # failed we must not delete the backup — it may be the only copy left. + if backup is not None and swap_done and backup.exists(): shutil.rmtree(backup, ignore_errors=True) return True, f"Installed skill pack to {target_dir}" @@ -385,6 +409,11 @@ def _upgrade_hook_commands(settings: dict[str, Any], bare_command: str, new_comm for matcher in session_start: if not isinstance(matcher, dict): continue + # Only upgrade commands in unscoped blocks legis owns. A user's scoped + # block (e.g. {"matcher": "resume"}) is their config — never rewrite a + # portable bare command there into a venv-specific absolute path. + if "matcher" in matcher and matcher.get("matcher") not in (None, "*"): + continue hook_list = matcher.get("hooks", []) if not isinstance(hook_list, list): continue @@ -443,35 +472,36 @@ def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: return True, f"Upgraded hook command in .claude/settings.json to use {prefix}" return True, "Hook already registered in .claude/settings.json" - hooks = settings.setdefault("hooks", {}) - if not isinstance(hooks, dict): - hooks = settings["hooks"] = {} - session_start = hooks.setdefault("SessionStart", []) - if not isinstance(session_start, list): - session_start = hooks["SessionStart"] = [] - - matcher_block: dict[str, Any] | None = None - for matcher in session_start: - if not isinstance(matcher, dict): - continue - if "matcher" in matcher and matcher.get("matcher") not in (None, "*"): - continue - hook_list = matcher.get("hooks", []) - if not isinstance(hook_list, list): - continue - if any( - isinstance(hook, dict) and _hook_cmd_matches(hook.get("command", ""), SESSION_CONTEXT_COMMAND) - for hook in hook_list - ): - matcher_block = matcher - break - - if matcher_block is None: - matcher_block = {"hooks": []} - session_start.append(matcher_block) - - matcher_block.setdefault("hooks", []).append( - {"type": "command", "command": session_context_cmd, "timeout": 5000} + # A valid top-level object whose "hooks"/"SessionStart" is the wrong type + # parses cleanly (so the malformed-JSON backup above did not fire), but the + # resets below would silently drop that user data — preserve a recoverable + # copy first. + existing_hooks = settings.get("hooks") + existing_ss = existing_hooks.get("SessionStart") if isinstance(existing_hooks, dict) else None + nested_corrupt = (existing_hooks is not None and not isinstance(existing_hooks, dict)) or ( + isinstance(existing_hooks, dict) and "SessionStart" in existing_hooks and not isinstance(existing_ss, list) + ) + if nested_corrupt and settings_path.exists(): + backup = settings_path.with_suffix(".json.bak") + try: + reject_symlink(backup) + except UnsafeInstallPathError as exc: + return False, str(exc) + shutil.copy2(settings_path, backup) + + if not isinstance(settings.get("hooks"), dict): + settings["hooks"] = {} + hooks = settings["hooks"] + if not isinstance(hooks.get("SessionStart"), list): + hooks["SessionStart"] = [] + session_start = hooks["SessionStart"] + + # needs_add is True only when no unscoped block already carries the legis + # hook (see _has_unscoped_session_start_hook), so there is never a reusable + # block to find — append a dedicated matcher-less block that fires on every + # SessionStart source regardless of how neighbouring blocks are scoped. + session_start.append( + {"hooks": [{"type": "command", "command": session_context_cmd, "timeout": 5000}]} ) _atomic_write_text(settings_path, json.dumps(settings, indent=2) + "\n") @@ -507,7 +537,10 @@ def ensure_gitignore(project_root: Path) -> tuple[bool, str]: return True, "legis config already in .gitignore" if not content.endswith("\n"): content += "\n" - content += _LEGIS_IGNORE_BLOCK + # Append only the rules that are actually absent — writing the whole + # block when one rule is already present would duplicate the other. + content += "\n# Legis — local working dir / config (regenerated/local; never commit)\n" + content += "".join(f"{rule}\n" for rule in missing) _atomic_write_text(gitignore, content) return True, f"Added {', '.join(missing)} to .gitignore" diff --git a/tests/test_hooks.py b/tests/test_hooks.py index 1404b58..f4b4939 100644 --- a/tests/test_hooks.py +++ b/tests/test_hooks.py @@ -12,6 +12,7 @@ SKILL_NAME, _marker_token, inject_instructions, + install_codex_skills, install_skills, ) @@ -46,6 +47,29 @@ def test_refresh_updates_drifted_block_in_both_files(tmp_path, monkeypatch): assert "DRIFTED BODY" in (tmp_path / "AGENTS.md").read_text() +def test_refresh_updates_on_version_bump_with_identical_content(tmp_path, monkeypatch): + # Pins the documented "automatic versioning" contract: a package-version + # bump re-injects even when instructions.md is byte-identical. This is the + # only test that would catch a regression collapsing freshness to hash-only. + inject_instructions(tmp_path / "CLAUDE.md") + monkeypatch.setattr(install, "_instructions_version", lambda: "9.9.9") + messages = refresh_instructions(tmp_path) + assert any("CLAUDE.md" in m for m in messages) + assert "v9.9.9:" in (tmp_path / "CLAUDE.md").read_text() + + +def test_refresh_reinstalls_drifted_codex_skill_pack(tmp_path): + install_codex_skills(tmp_path) + skill = tmp_path / ".agents" / "skills" / SKILL_NAME / "SKILL.md" + source = skill.read_text() + skill.write_text(source + "\nLOCAL EDIT\n") + + messages = refresh_instructions(tmp_path) + + assert any("Codex skill pack" in m for m in messages) + assert skill.read_text() == source + + def test_refresh_skips_file_without_marker(tmp_path): (tmp_path / "CLAUDE.md").write_text("# plain file, no legis marker\n") assert refresh_instructions(tmp_path) == [] diff --git a/tests/test_install.py b/tests/test_install.py index 086f97f..980c5e2 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -403,3 +403,88 @@ def test_upgrade_hook_commands_tolerates_non_dict_settings(): def test_has_unscoped_session_start_hook_tolerates_non_dict(): assert install._has_unscoped_session_start_hook({"hooks": "nope"}, "legis session-context") is False assert install._has_unscoped_session_start_hook({}, "legis session-context") is False + + +def test_install_hooks_leaves_user_scoped_block_command_untouched(tmp_path, monkeypatch): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text( + json.dumps( + { + "hooks": { + "SessionStart": [ + {"matcher": "resume", "hooks": [{"type": "command", "command": "legis session-context"}]} + ] + } + } + ) + ) + monkeypatch.setattr(install, "_find_legis_command", lambda: ["/opt/bin/legis"]) + install_claude_code_hooks(tmp_path) + blocks = json.loads((claude / "settings.json").read_text())["hooks"]["SessionStart"] + + scoped = [b for b in blocks if b.get("matcher") == "resume"][0] + # The user's portable bare command must NOT be pinned to a venv path. + assert scoped["hooks"][0]["command"] == "legis session-context" + # legis still adds its own unscoped block with the resolved command. + unscoped = [b for b in blocks if "matcher" not in b or b.get("matcher") in (None, "*")] + assert any(h["command"] == "/opt/bin/legis session-context" for b in unscoped for h in b["hooks"]) + + +def test_install_hooks_backs_up_nested_corrupt_structure(tmp_path): + claude = tmp_path / ".claude" + claude.mkdir() + (claude / "settings.json").write_text(json.dumps({"hooks": "important user data", "keep": 1})) + ok, _ = install_claude_code_hooks(tmp_path) + assert ok + bak = claude / "settings.json.bak" + assert bak.is_file() + assert "important user data" in bak.read_text() + settings = json.loads((claude / "settings.json").read_text()) + assert settings.get("keep") == 1 # sibling key preserved + assert any(c.endswith("session-context") for c in _session_commands(settings)) + + +def test_install_skills_restores_original_on_genuine_swap_failure(tmp_path, monkeypatch): + install_skills(tmp_path) + skill = tmp_path / ".claude" / "skills" / SKILL_NAME / "SKILL.md" + original = skill.read_text() + + real_rename = os.rename + calls = {"n": 0} + + def flaky_rename(src, dst): + calls["n"] += 1 + if calls["n"] == 2: # the staging -> target swap + raise OSError("simulated swap failure") + return real_rename(src, dst) + + monkeypatch.setattr(install.os, "rename", flaky_rename) + ok, msg = install_skills(tmp_path) + + assert ok is False + assert "swap failed" in msg + # The previously installed pack must survive a genuine swap failure. + assert skill.is_file() + assert skill.read_text() == original + + +def test_inject_append_keeps_marker_off_users_last_line(tmp_path): + target = tmp_path / "CLAUDE.md" + target.write_text("# Project\nlast line no newline") # no trailing newline + inject_instructions(target) + content = target.read_text() + assert "last line no newline\n" in content + idx = content.index(INSTRUCTIONS_MARKER) + assert content[idx - 1] == "\n" + + +def test_ensure_gitignore_partial_present_appends_only_missing(tmp_path): + (tmp_path / ".gitignore").write_text("*.db\n.legis/\n") # legis.yaml absent + ok, msg = ensure_gitignore(tmp_path) + assert ok + assert "legis.yaml" in msg + assert ".legis/" not in msg # already present — not re-reported + content = (tmp_path / ".gitignore").read_text() + assert content.count(".legis/") == 1 # not duplicated + assert content.count("legis.yaml") == 1 From e87fbf3e38aa7c446eee9cb35bfef8a37a59f76b Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 19:42:36 +1000 Subject: [PATCH 17/72] fix(mcp): pull_request_get reports recorded checks unconditionally MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The ternary guarded `_checks(runtime).for_pr(number)` on `runtime.check_surface is not None`, which evaluates the field BEFORE the lazy `_checks()` builder can initialise it. On a fresh build_runtime (check_surface=None) the guard short-circuited to `[]` and never built the surface, so a PR's CI outcomes were call-order-dependent: empty until some other tool (e.g. check_list) happened to initialise the surface first. For a governance tool that is the "but legis said it was clean" failure — an agent could be told a PR has no checks when failing checks exist. Call `_checks(runtime).for_pr(number)` unconditionally, matching _tool_check_list; `_checks()` already handles the None case. Confirmed the sole instance of the guard-before-lazy-builder anti-pattern in mcp.py. Regression test reproduces the real path (fresh runtime, checks reachable via LEGIS_CHECK_DB) and asserts the FAIL check is reported. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/mcp.py | 11 ++++---- tests/mcp/test_server.py | 56 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 62 insertions(+), 5 deletions(-) diff --git a/src/legis/mcp.py b/src/legis/mcp.py index fd2bfa4..459d595 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -1043,11 +1043,12 @@ def _tool_pull_request_get(runtime: McpRuntime, args: dict[str, Any]) -> dict[st return _tool_error("NOT_FOUND", f"unknown PR: {number}") pull_payload = asdict(pull) pull_payload["state"] = pull.state.value - pull_checks = ( - _checks(runtime).for_pr(number) - if runtime.check_surface is not None - else [] - ) + # Build the check surface unconditionally — `_checks()` lazily initialises it + # from LEGIS_CHECK_DB. Guarding on `runtime.check_surface is not None` made the + # result call-order-dependent: a fresh runtime (build_runtime sets it to None) + # reported no checks until some other tool happened to initialise the surface + # first, so an agent could be told a PR is clean when checks exist and fail. + pull_checks = _checks(runtime).for_pr(number) pull_payload["checks"] = [_check_to_dict(run) for run in pull_checks] return _tool_result(pull_payload) diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index a253701..2c9bc26 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -1250,6 +1250,62 @@ def test_read_tools_return_git_pull_checks_and_override_rate(tmp_path, git_repo) assert rate["note"] == "measures operator force-pasts; not movable by agent retries" +def test_pull_request_get_returns_checks_on_a_fresh_runtime(tmp_path, monkeypatch): + # Regression: build_runtime yields check_surface=None, and the first tool + # call an agent makes may be pull_request_get (no prior check_list to lazily + # initialise the surface). The result must NOT be call-order-dependent — a PR + # with recorded checks must report them, or a governance agent is told a PR is + # clean when checks exist and may be failing. + checks = CheckSurface(f"sqlite:///{tmp_path / 'checks.db'}") + checks.record( + CheckRun( + check_name="unit", + run_id="run-1", + commit_sha="abc123", + outcome=CheckOutcome.FAIL, + pr=7, + ran_against="abc123", + ) + ) + # The lazy _checks() builder resolves the DB from LEGIS_CHECK_DB, exactly as a + # deployed server does — so the surface is uninitialised but reachable. + monkeypatch.setenv("LEGIS_CHECK_DB", f"sqlite:///{tmp_path / 'checks.db'}") + pulls = PullSurface(f"sqlite:///{tmp_path / 'pulls.db'}") + pulls.record( + PullRequest( + number=7, + title="Feature", + base="main", + head="feature", + state=PullRequestState.OPEN, + url="https://example.test/pr/7", + ) + ) + # Fresh runtime: check_surface left at its build_runtime default (None). + runtime, _store = _runtime(tmp_path, check_surface=None) + runtime.pull_surface = pulls + + responses = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": { + "name": "pull_request_get", + "arguments": {"number": "7"}, + }, + }, + ), + runtime, + ) + + pr = responses[0]["result"]["structuredContent"] + assert pr["number"] == 7 + assert pr["checks"][0]["check_name"] == "unit" + assert pr["checks"][0]["outcome"] == "fail" + + def test_check_list_reads_recorded_checks_by_commit_and_pr(tmp_path): checks = CheckSurface(f"sqlite:///{tmp_path / 'checks.db'}") checks.record( From fe50792a063f808158832ae6494122e61e6284c7 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 19:42:46 +1000 Subject: [PATCH 18/72] docs(canonical): correct Q-L4 note clause + add non-ASCII regression test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit A code-review re-flagged canonical_json(ensure_ascii=False) as a cross-tool HMAC hazard ("non-ASCII findings fail verification"). Disproven: Wardline's artifact signer (wardline/core/legis.py) is a deliberate byte-for-byte Python replica using the identical ensure_ascii=False, pinned by a golden HMAC vector from the real legis signer plus an "é" canonicalization test. Both sides are the same Python json.dumps call, so non-ASCII round-trips and verifies — ensure_ascii=False is what makes them match, not a hazard. The Q-L4 note's *conclusion* (RFC-8785 deferral) was correct, but one clause was inaccurate: "every hash is produced and checked in-process". Wardline produces the artifact_signature out-of-process, cross-repo — it is cross-repo and cross-process, but NOT cross-language, which is the RFC-8785 trigger. Rewrite the clause to name the real cross-repo Python verifier and separate the two guarantees (the ASCII-only cross-impl golden vector vs. each side's own non-ASCII unit test); the cross-impl non-ASCII payload is a Wardline-side follow-up. Add a non-ASCII canonical test on the legis side (it had none) mirroring Wardline's — locks legis's own ensure_ascii=False output. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/canonical.py | 35 +++++++++++++++++++++++++---------- tests/test_canonical.py | 16 ++++++++++++++++ 2 files changed, 41 insertions(+), 10 deletions(-) diff --git a/src/legis/canonical.py b/src/legis/canonical.py index 636a566..7816473 100644 --- a/src/legis/canonical.py +++ b/src/legis/canonical.py @@ -4,16 +4,31 @@ a future hardening (elspeth uses RFC 8785); legis should converge there before the protected cell ships cryptographic guarantees (see ADR-0001 / ADR-0002). -Q-L4 deferral (assessed 2026-06-06): RFC-8785 is gated on "when cross-language -verification is needed." No current consumer verifies a legis hash from a -non-Python runtime — every hash is produced and checked in-process, and -``content_hash`` always derives bytes via ``.encode("utf-8")``, so the -``ensure_ascii=False`` byte output is deterministic for legis's single-language -use today. Because this is the single canonicalization choke point, the RFC-8785 -upgrade stays a one-file change for the day a cross-language verifier lands. The -companion Q-L5 fingerprint reconciliation (decorator.py / boundary_scan.py) is -independent and is done — those fingerprints are Python ``ast.dump`` output, not -cross-language JSON, so RFC-8785 does not apply to them. +Q-L4 deferral (assessed 2026-06-06; clause corrected 2026-06-06): RFC-8785 is +gated on "when cross-language verification is needed." One consumer verifies a +hash this module did NOT produce — ``wardline/ingest.verify_wardline_artifact`` +checks the ``artifact_signature`` Wardline computes in its OWN repo/process over +``canonical_json(scan-minus-signature)``. That is genuinely cross-repo and +cross-process, but it is NOT cross-language: Wardline's signer +(``wardline/src/wardline/core/legis.py``) is a deliberate byte-for-byte Python +replica using the same ``ensure_ascii=False`` params. Two guarantees back this, +and they are NOT the same: a golden HMAC vector captured from the real legis +signer is the *cross-impl* pin (it proves the two signers agree byte-for-byte — +but its payload is ASCII-only today); a separate ``"é"`` canonicalization unit +test on each side proves that side preserves a non-ASCII char as the literal +byte rather than a ``\\uXXXX`` escape. Because both serializers are the identical +Python ``json.dumps`` call, non-ASCII findings round-trip and verify — the +``ensure_ascii=False`` choice is what makes them match, not a hazard. The +*cross-impl non-ASCII* case is therefore guaranteed by construction but not yet +pinned by a golden vector; doing so (a non-ASCII payload in the shared golden +HMAC vector) is a Wardline-side follow-up, because that vector lives in +Wardline's repo and only Wardline's repo can detect Wardline drifting. RFC-8785 +is needed only the day a *non-Python* verifier lands; because this is the single +canonicalization choke point, that upgrade stays a one-file change. The +companion Q-L5 fingerprint +reconciliation (decorator.py / boundary_scan.py) is independent and is done — +those fingerprints are Python ``ast.dump`` output, not cross-language JSON, so +RFC-8785 does not apply to them. """ from __future__ import annotations diff --git a/tests/test_canonical.py b/tests/test_canonical.py index 100c2c2..e88cfb0 100644 --- a/tests/test_canonical.py +++ b/tests/test_canonical.py @@ -17,3 +17,19 @@ def test_content_hash_is_stable_and_hex(): def test_canonical_json_rejects_non_standard_float_values(): with pytest.raises(ValueError): canonical_json({"bad": float("nan")}) + + +def test_canonical_json_preserves_non_ascii(): + # ``ensure_ascii=False`` is a deliberate, load-bearing choice: a Wardline + # ``artifact_signature`` is an HMAC over these exact bytes, and Wardline's + # signer (wardline/core/legis.py) is a byte-for-byte Python replica using the + # same params. A non-ASCII finding message must therefore serialise to the + # literal character, not a ``\\uXXXX`` escape, or the cross-tool signature + # would diverge. This locks legis's own output; the cross-impl pin lives in + # Wardline's golden HMAC vector. Mirrors Wardline's + # ``test_canonical_json_is_sorted_tight_unicode``. + assert canonical_json({"b": 1, "a": "é"}) == '{"a":"é","b":1}' + # Round-trips through the UTF-8 encode content_hash uses. + assert canonical_json({"msg": "café—naïve"}).encode("utf-8").decode("utf-8") == ( + '{"msg":"café—naïve"}' + ) From 9cb0ff98793bc2ae00d98d5e38a49d53b2d31c60 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 19:42:55 +1000 Subject: [PATCH 19/72] docs(changelog): backfill rc2/rc3 entries + complete rc4 The CHANGELOG jumped rc1 -> rc4 (the rc2 and rc3 tags had no entries) and the rc4 entry documented only the dirty-tree path + lint, omitting the headline install instruction-injection system, the table-driven MCP dispatch (Q-L8), and the Q-L5/Q-M5/Q-L6 fixes. - rc2: the MCP stdio surface (WP-M2/M3, verified run_jsonrpc present at the v1.0.0rc2 tag), deployable OpenRouter judge, Filigree closure-gate, git rename feed, policy-boundary honesty gate, Clarion->Loomweave/Loom->Weft rebrand, PyPI publishing. - rc3: the Q-* audit-remediation series (single-secret writer scope, judge advisory-only on protected, Weft-component HMAC transport, fail-closed policy-cell config, etc.). - rc4: added the self-install system, table-driven MCP dispatch, and the Q-L5/Q-M5/Q-L6 fixes + the pull_request_get checks fix from this branch. Fixed the link refs (rc2/rc3 were missing; all pointed at a placeholder URL; repo org corrected to foundryside-dev). rc2/rc3 entries are reconstructed from git history. Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 128 ++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 126 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index dafc9aa..e8dbec0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,23 @@ versions per [PEP 440](https://peps.python.org/pep-0440/) / ## [1.0.0rc4] — 2026-06-06 ### Added +- **Self-install (`legis install`)** — legis now stands itself up like its + siblings: it injects a lean, versioned agent-orientation block into CLAUDE.md / + AGENTS.md, installs the `legis-workflow` skill pack (Claude + Codex), registers + a `SessionStart` hook, and extends `.gitignore` with the local config surface + (`.legis/`, `legis.yaml`). The block carries a content-hashed, version-pinned + marker (``); a drift check + re-injects it when either the bundled content or the package version changes. + Two triggers keep it fresh — the SessionStart hook (`legis session-context`) + and a best-effort refresh on `legis mcp` boot, the latter closing the + Codex-only-repo gap a hook-only approach leaves open. Mirrors filigree's + inject/replace/append install mechanism (atomic write, symlink rejection, + idempotent hook registration), right-sized for legis; the lean block + + skill-pack split keeps the injected context small while the skill carries the + full CLI + MCP-tool reference. Design spec: + `docs/superpowers/specs/2026-06-06-legis-instruction-injection-design.md`. + (legis-0127b66; hardening — skill swap, hook upgrade, gitignore, nested-corrupt + settings — in legis-b245710.) - **Dirty-tree dev path** — `verify_wardline_artifact` now recognises the unsigned `dirty: true` dev artifact emitted by `wardline scan --format legis --allow-dirty`. In the keyless posture it governs but records the marker @@ -21,11 +38,116 @@ versions per [PEP 440](https://peps.python.org/pep-0440/) / signature, so the clean-tree signing guarantee is intact. (legis-d731c760c5, legis-7e85e8e7ba; upstream wardline `--allow-dirty`.) +### Changed +- **Table-driven MCP dispatch (Q-L8)** — `call_tool` now routes through a tool + table instead of an if/elif ladder, and the stdio server bounds each stdin + line so a malformed client cannot stream unbounded input. Behavior-preserving. +- **Release CI gates** — the coverage floor is raised to 88% with a `ruff` lint + gate added (Q-L7), live Loomweave conformance is now non-optional for releases + (no silent skip when the oracle is down), and the Filigree client's transport / + error branches are covered. + ### Fixed +- **Fingerprint reconciliation + RFC-8785 deferral (Q-L5 / Q-L4)** — the policy + gate and the static boundary scanner now extract the same fingerprint (they had + diverged); the RFC-8785 canonical-JSON upgrade is explicitly deferred (its + trigger is a *non-Python* verifier, and the one cross-tool verifier — Wardline — + is a byte-for-byte Python replica pinned by a golden vector). +- **AuditStore batch read-free invariant (Q-M5)** — the batch append path is + guarded against issuing a read mid-batch, with a regression test pinning the + three-layer append-only enforcement. +- **Capability-latch TTL revalidation (Q-L6)** — the SEI capability latch is + TTL-revalidated rather than cached indefinitely, and `content_hash` is + type-checked at its call sites. - **Lint** — cleared the remaining `ruff` findings in the test suite (unused imports, mid-file imports hoisted to module top, and `# noqa: F821` on the honesty-gate fixture functions whose free `handler` name is fingerprinted by source, not executed). `ruff check src tests` is now clean. +- **`pull_request_get` reports recorded checks unconditionally** — the tool no + longer short-circuits to an empty `checks` list on a fresh runtime whose check + surface has not yet been lazily initialised. A PR's CI outcomes are now + call-order-independent, so a governance agent can never be told a PR is clean + when failing checks exist. + +## [1.0.0rc3] — 2026-06-06 + +Audit remediation: the `Q-*` series hardening the governance, transport, and +read paths surfaced by the rc2 architecture analysis. + +### Changed +- **Service layer is the one path to governance decisions (Q-H2)** — the FastAPI + and MCP adapters both drive `legis.service`; no decision logic lives in a route + closure. +- **Weft-component HMAC on the Filigree transport (Q-M4)** — the Filigree binding + hop is authenticated, and the wire carries the canonical signed bytes (signing + and transport agree byte-for-byte). +- **Recorded check/PR facts labelled unauthenticated (Q-M2 / Q-M4)** — `Check` + and `PullRequest` records carry an explicit unauthenticated provenance label; + legis never presents an unsigned upstream fact as verified. +- **Core modules typed against the `AppendOnlyStore` protocol (Q-L3)** — the + governance modules depend on the append-only contract, not a concrete store. + +### Fixed +- **Single-secret mode is writer-scoped (Q-H1)** — a single shared secret grants + writer scope only; operator force-past stays an explicit opt-in, never implied. +- **LLM judge is advisory-only on protected policies (Q-H3)** — on a protected + cell the judge cannot clear a verdict; the protected gate decides. +- **`verify_integrity` fails on non-finite-float tamper (Q-M3)** — a record + carrying a NaN/Inf that survives decode now fails integrity verification rather + than passing silently. +- **Fail closed when policy-cell config is absent (Q-M7)** — a missing cell + configuration is a block, not a default-allow. +- **Honesty gate requires the boundary result as the assertion subject (Q-M8)** — + the static policy-boundary gate cannot be satisfied by an unrelated assertion. +- **Same-cell batch routing is atomic (Q-M5)** — a batch routed into one cell + commits or fails as a unit. +- **Read paths hardened against malformed `entity_key` (Q-L1 / Q-L2)** — the + governance read surfaces reject a malformed locator instead of raising. +- **Source-binding contract clarified and signed status proven (Q-M1 / Q-M6)** — + the Filigree binding-availability contract is decided and documented + (ADR-0003). +- Declared the `pydantic` runtime dependency explicitly. + +## [1.0.0rc2] — 2026-06-06 + +The agent-facing MCP surface, the deployable LLM judge, and the sibling +integration surfaces (Filigree closure-gate, Loomweave git-rename feed) that rc1 +listed as not-yet-built. + +### Added +- **MCP stdio surface (WP-M2 / WP-M3)** — the ratified agent tool catalog is + loaded and callable over an MCP stdio server: the policy-cell registry and + `policy_explain` contract (WP-M2), the callable tool catalog with store/registry + flags (WP-M3), plus the `git_rename_feed_get` and `filigree_closure_gate_get` + tools. +- **Deployable LLM judge** — an OpenRouter judge client behind the `LLMClient` + seam, wired into both the API and MCP runtimes via deployable judge + configuration flags. +- **Filigree closure-gate** — a governance decision function exposed over + `GET /filigree/closure-gate` and the `filigree_closure_gate_get` MCP tool, with + a verified `get_by_issue_id` on the `BindingLedger`. +- **Git rename feed** — a Clarion/Loomweave-ready rename-feed builder with + working-tree rename detection on `GitSurface`, exposed over `GET /git/rename-feed` + and the `git_rename_feed_get` MCP tool; the feed contract is locked. +- **Static policy-boundary honesty gate** — a static scanner plus the + `legis policy-boundary-check` CLI command, enforced in CI; the static scanner + is converged onto the same runtime evidence gate. +- **PyPI Trusted Publishing** — a release workflow and package metadata for + publishing to PyPI. + +### Changed +- **Rebrand Clarion→Loomweave and Loom→Weft** across legis (identifiers, docs, + and config references). +- **MCP idempotency replays scoped** so a replayed call resolves against its own + prior result, not a sibling's. + +### Fixed +- **Ingest accepts realistic scans** — the over-strict Wardline ingest validator + was relaxed to accept the diagnostics a real scan carries while keeping the + trust-grammar projection. +- **CLI fails closed on protected override-rate trails** — a missing or + unverifiable protected trail exits non-zero rather than reporting a clean rate. +- Hardened the governance audit boundaries with regression coverage. ## [1.0.0rc1] — 2026-06-03 @@ -71,5 +193,7 @@ WP-M1 service-layer extraction, consolidated behind a stable version. (Filigree signature column, live-Loomweave oracle + HMAC auth, operative git-rename feed) remain. -[1.0.0rc4]: https://peps.python.org/pep-0440/ -[1.0.0rc1]: https://peps.python.org/pep-0440/ +[1.0.0rc4]: https://github.com/foundryside-dev/legis/compare/v1.0.0rc3...HEAD +[1.0.0rc3]: https://github.com/foundryside-dev/legis/compare/v1.0.0rc2...v1.0.0rc3 +[1.0.0rc2]: https://github.com/foundryside-dev/legis/releases/tag/v1.0.0rc2 +[1.0.0rc1]: https://github.com/foundryside-dev/legis/releases/tag/v1.0.0rc1 From b4a59acbcc089c5c95d70feb9241a601ceed2f94 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 21:58:02 +1000 Subject: [PATCH 20/72] fix(cli): log best-effort instruction-refresh failures on MCP boot MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The boot-time refresh swallowed all exceptions with a bare `pass` (the broad catch is justified — boot must never break the server), but it was silent while the sibling SessionStart path (hooks.generate_session_context) logs. Add a logger.debug(..., exc_info=True) so a recurring refresh failure is diagnosable without breaking boot. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/cli.py | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/legis/cli.py b/src/legis/cli.py index 5b2e5e1..5fb1690 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -1,5 +1,6 @@ import argparse import json +import logging import sys from pathlib import Path @@ -11,6 +12,8 @@ from legis.policy.boundary_scan import scan_policy_boundaries from legis.store.audit_store import AuditStore +logger = logging.getLogger(__name__) + def _add_judge_flags(parser: argparse.ArgumentParser) -> None: parser.add_argument( @@ -280,7 +283,9 @@ def _refresh_instructions_best_effort() -> None: for message in refresh_instructions(Path.cwd()): print(message, file=sys.stderr) except Exception: # noqa: BLE001 (boot refresh must never break the server) - pass + # Best-effort: never break the server, but don't vanish silently either — + # the sibling SessionStart path (hooks.generate_session_context) logs too. + logger.debug("Best-effort instruction refresh on MCP boot failed", exc_info=True) def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: From 5ac24991698dd24a642c4e0447cf4f964a6e8dfb Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 21:58:03 +1000 Subject: [PATCH 21/72] chore(hooks): drop dead _build_instructions_block re-export MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The re-export was annotated "(kept for symmetry / tests)", but no test imports it via legis.hooks — tests/test_install.py imports it straight from legis.install. Likely residue of the abandoned INSTRUCTIONS constant. Remove the unused import; every other name imported here is used by refresh_instructions. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/hooks.py | 1 - 1 file changed, 1 deletion(-) diff --git a/src/legis/hooks.py b/src/legis/hooks.py index 62dfaa7..cec69ea 100644 --- a/src/legis/hooks.py +++ b/src/legis/hooks.py @@ -21,7 +21,6 @@ from legis.install import ( INSTRUCTIONS_MARKER, SKILL_NAME, - _build_instructions_block, # noqa: F401 (kept for symmetry / tests) _get_skills_source_dir, _marker_token, _skill_tree_fingerprint, From a632541d03c7343ab5863cf4f3458d480cee6409 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 21:58:16 +1000 Subject: [PATCH 22/72] fix(mcp): bound stdin reads by bytes, not characters MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The stdin framing read passed LEGIS_MCP_MAX_REQUEST_BYTES (16 MiB) as the limit to TextIO.readline(), which counts CHARACTERS. A line of multibyte UTF-8 that fits the character cap could exceed the nominal byte cap up to ~4x — fail-safe (still bounded) but the limit did not mean what its name promises. Keep the char-capped readline for memory safety (a decoded str holds <=4 bytes per char, so the in-memory read stays bounded) and reject a complete line whose UTF-8 encoding exceeds the byte budget. The existing truncate-and-drain path for a too-long line is unchanged; the -32700 "maximum size ... bytes" message is now accurate. Test pins a multibyte record that is under the char cap but over the byte cap, and confirms the following record stays framed. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/mcp.py | 35 ++++++++++++++++++++++------------- tests/mcp/test_server.py | 19 +++++++++++++++++++ 2 files changed, 41 insertions(+), 13 deletions(-) diff --git a/src/legis/mcp.py b/src/legis/mcp.py index 459d595..d1876d2 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -1187,32 +1187,41 @@ def handle_request(request: dict[str, Any], runtime: McpRuntime) -> dict[str, An return {"jsonrpc": "2.0", "id": request_id, "result": result} -def _read_bounded_line(stream: TextIO, max_chars: int) -> tuple[str, bool]: - """Read one newline-terminated record, bounded to ``max_chars``. +def _read_bounded_line(stream: TextIO, max_bytes: int) -> tuple[str, bool]: + """Read one newline-terminated record, bounded to ``max_bytes`` UTF-8 bytes. Returns ``(line, overflow)``. ``overflow`` is True when the record exceeded - the bound — the remainder of that over-long line is then drained to the next - newline so framing stays aligned for the following request. Returns - ``("", False)`` at EOF. ``readline(max_chars + 1)`` stops at a newline OR the - size cap, so a record longer than the bound comes back without a trailing - newline — the signal we key on. + the bound. ``readline(max_bytes + 1)`` caps the *character* read — a decoded + ``str`` holds at most 4 bytes per char, so this keeps the in-memory read + bounded — and is the cheap first gate: a record longer than the cap in + characters comes back without a trailing newline, so its physical remainder + is drained to the next newline to keep framing aligned. A record that fits in + characters but whose UTF-8 encoding still exceeds ``max_bytes`` (multibyte + content) is rejected too, so the limit means bytes as its name promises. + Returns ``("", False)`` at EOF. """ - line = stream.readline(max_chars + 1) + line = stream.readline(max_bytes + 1) if line == "": return "", False - if len(line) > max_chars and not line.endswith("\n"): + if len(line) > max_bytes and not line.endswith("\n"): + # Truncated mid-record at the character cap: drain the rest of the + # physical line so the next read starts on a record boundary. while True: - extra = stream.readline(max_chars + 1) + extra = stream.readline(max_bytes + 1) if extra == "" or extra.endswith("\n"): break return line, True + if len(line.encode("utf-8")) > max_bytes: + # Complete (newline-terminated) but over the byte budget; framing is + # already aligned past the newline, so no drain is needed. + return line, True return line, False def run_jsonrpc(input_stream: TextIO, output_stream: TextIO, runtime: McpRuntime) -> None: - max_chars = _max_request_bytes() + max_bytes = _max_request_bytes() while True: - line, overflow = _read_bounded_line(input_stream, max_chars) + line, overflow = _read_bounded_line(input_stream, max_bytes) if not line: break # EOF if overflow: @@ -1223,7 +1232,7 @@ def run_jsonrpc(input_stream: TextIO, output_stream: TextIO, runtime: McpRuntime "id": None, "error": { "code": -32700, - "message": f"request exceeds maximum size of {max_chars} bytes", + "message": f"request exceeds maximum size of {max_bytes} bytes", }, }, separators=(",", ":"), diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index 2c9bc26..3c0b64e 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -1557,3 +1557,22 @@ def test_max_request_bytes_env_override_and_fallback(monkeypatch): for bad in ("not-an-int", "0", "-5"): monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", bad) assert _max_request_bytes() == _DEFAULT_MAX_REQUEST_BYTES + + +def test_read_bounded_line_enforces_bytes_not_chars(): + # The bound is named in BYTES; readline() counts characters. A record that + # fits the char count but whose UTF-8 encoding exceeds the cap (multibyte + # content) must still overflow — otherwise the byte limit could be exceeded + # ~4×. The record AFTER it must stay framed. + from legis.mcp import _read_bounded_line + + multibyte = "中" * 200 # 200 chars, 600 UTF-8 bytes — under 400 chars, over 400 bytes + stream = io.StringIO(f"{multibyte}\n" + '{"next":true}\n') + + line, overflow = _read_bounded_line(stream, 400) + assert overflow is True + assert line.startswith("中") + + nxt, nxt_overflow = _read_bounded_line(stream, 400) + assert nxt_overflow is False + assert nxt == '{"next":true}\n' From 129d0bbcde74e987a0eb0cc87212f66b31f13e00 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 21:58:16 +1000 Subject: [PATCH 23/72] test(wardline): pin allow-dirty fail-safe default + CI missing-provenance red Two untested branches surfaced by review: - LEGIS_WARDLINE_ALLOW_DIRTY was only ever tested with "1". The opt-in gates governing UNSIGNED dirty artifacts, so the strict `== "1"` parse must fail safe: pin that "0"/"true"/"yes"/"" all stay the typed amber SKIPPED_DIRTY_TREE skip, guarding against a future drift to truthiness. - The CI-posture (key configured, clean tree) missing-provenance-field branch in verify_wardline_artifact raised no test. Pin that a scan missing a required provenance field is a generic red before signature verification, not an amber skip. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/api/test_combinations_api.py | 17 +++++++++++++++++ tests/wardline/test_ingest.py | 11 +++++++++++ 2 files changed, 28 insertions(+) diff --git a/tests/api/test_combinations_api.py b/tests/api/test_combinations_api.py index ce7c74c..62edb21 100644 --- a/tests/api/test_combinations_api.py +++ b/tests/api/test_combinations_api.py @@ -607,6 +607,23 @@ def test_scan_results_dirty_tree_governs_under_devmode_optin(tmp_path, monkeypat assert "artifact_signature" not in wardline +def test_scan_results_devmode_optin_is_strict_and_fails_safe(tmp_path, monkeypatch): + # The dev-mode opt-in is `LEGIS_WARDLINE_ALLOW_DIRTY == "1"` exactly. A + # governing knob that gates UNSIGNED artifacts must fail safe: any value other + # than "1" (truthy-looking "true", "0", "yes") must NOT govern — it stays the + # typed amber skip. Pins the strict parse against a future drift to truthiness. + monkeypatch.setenv("LEGIS_WARDLINE_ARTIFACT_KEY", "wardline-key") + for value in ("0", "true", "True", "yes", "2", ""): + monkeypatch.setenv("LEGIS_WARDLINE_ALLOW_DIRTY", value) + c = _client(tmp_path) + resp = c.post("/wardline/scan-results", + json={"cell": "surface_only", "agent_id": "a", + "scan": _dirty_wardline_scan()}) + assert resp.status_code == 200, value + assert resp.json()["outcome"] == "SKIPPED_DIRTY_TREE", value + assert resp.json()["routed"] == [], value + + def test_scan_results_single_cell_still_works(tmp_path): c = _client(tmp_path) body = {"cell": "surface_override", "agent_id": "agent-1", "scan": {"findings": [ diff --git a/tests/wardline/test_ingest.py b/tests/wardline/test_ingest.py index 06dc028..3844f75 100644 --- a/tests/wardline/test_ingest.py +++ b/tests/wardline/test_ingest.py @@ -223,3 +223,14 @@ def test_signed_dirty_artifact_verifies_normally(): scan = _artifact(dirty=True, signed=True) prov = verify_wardline_artifact(scan, _KEY, allow_dirty=False) assert prov["artifact_status"] == "verified" + + +def test_ci_posture_missing_provenance_field_is_red(): + # Key configured, clean (not dirty), but a required provenance field is + # absent -> generic red BEFORE signature verification is even attempted. This + # is the non-dirty CI branch that demands signed scanner/rule-set/commit/tree + # provenance; a scan missing any of them is malformed, not an amber skip. + scan = _artifact() # all four provenance fields present, unsigned + del scan["tree_sha"] + with pytest.raises(WardlinePayloadError, match="missing required field"): + verify_wardline_artifact(scan, _KEY) From b100d45cbaacae1982f349d1843e45a5d769b5ab Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 22:29:57 +1000 Subject: [PATCH 24/72] fix(audit): surface SQLite PRAGMA failures instead of swallowing them MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The audit-store connect listener applied journal_mode=WAL/synchronous/ busy_timeout inside `try: ... except Exception: pass`. The real silent failure was never even an exception: PRAGMA journal_mode=WAL does NOT raise when WAL is unavailable (read-only mount, some network filesystems, in-memory DBs) — it returns the journal mode actually in force. So the connection ran without WAL and surfaced much later as opaque "database is locked" under concurrency, in a governance-critical store. Extract the listener body into a module-level _apply_sqlite_pragmas() (module -level to avoid an engine->listener->self reference cycle, and testable without constructing a store — in-memory AuditStore is unconstructable under NullPool). Log both failure channels at WARNING: a raising PRAGMA (with exc_info) and the silent WAL-not-applied case (by capturing the journal_mode return value). Keep best-effort semantics — a PRAGMA failure must not break connect. Delete the stale force_immediate_transaction comment (finding 2). Tests: WAL actually lands on the file (external sqlite3 reads journal_mode=wal) + busy_timeout=5000 on a listener-fired connection; WAL-not-applied warning fires; raising PRAGMA logs with exc_info and still closes the cursor. Closes legis-5a85c48c41 Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/store/audit_store.py | 55 +++++++++++++++---- tests/store/test_audit_store.py | 93 ++++++++++++++++++++++++++++++++- 2 files changed, 136 insertions(+), 12 deletions(-) diff --git a/src/legis/store/audit_store.py b/src/legis/store/audit_store.py index b3e550f..deeca4f 100644 --- a/src/legis/store/audit_store.py +++ b/src/legis/store/audit_store.py @@ -16,6 +16,7 @@ import hashlib import json +import logging import threading from collections.abc import Iterator from contextlib import contextmanager @@ -37,9 +38,51 @@ from legis.canonical import canonical_json, content_hash +logger = logging.getLogger(__name__) + GENESIS = "0" * 64 +def _apply_sqlite_pragmas(dbapi_connection: Any, url: str) -> None: + """Apply the durability/concurrency PRAGMAs to a freshly-opened connection. + + Best-effort: a PRAGMA failure must not break connection setup (the store is + still usable without WAL), but it must NOT vanish silently either. Two + distinct failure channels are surfaced: + + * An exception while issuing a PRAGMA → logged with ``exc_info``. + * WAL silently not taking effect → ``PRAGMA journal_mode=WAL`` does *not* + raise when WAL is unavailable (read-only mount, some network filesystems, + in-memory DBs); it returns the journal mode actually in force. The old + ``except Exception: pass`` never caught this most-likely case, so the + connection ran without WAL and the symptom surfaced much later as an + opaque "database is locked" under concurrency. Detect and log it here. + """ + cursor = dbapi_connection.cursor() + try: + journal_row = cursor.execute("PRAGMA journal_mode=WAL").fetchone() + cursor.execute("PRAGMA synchronous=NORMAL") + cursor.execute("PRAGMA busy_timeout=5000") + journal_mode = journal_row[0] if journal_row else None + if journal_mode is not None and str(journal_mode).lower() != "wal": + logger.warning( + "audit store SQLite did not enter WAL mode (journal_mode=%r, " + "url=%s); concurrent appends may surface as opaque 'database is " + "locked' errors instead of waiting", + journal_mode, + url, + ) + except Exception: # noqa: BLE001 (PRAGMA failure must not break connect) + logger.warning( + "audit store failed to apply SQLite PRAGMAs (url=%s); connection " + "falls back to defaults (no WAL / default busy_timeout)", + url, + exc_info=True, + ) + finally: + cursor.close() + + @dataclass(frozen=True) class AuditRecord: seq: int @@ -68,17 +111,7 @@ def __init__(self, url: str) -> None: @event.listens_for(self._engine, "connect") def set_sqlite_pragma(dbapi_connection, connection_record): if "sqlite" in url: - cursor = dbapi_connection.cursor() - try: - cursor.execute("PRAGMA journal_mode=WAL") - cursor.execute("PRAGMA synchronous=NORMAL") - cursor.execute("PRAGMA busy_timeout=5000") - except Exception: - pass - finally: - cursor.close() - - # Remove the global force_immediate_transaction event listener to prevent locking on read-only queries. + _apply_sqlite_pragmas(dbapi_connection, url) self._md = MetaData() self._log = Table( diff --git a/tests/store/test_audit_store.py b/tests/store/test_audit_store.py index 7c9fa85..4c7de5d 100644 --- a/tests/store/test_audit_store.py +++ b/tests/store/test_audit_store.py @@ -1,8 +1,9 @@ +import logging import sqlite3 import pytest -from legis.store.audit_store import AuditStore +from legis.store.audit_store import AuditStore, _apply_sqlite_pragmas def db_path(tmp_path): @@ -128,6 +129,96 @@ def run_appends(tid, count): assert s.verify_integrity() is True +def test_pragma_wal_actually_applied_on_file(tmp_path): + # The connect listener must put the on-disk DB into WAL mode. journal_mode is + # a persistent file-header property, so an *external* connection that never + # ran our listener still observes it — proof WAL truly applied to the file. + make_store(tmp_path) + conn = raw_conn(tmp_path) + try: + mode = conn.execute("PRAGMA journal_mode").fetchone()[0] + finally: + conn.close() + assert mode.lower() == "wal" + + +def test_pragma_busy_timeout_set_on_listener_connection(tmp_path): + # busy_timeout is per-connection (not persistent), so it must be read on a + # connection that went through the listener — i.e. one from the store engine. + s = make_store(tmp_path) + with s._engine.connect() as conn: + timeout = conn.exec_driver_sql("PRAGMA busy_timeout").scalar() + assert timeout == 5000 + + +class _FakeCursor: + def __init__(self, journal_mode): + self._journal_mode = journal_mode + self.closed = False + + def execute(self, _sql): + return self + + def fetchone(self): + return (self._journal_mode,) + + def close(self): + self.closed = True + + +class _FakeConn: + def __init__(self, journal_mode): + self.cursor_obj = _FakeCursor(journal_mode) + + def cursor(self): + return self.cursor_obj + + +class _RaisingCursor: + def __init__(self): + self.closed = False + + def execute(self, _sql): + raise sqlite3.OperationalError("PRAGMA rejected") + + def close(self): + self.closed = True + + +class _RaisingConn: + def __init__(self): + self.cursor_obj = _RaisingCursor() + + def cursor(self): + return self.cursor_obj + + +def test_apply_pragmas_warns_when_wal_not_applied(caplog): + # The silent failure the bare `except` never caught: PRAGMA journal_mode=WAL + # does NOT raise when WAL is unavailable — it returns the mode actually in + # force (e.g. 'delete'/'memory'). That must surface as a warning. + conn = _FakeConn("delete") + with caplog.at_level(logging.WARNING, logger="legis.store.audit_store"): + _apply_sqlite_pragmas(conn, "sqlite:///some.db") + assert any( + "wal" in r.getMessage().lower() for r in caplog.records + ), f"expected a WAL-not-applied warning; got {[r.getMessage() for r in caplog.records]}" + assert conn.cursor_obj.closed is True + + +def test_apply_pragmas_warns_with_exc_info_on_pragma_exception(caplog): + # A PRAGMA that genuinely raises must be logged (with exc_info), not swallowed, + # and the connection setup must still complete (cursor closed, no re-raise). + conn = _RaisingConn() + with caplog.at_level(logging.WARNING, logger="legis.store.audit_store"): + _apply_sqlite_pragmas(conn, "sqlite:///some.db") + assert caplog.records, "expected a warning when PRAGMA application raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None + assert conn.cursor_obj.closed is True + + def test_verify_integrity_handles_non_finite_float_as_integrity_failure(tmp_path): # json.loads accepts Infinity/NaN, so the payload survives read_all's # decode guard, but content_hash -> canonical_json(allow_nan=False) raises From e77d6e474ee4e85b9ce30d43078a9075511ed197 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 22:45:24 +1000 Subject: [PATCH 25/72] refactor(types): convert stringly-typed outcome/status axes to str Enums MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Five outcome/status axes were bare strings where ingest.py already shipped WardlineSeverity(str, Enum) as the model. Convert each to a str,Enum — the member IS its wire string, so json.dumps/canonical_json emit byte-identical payloads (HMAC artifact-signature path unaffected: it signs the raw scan, not legis's provenance dict). - ScanOutcome (ROUTED / SKIPPED_DIRTY_TREE) — wardline ingest + scan_route boundaries (mcp.py, api/app.py). SKIPPED_DIRTY_TREE kept as a back-compat alias to the enum member. - ArtifactStatus (verified / dirty / unverified) — wardline/ingest.py. - IdentityResolutionStatus + LineageSnapshotStatus — resolver.py, with a __post_init__ bijection (alive None<->UNAVAILABLE, False<->NOT_ALIVE, True<->RESOLVED) so a self-contradictory frozen record is unrepresentable. Consumers updated: service/governance.py (dead getattr fallbacks dropped), governance/sei_backfill.py. - Suppressed (active / waived / suppressed / baselined / judged) — the field stays str on the wire-facing dataclass (validation timing/error-type unchanged); the enum is the vocabulary source of truth for the frozensets. Full suite + ruff + mypy green. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/api/app.py | 3 +- src/legis/governance/sei_backfill.py | 20 +++++--- src/legis/identity/resolver.py | 66 +++++++++++++++++++++---- src/legis/mcp.py | 4 +- src/legis/service/governance.py | 15 ++---- src/legis/wardline/ingest.py | 69 ++++++++++++++++++++------ tests/identity/test_resolver.py | 73 +++++++++++++++++++++++++++- tests/service/test_governance.py | 17 +++++++ 8 files changed, 223 insertions(+), 44 deletions(-) diff --git a/src/legis/api/app.py b/src/legis/api/app.py index 17134d8..2c3086f 100644 --- a/src/legis/api/app.py +++ b/src/legis/api/app.py @@ -60,6 +60,7 @@ from legis.pulls.surface import PullSurface from legis.wardline.governor import WardlineCellPolicy from legis.wardline.ingest import ( + ScanOutcome, WardlineDirtyTreeError, WardlinePayloadError, WardlineSeverity, @@ -853,6 +854,6 @@ def wardline_scan_results(body: ScanResultsIn, actor: str = Depends(verify_write raise HTTPException(status_code=422, detail=f"invalid Wardline scan: {exc}") except ValueError as exc: raise HTTPException(status_code=409, detail=str(exc)) - return {"outcome": "ROUTED", "routed": routed} + return {"outcome": ScanOutcome.ROUTED, "routed": routed} return app diff --git a/src/legis/governance/sei_backfill.py b/src/legis/governance/sei_backfill.py index 60c2309..9024b7b 100644 --- a/src/legis/governance/sei_backfill.py +++ b/src/legis/governance/sei_backfill.py @@ -16,6 +16,7 @@ from legis.clock import Clock from legis.identity.loomweave_client import LoomweaveIdentity from legis.identity.entity_key import EntityKey +from legis.identity.resolver import IdentityResolutionStatus, LineageSnapshotStatus from legis.store.protocol import AppendOnlyStore, AuditRecordLike SEI_PREFIX = "loomweave:eid:" @@ -206,7 +207,7 @@ def _resolved_event( "alive": True, "content_hash": resolution.get("content_hash"), "lineage_snapshot": lineage_snapshot, - "identity_resolution_status": "resolved", + "identity_resolution_status": IdentityResolutionStatus.RESOLVED, "lineage_snapshot_status": lineage_status, }, "backfill": { @@ -226,7 +227,11 @@ def _unresolved_event( reason: str, ) -> dict[str, Any]: locator_key = EntityKey.from_dict(rec.payload["entity_key"]) - status = "invalid" if reason == "invalid" else "not_alive" + status = ( + IdentityResolutionStatus.INVALID + if reason == "invalid" + else IdentityResolutionStatus.NOT_ALIVE + ) return { "event": "SEI_BACKFILL_UNRESOLVED", "original_seq": rec.seq, @@ -239,7 +244,7 @@ def _unresolved_event( "loomweave": { "alive": False, "identity_resolution_status": status, - "lineage_snapshot_status": "not_applicable", + "lineage_snapshot_status": LineageSnapshotStatus.NOT_APPLICABLE, }, "backfill": { "source": "pre_sei_locator", @@ -252,9 +257,12 @@ def _unresolved_event( def _lineage_snapshot( client: LoomweaveIdentity, sei: str -) -> tuple[dict[str, Any] | None, str]: +) -> tuple[dict[str, Any] | None, LineageSnapshotStatus]: try: lineage = client.lineage(sei) except Exception: - return None, "unavailable" - return {"length": len(lineage), "hash": content_hash(lineage)}, "verified" + return None, LineageSnapshotStatus.UNAVAILABLE + return ( + {"length": len(lineage), "hash": content_hash(lineage)}, + LineageSnapshotStatus.VERIFIED, + ) diff --git a/src/legis/identity/resolver.py b/src/legis/identity/resolver.py index f719a8a..3db5f25 100644 --- a/src/legis/identity/resolver.py +++ b/src/legis/identity/resolver.py @@ -11,6 +11,7 @@ import time from dataclasses import dataclass +from enum import Enum from typing import Any, Callable from legis.canonical import content_hash @@ -23,14 +24,54 @@ _DEFAULT_CAPABILITY_TTL_SECONDS = 300.0 +class IdentityResolutionStatus(str, Enum): + """The identity axis verdict (str,Enum — serializes as the bare string). + + ``INVALID`` is produced only by the SEI backfill path (it keys raw dicts, + not :class:`IdentityResolution`); the resolver itself emits only the other + three. + """ + + RESOLVED = "resolved" + NOT_ALIVE = "not_alive" + UNAVAILABLE = "unavailable" + INVALID = "invalid" + + +class LineageSnapshotStatus(str, Enum): + """The REQ-L-01 lineage-snapshot verdict (str,Enum — bare-string wire).""" + + VERIFIED = "verified" + UNAVAILABLE = "unavailable" + NOT_APPLICABLE = "not_applicable" + + @dataclass(frozen=True) class IdentityResolution: entity_key: EntityKey alive: bool | None # identity axis; None when no capability/decision content_hash: str | None # content axis; None when unavailable lineage_snapshot: dict[str, Any] | None # {"length": N, "hash": ...} or None - identity_resolution_status: str - lineage_snapshot_status: str + identity_resolution_status: IdentityResolutionStatus + lineage_snapshot_status: LineageSnapshotStatus + + def __post_init__(self) -> None: + # The identity axis and its status are two views of one fact — keep them + # from contradicting each other at construction. A "resolved" record with + # alive=False (or any other crossed pair) was representable before this + # guard; the invariant lived only in the construction sites. The bijection + # is exactly the three shapes the resolver actually builds. + expected = { + None: IdentityResolutionStatus.UNAVAILABLE, + False: IdentityResolutionStatus.NOT_ALIVE, + True: IdentityResolutionStatus.RESOLVED, + }[self.alive] + if self.identity_resolution_status is not expected: + raise ValueError( + f"contradictory IdentityResolution: alive={self.alive!r} " + f"requires identity_resolution_status=" + f"{expected.value!r}, got {self.identity_resolution_status.value!r}" + ) class IdentityResolver: @@ -78,12 +119,17 @@ def _capability(self) -> bool: self._capable_checked_at = now return self._capable if self._capable is not None else False - def _snapshot(self, sei: str) -> tuple[dict[str, Any] | None, str]: + def _snapshot( + self, sei: str + ) -> tuple[dict[str, Any] | None, LineageSnapshotStatus]: try: lineage = self._client.lineage(sei) # type: ignore[union-attr] except Exception: - return None, "unavailable" - return {"length": len(lineage), "hash": content_hash(lineage)}, "verified" + return None, LineageSnapshotStatus.UNAVAILABLE + return ( + {"length": len(lineage), "hash": content_hash(lineage)}, + LineageSnapshotStatus.VERIFIED, + ) def resolve(self, locator: str) -> IdentityResolution: degraded = IdentityResolution( @@ -91,8 +137,8 @@ def resolve(self, locator: str) -> IdentityResolution: None, None, None, - "unavailable", - "not_applicable", + IdentityResolutionStatus.UNAVAILABLE, + LineageSnapshotStatus.NOT_APPLICABLE, ) if not self._capability(): return degraded @@ -110,8 +156,8 @@ def resolve(self, locator: str) -> IdentityResolution: False, None, None, - "not_alive", - "not_applicable", + IdentityResolutionStatus.NOT_ALIVE, + LineageSnapshotStatus.NOT_APPLICABLE, ) sei = res.get("sei") if not isinstance(sei, str) or not sei: @@ -127,6 +173,6 @@ def resolve(self, locator: str) -> IdentityResolution: True, content_hash_value, snapshot, - "resolved", + IdentityResolutionStatus.RESOLVED, snapshot_status, ) diff --git a/src/legis/mcp.py b/src/legis/mcp.py index d1876d2..5362f4d 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -55,7 +55,7 @@ from legis.service.wardline import route_wardline_scan from legis.store.audit_store import AuditStore from legis.wardline.governor import WardlineCellPolicy -from legis.wardline.ingest import WardlineDirtyTreeError, WardlineSeverity +from legis.wardline.ingest import ScanOutcome, WardlineDirtyTreeError, WardlineSeverity _AGENT_TOOLS = frozenset( @@ -987,7 +987,7 @@ def _tool_scan_route(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any return _tool_result( {"outcome": exc.reason, "routed": [], "detail": str(exc)} ) - return _tool_result({"outcome": "ROUTED", "routed": routed}) + return _tool_result({"outcome": ScanOutcome.ROUTED, "routed": routed}) def _tool_git_branch_list(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: diff --git a/src/legis/service/governance.py b/src/legis/service/governance.py index 2fc1582..e624ba7 100644 --- a/src/legis/service/governance.py +++ b/src/legis/service/governance.py @@ -51,20 +51,15 @@ def resolve_for_record( res = identity.resolve(locator) ext: dict = {} if res.alive is not None: - identity_status = getattr( - res, "identity_resolution_status", "resolved" if res.alive else "not_alive" - ) - lineage_status = getattr( - res, - "lineage_snapshot_status", - "verified" if res.lineage_snapshot is not None else "not_applicable", - ) + # Both status axes are mandatory str,Enum fields on IdentityResolution now, + # so read them directly — the old getattr fallbacks guarded a shape the + # type no longer permits. The members serialize as their bare strings. ext["loomweave"] = { "alive": res.alive, "content_hash": res.content_hash, "lineage_snapshot": res.lineage_snapshot, - "identity_resolution_status": identity_status, - "lineage_snapshot_status": lineage_status, + "identity_resolution_status": res.identity_resolution_status, + "lineage_snapshot_status": res.lineage_snapshot_status, } return res.entity_key, ext diff --git a/src/legis/wardline/ingest.py b/src/legis/wardline/ingest.py index c66f905..70a16d2 100644 --- a/src/legis/wardline/ingest.py +++ b/src/legis/wardline/ingest.py @@ -53,12 +53,33 @@ class WardlinePayloadError(ValueError): """A Wardline scan payload is not shaped like the trusted wire contract.""" -# A dirty working tree is not a malformed payload — it is "the dev environment -# is not ready for a signed artifact yet". wardline emits an UNSIGNED, dirty:true -# dev artifact for this case (signing stays clean-tree-only). legis classifies it -# as a typed amber/skipped state, NOT a generic red, so a harness can tell -# "commit first" apart from "legis/the scan is broken". -SKIPPED_DIRTY_TREE = "SKIPPED_DIRTY_TREE" +class ArtifactStatus(str, Enum): + """How far the Wardline artifact's provenance verified (str,Enum — the member + IS its bare-string wire value, so records serialize byte-identically).""" + + VERIFIED = "verified" + DIRTY = "dirty" + UNVERIFIED = "unverified" + + +class ScanOutcome(str, Enum): + """The ``scan_route`` boundary outcome (str,Enum — bare-string wire). + + ``ROUTED`` — findings were governed into the configured cell. A dirty working + tree is not a malformed payload — it is "the dev environment is not ready for + a signed artifact yet". wardline emits an UNSIGNED, ``dirty: true`` dev + artifact for this case (signing stays clean-tree-only); legis classifies it + as the typed amber ``SKIPPED_DIRTY_TREE`` state, NOT a generic red, so a + harness can tell "commit first" apart from "legis/the scan is broken". + """ + + ROUTED = "ROUTED" + SKIPPED_DIRTY_TREE = "SKIPPED_DIRTY_TREE" + + +# Back-compat alias for the bare-string constant callers/tests imported before the +# enum existed; ``== "SKIPPED_DIRTY_TREE"`` still holds (str,Enum). +SKIPPED_DIRTY_TREE = ScanOutcome.SKIPPED_DIRTY_TREE class WardlineDirtyTreeError(Exception): @@ -117,8 +138,8 @@ def verify_wardline_artifact( strict boolean ``True`` because the scan dict is caller-controlled. """ fields = wardline_artifact_fields(scan) - provenance = { - "artifact_status": "unverified", + provenance: dict[str, Any] = { + "artifact_status": ArtifactStatus.UNVERIFIED, } for key in ARTIFACT_PROVENANCE_FIELDS: value = scan.get(key) @@ -132,7 +153,7 @@ def verify_wardline_artifact( if artifact_key is None: if is_dirty_dev_artifact: - provenance["artifact_status"] = "dirty" + provenance["artifact_status"] = ArtifactStatus.DIRTY return provenance if is_dirty_dev_artifact: @@ -144,7 +165,7 @@ def verify_wardline_artifact( "govern it unsigned in dev." ) return { - "artifact_status": "dirty", + "artifact_status": ArtifactStatus.DIRTY, **{key: value for key in ARTIFACT_PROVENANCE_FIELDS if isinstance(value := scan.get(key), str) and value}, } @@ -164,7 +185,7 @@ def verify_wardline_artifact( if not verify(fields, signature, artifact_key): raise WardlinePayloadError("Wardline artifact signature does not verify") return { - "artifact_status": "verified", + "artifact_status": ArtifactStatus.VERIFIED, **{key: scan[key] for key in ARTIFACT_PROVENANCE_FIELDS}, "artifact_signature": signature, } @@ -232,8 +253,28 @@ def from_wire(cls, d: Mapping[str, Any]) -> "WardlineFinding": # be able to silently dismiss a defect. Non-agent suppressions # (``baselined`` / ``judged``) are simply not active and carry no proof. Any # other state is malformed and rejected. -AGENT_SUPPRESSED: frozenset[str] = frozenset({"waived", "suppressed"}) -NON_AGENT_SUPPRESSED: frozenset[str] = frozenset({"baselined", "judged"}) +class Suppressed(str, Enum): + """The finding suppression-state vocabulary (str,Enum — bare-string wire). + + The ``suppressed`` field stays ``str`` on the wire-facing dataclass so the + validation timing is unchanged (any string is accepted off the wire; only a + *defect* with an out-of-vocabulary state is rejected, in ``active_defects``). + This enum is the single source of truth for the vocabulary — members compare + and hash equal to their strings, so the frozensets below match the bare + ``suppressed`` strings carried verbatim from the scan. + """ + + ACTIVE = "active" + WAIVED = "waived" + SUPPRESSED = "suppressed" + BASELINED = "baselined" + JUDGED = "judged" + + +AGENT_SUPPRESSED: frozenset[Suppressed] = frozenset({Suppressed.WAIVED, Suppressed.SUPPRESSED}) +NON_AGENT_SUPPRESSED: frozenset[Suppressed] = frozenset( + {Suppressed.BASELINED, Suppressed.JUDGED} +) def _has_suppression_proof(finding: Mapping[str, Any]) -> bool: @@ -272,7 +313,7 @@ def active_defects(scan: Mapping[str, Any]) -> list[WardlineFinding]: f = WardlineFinding.from_wire(raw) if f.kind != "defect": continue - if f.suppressed == "active": + if f.suppressed == Suppressed.ACTIVE: out.append(f) continue if f.suppressed in AGENT_SUPPRESSED: diff --git a/tests/identity/test_resolver.py b/tests/identity/test_resolver.py index 5a2e3cc..fc83088 100644 --- a/tests/identity/test_resolver.py +++ b/tests/identity/test_resolver.py @@ -1,5 +1,13 @@ +import pytest + from legis.canonical import content_hash -from legis.identity.resolver import IdentityResolver +from legis.identity.entity_key import EntityKey +from legis.identity.resolver import ( + IdentityResolution, + IdentityResolutionStatus, + IdentityResolver, + LineageSnapshotStatus, +) class FakeClient: @@ -45,6 +53,69 @@ def test_alive_sei_is_keyed_opaquely_with_two_axes(): assert res.lineage_snapshot_status == "verified" +# --- the str,Enum axes + the IdentityResolution construction invariant --- + + +def test_status_axes_are_str_enums_serializing_to_bare_strings(): + # str,Enum members ARE their wire string — comparison and serialization + # are byte-identical to the old bare strings (the whole compat argument). + assert IdentityResolutionStatus.RESOLVED == "resolved" + assert LineageSnapshotStatus.NOT_APPLICABLE == "not_applicable" + assert content_hash({"s": IdentityResolutionStatus.NOT_ALIVE}) == content_hash( + {"s": "not_alive"} + ) + + +def test_identity_resolution_rejects_contradictory_status_alive(): + # The sharpest case: a frozen record claiming "resolved" while alive is False + # is self-contradictory and must be unrepresentable at construction. + ek = EntityKey.from_locator("python:function:m.f") + with pytest.raises(ValueError): + IdentityResolution( + ek, + False, + None, + None, + IdentityResolutionStatus.RESOLVED, + LineageSnapshotStatus.NOT_APPLICABLE, + ) + with pytest.raises(ValueError): + IdentityResolution( + ek, + None, + None, + None, + IdentityResolutionStatus.NOT_ALIVE, + LineageSnapshotStatus.NOT_APPLICABLE, + ) + with pytest.raises(ValueError): + IdentityResolution( + ek, + True, + None, + None, + IdentityResolutionStatus.UNAVAILABLE, + LineageSnapshotStatus.NOT_APPLICABLE, + ) + + +def test_identity_resolution_accepts_the_three_consistent_shapes(): + ek = EntityKey.from_locator("python:function:m.f") + # alive None ↔ UNAVAILABLE, False ↔ NOT_ALIVE, True ↔ RESOLVED + IdentityResolution( + ek, None, None, None, + IdentityResolutionStatus.UNAVAILABLE, LineageSnapshotStatus.NOT_APPLICABLE, + ) + IdentityResolution( + ek, False, None, None, + IdentityResolutionStatus.NOT_ALIVE, LineageSnapshotStatus.NOT_APPLICABLE, + ) + IdentityResolution( + ek, True, "h", {"length": 1, "hash": "x"}, + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.VERIFIED, + ) + + def test_capability_absent_degrades_to_locator(): r = IdentityResolver(FakeClient(capable=False)) res = r.resolve("python:function:m.f") diff --git a/tests/service/test_governance.py b/tests/service/test_governance.py index f3a22e4..d69c597 100644 --- a/tests/service/test_governance.py +++ b/tests/service/test_governance.py @@ -6,6 +6,7 @@ from legis.enforcement.protected import ProtectedGate, TamperError from legis.enforcement.verdict import JudgeOpinion, Verdict from legis.identity.entity_key import EntityKey +from legis.identity.resolver import IdentityResolutionStatus, LineageSnapshotStatus from legis.service.errors import AuditIntegrityError, InvalidArgumentError from legis.service.governance import ( compute_override_rate, @@ -18,11 +19,27 @@ class _FakeResult: + # Mirrors IdentityResolution, including the two mandatory str,Enum status + # axes. Defaults derive from ``alive`` via the same bijection the real type + # now enforces in __post_init__, so a contradictory fake can't sneak through. def __init__(self, entity_key, alive, content_hash, lineage_snapshot): self.entity_key = entity_key self.alive = alive self.content_hash = content_hash self.lineage_snapshot = lineage_snapshot + self.identity_resolution_status = { + True: IdentityResolutionStatus.RESOLVED, + False: IdentityResolutionStatus.NOT_ALIVE, + None: IdentityResolutionStatus.UNAVAILABLE, + }[alive] + if alive: + self.lineage_snapshot_status = ( + LineageSnapshotStatus.VERIFIED + if lineage_snapshot is not None + else LineageSnapshotStatus.UNAVAILABLE + ) + else: + self.lineage_snapshot_status = LineageSnapshotStatus.NOT_APPLICABLE class _FakeIdentity: From 6361e0334853a27e040733934f9d1fcbf835d70f Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 22:52:50 +1000 Subject: [PATCH 26/72] docs(changelog): record str,Enum outcome/status axes conversion Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index e8dbec0..ab96370 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -39,6 +39,21 @@ versions per [PEP 440](https://peps.python.org/pep-0440/) / legis-7e85e8e7ba; upstream wardline `--allow-dirty`.) ### Changed +- **Typed outcome/status axes (str Enums)** — five stringly-typed axes are now + `str, Enum` following the existing `WardlineSeverity` model: `ScanOutcome` + (`ROUTED` / `SKIPPED_DIRTY_TREE`), `ArtifactStatus` + (`verified` / `dirty` / `unverified`), `IdentityResolutionStatus`, + `LineageSnapshotStatus`, and `Suppressed`. A `str, Enum` serializes identically + to the bare string, so wire payloads and HMAC artifact signatures are + byte-identical (the signature path signs the raw scan, not legis's + enum-bearing provenance). `IdentityResolution` gains a `__post_init__` + bijection (`alive` `None`↔`UNAVAILABLE`, `False`↔`NOT_ALIVE`, + `True`↔`RESOLVED`) so a self-contradictory frozen record is no longer + representable; the dead `getattr` fallbacks in `service/governance.py` are + dropped. The `suppressed` field stays `str` on the wire-facing dataclass + (validation timing and error type unchanged); the enum is the vocabulary + source of truth. Behavior-preserving. (legis-bba4f22949; deferred from the + rc4 code review.) - **Table-driven MCP dispatch (Q-L8)** — `call_tool` now routes through a tool table instead of an if/elif ladder, and the stdio server bounds each stdin line so a malformed client cannot stream unbounded input. Behavior-preserving. From 9100e64c5a856aa9da205b66d4bc14b6c02bc718 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 22:58:19 +1000 Subject: [PATCH 27/72] test(wardline): pin str,Enum axes to byte-identical bare-string wire Direct regression pin for the load-bearing compat contract: each ScanOutcome / ArtifactStatus / Suppressed member serializes identically to its bare string through json.dumps and canonical_json (and content_hash agrees), so a future Python/enum change that alters str,Enum serialization fails here loudly instead of silently breaking wire payloads and the content-hashed audit chain. Also notes WardlineDirtyTreeError.reason as the one enum-on-the-wire class attribute. Surfaced as the top nice-to-have by the adversarial review (verdict: SHIP). Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/wardline/ingest.py | 3 +++ tests/wardline/test_ingest.py | 36 +++++++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/src/legis/wardline/ingest.py b/src/legis/wardline/ingest.py index 70a16d2..538f723 100644 --- a/src/legis/wardline/ingest.py +++ b/src/legis/wardline/ingest.py @@ -93,6 +93,9 @@ class WardlineDirtyTreeError(Exception): catch it and surface a typed ``SKIPPED_DIRTY_TREE`` outcome. """ + # A ScanOutcome member (via the alias). Boundaries put it straight into the + # response as ``{"outcome": exc.reason}`` (app.py / mcp.py), so it is relied + # on to serialize as the bare ``"SKIPPED_DIRTY_TREE"`` string on the wire. reason = SKIPPED_DIRTY_TREE diff --git a/tests/wardline/test_ingest.py b/tests/wardline/test_ingest.py index 3844f75..bcddfb5 100644 --- a/tests/wardline/test_ingest.py +++ b/tests/wardline/test_ingest.py @@ -1,7 +1,13 @@ +import json + import pytest +from legis.canonical import canonical_json, content_hash from legis.wardline.ingest import ( TRUST_TIERS, + ArtifactStatus, + ScanOutcome, + Suppressed, WardlineFinding, WardlinePayloadError, WardlineSeverity, @@ -9,6 +15,36 @@ ) +def test_str_enum_axes_are_byte_identical_to_bare_strings_on_the_wire(): + # The load-bearing compat contract: a str,Enum serializes EXACTLY like its + # bare string through json.dumps and canonical_json (so wire payloads and the + # content-hashed audit chain are unchanged). Pin it directly so a future + # Python/enum change that alters str,Enum serialization fails here loudly, + # not silently downstream. + cases = [ + (ScanOutcome.ROUTED, "ROUTED"), + (ScanOutcome.SKIPPED_DIRTY_TREE, "SKIPPED_DIRTY_TREE"), + (ArtifactStatus.VERIFIED, "verified"), + (ArtifactStatus.DIRTY, "dirty"), + (ArtifactStatus.UNVERIFIED, "unverified"), + (Suppressed.ACTIVE, "active"), + (Suppressed.WAIVED, "waived"), + (Suppressed.SUPPRESSED, "suppressed"), + (Suppressed.BASELINED, "baselined"), + (Suppressed.JUDGED, "judged"), + ] + for member, raw in cases: + assert member == raw + assert json.dumps({"k": member}) == json.dumps({"k": raw}) + assert canonical_json({"k": member}) == canonical_json({"k": raw}) + assert content_hash({"k": member}) == content_hash({"k": raw}) + # The back-compat alias and the error's reason still equal the bare string + # that callers/boundaries imported and serialized before the enum existed + # (both are bound by the module-level import block below). + assert SKIPPED_DIRTY_TREE == "SKIPPED_DIRTY_TREE" + assert WardlineDirtyTreeError.reason == "SKIPPED_DIRTY_TREE" + + def _finding(**over): base = {"rule_id": "PY-WL-101", "message": "m", "severity": "ERROR", "kind": "defect", "fingerprint": "fp1", "qualname": "m.f", From 6417b6964cb019d97262baec77a3b1b92eb3f3bc Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 23:10:11 +1000 Subject: [PATCH 28/72] fix(observability): surface silent degrade paths + pin MCP registry sync MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three rc4-review issues, all about failures that vanished at the default log level or went untested: - cli: boot-time instruction refresh logged at DEBUG, dropped by the default-WARNING root logger despite a comment claiming it surfaced. Raise to WARNING (exc_info), matching hooks.generate_session_context — this is the only refresh trigger in a Codex-only repo with no SessionStart hook, so a persistent failure must not run agents on drifted instructions silently. - identity/resolver: three bare `except Exception` degrade paths (capability probe, locator resolve, lineage snapshot) had no logger. A broken Loomweave (auth/network/HMAC) returns the same typed-degraded record as a genuine "no SEI", so when governance shows `unavailable` en masse an operator cannot tell integration-broken from absent. Add a module logger + WARNING(exc_info) per path; typed returns unchanged. - mcp: pin the three hand-maintained tool registries (tool_definitions, _TOOL_HANDLERS, _AGENT_TOOLS) in sync with one assertion — the table-driven dispatch makes a missing/extra entry (reachable-but- unvalidated handler, or advertised-but-UNKNOWN_TOOL schema) easy to introduce. Tests: one per resolver path (each needs a differently-configured fake) and the boot-refresh path assert a WARNING record with exc_info; the registry test is a passing regression guard. Gate green: ruff, mypy, 643 passed, coverage 91.45% (floor 88%). The rc4-review CHANGELOG-omission claim was disproven against HEAD — the "Self-install (legis install)" entry already documents legis-0127b66 — so no CHANGELOG change is included. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/cli.py | 2 +- src/legis/identity/resolver.py | 20 ++++++++++- tests/identity/test_resolver.py | 59 ++++++++++++++++++++++++++++++++- tests/mcp/test_server.py | 14 ++++++++ tests/test_cli_install.py | 28 ++++++++++++++++ 5 files changed, 120 insertions(+), 3 deletions(-) diff --git a/src/legis/cli.py b/src/legis/cli.py index 5fb1690..4e91d46 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -285,7 +285,7 @@ def _refresh_instructions_best_effort() -> None: except Exception: # noqa: BLE001 (boot refresh must never break the server) # Best-effort: never break the server, but don't vanish silently either — # the sibling SessionStart path (hooks.generate_session_context) logs too. - logger.debug("Best-effort instruction refresh on MCP boot failed", exc_info=True) + logger.warning("Best-effort instruction refresh on MCP boot failed", exc_info=True) def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: diff --git a/src/legis/identity/resolver.py b/src/legis/identity/resolver.py index 3db5f25..5122f26 100644 --- a/src/legis/identity/resolver.py +++ b/src/legis/identity/resolver.py @@ -9,6 +9,7 @@ from __future__ import annotations +import logging import time from dataclasses import dataclass from enum import Enum @@ -18,6 +19,8 @@ from legis.identity.loomweave_client import LoomweaveIdentity from legis.identity.entity_key import EntityKey +logger = logging.getLogger(__name__) + # A long-lived resolver re-probes the Loomweave sei capability at most once per # this window. Without it a positive latch is permanent: a Loomweave that loses # the capability mid-life would be trusted forever (Q-L6). @@ -112,7 +115,14 @@ def _capability(self) -> bool: self._capable = bool(self._client.capability()) except Exception: # Honest transient degrade — clear the latch so the next resolve - # retries rather than trusting a stale value. + # retries rather than trusting a stale value. Log it: the typed + # return is indistinguishable from a Loomweave that genuinely has + # no sei capability, so the warning is the only operator signal + # that the integration is broken rather than absent. + logger.warning( + "Loomweave sei-capability probe failed; degrading to locator keys", + exc_info=True, + ) self._capable = None self._capable_checked_at = None return False @@ -125,6 +135,10 @@ def _snapshot( try: lineage = self._client.lineage(sei) # type: ignore[union-attr] except Exception: + logger.warning( + "Loomweave lineage snapshot failed; recording lineage as unavailable", + exc_info=True, + ) return None, LineageSnapshotStatus.UNAVAILABLE return ( {"length": len(lineage), "hash": content_hash(lineage)}, @@ -145,6 +159,10 @@ def resolve(self, locator: str) -> IdentityResolution: try: res = self._client.resolve_locator(locator) # type: ignore[union-attr] except Exception: + logger.warning( + "Loomweave locator resolve failed; degrading to locator key", + exc_info=True, + ) return degraded if not isinstance(res, dict): return degraded diff --git a/tests/identity/test_resolver.py b/tests/identity/test_resolver.py index fc83088..fa3a9cf 100644 --- a/tests/identity/test_resolver.py +++ b/tests/identity/test_resolver.py @@ -1,3 +1,5 @@ +import logging + import pytest from legis.canonical import content_hash @@ -11,12 +13,22 @@ class FakeClient: - def __init__(self, *, capable=True, resolve=None, lineage=None, boom=False, lineage_boom=False): + def __init__( + self, + *, + capable=True, + resolve=None, + lineage=None, + boom=False, + lineage_boom=False, + resolve_boom=False, + ): self._capable = capable self._resolve = resolve or {"alive": False} self._lineage = lineage or [] self._boom = boom self._lineage_boom = lineage_boom + self._resolve_boom = resolve_boom def capability(self): if self._boom: @@ -24,6 +36,8 @@ def capability(self): return self._capable def resolve_locator(self, locator): + if self._resolve_boom: + raise RuntimeError("resolve_locator down") return self._resolve def resolve_sei(self, sei): # not used by the resolver @@ -181,6 +195,49 @@ def test_alive_sei_with_lineage_failure_records_unavailable_status(): assert res.lineage_snapshot_status == "unavailable" +# --- each degrade path must leave an operator-visible trail. A broken Loomweave +# (auth/network/HMAC failure) returns the SAME typed-degraded record as a genuine +# "no SEI" — so when governance shows `unavailable` en masse, the WARNING is the +# only thing telling an operator "integration broken" from "nothing to resolve". +# One test per except block, each needing a differently-configured fake. --- + + +def test_capability_probe_failure_is_logged_with_exc_info(caplog): + r = IdentityResolver(FakeClient(boom=True)) + with caplog.at_level(logging.WARNING, logger="legis.identity.resolver"): + res = r.resolve("python:function:m.f") + assert res.entity_key.identity_stable is False # typed return unchanged + assert caplog.records, "expected a warning when capability() raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None + + +def test_resolve_locator_failure_is_logged_with_exc_info(caplog): + r = IdentityResolver(FakeClient(resolve_boom=True)) + with caplog.at_level(logging.WARNING, logger="legis.identity.resolver"): + res = r.resolve("python:function:m.f") + assert res.entity_key.identity_stable is False # typed return unchanged + assert caplog.records, "expected a warning when resolve_locator() raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None + + +def test_lineage_snapshot_failure_is_logged_with_exc_info(caplog): + r = IdentityResolver(FakeClient(resolve=ALIVE, lineage_boom=True)) + with caplog.at_level(logging.WARNING, logger="legis.identity.resolver"): + res = r.resolve("python:function:m.f") + # The resolution still succeeds; only the lineage axis degrades — but the + # failure must still surface. + assert res.alive is True + assert res.lineage_snapshot_status == "unavailable" + assert caplog.records, "expected a warning when lineage() raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None + + # --- Q-L6: the capability latch must revalidate (TTL), and content_hash must be # type-checked, not trusted verbatim from the Loomweave response. --- diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index 3c0b64e..3509f7c 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -1461,6 +1461,20 @@ def test_build_runtime_loads_policy_cells_from_configured_path(tmp_path, monkeyp assert runtime.cell_registry.cell_for("ordinary.policy") == "chill" +def test_tool_registries_are_in_sync(): + # mcp.py hand-maintains three parallel name registries: the public schema + # (tool_definitions), the dispatch table (_TOOL_HANDLERS), and the agent- + # exposed set (_AGENT_TOOLS). They MUST agree. A handler without a schema + # entry is reachable-but-unvalidated (it accepts arbitrary arg keys); a + # schema entry without a handler advertises a tool that errors UNKNOWN_TOOL. + # The table-driven dispatch makes exactly this drift easy to introduce, so + # pin it directly rather than inferring it from per-tool listing tests. + from legis.mcp import _AGENT_TOOLS, _TOOL_HANDLERS, tool_definitions + + defined = {t["name"] for t in tool_definitions()} + assert defined == set(_TOOL_HANDLERS) == set(_AGENT_TOOLS) + + def test_git_rename_feed_get_is_listed(): from legis.mcp import tool_definitions diff --git a/tests/test_cli_install.py b/tests/test_cli_install.py index 50b3b01..9d11c4c 100644 --- a/tests/test_cli_install.py +++ b/tests/test_cli_install.py @@ -112,3 +112,31 @@ def boom(_root): rc = main(["mcp", "--agent-id", "agent-1"]) assert rc == 0 assert calls == ["agent-1"] + + +def test_mcp_boot_refresh_failure_is_logged_with_exc_info(tmp_path, monkeypatch, caplog): + # The boot refresh is the ONLY refresh trigger in a Codex-only repo with no + # SessionStart hook. A persistently failing refresh must be visible at the + # default level (WARNING), not swallowed at DEBUG — otherwise agents run on + # drifted instructions with no signal. Mirrors hooks.generate_session_context. + monkeypatch.chdir(tmp_path) + + import logging + + import legis.hooks as hooks_module + import legis.mcp as mcp_module + + def boom(_root): + raise RuntimeError("refresh exploded") + + monkeypatch.setattr(hooks_module, "refresh_instructions", boom) + monkeypatch.setattr(mcp_module, "main", lambda agent_id: 0) + + with caplog.at_level(logging.WARNING, logger="legis.cli"): + rc = main(["mcp", "--agent-id", "agent-1"]) + + assert rc == 0 + assert caplog.records, "expected a warning when boot refresh raises" + rec = caplog.records[-1] + assert rec.levelno >= logging.WARNING + assert rec.exc_info is not None From 7f5ad87cdd5fa72dfe89d155604e102c26dc70fc Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 23:13:02 +1000 Subject: [PATCH 29/72] Rebrand Loom suite residue -> Weft in .gitignore comment Sole remaining Loom federation-brand string in legis (repo was already migrated Clarion->Loomweave and Loom->Weft per CHANGELOG). Display-only comment edit; no code, no schema, no SEI lines touched. SEI count (loomweave:eid in src/crates) unchanged at 1. Co-Authored-By: Claude Opus 4.8 (1M context) --- .gitignore | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index 5508bbc..e9e7df6 100644 --- a/.gitignore +++ b/.gitignore @@ -25,7 +25,7 @@ coverage.json AGENTS.md CLAUDE.md -# --- Loom suite working folders & local config (regenerated/local; never commit) --- +# --- Weft suite working folders & local config (regenerated/local; never commit) --- # Filigree — issue-tracker database + project config .filigree/ .filigree.conf From 9910c4130e89b2f78c0cc3e5121e551ebb60ab41 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sat, 6 Jun 2026 23:57:52 +1000 Subject: [PATCH 30/72] fix(observability): name failed seq in verify_integrity; warn on bad config The live subset of the rc4 review follow-ups. The str,Enum status-axis divergence (#3) and _snapshot logging (#7) already landed in 9100e64 / 6417b69, so only these five remain: - audit_store.verify_integrity logs the offending seq before every `return False` (and a distinct message for the read-decode path) so an investigator can locate the tamper in the append-only trail instead of getting a bare False. - mcp._max_request_bytes warns on BOTH bad-config paths (unparseable and non-positive) rather than silently falling back to the 16 MiB default, so an operator lowering the bound sees why it was ignored. - cli._run_install wraps each step: a raising step renders `[FAIL] {name}` and continues, consistent with the per-step model, instead of aborting half-applied with a traceback. Tests (each new behavior gets a regression guard): - scan_route malformed finding -> INVALID_ARGUMENT red asserted at the MCP boundary (the other half of the dirty->amber contract). - caplog guards for the verify_integrity seq logging, both _max_request_bytes warnings, the _run_install [FAIL]-and-continue path, and the generate_session_context swallowed-error warning. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/cli.py | 11 +++++++- src/legis/mcp.py | 18 +++++++++++++ src/legis/store/audit_store.py | 33 +++++++++++++++++++++++ tests/mcp/test_server.py | 48 +++++++++++++++++++++++++++++++-- tests/store/test_audit_store.py | 13 ++++++--- tests/test_cli_install.py | 17 ++++++++++++ tests/test_hooks.py | 10 +++++-- 7 files changed, 141 insertions(+), 9 deletions(-) diff --git a/src/legis/cli.py b/src/legis/cli.py index 4e91d46..085ec77 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -267,7 +267,16 @@ def _run_install(args) -> int: for selected, name, step in steps: if not selected: continue - ok, message = step() # type: ignore[operator] + try: + ok, message = step() # type: ignore[operator] + except Exception as exc: # noqa: BLE001 — one bad step must not abort the rest + # Stay consistent with the per-step [OK]/[FAIL] model instead of + # aborting the whole install with a traceback and leaving it + # half-applied. Render the failure, count it, keep going. + logger.warning("install step %r raised", name, exc_info=True) + print(f"[FAIL] {name}: {exc}") + failures += 1 + continue mark = "OK" if ok else "FAIL" print(f"[{mark}] {name}: {message}") if not ok: diff --git a/src/legis/mcp.py b/src/legis/mcp.py index 5362f4d..b5bb914 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -11,6 +11,7 @@ from collections.abc import Callable from dataclasses import asdict, dataclass import json +import logging import os from pathlib import Path import sys @@ -86,6 +87,8 @@ # refusing a pathological one. Override with LEGIS_MCP_MAX_REQUEST_BYTES. _DEFAULT_MAX_REQUEST_BYTES = 16 * 1024 * 1024 +logger = logging.getLogger(__name__) + def _max_request_bytes() -> int: raw = os.environ.get("LEGIS_MCP_MAX_REQUEST_BYTES") @@ -93,9 +96,24 @@ def _max_request_bytes() -> int: try: value = int(raw) except ValueError: + logger.warning( + "LEGIS_MCP_MAX_REQUEST_BYTES=%r is not an integer; ignoring it " + "and using the default %d-byte bound", + raw, + _DEFAULT_MAX_REQUEST_BYTES, + ) return _DEFAULT_MAX_REQUEST_BYTES if value > 0: return value + # A non-positive bound (a fat-fingered 0 or negative) would otherwise + # fall through silently — the operator meant to lower the cap and it was + # ignored. Say so. + logger.warning( + "LEGIS_MCP_MAX_REQUEST_BYTES=%r is not positive; ignoring it and " + "using the default %d-byte bound", + raw, + _DEFAULT_MAX_REQUEST_BYTES, + ) return _DEFAULT_MAX_REQUEST_BYTES diff --git a/src/legis/store/audit_store.py b/src/legis/store/audit_store.py index deeca4f..a85a516 100644 --- a/src/legis/store/audit_store.py +++ b/src/legis/store/audit_store.py @@ -270,6 +270,14 @@ def verify_integrity(self) -> bool: try: records = self.read_all() except (json.JSONDecodeError, TypeError, ValueError): + # No seq survives a decode failure of the whole read; name the + # failure mode so an investigator knows the trail is unreadable + # rather than merely mismatched. + logger.error( + "audit trail integrity check failed: a record payload did not " + "decode as JSON", + exc_info=True, + ) return False for rec in records: # json.loads accepts Infinity/NaN, so a directly-tampered payload @@ -279,12 +287,37 @@ def verify_integrity(self) -> bool: try: computed = content_hash(rec.payload) except (ValueError, TypeError): + logger.error( + "audit trail integrity check failed at seq=%s: payload is " + "not canonicalizable (tamper)", + rec.seq, + exc_info=True, + ) return False if computed != rec.content_hash: + logger.error( + "audit trail integrity check failed at seq=%s: content hash " + "mismatch (recorded %s, recomputed %s)", + rec.seq, + rec.content_hash, + computed, + ) return False if rec.prev_hash != prev_hash: + logger.error( + "audit trail integrity check failed at seq=%s: broken chain " + "link (prev_hash %s != expected %s)", + rec.seq, + rec.prev_hash, + prev_hash, + ) return False if rec.chain_hash != _chain(rec.prev_hash, rec.content_hash): + logger.error( + "audit trail integrity check failed at seq=%s: chain hash " + "does not match prev+content", + rec.seq, + ) return False prev_hash = rec.chain_hash return True diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index 3509f7c..138b7f0 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -1,5 +1,6 @@ import io import json +import logging import sqlite3 from legis.canonical import canonical_json, content_hash @@ -1063,6 +1064,43 @@ def test_scan_route_dirty_tree_governs_under_devmode_optin(tmp_path, monkeypatch assert "artifact_signature" not in wardline +def test_scan_route_malformed_finding_is_invalid_argument_red(tmp_path, monkeypatch): + # The other half of the dirty-vs-malformed contract (cf. the amber test + # above): a malformed finding — here an unknown severity — is a generic red + # INVALID_ARGUMENT, NOT the amber SKIPPED_DIRTY_TREE. WardlinePayloadError is + # deliberately not a WardlineDirtyTreeError, so the boundary keeps "broken or + # tampered scan" distinct from "commit first". Nothing is governed. + monkeypatch.setenv("LEGIS_WARDLINE_CELL", "surface_only") + runtime, store = _runtime(tmp_path) + malformed = { + "findings": [ + { + "rule_id": "PY-WL-101", + "message": "untrusted reaches trusted", + "severity": "NOT_A_SEVERITY", + "kind": "defect", + "fingerprint": "fp1", + } + ] + } + + result = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": {"name": "scan_route", "arguments": {"scan": malformed}}, + } + ), + runtime, + )[0]["result"] + + assert result["isError"] is True + assert result["structuredContent"]["error_code"] == "INVALID_ARGUMENT" + assert store.read_all() == [] + + def test_scan_route_fail_on_threshold_routes_each_finding(tmp_path, monkeypatch): monkeypatch.setenv("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING", "1") runtime, _store = _runtime(tmp_path) @@ -1561,16 +1599,22 @@ def test_run_jsonrpc_rejects_oversized_line_and_stays_framed(tmp_path, monkeypat assert responses[2]["id"] == 2 and "result" in responses[2] -def test_max_request_bytes_env_override_and_fallback(monkeypatch): +def test_max_request_bytes_env_override_and_fallback(monkeypatch, caplog): from legis.mcp import _DEFAULT_MAX_REQUEST_BYTES, _max_request_bytes monkeypatch.delenv("LEGIS_MCP_MAX_REQUEST_BYTES", raising=False) assert _max_request_bytes() == _DEFAULT_MAX_REQUEST_BYTES monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", "4096") assert _max_request_bytes() == 4096 + # Both the unparseable and the non-positive fat-finger fall back, but neither + # may do so silently — an operator lowering the bound must see why it was + # ignored. for bad in ("not-an-int", "0", "-5"): + caplog.clear() monkeypatch.setenv("LEGIS_MCP_MAX_REQUEST_BYTES", bad) - assert _max_request_bytes() == _DEFAULT_MAX_REQUEST_BYTES + with caplog.at_level(logging.WARNING, logger="legis.mcp"): + assert _max_request_bytes() == _DEFAULT_MAX_REQUEST_BYTES + assert "LEGIS_MCP_MAX_REQUEST_BYTES" in caplog.text def test_read_bounded_line_enforces_bytes_not_chars(): diff --git a/tests/store/test_audit_store.py b/tests/store/test_audit_store.py index 4c7de5d..6e8362c 100644 --- a/tests/store/test_audit_store.py +++ b/tests/store/test_audit_store.py @@ -70,7 +70,7 @@ def test_verify_integrity_passes_on_clean_chain(tmp_path): assert s.verify_integrity() is True -def test_verify_integrity_detects_out_of_band_tamper(tmp_path): +def test_verify_integrity_detects_out_of_band_tamper(tmp_path, caplog): s = make_store(tmp_path) s.append({"k": "a"}) s.append({"k": "b"}) @@ -85,10 +85,13 @@ def test_verify_integrity_detects_out_of_band_tamper(tmp_path): conn.commit() finally: conn.close() - assert s.verify_integrity() is False + with caplog.at_level(logging.ERROR, logger="legis.store.audit_store"): + assert s.verify_integrity() is False + # An investigator needs the offending seq, not a bare False. + assert "integrity check failed at seq=1" in caplog.text -def test_verify_integrity_handles_malformed_json_as_integrity_failure(tmp_path): +def test_verify_integrity_handles_malformed_json_as_integrity_failure(tmp_path, caplog): s = make_store(tmp_path) s.append({"k": "a"}) conn = raw_conn(tmp_path) @@ -102,7 +105,9 @@ def test_verify_integrity_handles_malformed_json_as_integrity_failure(tmp_path): finally: conn.close() - assert s.verify_integrity() is False + with caplog.at_level(logging.ERROR, logger="legis.store.audit_store"): + assert s.verify_integrity() is False + assert "integrity check failed" in caplog.text def test_audit_store_concurrent_writes(tmp_path): diff --git a/tests/test_cli_install.py b/tests/test_cli_install.py index 9d11c4c..413c5c5 100644 --- a/tests/test_cli_install.py +++ b/tests/test_cli_install.py @@ -51,6 +51,23 @@ def test_install_reports_failure_rc1_on_symlink(tmp_path, monkeypatch, capsys): assert "FAIL" in capsys.readouterr().out +def test_install_renders_fail_and_continues_when_a_step_raises(tmp_path, monkeypatch, capsys): + monkeypatch.chdir(tmp_path) + + def boom(_root): + raise RuntimeError("step blew up") + + monkeypatch.setattr(install, "install_skills", boom) + rc = main(["install"]) + out = capsys.readouterr().out + # A raising step is rendered as a [FAIL] line, not a traceback that aborts + # the run and leaves the install half-applied... + assert "[FAIL] Claude Code skill: step blew up" in out + # ...and the steps after it still run. + assert (tmp_path / ".gitignore").exists() + assert rc == 1 + + def test_session_context_silent_when_fresh(tmp_path, monkeypatch, capsys): monkeypatch.chdir(tmp_path) install.inject_instructions(tmp_path / "CLAUDE.md") diff --git a/tests/test_hooks.py b/tests/test_hooks.py index f4b4939..ed0d9b7 100644 --- a/tests/test_hooks.py +++ b/tests/test_hooks.py @@ -2,6 +2,8 @@ from __future__ import annotations +import logging + from legis import hooks, install from legis.hooks import ( _extract_marker_token, @@ -115,11 +117,15 @@ def test_generate_session_context_returns_messages_on_drift(tmp_path, monkeypatc assert "CLAUDE.md" in context -def test_generate_session_context_swallows_errors(tmp_path, monkeypatch): +def test_generate_session_context_swallows_errors(tmp_path, monkeypatch, caplog): monkeypatch.chdir(tmp_path) def boom(_root): raise OSError("disk gone") monkeypatch.setattr(hooks, "refresh_instructions", boom) - assert generate_session_context() is None + with caplog.at_level(logging.WARNING, logger="legis.hooks"): + assert generate_session_context() is None + # Swallowing must not be silent — a regression dropping the warning would + # hide a broken freshness check. + assert "Instruction freshness check failed" in caplog.text From a9a358a4e1f59ce305eede52ea1d82c0b3355e16 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 00:19:52 +1000 Subject: [PATCH 31/72] fix(install): bound instruction injector at foreign fences (peer of filigree-bcbd4d66fd) The injector identified its block by substring marker only, with no foreign-owner concept, so two branches could delete a co-resident sibling tool's block (wardline/filigree) in a shared CLAUDE.md / AGENTS.md: - truncate-to-EOF when legis's own end marker was absent, destroying any sibling block physically after an unclosed legis block; and - a Shape-2 splice where find() jumps over a sandwiched foreign block to a later legis close. Both auto-fire with no user action via the SessionStart drift refresh (hooks.py -> inject_instructions). Replace the two divergent branches with one bounded scan: legis's writable region runs to the first of (a) its own close if it precedes any foreign fence, (b) the next foreign-namespace fence, or (c) EOF. Own-namespace fences are absorbed, preserving the orphan-tail idempotency invariant. Monotonic-safe: bound <= the old cut point in every branch, so it can only preserve bytes the old code deleted. Folds in case-insensitive namespace matching, a separating newline on bounded recovery, a stale-duplicate split-brain warning, and a refuse-to-empty guard in _atomic_write_text (filigree-04bad2a2bf parity). Tracked as legis-068e359d28; weft C-4 contract promotion as weft-e408dc2b82. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/install.py | 83 ++++++++++++++++++++++++++---- tests/test_hooks.py | 24 +++++++++ tests/test_install.py | 115 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 213 insertions(+), 9 deletions(-) diff --git a/src/legis/install.py b/src/legis/install.py index 7f35813..13e92fa 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -17,7 +17,9 @@ import importlib.metadata import importlib.resources import json +import logging import os +import re import shlex import shutil import stat @@ -25,6 +27,8 @@ from pathlib import Path from typing import Any +logger = logging.getLogger(__name__) + # --------------------------------------------------------------------------- # Constants # --------------------------------------------------------------------------- @@ -34,6 +38,29 @@ _END_MARKER = "" +# Recognises ANY tool's instruction-block fence (open or close) by its vendor +# namespace, so legis can bound its own rewrite at a *foreign* fence and never +# delete a co-resident sibling block (wardline/filigree) in a shared +# CLAUDE.md/AGENTS.md (peer of filigree-bcbd4d66fd). The namespace match is +# case-insensitive: an uppercase-namespaced sibling must still register as a +# boundary. The cross-tool multi-owner block contract lives in weft +# conventions.md (C-4). +_INSTR_FENCE_RE = re.compile(r"\n" + "legis body, block NOT closed\n" + "\n" + "wardline body\n" + "\n" + ) + messages = refresh_instructions(tmp_path) + content = md.read_text() + assert any("CLAUDE.md" in m for m in messages) # drift was acted on + assert "wardline body" in content + assert "" in content + + def test_generate_session_context_swallows_errors(tmp_path, monkeypatch, caplog): monkeypatch.chdir(tmp_path) diff --git a/tests/test_install.py b/tests/test_install.py index 980c5e2..b6064fd 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -3,6 +3,7 @@ from __future__ import annotations import json +import logging import os import stat @@ -161,6 +162,109 @@ def test_inject_rejects_symlink_target(tmp_path): assert "symlink" in msg.lower() +# --------------------------------------------------------------------------- +# inject_instructions — foreign-block safety (peer of filigree-bcbd4d66fd) +# --------------------------------------------------------------------------- + +_WARDLINE_BLOCK = ( + "\n" + "wardline body\n" + "\n" +) + + +def test_inject_malformed_block_preserves_coresident_foreign_block(tmp_path): + """An unclosed legis block must NOT truncate a sibling block that follows it.""" + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "legis body, block NOT closed\n" + + _WARDLINE_BLOCK + ) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + # The foreign block survives intact. + assert "wardline body" in content + assert "" in content + assert "" in content + # Exactly one well-formed legis block remains; the orphan body is gone. + assert content.count(INSTRUCTIONS_MARKER) == 1 + assert "block NOT closed" not in content + assert content.count("") == 1 + + +def test_inject_shape2_sandwich_preserves_foreign_block(tmp_path, caplog): + """Unclosed-first / closed-later legis must not splice over a sandwiched sibling. + + The stale second legis block surviving beyond the foreign fence must also be + surfaced as a warning (refinement 4), not silently shipped as a split brain. + """ + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "first legis body (unclosed)\n" + + _WARDLINE_BLOCK + + f"{INSTRUCTIONS_MARKER}:vY:beef -->\n" + "second legis body\n" + "\n" + ) + with caplog.at_level(logging.WARNING, logger="legis.install"): + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert "wardline body" in content + assert "" in content + # Stale duplicate beyond the foreign fence is surfaced, not silent. + assert "duplicate that could not be canonicalised" in caplog.text + + +def test_inject_uppercase_namespace_sibling_survives(tmp_path): + """A sibling block with an upper-cased namespace is still a boundary (refinement 1).""" + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "legis body no close\n" + "\n" + "wardline body\n" + "\n" + ) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert "wardline body" in content + assert "" in content + + +def test_instructions_body_has_no_fence_token(): + """Pin: the shipped body must not contain a ``:instructions`` fence (refinement 2). + + The bounded scan runs across legis's own body; a fence token there would + misroute the common well-formed path into bounded recovery. + """ + assert ":instructions" not in _instructions_text() + + +def test_inject_bounded_recovery_is_idempotent(tmp_path): + """Repairing a malformed block next to a foreign one is byte-stable on re-run (refinement 3).""" + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "legis body no close\n" + + _WARDLINE_BLOCK + ) + inject_instructions(target) + first = target.read_text() + inject_instructions(target) + second = target.read_text() + assert first == second + assert "wardline body" in second + + # --------------------------------------------------------------------------- # _atomic_write_text # --------------------------------------------------------------------------- @@ -184,6 +288,17 @@ def test_reject_symlink_raises_on_symlink(tmp_path): reject_symlink(link) +@pytest.mark.parametrize("payload", ["", " \n\t \n"]) +def test_atomic_write_refuses_empty_content(tmp_path, payload): + """Refuse-to-empty guard (filigree-04bad2a2bf parity): never truncate a file to nothing.""" + target = tmp_path / "CLAUDE.md" + target.write_text("populated content\n") + with pytest.raises(ValueError, match="empty"): + install._atomic_write_text(target, payload) + # The populated file is left untouched. + assert target.read_text() == "populated content\n" + + # --------------------------------------------------------------------------- # Skill pack # --------------------------------------------------------------------------- From 1a221de8ed228278529a977470d07ad9f70b7690 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 00:20:15 +1000 Subject: [PATCH 32/72] chore(release): sync uv.lock legis version to 1.0.0rc4 The rc4 version bump (01d26c6) left the lockfile's own legis entry at 1.0.0rc3. Re-sync it so the tree is clean on the candidate branch. Co-Authored-By: Claude Opus 4.8 (1M context) --- uv.lock | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/uv.lock b/uv.lock index 63ed943..c1797e1 100644 --- a/uv.lock +++ b/uv.lock @@ -355,7 +355,7 @@ wheels = [ [[package]] name = "legis" -version = "1.0.0rc3" +version = "1.0.0rc4" source = { editable = "." } dependencies = [ { name = "fastapi" }, From 645cc64fc61b1cf975e4a76a707b8a6a6c143242 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 01:13:33 +1000 Subject: [PATCH 33/72] fix(install): span-aware injector anchor + surface drift-refresh failures MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the two residual gaps the rc4 PR review found in the instruction injector (both Important, neither shipping-blocking on their own): 1. inject_instructions anchored its block with a bare substring search for `" in content +def test_refresh_warns_when_drift_reinjection_fails(tmp_path, monkeypatch, caplog): + """A *detected-drift* re-injection that fails must not be dropped silently. + + ``inject_instructions`` returns ``(False, reason)`` (it does not raise) for a + recoverable refusal such as a symlinked target, so the upstream ``except`` in + the session-context path never sees it. If the refresh swallows the ``False``, + agents run on drifted instructions with zero operator signal. + """ + real = tmp_path / "real.md" + inject_instructions(real) + link = tmp_path / "CLAUDE.md" + link.symlink_to(real) + # Drift so the refresh attempts a re-injection (which then fails on the symlink). + monkeypatch.setattr(install, "_instructions_text", lambda: "DRIFTED BODY\n") + + with caplog.at_level(logging.WARNING, logger="legis.hooks"): + messages = refresh_instructions(tmp_path) + + assert not any("CLAUDE.md" in m for m in messages) # no false success + assert "CLAUDE.md" in caplog.text + assert "symlink" in caplog.text.lower() + + +def test_refresh_warns_when_skill_reinstall_fails(tmp_path, monkeypatch, caplog): + """A failed skill-pack re-install on drift must warn, not silently no-op.""" + install.install_skills(tmp_path) + # Drift the installed pack so the refresh attempts a reinstall. + next( + (tmp_path / ".claude" / "skills" / install.SKILL_NAME).rglob("*.md") + ).write_text("DRIFTED\n") + monkeypatch.setattr(hooks, "install_skills", lambda _root: (False, "swap failed")) + + with caplog.at_level(logging.WARNING, logger="legis.hooks"): + messages = refresh_instructions(tmp_path) + + assert not any("skill" in m.lower() for m in messages) # no false success + assert "swap failed" in caplog.text + + def test_generate_session_context_swallows_errors(tmp_path, monkeypatch, caplog): monkeypatch.chdir(tmp_path) diff --git a/tests/test_install.py b/tests/test_install.py index b6064fd..df5910e 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -248,6 +248,64 @@ def test_instructions_body_has_no_fence_token(): assert ":instructions" not in _instructions_text() +def test_inject_marker_text_inside_foreign_block_not_mistaken_for_own(tmp_path): + """A legis marker quoted *inside* a sibling block is not legis's own anchor. + + The literal ```` can legitimately appear inside + another tool's block (a quoted example, documentation). A bare substring anchor + would splice there and gut the sibling. The anchor must respect foreign block + spans, so this file has *no* legis block of its own → append, sibling untouched. + """ + target = tmp_path / "CLAUDE.md" + foreign_block = ( + "\n" + f"See example: {INSTRUCTIONS_MARKER}:v0:0000 -->\n" + "WARDLINE BODY MUST SURVIVE\n" + "\n" + ) + target.write_text("HEAD\n" + foreign_block) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + # The sibling block is preserved verbatim — not gutted, not spliced into. + assert foreign_block in content + assert "WARDLINE BODY MUST SURVIVE" in content + # Exactly one well-formed legis block was appended, after the sibling close. + assert content.count("") == 1 + assert content.rindex(INSTRUCTIONS_MARKER) > content.index( + "" + ) + + +def test_inject_reinject_preserves_foreign_block_placed_before_legis(tmp_path): + """A sibling block *before* the legis block survives re-injection on drift. + + The shared-file layout where wardline installs before legis is realistic; the + in-place replace must not reach backwards past ``start`` into a preceding block. + """ + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + + _WARDLINE_BLOCK + + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "stale legis body\n" + "\n" + ) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert "wardline body" in content + assert "" in content + assert "" in content + # The legis block was replaced in place (stale body gone), exactly one remains. + assert content.count(INSTRUCTIONS_MARKER) == 1 + assert "stale legis body" not in content + # The sibling still precedes the legis block. + assert content.index("") < content.index( + INSTRUCTIONS_MARKER + ) + + def test_inject_bounded_recovery_is_idempotent(tmp_path): """Repairing a malformed block next to a foreign one is byte-stable on re-run (refinement 3).""" target = tmp_path / "CLAUDE.md" From af32ed4a8f6d6983657c790fcb1a0dfe9562f21d Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 04:48:54 +1000 Subject: [PATCH 34/72] fix: fold in non-blocking rc4 review suggestions MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses the remaining (non-shipping-blocking) findings from the rc4 PR review, each with a regression test: - identity: IdentityResolution.__post_init__ now also constrains the lineage axis (lineage_snapshot present iff status is VERIFIED) so the *whole* record, not just the identity half, is impossible to construct contradictory; a non-bool `alive` now raises the guard's own ValueError instead of a KeyError (and int aliases 1/0 are rejected by identity, not == True/False). - mcp: the INTERNAL_ERROR fall-through in _service_error logs at ERROR with the exception attached — it reached the agent caller but left no server/Sentry record. The typed expected errors stay quiet. - install: a corrupt settings.json recovery (backup to .json.bak + reset) is now surfaced via a WARNING and named in the return message, not silent. - install: injecting into an empty / whitespace-only file writes just the block (like create) instead of leaving leading blank-line artifacts. - tests: bounded-read byte-boundary (max-1/max/max+1) and oversized-multibyte drain-loop locks; injector CRLF, empty-file, and two-clean-blocks locks (the last documents the *safe* replace-first/warn-second behavior — collapsing would need a deletion window over non-legis bytes between the blocks). - comment: corrected the _read_bounded_line byte-overflow comment for the final EOF-no-newline record. 665 passed / 2 skipped; ruff + mypy clean; per-package coverage floors hold. Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 16 ++++++- src/legis/identity/resolver.py | 23 ++++++++++ src/legis/install.py | 25 ++++++++++- src/legis/mcp.py | 11 ++++- tests/identity/test_resolver.py | 50 ++++++++++++++++++++++ tests/mcp/test_server.py | 56 +++++++++++++++++++++++++ tests/test_install.py | 74 +++++++++++++++++++++++++++++++-- 7 files changed, 248 insertions(+), 7 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 46cb621..b9f7a62 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -50,7 +50,9 @@ versions per [PEP 440](https://peps.python.org/pep-0440/) / bijection (`alive` `None`↔`UNAVAILABLE`, `False`↔`NOT_ALIVE`, `True`↔`RESOLVED`) so a self-contradictory frozen record is no longer representable; the dead `getattr` fallbacks in `service/governance.py` are - dropped. The `suppressed` field stays `str` on the wire-facing dataclass + dropped. The guard now covers the record's *other* half too — the lineage axis + (`lineage_snapshot` present iff `lineage_snapshot_status` is `VERIFIED`) — and + rejects a non-bool `alive` with its own `ValueError` rather than a `KeyError`. The `suppressed` field stays `str` on the wire-facing dataclass (validation timing and error type unchanged); the enum is the vocabulary source of truth. Behavior-preserving. (legis-bba4f22949; deferred from the rc4 code review.) @@ -100,6 +102,18 @@ versions per [PEP 440](https://peps.python.org/pep-0440/) / refusal such as a symlinked target, the upstream `except` never saw it and agents could run on drifted instructions with zero signal. Both paths now log a `WARNING` with the reason on failure (peer of the boot-log path closed earlier). +- **Unexpected MCP tool errors are logged server-side** — the `INTERNAL_ERROR` + fall-through in `_service_error` reached the agent caller but left no + server/Sentry record; an unexpected exception now logs at `ERROR` with the + exception attached. The typed, expected errors (`NOT_FOUND`, `INVALID_ARGUMENT`, + …) stay quiet. +- **Corrupt `settings.json` recovery is surfaced** — `install_claude_code_hooks` + already backed a malformed or wrong-typed `settings.json` up to `.json.bak` + before resetting it, but reported ordinary success; it now logs a `WARNING` and + names the backup in its return message so the user knows to reconcile. +- **Injector handles an empty target file cleanly** — injecting into an existing + zero-byte / whitespace-only `CLAUDE.md` / `AGENTS.md` now writes just the block + (like the create path) instead of leaving leading blank-line artifacts. ## [1.0.0rc3] — 2026-06-06 diff --git a/src/legis/identity/resolver.py b/src/legis/identity/resolver.py index 5122f26..c0de786 100644 --- a/src/legis/identity/resolver.py +++ b/src/legis/identity/resolver.py @@ -64,6 +64,14 @@ def __post_init__(self) -> None: # alive=False (or any other crossed pair) was representable before this # guard; the invariant lived only in the construction sites. The bijection # is exactly the three shapes the resolver actually builds. + # + # ``alive`` must be exactly None/False/True by identity: a bare ``in`` / + # dict lookup would alias ints (1 == True, 0 == False) and a non-bool + # would surface as a KeyError, not this guard's ValueError. + if not any(self.alive is v for v in (None, False, True)): + raise ValueError( + f"IdentityResolution.alive must be None/False/True, got {self.alive!r}" + ) expected = { None: IdentityResolutionStatus.UNAVAILABLE, False: IdentityResolutionStatus.NOT_ALIVE, @@ -75,6 +83,21 @@ def __post_init__(self) -> None: f"requires identity_resolution_status=" f"{expected.value!r}, got {self.identity_resolution_status.value!r}" ) + # The lineage axis is the record's other half: a snapshot is present iff + # the status is VERIFIED (the resolver pairs them so — VERIFIED carries a + # snapshot; UNAVAILABLE/NOT_APPLICABLE carry None). Keep that pairing from + # contradicting itself too, so the whole record — not just the identity + # axis — is impossible to construct in a self-contradictory state. + snapshot_present = self.lineage_snapshot is not None + verified = self.lineage_snapshot_status is LineageSnapshotStatus.VERIFIED + if snapshot_present != verified: + raise ValueError( + f"contradictory IdentityResolution: lineage_snapshot " + f"{'present' if snapshot_present else 'absent'} requires " + f"lineage_snapshot_status" + f"{'==' if snapshot_present else '!='} VERIFIED, got " + f"{self.lineage_snapshot_status.value!r}" + ) class IdentityResolver: diff --git a/src/legis/install.py b/src/legis/install.py index 1716cf5..b44c954 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -303,6 +303,12 @@ def inject_instructions(file_path: Path) -> tuple[bool, str]: _atomic_write_text(file_path, content) return True, f"Updated instructions in {file_path}" + if not content.strip(): + # An existing empty / whitespace-only file is effectively a create: write + # just the block rather than leaving leading blank-line artifacts. + _atomic_write_text(file_path, block + "\n") + return True, f"Created {file_path}" + if not content.endswith("\n"): content += "\n" content += "\n" + block + "\n" @@ -539,6 +545,8 @@ def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: except UnsafeInstallPathError as exc: return False, str(exc) + recovered_backup: str | None = None # set when a corrupt file was backed up + settings: dict[str, Any] = {} if settings_path.exists(): try: @@ -553,6 +561,12 @@ def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: except UnsafeInstallPathError as exc: return False, str(exc) shutil.copy2(settings_path, backup) + recovered_backup = backup.name + logger.warning( + "malformed .claude/settings.json backed up to %s and replaced with " + "a fresh file; reconcile any lost settings by hand", + backup.name, + ) prefix = shlex.join(_find_legis_command()) session_context_cmd = f"{prefix} session-context" @@ -582,6 +596,12 @@ def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: except UnsafeInstallPathError as exc: return False, str(exc) shutil.copy2(settings_path, backup) + recovered_backup = backup.name + logger.warning( + "corrupt hooks structure in .claude/settings.json backed up to %s " + "before resetting it; reconcile any lost hooks by hand", + backup.name, + ) if not isinstance(settings.get("hooks"), dict): settings["hooks"] = {} @@ -599,7 +619,10 @@ def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: ) _atomic_write_text(settings_path, json.dumps(settings, indent=2) + "\n") - return True, f"Registered hook in .claude/settings.json: {session_context_cmd}" + msg = f"Registered hook in .claude/settings.json: {session_context_cmd}" + if recovered_backup is not None: + msg += f" (backed up malformed settings.json to {recovered_backup})" + return True, msg # --------------------------------------------------------------------------- diff --git a/src/legis/mcp.py b/src/legis/mcp.py index b5bb914..bf11523 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -420,6 +420,12 @@ def _service_error(exc: Exception) -> dict[str, Any]: return _tool_error("SERVICE_ERROR", str(exc)) if isinstance(exc, ValueError): return _tool_error("INVALID_ARGUMENT", str(exc)) + # Unexpected: the typed cases above are expected and reach the caller as their + # own codes, so they stay quiet. This fall-through is a genuine surprise — the + # caller gets INTERNAL_ERROR, but the operator/Sentry would see nothing unless + # we log it here with the exception. (exc_info=exc, not True: _service_error + # may be called outside an active except block.) + logger.error("unhandled MCP tool error: %s", exc, exc_info=exc) return _tool_error("INTERNAL_ERROR", str(exc)) @@ -1230,8 +1236,9 @@ def _read_bounded_line(stream: TextIO, max_bytes: int) -> tuple[str, bool]: break return line, True if len(line.encode("utf-8")) > max_bytes: - # Complete (newline-terminated) but over the byte budget; framing is - # already aligned past the newline, so no drain is needed. + # Complete record (newline-terminated, or the final EOF record with no + # trailing newline) but over the byte budget; framing is already aligned + # — nothing follows the read — so no drain is needed. return line, True return line, False diff --git a/tests/identity/test_resolver.py b/tests/identity/test_resolver.py index fa3a9cf..d3bb159 100644 --- a/tests/identity/test_resolver.py +++ b/tests/identity/test_resolver.py @@ -130,6 +130,56 @@ def test_identity_resolution_accepts_the_three_consistent_shapes(): ) +def test_identity_resolution_rejects_contradictory_lineage_axis(): + # The lineage axis is the other half of the record: a snapshot is present + # iff the status is VERIFIED. Any crossed pair is self-contradictory. + ek = EntityKey.from_locator("python:function:m.f") + # VERIFIED but no snapshot. + with pytest.raises(ValueError): + IdentityResolution( + ek, True, "h", None, + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.VERIFIED, + ) + # Snapshot present but status NOT_APPLICABLE. + with pytest.raises(ValueError): + IdentityResolution( + ek, False, None, {"length": 1, "hash": "x"}, + IdentityResolutionStatus.NOT_ALIVE, LineageSnapshotStatus.NOT_APPLICABLE, + ) + # Snapshot present but status UNAVAILABLE. + with pytest.raises(ValueError): + IdentityResolution( + ek, True, "h", {"length": 1, "hash": "x"}, + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.UNAVAILABLE, + ) + + +def test_identity_resolution_accepts_resolved_with_unavailable_lineage(): + # A real producer shape: RESOLVED identity but the lineage probe failed — + # snapshot None, status UNAVAILABLE. Must construct. + ek = EntityKey.from_locator("python:function:m.f") + IdentityResolution( + ek, True, "h", None, + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.UNAVAILABLE, + ) + + +def test_identity_resolution_rejects_non_bool_alive_as_value_error(): + # A non-bool alive (and int aliases like 1/0 that collide with True/False) + # must raise the guard's own ValueError, not a KeyError. + ek = EntityKey.from_locator("python:function:m.f") + with pytest.raises(ValueError): + IdentityResolution( + ek, "yes", None, None, # type: ignore[arg-type] + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.VERIFIED, + ) + with pytest.raises(ValueError): + IdentityResolution( + ek, 1, None, None, # type: ignore[arg-type] + IdentityResolutionStatus.RESOLVED, LineageSnapshotStatus.VERIFIED, + ) + + def test_capability_absent_degrades_to_locator(): r = IdentityResolver(FakeClient(capable=False)) res = r.resolve("python:function:m.f") diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index 138b7f0..fdebc50 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -1634,3 +1634,59 @@ def test_read_bounded_line_enforces_bytes_not_chars(): nxt, nxt_overflow = _read_bounded_line(stream, 400) assert nxt_overflow is False assert nxt == '{"next":true}\n' + + +def test_read_bounded_line_at_byte_boundary(): + # The bound counts the trailing newline (fail-safe off-by-one): a 399-byte + # data record + "\n" == 400 bytes passes; one more byte overflows. + from legis.mcp import _read_bounded_line + + ok_line, ok_overflow = _read_bounded_line(io.StringIO("x" * 399 + "\n"), 400) + assert ok_overflow is False + assert ok_line == "x" * 399 + "\n" + + _, over_overflow = _read_bounded_line(io.StringIO("x" * 400 + "\n"), 400) + assert over_overflow is True + + +def test_read_bounded_line_drains_oversized_multibyte_record(): + # A record longer than the *character* cap forces the drain loop (first + # branch) — exercise it with multibyte content and assert the next record + # stays framed (the existing multibyte test stays under the char cap and + # hits the second branch instead). + from legis.mcp import _read_bounded_line + + stream = io.StringIO("中" * 20 + "\n" + "{}\n") # 20 chars > 10-char cap + line, overflow = _read_bounded_line(stream, 10) + assert overflow is True + assert line.startswith("中") + + nxt, nxt_overflow = _read_bounded_line(stream, 10) + assert nxt == "{}\n" + assert nxt_overflow is False + + +def test_service_error_logs_unexpected_internal_error(caplog): + # An unexpected exception is surfaced to the caller as INTERNAL_ERROR; it must + # also be logged server-side (with the exception) so the operator/Sentry sees + # what the agent caller's payload alone would hide. + from legis.mcp import _service_error + + with caplog.at_level(logging.ERROR, logger="legis.mcp"): + result = _service_error(RuntimeError("kaboom")) + + assert result["structuredContent"]["error_code"] == "INTERNAL_ERROR" + assert any(r.levelno == logging.ERROR and r.exc_info for r in caplog.records) + + +def test_service_error_does_not_log_expected_typed_errors(caplog): + # Expected, typed service errors map to typed codes and must NOT spam the + # server log — only the unexpected INTERNAL_ERROR fall-through logs. + from legis.mcp import _service_error + from legis.service.errors import NotFoundError + + with caplog.at_level(logging.ERROR, logger="legis.mcp"): + result = _service_error(NotFoundError("nope")) + + assert result["structuredContent"]["error_code"] == "NOT_FOUND" + assert not caplog.records diff --git a/tests/test_install.py b/tests/test_install.py index df5910e..2fbc962 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -323,6 +323,68 @@ def test_inject_bounded_recovery_is_idempotent(tmp_path): assert "wardline body" in second +def test_inject_into_empty_file_produces_clean_single_block(tmp_path): + """An existing zero-byte file gets a clean block, not leading blank lines.""" + target = tmp_path / "CLAUDE.md" + target.write_text("") + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert content.count(INSTRUCTIONS_MARKER) == 1 + # No leading blank-line artifact: the block starts at byte 0. + assert content.startswith(INSTRUCTIONS_MARKER) + + +def test_inject_crlf_file_preserves_foreign_block(tmp_path): + """A CRLF-terminated shared file: the sibling block still survives recovery.""" + target = tmp_path / "CLAUDE.md" + target.write_bytes( + ( + "HEAD\r\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\r\n" + "legis body, block NOT closed\r\n" + "\r\n" + "wardline body\r\n" + "\r\n" + ).encode("utf-8") + ) + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + assert "wardline body" in content + assert "" in content + assert content.count(INSTRUCTIONS_MARKER) == 1 + + +def test_inject_two_clean_legis_blocks_canonicalises_first_keeps_second(tmp_path, caplog): + """Two well-formed legis blocks: the first is canonicalised, the second is kept. + + Bounding at the first own close (not EOF) is deliberate — it preserves any + trailing content legis does not own, so a second block in the tail is surfaced + via a warning rather than silently deleted. Collapsing would require a deletion + window over the bytes between the two blocks, which may be user content. + """ + target = tmp_path / "CLAUDE.md" + target.write_text( + "HEAD\n" + f"{INSTRUCTIONS_MARKER}:vX:dead -->\n" + "first legis body\n" + "\n" + f"{INSTRUCTIONS_MARKER}:vY:beef -->\n" + "second legis body\n" + "\n" + ) + with caplog.at_level(logging.WARNING, logger="legis.install"): + ok, _ = inject_instructions(target) + assert ok + content = target.read_text() + # First block canonicalised (stale body gone); second block NOT deleted. + assert "first legis body" not in content + assert "second legis body" in content + # The surviving duplicate is surfaced, not silent. + assert caplog.records + + # --------------------------------------------------------------------------- # _atomic_write_text # --------------------------------------------------------------------------- @@ -442,15 +504,19 @@ def test_install_hooks_upgrades_bare_command(tmp_path, monkeypatch): assert cmds.count("/opt/bin/legis session-context") == 1 -def test_install_hooks_backs_up_malformed_settings(tmp_path): +def test_install_hooks_backs_up_malformed_settings(tmp_path, caplog): claude = tmp_path / ".claude" claude.mkdir() (claude / "settings.json").write_text("{ this is not json") - ok, _ = install_claude_code_hooks(tmp_path) + with caplog.at_level(logging.WARNING, logger="legis.install"): + ok, msg = install_claude_code_hooks(tmp_path) assert ok assert (claude / "settings.json.bak").is_file() settings = json.loads((claude / "settings.json").read_text()) assert any(c.endswith("session-context") for c in _session_commands(settings)) + # The reset is not silent: the user is told a backup was written. + assert ".bak" in msg + assert ".bak" in caplog.text def test_install_hooks_does_not_reuse_scoped_block(tmp_path): @@ -608,7 +674,7 @@ def test_install_hooks_backs_up_nested_corrupt_structure(tmp_path): claude = tmp_path / ".claude" claude.mkdir() (claude / "settings.json").write_text(json.dumps({"hooks": "important user data", "keep": 1})) - ok, _ = install_claude_code_hooks(tmp_path) + ok, msg = install_claude_code_hooks(tmp_path) assert ok bak = claude / "settings.json.bak" assert bak.is_file() @@ -616,6 +682,8 @@ def test_install_hooks_backs_up_nested_corrupt_structure(tmp_path): settings = json.loads((claude / "settings.json").read_text()) assert settings.get("keep") == 1 # sibling key preserved assert any(c.endswith("session-context") for c in _session_commands(settings)) + # The recovery of the corrupt nested structure is surfaced, not silent. + assert ".bak" in msg def test_install_skills_restores_original_on_genuine_swap_failure(tmp_path, monkeypatch): From 8ad8b3b1069596dce3ba4a1821103ba8d00e1f5b Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 04:50:12 +1000 Subject: [PATCH 35/72] docs(readme): refresh rc1-era status to rc4 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Status section still claimed 1.0.0rc1 ("first release candidate") and described the agent-facing MCP surface as "forthcoming" — it has since shipped (`legis mcp`), as has self-install (`legis install`). Update the version and the two stale feature claims; no other content change. Co-Authored-By: Claude Opus 4.8 (1M context) --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 1a3fe9a..f9f4b2f 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ Legis is the fourth Weft product: the git/CI and governance side of the suite's ## Status -Legis is at **`1.0.0rc1`** — the first release candidate. The standalone git/CI surfaces, the graded 2×2 enforcement engine, the agent-programmable policy grammar, SEI-keyed attestations, and the Wardline/Filigree suite combinations are all built and tested; the git-rename provider to Loomweave is contract-locked, operative pending Loomweave's committed-range driving. The transport-agnostic service layer (WP-M1) underpinning the forthcoming agent-facing MCP surface has landed. See the combination matrix below for per-pairing status and `CHANGELOG.md` for the release notes. +Legis is at **`1.0.0rc4`** — the fourth release candidate. The standalone git/CI surfaces, the graded 2×2 enforcement engine, the agent-programmable policy grammar, SEI-keyed attestations, and the Wardline/Filigree suite combinations are all built and tested; the git-rename provider to Loomweave is contract-locked, operative pending Loomweave's committed-range driving. The transport-agnostic service layer (WP-M1) and the agent-facing MCP surface on top of it have landed (`legis mcp`), and Legis now stands itself up via `legis install` (instruction block + `legis-workflow` skill pack + SessionStart hook). See the combination matrix below for per-pairing status and `CHANGELOG.md` for the release notes. ## The Weft suite From 38836acbb2785e4e35fb98c1316230ba1d3e61d6 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 07:45:40 +1000 Subject: [PATCH 36/72] feat(config): consolidate legis stores under .weft/legis federation subtree Move all four cwd-relative SQLite stores (check, governance, binding, pulls) into the federated .weft/legis/ subtree, the convention shared with the other weft members. legis is the sole writer of this subtree. - Replace the import-time DEFAULT_CHECK_DB/DEFAULT_GOVERNANCE_DB constants with lazy resolver functions in config.py (check_db_url/governance_db_url/ binding_db_url/pull_db_url); fold in the binding/pull URLs that were inline literals duplicated across app.py and mcp.py. - Read the operator-authored [legis] table from weft.toml (READ-ONLY, enrich-only): a store_dir knob relocates the subtree. An absent, section-less, or malformed weft.toml still boots on built-in defaults -- never load-bearing. - ensure_sqlite_parent() creates the parent dir at store-open time (in the three store __init__s), never at URL-compute time, preserving the MCP initialize-path no-leak guarantee. - Precedence: LEGIS_*_DB env var > weft.toml [legis] store_dir > default. - Clean break: no legacy fallback. Existing deployments move their files into .weft/legis/ or pin the LEGIS_*_DB env vars. - Operator signing keys are untouched (env-provided secrets, not files). - .gitignore + install injector ignore .weft/legis/ only (never .weft/ wholesale); CI governance-gate uses the resolved default store. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/ci.yml | 4 +- .gitignore | 2 + src/legis/api/app.py | 24 ++--- src/legis/checks/surface.py | 3 + src/legis/cli.py | 8 +- src/legis/config.py | 137 ++++++++++++++++++++++++++++- src/legis/install.py | 8 +- src/legis/mcp.py | 18 ++-- src/legis/pulls/surface.py | 3 + src/legis/store/audit_store.py | 5 ++ tests/api/test_combinations_api.py | 2 +- tests/mcp/test_server.py | 9 +- tests/test_config.py | 95 ++++++++++++++++++++ 13 files changed, 286 insertions(+), 32 deletions(-) create mode 100644 tests/test_config.py diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index a6c44fb..8661f6e 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -50,4 +50,6 @@ jobs: # Remove this once a real governance DB is wired into CI. env: LEGIS_ALLOW_MISSING_GOVERNANCE_DB: "1" - run: uv run legis governance-gate --db sqlite:///legis-governance.db + # No --db: use the resolved default store (.weft/legis/legis-governance.db), + # the same location the server/MCP write to. + run: uv run legis governance-gate diff --git a/.gitignore b/.gitignore index e9e7df6..9c7a00b 100644 --- a/.gitignore +++ b/.gitignore @@ -42,3 +42,5 @@ wardline.yaml *.db-wal .legis/ legis.yaml +# Federated runtime-state subtree (legis is the sole writer; never .weft/ wholesale) +.weft/legis/ diff --git a/src/legis/api/app.py b/src/legis/api/app.py index 2c3086f..e26703a 100644 --- a/src/legis/api/app.py +++ b/src/legis/api/app.py @@ -26,10 +26,14 @@ from pydantic import BaseModel from legis import __version__ -# Re-exported so existing `from legis.api.app import DEFAULT_*_DB` call sites -# keep working, while the canonical definition lives in the transport-agnostic -# config module instead of the HTTP layer (Q-H2). -from legis.config import DEFAULT_CHECK_DB, DEFAULT_GOVERNANCE_DB +# Store-location resolvers live in the transport-agnostic config module, not the +# HTTP layer, so `mcp` and any other composition root share one source (Q-H2). +from legis.config import ( + binding_db_url, + check_db_url, + governance_db_url, + pull_db_url, +) from legis.checks.models import CheckOutcome, CheckRun from legis.checks.surface import CheckSurface from legis.enforcement.engine import EnforcementEngine @@ -336,7 +340,7 @@ def create_app( from legis.clock import SystemClock from legis.store.audit_store import AuditStore - gov_db_url = os.environ.get("LEGIS_GOVERNANCE_DB", DEFAULT_GOVERNANCE_DB) + gov_db_url = os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url()) gov_store = AuditStore(gov_db_url) clock = SystemClock() @@ -367,7 +371,7 @@ def create_app( if binding_ledger is None: from legis.governance.binding_ledger import BindingLedger - bind_db_url = os.environ.get("LEGIS_BINDING_DB", "sqlite:///legis-binding.db") + bind_db_url = os.environ.get("LEGIS_BINDING_DB", binding_db_url()) binding_ledger = BindingLedger(AuditStore(bind_db_url), clock, hmac_key) state: dict[str, Any] = { "checks": check_surface, @@ -381,13 +385,13 @@ def git() -> GitSurface: def checks() -> CheckSurface: if state["checks"] is None: - check_db = os.environ.get("LEGIS_CHECK_DB", DEFAULT_CHECK_DB) + check_db = os.environ.get("LEGIS_CHECK_DB", check_db_url()) state["checks"] = CheckSurface(check_db) return state["checks"] def pulls() -> PullSurface: if state["pulls"] is None: - pull_db = os.environ.get("LEGIS_PULL_DB", "sqlite:///legis-pulls.db") + pull_db = os.environ.get("LEGIS_PULL_DB", pull_db_url()) state["pulls"] = PullSurface(pull_db) return state["pulls"] @@ -396,7 +400,7 @@ def engine() -> EnforcementEngine: from legis.clock import SystemClock from legis.store.audit_store import AuditStore - gov_db_url = os.environ.get("LEGIS_GOVERNANCE_DB", DEFAULT_GOVERNANCE_DB) + gov_db_url = os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url()) state["enforcement"] = EnforcementEngine( AuditStore(gov_db_url), SystemClock() ) @@ -820,7 +824,7 @@ def wardline_scan_results(body: ScanResultsIn, actor: str = Depends(verify_write raise HTTPException(status_code=422, detail=f"unknown cell/severity: {exc}") # Only provision the governance store when a surface cell can actually run: - # engine() lazily creates legis-governance.db, so a pure block_escalate scan + # engine() lazily creates .weft/legis/legis-governance.db, so a pure block_escalate scan # must not touch it. signoff_gate is an injected param (no side effect). needs_engine = bool(cells & {WardlineCellPolicy.SURFACE_OVERRIDE, WardlineCellPolicy.SURFACE_ONLY}) diff --git a/src/legis/checks/surface.py b/src/legis/checks/surface.py index d627ef8..55cfa91 100644 --- a/src/legis/checks/surface.py +++ b/src/legis/checks/surface.py @@ -27,6 +27,9 @@ class CheckSurface: def __init__(self, db_url: str) -> None: + from legis.config import ensure_sqlite_parent + + ensure_sqlite_parent(db_url) self._engine = create_engine(db_url, future=True, poolclass=NullPool) self._md = MetaData() self._runs = Table( diff --git a/src/legis/cli.py b/src/legis/cli.py index 085ec77..49f58b8 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -90,14 +90,16 @@ def build_parser() -> argparse.ArgumentParser: _add_judge_flags(mcp) import os - gov_db_default = os.environ.get("LEGIS_GOVERNANCE_DB", "sqlite:///legis-governance.db") + + from legis.config import governance_db_url + gov_db_default = os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url()) rate = subparsers.add_parser( "check-override-rate", help="Fail (exit 1) if the override-rate gate is FAIL — for CI", ) rate.add_argument( "--db", default=gov_db_default, - help="Governance store URL (mirrors the server's DEFAULT_GOVERNANCE_DB)", + help="Governance store URL (defaults to the server's governance store)", ) gate = subparsers.add_parser( "governance-gate", @@ -105,7 +107,7 @@ def build_parser() -> argparse.ArgumentParser: ) gate.add_argument( "--db", default=gov_db_default, - help="Governance store URL (mirrors the server's DEFAULT_GOVERNANCE_DB)", + help="Governance store URL (defaults to the server's governance store)", ) backfill = subparsers.add_parser( "sei-backfill", diff --git a/src/legis/config.py b/src/legis/config.py index c3ea9b7..5447214 100644 --- a/src/legis/config.py +++ b/src/legis/config.py @@ -1,13 +1,142 @@ -"""Shared default store locations — the single source for the governance and -check database URLs. +"""Store-location resolver — the single source for legis's database URLs. These previously lived on ``legis.api.app``, which forced ``mcp`` (and any other composition root) to import from the HTTP layer just to learn where the governance store lives (Q-H2). They are transport-agnostic configuration, so they belong here; ``api`` and ``mcp`` both import them from this module. + +**Federated store layout.** legis's machine-written runtime state lives under +``.weft/legis/`` at the project root — the federation convention shared with +the other weft members. legis is the *sole writer* of this subtree. Resolution +is anchored at the current working directory: the same notion the installer +uses (``cli.py`` sets ``project_root = Path.cwd()``), and every member resolves +``.weft/`` against that same cwd, so running each tool from the project root +keeps them in agreement. The default URLs are therefore cwd-relative +(``sqlite:///.weft/legis/...``), preserving the historical resolution semantics. + +**weft.toml is enrich-only, never load-bearing.** The operator-authored +``weft.toml`` may carry a ``[legis]`` table; we read it but never write it. +The single enrichment knob is ``store_dir`` (relocate the subtree; relative to +the project root, or absolute). Per-DB overrides remain the ``LEGIS_*_DB`` env +vars, which take precedence over weft.toml. An absent file, an absent +``[legis]`` section, or even a malformed weft.toml must still boot on the +built-in defaults — legis never *depends* on weft.toml (Doctrine §5 deletion +test). + +**Clean break.** There is no fallback to the old cwd-root locations +(``legis-governance.db`` &c.). Existing deployments move their files into +``.weft/legis/`` or pin the ``LEGIS_*_DB`` env vars. + +**Keys are out of scope.** Operator-held signing keys are the authority-key +carve-out — capability-confined and deliberately not agent-reachable. They are +env-provided secrets, not files under this subtree; nothing here touches key +storage. """ from __future__ import annotations -DEFAULT_CHECK_DB = "sqlite:///legis-checks.db" -DEFAULT_GOVERNANCE_DB = "sqlite:///legis-governance.db" +import logging +import tomllib +from pathlib import Path + +from sqlalchemy.engine import make_url + +logger = logging.getLogger(__name__) + +WEFT_MEMBER = "legis" + +# Built-in DB filenames under the member's runtime-state subtree. The legacy +# names are preserved so a clean-break move is a relocation, not a rename. +_CHECK_DB_NAME = "legis-checks.db" +_GOVERNANCE_DB_NAME = "legis-governance.db" +_BINDING_DB_NAME = "legis-binding.db" +_PULL_DB_NAME = "legis-pulls.db" + + +def project_root() -> Path: + """The directory the federation treats as project root (the cwd).""" + return Path.cwd() + + +def _weft_legis_config() -> dict: + """Read the operator-authored ``[legis]`` table from ``weft.toml``. + + Returns an empty enrichment ({}) when the file is absent, has no ``[legis]`` + table, or cannot be parsed — weft.toml is never load-bearing, so a missing + or broken operator file degrades to built-in defaults rather than failing + boot. We are READ-ONLY here; this function never writes weft.toml. + """ + path = project_root() / "weft.toml" + try: + with path.open("rb") as fh: + data = tomllib.load(fh) + except FileNotFoundError: + return {} + except (OSError, tomllib.TOMLDecodeError): + # A broken operator file must not be load-bearing. Surface it on the log + # (so a fat-fingered weft.toml is diagnosable) but boot on defaults. + logger.warning( + "weft.toml present but unreadable (%s); legis booting on built-in " + "store defaults", + path, + exc_info=True, + ) + return {} + section = data.get(WEFT_MEMBER) + return section if isinstance(section, dict) else {} + + +def _store_dir() -> Path: + """The runtime-state subtree: ``.weft/legis`` by default, or the operator's + ``[legis] store_dir`` if set. Relative paths resolve against cwd at connect + time (three-slash URL); an absolute store_dir yields an absolute URL. + """ + configured = _weft_legis_config().get("store_dir") + if isinstance(configured, str) and configured: + return Path(configured) + return Path(".weft") / WEFT_MEMBER + + +def _sqlite_url(path: Path) -> str: + """Render a filesystem path as a SQLite URL, preserving relative-ness. + + A relative path stays relative (``sqlite:///.weft/legis/x.db``, resolved by + SQLite against cwd); an absolute path renders with the leading slash intact + (``sqlite:////abs/x.db``). + """ + return f"sqlite:///{path.as_posix()}" + + +def check_db_url() -> str: + return _sqlite_url(_store_dir() / _CHECK_DB_NAME) + + +def governance_db_url() -> str: + return _sqlite_url(_store_dir() / _GOVERNANCE_DB_NAME) + + +def binding_db_url() -> str: + return _sqlite_url(_store_dir() / _BINDING_DB_NAME) + + +def pull_db_url() -> str: + return _sqlite_url(_store_dir() / _PULL_DB_NAME) + + +def ensure_sqlite_parent(url: str) -> None: + """Create the parent directory for a SQLite *file* URL, if needed. + + Called at store-open time (not at URL-compute time) so that merely importing + config or computing a default URL never litters ``.weft/`` directories — the + subtree appears only when a DB is actually opened. No-op for in-memory or + non-SQLite URLs. SQLite creates the ``.db`` file but never its parent, so + without this an open against a fresh ``.weft/legis/`` raises "unable to open + database file". + """ + parsed = make_url(url) + if not parsed.drivername.startswith("sqlite"): + return + database = parsed.database + if not database or database == ":memory:": + return + Path(database).expanduser().parent.mkdir(parents=True, exist_ok=True) diff --git a/src/legis/install.py b/src/legis/install.py index b44c954..78cf312 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -629,11 +629,15 @@ def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: # .gitignore # --------------------------------------------------------------------------- -_LEGIS_IGNORE_RULES = (".legis/", "legis.yaml") +# Only legis's OWN rules — never another member's. ``.weft/legis/`` is legis's +# machine-written runtime-state subtree (DBs &c.); ``.weft/`` as a whole is the +# shared federation namespace and must NOT be claimed wholesale here. +_LEGIS_IGNORE_RULES = (".legis/", "legis.yaml", ".weft/legis/") _LEGIS_IGNORE_BLOCK = ( - "\n# Legis — local working dir / config (regenerated/local; never commit)\n" + "\n# Legis — local working dir / config + runtime state (regenerated/local; never commit)\n" ".legis/\n" "legis.yaml\n" + ".weft/legis/\n" ) diff --git a/src/legis/mcp.py b/src/legis/mcp.py index bf11523..0d877ac 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -158,7 +158,7 @@ def _load_policy_cell_registry() -> PolicyCellRegistry: def build_runtime(agent_id: str) -> McpRuntime: - from legis.config import DEFAULT_GOVERNANCE_DB + from legis.config import binding_db_url, governance_db_url clock = SystemClock() engine = None @@ -179,7 +179,7 @@ def build_runtime(agent_id: str) -> McpRuntime: hmac_key = os.environ.get("LEGIS_HMAC_KEY") if hmac_key: key = hmac_key.encode("utf-8") - store = AuditStore(os.environ.get("LEGIS_GOVERNANCE_DB", DEFAULT_GOVERNANCE_DB)) + store = AuditStore(os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url())) protected_policies_str = os.environ.get("LEGIS_PROTECTED_POLICIES", "") protected_policies = frozenset( p.strip() for p in protected_policies_str.split(",") if p.strip() @@ -198,7 +198,7 @@ def build_runtime(agent_id: str) -> McpRuntime: from legis.governance.binding_ledger import BindingLedger binding_ledger = BindingLedger( - AuditStore(os.environ.get("LEGIS_BINDING_DB", "sqlite:///legis-binding.db")), + AuditStore(os.environ.get("LEGIS_BINDING_DB", binding_db_url())), clock, key, ) @@ -559,27 +559,29 @@ def _git(runtime: McpRuntime) -> GitSurface: def _engine(runtime: McpRuntime) -> EnforcementEngine: if runtime.engine is None: - from legis.config import DEFAULT_GOVERNANCE_DB + from legis.config import governance_db_url - store = AuditStore(os.environ.get("LEGIS_GOVERNANCE_DB", DEFAULT_GOVERNANCE_DB)) + store = AuditStore(os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url())) runtime.engine = EnforcementEngine(store, SystemClock()) return runtime.engine def _checks(runtime: McpRuntime) -> CheckSurface: if runtime.check_surface is None: - from legis.config import DEFAULT_CHECK_DB + from legis.config import check_db_url runtime.check_surface = CheckSurface( - os.environ.get("LEGIS_CHECK_DB", DEFAULT_CHECK_DB) + os.environ.get("LEGIS_CHECK_DB", check_db_url()) ) return runtime.check_surface def _pulls(runtime: McpRuntime) -> PullSurface: if runtime.pull_surface is None: + from legis.config import pull_db_url + runtime.pull_surface = PullSurface( - os.environ.get("LEGIS_PULL_DB", "sqlite:///legis-pulls.db") + os.environ.get("LEGIS_PULL_DB", pull_db_url()) ) return runtime.pull_surface diff --git a/src/legis/pulls/surface.py b/src/legis/pulls/surface.py index 7c17eb6..753db20 100644 --- a/src/legis/pulls/surface.py +++ b/src/legis/pulls/surface.py @@ -10,6 +10,9 @@ class PullSurface: def __init__(self, db_url: str) -> None: + from legis.config import ensure_sqlite_parent + + ensure_sqlite_parent(db_url) self._engine = create_engine(db_url, future=True, poolclass=NullPool) self._md = MetaData() self._pulls = Table( diff --git a/src/legis/store/audit_store.py b/src/legis/store/audit_store.py index a85a516..f5a97f8 100644 --- a/src/legis/store/audit_store.py +++ b/src/legis/store/audit_store.py @@ -98,6 +98,11 @@ def _chain(prev_hash: str, c_hash: str) -> str: class AuditStore: def __init__(self, url: str) -> None: + # The federated store subtree (.weft/legis) is created lazily, here at + # open time — SQLite makes the .db file but never its parent directory. + from legis.config import ensure_sqlite_parent + + ensure_sqlite_parent(url) # NullPool: hold no connection between operations — an append-only # audit store wants no lingering locks and clean resource lifecycle. self._engine = create_engine(url, future=True, poolclass=NullPool) diff --git a/tests/api/test_combinations_api.py b/tests/api/test_combinations_api.py index 62edb21..16ca506 100644 --- a/tests/api/test_combinations_api.py +++ b/tests/api/test_combinations_api.py @@ -386,7 +386,7 @@ def test_scan_results_rejects_both_or_neither_cell_form(tmp_path): def test_scan_results_block_escalate_only_needs_no_engine(tmp_path): # A pure block_escalate scan must route with only a signoff gate wired — no - # enforcement engine, so engine()'s lazy legis-governance.db is never created. + # enforcement engine, so engine()'s lazy .weft/legis/legis-governance.db is never created. sg = SignoffGate(AuditStore(f"sqlite:///{tmp_path / 's.db'}"), FixedClock("2026-06-02T12:00:00+00:00")) c = TestClient(create_app(signoff_gate=sg)) # NOT _client: no enforcement injected diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index fdebc50..4a20d8d 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -245,9 +245,12 @@ def test_build_runtime_initialize_does_not_create_local_state(tmp_path, monkeypa ) assert responses[0]["result"]["serverInfo"]["name"] == "legis" - assert not (tmp_path / "legis-governance.db").exists() - assert not (tmp_path / "legis-checks.db").exists() - assert not (tmp_path / "legis-pulls.db").exists() + # The federated store subtree must not be created on the initialize path — + # stores are opened lazily, so neither the .weft/legis dir nor any DB appears. + assert not (tmp_path / ".weft").exists() + assert not (tmp_path / ".weft" / "legis" / "legis-governance.db").exists() + assert not (tmp_path / ".weft" / "legis" / "legis-checks.db").exists() + assert not (tmp_path / ".weft" / "legis" / "legis-pulls.db").exists() def test_policy_explain_returns_service_explanation_payload(tmp_path): diff --git a/tests/test_config.py b/tests/test_config.py new file mode 100644 index 0000000..53789a5 --- /dev/null +++ b/tests/test_config.py @@ -0,0 +1,95 @@ +"""Store-location resolver: the federated ``.weft/legis`` subtree. + +These pin the contract from the weft config/store consolidation: + * machine-written DBs default under ``.weft/legis/`` (cwd-anchored, the same + notion the installer uses for project root); + * the operator-authored ``weft.toml`` ``[legis]`` table may relocate the + subtree but is enrich-only — absent, section-less, or malformed weft.toml + must still boot on built-in defaults (never load-bearing); + * computing a URL is pure (creates nothing); the directory materialises only + when a DB is actually opened, via ``ensure_sqlite_parent``. +""" + +from __future__ import annotations + +from legis import config + + +def test_all_four_db_urls_default_under_weft_legis(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + assert config.check_db_url() == "sqlite:///.weft/legis/legis-checks.db" + assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" + assert config.binding_db_url() == "sqlite:///.weft/legis/legis-binding.db" + assert config.pull_db_url() == "sqlite:///.weft/legis/legis-pulls.db" + + +def test_db_urls_use_builtin_defaults_with_no_weft_toml(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + assert not (tmp_path / "weft.toml").exists() + assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" + + +def test_weft_toml_store_dir_relocates_the_subtree(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text( + '[legis]\nstore_dir = "var/legis-state"\n', encoding="utf-8" + ) + assert config.governance_db_url() == "sqlite:///var/legis-state/legis-governance.db" + assert config.check_db_url() == "sqlite:///var/legis-state/legis-checks.db" + + +def test_weft_toml_absolute_store_dir_yields_absolute_url(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + abs_dir = tmp_path / "srv" / "legis" + (tmp_path / "weft.toml").write_text( + f'[legis]\nstore_dir = "{abs_dir.as_posix()}"\n', encoding="utf-8" + ) + assert config.governance_db_url() == f"sqlite:///{abs_dir.as_posix()}/legis-governance.db" + + +def test_weft_toml_without_legis_section_uses_defaults(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text('[filigree]\ndb = "x"\n', encoding="utf-8") + assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" + + +def test_malformed_weft_toml_is_not_load_bearing(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text("this is = = not valid toml [[[", encoding="utf-8") + assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" + + +def test_computing_db_url_creates_no_directories(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + _ = config.governance_db_url() + _ = config.check_db_url() + _ = config.binding_db_url() + _ = config.pull_db_url() + assert not (tmp_path / ".weft").exists() + + +def test_ensure_sqlite_parent_creates_dir_for_relative_file_url(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + config.ensure_sqlite_parent("sqlite:///.weft/legis/legis-checks.db") + assert (tmp_path / ".weft" / "legis").is_dir() + + +def test_ensure_sqlite_parent_creates_dir_for_absolute_file_url(tmp_path): + target = tmp_path / "a" / "b" / "x.db" + config.ensure_sqlite_parent(f"sqlite:///{target.as_posix()}") + assert (tmp_path / "a" / "b").is_dir() + + +def test_ensure_sqlite_parent_is_noop_for_in_memory_and_non_sqlite(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + config.ensure_sqlite_parent("sqlite://") + config.ensure_sqlite_parent("sqlite:///:memory:") + config.ensure_sqlite_parent("postgresql://localhost/x") + assert list(tmp_path.iterdir()) == [] + + +def test_ensure_sqlite_parent_is_idempotent(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + config.ensure_sqlite_parent("sqlite:///.weft/legis/legis-checks.db") + config.ensure_sqlite_parent("sqlite:///.weft/legis/legis-checks.db") + assert (tmp_path / ".weft" / "legis").is_dir() From 12d24942214e031607b2d8953c3af48d5b487659 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 07:57:18 +1000 Subject: [PATCH 37/72] fix(tests): isolate store locations; retire vestigial gitignore entries MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Resolves legis-3d295a6f7f. Test-isolation leak: several tests built default-path stores without pinning a location, so with the .weft/legis consolidation they dropped a real subtree into the repo working tree (previously bare legis-*.db; gitignored either way, so invisible to git status). Fix it centrally with an autouse conftest fixture that redirects the four LEGIS_*_DB env vars to a per-test tmp dir — isolating every current and future default-path test. A test that sets/deletes its own LEGIS_*_DB still overrides it. Guarded by a regression test pinning the fixture; a full suite run now leaves the repo root clean. Also retire the now-vestigial .legis/ and legis.yaml gitignore entries (no legis code reads them; legis.yaml was the per-member config that weft.toml [legis] replaces). The install injector and repo .gitignore now ignore .weft/legis/ only; affected install tests updated. Co-Authored-By: Claude Opus 4.8 (1M context) --- .gitignore | 2 -- src/legis/install.py | 13 +++++++------ tests/conftest.py | 24 ++++++++++++++++++++++++ tests/test_cli_install.py | 2 +- tests/test_config.py | 18 ++++++++++++++++++ tests/test_install.py | 17 +++++++---------- 6 files changed, 57 insertions(+), 19 deletions(-) diff --git a/.gitignore b/.gitignore index 9c7a00b..112731a 100644 --- a/.gitignore +++ b/.gitignore @@ -40,7 +40,5 @@ wardline.yaml *.db *.db-shm *.db-wal -.legis/ -legis.yaml # Federated runtime-state subtree (legis is the sole writer; never .weft/ wholesale) .weft/legis/ diff --git a/src/legis/install.py b/src/legis/install.py index 78cf312..0a9527e 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -631,18 +631,19 @@ def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: # Only legis's OWN rules — never another member's. ``.weft/legis/`` is legis's # machine-written runtime-state subtree (DBs &c.); ``.weft/`` as a whole is the -# shared federation namespace and must NOT be claimed wholesale here. -_LEGIS_IGNORE_RULES = (".legis/", "legis.yaml", ".weft/legis/") +# shared federation namespace and must NOT be claimed wholesale here. The legacy +# ``.legis/`` / ``legis.yaml`` surfaces were retired with the weft store +# consolidation — no legis code reads them (``legis.yaml`` was the per-member +# config that ``weft.toml`` ``[legis]`` now replaces). +_LEGIS_IGNORE_RULES = (".weft/legis/",) _LEGIS_IGNORE_BLOCK = ( - "\n# Legis — local working dir / config + runtime state (regenerated/local; never commit)\n" - ".legis/\n" - "legis.yaml\n" + "\n# Legis — machine-written runtime state (regenerated/local; never commit)\n" ".weft/legis/\n" ) def ensure_gitignore(project_root: Path) -> tuple[bool, str]: - """Ensure legis's local config surface (``.legis/``, ``legis.yaml``) is ignored.""" + """Ensure legis's runtime-state subtree (``.weft/legis/``) is ignored.""" try: gitignore = project_path(project_root, ".gitignore") except UnsafeInstallPathError as exc: diff --git a/tests/conftest.py b/tests/conftest.py index 2db5518..0f466f8 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -15,6 +15,30 @@ } +@pytest.fixture(autouse=True) +def _isolate_legis_store_locations( + tmp_path_factory: pytest.TempPathFactory, monkeypatch: pytest.MonkeyPatch +) -> None: + """Redirect every legis store to a per-test tmp dir. + + Store URLs default to the cwd-relative ``.weft/legis/`` subtree (see + ``legis.config``); a test that builds a default-path store without pinning a + location would otherwise drop that subtree into the repo working tree. + Pointing the four ``LEGIS_*_DB`` env vars at a unique tmp directory isolates + them centrally for the whole suite (legis-3d295a6f7f). A test that sets — or + deletes — its own ``LEGIS_*_DB`` still overrides this, since its monkeypatch + runs after the fixture. + """ + store = tmp_path_factory.mktemp("legis-store") + for var, name in ( + ("LEGIS_CHECK_DB", "legis-checks.db"), + ("LEGIS_GOVERNANCE_DB", "legis-governance.db"), + ("LEGIS_BINDING_DB", "legis-binding.db"), + ("LEGIS_PULL_DB", "legis-pulls.db"), + ): + monkeypatch.setenv(var, f"sqlite:///{(store / name).as_posix()}") + + @pytest.fixture def unsafe_dev_auth(monkeypatch: pytest.MonkeyPatch) -> None: monkeypatch.setenv("LEGIS_UNSAFE_DEV_AUTH", "1") diff --git a/tests/test_cli_install.py b/tests/test_cli_install.py index 413c5c5..1ad799c 100644 --- a/tests/test_cli_install.py +++ b/tests/test_cli_install.py @@ -21,7 +21,7 @@ def test_install_all_creates_every_artifact(tmp_path, monkeypatch, capsys): settings = json.loads((tmp_path / ".claude" / "settings.json").read_text()) assert "SessionStart" in settings["hooks"] gitignore = (tmp_path / ".gitignore").read_text() - assert ".legis/" in gitignore and "legis.yaml" in gitignore + assert ".weft/legis/" in gitignore def test_install_selective_gitignore_only(tmp_path, monkeypatch): diff --git a/tests/test_config.py b/tests/test_config.py index 53789a5..a12e944 100644 --- a/tests/test_config.py +++ b/tests/test_config.py @@ -93,3 +93,21 @@ def test_ensure_sqlite_parent_is_idempotent(tmp_path, monkeypatch): config.ensure_sqlite_parent("sqlite:///.weft/legis/legis-checks.db") config.ensure_sqlite_parent("sqlite:///.weft/legis/legis-checks.db") assert (tmp_path / ".weft" / "legis").is_dir() + + +def test_suite_isolates_store_locations_to_tmp(): + """Regression guard for legis-3d295a6f7f: the autouse conftest fixture must + redirect every store env var off the repo-relative `.weft/legis/` default, + so a test that builds a default-path store can't leak a subtree into the + working tree.""" + import os + + for var in ( + "LEGIS_CHECK_DB", + "LEGIS_GOVERNANCE_DB", + "LEGIS_BINDING_DB", + "LEGIS_PULL_DB", + ): + val = os.environ.get(var, "") + assert val.startswith("sqlite:"), f"{var} not redirected: {val!r}" + assert "legis-store" in val, f"{var} not pointed at the isolated tmp dir: {val!r}" diff --git a/tests/test_install.py b/tests/test_install.py index 2fbc962..44bca1b 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -571,8 +571,7 @@ def test_ensure_gitignore_creates_file(tmp_path): ok, msg = ensure_gitignore(tmp_path) assert ok content = (tmp_path / ".gitignore").read_text() - assert ".legis/" in content - assert "legis.yaml" in content + assert ".weft/legis/" in content def test_ensure_gitignore_appends_missing_rules(tmp_path): @@ -581,8 +580,7 @@ def test_ensure_gitignore_appends_missing_rules(tmp_path): assert ok content = (tmp_path / ".gitignore").read_text() assert "*.db" in content - assert ".legis/" in content - assert "legis.yaml" in content + assert ".weft/legis/" in content def test_ensure_gitignore_idempotent(tmp_path): @@ -720,12 +718,11 @@ def test_inject_append_keeps_marker_off_users_last_line(tmp_path): assert content[idx - 1] == "\n" -def test_ensure_gitignore_partial_present_appends_only_missing(tmp_path): - (tmp_path / ".gitignore").write_text("*.db\n.legis/\n") # legis.yaml absent +def test_ensure_gitignore_present_among_other_rules_not_duplicated(tmp_path): + # legis's rule already present alongside unrelated rules → nothing to add. + (tmp_path / ".gitignore").write_text("*.db\n.weft/legis/\n") ok, msg = ensure_gitignore(tmp_path) assert ok - assert "legis.yaml" in msg - assert ".legis/" not in msg # already present — not re-reported + assert "already" in msg # detected as present, not re-appended content = (tmp_path / ".gitignore").read_text() - assert content.count(".legis/") == 1 # not duplicated - assert content.count("legis.yaml") == 1 + assert content.count(".weft/legis/") == 1 # not duplicated From 2b4588c8fd815707dc5e4876fcede94b860a6d89 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 08:00:17 +1000 Subject: [PATCH 38/72] fix(docs): update error messages in SKILL.md for Loomweave and Filigree workflows --- .agents/skills/filigree-workflow/SKILL.md | 2 +- .agents/skills/loomweave-workflow/.fingerprint | 2 +- .agents/skills/loomweave-workflow/SKILL.md | 16 +++++++++++++--- .claude/skills/filigree-workflow/SKILL.md | 2 +- .claude/skills/loomweave-workflow/.fingerprint | 2 +- .claude/skills/loomweave-workflow/SKILL.md | 16 +++++++++++++--- 6 files changed, 30 insertions(+), 10 deletions(-) diff --git a/.agents/skills/filigree-workflow/SKILL.md b/.agents/skills/filigree-workflow/SKILL.md index 76e81e4..aae6e10 100644 --- a/.agents/skills/filigree-workflow/SKILL.md +++ b/.agents/skills/filigree-workflow/SKILL.md @@ -196,7 +196,7 @@ When parsing `--json` output or MCP responses, expect these unified envelopes: one of: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, `INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, `INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, - `CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, + `LOOMWEAVE_REGISTRY_VERSION_MISMATCH`, `LOOMWEAVE_OUT_OF_SYNC`, `BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. Branch on `code` for retry policy (`CONFLICT` → exit 4, retryable; everything at exit 1 needs operator diff --git a/.agents/skills/loomweave-workflow/.fingerprint b/.agents/skills/loomweave-workflow/.fingerprint index e44b7ed..b8934d2 100644 --- a/.agents/skills/loomweave-workflow/.fingerprint +++ b/.agents/skills/loomweave-workflow/.fingerprint @@ -1 +1 @@ -fe04e6fd9d528b07738f527b41d817dff89344f051465af012fc42ed44377ea3 \ No newline at end of file +8af48023ff74748434eec046b718fe586bce8784e51d474c9c58daf8f292326b \ No newline at end of file diff --git a/.agents/skills/loomweave-workflow/SKILL.md b/.agents/skills/loomweave-workflow/SKILL.md index 1b07457..fd7ab55 100644 --- a/.agents/skills/loomweave-workflow/SKILL.md +++ b/.agents/skills/loomweave-workflow/SKILL.md @@ -65,18 +65,27 @@ tell which case you're in. | `execution_paths_from` | bounded call paths out of an entity | `{"id": "", "max_depth": 5}` | | `subsystem_members` | modules in a subsystem | `{"id": "core:subsystem:"}` | | `subsystem_of` | the subsystem an entity belongs to (reverse of `subsystem_members`) | `{"id": ""}` | -| `summary` | on-demand prose summary of one entity | `{"id": ""}` | +| `summary` † | on-demand prose summary of one entity | `{"id": ""}` | | `summary_preview_cost` | preview a `summary` call's cache status / cost before spending | `{"id": ""}` | | `issues_for` | Filigree issues attached to an entity | `{"id": ""}` | | `source_for_entity` | an entity's exact indexed source span + bounded context | `{"id": "", "context_lines": 10}` | | `call_sites` | the source line(s) behind a calls/references edge | `{"id": "", "role": "caller"}` | | `orientation_pack` | one deterministic orientation packet for an entity or file:line (entity + context + neighbors + paths + issues + freshness) | `{"file": "rel/path.py", "line": 42}` | | `index_diff` | index freshness / drift vs. the current working tree | `{}` | -| `analyze_start` | launch a background re-index, return its `run_id` | `{}` | +| `analyze_start` † | launch a background re-index, return its `run_id` | `{}` | | `analyze_status` | poll a started analyze (queued/running/terminal + progress) | `{"run_id": ""}` | -| `analyze_cancel` | stop a running analyze (group-kills plugin + Pyright) | `{"run_id": ""}` | +| `analyze_cancel` † | stop a running analyze (group-kills plugin + Pyright) | `{"run_id": ""}` | | `project_status` | index freshness, counts, LLM + Filigree status | `{}` | +† **Write-gated.** `summary` (`entity_summary_get`), `analyze_start`, +`analyze_cancel`, `propose_guidance`, and `promote_guidance` are registered only +when `serve.mcp.enable_write_tools: true` is set in `loomweave.yaml` (default +`false`). When the gate is off they do not appear in `tools/list` and a call +returns a tool-disabled error — run `loomweave config check` to see the active +policy. `summary` additionally requires the live LLM provider to be enabled +(`llm_policy.enabled: true` + `allow_live_provider: true`), or it serves cache +only. + `callers_of` / `neighborhood` / `execution_paths_from` take a `confidence` tier — one of `"resolved"` (default; only high-confidence edges), `"ambiguous"`, or `"inferred"`. There is no `"all"` value. When you suspect an @@ -163,6 +172,7 @@ for team sharing). Agents may call `propose_guidance` to create a Filigree observation, but that proposal is inert until an operator promotes it through `promote_guidance` or the CLI. Promoted sheets reach you through `guidance_for` and are composed into `summary` prompts with a real guidance fingerprint. +(`propose_guidance` and `promote_guidance` are write-gated — see the † note above.) ## Workflow: orient, then navigate diff --git a/.claude/skills/filigree-workflow/SKILL.md b/.claude/skills/filigree-workflow/SKILL.md index 76e81e4..aae6e10 100644 --- a/.claude/skills/filigree-workflow/SKILL.md +++ b/.claude/skills/filigree-workflow/SKILL.md @@ -196,7 +196,7 @@ When parsing `--json` output or MCP responses, expect these unified envelopes: one of: `VALIDATION`, `NOT_FOUND`, `CONFLICT`, `INVALID_TRANSITION`, `PERMISSION`, `NOT_INITIALIZED`, `IO`, `INVALID_API_URL`, `FILE_REGISTRY_DISPLACED`, `REGISTRY_UNAVAILABLE`, - `CLARION_REGISTRY_VERSION_MISMATCH`, `CLARION_OUT_OF_SYNC`, + `LOOMWEAVE_REGISTRY_VERSION_MISMATCH`, `LOOMWEAVE_OUT_OF_SYNC`, `BRIEFING_BLOCKED`, `STOP_FAILED`, `SCHEMA_MISMATCH`, `INTERNAL`. Branch on `code` for retry policy (`CONFLICT` → exit 4, retryable; everything at exit 1 needs operator diff --git a/.claude/skills/loomweave-workflow/.fingerprint b/.claude/skills/loomweave-workflow/.fingerprint index e44b7ed..b8934d2 100644 --- a/.claude/skills/loomweave-workflow/.fingerprint +++ b/.claude/skills/loomweave-workflow/.fingerprint @@ -1 +1 @@ -fe04e6fd9d528b07738f527b41d817dff89344f051465af012fc42ed44377ea3 \ No newline at end of file +8af48023ff74748434eec046b718fe586bce8784e51d474c9c58daf8f292326b \ No newline at end of file diff --git a/.claude/skills/loomweave-workflow/SKILL.md b/.claude/skills/loomweave-workflow/SKILL.md index 1b07457..fd7ab55 100644 --- a/.claude/skills/loomweave-workflow/SKILL.md +++ b/.claude/skills/loomweave-workflow/SKILL.md @@ -65,18 +65,27 @@ tell which case you're in. | `execution_paths_from` | bounded call paths out of an entity | `{"id": "", "max_depth": 5}` | | `subsystem_members` | modules in a subsystem | `{"id": "core:subsystem:"}` | | `subsystem_of` | the subsystem an entity belongs to (reverse of `subsystem_members`) | `{"id": ""}` | -| `summary` | on-demand prose summary of one entity | `{"id": ""}` | +| `summary` † | on-demand prose summary of one entity | `{"id": ""}` | | `summary_preview_cost` | preview a `summary` call's cache status / cost before spending | `{"id": ""}` | | `issues_for` | Filigree issues attached to an entity | `{"id": ""}` | | `source_for_entity` | an entity's exact indexed source span + bounded context | `{"id": "", "context_lines": 10}` | | `call_sites` | the source line(s) behind a calls/references edge | `{"id": "", "role": "caller"}` | | `orientation_pack` | one deterministic orientation packet for an entity or file:line (entity + context + neighbors + paths + issues + freshness) | `{"file": "rel/path.py", "line": 42}` | | `index_diff` | index freshness / drift vs. the current working tree | `{}` | -| `analyze_start` | launch a background re-index, return its `run_id` | `{}` | +| `analyze_start` † | launch a background re-index, return its `run_id` | `{}` | | `analyze_status` | poll a started analyze (queued/running/terminal + progress) | `{"run_id": ""}` | -| `analyze_cancel` | stop a running analyze (group-kills plugin + Pyright) | `{"run_id": ""}` | +| `analyze_cancel` † | stop a running analyze (group-kills plugin + Pyright) | `{"run_id": ""}` | | `project_status` | index freshness, counts, LLM + Filigree status | `{}` | +† **Write-gated.** `summary` (`entity_summary_get`), `analyze_start`, +`analyze_cancel`, `propose_guidance`, and `promote_guidance` are registered only +when `serve.mcp.enable_write_tools: true` is set in `loomweave.yaml` (default +`false`). When the gate is off they do not appear in `tools/list` and a call +returns a tool-disabled error — run `loomweave config check` to see the active +policy. `summary` additionally requires the live LLM provider to be enabled +(`llm_policy.enabled: true` + `allow_live_provider: true`), or it serves cache +only. + `callers_of` / `neighborhood` / `execution_paths_from` take a `confidence` tier — one of `"resolved"` (default; only high-confidence edges), `"ambiguous"`, or `"inferred"`. There is no `"all"` value. When you suspect an @@ -163,6 +172,7 @@ for team sharing). Agents may call `propose_guidance` to create a Filigree observation, but that proposal is inert until an operator promotes it through `promote_guidance` or the CLI. Promoted sheets reach you through `guidance_for` and are composed into `summary` prompts with a real guidance fingerprint. +(`propose_guidance` and `promote_guidance` are write-gated — see the † note above.) ## Workflow: orient, then navigate From c3637ebe57da9b5d481e2671a4e1b18e27543fa5 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 12:03:14 +1000 Subject: [PATCH 39/72] =?UTF-8?q?fix(enforcement):=20drop=20vestigial=20v1?= =?UTF-8?q?/legacy=20signing=20path=20after=20clarion=E2=86=92loomweave=20?= =?UTF-8?q?rename?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The rebrand changed the protected-cell signing field set (clarion_content_hash → loomweave_content_hash, ext["clarion"] → ext["loomweave"]). legis is unreleased, so no signed governance records predate the rename and the legacy fallback that tolerated the old field set is dead weight. Remove legacy_signing_fields and the hmac-sha256:v1 acceptance path from TrailVerifier and signing.py, keeping the version-tag mechanism (v2) so a future field-set change can ship as a new tag. Correct signing.py's docstring (it implied a shipped v1). The verify path is otherwise unchanged. Verified clean: grep of `clarion` finds no surviving producer or golden vector; full suite + coverage floors + ruff + mypy + policy-boundary-check + governance-gate green. (legis-30d68f6766) Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 9 ++++++++- src/legis/enforcement/protected.py | 23 ++--------------------- src/legis/enforcement/signing.py | 15 +++++---------- tests/enforcement/test_signing.py | 17 ++++++++++++----- tests/enforcement/test_trail_verify.py | 15 --------------- 5 files changed, 27 insertions(+), 52 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index b9f7a62..cd1555e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -183,7 +183,14 @@ listed as not-yet-built. ### Changed - **Rebrand Clarion→Loomweave and Loom→Weft** across legis (identifiers, docs, - and config references). + and config references). The protected-cell signing field set follows the + rename (`clarion_content_hash` → `loomweave_content_hash`, `ext["clarion"]` → + `ext["loomweave"]`). This is a deliberate **clean break**, not a migration: + legis is unreleased, so no signed governance records predate the rename. The + now-impossible legacy fallback (`legacy_signing_fields` and the + `hmac-sha256:v1` acceptance path in `TrailVerifier` / `signing`) is removed + accordingly; the version-tag mechanism (`v2`) is retained so a future + field-set change can still be introduced as a new tag without ambiguity. - **MCP idempotency replays scoped** so a replayed call resolves against its own prior result, not a sibling's. diff --git a/src/legis/enforcement/protected.py b/src/legis/enforcement/protected.py index 16f7390..9e33be9 100644 --- a/src/legis/enforcement/protected.py +++ b/src/legis/enforcement/protected.py @@ -16,7 +16,7 @@ from legis.clock import Clock from legis.enforcement.judge import Judge -from legis.enforcement.signing import SIG_PREFIX_V1, sign, verify +from legis.enforcement.signing import sign, verify from legis.enforcement.signoff import signoff_signing_fields from legis.enforcement.verdict import Verdict from legis.identity.entity_key import EntityKey @@ -78,21 +78,6 @@ def signing_fields(payload: dict[str, Any]) -> dict[str, Any]: return fields -def legacy_signing_fields(payload: dict[str, Any]) -> dict[str, Any]: - """Protected override fields signed by legacy ``hmac-sha256:v1`` records.""" - ext = payload.get("extensions") or {} - return { - "policy": payload.get("policy"), - "entity": payload.get("entity_key"), - "verdict": ext.get("judge_verdict"), - "model": ext.get("judge_model"), - "recorded_at": payload.get("recorded_at"), - "rationale": payload.get("rationale"), - "file_fingerprint": ext.get("file_fingerprint"), - "ast_path": ext.get("ast_path"), - } - - class TrailVerifier: """Load-time signature check. A record whose policy is protected MUST carry a valid signature; a missing or mismatched signature is tampering. @@ -153,11 +138,7 @@ def verify(self, records) -> None: raise TamperError( f"protected record seq={rec.seq} is structurally malformed: {exc}" ) from exc - if not verify(fields, sig, self._key) and not ( - isinstance(sig, str) - and sig.startswith(SIG_PREFIX_V1) - and verify(legacy_signing_fields(rec.payload), sig, self._key) - ): + if not verify(fields, sig, self._key): raise TamperError( f"protected record seq={rec.seq} signature does not verify" ) diff --git a/src/legis/enforcement/signing.py b/src/legis/enforcement/signing.py index 8f99d2e..2853528 100644 --- a/src/legis/enforcement/signing.py +++ b/src/legis/enforcement/signing.py @@ -2,9 +2,10 @@ The Sprint 0 hash chain detects edits by an actor who *cannot* recompute it; an actor with DB-file access can re-chain a forged record. The HMAC closes that: -without the key, a forged record cannot carry a valid signature. Versioned -(`v2` pins the expanded audit field set and canonical-JSON v1) so future -canonicalisation or field-set upgrades can be introduced without ambiguity. +without the key, a forged record cannot carry a valid signature. Every signature +carries a version tag (currently `v2`, which pins the audit field set and +canonical-JSON v1) so a future canonicalisation or field-set change can be +introduced as a new tag without ambiguity. """ from __future__ import annotations @@ -14,14 +15,11 @@ from legis.canonical import canonical_json -SIG_PREFIX_V1 = "hmac-sha256:v1:" SIG_PREFIX_V2 = "hmac-sha256:v2:" SIG_PREFIX = SIG_PREFIX_V2 def _prefix_for(version: str) -> str: - if version == "v1": - return SIG_PREFIX_V1 if version == "v2": return SIG_PREFIX_V2 raise ValueError(f"unsupported signature version: {version}") @@ -41,7 +39,4 @@ def sign(fields: dict, key: bytes, *, version: str = "v2") -> str: def verify(fields: dict, signature: str, key: bytes) -> bool: if signature.startswith(SIG_PREFIX_V2): return hmac.compare_digest(_signed(fields, key, SIG_PREFIX_V2), signature) - if signature.startswith(SIG_PREFIX_V1): - return hmac.compare_digest(_signed(fields, key, SIG_PREFIX_V1), signature) - else: - return False + return False diff --git a/tests/enforcement/test_signing.py b/tests/enforcement/test_signing.py index afb5514..524171b 100644 --- a/tests/enforcement/test_signing.py +++ b/tests/enforcement/test_signing.py @@ -1,4 +1,6 @@ -from legis.enforcement.signing import SIG_PREFIX, SIG_PREFIX_V1, sign, verify +import pytest + +from legis.enforcement.signing import SIG_PREFIX, sign, verify def test_sign_is_prefixed_and_deterministic(): @@ -19,8 +21,13 @@ def test_verify_round_trips_and_rejects_wrong_key_or_tamper(): assert verify(fields, "", b"key-1") is False -def test_verify_accepts_explicit_legacy_v1_signature(): +def test_verify_rejects_unknown_prefix(): fields = {"verdict": "ACCEPTED", "policy": "p"} - sig = sign(fields, b"key-1", version="v1") - assert sig.startswith(SIG_PREFIX_V1) - assert verify(fields, sig, b"key-1") is True + sig = sign(fields, b"key-1") + forged = sig.replace("v2", "v1", 1) # a tag verify no longer recognises + assert verify(fields, forged, b"key-1") is False + + +def test_sign_rejects_unknown_version(): + with pytest.raises(ValueError, match="unsupported signature version"): + sign({"verdict": "ACCEPTED"}, b"key-1", version="v1") diff --git a/tests/enforcement/test_trail_verify.py b/tests/enforcement/test_trail_verify.py index 3c32654..a67edb0 100644 --- a/tests/enforcement/test_trail_verify.py +++ b/tests/enforcement/test_trail_verify.py @@ -7,9 +7,7 @@ ProtectedGate, TamperError, TrailVerifier, - legacy_signing_fields, ) -from legis.enforcement.signing import sign from legis.enforcement.verdict import JudgeOpinion, Verdict from legis.identity.entity_key import EntityKey from legis.store.audit_store import GENESIS, AuditStore, _chain @@ -55,19 +53,6 @@ def test_clean_protected_trail_verifies(tmp_path): TrailVerifier(KEY, PROTECTED).verify(store.read_all()) # no raise -def test_legacy_v1_protected_signature_still_verifies(tmp_path): - g, store = _gate(tmp_path / "gov.db") - _submit(g) - - def replace_with_legacy_signature(p): - p["extensions"]["judge_metadata_signature"] = sign( - legacy_signing_fields(p), KEY, version="v1" - ) - - _edit_payload_and_rechain(tmp_path / "gov.db", replace_with_legacy_signature) - TrailVerifier(KEY, PROTECTED).verify(store.read_all()) # no raise - - def test_missing_signature_on_protected_policy_is_tampering(tmp_path): g, store = _gate(tmp_path / "gov.db") _submit(g) From 015a2db910acef7bd39b89f32ec8ea8a216eea26 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 12:03:29 +1000 Subject: [PATCH 40/72] fix(governance): stop override-rate gate over-detecting protected records MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The keyless-branch protected-detector sniffed file_fingerprint/ast_path in a record's extensions. The simple-tier engine accepts an arbitrary extra_extensions dict, so a chill/coached record carrying those keys would fail-close a non-protected deployment's `legis governance-gate` on a record that was never signed. Drop the two soft sniffs; keep the policy set plus the protected_cell / signature markers (a record purporting to BE protected, where failing closed is correct even if injected). TrailVerifier's deliberately over-inclusive verify-path heuristic is unchanged — the two detectors intentionally diverge (keyless "must I refuse to score?" vs with-key "must this be signed?"). Latent over-reach: no shipped simple-tier producer writes those top-level keys. TDD test added (RED→GREEN); full suite + coverage floors + ruff + mypy + policy-boundary-check + governance-gate green. (legis-fa1811186a) Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 8 ++++++++ src/legis/service/governance.py | 20 ++++++++++++++++++-- tests/service/test_governance.py | 21 +++++++++++++++++++++ 3 files changed, 47 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index cd1555e..9986e97 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -200,6 +200,14 @@ listed as not-yet-built. trust-grammar projection. - **CLI fails closed on protected override-rate trails** — a missing or unverifiable protected trail exits non-zero rather than reporting a clean rate. +- **Override-rate gate no longer over-detects protected records** — the + keyless-branch protected-detector dropped its soft `file_fingerprint` / + `ast_path` extension sniffs, which a chill/coached record could carry via an + arbitrary `extra_extensions` dict and thereby fail-close a non-protected + deployment's `legis governance-gate`. It now keys off the policy set plus the + `protected_cell` / signature markers the simple-tier engine never writes; + `TrailVerifier`'s (deliberately over-inclusive) verify-path heuristic is + unchanged. - Hardened the governance audit boundaries with regression coverage. ## [1.0.0rc1] — 2026-06-03 diff --git a/src/legis/service/governance.py b/src/legis/service/governance.py index e624ba7..214f8ea 100644 --- a/src/legis/service/governance.py +++ b/src/legis/service/governance.py @@ -114,14 +114,30 @@ def compute_override_rate(records: list): def _requires_protected_verification(payload: dict[str, Any], protected_policies) -> bool: + """Gate-local protected-detection for the KEYLESS branch of the override-rate + gate: would refusing to score this record be right because it genuinely needs + a signature we have no key to check? + + The discriminator is *status-claim vs incidental metadata*. The markers kept + below — ``protected_cell`` and the signature keys — are a record purporting to + BE protected, so failing closed on them in a keyless deployment is correct + even if injected. ``file_fingerprint`` / ``ast_path`` carry no such claim: + they are ordinary metadata, and the simple-tier engine accepts an arbitrary + ``extensions`` dict, so they can ride on a never-signed chill/coached record — + flagging them would fail-close a non-protected deployment on a record that has + nothing to verify. That over-reach is why those two sniffs are dropped here. + + Intentionally NARROWER than ``TrailVerifier._requires_verification`` (the + verify path, which must stay over-inclusive): the two answer different + questions — keyless "must I refuse to score this?" vs with-key "must this be + signed?" — so do NOT re-merge them. + """ ext = payload.get("extensions", {}) or {} return ( payload.get("policy") in protected_policies or ext.get("protected_cell") is True or "judge_metadata_signature" in ext or "signoff_signature" in ext - or "file_fingerprint" in ext - or "ast_path" in ext ) diff --git a/tests/service/test_governance.py b/tests/service/test_governance.py index d69c597..10766cf 100644 --- a/tests/service/test_governance.py +++ b/tests/service/test_governance.py @@ -282,6 +282,27 @@ def test_evaluate_override_rate_gate_scores_with_key(tmp_path): assert res.status in {GateStatus.PASS, GateStatus.PASS_WITH_NOTICE, GateStatus.FAIL} +def test_evaluate_override_rate_gate_ignores_soft_sniffs_on_simple_records(tmp_path): + # A chill/coached record can carry an arbitrary extra_extensions dict through + # the simple-tier engine. Such a record holding file_fingerprint/ast_path is + # NOT protected (the engine never writes protected_cell or a signature), so a + # keyless, non-protected deployment must score it rather than fail closed. + from legis.service.governance import evaluate_override_rate_gate + + store = AuditStore(f"sqlite:///{tmp_path / 'gov.db'}") + engine = EnforcementEngine(store, SystemClock()) # chill: no judge + engine.submit_override( + policy="some-policy", + entity_key=EntityKey.from_locator("src/x.py:f"), + rationale="r", + agent_id="a", + extensions={"file_fingerprint": "fp", "ast_path": "ap"}, + ) + records = store.read_all() + res = evaluate_override_rate_gate(records, hmac_key=None, protected_policies=frozenset()) + assert res.status in {GateStatus.PASS, GateStatus.PASS_WITH_NOTICE, GateStatus.FAIL} + + def test_sign_off_raises_not_enabled_when_gate_absent(): from legis.service.errors import NotEnabledError from legis.service.governance import sign_off From db9a38eec376c3e6baa2fe38f44f969cace0c310 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 17:05:59 +1000 Subject: [PATCH 41/72] refactor(config): centralize LEGIS_*_DB env precedence into the store resolvers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit config.py's docstring documented that LEGIS_*_DB env vars take precedence over weft.toml, but the *_db_url() resolvers only returned the weft.toml/built-in default — so every consumer re-implemented the precedence as os.environ.get("LEGIS_*_DB", *_db_url()) at 11 sites (api/app.py ×5, mcp.py ×5, cli.py ×1). Changing precedence or adding an alias meant editing all 11; miss one and that store silently ignores its override. Move resolution into a single config._resolve_db_url(env_var, db_name) helper that the four *_db_url() functions call; collapse the 11 wrappers to bare resolver calls. Faithful semantics — `env_var in os.environ` (not `.get(...) or default`) — so a present-but-empty override is returned verbatim, byte-identical to the old behaviour. No change to resolved URLs for any existing deployment. Latent fragility, not a live bug: a whole-repo grep confirmed all 11 sites wrapped correctly and no bare resolver call existed. Also hardens the test-isolation path in legis-189eeafb5d — a direct resolver caller now honours the isolation env vars instead of leaking a default-path .weft/legis/ store (that ticket's broader scope is left open). TDD: added test_legis_db_env_var_takes_precedence_over_weft_toml_and_default (RED→GREEN); the 7 default-layer config tests now request a _clear_db_env fixture since the autouse suite fixture's overrides now reach the resolver. Full suite (679) + coverage floors + ruff + mypy + policy-boundary-check + governance-gate green. (legis-0db1fcfda6) Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 8 +++++++ src/legis/api/app.py | 10 ++++----- src/legis/cli.py | 4 +--- src/legis/config.py | 38 +++++++++++++++++++++++++++------ src/legis/mcp.py | 14 +++++-------- tests/test_config.py | 50 +++++++++++++++++++++++++++++++++++++------- 6 files changed, 94 insertions(+), 30 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9986e97..07617af 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -193,6 +193,14 @@ listed as not-yet-built. field-set change can still be introduced as a new tag without ambiguity. - **MCP idempotency replays scoped** so a replayed call resolves against its own prior result, not a sibling's. +- **Store-URL resolution centralised in `config.py`** — the `LEGIS_*_DB` env + override precedence the module documents is now implemented inside the + `*_db_url()` resolvers themselves (via `_resolve_db_url`), instead of being + re-wrapped as `os.environ.get("LEGIS_*_DB", *_db_url())` at ~11 call sites + across `api/app.py`, `mcp.py`, and `cli.py`. Consumers call the resolver + directly; precedence/alias changes are a one-line edit in one place, and a + direct resolver call can no longer silently ignore its override. No change to + the resolved URLs for existing deployments. ### Fixed - **Ingest accepts realistic scans** — the over-strict Wardline ingest validator diff --git a/src/legis/api/app.py b/src/legis/api/app.py index e26703a..9fe6173 100644 --- a/src/legis/api/app.py +++ b/src/legis/api/app.py @@ -340,7 +340,7 @@ def create_app( from legis.clock import SystemClock from legis.store.audit_store import AuditStore - gov_db_url = os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url()) + gov_db_url = governance_db_url() gov_store = AuditStore(gov_db_url) clock = SystemClock() @@ -371,7 +371,7 @@ def create_app( if binding_ledger is None: from legis.governance.binding_ledger import BindingLedger - bind_db_url = os.environ.get("LEGIS_BINDING_DB", binding_db_url()) + bind_db_url = binding_db_url() binding_ledger = BindingLedger(AuditStore(bind_db_url), clock, hmac_key) state: dict[str, Any] = { "checks": check_surface, @@ -385,13 +385,13 @@ def git() -> GitSurface: def checks() -> CheckSurface: if state["checks"] is None: - check_db = os.environ.get("LEGIS_CHECK_DB", check_db_url()) + check_db = check_db_url() state["checks"] = CheckSurface(check_db) return state["checks"] def pulls() -> PullSurface: if state["pulls"] is None: - pull_db = os.environ.get("LEGIS_PULL_DB", pull_db_url()) + pull_db = pull_db_url() state["pulls"] = PullSurface(pull_db) return state["pulls"] @@ -400,7 +400,7 @@ def engine() -> EnforcementEngine: from legis.clock import SystemClock from legis.store.audit_store import AuditStore - gov_db_url = os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url()) + gov_db_url = governance_db_url() state["enforcement"] = EnforcementEngine( AuditStore(gov_db_url), SystemClock() ) diff --git a/src/legis/cli.py b/src/legis/cli.py index 49f58b8..c4e48c6 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -89,10 +89,8 @@ def build_parser() -> argparse.ArgumentParser: ) _add_judge_flags(mcp) - import os - from legis.config import governance_db_url - gov_db_default = os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url()) + gov_db_default = governance_db_url() rate = subparsers.add_parser( "check-override-rate", help="Fail (exit 1) if the override-rate gate is FAIL — for CI", diff --git a/src/legis/config.py b/src/legis/config.py index 5447214..1a452c7 100644 --- a/src/legis/config.py +++ b/src/legis/config.py @@ -18,8 +18,10 @@ ``weft.toml`` may carry a ``[legis]`` table; we read it but never write it. The single enrichment knob is ``store_dir`` (relocate the subtree; relative to the project root, or absolute). Per-DB overrides remain the ``LEGIS_*_DB`` env -vars, which take precedence over weft.toml. An absent file, an absent -``[legis]`` section, or even a malformed weft.toml must still boot on the +vars, which take precedence over weft.toml — a precedence the ``*_db_url()`` +resolvers below implement directly (via ``_resolve_db_url``), so every consumer +gets it by calling the resolver, not by re-wrapping it. An absent file, an +absent ``[legis]`` section, or even a malformed weft.toml must still boot on the built-in defaults — legis never *depends* on weft.toml (Doctrine §5 deletion test). @@ -36,6 +38,7 @@ from __future__ import annotations import logging +import os import tomllib from pathlib import Path @@ -52,6 +55,12 @@ _BINDING_DB_NAME = "legis-binding.db" _PULL_DB_NAME = "legis-pulls.db" +# Per-DB override env vars. Highest precedence (see ``_resolve_db_url``). +_CHECK_DB_ENV = "LEGIS_CHECK_DB" +_GOVERNANCE_DB_ENV = "LEGIS_GOVERNANCE_DB" +_BINDING_DB_ENV = "LEGIS_BINDING_DB" +_PULL_DB_ENV = "LEGIS_PULL_DB" + def project_root() -> Path: """The directory the federation treats as project root (the cwd).""" @@ -107,20 +116,37 @@ def _sqlite_url(path: Path) -> str: return f"sqlite:///{path.as_posix()}" +def _resolve_db_url(env_var: str, db_name: str) -> str: + """Resolve a store URL with the documented precedence (module docstring): + the per-DB ``LEGIS_*_DB`` override wins; otherwise the URL is composed from + the weft.toml ``store_dir`` (or the built-in ``.weft/legis`` default) under + the canonical filename. + + This is THE single resolution point — callers invoke the ``*_db_url()`` + function directly and never re-implement the env layering, so changing + precedence or adding an alias is a one-line edit here, not ~11 call sites. + ``env_var in os.environ`` (not ``.get(...) or``) so a present-but-empty + override is returned verbatim rather than silently falling through. + """ + if env_var in os.environ: + return os.environ[env_var] + return _sqlite_url(_store_dir() / db_name) + + def check_db_url() -> str: - return _sqlite_url(_store_dir() / _CHECK_DB_NAME) + return _resolve_db_url(_CHECK_DB_ENV, _CHECK_DB_NAME) def governance_db_url() -> str: - return _sqlite_url(_store_dir() / _GOVERNANCE_DB_NAME) + return _resolve_db_url(_GOVERNANCE_DB_ENV, _GOVERNANCE_DB_NAME) def binding_db_url() -> str: - return _sqlite_url(_store_dir() / _BINDING_DB_NAME) + return _resolve_db_url(_BINDING_DB_ENV, _BINDING_DB_NAME) def pull_db_url() -> str: - return _sqlite_url(_store_dir() / _PULL_DB_NAME) + return _resolve_db_url(_PULL_DB_ENV, _PULL_DB_NAME) def ensure_sqlite_parent(url: str) -> None: diff --git a/src/legis/mcp.py b/src/legis/mcp.py index 0d877ac..336cfa0 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -179,7 +179,7 @@ def build_runtime(agent_id: str) -> McpRuntime: hmac_key = os.environ.get("LEGIS_HMAC_KEY") if hmac_key: key = hmac_key.encode("utf-8") - store = AuditStore(os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url())) + store = AuditStore(governance_db_url()) protected_policies_str = os.environ.get("LEGIS_PROTECTED_POLICIES", "") protected_policies = frozenset( p.strip() for p in protected_policies_str.split(",") if p.strip() @@ -198,7 +198,7 @@ def build_runtime(agent_id: str) -> McpRuntime: from legis.governance.binding_ledger import BindingLedger binding_ledger = BindingLedger( - AuditStore(os.environ.get("LEGIS_BINDING_DB", binding_db_url())), + AuditStore(binding_db_url()), clock, key, ) @@ -561,7 +561,7 @@ def _engine(runtime: McpRuntime) -> EnforcementEngine: if runtime.engine is None: from legis.config import governance_db_url - store = AuditStore(os.environ.get("LEGIS_GOVERNANCE_DB", governance_db_url())) + store = AuditStore(governance_db_url()) runtime.engine = EnforcementEngine(store, SystemClock()) return runtime.engine @@ -570,9 +570,7 @@ def _checks(runtime: McpRuntime) -> CheckSurface: if runtime.check_surface is None: from legis.config import check_db_url - runtime.check_surface = CheckSurface( - os.environ.get("LEGIS_CHECK_DB", check_db_url()) - ) + runtime.check_surface = CheckSurface(check_db_url()) return runtime.check_surface @@ -580,9 +578,7 @@ def _pulls(runtime: McpRuntime) -> PullSurface: if runtime.pull_surface is None: from legis.config import pull_db_url - runtime.pull_surface = PullSurface( - os.environ.get("LEGIS_PULL_DB", pull_db_url()) - ) + runtime.pull_surface = PullSurface(pull_db_url()) return runtime.pull_surface diff --git a/tests/test_config.py b/tests/test_config.py index a12e944..4d1f52e 100644 --- a/tests/test_config.py +++ b/tests/test_config.py @@ -12,10 +12,29 @@ from __future__ import annotations +import pytest + from legis import config -def test_all_four_db_urls_default_under_weft_legis(tmp_path, monkeypatch): +@pytest.fixture +def _clear_db_env(monkeypatch): + """Clear the per-DB ``LEGIS_*_DB`` overrides so a test can probe the lower + weft.toml / built-in-default precedence layers. The autouse suite fixture + (tests/conftest.py) sets these to isolate stores, and the resolvers now honour + them (highest precedence), so a default-layer assertion must drop them first. + A test's own monkeypatch runs after the autouse fixture, so this wins. + """ + for var in ( + "LEGIS_CHECK_DB", + "LEGIS_GOVERNANCE_DB", + "LEGIS_BINDING_DB", + "LEGIS_PULL_DB", + ): + monkeypatch.delenv(var, raising=False) + + +def test_all_four_db_urls_default_under_weft_legis(_clear_db_env, tmp_path, monkeypatch): monkeypatch.chdir(tmp_path) assert config.check_db_url() == "sqlite:///.weft/legis/legis-checks.db" assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" @@ -23,13 +42,30 @@ def test_all_four_db_urls_default_under_weft_legis(tmp_path, monkeypatch): assert config.pull_db_url() == "sqlite:///.weft/legis/legis-pulls.db" -def test_db_urls_use_builtin_defaults_with_no_weft_toml(tmp_path, monkeypatch): +def test_legis_db_env_var_takes_precedence_over_weft_toml_and_default(tmp_path, monkeypatch): + # The documented precedence (module docstring): a per-DB LEGIS_*_DB override + # wins over both the weft.toml store_dir and the built-in default. The + # resolvers must implement this themselves, so a bare call honours the env. + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text( + '[legis]\nstore_dir = "var/legis-state"\n', encoding="utf-8" + ) + monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "sqlite:///explicit-gov.db") + monkeypatch.setenv("LEGIS_CHECK_DB", "sqlite:///explicit-check.db") + assert config.governance_db_url() == "sqlite:///explicit-gov.db" + assert config.check_db_url() == "sqlite:///explicit-check.db" + # An unset var still falls through to weft.toml store_dir for that DB. + monkeypatch.delenv("LEGIS_BINDING_DB", raising=False) + assert config.binding_db_url() == "sqlite:///var/legis-state/legis-binding.db" + + +def test_db_urls_use_builtin_defaults_with_no_weft_toml(_clear_db_env, tmp_path, monkeypatch): monkeypatch.chdir(tmp_path) assert not (tmp_path / "weft.toml").exists() assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" -def test_weft_toml_store_dir_relocates_the_subtree(tmp_path, monkeypatch): +def test_weft_toml_store_dir_relocates_the_subtree(_clear_db_env, tmp_path, monkeypatch): monkeypatch.chdir(tmp_path) (tmp_path / "weft.toml").write_text( '[legis]\nstore_dir = "var/legis-state"\n', encoding="utf-8" @@ -38,7 +74,7 @@ def test_weft_toml_store_dir_relocates_the_subtree(tmp_path, monkeypatch): assert config.check_db_url() == "sqlite:///var/legis-state/legis-checks.db" -def test_weft_toml_absolute_store_dir_yields_absolute_url(tmp_path, monkeypatch): +def test_weft_toml_absolute_store_dir_yields_absolute_url(_clear_db_env, tmp_path, monkeypatch): monkeypatch.chdir(tmp_path) abs_dir = tmp_path / "srv" / "legis" (tmp_path / "weft.toml").write_text( @@ -47,19 +83,19 @@ def test_weft_toml_absolute_store_dir_yields_absolute_url(tmp_path, monkeypatch) assert config.governance_db_url() == f"sqlite:///{abs_dir.as_posix()}/legis-governance.db" -def test_weft_toml_without_legis_section_uses_defaults(tmp_path, monkeypatch): +def test_weft_toml_without_legis_section_uses_defaults(_clear_db_env, tmp_path, monkeypatch): monkeypatch.chdir(tmp_path) (tmp_path / "weft.toml").write_text('[filigree]\ndb = "x"\n', encoding="utf-8") assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" -def test_malformed_weft_toml_is_not_load_bearing(tmp_path, monkeypatch): +def test_malformed_weft_toml_is_not_load_bearing(_clear_db_env, tmp_path, monkeypatch): monkeypatch.chdir(tmp_path) (tmp_path / "weft.toml").write_text("this is = = not valid toml [[[", encoding="utf-8") assert config.governance_db_url() == "sqlite:///.weft/legis/legis-governance.db" -def test_computing_db_url_creates_no_directories(tmp_path, monkeypatch): +def test_computing_db_url_creates_no_directories(_clear_db_env, tmp_path, monkeypatch): monkeypatch.chdir(tmp_path) _ = config.governance_db_url() _ = config.check_db_url() From 020c0c62bb358cec6e604304dc5cb1ceb8e09702 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 17:25:55 +1000 Subject: [PATCH 42/72] refactor(identity): extract shared Weft-component transport-HMAC seam MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit filigree/client.py and identity/loomweave_client.py signed their requests with byte-for-byte copies of the same X-Weft-Component scheme (_json_body_bytes / _path_and_query / sign_*_request / *_hmac_key_from_env). The wire format lived in two modules; a change to canonicalization or the X-Weft-* headers would have to touch both or the channels silently diverge. Extract src/legis/weft_signing.py as the single definition (weft_body_bytes, weft_path_and_query, sign_weft_request(component, ...), weft_hmac_key_from_env(env_var)). Both clients delegate; module-level _json_body_bytes/_path_and_query aliases keep internal transport and existing call sites stable. The serializer deliberately stays off canonical.canonical_json — its ensure_ascii=False would change the signed bytes (the cross-tool HMAC contract with Wardline). Behavior-preserving: existing per-channel golden vectors unchanged and green; adds tests/test_weft_signing.py with a cross-channel anti-drift test (signatures identical modulo the component prefix) and an ascii-escaping guard. Full suite 684 passed, mypy + ruff clean. rc4 review finding #5 (legis-3012f98aaa). Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 11 ++++ src/legis/filigree/client.py | 58 +++++++----------- src/legis/identity/loomweave_client.py | 44 ++++++------- src/legis/weft_signing.py | 82 +++++++++++++++++++++++++ tests/test_weft_signing.py | 85 ++++++++++++++++++++++++++ 5 files changed, 217 insertions(+), 63 deletions(-) create mode 100644 src/legis/weft_signing.py create mode 100644 tests/test_weft_signing.py diff --git a/CHANGELOG.md b/CHANGELOG.md index 07617af..e606262 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -201,6 +201,17 @@ listed as not-yet-built. directly; precedence/alias changes are a one-line edit in one place, and a direct resolver call can no longer silently ignore its override. No change to the resolved URLs for existing deployments. +- **Weft-component transport-HMAC seam extracted to `weft_signing`** — the + Loomweave SEI client and the Filigree association client signed their requests + with byte-for-byte copies of the same `X-Weft-Component` scheme + (`_json_body_bytes` / `_path_and_query` / `sign_*_request` / + `*_hmac_key_from_env`). The wire format now has a single definition; both + clients delegate to it (component name and channel env var parameterised), so + a future canonicalization or header change can no longer touch one channel and + silently diverge the other. The shared serializer deliberately stays off + `canonical.canonical_json` (whose `ensure_ascii=False` would change the signed + bytes). Behavior-preserving — existing per-channel golden vectors unchanged, + plus a new cross-channel anti-drift test. No change to signatures on the wire. ### Fixed - **Ingest accepts realistic scans** — the over-strict Wardline ingest validator diff --git a/src/legis/filigree/client.py b/src/legis/filigree/client.py index 5bbf190..87608b8 100644 --- a/src/legis/filigree/client.py +++ b/src/legis/filigree/client.py @@ -8,8 +8,6 @@ from __future__ import annotations -import hashlib -import hmac import json import ipaddress import os @@ -20,6 +18,13 @@ import urllib.request from typing import Any, Callable, Protocol, runtime_checkable +from legis.weft_signing import ( + sign_weft_request, + weft_body_bytes, + weft_hmac_key_from_env, + weft_path_and_query, +) + Fetch = Callable[[str, str, "dict | None"], dict] @@ -30,18 +35,13 @@ class FiligreeError(RuntimeError): MAX_RESPONSE_BYTES = 1_000_000 -def _json_body_bytes(body: dict | None) -> bytes: - if body is None: - return b"" - return json.dumps(body, sort_keys=True, separators=(",", ":")).encode("utf-8") - - -def _path_and_query(url: str) -> str: - parsed = urllib.parse.urlsplit(url) - path_and_query = parsed.path or "/" - if parsed.query: - path_and_query = f"{path_and_query}?{parsed.query}" - return path_and_query +# The Weft-component transport-HMAC scheme is shared with the Loomweave channel; +# both delegate to ``weft_signing`` so the wire format (canonicalization + +# ``X-Weft-*`` headers) has a single definition and cannot silently diverge. The +# module-level ``_json_body_bytes`` / ``_path_and_query`` aliases keep the +# internal transport and existing call sites stable. +_json_body_bytes = weft_body_bytes +_path_and_query = weft_path_and_query def sign_filigree_request( @@ -55,28 +55,15 @@ def sign_filigree_request( ) -> dict[str, str]: """Weft-component HMAC headers for a legis->Filigree request (Q-M4). - Mirrors ``identity.loomweave_client.sign_loomweave_request`` so the Filigree - channel has the same transport authentication the Loomweave channel already - had. The attach ``signature`` is an app-level attestation about WHAT is + Delegates to the shared ``weft_signing`` seam (same scheme as the Loomweave + channel). The attach ``signature`` is an app-level attestation about WHAT is bound; this proves WHO is calling. ``timestamp`` and ``nonce`` are injected - (not generated here) so the signature is deterministically testable. - - Canonicalization contract: the body hash is taken over ``_json_body_bytes`` - (sorted keys, compact ``(",", ":")`` separators). The wire transport - (``_urllib_fetch``) sends those exact bytes, and a Filigree verifier MUST - canonicalize the received body identically before hashing — any spacing or - key-ordering drift on either side breaks every signature. See ADR-0003. + (not generated here) so the signature is deterministically testable. See + ``weft_signing`` for the canonicalization contract and ADR-0003. """ - body_hash = hashlib.sha256(_json_body_bytes(body)).hexdigest() - message = ( - f"{method}\n{_path_and_query(url)}\n{body_hash}\n{timestamp}\n{nonce}" - ).encode("utf-8") - signature = hmac.new(key, message, hashlib.sha256).hexdigest() - return { - "X-Weft-Component": f"filigree:{signature}", - "X-Weft-Timestamp": str(timestamp), - "X-Weft-Nonce": nonce, - } + return sign_weft_request( + "filigree", key, method, url, body, timestamp=timestamp, nonce=nonce + ) def filigree_hmac_key_from_env() -> bytes | None: @@ -85,8 +72,7 @@ def filigree_hmac_key_from_env() -> bytes | None: Absent key -> unsigned (backward compatible with deployments that have not provisioned the channel key yet), mirroring ``loomweave_hmac_key_from_env``. """ - value = os.environ.get("LEGIS_FILIGREE_HMAC_KEY") or os.environ.get("LEGIS_HMAC_KEY") - return value.encode("utf-8") if value else None + return weft_hmac_key_from_env("LEGIS_FILIGREE_HMAC_KEY") @runtime_checkable diff --git a/src/legis/identity/loomweave_client.py b/src/legis/identity/loomweave_client.py index 4ff897e..19e1d7c 100644 --- a/src/legis/identity/loomweave_client.py +++ b/src/legis/identity/loomweave_client.py @@ -18,8 +18,6 @@ from __future__ import annotations -import hashlib -import hmac import json import ipaddress import os @@ -31,6 +29,13 @@ from collections.abc import Mapping from typing import Any, Callable, Protocol, runtime_checkable +from legis.weft_signing import ( + sign_weft_request, + weft_body_bytes, + weft_hmac_key_from_env, + weft_path_and_query, +) + Fetch = Callable[[str, str, "dict | None", Mapping[str, str]], dict] @@ -50,18 +55,12 @@ def resolve_sei(self, sei: str) -> dict[str, Any]: ... def lineage(self, sei: str) -> list[dict[str, Any]]: ... -def _json_body_bytes(body: dict | None) -> bytes: - if body is None: - return b"" - return json.dumps(body, sort_keys=True, separators=(",", ":")).encode("utf-8") - - -def _path_and_query(url: str) -> str: - parsed = urllib.parse.urlsplit(url) - path_and_query = parsed.path or "/" - if parsed.query: - path_and_query = f"{path_and_query}?{parsed.query}" - return path_and_query +# The Weft-component transport-HMAC scheme is shared with the Filigree channel; +# both delegate to ``weft_signing`` so the wire format has a single definition +# (the module-level ``_json_body_bytes`` / ``_path_and_query`` aliases keep the +# internal transport and existing call sites stable). +_json_body_bytes = weft_body_bytes +_path_and_query = weft_path_and_query def sign_loomweave_request( @@ -74,23 +73,14 @@ def sign_loomweave_request( nonce: str, ) -> dict[str, str]: """Return Loomweave's current Weft-component HMAC request headers.""" - body_bytes = _json_body_bytes(body) - body_hash = hashlib.sha256(body_bytes).hexdigest() - message = ( - f"{method}\n{_path_and_query(url)}\n{body_hash}\n{timestamp}\n{nonce}" - ).encode("utf-8") - signature = hmac.new(key, message, hashlib.sha256).hexdigest() - return { - "X-Weft-Component": f"loomweave:{signature}", - "X-Weft-Timestamp": str(timestamp), - "X-Weft-Nonce": nonce, - } + return sign_weft_request( + "loomweave", key, method, url, body, timestamp=timestamp, nonce=nonce + ) def loomweave_hmac_key_from_env() -> bytes | None: """Resolve Loomweave HMAC key material from env without making it mandatory.""" - value = os.environ.get("LEGIS_LOOMWEAVE_HMAC_KEY") or os.environ.get("LEGIS_HMAC_KEY") - return value.encode("utf-8") if value else None + return weft_hmac_key_from_env("LEGIS_LOOMWEAVE_HMAC_KEY") def _urllib_fetch( diff --git a/src/legis/weft_signing.py b/src/legis/weft_signing.py new file mode 100644 index 0000000..bfa4f24 --- /dev/null +++ b/src/legis/weft_signing.py @@ -0,0 +1,82 @@ +"""Shared Weft-component transport-HMAC seam. + +The Loomweave SEI client (``identity/loomweave_client.py``) and the Filigree +association client (``filigree/client.py``) authenticate their requests to a +sibling Weft component with the *same* wire scheme: an +``X-Weft-Component: :`` header alongside ``X-Weft-Timestamp`` and +``X-Weft-Nonce``, where the HMAC is computed over +``METHOD\\npath?query\\nsha256(body)\\ntimestamp\\nnonce``. This module is the +single definition of that scheme so the two channels cannot silently diverge — +a change to the canonicalization or header shape now happens in one place. + +Canonicalization contract: the signed body bytes are +``json.dumps(body, sort_keys=True, separators=(",", ":"))`` with the default +``ensure_ascii=True``. This is deliberately **NOT** ``canonical.canonical_json``, +whose ``ensure_ascii=False`` is the byte-for-byte HMAC contract shared with +Wardline; routing a transport body through it would change every signed +request's bytes. The wire transport MUST send exactly ``weft_body_bytes(body)`` +and a verifier MUST recanonicalize identically before hashing. +""" + +from __future__ import annotations + +import hashlib +import hmac +import json +import os +import urllib.parse + + +def weft_body_bytes(body: dict | None) -> bytes: + """Serialize a request body to the exact bytes the signature commits to.""" + if body is None: + return b"" + return json.dumps(body, sort_keys=True, separators=(",", ":")).encode("utf-8") + + +def weft_path_and_query(url: str) -> str: + """The path (plus query, if any) the signed message commits to.""" + parsed = urllib.parse.urlsplit(url) + path_and_query = parsed.path or "/" + if parsed.query: + path_and_query = f"{path_and_query}?{parsed.query}" + return path_and_query + + +def sign_weft_request( + component: str, + key: bytes, + method: str, + url: str, + body: dict | None, + *, + timestamp: int, + nonce: str, +) -> dict[str, str]: + """Return the Weft-component HMAC request headers for ``component``. + + ``timestamp`` and ``nonce`` are injected (not generated here) so the + signature is deterministically testable. + """ + body_hash = hashlib.sha256(weft_body_bytes(body)).hexdigest() + message = ( + f"{method}\n{weft_path_and_query(url)}\n{body_hash}\n{timestamp}\n{nonce}" + ).encode("utf-8") + signature = hmac.new(key, message, hashlib.sha256).hexdigest() + return { + "X-Weft-Component": f"{component}:{signature}", + "X-Weft-Timestamp": str(timestamp), + "X-Weft-Nonce": nonce, + } + + +def weft_hmac_key_from_env(component_env_var: str) -> bytes | None: + """Resolve a channel HMAC key without making it mandatory. + + The channel-specific variable (e.g. ``LEGIS_LOOMWEAVE_HMAC_KEY``) wins; an + absent channel key falls back to the shared ``LEGIS_HMAC_KEY``; absent both, + the channel is unsigned (backward compatible with deployments that have not + provisioned a key yet). + """ + value = os.environ.get(component_env_var) or os.environ.get("LEGIS_HMAC_KEY") + return value.encode("utf-8") if value else None diff --git a/tests/test_weft_signing.py b/tests/test_weft_signing.py new file mode 100644 index 0000000..eb306cd --- /dev/null +++ b/tests/test_weft_signing.py @@ -0,0 +1,85 @@ +"""The shared Weft-component transport-HMAC seam. + +These pin the single wire definition that ``identity/loomweave_client`` and +``filigree/client`` both delegate to, and guard against the two channels +silently re-diverging (the duplication this module was extracted to remove). +""" + +from __future__ import annotations + +import hashlib +import hmac + +import pytest + +from legis.filigree.client import sign_filigree_request +from legis.identity.loomweave_client import sign_loomweave_request +from legis.weft_signing import ( + sign_weft_request, + weft_body_bytes, + weft_hmac_key_from_env, + weft_path_and_query, +) + + +def test_weft_body_bytes_is_compact_sorted_ascii(): + # The signed bytes are compact, key-sorted, and ASCII-escaped — deliberately + # NOT canonical.canonical_json (ensure_ascii=False), which would change the + # signed bytes and break the cross-tool HMAC contract. + assert weft_body_bytes({"b": 1, "a": "x"}) == b'{"a":"x","b":1}' + assert weft_body_bytes({"k": "é"}) == b'{"k":"\\u00e9"}' # escaped, not raw utf-8 + assert weft_body_bytes(None) == b"" + + +def test_weft_path_and_query_carries_query_and_defaults_root(): + assert weft_path_and_query("https://h/api/x?e=1") == "/api/x?e=1" + assert weft_path_and_query("https://h/api/x") == "/api/x" + assert weft_path_and_query("https://h") == "/" + + +def test_sign_weft_request_matches_explicit_hmac_contract(): + key = b"weft-key" + body = {"locator": "python:function:m.f"} + headers = sign_weft_request( + "loomweave", key, "POST", "https://h/api/v1/identity/resolve", body, + timestamp=1_900_000_000, nonce="nonce-1", + ) + body_hash = hashlib.sha256(weft_body_bytes(body)).hexdigest() + message = ( + f"POST\n/api/v1/identity/resolve\n{body_hash}\n1900000000\nnonce-1" + ).encode("utf-8") + expected = hmac.new(key, message, hashlib.sha256).hexdigest() + assert headers == { + "X-Weft-Component": f"loomweave:{expected}", + "X-Weft-Timestamp": "1900000000", + "X-Weft-Nonce": "nonce-1", + } + + +def test_both_channels_share_one_seam_differing_only_by_component(): + # Anti-drift guard: for identical inputs the Loomweave and Filigree channels + # must produce the SAME signature — only the component namespace differs. If + # a future change to one channel's canonicalization slips in, this fails. + key, method, url = b"weft-key", "POST", "https://h/api/issue/I-1/x?q=1" + body = {"entity_id": "loomweave:eid:abc", "content_hash": "h"} + kwargs = dict(timestamp=1_700_000_000, nonce="cafef00d") + + loom = sign_loomweave_request(key, method, url, body, **kwargs) + fil = sign_filigree_request(key, method, url, body, **kwargs) + + assert loom["X-Weft-Component"].startswith("loomweave:") + assert fil["X-Weft-Component"].startswith("filigree:") + # Strip the namespace prefix -> the HMACs are byte-identical. + assert loom["X-Weft-Component"].split(":", 1)[1] == fil["X-Weft-Component"].split(":", 1)[1] + assert loom["X-Weft-Timestamp"] == fil["X-Weft-Timestamp"] + assert loom["X-Weft-Nonce"] == fil["X-Weft-Nonce"] + + +def test_weft_hmac_key_from_env_prefers_channel_then_shared(monkeypatch): + monkeypatch.delenv("LEGIS_CHAN_KEY", raising=False) + monkeypatch.delenv("LEGIS_HMAC_KEY", raising=False) + assert weft_hmac_key_from_env("LEGIS_CHAN_KEY") is None + monkeypatch.setenv("LEGIS_HMAC_KEY", "shared") + assert weft_hmac_key_from_env("LEGIS_CHAN_KEY") == b"shared" + monkeypatch.setenv("LEGIS_CHAN_KEY", "channel") + assert weft_hmac_key_from_env("LEGIS_CHAN_KEY") == b"channel" # channel-specific wins From 779023d20b78a8d3d388127ebd044b52df7ade9c Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 17:34:32 +1000 Subject: [PATCH 43/72] refactor(wardline): centralize scan-routing validation in the service layer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The scan-route request-routing decision — "is request-side routing allowed, and is the cell-spec well-formed?" — was hand-copied into both the HTTP (/wardline/scan-results) and MCP (scan_route) adapters, alongside the cell-spec parse and a byte-for-byte _parse_wardline_cell_map helper. The copies had already drifted: HTTP rejected an empty cell_by_severity (422) while MCP silently accepted an empty severity_map and routed nothing — the "check added to one transport not the other" failure mode the layering exists to prevent. Extract service.resolve_scan_routing (+ ResolvedRouting, _parse_cell_map_env) as the single decision. It raises WardlineRoutingError carrying a `kind` discriminator; each adapter maps kind to its own taxonomy — HTTP 500/403/422 (_WARDLINE_ROUTING_STATUS), MCP collapses all three to INVALID_CELL_SPEC (via _service_error, before the generic ServiceError case). Both dead _parse_wardline_cell_map copies are removed; env reads stay in the adapters. Behavior-preserving for every pinned case (HTTP 403/"server-owned" + 422; MCP INVALID_CELL_SPEC; the malformed-finding path still raises INVALID_ARGUMENT from inside route_wardline_scan, kept distinct from routing errors). One intended change closes the drift: an empty per-severity map is now rejected up front on both transports — no silent governance skip. TDD: tests/service/test_wardline.py (14 cases) + a new MCP regression test_scan_route_rejects_empty_severity_map. Full suite 699 passed, mypy + ruff clean, coverage floors hold, policy-boundary-check + governance-gate + SEI oracle green. rc4 review finding #4 (legis-604ddb8dd4). Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 13 +++ src/legis/api/app.py | 107 +++++++++---------------- src/legis/mcp.py | 102 +++++++----------------- src/legis/service/errors.py | 19 +++++ src/legis/service/wardline.py | 132 +++++++++++++++++++++++++++++++ tests/mcp/test_server.py | 25 ++++++ tests/service/test_wardline.py | 140 +++++++++++++++++++++++++++++++++ 7 files changed, 393 insertions(+), 145 deletions(-) create mode 100644 tests/service/test_wardline.py diff --git a/CHANGELOG.md b/CHANGELOG.md index e606262..9d47600 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -212,6 +212,19 @@ listed as not-yet-built. `canonical.canonical_json` (whose `ensure_ascii=False` would change the signed bytes). Behavior-preserving — existing per-channel golden vectors unchanged, plus a new cross-channel anti-drift test. No change to signatures on the wire. +- **Wardline scan-routing validation centralised in the service layer** — "is + request-side routing allowed, and is the cell-spec well-formed?" is a + governance decision that was hand-copied into both the HTTP + (`/wardline/scan-results`) and MCP (`scan_route`) adapters, along with the + cell-spec parse and a `_parse_wardline_cell_map` helper. The copies had already + drifted: the HTTP adapter rejected an empty `cell_by_severity` (422) while MCP + silently accepted an empty `severity_map` and routed nothing. The decision now + lives in `service.resolve_scan_routing`, raising a `WardlineRoutingError` whose + `kind` each adapter maps to its own taxonomy (HTTP 500/403/422 by kind; MCP + collapses to `INVALID_CELL_SPEC`) — so a new routing rule is added once and + cannot reach one transport but not the other. Behavior-preserving for every + pinned case; the one intended change closes the drift (an empty per-severity + map is now rejected up front on both transports — no silent governance skip). ### Fixed - **Ingest accepts realistic scans** — the over-strict Wardline ingest validator diff --git a/src/legis/api/app.py b/src/legis/api/app.py index 9fe6173..da70f13 100644 --- a/src/legis/api/app.py +++ b/src/legis/api/app.py @@ -48,7 +48,12 @@ from legis.governance.signoff_binding import bind_signoff_to_issue from legis.identity.entity_key import EntityKey from legis.identity.resolver import IdentityResolver -from legis.service.errors import AuditIntegrityError, InvalidArgumentError, NotEnabledError +from legis.service.errors import ( + AuditIntegrityError, + InvalidArgumentError, + NotEnabledError, + WardlineRoutingError, +) from legis.service.governance import compute_override_rate as _compute_override_rate from legis.service.governance import evaluate_policy as _evaluate_policy from legis.service.governance import request_signoff as _request_signoff @@ -58,7 +63,10 @@ from legis.service.governance import submit_override as _submit_override from legis.service.governance import submit_protected_override as _submit_protected_override from legis.service.governance import verified_records as _verified_records -from legis.service.wardline import route_wardline_scan as _route_wardline_scan +from legis.service.wardline import ( + resolve_scan_routing, + route_wardline_scan as _route_wardline_scan, +) from legis.policy.grammar import PolicyGrammar, default_grammar from legis.pulls.models import PullRequest, PullRequestState from legis.pulls.surface import PullSurface @@ -67,7 +75,6 @@ ScanOutcome, WardlineDirtyTreeError, WardlinePayloadError, - WardlineSeverity, ) security = HTTPBearer(auto_error=False) @@ -247,20 +254,13 @@ class CheckRunIn(BaseModel): finished_at: str | None = None -def _parse_wardline_cell_map(raw: str) -> dict[WardlineSeverity, WardlineCellPolicy]: - mapping: dict[WardlineSeverity, WardlineCellPolicy] = {} - for part in raw.split(","): - if not part.strip(): - continue - severity_raw, sep, cell_raw = part.partition("=") - if not sep: - raise ValueError("cell map entries must be SEVERITY=cell") - mapping[WardlineSeverity[severity_raw.strip()]] = WardlineCellPolicy( - cell_raw.strip() - ) - if not mapping: - raise ValueError("cell map must not be empty") - return mapping +# Wardline scan-routing rejections (raised by service.resolve_scan_routing) map +# to HTTP status by kind; the MCP adapter collapses the same kinds to one code. +_WARDLINE_ROUTING_STATUS = { + WardlineRoutingError.SERVER_MISCONFIGURED: 500, + WardlineRoutingError.SERVER_OWNED: 403, + WardlineRoutingError.MALFORMED: 422, +} def _check_to_dict(run: CheckRun) -> dict: @@ -772,62 +772,27 @@ def policy_evaluate(body: PolicyEvalIn, actor: str = Depends(verify_writer)) -> @app.post("/wardline/scan-results") def wardline_scan_results(body: ScanResultsIn, actor: str = Depends(verify_writer)) -> dict: - server_cell = os.environ.get("LEGIS_WARDLINE_CELL") - server_cell_by_severity = os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY") - if server_cell and server_cell_by_severity: - raise HTTPException(status_code=500, detail="server Wardline routing is misconfigured") - server_routing = server_cell is not None or server_cell_by_severity is not None - if server_routing and ( - body.cell is not None or body.cell_by_severity is not None or body.fail_on is not None - ): - raise HTTPException(status_code=403, detail="Wardline routing is server-owned") - if not server_routing: - if os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") != "1": - raise HTTPException( - status_code=403, - detail="Wardline routing is server-owned; configure LEGIS_WARDLINE_CELL or LEGIS_WARDLINE_CELL_BY_SEVERITY", - ) - if body.fail_on is not None: - if body.cell is None or body.cell_by_severity is not None: - raise HTTPException( - status_code=422, - detail="fail_on routing requires cell and forbids cell_by_severity", - ) - elif (body.cell is None) == (body.cell_by_severity is None): - raise HTTPException(status_code=422, - detail="provide exactly one of cell or cell_by_severity") - if body.cell_by_severity is not None and not body.cell_by_severity: - raise HTTPException(status_code=422, detail="cell_by_severity must not be empty") - - policy: WardlineCellPolicy | None = None - cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None - fail_on: WardlineSeverity | None = None try: - if server_cell_by_severity is not None: - cell_map = _parse_wardline_cell_map(server_cell_by_severity) - cells = set(cell_map.values()) - elif server_cell is not None: - policy = WardlineCellPolicy(server_cell) - cells = {policy} - elif body.cell_by_severity is not None: - cell_map = {WardlineSeverity[sev]: WardlineCellPolicy(cell) - for sev, cell in body.cell_by_severity.items()} - cells = set(cell_map.values()) - else: - policy = WardlineCellPolicy(body.cell) - if body.fail_on is not None: - fail_on = WardlineSeverity[body.fail_on] - cells = {policy, WardlineCellPolicy.SURFACE_ONLY} - else: - cells = {policy} - except (KeyError, ValueError) as exc: - raise HTTPException(status_code=422, detail=f"unknown cell/severity: {exc}") + routing = resolve_scan_routing( + server_cell=os.environ.get("LEGIS_WARDLINE_CELL"), + server_cell_by_severity=os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY"), + request_cell=body.cell, + request_severity_map=body.cell_by_severity, + request_fail_on=body.fail_on, + allow_request_routing=( + os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") == "1" + ), + ) + except WardlineRoutingError as exc: + raise HTTPException( + status_code=_WARDLINE_ROUTING_STATUS[exc.kind], detail=str(exc) + ) from exc # Only provision the governance store when a surface cell can actually run: # engine() lazily creates .weft/legis/legis-governance.db, so a pure block_escalate scan # must not touch it. signoff_gate is an injected param (no side effect). - needs_engine = bool(cells & {WardlineCellPolicy.SURFACE_OVERRIDE, - WardlineCellPolicy.SURFACE_ONLY}) + needs_engine = bool(routing.cells & {WardlineCellPolicy.SURFACE_OVERRIDE, + WardlineCellPolicy.SURFACE_ONLY}) try: routed = _route_wardline_scan( body.scan, @@ -835,9 +800,9 @@ def wardline_scan_results(body: ScanResultsIn, actor: str = Depends(verify_write identity=identity, engine=engine() if needs_engine else None, signoff=signoff_gate, - policy=policy, - cell_map=cell_map, - fail_on=fail_on, + policy=routing.policy, + cell_map=routing.cell_map, + fail_on=routing.fail_on, artifact_key=( os.environ["LEGIS_WARDLINE_ARTIFACT_KEY"].encode("utf-8") if os.environ.get("LEGIS_WARDLINE_ARTIFACT_KEY") diff --git a/src/legis/mcp.py b/src/legis/mcp.py index 336cfa0..dbd81a9 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -43,6 +43,7 @@ NotEnabledError, NotFoundError, ServiceError, + WardlineRoutingError, ) from legis.service.explain import explain_policy from legis.service.governance import ( @@ -53,10 +54,9 @@ request_signoff, verified_records as service_verified_records, ) -from legis.service.wardline import route_wardline_scan +from legis.service.wardline import resolve_scan_routing, route_wardline_scan from legis.store.audit_store import AuditStore -from legis.wardline.governor import WardlineCellPolicy -from legis.wardline.ingest import ScanOutcome, WardlineDirtyTreeError, WardlineSeverity +from legis.wardline.ingest import ScanOutcome, WardlineDirtyTreeError _AGENT_TOOLS = frozenset( @@ -414,6 +414,11 @@ def _service_error(exc: Exception) -> dict[str, Any]: return _tool_error("NOT_FOUND", str(exc)) if isinstance(exc, InvalidArgumentError): return _tool_error("INVALID_ARGUMENT", str(exc)) + if isinstance(exc, WardlineRoutingError): + # All three routing kinds (server-misconfigured / server-owned / + # malformed) collapse to one MCP code; the HTTP adapter splits them by + # status. Must precede the generic ServiceError case below. + return _tool_error("INVALID_CELL_SPEC", str(exc)) if isinstance(exc, GitError): return _tool_error("GIT_ERROR", str(exc)) if isinstance(exc, ServiceError): @@ -519,22 +524,6 @@ def _registry(runtime: McpRuntime) -> PolicyCellRegistry: return runtime.cell_registry or fail_closed_policy_cells() -def _parse_wardline_cell_map(raw: str) -> dict[WardlineSeverity, WardlineCellPolicy]: - mapping: dict[WardlineSeverity, WardlineCellPolicy] = {} - for part in raw.split(","): - if not part.strip(): - continue - severity_raw, sep, cell_raw = part.partition("=") - if not sep: - raise ValueError("cell map entries must be SEVERITY=cell") - mapping[WardlineSeverity[severity_raw.strip()]] = WardlineCellPolicy( - cell_raw.strip() - ) - if not mapping: - raise ValueError("cell map must not be empty") - return mapping - - def _explanation_payload(explanation) -> dict[str, Any]: payload = explanation.to_payload() payload["available_moves"] = [ @@ -925,59 +914,24 @@ def _tool_policy_evaluate(runtime: McpRuntime, args: dict[str, Any]) -> dict[str def _tool_scan_route(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any]: - server_cell = os.environ.get("LEGIS_WARDLINE_CELL") - server_cell_by_severity = os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY") - if server_cell and server_cell_by_severity: - return _tool_error( - "INVALID_CELL_SPEC", "server Wardline routing is misconfigured" - ) - has_cell = "cell" in args - has_map = "severity_map" in args - has_fail_on = "fail_on" in args - server_routing = server_cell is not None or server_cell_by_severity is not None - if server_routing and (has_cell or has_map or has_fail_on): - return _tool_error( - "INVALID_CELL_SPEC", "Wardline routing is server-owned" - ) - if not server_routing: - if os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") != "1": - return _tool_error( - "INVALID_CELL_SPEC", - "Wardline routing is server-owned; configure " - "LEGIS_WARDLINE_CELL or LEGIS_WARDLINE_CELL_BY_SEVERITY", - ) - if has_fail_on: - if not has_cell or has_map: - return _tool_error( - "INVALID_CELL_SPEC", - "fail_on routing requires cell and forbids severity_map", - ) - elif has_cell == has_map: - return _tool_error( - "INVALID_CELL_SPEC", - "provide exactly one of cell or severity_map", - ) + # "severity_map" must be an object if present (transport-type check); the + # governance decision — is request routing allowed, and is the spec + # well-formed? — lives in resolve_scan_routing, shared with the HTTP adapter. + # A WardlineRoutingError propagates to call_tool's translator → INVALID_CELL_SPEC. + request_severity_map = ( + _require_object(args, "severity_map") if "severity_map" in args else None + ) + routing = resolve_scan_routing( + server_cell=os.environ.get("LEGIS_WARDLINE_CELL"), + server_cell_by_severity=os.environ.get("LEGIS_WARDLINE_CELL_BY_SEVERITY"), + request_cell=args.get("cell"), + request_severity_map=request_severity_map, + request_fail_on=args.get("fail_on"), + allow_request_routing=( + os.environ.get("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING") == "1" + ), + ) scan = _require_object(args, "scan") - scan_policy: WardlineCellPolicy | None = None - scan_cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None - scan_fail_on: WardlineSeverity | None = None - try: - if server_cell_by_severity is not None: - scan_cell_map = _parse_wardline_cell_map(server_cell_by_severity) - elif server_cell is not None: - scan_policy = WardlineCellPolicy(server_cell) - elif has_cell: - scan_policy = WardlineCellPolicy(_require(args, "cell")) - if has_fail_on: - scan_fail_on = WardlineSeverity[_require(args, "fail_on")] - else: - raw_map = _require_object(args, "severity_map") - scan_cell_map = { - WardlineSeverity[severity]: WardlineCellPolicy(cell) - for severity, cell in raw_map.items() - } - except (KeyError, ValueError) as exc: - return _tool_error("INVALID_CELL_SPEC", str(exc)) try: routed = route_wardline_scan( scan, @@ -985,9 +939,9 @@ def _tool_scan_route(runtime: McpRuntime, args: dict[str, Any]) -> dict[str, Any identity=runtime.identity, engine=_engine(runtime), signoff=runtime.signoff_gate, - policy=scan_policy, - cell_map=scan_cell_map, - fail_on=scan_fail_on, + policy=routing.policy, + cell_map=routing.cell_map, + fail_on=routing.fail_on, artifact_key=( runtime.wardline_artifact_key or ( diff --git a/src/legis/service/errors.py b/src/legis/service/errors.py index 0b952e2..94065d3 100644 --- a/src/legis/service/errors.py +++ b/src/legis/service/errors.py @@ -28,6 +28,25 @@ class InvalidArgumentError(ServiceError): """Caller input is structurally valid for the transport but invalid for Legis.""" +class WardlineRoutingError(ServiceError): + """A Wardline scan-routing request is not permitted or is malformed. + + Carries a ``kind`` discriminator so each adapter can preserve its own + taxonomy without re-implementing the decision: the HTTP adapter maps + ``server_misconfigured`` → 500, ``server_owned`` → 403, ``malformed`` → 422, + while the MCP adapter collapses all three to ``INVALID_CELL_SPEC``. Adapters + switch on the ``kind`` attribute, never on message text. + """ + + SERVER_MISCONFIGURED = "server_misconfigured" + SERVER_OWNED = "server_owned" + MALFORMED = "malformed" + + def __init__(self, kind: str, message: str) -> None: + super().__init__(message) + self.kind = kind + + class ProtectedKeyRequiredError(ServiceError): """A protected trail was read without the HMAC key needed to verify it. diff --git a/src/legis/service/wardline.py b/src/legis/service/wardline.py index a34f410..33c0aef 100644 --- a/src/legis/service/wardline.py +++ b/src/legis/service/wardline.py @@ -3,6 +3,7 @@ from __future__ import annotations from collections.abc import Mapping +from dataclasses import dataclass from typing import Any from legis.canonical import content_hash @@ -10,6 +11,7 @@ from legis.enforcement.signoff import SignoffGate from legis.identity.entity_key import EntityKey from legis.identity.resolver import IdentityResolver +from legis.service.errors import WardlineRoutingError from legis.service.governance import resolve_for_record from legis.wardline.governor import WardlineCellPolicy, route_findings from legis.wardline.ingest import ( @@ -21,6 +23,136 @@ from legis.wardline.policy import resolve_cell +@dataclass(frozen=True) +class ResolvedRouting: + """The resolved Wardline routing intent for a single scan. + + Exactly one of ``policy`` / ``cell_map`` is set unless ``fail_on`` is given + (then ``policy`` is the gate cell and per-finding resolution happens inside + ``route_wardline_scan``). ``cells`` is the set of cells that may actually run + — an adapter uses it to decide whether the governance engine is needed. + """ + + policy: WardlineCellPolicy | None + cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None + fail_on: WardlineSeverity | None + cells: frozenset[WardlineCellPolicy] + + +def _parse_cell_map_env(raw: str) -> dict[WardlineSeverity, WardlineCellPolicy]: + mapping: dict[WardlineSeverity, WardlineCellPolicy] = {} + for part in raw.split(","): + if not part.strip(): + continue + severity_raw, sep, cell_raw = part.partition("=") + if not sep: + raise ValueError("cell map entries must be SEVERITY=cell") + mapping[WardlineSeverity[severity_raw.strip()]] = WardlineCellPolicy( + cell_raw.strip() + ) + if not mapping: + raise ValueError("cell map must not be empty") + return mapping + + +def resolve_scan_routing( + *, + server_cell: str | None, + server_cell_by_severity: str | None, + request_cell: str | None, + request_severity_map: dict[str, str] | None, + request_fail_on: str | None, + allow_request_routing: bool, +) -> ResolvedRouting: + """Resolve a scan-routing request to a ``ResolvedRouting`` or reject it. + + This is the single home for the governance decision the two transports used + to hand-copy: *is request-side routing allowed, and is the cell-spec + well-formed?* The caller passes already-read server-config values (env stays + in the adapter) plus the normalized request fields; every rejection is a + ``WardlineRoutingError`` whose ``kind`` the adapter maps to its own taxonomy. + + Routing is server-owned by default: a deployment pins the cell(s) via env and + callers may not override. ``allow_request_routing`` (the + ``LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING`` opt-in) is the only path to a + caller-supplied spec. Check order is part of the contract: + misconfigured → server-owned → malformed. + """ + if server_cell is not None and server_cell_by_severity is not None: + raise WardlineRoutingError( + WardlineRoutingError.SERVER_MISCONFIGURED, + "server Wardline routing is misconfigured", + ) + server_routing = server_cell is not None or server_cell_by_severity is not None + request_routing = ( + request_cell is not None + or request_severity_map is not None + or request_fail_on is not None + ) + if server_routing: + if request_routing: + raise WardlineRoutingError( + WardlineRoutingError.SERVER_OWNED, "Wardline routing is server-owned" + ) + else: + if not allow_request_routing: + raise WardlineRoutingError( + WardlineRoutingError.SERVER_OWNED, + "Wardline routing is server-owned; configure LEGIS_WARDLINE_CELL " + "or LEGIS_WARDLINE_CELL_BY_SEVERITY", + ) + if request_fail_on is not None: + if request_cell is None or request_severity_map is not None: + raise WardlineRoutingError( + WardlineRoutingError.MALFORMED, + "fail_on routing requires cell and forbids a per-severity map", + ) + elif (request_cell is None) == (request_severity_map is None): + raise WardlineRoutingError( + WardlineRoutingError.MALFORMED, + "provide exactly one of cell or a per-severity map", + ) + if request_severity_map is not None and not request_severity_map: + raise WardlineRoutingError( + WardlineRoutingError.MALFORMED, "per-severity map must not be empty" + ) + + policy: WardlineCellPolicy | None = None + cell_map: dict[WardlineSeverity, WardlineCellPolicy] | None = None + fail_on: WardlineSeverity | None = None + try: + if server_cell_by_severity is not None: + cell_map = _parse_cell_map_env(server_cell_by_severity) + elif server_cell is not None: + policy = WardlineCellPolicy(server_cell) + elif request_severity_map is not None: + cell_map = { + WardlineSeverity[sev]: WardlineCellPolicy(cell) + for sev, cell in request_severity_map.items() + } + else: + policy = WardlineCellPolicy(request_cell) # type: ignore[arg-type] + if request_fail_on is not None: + fail_on = WardlineSeverity[request_fail_on] + except (KeyError, ValueError) as exc: + raise WardlineRoutingError( + WardlineRoutingError.MALFORMED, f"unknown cell/severity: {exc}" + ) from exc + + if fail_on is not None: + cells = {policy, WardlineCellPolicy.SURFACE_ONLY} + elif cell_map is not None: + cells = set(cell_map.values()) + else: + cells = {policy} + return ResolvedRouting( + policy=policy, + cell_map=cell_map, + fail_on=fail_on, + cells=frozenset(c for c in cells if c is not None), + ) + + def route_wardline_scan( scan: Mapping[str, Any], *, diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index 4a20d8d..8d05f9b 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -866,6 +866,31 @@ def test_scan_route_requires_exactly_one_cell_spec_and_routes_findings(tmp_path, } +def test_scan_route_rejects_empty_severity_map(tmp_path, monkeypatch): + # Drift fix: the HTTP adapter already rejected an empty cell_by_severity, but + # MCP silently accepted an empty severity_map (routed nothing). Both transports + # now reject it up front via the shared resolver — no silent governance skip. + monkeypatch.setenv("LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING", "1") + runtime, store = _runtime(tmp_path) + result = _run( + _messages( + { + "jsonrpc": "2.0", + "id": 1, + "method": "tools/call", + "params": { + "name": "scan_route", + "arguments": {"scan": _active_scan(), "severity_map": {}}, + }, + } + ), + runtime, + )[0]["result"] + assert result["isError"] is True + assert result["structuredContent"]["error_code"] == "INVALID_CELL_SPEC" + assert store.read_all() == [] + + def test_scan_route_rejects_request_routing_when_server_owned(tmp_path, monkeypatch): monkeypatch.setenv("LEGIS_WARDLINE_CELL", "surface_only") runtime, store = _runtime(tmp_path) diff --git a/tests/service/test_wardline.py b/tests/service/test_wardline.py new file mode 100644 index 0000000..9859e61 --- /dev/null +++ b/tests/service/test_wardline.py @@ -0,0 +1,140 @@ +"""Transport-agnostic Wardline scan-routing resolution. + +These pin the single governance decision — "is request-side routing allowed, +and is the cell-spec well-formed?" — that both the HTTP and MCP adapters now +delegate to instead of hand-copying (the duplication this resolver removed). +""" + +from __future__ import annotations + +import pytest + +from legis.service.errors import WardlineRoutingError +from legis.service.wardline import resolve_scan_routing +from legis.wardline.governor import WardlineCellPolicy +from legis.wardline.ingest import WardlineSeverity + + +def _resolve(**overrides): + base = dict( + server_cell=None, + server_cell_by_severity=None, + request_cell=None, + request_severity_map=None, + request_fail_on=None, + allow_request_routing=False, + ) + base.update(overrides) + return resolve_scan_routing(**base) + + +def test_server_cell_resolves_to_single_policy(): + r = _resolve(server_cell="surface_override") + assert r.policy is WardlineCellPolicy.SURFACE_OVERRIDE + assert r.cell_map is None and r.fail_on is None + assert r.cells == frozenset({WardlineCellPolicy.SURFACE_OVERRIDE}) + + +def test_server_cell_by_severity_resolves_to_cell_map(): + r = _resolve(server_cell_by_severity="CRITICAL=surface_override,INFO=surface_only") + assert r.policy is None + assert r.cell_map == { + WardlineSeverity.CRITICAL: WardlineCellPolicy.SURFACE_OVERRIDE, + WardlineSeverity.INFO: WardlineCellPolicy.SURFACE_ONLY, + } + + +def test_both_server_env_set_is_server_misconfigured(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(server_cell="surface_only", server_cell_by_severity="INFO=surface_only") + assert exc.value.kind == WardlineRoutingError.SERVER_MISCONFIGURED + + +def test_request_routing_under_server_ownership_is_rejected(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(server_cell="surface_only", request_cell="surface_override") + assert exc.value.kind == WardlineRoutingError.SERVER_OWNED + assert "server-owned" in str(exc.value) + + +def test_request_routing_without_optin_is_server_owned(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(request_cell="surface_override", allow_request_routing=False) + assert exc.value.kind == WardlineRoutingError.SERVER_OWNED + assert "server-owned" in str(exc.value) + + +def test_request_cell_resolves_when_optedin(): + r = _resolve(request_cell="surface_override", allow_request_routing=True) + assert r.policy is WardlineCellPolicy.SURFACE_OVERRIDE + + +def test_request_severity_map_resolves_when_optedin(): + r = _resolve( + request_severity_map={"CRITICAL": "surface_override"}, + allow_request_routing=True, + ) + assert r.cell_map == {WardlineSeverity.CRITICAL: WardlineCellPolicy.SURFACE_OVERRIDE} + + +def test_request_fail_on_with_cell_resolves_and_exposes_surface_only(): + r = _resolve( + request_cell="surface_override", request_fail_on="ERROR", + allow_request_routing=True, + ) + assert r.policy is WardlineCellPolicy.SURFACE_OVERRIDE + assert r.fail_on is WardlineSeverity.ERROR + # fail_on resolves per-finding to the gate cell or surface_only, so both may run. + assert r.cells == frozenset( + {WardlineCellPolicy.SURFACE_OVERRIDE, WardlineCellPolicy.SURFACE_ONLY} + ) + + +def test_fail_on_without_cell_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve( + request_fail_on="ERROR", + request_severity_map={"ERROR": "surface_only"}, + allow_request_routing=True, + ) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_both_cell_and_map_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve( + request_cell="surface_only", + request_severity_map={"INFO": "surface_only"}, + allow_request_routing=True, + ) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_neither_cell_nor_map_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(allow_request_routing=True) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_empty_request_severity_map_is_malformed(): + # The drift fix: HTTP already rejected an empty cell_by_severity; MCP silently + # accepted an empty severity_map (routed nothing). The resolver rejects it for + # both transports. + with pytest.raises(WardlineRoutingError) as exc: + _resolve(request_severity_map={}, allow_request_routing=True) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_unknown_cell_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve(request_cell="not_a_cell", allow_request_routing=True) + assert exc.value.kind == WardlineRoutingError.MALFORMED + + +def test_unknown_fail_on_severity_is_malformed(): + with pytest.raises(WardlineRoutingError) as exc: + _resolve( + request_cell="surface_only", request_fail_on="SEVERE", + allow_request_routing=True, + ) + assert exc.value.kind == WardlineRoutingError.MALFORMED From a217b00fed1812c42749ecf373d67928e3c55232 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 17:51:44 +1000 Subject: [PATCH 44/72] style(tests): drop unused pytest import in test_weft_signing Leftover from the weft_signing extraction (020c0c6). The CI ruff gate scopes to src/, so `ruff check src tests` is needed to catch it; now clean. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/test_weft_signing.py | 2 -- 1 file changed, 2 deletions(-) diff --git a/tests/test_weft_signing.py b/tests/test_weft_signing.py index eb306cd..a69163b 100644 --- a/tests/test_weft_signing.py +++ b/tests/test_weft_signing.py @@ -10,8 +10,6 @@ import hashlib import hmac -import pytest - from legis.filigree.client import sign_filigree_request from legis.identity.loomweave_client import sign_loomweave_request from legis.weft_signing import ( From f32801f355ddc53c31fb831bcd44ca1f68dd8adc Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 17:52:22 +1000 Subject: [PATCH 45/72] refactor(install): colocate instruction-marker reader with its writer MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit hooks.py re-encoded the marker format with its own regex (), independently of install.py, which builds the marker and owns INSTRUCTIONS_MARKER. A change to the marker spacing or token shape in the writer would silently desync the reader, breaking the SessionStart / MCP-boot drift refresh — the hook's entire job. Move the token-extraction helper (_extract_marker_token) into install.py next to _build_instructions_block / _marker_token. Its regex is now re.escape'd from the INSTRUCTIONS_MARKER constant (the prefix can't desync) and captures the token opaquely as \S+ rather than re-encoding the v{version}:{hash} shape, so a future token-shape change needs no edit here. hooks.py imports it; its now-unused `re` import is dropped. Relocate the marker-token tests to test_install.py (the writer's home) and strengthen the round-trip to parse the ACTUAL writer output: _extract_marker_token(_build_instructions_block()) == _marker_token(). A future marker-format change now fails this loudly instead of silently desyncing. Full suite 699 passed, ruff (src+tests) + mypy clean, coverage floors hold. rc4 review finding #6 (legis-bd49bb8048). Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 13 +++++++++++++ src/legis/hooks.py | 10 +--------- src/legis/install.py | 15 +++++++++++++++ tests/test_hooks.py | 12 ------------ tests/test_install.py | 16 ++++++++++++++++ 5 files changed, 45 insertions(+), 21 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9d47600..0f1f7ec 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -225,6 +225,19 @@ listed as not-yet-built. cannot reach one transport but not the other. Behavior-preserving for every pinned case; the one intended change closes the drift (an empty per-severity map is now rejected up front on both transports — no silent governance skip). +- **Instruction-marker reader colocated with its writer** — the SessionStart / + MCP-boot freshness check in `hooks.py` re-encoded the marker format + (``) with its own regex, + independently of `install.py`, which builds the marker and owns + `INSTRUCTIONS_MARKER`. A change to the marker spacing or token shape in the + writer would silently desync the reader, and the drift check — the hook's whole + job — would stop matching. The token-extraction helper (`_extract_marker_token`) + now lives next to the writer in `install.py`; its regex is `re.escape`d from the + `INSTRUCTIONS_MARKER` constant and captures the token opaquely (`\S+`), so it + cannot desync from the prefix and needs no edit if the token shape changes. A + round-trip test (`_extract_marker_token(_build_instructions_block())` == + `_marker_token()`) pins reader-to-writer, failing loudly on any future format + change. ### Fixed - **Ingest accepts realistic scans** — the over-strict Wardline ingest validator diff --git a/src/legis/hooks.py b/src/legis/hooks.py index 142b35b..9a95813 100644 --- a/src/legis/hooks.py +++ b/src/legis/hooks.py @@ -15,12 +15,12 @@ from __future__ import annotations import logging -import re from pathlib import Path from legis.install import ( INSTRUCTIONS_MARKER, SKILL_NAME, + _extract_marker_token, _get_skills_source_dir, _marker_token, _skill_tree_fingerprint, @@ -31,14 +31,6 @@ logger = logging.getLogger(__name__) -_MARKER_TOKEN_RE = re.compile(r"") - - -def _extract_marker_token(content: str) -> str | None: - """Return the ``v{version}:{hash}`` token from a legis marker, or ``None``.""" - m = _MARKER_TOKEN_RE.search(content) - return m.group(1) if m else None - def refresh_instructions(root: Path) -> list[str]: """Refresh drifted legis instruction blocks and skill packs under *root*. diff --git a/src/legis/install.py b/src/legis/install.py index 0a9527e..c336473 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -204,6 +204,21 @@ def _build_instructions_block() -> str: return f"{opening}\n{text}{_END_MARKER}" +# Reader counterpart to the opening marker built in `_build_instructions_block`. +# It lives next to the writer (and is derived from the same `INSTRUCTIONS_MARKER` +# constant) so the freshness check cannot silently desync from the marker format: +# the prefix is `re.escape`d from the constant, and the token is captured as an +# opaque `\S+` rather than re-encoding its `v{version}:{hash}` shape — so a future +# change to the token shape needs no edit here. The round-trip is pinned by a test. +_MARKER_TOKEN_RE = re.compile(re.escape(INSTRUCTIONS_MARKER) + r":(\S+) -->") + + +def _extract_marker_token(content: str) -> str | None: + """Return the token from the first legis instruction marker, or ``None``.""" + m = _MARKER_TOKEN_RE.search(content) + return m.group(1) if m else None + + def _atomic_write_text(path: Path, content: str) -> None: """Write *content* to *path* atomically (temp + rename), preserving mode.""" # Refuse-to-empty guard (filigree-04bad2a2bf parity). Every caller of this diff --git a/tests/test_hooks.py b/tests/test_hooks.py index ccd848b..18d82ec 100644 --- a/tests/test_hooks.py +++ b/tests/test_hooks.py @@ -6,29 +6,17 @@ from legis import hooks, install from legis.hooks import ( - _extract_marker_token, generate_session_context, refresh_instructions, ) from legis.install import ( SKILL_NAME, - _marker_token, inject_instructions, install_codex_skills, install_skills, ) -def test_extract_marker_token_roundtrip(): - token = _marker_token() - content = f"x\n\nbody\n" - assert _extract_marker_token(content) == token - - -def test_extract_marker_token_absent(): - assert _extract_marker_token("no marker here") is None - - def test_refresh_noop_when_fresh(tmp_path): inject_instructions(tmp_path / "CLAUDE.md") inject_instructions(tmp_path / "AGENTS.md") diff --git a/tests/test_install.py b/tests/test_install.py index 44bca1b..5329e0e 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -15,6 +15,7 @@ SKILL_NAME, UnsafeInstallPathError, _build_instructions_block, + _extract_marker_token, _instructions_hash, _instructions_text, _instructions_version, @@ -85,6 +86,21 @@ def test_build_block_has_open_and_close_markers(): assert _instructions_text() in block +def test_extract_marker_token_round_trips_the_writer(): + # The freshness check's reader must parse the exact marker the writer emits. + # Driving it off the real `_build_instructions_block()` output (not a + # hand-written marker) is what keeps the reader from silently desyncing if + # the marker format ever changes — both live in install.py now. + assert _extract_marker_token(_build_instructions_block()) == _marker_token() + + +def test_extract_marker_token_ignores_the_close_marker_and_absence(): + # The close marker (``) carries no token and must + # not be mistaken for the open marker; absent any marker yields None. + assert _extract_marker_token("") is None + assert _extract_marker_token("no marker here") is None + + # --------------------------------------------------------------------------- # inject_instructions # --------------------------------------------------------------------------- From 1805e3780cda5c5294a190b9facd15539eaad17f Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 17:55:30 +1000 Subject: [PATCH 46/72] docs(governance): record why full per-read trail verification is intentional MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit rc4 review #7 flagged verify_integrity's O(N) full-chain re-hash on interactive read paths (verified_records, hit by the keyed override-submit idempotency check and override_rate_get) and suggested incremental verification or reserving full verification for the explicit governance-gate. Investigation: the cost is the tamper-evidence property, not a defect. There is no load-time/open-time verification anywhere (AuditStore.__init__ only creates the schema), so verified_records is the ONLY tamper check on interactive paths — reserving it for the gate would leave every interactive read unverified. Incremental verification cannot detect out-of-band tampering of an already-verified prefix record (the chain gives O(1) verification of appends, not of a mutated prefix) and would not reach O(1) anyway, since the signature pass is O(N) regardless. Operator-confirmed decision: document & decline. Add a cost-note to service.verified_records (intentional cost + the two rejected optimizations and why) and a pointer comment at audit_store.verify_integrity. No behavior change; if trail size ever becomes latency-bound the honest lever is retention/ compaction, not narrowing what each read verifies. rc4 review finding #7 (legis-4ab36517df). Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/service/governance.py | 17 +++++++++++++++++ src/legis/store/audit_store.py | 5 +++++ 2 files changed, 22 insertions(+) diff --git a/src/legis/service/governance.py b/src/legis/service/governance.py index 214f8ea..24f2747 100644 --- a/src/legis/service/governance.py +++ b/src/legis/service/governance.py @@ -84,6 +84,23 @@ def verified_records( owner exposing ``records()`` / ``verify_integrity()`` and a verifier exposing ``verify()``) so the service layer is not coupled to the enforcement concrete types. + + Cost note (rc4 review #7): this verifies the *whole* trail on every call — + ``verify_integrity()`` re-hashes the chain (O(N)) and ``trail_verifier.verify`` + re-checks signatures (O(N)) — including on interactive paths (the keyed + override-submit idempotency check and every override-rate read). That cost is + the tamper-evidence property, not an oversight: there is no load-time or + open-time verification anywhere (``AuditStore.__init__`` only creates the + schema), so this path is the only thing standing between a tampered record and + an interactive read. Two tempting optimizations are deliberately NOT taken: + reserving full verification for the explicit governance-gate would leave every + interactive read unverified (a silent tamper window); and incremental + verification (trusting a cached last-verified prefix and re-hashing only the + new tail) cannot detect out-of-band tampering of an already-verified record — + exactly what the hash chain exists to catch — and still would not reach O(1), + because the signature pass is O(N) regardless. If trail size ever makes this + latency-bound, the honest lever is trail retention/compaction, not narrowing + what each read verifies. """ if trail_owner is not None: records = trail_owner.records() diff --git a/src/legis/store/audit_store.py b/src/legis/store/audit_store.py index f5a97f8..c999ddc 100644 --- a/src/legis/store/audit_store.py +++ b/src/legis/store/audit_store.py @@ -270,6 +270,11 @@ def read_by_seq(self, seq: int) -> AuditRecord | None: ) def verify_integrity(self) -> bool: + # O(N) by design: a full chain re-hash is the only way to detect + # out-of-band tampering of an arbitrary record (the hash chain gives O(1) + # verification of *appends*, never of a mutated prefix). Callers on + # interactive read paths (service.verified_records) pay this deliberately; + # see that function's cost note (rc4 review #7) for why it is not narrowed. self._assert_no_batch_in_progress("verify_integrity") prev_hash = GENESIS try: From f7eafa78bc003ae2e9ee561c36c9076579bccbff Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 18:25:44 +1000 Subject: [PATCH 47/72] refactor(types): give recorded-fact provenance a shared str,Enum vocabulary MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CheckRun.provenance and PullRequest.provenance were bare str literals defaulting to "unauthenticated" while the sibling outcome/status axes were converted to str,Enum in e77d6e4. provenance is a closed enumeration (today "unauthenticated"; tomorrow a signed-webhook value) hand-typed in ~4 places, so the authenticated value would arrive as another bare literal a typo could not catch. Add Provenance(str,Enum) in a new package-root module shared by checks and pulls (neither imports the other, and the vocabulary is genuinely common). Following the Suppressed precedent from that batch, the field stays typed str: a str,Enum member IS its wire string (json.dumps / canonical_json emit byte-identical payloads — HMAC contract unaffected), and the Text-column reads (row.provenance or ...) need no coercion that could raise on a legacy value. rc4 review finding #10. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/checks/models.py | 4 +++- src/legis/checks/surface.py | 3 ++- src/legis/provenance.py | 27 +++++++++++++++++++++++++++ src/legis/pulls/models.py | 4 +++- src/legis/pulls/surface.py | 3 ++- 5 files changed, 37 insertions(+), 4 deletions(-) create mode 100644 src/legis/provenance.py diff --git a/src/legis/checks/models.py b/src/legis/checks/models.py index ea687c2..2aea94d 100644 --- a/src/legis/checks/models.py +++ b/src/legis/checks/models.py @@ -10,6 +10,8 @@ from dataclasses import dataclass from enum import Enum +from legis.provenance import Provenance + class CheckOutcome(str, Enum): PASS = "pass" @@ -37,4 +39,4 @@ class CheckRun: # "unauthenticated" so a consumer is never misled into treating a # writer-asserted "pass" as authoritative. An authenticated path (a signed # forge webhook) would set a stronger value; none exists today. - provenance: str = "unauthenticated" + provenance: str = Provenance.UNAUTHENTICATED diff --git a/src/legis/checks/surface.py b/src/legis/checks/surface.py index 55cfa91..c15414e 100644 --- a/src/legis/checks/surface.py +++ b/src/legis/checks/surface.py @@ -23,6 +23,7 @@ from sqlalchemy.pool import NullPool from legis.checks.models import CheckOutcome, CheckRun +from legis.provenance import Provenance class CheckSurface: @@ -111,7 +112,7 @@ def _to_run(r) -> CheckRun: finished_at=r.finished_at, recorded_by=r.recorded_by, # Rows written before this column existed are still writer-asserted. - provenance=r.provenance or "unauthenticated", + provenance=r.provenance or Provenance.UNAUTHENTICATED, ) def for_commit(self, sha: str) -> list[CheckRun]: diff --git a/src/legis/provenance.py b/src/legis/provenance.py new file mode 100644 index 0000000..4be22af --- /dev/null +++ b/src/legis/provenance.py @@ -0,0 +1,27 @@ +"""Provenance vocabulary shared by recorded forge/CI facts. + +``CheckRun`` and ``PullRequest`` are both *writer-supplied claims* — legis +records what a writer asserted, not what a forge cryptographically attested. The +provenance axis names how far that claim is backed. Today there is exactly one +member; an authenticated path (e.g. a signed forge webhook) would add a stronger +value here rather than as another hand-typed string literal. + +This is the single vocabulary source for both ``checks`` and ``pulls``; neither +package imports the other, so the enum lives at the package root they share. The +field stays typed ``str`` on the wire-facing dataclasses (matching the +``Suppressed`` precedent in the rc-series str,Enum conversion): a ``str,Enum`` +member *is* its wire string, so ``json.dumps`` / ``canonical_json`` emit +byte-identical payloads, and raw values read back out of the ``Text`` DB columns +never need coercion that could raise on a legacy/unexpected value. +""" + +from __future__ import annotations + +from enum import Enum + + +class Provenance(str, Enum): + """How far a recorded forge/CI claim is backed.""" + + # A writer-asserted fact with no signature or forge attestation behind it. + UNAUTHENTICATED = "unauthenticated" diff --git a/src/legis/pulls/models.py b/src/legis/pulls/models.py index 7141742..bba946f 100644 --- a/src/legis/pulls/models.py +++ b/src/legis/pulls/models.py @@ -5,6 +5,8 @@ from dataclasses import dataclass from enum import Enum +from legis.provenance import Provenance + class PullRequestState(str, Enum): OPEN = "open" @@ -24,4 +26,4 @@ class PullRequest: # Q-M4: recorded PR metadata is a writer-supplied claim, not forge-verified. # "unauthenticated" so a consumer never treats writer-asserted PR state as # authoritative (see CheckRun.provenance). - provenance: str = "unauthenticated" + provenance: str = Provenance.UNAUTHENTICATED diff --git a/src/legis/pulls/surface.py b/src/legis/pulls/surface.py index 753db20..3e883de 100644 --- a/src/legis/pulls/surface.py +++ b/src/legis/pulls/surface.py @@ -5,6 +5,7 @@ from sqlalchemy import Column, Integer, MetaData, String, Table, Text, create_engine, delete, insert, select from sqlalchemy.pool import NullPool +from legis.provenance import Provenance from legis.pulls.models import PullRequest, PullRequestState @@ -72,5 +73,5 @@ def get(self, number: int) -> PullRequest | None: state=PullRequestState(row.state), url=row.url, recorded_by=row.recorded_by, - provenance=row.provenance or "unauthenticated", + provenance=row.provenance or Provenance.UNAUTHENTICATED, ) From cf1aded10f93a2c682dbd29341267bbadae1cb5a Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 18:25:52 +1000 Subject: [PATCH 48/72] refactor(config): centralize LEGIS_PROTECTED_POLICIES resolution MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The frozenset(p.strip() for p in split(",")) idiom was duplicated verbatim in three composition roots (api/app.py, cli.py, mcp.py). A future change to the delimiter/trim rule applied to one root would diverge the protected-policy set between the API, the CLI override-rate gate, and the MCP server — and that set decides whether a judge ACCEPTED is downgraded to operator sign-off, so a divergence is a real authority split, not a cosmetic one. Add config.protected_policies(), the single parse point, alongside the *_db_url resolvers it mirrors. Read at call time (like those resolvers) because cli.py writes the env var from --protected-policies before the downstream root reads it. The bare frozenset() empty-default at app.py (no-verifier case) is a different concern and left as-is. rc4 review finding #9. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/api/app.py | 10 ++++------ src/legis/cli.py | 7 ++----- src/legis/config.py | 21 +++++++++++++++++++++ src/legis/mcp.py | 11 ++++------- 4 files changed, 31 insertions(+), 18 deletions(-) diff --git a/src/legis/api/app.py b/src/legis/api/app.py index da70f13..cc0df06 100644 --- a/src/legis/api/app.py +++ b/src/legis/api/app.py @@ -32,6 +32,7 @@ binding_db_url, check_db_url, governance_db_url, + protected_policies, pull_db_url, ) from legis.checks.models import CheckOutcome, CheckRun @@ -344,14 +345,11 @@ def create_app( gov_store = AuditStore(gov_db_url) clock = SystemClock() - protected_policies_str = os.environ.get("LEGIS_PROTECTED_POLICIES", "") - protected_policies = frozenset( - p.strip() for p in protected_policies_str.split(",") if p.strip() - ) + protected = protected_policies() if trail_verifier is None: from legis.enforcement.protected import TrailVerifier - trail_verifier = TrailVerifier(hmac_key, protected_policies) + trail_verifier = TrailVerifier(hmac_key, protected) if protected_gate is None: from legis.enforcement.judge_factory import build_judge_from_env @@ -362,7 +360,7 @@ def create_app( # downgraded and the agent must obtain operator sign-off. protected_gate = ProtectedGate( gov_store, clock, build_judge_from_env("API"), hmac_key, - protected_policies=protected_policies, + protected_policies=protected, ) if signoff_gate is None: diff --git a/src/legis/cli.py b/src/legis/cli.py index c4e48c6..bfd89bd 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -188,6 +188,7 @@ def _apply_judge_env(args) -> None: def _check_override_rate(db_url: str) -> int: import os + from legis.config import protected_policies from legis.enforcement.lifecycle import GateStatus from legis.service.errors import AuditIntegrityError, ProtectedKeyRequiredError from legis.service.governance import evaluate_override_rate_gate @@ -217,10 +218,6 @@ def _check_override_rate(db_url: str) -> int: return 1 records = store.read_all() - protected_policies_str = os.environ.get("LEGIS_PROTECTED_POLICIES", "") - protected_policies = frozenset( - p.strip() for p in protected_policies_str.split(",") if p.strip() - ) # The detect -> require-key -> verify -> score decision lives in the service # layer (Q-H2), so the cli, the api, and any future consumer all measure the @@ -229,7 +226,7 @@ def _check_override_rate(db_url: str) -> int: res = evaluate_override_rate_gate( records, hmac_key=os.environ.get("LEGIS_HMAC_KEY"), - protected_policies=protected_policies, + protected_policies=protected_policies(), ) except (ProtectedKeyRequiredError, AuditIntegrityError) as exc: print(f"Error: {exc}", file=sys.stderr) diff --git a/src/legis/config.py b/src/legis/config.py index 1a452c7..f72da0d 100644 --- a/src/legis/config.py +++ b/src/legis/config.py @@ -61,6 +61,11 @@ _BINDING_DB_ENV = "LEGIS_BINDING_DB" _PULL_DB_ENV = "LEGIS_PULL_DB" +# Protected-policy set: the policy names whose judge-ACCEPTED verdicts are +# downgraded to operator sign-off (Q-H3). Composition-root config like the DB +# URLs above, so resolved here. +_PROTECTED_POLICIES_ENV = "LEGIS_PROTECTED_POLICIES" + def project_root() -> Path: """The directory the federation treats as project root (the cwd).""" @@ -149,6 +154,22 @@ def pull_db_url() -> str: return _resolve_db_url(_PULL_DB_ENV, _PULL_DB_NAME) +def protected_policies() -> frozenset[str]: + """Resolve the protected-policy set from ``LEGIS_PROTECTED_POLICIES``. + + THE single parse point for the env var: the API factory, the MCP runtime, + and the CLI override-rate gate all call this rather than re-implementing the + ``frozenset(split(","))`` idiom, so the delimiter/trim rule cannot diverge + between composition roots (it decides whether a judge ACCEPTED is downgraded + to sign-off, so a divergence would be a real authority split). Read at call + time — like the ``*_db_url()`` resolvers — because ``cli.py`` writes the env + var from ``--protected-policies`` before the downstream root reads it. Empty, + whitespace-only, and absent all yield the empty set. + """ + raw = os.environ.get(_PROTECTED_POLICIES_ENV, "") + return frozenset(p.strip() for p in raw.split(",") if p.strip()) + + def ensure_sqlite_parent(url: str) -> None: """Create the parent directory for a SQLite *file* URL, if needed. diff --git a/src/legis/mcp.py b/src/legis/mcp.py index dbd81a9..e226ec2 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -158,7 +158,7 @@ def _load_policy_cell_registry() -> PolicyCellRegistry: def build_runtime(agent_id: str) -> McpRuntime: - from legis.config import binding_db_url, governance_db_url + from legis.config import binding_db_url, governance_db_url, protected_policies clock = SystemClock() engine = None @@ -180,18 +180,15 @@ def build_runtime(agent_id: str) -> McpRuntime: if hmac_key: key = hmac_key.encode("utf-8") store = AuditStore(governance_db_url()) - protected_policies_str = os.environ.get("LEGIS_PROTECTED_POLICIES", "") - protected_policies = frozenset( - p.strip() for p in protected_policies_str.split(",") if p.strip() - ) - trail_verifier = TrailVerifier(key, protected_policies) + protected = protected_policies() + trail_verifier = TrailVerifier(key, protected) # Protected policies: the LLM judge is advisory only (Q-H3). With no # deterministic validator wired, a judge ACCEPTED is downgraded and the # agent must escalate to operator sign-off. protected_gate = ProtectedGate( store, clock, build_judge_from_env("MCP"), key, - protected_policies=protected_policies, + protected_policies=protected, ) signoff_gate = SignoffGate(store, clock, signer=True, key=key) From cf07ddaac8c641b6cb772e254156536c98a4c0a2 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 18:26:10 +1000 Subject: [PATCH 49/72] docs(mcp): note the idempotency scan's verification cost is intentional MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit rc4 review #8 flagged _existing_idempotent_record's O(N) read + O(N) hash + O(N) HMAC and proposed a keyed O(1) lookup. That is the same optimization operator-confirmed declined as finding #7 (1805e37): the cost is _verified_records' whole-trail tamper check, and a keyed single-row lookup reaches O(1) only by skipping verification — the silent tamper window #7 deliberately refused. The #7 commit already names this idempotency path; add a pointer comment here so the equivalence is visible at the call site and the finding is not re-litigated a third time. No behavior change. rc4 review finding #8 (duplicate of #7, legis-4ab36517df). Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/mcp.py | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/legis/mcp.py b/src/legis/mcp.py index e226ec2..fe5c857 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -636,6 +636,11 @@ def _override_idempotency_request_hash( def _existing_idempotent_record( runtime: McpRuntime, key: str, request_hash: str ) -> Any | None: + # The O(N) hash + HMAC cost of the scan below is `_verified_records`' whole- + # trail tamper check, paid deliberately on this interactive path — NOT a + # keyed single-row lookup, which would skip verification (the optimization + # operator-confirmed declined in rc4 review #7; see service.verified_records' + # cost note). The scan itself is over the already-verified list. for rec in _verified_records(runtime): ext = rec.payload.get("extensions", {}) if ext.get("mcp_idempotency_key") != key: From f70feb4a5fb1328ad14a3b9961300d91260e3621 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 18:26:30 +1000 Subject: [PATCH 50/72] chore(skills): regenerate loomweave-workflow for .weft/loomweave store path loomweave install regenerated the workflow skill after loomweave moved its index under .weft/loomweave/ (the federation store convention). Doc-only path references; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) --- .agents/skills/loomweave-workflow/.fingerprint | 2 +- .agents/skills/loomweave-workflow/SKILL.md | 6 +++--- .claude/skills/loomweave-workflow/.fingerprint | 2 +- .claude/skills/loomweave-workflow/SKILL.md | 6 +++--- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/.agents/skills/loomweave-workflow/.fingerprint b/.agents/skills/loomweave-workflow/.fingerprint index b8934d2..f1af0a2 100644 --- a/.agents/skills/loomweave-workflow/.fingerprint +++ b/.agents/skills/loomweave-workflow/.fingerprint @@ -1 +1 @@ -8af48023ff74748434eec046b718fe586bce8784e51d474c9c58daf8f292326b \ No newline at end of file +4c1af074f42ec147611923aafeb704eba54cd7dca4dcec2489907921b7f94233 \ No newline at end of file diff --git a/.agents/skills/loomweave-workflow/SKILL.md b/.agents/skills/loomweave-workflow/SKILL.md index fd7ab55..5b8e4d8 100644 --- a/.agents/skills/loomweave-workflow/SKILL.md +++ b/.agents/skills/loomweave-workflow/SKILL.md @@ -26,7 +26,7 @@ calls this?" without reading a single file. - You need a function's neighborhood, execution paths, or which subsystem it belongs to. **Not for:** editing code, reading exact implementation bodies (use `summary` or -read the file once you have its path), or codebases with no `.loomweave/` index. +read the file once you have its path), or codebases with no `.weft/loomweave/` index. ## Entity IDs — the model @@ -161,7 +161,7 @@ honest-empty unless a plugin emits those tags. Likewise `high_churn` and `search_semantic` is also in the catalogue. It is opt-in under `semantic_search:`; when enabled, `loomweave analyze` populates the git-ignored -`.loomweave/embeddings.db` sidecar and the query path filters stale vectors by +`.weft/loomweave/embeddings.db` sidecar and the query path filters stale vectors by content hash. > Not in this catalogue: `emit_observation` as a general-purpose write surface. @@ -202,7 +202,7 @@ and are composed into `summary` prompts with a real guidance fingerprint. ## Launch -`loomweave serve --path ` where `` contains `.loomweave/loomweave.db` +`loomweave serve --path ` where `` contains `.weft/loomweave/loomweave.db` (built by `loomweave analyze `). In an MCP client the tools appear as `mcp__loomweave__find_entity`, etc. diff --git a/.claude/skills/loomweave-workflow/.fingerprint b/.claude/skills/loomweave-workflow/.fingerprint index b8934d2..f1af0a2 100644 --- a/.claude/skills/loomweave-workflow/.fingerprint +++ b/.claude/skills/loomweave-workflow/.fingerprint @@ -1 +1 @@ -8af48023ff74748434eec046b718fe586bce8784e51d474c9c58daf8f292326b \ No newline at end of file +4c1af074f42ec147611923aafeb704eba54cd7dca4dcec2489907921b7f94233 \ No newline at end of file diff --git a/.claude/skills/loomweave-workflow/SKILL.md b/.claude/skills/loomweave-workflow/SKILL.md index fd7ab55..5b8e4d8 100644 --- a/.claude/skills/loomweave-workflow/SKILL.md +++ b/.claude/skills/loomweave-workflow/SKILL.md @@ -26,7 +26,7 @@ calls this?" without reading a single file. - You need a function's neighborhood, execution paths, or which subsystem it belongs to. **Not for:** editing code, reading exact implementation bodies (use `summary` or -read the file once you have its path), or codebases with no `.loomweave/` index. +read the file once you have its path), or codebases with no `.weft/loomweave/` index. ## Entity IDs — the model @@ -161,7 +161,7 @@ honest-empty unless a plugin emits those tags. Likewise `high_churn` and `search_semantic` is also in the catalogue. It is opt-in under `semantic_search:`; when enabled, `loomweave analyze` populates the git-ignored -`.loomweave/embeddings.db` sidecar and the query path filters stale vectors by +`.weft/loomweave/embeddings.db` sidecar and the query path filters stale vectors by content hash. > Not in this catalogue: `emit_observation` as a general-purpose write surface. @@ -202,7 +202,7 @@ and are composed into `summary` prompts with a real guidance fingerprint. ## Launch -`loomweave serve --path ` where `` contains `.loomweave/loomweave.db` +`loomweave serve --path ` where `` contains `.weft/loomweave/loomweave.db` (built by `loomweave analyze `). In an MCP client the tools appear as `mcp__loomweave__find_entity`, etc. From e289bc6824513b6d404e69867045afefca4c5232 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 18:55:04 +1000 Subject: [PATCH 51/72] =?UTF-8?q?docs(spec):=20legis=20doctor=20design=20?= =?UTF-8?q?=E2=80=94=20view/repair=20install+config=20health?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New `legis doctor [--root] [--repair] [--format text|json]`: an operator/CLI health view + safe repair for legis's install and config layer, mirroring wardline doctor. Four check domains (install wiring incl. a new .mcp.json registration capability, config & stores, governance hash-chain integrity, runtime & sibling wiring). Doctrine-bound: C-9(b) makes doctor fully report-only on weft.toml (no scaffold — reversed from an earlier scoping choice once C-9(b) surfaced); repairs touch only legis's own per-member artifacts (install wiring, .mcp.json, .weft/legis/). Keys are presence-checked only, values never shown (Rust key sidecar planned). Not on the agent MCP surface. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../specs/2026-06-07-legis-doctor-design.md | 245 ++++++++++++++++++ 1 file changed, 245 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-07-legis-doctor-design.md diff --git a/docs/superpowers/specs/2026-06-07-legis-doctor-design.md b/docs/superpowers/specs/2026-06-07-legis-doctor-design.md new file mode 100644 index 0000000..8cb26a3 --- /dev/null +++ b/docs/superpowers/specs/2026-06-07-legis-doctor-design.md @@ -0,0 +1,245 @@ +# Legis doctor — design spec + +**Date:** 2026-06-07 +**Status:** Approved for implementation +**Author:** John Morrissey (with Claude) + +## Goal + +Give legis a `legis doctor` command that **views and repairs install/config +problems**, the way its siblings do (`wardline doctor`, +`filigree`'s `install_support/doctor.py`; loomweave has none). One command +answers *"is my legis wiring healthy, and if not, fix what's safe to fix."* + +Two distinct gaps motivate it: + +1. **No affirmative health view.** legis already self-heals install drift on + `SessionStart` / MCP boot (`hooks.refresh_instructions`), but that path is + silent on success — `session-context` prints nothing whether everything is + current or nothing was checked. There is no way to *ask* "is this healthy?" + and get an affirmative answer. +2. **No coverage of the config/store layer.** The install path checks + instruction blocks / skills / hook / `.gitignore`, but nothing checks + `weft.toml` parseability, the `.weft/legis/` stores, audit-chain integrity, + the `.mcp.json` server registration, or key/sibling-URL wiring. + +These were surfaced concretely while scoping this work (see **Worked examples**): +legis was absent from `.mcp.json` entirely, and `session-context` returned +nothing — both real, both exactly what doctor should catch. + +## Doctrine anchors + +- **C-9(a) — per-member subtree.** Each member is the **sole writer of its own + `.weft//` subtree** and never reads/writes a sibling's. doctor may + create/repair `.weft/legis/`. +- **C-9(b) — `weft.toml` is operator-write-only; `doctor` is named.** *"No + member's installer / CLI / `doctor` writes or rewrites `weft.toml`."* Precedent + is the multi-writer truncation gate `weft-eb3dee402f`. **`legis doctor` is + fully report-only on `weft.toml` — it does not even scaffold an absent + `[legis]` table.** Matches `wardline doctor` ("never weft.toml, never a + sibling's"). +- **C-9(c) — malformed = absent (silent fallback) at runtime.** A + malformed/unreadable `weft.toml` must still boot on defaults. doctor's job is + to **restore the operator signal** that runtime silences: it reports + malformed `weft.toml` as an **error** (your config is silently not applying) + — a diagnostic, never a write. +- **Capability honesty / key carve-out.** Operator signing keys are + capability-confined and not agent-reachable (`config.py`). doctor + **presence-checks** keys only — it never prints, logs, or writes a key value. + legis operator keys are held securely (a Rust key sidecar is planned); + filigree's auto-generated federation comms key is a separate concern. +- **Agent-first, humans on the loop.** doctor is an **operator/CLI** tool. It + inspects and repairs the *host* install and operator files, which is not an + agent-reachable concern, so it is **not** added to the legis MCP tool surface + or the transport-agnostic `service/` decision layer. + +## Architecture + +A single new module plus thin CLI wiring and one install capability — +mirroring `wardline/install/doctor.py` and matching legis's flat-module style +(`config.py`, `install.py`, `hooks.py`). + +- **`src/legis/doctor.py`** — the logic. A `DoctorCheck` dataclass, one function + per check, a `run_doctor(root, *, repair, fmt) -> int` orchestrator, and + `machine_readable_doctor(root, *, repair) -> dict` for the JSON shape. +- **`src/legis/cli.py`** — a `doctor` subparser and a thin `_run_doctor` + dispatcher (I/O shell + exit code only; same pattern as `_check_override_rate`). +- **`src/legis/install.py`** — a new `register_mcp_json(project_root) -> + tuple[bool, str]` (and a matching `--mcp` install flag, included in + install-all), so the `.mcp.json` check has a repair capability to call. This + closes the asymmetry where `wardline install` registers `.mcp.json` but + `legis install` did not. + +**Reuse (no logic duplication):** +- `install.py`: `INSTRUCTIONS_MARKER`, `_extract_marker_token`, `_marker_token`, + `_skill_tree_fingerprint`, `_get_skills_source_dir`, `inject_instructions`, + `install_skills`, `install_codex_skills`, `install_claude_code_hooks`, + `ensure_gitignore`, and the new `register_mcp_json`. +- `config.py`: `project_root`, `_weft_legis_config`, `_store_dir`, + `*_db_url`, `protected_policies`, `ensure_sqlite_parent`. +- `store/audit_store.py`: `verify_integrity`. + +### `DoctorCheck` + +```python +@dataclass(frozen=True, slots=True) +class DoctorCheck: + id: str # stable, e.g. "install.mcp_json", "store.governance_chain" + status: str # "ok" | "warn" | "error" + fixed: bool = False # True if --repair changed state from not-ok to ok + message: str | None = None + + @property + def ok(self) -> bool: return self.status == "ok" +``` + +`warn` is non-fatal (does not affect exit code); `error` is fatal (exit 1). + +## Surface + +``` +legis doctor [--root .] [--repair] [--format {text,json}] +``` + +- **default** — report-only, human text. Exit `0` if no `error` checks, else `1`. +- **`--repair`** — apply safe repairs (see model below), **re-check**, then + report the post-repair state. +- **`--format json`** — emit the federation machine-readable shape: + `{"ok": bool, "checks": [DoctorCheck.to_dict()...], "next_actions": [str...]}`. + `next_actions` lists `"{id}: {message}"` for each non-ok check with a message. + +`--format` (not wardline's `--fix`) is deliberate: it matches legis's *own* +existing `policy-boundary-check --format {text,json}` convention. `--repair` and +`--format` are orthogonal (you can `--repair --format json`). Exit `2` on usage +error. + +## Checks + +### Install wiring (repairable) +- `install.claude_md` — CLAUDE.md instruction block present and **not drifted** + (marker token = current `version:hash`). +- `install.agents_md` — AGENTS.md block present and not drifted. +- `install.claude_skill` — `.claude` skill pack present, tree fingerprint fresh. +- `install.agents_skill` — `.agents` (Codex) skill pack present, fingerprint fresh. +- `install.hook` — Claude Code `SessionStart` hook registered. +- `install.gitignore` — legis `.gitignore` rules present. +- `install.mcp_json` — `.mcp.json` has a `legis` server entry matching the + canonical local entry (`legis mcp --agent-id ` via the resolved binary). + +### Config & stores +- `config.weft_toml` — **report-only.** ABSENT → `ok` (defaults intentional); + PRESENT-and-`[legis]`-valid → `ok`; PRESENT-but-unparseable, or `[legis]` not a + table → `error` ("weft.toml present but malformed; legis is booting on + defaults and your `[legis]` config is silently not applying"). +- `store.dir` — the resolved `store_dir` is usable: its parent is writable so + stores can be created. An **absent** `.weft/legis/` is `ok` (created lazily on + first store open — preserves the import-time no-leak guarantee + `test_build_runtime_initialize_does_not_create_local_state`); a + **present-but-unwritable** dir is `error`. `--repair` ensures the dir exists as + a convenience — an explicit operator action, categorically distinct from the + import-time no-leak guarantee (C-9(a)). +- `store.db_overrides` — any set `LEGIS_*_DB` env var is a well-formed URL. + Report-only. +- `store.legacy_stray` — legacy `legis-*.db` at the repo root → `warn` + (informational; never deleted — operator data). + +### Governance integrity (report-only) +- `store.governance_chain` — `AuditStore(governance_db_url()).verify_integrity()`. + Absent DB → `ok` (nothing to verify, not an error). Tamper/broken chain → + `error` (report-only; a hash chain cannot and must not be auto-repaired). +- `store.binding_chain` — same for the binding ledger. + +### Runtime & siblings (report-only) +- `runtime.hmac_key` — if `LEGIS_PROTECTED_POLICIES` is non-empty (protected / + structured cells configured) but no signing key is available → `warn` + ("protected policies configured but no signing key; protected submissions + will fail"). **Presence only; the value is never read out or shown.** +- `runtime.loomweave_url` / `runtime.filigree_url` — if set, well-formed + http(s) URL; unset → `ok` ("not configured"). Report-only. + +## Repair model + +`--repair` mutates **only legis's own per-member artifacts**: + +| Artifact | Repaired? | How | +|---|---|---| +| CLAUDE.md / AGENTS.md blocks | ✅ | `inject_instructions` (idempotent, drift-aware) | +| `.claude` / `.agents` skills | ✅ | `install_skills` / `install_codex_skills` | +| SessionStart hook | ✅ | `install_claude_code_hooks` | +| `.gitignore` | ✅ | `ensure_gitignore` | +| `.mcp.json` legis entry | ✅ | `register_mcp_json` (new) | +| `.weft/legis/` dir | ✅ | `ensure_sqlite_parent` / `mkdir` | +| `weft.toml` | ❌ never | C-9(b) — report-only, even when absent | +| Audit hash chains | ❌ never | tamper-evidence; report-only | +| Keys, sibling URLs | ❌ never | secrets/values; report-only with guidance | + +After repair, every check is **re-run** so the report reflects true post-repair +state and `fixed=True` is set only where a not-ok check became ok. + +## `.mcp.json` registration (new install capability) + +`register_mcp_json(project_root)` adds/updates a `legis` entry under +`mcpServers` in `/.mcp.json` (creating the file if absent), merging +without disturbing sibling entries. The canonical entry: + +```json +"legis": { + "args": ["mcp", "--agent-id", ""], + "command": "", + "env": {}, + "type": "stdio" +} +``` + +- **Binary resolution** reuses the same logic as the hook installer + (`install._find_legis_command`) so the entry points at the real `legis`. +- **Agent id**: `legis mcp` requires `--agent-id` (it stamps the governance + actor). Default `"claude-code"`; overridable via a `--agent-id` option on + `legis install --mcp` (and `legis doctor --repair` uses the default unless an + existing entry already carries one, which it preserves). +- Wired into `legis install` as `--mcp` and included in install-all. + +## What doctor does NOT do + +- Never writes `weft.toml` (C-9(b)). +- Never repairs a hash chain (tamper-evidence is the point). +- Never prints, logs, or writes a key value. +- Never deletes operator data (legacy stray DBs are warned, not removed). +- Not exposed on the agent MCP surface or the `service/` layer. + +## Testing + +`tests/test_doctor.py` (mirrors `src/legis/doctor.py`), `tmp_path` project +roots, with the **Worked examples** below as red→green fixtures: + +- missing `.mcp.json` legis entry → `error`; `--repair` → `fixed=True`, re-check `ok`. +- drifted instruction block (stale marker token) → `error` → repaired. +- absent `weft.toml` → `ok`; malformed `weft.toml` → `error` and **file + unchanged after `--repair`** (asserts C-9(b)). +- tampered governance chain → `error`, **report-only** (file unchanged after `--repair`). +- `LEGIS_PROTECTED_POLICIES` set with no key → `warn`; assert **no key value + appears** anywhere in text/JSON output. +- JSON shape: `{ok, checks:[{id,status,fixed,message?}], next_actions}`. +- exit codes: `0` healthy, `1` any error, `2` usage error. + +A new per-package coverage floor entry covers `doctor.py`. + +## Worked examples (the findings that motivated this) + +1. **legis absent from `.mcp.json`** — its `mcp__legis__*` tools never loaded. + `install.mcp_json` → `error`; repaired by `register_mcp_json`. (Fixed + manually during scoping; doctor makes it self-diagnosing.) +2. **`session-context` returns nothing** — honest-empty by design + (`refresh_instructions` → `[]` on no drift). doctor supplies the missing + affirmative "all current" signal. +3. **wardline rc1↔rc4 version skew (reported)** — not reproducible in this + environment (uniformly rc4). Cross-tool *version* reconciliation is **out of + scope** for v1 (doctor checks legis's own wiring, not sibling tool versions); + noted as a candidate future check. + +## Out of scope / future + +- Cross-tool version-skew checks (sibling binary versions). +- Reading keys from the planned Rust key sidecar (doctor stays presence-only; + it will check availability through whatever resolution path exists then). +- Any `weft.toml` write capability (blocked by C-9(b)). From dcc38c826562741d9d40b1037bd05942543e411a Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:02:31 +1000 Subject: [PATCH 52/72] docs(plan): legis doctor implementation plan (10 TDD tasks) Bite-sized TDD tasks: DoctorCheck record + rendering, CLI wiring, the new register_mcp_json install capability, then the four check domains (install wiring, config & stores, governance integrity, runtime & siblings), an end-to-end --repair test, and docs/coverage. Each invariant from the spec (no weft.toml write, no key-value rendering, no-leak DB creation) is pinned by a named test. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../plans/2026-06-07-legis-doctor.md | 1162 +++++++++++++++++ 1 file changed, 1162 insertions(+) create mode 100644 docs/superpowers/plans/2026-06-07-legis-doctor.md diff --git a/docs/superpowers/plans/2026-06-07-legis-doctor.md b/docs/superpowers/plans/2026-06-07-legis-doctor.md new file mode 100644 index 0000000..a844d13 --- /dev/null +++ b/docs/superpowers/plans/2026-06-07-legis-doctor.md @@ -0,0 +1,1162 @@ +# Legis doctor Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add `legis doctor [--root .] [--repair] [--format {text,json}]` — an operator/CLI health view that diagnoses and (safely) repairs legis's install + config layer. + +**Architecture:** One new module `src/legis/doctor.py` (a `DoctorCheck` dataclass, one function per check, a `run_doctor` orchestrator, `machine_readable_doctor` for JSON), a thin `doctor` subparser in `cli.py`, and one new install capability `register_mcp_json` in `install.py` (with a `legis install --mcp` flag). Checks reuse existing `install.py` / `config.py` / `store` helpers; repairs touch only legis's own per-member artifacts. Bound by C-9(b): **never writes `weft.toml`**. + +**Tech Stack:** Python 3.12, argparse, stdlib `tomllib`/`json`, SQLAlchemy `make_url`, pytest, uv. + +**Spec:** `docs/superpowers/specs/2026-06-07-legis-doctor-design.md` + +--- + +## File Structure + +- **Create `src/legis/doctor.py`** — all doctor logic. Responsibilities: the `DoctorCheck` record, every check function (pure: `root: Path` + env → `DoctorCheck`, no mutation), the repair dispatch, the `run_doctor`/`machine_readable_doctor` orchestrators, and text/JSON rendering. +- **Modify `src/legis/install.py`** — add `register_mcp_json(project_root)` + `_legis_mcp_entry(agent_id)` (the `.mcp.json` writer/canonical entry), reusing `_find_legis_command`, `_atomic_write_text`, `reject_symlink`, `project_path`. +- **Modify `src/legis/cli.py`** — add the `doctor` subparser, a `--mcp` flag (+ optional `--agent-id`) on the `install` subparser and its step list, and a thin `_run_doctor` dispatcher. +- **Create `tests/test_doctor.py`** — mirrors `src/legis/doctor.py`. +- **Modify `tests/test_install.py`** — tests for `register_mcp_json`. +- **Modify `scripts/check_coverage_floors.py`** — (only if it enumerates modules) add a floor for `doctor.py`; otherwise the top-level src floor covers it. +- **Modify `CHANGELOG.md`**, **`README.md`** — document the new command. + +Reused symbols (verify they exist before relying on them): +- `install.py`: `INSTRUCTIONS_MARKER`, `SKILL_NAME`, `SESSION_CONTEXT_COMMAND`, `_marker_token`, `_extract_marker_token`, `_get_skills_source_dir`, `_skill_tree_fingerprint`, `_has_unscoped_session_start_hook`, `_find_legis_command`, `_LEGIS_IGNORE_RULES`, `inject_instructions`, `install_skills`, `install_codex_skills`, `install_claude_code_hooks`, `ensure_gitignore`, `_atomic_write_text`, `reject_symlink`, `project_path`. +- `config.py`: `project_root`, `governance_db_url`, `binding_db_url`, `protected_policies`, `_store_dir`. +- `store/audit_store.py`: `AuditStore(url).verify_integrity() -> bool`. + +--- + +## Task 1: `DoctorCheck` record + rendering + empty orchestrator + +**Files:** +- Create: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# tests/test_doctor.py +from __future__ import annotations + +import json + +from legis.doctor import DoctorCheck, render_json, render_text + + +def test_doctorcheck_to_dict_omits_empty_message(): + assert DoctorCheck("a.b", "ok").to_dict() == {"id": "a.b", "status": "ok", "fixed": False} + assert DoctorCheck("a.b", "error", message="boom").to_dict() == { + "id": "a.b", + "status": "error", + "fixed": False, + "message": "boom", + } + + +def test_render_json_shape(): + checks = [DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")] + payload = json.loads(render_json(checks)) + assert payload["ok"] is False + assert payload["checks"][0] == {"id": "a", "status": "ok", "fixed": False} + assert payload["next_actions"] == ["b: bad"] + + +def test_render_text_lists_only_problems_when_healthy_says_ok(): + assert "legis doctor: ok" in render_text([DoctorCheck("a", "ok")]) + out = render_text([DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")]) + assert "b: bad" in out + assert "legis doctor: ok" not in out +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -v` +Expected: FAIL with `ModuleNotFoundError: No module named 'legis.doctor'`. + +- [ ] **Step 3: Write minimal implementation** + +```python +# src/legis/doctor.py +"""`legis doctor` — view and repair legis install/config health. + +Operator/CLI tool only: it inspects and repairs the *host* install and legis's +own per-member artifacts. It is NOT on the agent MCP surface or the service +layer, and per hub doctrine C-9(b) it NEVER writes weft.toml. +""" + +from __future__ import annotations + +import json +from dataclasses import dataclass +from typing import Any + + +@dataclass(frozen=True, slots=True) +class DoctorCheck: + id: str + status: str # "ok" | "warn" | "error" + fixed: bool = False + message: str | None = None + + @property + def ok(self) -> bool: + return self.status != "error" + + def to_dict(self) -> dict[str, Any]: + data: dict[str, Any] = {"id": self.id, "status": self.status, "fixed": self.fixed} + if self.message: + data["message"] = self.message + return data + + +def _next_actions(checks: list[DoctorCheck]) -> list[str]: + return [f"{c.id}: {c.message}" for c in checks if c.status != "ok" and c.message] + + +def render_json(checks: list[DoctorCheck]) -> str: + payload = { + "ok": all(c.ok for c in checks), + "checks": [c.to_dict() for c in checks], + "next_actions": _next_actions(checks), + } + return json.dumps(payload, indent=2, sort_keys=True) + + +def render_text(checks: list[DoctorCheck]) -> str: + healthy = all(c.status == "ok" for c in checks) + if healthy: + return "legis doctor: ok" + lines = ["legis doctor:"] + for c in checks: + if c.status == "ok": + continue + lines.append(f" {c.id}: {c.status} — {c.message}" if c.message else f" {c.id}: {c.status}") + return "\n".join(lines) +``` + +Note: `ok` is True for `warn` (non-fatal) and False only for `error`. `render_text`'s "all ok" banner uses strict `== "ok"` so warns still print. + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -v` +Expected: PASS (3 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): DoctorCheck record + text/json rendering" +``` + +--- + +## Task 2: `collect_checks` orchestrator + `run_doctor` (still no real checks) + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from pathlib import Path + +from legis.doctor import run_doctor + + +def test_run_doctor_empty_is_healthy(tmp_path, capsys): + # With no checks registered yet, an empty list renders healthy, exit 0. + rc = run_doctor(tmp_path, repair=False, fmt="text") + assert rc == 0 + assert "legis doctor: ok" in capsys.readouterr().out + + +def test_run_doctor_json_format(tmp_path, capsys): + rc = run_doctor(tmp_path, repair=False, fmt="json") + assert rc == 0 + payload = json.loads(capsys.readouterr().out) + assert payload == {"ok": True, "checks": [], "next_actions": []} +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k run_doctor -v` +Expected: FAIL with `ImportError: cannot import name 'run_doctor'`. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +from pathlib import Path + + +def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: + """Run every check against *root*. Repairs run inside individual checks + when *repair* is True; each returned check reflects post-repair state.""" + checks: list[DoctorCheck] = [] + # Check functions are appended here in later tasks. + return checks + + +def run_doctor(root: Path, *, repair: bool, fmt: str) -> int: + checks = collect_checks(root, repair=repair) + print(render_json(checks) if fmt == "json" else render_text(checks)) + return 0 if all(c.ok for c in checks) else 1 +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k run_doctor -v` +Expected: PASS (2 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): collect_checks + run_doctor orchestrator skeleton" +``` + +--- + +## Task 3: CLI `doctor` subparser + dispatch (walking skeleton end-to-end) + +**Files:** +- Modify: `src/legis/cli.py` (subparser in `build_parser`, dispatch in `main`) +- Test: `tests/test_cli.py` (or `tests/test_doctor.py` — match where CLI tests live) + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.cli import main as cli_main + + +def test_cli_doctor_runs_and_exits_zero(tmp_path, capsys, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = cli_main(["doctor"]) + assert rc == 0 + assert "legis doctor: ok" in capsys.readouterr().out + + +def test_cli_doctor_json(tmp_path, capsys, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = cli_main(["doctor", "--format", "json"]) + assert rc == 0 + assert json.loads(capsys.readouterr().out)["ok"] is True +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k cli_doctor -v` +Expected: FAIL — argparse exits non-zero / `doctor` is not a known subcommand. + +- [ ] **Step 3: Write minimal implementation** + +In `src/legis/cli.py`, inside `build_parser()` (after the `install` subparser block, before `return parser`): + +```python + doctor = subparsers.add_parser( + "doctor", + help="View and repair legis install/config health", + ) + doctor.add_argument("--root", default=".", help="Project root to inspect (default: cwd)") + doctor.add_argument("--repair", action="store_true", help="Apply safe repairs, then re-check") + doctor.add_argument( + "--format", choices=("text", "json"), default="text", + help="Output format: human text (default) or machine-readable json", + ) +``` + +Add a dispatcher function near `_check_override_rate`: + +```python +def _run_doctor(args) -> int: + from pathlib import Path + + from legis.doctor import run_doctor + + return run_doctor(Path(args.root), repair=args.repair, fmt=args.format) +``` + +In `main()`, add a branch alongside the other `args.command` checks: + +```python + if args.command == "doctor": + return _run_doctor(args) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k cli_doctor -v` +Expected: PASS (2 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/cli.py tests/test_doctor.py +git commit -m "feat(doctor): wire 'legis doctor' CLI subcommand" +``` + +--- + +## Task 4: `register_mcp_json` install capability + `legis install --mcp` + +**Files:** +- Modify: `src/legis/install.py` (add `_legis_mcp_entry`, `register_mcp_json`) +- Modify: `src/legis/cli.py` (`--mcp` flag + step in `_run_install`) +- Test: `tests/test_install.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_install.py +import json +from pathlib import Path + +from legis.install import register_mcp_json, _legis_mcp_entry + + +def test_register_mcp_json_creates_file_with_legis_entry(tmp_path): + ok, msg = register_mcp_json(tmp_path) + assert ok, msg + data = json.loads((tmp_path / ".mcp.json").read_text()) + entry = data["mcpServers"]["legis"] + assert entry["type"] == "stdio" + assert entry["args"][0] == "mcp" + assert "--agent-id" in entry["args"] + + +def test_register_mcp_json_preserves_sibling_entries(tmp_path): + (tmp_path / ".mcp.json").write_text( + json.dumps({"mcpServers": {"filigree": {"command": "x", "type": "stdio"}}}) + ) + ok, _ = register_mcp_json(tmp_path) + assert ok + data = json.loads((tmp_path / ".mcp.json").read_text()) + assert "filigree" in data["mcpServers"] + assert "legis" in data["mcpServers"] + + +def test_register_mcp_json_idempotent(tmp_path): + register_mcp_json(tmp_path) + first = (tmp_path / ".mcp.json").read_text() + register_mcp_json(tmp_path) + assert (tmp_path / ".mcp.json").read_text() == first +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_install.py -k mcp_json -v` +Expected: FAIL with `ImportError: cannot import name 'register_mcp_json'`. + +- [ ] **Step 3: Write minimal implementation** + +In `src/legis/install.py` (after the `.gitignore` section), add: + +```python +# --------------------------------------------------------------------------- +# .mcp.json (agent MCP server registration) +# --------------------------------------------------------------------------- + +import shlex + +_DEFAULT_AGENT_ID = "claude-code" + + +def _legis_mcp_entry(agent_id: str = _DEFAULT_AGENT_ID) -> dict[str, Any]: + """The canonical legis stdio server entry for .mcp.json.""" + return { + "args": ["mcp", "--agent-id", agent_id], + "command": _find_legis_command()[0] if len(_find_legis_command()) == 1 else shlex.join(_find_legis_command()), + "env": {}, + "type": "stdio", + } + + +def register_mcp_json(project_root: Path, agent_id: str = _DEFAULT_AGENT_ID) -> tuple[bool, str]: + """Register (or refresh) the legis server in /.mcp.json. + + Creates the file if absent; merges into mcpServers without disturbing + sibling entries. Preserves an existing legis entry's agent-id if it already + carries one (operator choice), refreshing only the command/args shape. + """ + try: + path = project_path(project_root, ".mcp.json") + except UnsafeInstallPathError as exc: + return False, str(exc) + + data: dict[str, Any] = {} + if path.exists(): + try: + parsed = json.loads(path.read_text(encoding="utf-8")) + if isinstance(parsed, dict): + data = parsed + except (json.JSONDecodeError, OSError): + return False, ".mcp.json present but unreadable; fix or remove it by hand" + + servers = data.get("mcpServers") + if not isinstance(servers, dict): + servers = {} + data["mcpServers"] = servers + + existing = servers.get("legis") + keep_agent = agent_id + if isinstance(existing, dict): + args = existing.get("args", []) + if isinstance(args, list) and "--agent-id" in args: + i = args.index("--agent-id") + if i + 1 < len(args) and isinstance(args[i + 1], str): + keep_agent = args[i + 1] + + desired = _legis_mcp_entry(keep_agent) + if existing == desired: + return True, "legis already registered in .mcp.json" + servers["legis"] = desired + _atomic_write_text(path, json.dumps(data, indent=2, sort_keys=True) + "\n") + return True, "Registered legis server in .mcp.json" +``` + +Note: `Any` and `json` are already imported at the top of `install.py`; if not, add `from typing import Any` and `import json`. Move `import shlex` to the module top if a linter flags the inline import. + +In `src/legis/cli.py` `build_parser()`, add to the `install` subparser: + +```python + install.add_argument("--mcp", action="store_true", help="Register the legis MCP server in .mcp.json only") + install.add_argument( + "--agent-id", default="claude-code", + help="Agent id stamped in the .mcp.json legis entry (default: claude-code)", + ) +``` + +In `_run_install` (the `steps` list and the imports from `legis.install`), add `register_mcp_json` to the import and a step: + +```python + (install_all or args.mcp, ".mcp.json", lambda: register_mcp_json(project_root, args.agent_id)), +``` + +and update the `install_all` computation to include `args.mcp`: + +```python + install_all = not any( + [args.claude_md, args.agents_md, args.skills, args.codex_skills, args.hooks, args.gitignore, args.mcp] + ) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_install.py -k mcp_json -v` +Expected: PASS (3 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/install.py src/legis/cli.py tests/test_install.py +git commit -m "feat(install): register legis MCP server in .mcp.json (+ --mcp flag)" +``` + +--- + +## Task 5: doctor `.mcp.json` check + repair + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.doctor import check_mcp_json + + +def test_mcp_json_absent_is_error(tmp_path): + c = check_mcp_json(tmp_path, repair=False) + assert c.id == "install.mcp_json" + assert c.status == "error" + assert c.fixed is False + + +def test_mcp_json_repair_fixes_it(tmp_path): + c = check_mcp_json(tmp_path, repair=True) + assert c.status == "ok" + assert c.fixed is True + assert (tmp_path / ".mcp.json").exists() + + +def test_mcp_json_present_is_ok(tmp_path): + from legis.install import register_mcp_json + register_mcp_json(tmp_path) + c = check_mcp_json(tmp_path, repair=False) + assert c.status == "ok" + assert c.fixed is False +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k mcp_json -v` +Expected: FAIL — `cannot import name 'check_mcp_json'`. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +import json as _json # noqa: F401 (json already imported at top; reuse it) + + +def check_mcp_json(root: Path, *, repair: bool) -> DoctorCheck: + cid = "install.mcp_json" + path = root / ".mcp.json" + present = False + if path.exists(): + try: + data = json.loads(path.read_text(encoding="utf-8")) + present = isinstance(data, dict) and isinstance(data.get("mcpServers"), dict) and "legis" in data["mcpServers"] + except (json.JSONDecodeError, OSError): + present = False + if present: + return DoctorCheck(cid, "ok") + if repair: + from legis.install import register_mcp_json + + ok, msg = register_mcp_json(root) + if ok: + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message="legis server not registered (run: legis install --mcp)") +``` + +Remove the `import json as _json` line — `json` is already imported at the top of the module from Task 1; this note is a reminder, not new code. Then register the check in `collect_checks`: + +```python + checks.append(check_mcp_json(root, repair=repair)) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k mcp_json -v` +Expected: PASS (3 tests). + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): .mcp.json registration check + repair" +``` + +--- + +## Task 6: doctor install-wiring checks (blocks, skills, hook, gitignore) + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.doctor import check_instruction_block, check_skill_pack, check_hook, check_gitignore +from legis import install as legis_install + + +def test_instruction_block_absent_is_error(tmp_path): + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=False) + assert c.id == "install.claude_md" + assert c.status == "error" + + +def test_instruction_block_repair_creates_it(tmp_path): + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=True) + assert c.status == "ok" + assert c.fixed is True + assert legis_install.INSTRUCTIONS_MARKER in (tmp_path / "CLAUDE.md").read_text() + + +def test_gitignore_absent_is_error_then_repaired(tmp_path): + assert check_gitignore(tmp_path, repair=False).status == "error" + fixed = check_gitignore(tmp_path, repair=True) + assert fixed.status == "ok" and fixed.fixed is True + assert ".weft/legis/" in (tmp_path / ".gitignore").read_text() + + +def test_skill_pack_absent_is_error(tmp_path): + assert check_skill_pack(tmp_path, ".claude", repair=False).status == "error" + + +def test_skill_pack_repair_installs(tmp_path): + c = check_skill_pack(tmp_path, ".claude", repair=True) + assert c.status == "ok" and c.fixed is True +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k "instruction_block or gitignore or skill_pack" -v` +Expected: FAIL — those check functions don't exist yet. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +from legis import install as _install + + +def _block_fresh(root: Path, filename: str) -> bool: + """True iff / has the legis block at the current token.""" + path = root / filename + if not path.exists(): + return False + try: + content = path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + if _install.INSTRUCTIONS_MARKER not in content: + return False + return _install._extract_marker_token(content) == _install._marker_token() + + +def check_instruction_block(root: Path, filename: str, *, repair: bool) -> DoctorCheck: + cid = "install.claude_md" if filename == "CLAUDE.md" else "install.agents_md" + if _block_fresh(root, filename): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.inject_instructions(root / filename) + if ok and _block_fresh(root, filename): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + missing = "missing" if not (root / filename).exists() else "block missing or drifted" + return DoctorCheck(cid, "error", message=f"{filename} {missing} (run: legis install)") + + +def _skill_fresh(root: Path, base: str) -> bool: + source = _install._get_skills_source_dir() / _install.SKILL_NAME + target = root / base / "skills" / _install.SKILL_NAME + if not source.is_dir() or not target.is_dir(): + return False + return _install._skill_tree_fingerprint(target) == _install._skill_tree_fingerprint(source) + + +def check_skill_pack(root: Path, base: str, *, repair: bool) -> DoctorCheck: + cid = "install.claude_skill" if base == ".claude" else "install.agents_skill" + installer = _install.install_skills if base == ".claude" else _install.install_codex_skills + if _skill_fresh(root, base): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = installer(root) + if ok and _skill_fresh(root, base): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message=f"{base}/skills/{_install.SKILL_NAME} missing or drifted (run: legis install)") + + +def _hook_present(root: Path) -> bool: + settings_path = root / ".claude" / "settings.json" + if not settings_path.exists(): + return False + try: + settings = json.loads(settings_path.read_text(encoding="utf-8")) + except (json.JSONDecodeError, OSError): + return False + return _install._has_unscoped_session_start_hook(settings, _install.SESSION_CONTEXT_COMMAND) + + +def check_hook(root: Path, *, repair: bool) -> DoctorCheck: + cid = "install.hook" + if _hook_present(root): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.install_claude_code_hooks(root) + if ok and _hook_present(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message="SessionStart hook not registered (run: legis install)") + + +def _gitignore_present(root: Path) -> bool: + path = root / ".gitignore" + if not path.exists(): + return False + try: + content = path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + present = {ln.strip() for ln in content.splitlines() if ln.strip() and not ln.lstrip().startswith("#")} + return all(rule in present for rule in _install._LEGIS_IGNORE_RULES) + + +def check_gitignore(root: Path, *, repair: bool) -> DoctorCheck: + cid = "install.gitignore" + if _gitignore_present(root): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.ensure_gitignore(root) + if ok and _gitignore_present(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message=".weft/legis/ not in .gitignore (run: legis install)") +``` + +Register them in `collect_checks` (before the `.mcp.json` check): + +```python + checks.append(check_instruction_block(root, "CLAUDE.md", repair=repair)) + checks.append(check_instruction_block(root, "AGENTS.md", repair=repair)) + checks.append(check_skill_pack(root, ".claude", repair=repair)) + checks.append(check_skill_pack(root, ".agents", repair=repair)) + checks.append(check_hook(root, repair=repair)) + checks.append(check_gitignore(root, repair=repair)) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k "instruction_block or gitignore or skill_pack or hook" -v` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): install-wiring checks (blocks, skills, hook, gitignore)" +``` + +--- + +## Task 7: doctor config & store checks (weft.toml report-only, store dir, db overrides, legacy) + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.doctor import check_weft_toml, check_store_dir, check_db_overrides, check_legacy_stray_db + + +def test_weft_toml_absent_is_ok(tmp_path): + assert check_weft_toml(tmp_path).status == "ok" + + +def test_weft_toml_valid_legis_table_is_ok(tmp_path): + (tmp_path / "weft.toml").write_text('[legis]\nstore_dir = ".weft/legis"\n') + assert check_weft_toml(tmp_path).status == "ok" + + +def test_weft_toml_malformed_is_error_and_unchanged(tmp_path): + wt = tmp_path / "weft.toml" + wt.write_text("[legis]\nstore_dir = \n") # malformed TOML + before = wt.read_text() + c = check_weft_toml(tmp_path) + assert c.status == "error" + assert wt.read_text() == before # C-9(b): never written + + +def test_weft_toml_legis_not_a_table_is_error(tmp_path): + (tmp_path / "weft.toml").write_text('legis = "oops"\n') + assert check_weft_toml(tmp_path).status == "error" + + +def test_store_dir_writable_parent_is_ok(tmp_path): + assert check_store_dir(tmp_path).status == "ok" + + +def test_db_override_bad_url_is_error(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "::not a url::") + assert check_db_overrides(tmp_path).status == "error" + + +def test_legacy_stray_db_is_warn(tmp_path): + (tmp_path / "legis-governance.db").write_text("x") + assert check_legacy_stray_db(tmp_path).status == "warn" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k "weft_toml or store_dir or db_override or legacy" -v` +Expected: FAIL — functions undefined. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +import os +import tomllib + +from sqlalchemy.engine import make_url + +_DB_OVERRIDE_ENVS = ("LEGIS_CHECK_DB", "LEGIS_GOVERNANCE_DB", "LEGIS_BINDING_DB", "LEGIS_PULL_DB") +_LEGACY_DB_NAMES = ("legis-checks.db", "legis-governance.db", "legis-binding.db", "legis-pulls.db") + + +def check_weft_toml(root: Path) -> DoctorCheck: + """Report-only (C-9(b)): NEVER writes weft.toml. Distinguishes ABSENT (ok — + defaults intentional) from PRESENT-BUT-BROKEN (error — config silently not + applying), restoring the operator signal that C-9(c) silences at runtime.""" + cid = "config.weft_toml" + path = root / "weft.toml" + if not path.exists(): + return DoctorCheck(cid, "ok", message="absent (built-in defaults)") + try: + data = tomllib.loads(path.read_text(encoding="utf-8")) + except (tomllib.TOMLDecodeError, OSError, UnicodeDecodeError) as exc: + return DoctorCheck(cid, "error", message=f"present but unparseable; [legis] silently not applying ({exc})") + table = data.get("legis") + if table is not None and not isinstance(table, dict): + return DoctorCheck(cid, "error", message="[legis] in weft.toml must be a table") + return DoctorCheck(cid, "ok") + + +def _nearest_existing(path: Path) -> Path: + p = path + while not p.exists() and p != p.parent: + p = p.parent + return p + + +def check_store_dir(root: Path, *, repair: bool = False) -> DoctorCheck: + """An absent .weft/legis/ is ok (created lazily). A present-but-unwritable + dir is an error. --repair ensures the dir exists (explicit operator action).""" + cid = "store.dir" + from legis import config + + store_dir = (root / config._store_dir()) if not config._store_dir().is_absolute() else config._store_dir() + if store_dir.exists(): + if not os.access(store_dir, os.W_OK): + return DoctorCheck(cid, "error", message=f"{store_dir} not writable") + return DoctorCheck(cid, "ok") + if repair: + try: + store_dir.mkdir(parents=True, exist_ok=True) + return DoctorCheck(cid, "ok", fixed=True) + except OSError as exc: + return DoctorCheck(cid, "error", message=f"cannot create {store_dir}: {exc}") + anchor = _nearest_existing(store_dir) + if not os.access(anchor, os.W_OK): + return DoctorCheck(cid, "error", message=f"{store_dir} not creatable ({anchor} not writable)") + return DoctorCheck(cid, "ok", message="absent (created on first store open)") + + +def check_db_overrides(root: Path) -> DoctorCheck: + cid = "store.db_overrides" + bad = [] + for env in _DB_OVERRIDE_ENVS: + val = os.environ.get(env) + if not val: + continue + try: + make_url(val) + except Exception: # noqa: BLE001 — any parse failure is a bad override + bad.append(env) + if bad: + return DoctorCheck(cid, "error", message="invalid URL in: " + ", ".join(bad)) + return DoctorCheck(cid, "ok") + + +def check_legacy_stray_db(root: Path) -> DoctorCheck: + cid = "store.legacy_stray" + stray = [n for n in _LEGACY_DB_NAMES if (root / n).is_file()] + if stray: + return DoctorCheck(cid, "warn", message="legacy DB at repo root (move to .weft/legis/): " + ", ".join(stray)) + return DoctorCheck(cid, "ok") +``` + +Register in `collect_checks`: + +```python + checks.append(check_weft_toml(root)) + checks.append(check_store_dir(root, repair=repair)) + checks.append(check_db_overrides(root)) + checks.append(check_legacy_stray_db(root)) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k "weft_toml or store_dir or db_override or legacy" -v` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): config & store checks (weft.toml report-only, store dir, db overrides, legacy)" +``` + +--- + +## Task 8: doctor governance integrity + runtime/sibling checks + +**Files:** +- Modify: `src/legis/doctor.py` +- Test: `tests/test_doctor.py` + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +from legis.doctor import check_audit_chain, check_hmac_key, check_sibling_url + + +def test_audit_chain_absent_db_is_ok(tmp_path): + c = check_audit_chain("store.governance_chain", "sqlite:///" + str(tmp_path / "nope.db")) + assert c.status == "ok" + + +def test_audit_chain_intact_db_is_ok(tmp_path): + from legis.store.audit_store import AuditStore + url = "sqlite:///" + str(tmp_path / "gov.db") + AuditStore(url) # creates schema + assert check_audit_chain("store.governance_chain", url).status == "ok" + + +def test_hmac_key_warn_when_protected_set_without_key(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.delenv("LEGIS_HMAC_KEY", raising=False) + c = check_hmac_key(tmp_path) + assert c.status == "warn" + + +def test_hmac_key_never_prints_value(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.setenv("LEGIS_HMAC_KEY", "super-secret-value") + c = check_hmac_key(tmp_path) + assert c.status == "ok" + assert "super-secret-value" not in (c.message or "") + + +def test_sibling_url_invalid_is_error(tmp_path, monkeypatch): + monkeypatch.setenv("LOOMWEAVE_API_URL", "localhost:9620") # no scheme + c = check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL") + assert c.status == "error" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `uv run pytest tests/test_doctor.py -k "audit_chain or hmac_key or sibling_url" -v` +Expected: FAIL — functions undefined. + +- [ ] **Step 3: Write minimal implementation** + +```python +# add to src/legis/doctor.py +from urllib.parse import urlsplit + + +def check_audit_chain(cid: str, url: str) -> DoctorCheck: + """Report-only. Absent file store => ok (nothing to verify). A tampered + chain => error (a hash chain cannot/must not be auto-repaired).""" + try: + parsed = make_url(url) + except Exception: # noqa: BLE001 + return DoctorCheck(cid, "ok", message="store URL not a file store") + db = parsed.database + if not str(parsed.drivername).startswith("sqlite") or not db or db == ":memory:": + return DoctorCheck(cid, "ok", message="not a file store") + if not Path(db).exists(): + return DoctorCheck(cid, "ok", message="no store yet") + from legis.store.audit_store import AuditStore + + try: + intact = AuditStore(url).verify_integrity() + except Exception as exc: # noqa: BLE001 — surface any verify failure, never raise from doctor + return DoctorCheck(cid, "error", message=f"integrity check failed: {exc}") + return DoctorCheck(cid, "ok") if intact else DoctorCheck(cid, "error", message="hash chain verification FAILED (report-only; cannot repair)") + + +def check_hmac_key(root: Path) -> DoctorCheck: + """Presence-only; NEVER renders the key value.""" + cid = "runtime.hmac_key" + from legis import config + + if not config.protected_policies(): + return DoctorCheck(cid, "ok", message="no protected policies configured") + if os.environ.get("LEGIS_HMAC_KEY"): + return DoctorCheck(cid, "ok") + return DoctorCheck(cid, "warn", message="protected policies configured but LEGIS_HMAC_KEY not set; protected submissions will fail") + + +def check_sibling_url(cid: str, env: str) -> DoctorCheck: + url = os.environ.get(env) + if not url: + return DoctorCheck(cid, "ok", message="not configured") + parsed = urlsplit(url) + if parsed.scheme.lower() in {"http", "https"} and parsed.netloc: + return DoctorCheck(cid, "ok") + return DoctorCheck(cid, "error", message=f"{env} invalid URL: {url!r}") +``` + +Register in `collect_checks`: + +```python + from legis import config + + checks.append(check_audit_chain("store.governance_chain", config.governance_db_url())) + checks.append(check_audit_chain("store.binding_chain", config.binding_db_url())) + checks.append(check_hmac_key(root)) + checks.append(check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL")) + checks.append(check_sibling_url("runtime.filigree_url", "FILIGREE_API_URL")) +``` + +Note: `config.governance_db_url()` / `binding_db_url()` resolve cwd-relative URLs. `collect_checks` must resolve them relative to `root`; if `root` is not cwd, run the resolution with cwd set to `root` — simplest is to compute these URLs inside a small helper that `os.chdir`-free resolves via `config._store_dir()` joined to `root`. To avoid cwd coupling in tests, compute the path directly: + +```python +def _store_url(root: Path, db_name: str, env: str) -> str: + val = os.environ.get(env) + if val: + return val + from legis import config + + store_dir = config._store_dir() + base = store_dir if store_dir.is_absolute() else (root / store_dir) + return "sqlite:///" + (base / db_name).as_posix() +``` + +and call: + +```python + checks.append(check_audit_chain("store.governance_chain", _store_url(root, "legis-governance.db", "LEGIS_GOVERNANCE_DB"))) + checks.append(check_audit_chain("store.binding_chain", _store_url(root, "legis-binding.db", "LEGIS_BINDING_DB"))) +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `uv run pytest tests/test_doctor.py -k "audit_chain or hmac_key or sibling_url" -v` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add src/legis/doctor.py tests/test_doctor.py +git commit -m "feat(doctor): governance-chain integrity + runtime/sibling checks" +``` + +--- + +## Task 9: end-to-end `--repair` re-check + JSON regression test + +**Files:** +- Test: `tests/test_doctor.py` (no new logic — repairs already run inside checks; this proves the whole pipeline) + +- [ ] **Step 1: Write the failing test** + +```python +# add to tests/test_doctor.py +def test_repair_makes_fresh_project_healthy(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + # First run: unhealthy (no install artifacts, no .mcp.json). + assert run_doctor(tmp_path, repair=False, fmt="text") == 1 + # Repair run: install-wiring + .mcp.json get fixed; re-check is healthy. + assert run_doctor(tmp_path, repair=True, fmt="text") == 0 + # Third run, no repair: stays healthy. + assert run_doctor(tmp_path, repair=False, fmt="text") == 0 + + +def test_repair_never_writes_weft_toml(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text("[legis]\nstore_dir = \n") # malformed + before = (tmp_path / "weft.toml").read_text() + run_doctor(tmp_path, repair=True, fmt="json") + assert (tmp_path / "weft.toml").read_text() == before + + +def test_json_output_has_no_secret(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.setenv("LEGIS_HMAC_KEY", "TOP-SECRET") + import io, contextlib + buf = io.StringIO() + with contextlib.redirect_stdout(buf): + run_doctor(tmp_path, repair=False, fmt="json") + assert "TOP-SECRET" not in buf.getvalue() +``` + +- [ ] **Step 2: Run test to verify it fails or passes** + +Run: `uv run pytest tests/test_doctor.py -k "repair_makes or never_writes or no_secret" -v` +Expected: PASS if Tasks 5–8 are wired correctly. If `test_repair_makes_fresh_project_healthy` fails, the offending check's `repair=True` branch isn't reaching `ok` — fix that check, not this test. + +- [ ] **Step 3: (only if a test failed) fix the implicated check** + +No new code if green. If red, the failing check is reported by name in the assertion — return to that check's task and correct its repair branch. + +- [ ] **Step 4: Run the full doctor test file** + +Run: `uv run pytest tests/test_doctor.py -v` +Expected: PASS (all). + +- [ ] **Step 5: Commit** + +```bash +git add tests/test_doctor.py +git commit -m "test(doctor): end-to-end repair pipeline + weft.toml/secret invariants" +``` + +--- + +## Task 10: docs, coverage floor, and full gate run + +**Files:** +- Modify: `CHANGELOG.md`, `README.md` +- Modify: `scripts/check_coverage_floors.py` (only if it lists modules explicitly) + +- [ ] **Step 1: Update CHANGELOG and README** + +Add to `CHANGELOG.md` under the unreleased/rc4 section: + +```markdown +### Added +- `legis doctor [--root] [--repair] [--format text|json]` — operator health view + and safe repair for the install + config layer (instruction blocks, skills, + SessionStart hook, `.gitignore`, `.mcp.json` registration, store dir, audit + hash-chain integrity, key/sibling wiring). Report-only on `weft.toml` (C-9(b)) + and on hash chains; key values are never rendered. +- `legis install --mcp` — register the legis MCP server in `.mcp.json` + (also part of `legis install` with no flags). +``` + +In `README.md`, under the surfaces/commands section, add a `legis doctor` line mirroring the existing `legis install` description. + +- [ ] **Step 2: Run the full test suite + lint + types** + +Run: +```bash +uv run ruff check src +uv run mypy src/legis +uv run pytest -q +``` +Expected: ruff clean, mypy clean, all tests pass. + +- [ ] **Step 3: Run coverage floors** + +Run: `uv run pytest --cov=legis --cov-report=term-missing && uv run python scripts/check_coverage_floors.py` +Expected: floors hold. If `check_coverage_floors.py` enumerates packages and `doctor.py` is top-level (not in a covered package dir), confirm it falls under the global floor; if the script needs a per-module entry, add one a few points below the achieved coverage. + +- [ ] **Step 4: Manual smoke test** + +Run: +```bash +cd /tmp && rm -rf doctortest && mkdir doctortest && cd doctortest +legis doctor # expect: several errors (fresh dir), exit 1 +legis doctor --repair # expect: install wiring + .mcp.json fixed, exit 0 +legis doctor --format json # expect: {"ok": true, ...} +``` +Expected: matches the comments. + +- [ ] **Step 5: Commit** + +```bash +git add CHANGELOG.md README.md scripts/check_coverage_floors.py +git commit -m "docs(doctor): changelog + readme for legis doctor; coverage floor" +``` + +--- + +## Self-Review notes (for the implementer) + +- **Spec coverage:** install wiring (T6) ✓, `.mcp.json` install+check (T4/T5) ✓, config & stores (T7) ✓, governance integrity (T8) ✓, runtime & siblings (T8) ✓, `--repair` model (repairs live inside checks; T9 proves it) ✓, JSON shape + exit codes (T1/T2/T9) ✓, weft.toml never-written invariant (T7/T9) ✓, key-value-never-shown invariant (T8/T9) ✓. +- **C-9(b) guard** is asserted by `test_weft_toml_malformed_is_error_and_unchanged` and `test_repair_never_writes_weft_toml`. +- **No-leak guard:** `check_audit_chain` constructs `AuditStore` only when the DB file already exists; `check_store_dir` creates `.weft/legis/` only under `--repair`. +- **Verify reused private symbols exist** before Task 6/8 (`SESSION_CONTEXT_COMMAND`, `SKILL_NAME`, `_has_unscoped_session_start_hook`, `_LEGIS_IGNORE_RULES`, `_extract_marker_token`, `_marker_token`). If any name differs, adjust the call site — do not duplicate the logic. From 430a7596427a7871dbda4830cea367ba4b2e2f5b Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:09:53 +1000 Subject: [PATCH 53/72] feat(doctor): DoctorCheck record + text/json rendering --- src/legis/doctor.py | 70 ++++++++++++++++++++++++++++++++++++++++++++ tests/test_doctor.py | 30 +++++++++++++++++++ 2 files changed, 100 insertions(+) create mode 100644 src/legis/doctor.py create mode 100644 tests/test_doctor.py diff --git a/src/legis/doctor.py b/src/legis/doctor.py new file mode 100644 index 0000000..5cb328b --- /dev/null +++ b/src/legis/doctor.py @@ -0,0 +1,70 @@ +"""`legis doctor` — view and repair legis install/config health. + +Operator/CLI tool only: it inspects and repairs the *host* install and legis's +own per-member artifacts. It is NOT on the agent MCP surface or the service +layer, and per hub doctrine C-9(b) it NEVER writes weft.toml. +""" + +from __future__ import annotations + +import json +from dataclasses import dataclass +from pathlib import Path +from typing import Any + + +@dataclass(frozen=True, slots=True) +class DoctorCheck: + id: str + status: str # "ok" | "warn" | "error" + fixed: bool = False + message: str | None = None + + @property + def ok(self) -> bool: + return self.status != "error" + + def to_dict(self) -> dict[str, Any]: + data: dict[str, Any] = {"id": self.id, "status": self.status, "fixed": self.fixed} + if self.message: + data["message"] = self.message + return data + + +def _next_actions(checks: list[DoctorCheck]) -> list[str]: + return [f"{c.id}: {c.message}" for c in checks if c.status != "ok" and c.message] + + +def render_json(checks: list[DoctorCheck]) -> str: + payload = { + "ok": all(c.ok for c in checks), + "checks": [c.to_dict() for c in checks], + "next_actions": _next_actions(checks), + } + return json.dumps(payload, indent=2, sort_keys=True) + + +def render_text(checks: list[DoctorCheck]) -> str: + healthy = all(c.status == "ok" for c in checks) + if healthy: + return "legis doctor: ok" + lines = ["legis doctor:"] + for c in checks: + if c.status == "ok": + continue + lines.append(f" {c.id}: {c.status} — {c.message}" if c.message else f" {c.id}: {c.status}") + return "\n".join(lines) + + +def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: + """Run every check against *root*. Repairs run inside individual checks + when *repair* is True; each returned check reflects post-repair state.""" + checks: list[DoctorCheck] = [] + # Check functions are appended here in later tasks. + return checks + + +def run_doctor(root: Path, *, repair: bool, fmt: str) -> int: + checks = collect_checks(root, repair=repair) + print(render_json(checks) if fmt == "json" else render_text(checks)) + return 0 if all(c.ok for c in checks) else 1 diff --git a/tests/test_doctor.py b/tests/test_doctor.py new file mode 100644 index 0000000..38eb827 --- /dev/null +++ b/tests/test_doctor.py @@ -0,0 +1,30 @@ +from __future__ import annotations + +import json + +from legis.doctor import DoctorCheck, render_json, render_text + + +def test_doctorcheck_to_dict_omits_empty_message(): + assert DoctorCheck("a.b", "ok").to_dict() == {"id": "a.b", "status": "ok", "fixed": False} + assert DoctorCheck("a.b", "error", message="boom").to_dict() == { + "id": "a.b", + "status": "error", + "fixed": False, + "message": "boom", + } + + +def test_render_json_shape(): + checks = [DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")] + payload = json.loads(render_json(checks)) + assert payload["ok"] is False + assert payload["checks"][0] == {"id": "a", "status": "ok", "fixed": False} + assert payload["next_actions"] == ["b: bad"] + + +def test_render_text_lists_only_problems_when_healthy_says_ok(): + assert "legis doctor: ok" in render_text([DoctorCheck("a", "ok")]) + out = render_text([DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")]) + assert "b: error" in out + assert "legis doctor: ok" not in out From 077e668f605d0d3dc5751e39ff638c514edee327 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:10:12 +1000 Subject: [PATCH 54/72] feat(doctor): collect_checks + run_doctor orchestrator skeleton --- tests/test_doctor.py | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 38eb827..44149f3 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -28,3 +28,22 @@ def test_render_text_lists_only_problems_when_healthy_says_ok(): out = render_text([DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")]) assert "b: error" in out assert "legis doctor: ok" not in out + + +from pathlib import Path + +from legis.doctor import run_doctor + + +def test_run_doctor_empty_is_healthy(tmp_path, capsys): + # With no checks registered yet, an empty list renders healthy, exit 0. + rc = run_doctor(tmp_path, repair=False, fmt="text") + assert rc == 0 + assert "legis doctor: ok" in capsys.readouterr().out + + +def test_run_doctor_json_format(tmp_path, capsys): + rc = run_doctor(tmp_path, repair=False, fmt="json") + assert rc == 0 + payload = json.loads(capsys.readouterr().out) + assert payload == {"ok": True, "checks": [], "next_actions": []} From 77b118f3c5572ae6e719835a22cacbf821685456 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:11:06 +1000 Subject: [PATCH 55/72] feat(doctor): wire 'legis doctor' CLI subcommand --- src/legis/cli.py | 22 ++++++++++++++++++++++ tests/test_doctor.py | 17 +++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/src/legis/cli.py b/src/legis/cli.py index bfd89bd..5077029 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -159,6 +159,17 @@ def build_parser() -> argparse.ArgumentParser: help="SessionStart hook: refresh drifted legis instructions/skills in the cwd", ) + doctor = subparsers.add_parser( + "doctor", + help="View and repair legis install/config health", + ) + doctor.add_argument("--root", default=".", help="Project root to inspect (default: cwd)") + doctor.add_argument("--repair", action="store_true", help="Apply safe repairs, then re-check") + doctor.add_argument( + "--format", choices=("text", "json"), default="text", + help="Output format: human text (default) or machine-readable json", + ) + return parser @@ -237,6 +248,14 @@ def _check_override_rate(db_url: str) -> int: return 1 if res.status is GateStatus.FAIL else 0 +def _run_doctor(args) -> int: + from pathlib import Path + + from legis.doctor import run_doctor + + return run_doctor(Path(args.root), repair=args.repair, fmt=args.format) + + def _run_install(args) -> int: from legis.install import ( ensure_gitignore, @@ -380,5 +399,8 @@ def main(argv: list[str] | None = None, *, run=uvicorn.run) -> int: print("policy-boundary-check: PASS") return 1 if findings else 0 + if args.command == "doctor": + return _run_doctor(args) + parser.print_help(sys.stderr) return 2 diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 44149f3..23c7676 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -47,3 +47,20 @@ def test_run_doctor_json_format(tmp_path, capsys): assert rc == 0 payload = json.loads(capsys.readouterr().out) assert payload == {"ok": True, "checks": [], "next_actions": []} + + +from legis.cli import main as cli_main + + +def test_cli_doctor_runs_and_exits_zero(tmp_path, capsys, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = cli_main(["doctor"]) + assert rc == 0 + assert "legis doctor: ok" in capsys.readouterr().out + + +def test_cli_doctor_json(tmp_path, capsys, monkeypatch): + monkeypatch.chdir(tmp_path) + rc = cli_main(["doctor", "--format", "json"]) + assert rc == 0 + assert json.loads(capsys.readouterr().out)["ok"] is True From 87975020721bc6ae14e9aff526c31561371a936e Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:16:40 +1000 Subject: [PATCH 56/72] style(doctor): consolidate test imports; drop redundant Path import --- src/legis/cli.py | 2 -- tests/test_doctor.py | 11 ++--------- 2 files changed, 2 insertions(+), 11 deletions(-) diff --git a/src/legis/cli.py b/src/legis/cli.py index 5077029..a1e97ae 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -249,8 +249,6 @@ def _check_override_rate(db_url: str) -> int: def _run_doctor(args) -> int: - from pathlib import Path - from legis.doctor import run_doctor return run_doctor(Path(args.root), repair=args.repair, fmt=args.format) diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 23c7676..00924e5 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -2,7 +2,8 @@ import json -from legis.doctor import DoctorCheck, render_json, render_text +from legis.cli import main as cli_main +from legis.doctor import DoctorCheck, render_json, render_text, run_doctor def test_doctorcheck_to_dict_omits_empty_message(): @@ -30,11 +31,6 @@ def test_render_text_lists_only_problems_when_healthy_says_ok(): assert "legis doctor: ok" not in out -from pathlib import Path - -from legis.doctor import run_doctor - - def test_run_doctor_empty_is_healthy(tmp_path, capsys): # With no checks registered yet, an empty list renders healthy, exit 0. rc = run_doctor(tmp_path, repair=False, fmt="text") @@ -49,9 +45,6 @@ def test_run_doctor_json_format(tmp_path, capsys): assert payload == {"ok": True, "checks": [], "next_actions": []} -from legis.cli import main as cli_main - - def test_cli_doctor_runs_and_exits_zero(tmp_path, capsys, monkeypatch): monkeypatch.chdir(tmp_path) rc = cli_main(["doctor"]) From c11360d03b474ad32775f20aef571018171976a5 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:22:37 +1000 Subject: [PATCH 57/72] feat(install): register legis MCP server in .mcp.json (+ --mcp flag) Co-Authored-By: Claude Sonnet 4.6 --- src/legis/cli.py | 9 +++++- src/legis/install.py | 64 +++++++++++++++++++++++++++++++++++++++++++ tests/test_install.py | 39 ++++++++++++++++++++++++++ 3 files changed, 111 insertions(+), 1 deletion(-) diff --git a/src/legis/cli.py b/src/legis/cli.py index a1e97ae..f281881 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -153,6 +153,11 @@ def build_parser() -> argparse.ArgumentParser: install.add_argument("--codex-skills", action="store_true", help="Install the Codex skill pack only") install.add_argument("--hooks", action="store_true", help="Register the Claude Code SessionStart hook only") install.add_argument("--gitignore", action="store_true", help="Add legis config rules to .gitignore only") + install.add_argument("--mcp", action="store_true", help="Register the legis MCP server in .mcp.json only") + install.add_argument( + "--agent-id", default="claude-code", + help="Agent id stamped in the .mcp.json legis entry (default: claude-code)", + ) subparsers.add_parser( "session-context", @@ -261,11 +266,12 @@ def _run_install(args) -> int: install_claude_code_hooks, install_codex_skills, install_skills, + register_mcp_json, ) project_root = Path.cwd() install_all = not any( - [args.claude_md, args.agents_md, args.skills, args.codex_skills, args.hooks, args.gitignore] + [args.claude_md, args.agents_md, args.skills, args.codex_skills, args.hooks, args.gitignore, args.mcp] ) steps: list[tuple[bool, str, object]] = [ @@ -275,6 +281,7 @@ def _run_install(args) -> int: (install_all or args.codex_skills, "Codex skill", lambda: install_codex_skills(project_root)), (install_all or args.hooks, "Claude Code hook", lambda: install_claude_code_hooks(project_root)), (install_all or args.gitignore, ".gitignore", lambda: ensure_gitignore(project_root)), + (install_all or args.mcp, ".mcp.json", lambda: register_mcp_json(project_root, args.agent_id)), ] failures = 0 diff --git a/src/legis/install.py b/src/legis/install.py index c336473..b994d6c 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -683,3 +683,67 @@ def ensure_gitignore(project_root: Path) -> tuple[bool, str]: _atomic_write_text(gitignore, _LEGIS_IGNORE_BLOCK.lstrip("\n")) return True, "Created .gitignore with legis config rules" + + +# --------------------------------------------------------------------------- +# .mcp.json (agent MCP server registration) +# --------------------------------------------------------------------------- + +_DEFAULT_AGENT_ID = "claude-code" + + +def _legis_mcp_entry(agent_id: str = _DEFAULT_AGENT_ID) -> dict[str, Any]: + """The canonical legis stdio server entry for .mcp.json.""" + cmd_parts = _find_legis_command() + command = cmd_parts[0] if len(cmd_parts) == 1 else shlex.join(cmd_parts) + return { + "args": ["mcp", "--agent-id", agent_id], + "command": command, + "env": {}, + "type": "stdio", + } + + +def register_mcp_json( + project_root: Path, agent_id: str = _DEFAULT_AGENT_ID +) -> tuple[bool, str]: + """Register (or refresh) the legis server in /.mcp.json. + + Creates the file if absent; merges into mcpServers without disturbing + sibling entries. Preserves an existing legis entry's agent-id if it already + carries one (operator choice), refreshing only the command/args shape. + """ + try: + path = project_path(project_root, ".mcp.json") + except UnsafeInstallPathError as exc: + return False, str(exc) + + data: dict[str, Any] = {} + if path.exists(): + try: + parsed = json.loads(path.read_text(encoding="utf-8")) + if isinstance(parsed, dict): + data = parsed + except (json.JSONDecodeError, OSError): + return False, ".mcp.json present but unreadable; fix or remove it by hand" + + servers = data.get("mcpServers") + if not isinstance(servers, dict): + servers = {} + data["mcpServers"] = servers + + existing = servers.get("legis") + keep_agent = agent_id + if isinstance(existing, dict): + args = existing.get("args", []) + if isinstance(args, list) and "--agent-id" in args: + i = args.index("--agent-id") + if i + 1 < len(args) and isinstance(args[i + 1], str): + keep_agent = args[i + 1] + + desired = _legis_mcp_entry(keep_agent) + if existing == desired: + return True, "legis already registered in .mcp.json" + servers["legis"] = desired + _atomic_write_text(path, json.dumps(data, indent=2, sort_keys=True) + "\n") + return True, "Registered legis server in .mcp.json" diff --git a/tests/test_install.py b/tests/test_install.py index 5329e0e..1505919 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -578,6 +578,45 @@ def test_hook_cmd_matches(command, expected): assert install._hook_cmd_matches(command, "legis session-context") is expected +# --------------------------------------------------------------------------- +# register_mcp_json +# --------------------------------------------------------------------------- + + +def test_register_mcp_json_creates_file_with_legis_entry(tmp_path): + from legis.install import register_mcp_json, _legis_mcp_entry + + ok, msg = register_mcp_json(tmp_path) + assert ok, msg + data = json.loads((tmp_path / ".mcp.json").read_text()) + entry = data["mcpServers"]["legis"] + assert entry["type"] == "stdio" + assert entry["args"][0] == "mcp" + assert "--agent-id" in entry["args"] + + +def test_register_mcp_json_preserves_sibling_entries(tmp_path): + from legis.install import register_mcp_json + + (tmp_path / ".mcp.json").write_text( + json.dumps({"mcpServers": {"filigree": {"command": "x", "type": "stdio"}}}) + ) + ok, _ = register_mcp_json(tmp_path) + assert ok + data = json.loads((tmp_path / ".mcp.json").read_text()) + assert "filigree" in data["mcpServers"] + assert "legis" in data["mcpServers"] + + +def test_register_mcp_json_idempotent(tmp_path): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path) + first = (tmp_path / ".mcp.json").read_text() + register_mcp_json(tmp_path) + assert (tmp_path / ".mcp.json").read_text() == first + + # --------------------------------------------------------------------------- # .gitignore # --------------------------------------------------------------------------- From 0517cd0d34c64d1daa4e8d6827af4b6cb944f4ff Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:23:57 +1000 Subject: [PATCH 58/72] feat(doctor): .mcp.json registration check + repair Co-Authored-By: Claude Sonnet 4.6 --- src/legis/doctor.py | 31 ++++++++++++++++++++++++++++- tests/test_doctor.py | 47 ++++++++++++++++++++++++++++++++++++++------ 2 files changed, 71 insertions(+), 7 deletions(-) diff --git a/src/legis/doctor.py b/src/legis/doctor.py index 5cb328b..75cede4 100644 --- a/src/legis/doctor.py +++ b/src/legis/doctor.py @@ -56,11 +56,40 @@ def render_text(checks: list[DoctorCheck]) -> str: return "\n".join(lines) +def check_mcp_json(root: Path, *, repair: bool) -> DoctorCheck: + """Check that `.mcp.json` exists and has a `legis` server entry.""" + cid = "install.mcp_json" + path = root / ".mcp.json" + present = False + if path.exists(): + try: + data = json.loads(path.read_text(encoding="utf-8")) + present = ( + isinstance(data, dict) + and isinstance(data.get("mcpServers"), dict) + and "legis" in data["mcpServers"] + ) + except (json.JSONDecodeError, OSError): + present = False + if present: + return DoctorCheck(cid, "ok") + if repair: + from legis.install import register_mcp_json + + ok, msg = register_mcp_json(root) + if ok: + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck( + cid, "error", message="legis server not registered (run: legis install --mcp)" + ) + + def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: """Run every check against *root*. Repairs run inside individual checks when *repair* is True; each returned check reflects post-repair state.""" checks: list[DoctorCheck] = [] - # Check functions are appended here in later tasks. + checks.append(check_mcp_json(root, repair=repair)) return checks diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 00924e5..5313dc5 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -3,7 +3,7 @@ import json from legis.cli import main as cli_main -from legis.doctor import DoctorCheck, render_json, render_text, run_doctor +from legis.doctor import DoctorCheck, check_mcp_json, render_json, render_text, run_doctor def test_doctorcheck_to_dict_omits_empty_message(): @@ -31,29 +31,64 @@ def test_render_text_lists_only_problems_when_healthy_says_ok(): assert "legis doctor: ok" not in out -def test_run_doctor_empty_is_healthy(tmp_path, capsys): - # With no checks registered yet, an empty list renders healthy, exit 0. +def test_run_doctor_healthy_when_mcp_present(tmp_path, capsys): + # A project with .mcp.json registered renders healthy, exit 0. + from legis.install import register_mcp_json + + register_mcp_json(tmp_path) rc = run_doctor(tmp_path, repair=False, fmt="text") assert rc == 0 assert "legis doctor: ok" in capsys.readouterr().out def test_run_doctor_json_format(tmp_path, capsys): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path) rc = run_doctor(tmp_path, repair=False, fmt="json") assert rc == 0 payload = json.loads(capsys.readouterr().out) - assert payload == {"ok": True, "checks": [], "next_actions": []} + assert payload["ok"] is True + assert payload["next_actions"] == [] def test_cli_doctor_runs_and_exits_zero(tmp_path, capsys, monkeypatch): monkeypatch.chdir(tmp_path) - rc = cli_main(["doctor"]) + rc = cli_main(["doctor", "--repair"]) assert rc == 0 assert "legis doctor: ok" in capsys.readouterr().out def test_cli_doctor_json(tmp_path, capsys, monkeypatch): monkeypatch.chdir(tmp_path) - rc = cli_main(["doctor", "--format", "json"]) + rc = cli_main(["doctor", "--repair", "--format", "json"]) assert rc == 0 assert json.loads(capsys.readouterr().out)["ok"] is True + + +# --------------------------------------------------------------------------- +# check_mcp_json +# --------------------------------------------------------------------------- + + +def test_mcp_json_absent_is_error(tmp_path): + c = check_mcp_json(tmp_path, repair=False) + assert c.id == "install.mcp_json" + assert c.status == "error" + assert c.fixed is False + + +def test_mcp_json_repair_fixes_it(tmp_path): + c = check_mcp_json(tmp_path, repair=True) + assert c.status == "ok" + assert c.fixed is True + assert (tmp_path / ".mcp.json").exists() + + +def test_mcp_json_present_is_ok(tmp_path): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path) + c = check_mcp_json(tmp_path, repair=False) + assert c.status == "ok" + assert c.fixed is False From fc38a6d1be703cc5249ac4a8de59f07c01975213 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:25:46 +1000 Subject: [PATCH 59/72] fix(install): split command/args in .mcp.json entry for module fallback Co-Authored-By: Claude Sonnet 4.6 --- src/legis/install.py | 15 ++++++++++----- tests/test_install.py | 7 +++++++ 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/src/legis/install.py b/src/legis/install.py index b994d6c..a194c94 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -693,12 +693,17 @@ def ensure_gitignore(project_root: Path) -> tuple[bool, str]: def _legis_mcp_entry(agent_id: str = _DEFAULT_AGENT_ID) -> dict[str, Any]: - """The canonical legis stdio server entry for .mcp.json.""" - cmd_parts = _find_legis_command() - command = cmd_parts[0] if len(cmd_parts) == 1 else shlex.join(cmd_parts) + """The canonical legis stdio server entry for .mcp.json. + + Splits the resolved invocation into a bare ``command`` (the executable an + MCP client execs directly) plus ``args`` so the module-fallback form + (`` -P -m legis ...``) launches correctly — a single joined string + in ``command`` would not be exec'd as separate argv tokens. + """ + cmd = _find_legis_command() return { - "args": ["mcp", "--agent-id", agent_id], - "command": command, + "args": cmd[1:] + ["mcp", "--agent-id", agent_id], + "command": cmd[0], "env": {}, "type": "stdio", } diff --git a/tests/test_install.py b/tests/test_install.py index 1505919..cbb6fba 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -617,6 +617,13 @@ def test_register_mcp_json_idempotent(tmp_path): assert (tmp_path / ".mcp.json").read_text() == first +def test_legis_mcp_entry_module_fallback_splits_command_and_args(monkeypatch): + monkeypatch.setattr(install, "_find_legis_command", lambda: ["/usr/bin/python3", "-P", "-m", "legis"]) + entry = install._legis_mcp_entry("claude-code") + assert entry["command"] == "/usr/bin/python3" + assert entry["args"] == ["-P", "-m", "legis", "mcp", "--agent-id", "claude-code"] + + # --------------------------------------------------------------------------- # .gitignore # --------------------------------------------------------------------------- From ae99cdf29b4adbc0dae0fd65e64ec5b38e282db7 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:32:53 +1000 Subject: [PATCH 60/72] fix(install): let explicit --agent-id win; guard non-dict .mcp.json Co-Authored-By: Claude Sonnet 4.6 --- src/legis/cli.py | 5 +++-- src/legis/install.py | 30 ++++++++++++++++++------------ tests/test_install.py | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 54 insertions(+), 14 deletions(-) diff --git a/src/legis/cli.py b/src/legis/cli.py index f281881..57d4983 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -155,8 +155,9 @@ def build_parser() -> argparse.ArgumentParser: install.add_argument("--gitignore", action="store_true", help="Add legis config rules to .gitignore only") install.add_argument("--mcp", action="store_true", help="Register the legis MCP server in .mcp.json only") install.add_argument( - "--agent-id", default="claude-code", - help="Agent id stamped in the .mcp.json legis entry (default: claude-code)", + "--agent-id", default=None, + help="Agent id stamped in the .mcp.json legis entry " + "(default: claude-code, or preserve an existing entry's id)", ) subparsers.add_parser( diff --git a/src/legis/install.py b/src/legis/install.py index a194c94..563d791 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -710,13 +710,15 @@ def _legis_mcp_entry(agent_id: str = _DEFAULT_AGENT_ID) -> dict[str, Any]: def register_mcp_json( - project_root: Path, agent_id: str = _DEFAULT_AGENT_ID + project_root: Path, agent_id: str | None = None ) -> tuple[bool, str]: """Register (or refresh) the legis server in /.mcp.json. Creates the file if absent; merges into mcpServers without disturbing - sibling entries. Preserves an existing legis entry's agent-id if it already - carries one (operator choice), refreshing only the command/args shape. + sibling entries. An explicit *agent_id* always wins; when it is ``None`` + (the default), an existing legis entry's agent-id is preserved (operator + choice), falling back to ``_DEFAULT_AGENT_ID`` for a fresh entry. Refreshes + only the command/args shape otherwise. """ try: path = project_path(project_root, ".mcp.json") @@ -727,10 +729,11 @@ def register_mcp_json( if path.exists(): try: parsed = json.loads(path.read_text(encoding="utf-8")) - if isinstance(parsed, dict): - data = parsed except (json.JSONDecodeError, OSError): return False, ".mcp.json present but unreadable; fix or remove it by hand" + if not isinstance(parsed, dict): + return False, ".mcp.json present but not a JSON object; fix or remove it by hand" + data = parsed servers = data.get("mcpServers") if not isinstance(servers, dict): @@ -738,13 +741,16 @@ def register_mcp_json( data["mcpServers"] = servers existing = servers.get("legis") - keep_agent = agent_id - if isinstance(existing, dict): - args = existing.get("args", []) - if isinstance(args, list) and "--agent-id" in args: - i = args.index("--agent-id") - if i + 1 < len(args) and isinstance(args[i + 1], str): - keep_agent = args[i + 1] + if agent_id is not None: + keep_agent = agent_id # explicit caller wins + else: + keep_agent = _DEFAULT_AGENT_ID # default... + if isinstance(existing, dict): # ...but preserve an existing entry's id + args = existing.get("args", []) + if isinstance(args, list) and "--agent-id" in args: + i = args.index("--agent-id") + if i + 1 < len(args) and isinstance(args[i + 1], str): + keep_agent = args[i + 1] desired = _legis_mcp_entry(keep_agent) if existing == desired: diff --git a/tests/test_install.py b/tests/test_install.py index cbb6fba..19e0ed4 100644 --- a/tests/test_install.py +++ b/tests/test_install.py @@ -624,6 +624,39 @@ def test_legis_mcp_entry_module_fallback_splits_command_and_args(monkeypatch): assert entry["args"] == ["-P", "-m", "legis", "mcp", "--agent-id", "claude-code"] +def test_register_mcp_json_explicit_agent_id_wins_over_existing(tmp_path): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path, "claude-code") + register_mcp_json(tmp_path, "new-bot") + data = json.loads((tmp_path / ".mcp.json").read_text()) + args = data["mcpServers"]["legis"]["args"] + i = args.index("--agent-id") + assert args[i + 1] == "new-bot" + + +def test_register_mcp_json_default_preserves_existing_agent_id(tmp_path): + from legis.install import register_mcp_json + + register_mcp_json(tmp_path, "operator-pick") + register_mcp_json(tmp_path) # default (None) → preserve operator choice + data = json.loads((tmp_path / ".mcp.json").read_text()) + args = data["mcpServers"]["legis"]["args"] + i = args.index("--agent-id") + assert args[i + 1] == "operator-pick" + + +def test_register_mcp_json_non_dict_top_level_is_rejected_unchanged(tmp_path): + from legis.install import register_mcp_json + + mcp = tmp_path / ".mcp.json" + mcp.write_text("[]") + ok, msg = register_mcp_json(tmp_path) + assert ok is False + assert "not a JSON object" in msg + assert mcp.read_text() == "[]" + + # --------------------------------------------------------------------------- # .gitignore # --------------------------------------------------------------------------- From 71fbcd17dfa4c370062e93721f970e9cce0bb711 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:36:22 +1000 Subject: [PATCH 61/72] feat(doctor): install-wiring checks (blocks, skills, hook, gitignore) Co-Authored-By: Claude Sonnet 4.6 --- src/legis/doctor.py | 119 +++++++++++++++++++++++++++++++++++++++++++ tests/test_doctor.py | 61 ++++++++++++++++++---- 2 files changed, 171 insertions(+), 9 deletions(-) diff --git a/src/legis/doctor.py b/src/legis/doctor.py index 75cede4..48ec488 100644 --- a/src/legis/doctor.py +++ b/src/legis/doctor.py @@ -12,6 +12,8 @@ from pathlib import Path from typing import Any +from legis import install as _install + @dataclass(frozen=True, slots=True) class DoctorCheck: @@ -85,10 +87,127 @@ def check_mcp_json(root: Path, *, repair: bool) -> DoctorCheck: ) +# --------------------------------------------------------------------------- +# Install-wiring checks (Task 6) +# --------------------------------------------------------------------------- + + +def _block_fresh(root: Path, filename: str) -> bool: + """True iff / has the legis block at the current token.""" + path = root / filename + if not path.exists(): + return False + try: + content = path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + if _install.INSTRUCTIONS_MARKER not in content: + return False + return _install._extract_marker_token(content) == _install._marker_token() + + +def check_instruction_block(root: Path, filename: str, *, repair: bool) -> DoctorCheck: + """Check that / has the legis instruction block at the current token.""" + cid = "install.claude_md" if filename == "CLAUDE.md" else "install.agents_md" + if _block_fresh(root, filename): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.inject_instructions(root / filename) + if ok and _block_fresh(root, filename): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + missing = "missing" if not (root / filename).exists() else "block missing or drifted" + return DoctorCheck(cid, "error", message=f"{filename} {missing} (run: legis install)") + + +def _skill_fresh(root: Path, base: str) -> bool: + """True iff the skill pack under //skills/ matches the source fingerprint.""" + source = _install._get_skills_source_dir() / _install.SKILL_NAME + target = root / base / "skills" / _install.SKILL_NAME + if not source.is_dir() or not target.is_dir(): + return False + return _install._skill_tree_fingerprint(target) == _install._skill_tree_fingerprint(source) + + +def check_skill_pack(root: Path, base: str, *, repair: bool) -> DoctorCheck: + """Check that the legis skill pack under //skills/ is present and fresh.""" + cid = "install.claude_skill" if base == ".claude" else "install.agents_skill" + installer = _install.install_skills if base == ".claude" else _install.install_codex_skills + if _skill_fresh(root, base): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = installer(root) + if ok and _skill_fresh(root, base): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck( + cid, + "error", + message=f"{base}/skills/{_install.SKILL_NAME} missing or drifted (run: legis install)", + ) + + +def _hook_present(root: Path) -> bool: + """True iff the SessionStart hook is registered in .claude/settings.json.""" + settings_path = root / ".claude" / "settings.json" + if not settings_path.exists(): + return False + try: + settings = json.loads(settings_path.read_text(encoding="utf-8")) + except (json.JSONDecodeError, OSError): + return False + return _install._has_unscoped_session_start_hook(settings, _install.SESSION_CONTEXT_COMMAND) + + +def check_hook(root: Path, *, repair: bool) -> DoctorCheck: + """Check that the legis SessionStart hook is registered.""" + cid = "install.hook" + if _hook_present(root): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.install_claude_code_hooks(root) + if ok and _hook_present(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message="SessionStart hook not registered (run: legis install)") + + +def _gitignore_present(root: Path) -> bool: + """True iff all legis ignore rules are present in .gitignore.""" + path = root / ".gitignore" + if not path.exists(): + return False + try: + content = path.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + present = {ln.strip() for ln in content.splitlines() if ln.strip() and not ln.lstrip().startswith("#")} + return all(rule in present for rule in _install._LEGIS_IGNORE_RULES) + + +def check_gitignore(root: Path, *, repair: bool) -> DoctorCheck: + """Check that legis .gitignore rules are present.""" + cid = "install.gitignore" + if _gitignore_present(root): + return DoctorCheck(cid, "ok") + if repair: + ok, msg = _install.ensure_gitignore(root) + if ok and _gitignore_present(root): + return DoctorCheck(cid, "ok", fixed=True) + return DoctorCheck(cid, "error", message=msg) + return DoctorCheck(cid, "error", message=".weft/legis/ not in .gitignore (run: legis install)") + + def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: """Run every check against *root*. Repairs run inside individual checks when *repair* is True; each returned check reflects post-repair state.""" checks: list[DoctorCheck] = [] + checks.append(check_instruction_block(root, "CLAUDE.md", repair=repair)) + checks.append(check_instruction_block(root, "AGENTS.md", repair=repair)) + checks.append(check_skill_pack(root, ".claude", repair=repair)) + checks.append(check_skill_pack(root, ".agents", repair=repair)) + checks.append(check_hook(root, repair=repair)) + checks.append(check_gitignore(root, repair=repair)) checks.append(check_mcp_json(root, repair=repair)) return checks diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 5313dc5..22ef0fb 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -3,7 +3,18 @@ import json from legis.cli import main as cli_main -from legis.doctor import DoctorCheck, check_mcp_json, render_json, render_text, run_doctor +from legis.doctor import ( + DoctorCheck, + check_gitignore, + check_hook, + check_instruction_block, + check_mcp_json, + check_skill_pack, + render_json, + render_text, + run_doctor, +) +from legis import install as legis_install def test_doctorcheck_to_dict_omits_empty_message(): @@ -31,20 +42,18 @@ def test_render_text_lists_only_problems_when_healthy_says_ok(): assert "legis doctor: ok" not in out -def test_run_doctor_healthy_when_mcp_present(tmp_path, capsys): - # A project with .mcp.json registered renders healthy, exit 0. - from legis.install import register_mcp_json - - register_mcp_json(tmp_path) +def test_run_doctor_healthy_after_repair(tmp_path, capsys): + # A project repaired via run_doctor renders healthy on re-check, exit 0. + run_doctor(tmp_path, repair=True, fmt="text") + capsys.readouterr() # discard repair output rc = run_doctor(tmp_path, repair=False, fmt="text") assert rc == 0 assert "legis doctor: ok" in capsys.readouterr().out def test_run_doctor_json_format(tmp_path, capsys): - from legis.install import register_mcp_json - - register_mcp_json(tmp_path) + run_doctor(tmp_path, repair=True, fmt="json") + capsys.readouterr() # discard repair output rc = run_doctor(tmp_path, repair=False, fmt="json") assert rc == 0 payload = json.loads(capsys.readouterr().out) @@ -92,3 +101,37 @@ def test_mcp_json_present_is_ok(tmp_path): c = check_mcp_json(tmp_path, repair=False) assert c.status == "ok" assert c.fixed is False + + +# --------------------------------------------------------------------------- +# Task 6: install-wiring checks (blocks, skills, hook, gitignore) +# --------------------------------------------------------------------------- + + +def test_instruction_block_absent_is_error(tmp_path): + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=False) + assert c.id == "install.claude_md" + assert c.status == "error" + + +def test_instruction_block_repair_creates_it(tmp_path): + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=True) + assert c.status == "ok" + assert c.fixed is True + assert legis_install.INSTRUCTIONS_MARKER in (tmp_path / "CLAUDE.md").read_text() + + +def test_gitignore_absent_is_error_then_repaired(tmp_path): + assert check_gitignore(tmp_path, repair=False).status == "error" + fixed = check_gitignore(tmp_path, repair=True) + assert fixed.status == "ok" and fixed.fixed is True + assert ".weft/legis/" in (tmp_path / ".gitignore").read_text() + + +def test_skill_pack_absent_is_error(tmp_path): + assert check_skill_pack(tmp_path, ".claude", repair=False).status == "error" + + +def test_skill_pack_repair_installs(tmp_path): + c = check_skill_pack(tmp_path, ".claude", repair=True) + assert c.status == "ok" and c.fixed is True From 8f11e0333485535638554469d32c90efcb544214 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:43:12 +1000 Subject: [PATCH 62/72] refactor(doctor): share gitignore predicate with install; test install-wiring drift Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/doctor.py | 17 ++------------ src/legis/install.py | 20 ++++++++++++++-- tests/test_doctor.py | 55 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 75 insertions(+), 17 deletions(-) diff --git a/src/legis/doctor.py b/src/legis/doctor.py index 48ec488..48d854f 100644 --- a/src/legis/doctor.py +++ b/src/legis/doctor.py @@ -172,27 +172,14 @@ def check_hook(root: Path, *, repair: bool) -> DoctorCheck: return DoctorCheck(cid, "error", message="SessionStart hook not registered (run: legis install)") -def _gitignore_present(root: Path) -> bool: - """True iff all legis ignore rules are present in .gitignore.""" - path = root / ".gitignore" - if not path.exists(): - return False - try: - content = path.read_text(encoding="utf-8") - except (OSError, UnicodeDecodeError): - return False - present = {ln.strip() for ln in content.splitlines() if ln.strip() and not ln.lstrip().startswith("#")} - return all(rule in present for rule in _install._LEGIS_IGNORE_RULES) - - def check_gitignore(root: Path, *, repair: bool) -> DoctorCheck: """Check that legis .gitignore rules are present.""" cid = "install.gitignore" - if _gitignore_present(root): + if _install.gitignore_rules_present(root): return DoctorCheck(cid, "ok") if repair: ok, msg = _install.ensure_gitignore(root) - if ok and _gitignore_present(root): + if ok and _install.gitignore_rules_present(root): return DoctorCheck(cid, "ok", fixed=True) return DoctorCheck(cid, "error", message=msg) return DoctorCheck(cid, "error", message=".weft/legis/ not in .gitignore (run: legis install)") diff --git a/src/legis/install.py b/src/legis/install.py index 563d791..0c8f587 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -657,6 +657,22 @@ def install_claude_code_hooks(project_root: Path) -> tuple[bool, str]: ) +def gitignore_rules_present(project_root: Path) -> bool: + """True iff every legis ignore rule is already a non-comment line in .gitignore.""" + try: + gitignore = project_path(project_root, ".gitignore") + except UnsafeInstallPathError: + return False + if not gitignore.exists(): + return False + try: + content = gitignore.read_text(encoding="utf-8") + except (OSError, UnicodeDecodeError): + return False + present = {ln.strip() for ln in content.splitlines() if ln.strip() and not ln.lstrip().startswith("#")} + return all(rule in present for rule in _LEGIS_IGNORE_RULES) + + def ensure_gitignore(project_root: Path) -> tuple[bool, str]: """Ensure legis's runtime-state subtree (``.weft/legis/``) is ignored.""" try: @@ -665,13 +681,13 @@ def ensure_gitignore(project_root: Path) -> tuple[bool, str]: return False, str(exc) if gitignore.exists(): + if gitignore_rules_present(project_root): + return True, "legis config already in .gitignore" content = gitignore.read_text(encoding="utf-8") present = { line.strip() for line in content.splitlines() if line.strip() and not line.lstrip().startswith("#") } missing = [rule for rule in _LEGIS_IGNORE_RULES if rule not in present] - if not missing: - return True, "legis config already in .gitignore" if not content.endswith("\n"): content += "\n" # Append only the rules that are actually absent — writing the whole diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 22ef0fb..3da2d6b 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -135,3 +135,58 @@ def test_skill_pack_absent_is_error(tmp_path): def test_skill_pack_repair_installs(tmp_path): c = check_skill_pack(tmp_path, ".claude", repair=True) assert c.status == "ok" and c.fixed is True + + +# --------------------------------------------------------------------------- +# Task 6 (drift): stale block / stale skill pack are the headline behavior +# --------------------------------------------------------------------------- + + +def test_instruction_block_stale_token_is_error_then_repaired(tmp_path): + # A real block with a mutated marker token: marker present, token mismatch. + legis_install.inject_instructions(tmp_path / "CLAUDE.md") + path = tmp_path / "CLAUDE.md" + content = path.read_text() + fresh_token = legis_install._marker_token() + stale = content.replace(f":{fresh_token} -->", ":v0:deadbeef -->", 1) + assert stale != content # the token really was rewritten + path.write_text(stale) + assert legis_install._extract_marker_token(stale) != fresh_token + + c = check_instruction_block(tmp_path, "CLAUDE.md", repair=False) + assert c.status == "error" + + fixed = check_instruction_block(tmp_path, "CLAUDE.md", repair=True) + assert fixed.status == "ok" + assert fixed.fixed is True + assert legis_install._extract_marker_token((tmp_path / "CLAUDE.md").read_text()) == fresh_token + + +def test_skill_pack_stale_fingerprint_is_error_then_repaired(tmp_path): + legis_install.install_skills(tmp_path) + pack = tmp_path / ".claude" / "skills" / legis_install.SKILL_NAME + # Mutate a file under the installed pack so its fingerprint diverges from source. + skill_md = pack / "SKILL.md" + skill_md.write_text(skill_md.read_text() + "\n\n") + + c = check_skill_pack(tmp_path, ".claude", repair=False) + assert c.status == "error" + + fixed = check_skill_pack(tmp_path, ".claude", repair=True) + assert fixed.status == "ok" + assert fixed.fixed is True + + +# --------------------------------------------------------------------------- +# Task 6: hook check +# --------------------------------------------------------------------------- + + +def test_hook_absent_is_error_then_repaired(tmp_path): + c = check_hook(tmp_path, repair=False) + assert c.id == "install.hook" + assert c.status == "error" + + fixed = check_hook(tmp_path, repair=True) + assert fixed.status == "ok" + assert fixed.fixed is True From e9e92a04ef5875f4940caf45233213f13b4fbd8e Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:48:50 +1000 Subject: [PATCH 63/72] feat(doctor): config & store checks (weft.toml report-only, store dir, db overrides, legacy) Adds check_weft_toml (C-9(b): never writes, absent=ok, malformed=error), check_store_dir (absent=ok, present-unwritable=error, --repair creates), check_db_overrides (validate LEGIS_*_DB env URL syntax), and check_legacy_stray_db (warn on legis-*.db at repo root). All registered in collect_checks. Co-Authored-By: Claude Sonnet 4.6 --- src/legis/doctor.py | 97 ++++++++++++++++++++++++++++++++++++++++++++ tests/test_doctor.py | 45 ++++++++++++++++++++ 2 files changed, 142 insertions(+) diff --git a/src/legis/doctor.py b/src/legis/doctor.py index 48d854f..0d80d1e 100644 --- a/src/legis/doctor.py +++ b/src/legis/doctor.py @@ -8,10 +8,14 @@ from __future__ import annotations import json +import os +import tomllib from dataclasses import dataclass from pathlib import Path from typing import Any +from sqlalchemy.engine import make_url + from legis import install as _install @@ -185,6 +189,95 @@ def check_gitignore(root: Path, *, repair: bool) -> DoctorCheck: return DoctorCheck(cid, "error", message=".weft/legis/ not in .gitignore (run: legis install)") +# --------------------------------------------------------------------------- +# Task 7: config & store checks +# --------------------------------------------------------------------------- + +_DB_OVERRIDE_ENVS = ("LEGIS_CHECK_DB", "LEGIS_GOVERNANCE_DB", "LEGIS_BINDING_DB", "LEGIS_PULL_DB") +_LEGACY_DB_NAMES = ("legis-checks.db", "legis-governance.db", "legis-binding.db", "legis-pulls.db") + + +def check_weft_toml(root: Path) -> DoctorCheck: + """Report-only (C-9(b)): NEVER writes weft.toml. Distinguishes ABSENT (ok — + defaults intentional) from PRESENT-BUT-BROKEN (error — config silently not + applying), restoring the operator signal that C-9(c) silences at runtime.""" + cid = "config.weft_toml" + path = root / "weft.toml" + if not path.exists(): + return DoctorCheck(cid, "ok", message="absent (built-in defaults)") + try: + data = tomllib.loads(path.read_text(encoding="utf-8")) + except (tomllib.TOMLDecodeError, OSError, UnicodeDecodeError) as exc: + return DoctorCheck( + cid, + "error", + message=f"present but unparseable; [legis] silently not applying ({exc})", + ) + table = data.get("legis") + if table is not None and not isinstance(table, dict): + return DoctorCheck(cid, "error", message="[legis] in weft.toml must be a table") + return DoctorCheck(cid, "ok") + + +def _nearest_existing(path: Path) -> Path: + p = path + while not p.exists() and p != p.parent: + p = p.parent + return p + + +def check_store_dir(root: Path, *, repair: bool = False) -> DoctorCheck: + """An absent .weft/legis/ is ok (created lazily). A present-but-unwritable + dir is an error. --repair ensures the dir exists (explicit operator action).""" + cid = "store.dir" + from legis import config + + store_dir_rel = config._store_dir() + store_dir = store_dir_rel if store_dir_rel.is_absolute() else (root / store_dir_rel) + if store_dir.exists(): + if not os.access(store_dir, os.W_OK): + return DoctorCheck(cid, "error", message=f"{store_dir} not writable") + return DoctorCheck(cid, "ok") + if repair: + try: + store_dir.mkdir(parents=True, exist_ok=True) + return DoctorCheck(cid, "ok", fixed=True) + except OSError as exc: + return DoctorCheck(cid, "error", message=f"cannot create {store_dir}: {exc}") + anchor = _nearest_existing(store_dir) + if not os.access(anchor, os.W_OK): + return DoctorCheck(cid, "error", message=f"{store_dir} not creatable ({anchor} not writable)") + return DoctorCheck(cid, "ok", message="absent (created on first store open)") + + +def check_db_overrides(root: Path) -> DoctorCheck: # noqa: ARG001 + cid = "store.db_overrides" + bad = [] + for env in _DB_OVERRIDE_ENVS: + val = os.environ.get(env) + if not val: + continue + try: + make_url(val) + except Exception: # noqa: BLE001 — any parse failure is a bad override + bad.append(env) + if bad: + return DoctorCheck(cid, "error", message="invalid URL in: " + ", ".join(bad)) + return DoctorCheck(cid, "ok") + + +def check_legacy_stray_db(root: Path) -> DoctorCheck: + cid = "store.legacy_stray" + stray = [n for n in _LEGACY_DB_NAMES if (root / n).is_file()] + if stray: + return DoctorCheck( + cid, + "warn", + message="legacy DB at repo root (move to .weft/legis/): " + ", ".join(stray), + ) + return DoctorCheck(cid, "ok") + + def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: """Run every check against *root*. Repairs run inside individual checks when *repair* is True; each returned check reflects post-repair state.""" @@ -196,6 +289,10 @@ def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: checks.append(check_hook(root, repair=repair)) checks.append(check_gitignore(root, repair=repair)) checks.append(check_mcp_json(root, repair=repair)) + checks.append(check_weft_toml(root)) + checks.append(check_store_dir(root, repair=repair)) + checks.append(check_db_overrides(root)) + checks.append(check_legacy_stray_db(root)) return checks diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 3da2d6b..40a21ba 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -190,3 +190,48 @@ def test_hook_absent_is_error_then_repaired(tmp_path): fixed = check_hook(tmp_path, repair=True) assert fixed.status == "ok" assert fixed.fixed is True + + +# --------------------------------------------------------------------------- +# Task 7: config & store checks (weft.toml report-only, store dir, db overrides, legacy) +# --------------------------------------------------------------------------- + + +from legis.doctor import check_weft_toml, check_store_dir, check_db_overrides, check_legacy_stray_db + + +def test_weft_toml_absent_is_ok(tmp_path): + assert check_weft_toml(tmp_path).status == "ok" + + +def test_weft_toml_valid_legis_table_is_ok(tmp_path): + (tmp_path / "weft.toml").write_text('[legis]\nstore_dir = ".weft/legis"\n') + assert check_weft_toml(tmp_path).status == "ok" + + +def test_weft_toml_malformed_is_error_and_unchanged(tmp_path): + wt = tmp_path / "weft.toml" + wt.write_text("[legis]\nstore_dir = \n") # malformed TOML + before = wt.read_text() + c = check_weft_toml(tmp_path) + assert c.status == "error" + assert wt.read_text() == before # C-9(b): never written + + +def test_weft_toml_legis_not_a_table_is_error(tmp_path): + (tmp_path / "weft.toml").write_text('legis = "oops"\n') + assert check_weft_toml(tmp_path).status == "error" + + +def test_store_dir_writable_parent_is_ok(tmp_path): + assert check_store_dir(tmp_path).status == "ok" + + +def test_db_override_bad_url_is_error(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "::not a url::") + assert check_db_overrides(tmp_path).status == "error" + + +def test_legacy_stray_db_is_warn(tmp_path): + (tmp_path / "legis-governance.db").write_text("x") + assert check_legacy_stray_db(tmp_path).status == "warn" From f0025785a39e5e4b88d25816649a7661612fa7e8 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 19:50:12 +1000 Subject: [PATCH 64/72] feat(doctor): governance-chain integrity + runtime/sibling checks Adds check_audit_chain (absent DB=ok, no-leak: never creates DB file; tampered chain=error, report-only), _store_url helper (resolves DB paths relative to root, not cwd, so tests are cwd-agnostic), check_hmac_key (presence-only, never renders value; warns if protected policies configured without key), and check_sibling_url (validates LOOMWEAVE_API_URL/FILIGREE_API_URL as http(s)). All registered in collect_checks. Co-Authored-By: Claude Sonnet 4.6 --- src/legis/doctor.py | 76 ++++++++++++++++++++++++++++++++++++++++++++ tests/test_doctor.py | 44 +++++++++++++++++++++++++ 2 files changed, 120 insertions(+) diff --git a/src/legis/doctor.py b/src/legis/doctor.py index 0d80d1e..0777c8a 100644 --- a/src/legis/doctor.py +++ b/src/legis/doctor.py @@ -13,6 +13,7 @@ from dataclasses import dataclass from pathlib import Path from typing import Any +from urllib.parse import urlsplit from sqlalchemy.engine import make_url @@ -278,6 +279,76 @@ def check_legacy_stray_db(root: Path) -> DoctorCheck: return DoctorCheck(cid, "ok") +# --------------------------------------------------------------------------- +# Task 8: governance integrity + runtime/sibling checks +# --------------------------------------------------------------------------- + + +def _store_url(root: Path, db_name: str, env: str) -> str: + """Resolve a store URL relative to *root* (not cwd). Respects the LEGIS_*_DB + env-var override first; otherwise constructs a file URL under the configured + store_dir joined to root so tests using a tmp_path don't touch the real repo.""" + val = os.environ.get(env) + if val: + return val + from legis import config + + store_dir = config._store_dir() + base = store_dir if store_dir.is_absolute() else (root / store_dir) + return "sqlite:///" + (base / db_name).as_posix() + + +def check_audit_chain(cid: str, url: str) -> DoctorCheck: + """Report-only. Absent file store => ok (nothing to verify; must NOT create + the DB). A tampered chain => error (cannot/must not be auto-repaired).""" + try: + parsed = make_url(url) + except Exception: # noqa: BLE001 + return DoctorCheck(cid, "ok", message="store URL not a file store") + db = parsed.database + if not str(parsed.drivername).startswith("sqlite") or not db or db == ":memory:": + return DoctorCheck(cid, "ok", message="not a file store") + if not Path(db).exists(): + return DoctorCheck(cid, "ok", message="no store yet") + from legis.store.audit_store import AuditStore + + try: + intact = AuditStore(url).verify_integrity() + except Exception as exc: # noqa: BLE001 — surface any verify failure, never raise from doctor + return DoctorCheck(cid, "error", message=f"integrity check failed: {exc}") + if intact: + return DoctorCheck(cid, "ok") + return DoctorCheck( + cid, "error", message="hash chain verification FAILED (report-only; cannot repair)" + ) + + +def check_hmac_key(root: Path) -> DoctorCheck: # noqa: ARG001 + """Presence-only; NEVER renders the key value.""" + cid = "runtime.hmac_key" + from legis import config + + if not config.protected_policies(): + return DoctorCheck(cid, "ok", message="no protected policies configured") + if os.environ.get("LEGIS_HMAC_KEY"): + return DoctorCheck(cid, "ok") + return DoctorCheck( + cid, + "warn", + message="protected policies configured but LEGIS_HMAC_KEY not set; protected submissions will fail", + ) + + +def check_sibling_url(cid: str, env: str) -> DoctorCheck: + url = os.environ.get(env) + if not url: + return DoctorCheck(cid, "ok", message="not configured") + parsed = urlsplit(url) + if parsed.scheme.lower() in {"http", "https"} and parsed.netloc: + return DoctorCheck(cid, "ok") + return DoctorCheck(cid, "error", message=f"{env} invalid URL: {url!r}") + + def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: """Run every check against *root*. Repairs run inside individual checks when *repair* is True; each returned check reflects post-repair state.""" @@ -293,6 +364,11 @@ def collect_checks(root: Path, *, repair: bool) -> list[DoctorCheck]: checks.append(check_store_dir(root, repair=repair)) checks.append(check_db_overrides(root)) checks.append(check_legacy_stray_db(root)) + checks.append(check_audit_chain("store.governance_chain", _store_url(root, "legis-governance.db", "LEGIS_GOVERNANCE_DB"))) + checks.append(check_audit_chain("store.binding_chain", _store_url(root, "legis-binding.db", "LEGIS_BINDING_DB"))) + checks.append(check_hmac_key(root)) + checks.append(check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL")) + checks.append(check_sibling_url("runtime.filigree_url", "FILIGREE_API_URL")) return checks diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 40a21ba..9980d9c 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -235,3 +235,47 @@ def test_db_override_bad_url_is_error(tmp_path, monkeypatch): def test_legacy_stray_db_is_warn(tmp_path): (tmp_path / "legis-governance.db").write_text("x") assert check_legacy_stray_db(tmp_path).status == "warn" + + +# --------------------------------------------------------------------------- +# Task 8: governance integrity + runtime/sibling checks +# --------------------------------------------------------------------------- + + +from legis.doctor import check_audit_chain, check_hmac_key, check_sibling_url + + +def test_audit_chain_absent_db_is_ok(tmp_path): + c = check_audit_chain("store.governance_chain", "sqlite:///" + str(tmp_path / "nope.db")) + assert c.status == "ok" + # No-leak invariant: must NOT create the file + assert not (tmp_path / "nope.db").exists() + + +def test_audit_chain_intact_db_is_ok(tmp_path): + from legis.store.audit_store import AuditStore + + url = "sqlite:///" + str(tmp_path / "gov.db") + AuditStore(url) # creates schema + assert check_audit_chain("store.governance_chain", url).status == "ok" + + +def test_hmac_key_warn_when_protected_set_without_key(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.delenv("LEGIS_HMAC_KEY", raising=False) + c = check_hmac_key(tmp_path) + assert c.status == "warn" + + +def test_hmac_key_never_prints_value(tmp_path, monkeypatch): + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.setenv("LEGIS_HMAC_KEY", "super-secret-value") + c = check_hmac_key(tmp_path) + assert c.status == "ok" + assert "super-secret-value" not in (c.message or "") + + +def test_sibling_url_invalid_is_error(tmp_path, monkeypatch): + monkeypatch.setenv("LOOMWEAVE_API_URL", "localhost:9620") # no scheme + c = check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL") + assert c.status == "error" From 6b77ff368d484b64711bedb464b397c51ba15c5c Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 20:01:19 +1000 Subject: [PATCH 65/72] fix(doctor): root-anchor store_dir; source store specs from config; tighten override checks Review follow-ups on Chunk D: 1. Root-anchor store_dir resolution. New _store_dir_for(root) reads root/weft.toml (never cwd) and returns an absolute path, used by both check_store_dir and _store_url, so --root != cwd is consistent and an absolute cwd store_dir can't escape root. Malformed weft.toml falls back to the default (check_weft_toml reports it). 2. Source store identity from config. New config.STORE_DB_SPECS (env, name) tuples; doctor derives _DB_OVERRIDE_ENVS / _LEGACY_DB_NAMES from it so adding a store can't silently drop doctor coverage. 3. Match config's empty-override precedence: present-but-empty LEGIS_*_DB is a verbatim broken override (membership, not truthiness) in both check_db_overrides and _store_url. 4. Robust sqlite predicate via make_url(url).get_backend_name() == "sqlite". Tests: root-anchored store_dir via weft.toml; empty-string override -> error. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/config.py | 11 ++++++++ src/legis/doctor.py | 62 ++++++++++++++++++++++++++++---------------- tests/test_doctor.py | 34 ++++++++++++++++++++++++ 3 files changed, 84 insertions(+), 23 deletions(-) diff --git a/src/legis/config.py b/src/legis/config.py index f72da0d..c89fca6 100644 --- a/src/legis/config.py +++ b/src/legis/config.py @@ -61,6 +61,17 @@ _BINDING_DB_ENV = "LEGIS_BINDING_DB" _PULL_DB_ENV = "LEGIS_PULL_DB" +# Public, stably-ordered (override env var, default filename) for every store. +# THE single source of store identity so consumers (e.g. ``legis doctor``) never +# re-list the env vars / filenames: adding a 5th store here automatically extends +# their coverage instead of silently dropping it. +STORE_DB_SPECS: tuple[tuple[str, str], ...] = ( + (_CHECK_DB_ENV, _CHECK_DB_NAME), + (_GOVERNANCE_DB_ENV, _GOVERNANCE_DB_NAME), + (_BINDING_DB_ENV, _BINDING_DB_NAME), + (_PULL_DB_ENV, _PULL_DB_NAME), +) + # Protected-policy set: the policy names whose judge-ACCEPTED verdicts are # downgraded to operator sign-off (Q-H3). Composition-root config like the DB # URLs above, so resolved here. diff --git a/src/legis/doctor.py b/src/legis/doctor.py index 0777c8a..76a88b1 100644 --- a/src/legis/doctor.py +++ b/src/legis/doctor.py @@ -17,6 +17,7 @@ from sqlalchemy.engine import make_url +from legis import config from legis import install as _install @@ -194,8 +195,31 @@ def check_gitignore(root: Path, *, repair: bool) -> DoctorCheck: # Task 7: config & store checks # --------------------------------------------------------------------------- -_DB_OVERRIDE_ENVS = ("LEGIS_CHECK_DB", "LEGIS_GOVERNANCE_DB", "LEGIS_BINDING_DB", "LEGIS_PULL_DB") -_LEGACY_DB_NAMES = ("legis-checks.db", "legis-governance.db", "legis-binding.db", "legis-pulls.db") +# Sourced from config's single store-identity registry so adding a store there +# can't silently drop doctor coverage (review #2). +_DB_OVERRIDE_ENVS = tuple(env for env, _ in config.STORE_DB_SPECS) +_LEGACY_DB_NAMES = tuple(name for _, name in config.STORE_DB_SPECS) + + +def _store_dir_for(root: Path) -> Path: + """legis's store dir resolved from root/weft.toml (root-anchored, never cwd). + Returns an absolute path: an operator-set absolute store_dir is honored as-is; + otherwise the (relative) store_dir / default is joined to root. Malformed + weft.toml falls back to the default (check_weft_toml reports the malformed file).""" + configured: Path | None = None + wt = root / "weft.toml" + if wt.exists(): + try: + data = tomllib.loads(wt.read_text(encoding="utf-8")) + except (tomllib.TOMLDecodeError, OSError, UnicodeDecodeError): + data = {} + legis = data.get("legis") + if isinstance(legis, dict): + sd = legis.get("store_dir") + if isinstance(sd, str) and sd: + configured = Path(sd) + store_dir = configured if configured is not None else Path(".weft") / "legis" + return store_dir if store_dir.is_absolute() else (root / store_dir) def check_weft_toml(root: Path) -> DoctorCheck: @@ -231,10 +255,7 @@ def check_store_dir(root: Path, *, repair: bool = False) -> DoctorCheck: """An absent .weft/legis/ is ok (created lazily). A present-but-unwritable dir is an error. --repair ensures the dir exists (explicit operator action).""" cid = "store.dir" - from legis import config - - store_dir_rel = config._store_dir() - store_dir = store_dir_rel if store_dir_rel.is_absolute() else (root / store_dir_rel) + store_dir = _store_dir_for(root) if store_dir.exists(): if not os.access(store_dir, os.W_OK): return DoctorCheck(cid, "error", message=f"{store_dir} not writable") @@ -255,11 +276,12 @@ def check_db_overrides(root: Path) -> DoctorCheck: # noqa: ARG001 cid = "store.db_overrides" bad = [] for env in _DB_OVERRIDE_ENVS: - val = os.environ.get(env) - if not val: + # Match config's precedence: a present-but-empty override is a verbatim + # (broken) override, not "unset" — so validate membership, not truthiness. + if env not in os.environ: continue try: - make_url(val) + make_url(os.environ[env]) except Exception: # noqa: BLE001 — any parse failure is a bad override bad.append(env) if bad: @@ -285,17 +307,13 @@ def check_legacy_stray_db(root: Path) -> DoctorCheck: def _store_url(root: Path, db_name: str, env: str) -> str: - """Resolve a store URL relative to *root* (not cwd). Respects the LEGIS_*_DB - env-var override first; otherwise constructs a file URL under the configured - store_dir joined to root so tests using a tmp_path don't touch the real repo.""" - val = os.environ.get(env) - if val: - return val - from legis import config - - store_dir = config._store_dir() - base = store_dir if store_dir.is_absolute() else (root / store_dir) - return "sqlite:///" + (base / db_name).as_posix() + """Resolve a store URL anchored at *root* via ``root/weft.toml`` (never cwd). + The LEGIS_*_DB override wins when set (present-but-empty included, matching + config's verbatim-override precedence); otherwise a file URL is built under + the root-anchored store_dir.""" + if env in os.environ: + return os.environ[env] + return "sqlite:///" + (_store_dir_for(root) / db_name).as_posix() def check_audit_chain(cid: str, url: str) -> DoctorCheck: @@ -306,7 +324,7 @@ def check_audit_chain(cid: str, url: str) -> DoctorCheck: except Exception: # noqa: BLE001 return DoctorCheck(cid, "ok", message="store URL not a file store") db = parsed.database - if not str(parsed.drivername).startswith("sqlite") or not db or db == ":memory:": + if parsed.get_backend_name() != "sqlite" or not db or db == ":memory:": return DoctorCheck(cid, "ok", message="not a file store") if not Path(db).exists(): return DoctorCheck(cid, "ok", message="no store yet") @@ -326,8 +344,6 @@ def check_audit_chain(cid: str, url: str) -> DoctorCheck: def check_hmac_key(root: Path) -> DoctorCheck: # noqa: ARG001 """Presence-only; NEVER renders the key value.""" cid = "runtime.hmac_key" - from legis import config - if not config.protected_policies(): return DoctorCheck(cid, "ok", message="no protected policies configured") if os.environ.get("LEGIS_HMAC_KEY"): diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 9980d9c..30ec89b 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -279,3 +279,37 @@ def test_sibling_url_invalid_is_error(tmp_path, monkeypatch): monkeypatch.setenv("LOOMWEAVE_API_URL", "localhost:9620") # no scheme c = check_sibling_url("runtime.loomweave_url", "LOOMWEAVE_API_URL") assert c.status == "error" + + +# --------------------------------------------------------------------------- +# Review follow-ups: root-anchored store_dir + empty-override precedence +# --------------------------------------------------------------------------- + + +from legis.doctor import _store_url + + +def test_store_dir_root_anchored_via_weft_toml(tmp_path, monkeypatch): + # --root != cwd, with a weft.toml that relocates the store. Resolution must + # honor root/weft.toml, not cwd's, and stay under root (review #1). + monkeypatch.chdir(tmp_path) # cwd has no weft.toml + # Clear the conftest store override so weft.toml resolution is exercised. + monkeypatch.delenv("LEGIS_GOVERNANCE_DB", raising=False) + root = tmp_path / "proj" + (root / "custom_store").mkdir(parents=True) + (root / "weft.toml").write_text('[legis]\nstore_dir = "custom_store"\n') + + c = check_store_dir(root) + assert c.status == "ok" + + # The audit-chain URL must point under root/custom_store, not cwd/.weft. + url = _store_url(root, "legis-governance.db", "LEGIS_GOVERNANCE_DB") + assert (root / "custom_store" / "legis-governance.db").as_posix() in url + assert ".weft" not in url + + +def test_db_override_empty_string_is_error(tmp_path, monkeypatch): + # Present-but-empty override is a verbatim broken override, not "unset" + # (matches config precedence; review #3). + monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "") + assert check_db_overrides(tmp_path).status == "error" From 7c529cfa413b6495211b62d65f1506d045344b7b Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 20:06:05 +1000 Subject: [PATCH 66/72] test(doctor): end-to-end repair pipeline + weft.toml/secret invariants Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/test_doctor.py | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 30ec89b..ce92f65 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -313,3 +313,38 @@ def test_db_override_empty_string_is_error(tmp_path, monkeypatch): # (matches config precedence; review #3). monkeypatch.setenv("LEGIS_GOVERNANCE_DB", "") assert check_db_overrides(tmp_path).status == "error" + + +# --------------------------------------------------------------------------- +# Task 9: end-to-end --repair pipeline + invariant tests +# --------------------------------------------------------------------------- + + +def test_repair_makes_fresh_project_healthy(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + # First run: unhealthy (no install artifacts, no .mcp.json). + assert run_doctor(tmp_path, repair=False, fmt="text") == 1 + # Repair run: install-wiring + .mcp.json get fixed; re-check is healthy. + assert run_doctor(tmp_path, repair=True, fmt="text") == 0 + # Third run, no repair: stays healthy. + assert run_doctor(tmp_path, repair=False, fmt="text") == 0 + + +def test_repair_never_writes_weft_toml(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + (tmp_path / "weft.toml").write_text("[legis]\nstore_dir = \n") # malformed + before = (tmp_path / "weft.toml").read_text() + run_doctor(tmp_path, repair=True, fmt="json") + assert (tmp_path / "weft.toml").read_text() == before + + +def test_json_output_has_no_secret(tmp_path, monkeypatch): + monkeypatch.chdir(tmp_path) + monkeypatch.setenv("LEGIS_PROTECTED_POLICIES", "secrets.read") + monkeypatch.setenv("LEGIS_HMAC_KEY", "TOP-SECRET") + import contextlib + import io + buf = io.StringIO() + with contextlib.redirect_stdout(buf): + run_doctor(tmp_path, repair=False, fmt="json") + assert "TOP-SECRET" not in buf.getvalue() From 7030be55876abf20e2daced1e824dda9eec6c7b4 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 20:07:49 +1000 Subject: [PATCH 67/72] docs(doctor): changelog + readme for legis doctor; coverage floor Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 7 +++++++ README.md | 2 +- scripts/check_coverage_floors.py | 1 + 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0f1f7ec..4b9d51f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,13 @@ versions per [PEP 440](https://peps.python.org/pep-0440/) / ## [1.0.0rc4] — 2026-06-06 ### Added +- **`legis doctor [--root] [--repair] [--format text|json]`** — operator health + view and safe repair for the install + config layer (instruction blocks, skills, + SessionStart hook, `.gitignore`, `.mcp.json` registration, store dir, audit + hash-chain integrity, key/sibling wiring). Report-only on `weft.toml` (C-9(b)) + and on hash chains; key values are never rendered. +- **`legis install --mcp`** — register the legis MCP server in `.mcp.json` + (also part of `legis install` with no flags). - **Self-install (`legis install`)** — legis now stands itself up like its siblings: it injects a lean, versioned agent-orientation block into CLAUDE.md / AGENTS.md, installs the `legis-workflow` skill pack (Claude + Codex), registers diff --git a/README.md b/README.md index f9f4b2f..625ffd5 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ Legis is the fourth Weft product: the git/CI and governance side of the suite's ## Status -Legis is at **`1.0.0rc4`** — the fourth release candidate. The standalone git/CI surfaces, the graded 2×2 enforcement engine, the agent-programmable policy grammar, SEI-keyed attestations, and the Wardline/Filigree suite combinations are all built and tested; the git-rename provider to Loomweave is contract-locked, operative pending Loomweave's committed-range driving. The transport-agnostic service layer (WP-M1) and the agent-facing MCP surface on top of it have landed (`legis mcp`), and Legis now stands itself up via `legis install` (instruction block + `legis-workflow` skill pack + SessionStart hook). See the combination matrix below for per-pairing status and `CHANGELOG.md` for the release notes. +Legis is at **`1.0.0rc4`** — the fourth release candidate. The standalone git/CI surfaces, the graded 2×2 enforcement engine, the agent-programmable policy grammar, SEI-keyed attestations, and the Wardline/Filigree suite combinations are all built and tested; the git-rename provider to Loomweave is contract-locked, operative pending Loomweave's committed-range driving. The transport-agnostic service layer (WP-M1) and the agent-facing MCP surface on top of it have landed (`legis mcp`), and Legis now stands itself up via `legis install` (instruction block + `legis-workflow` skill pack + SessionStart hook + `.mcp.json` registration). `legis doctor [--repair]` provides an operator health view and safe repair for the install + config layer. See the combination matrix below for per-pairing status and `CHANGELOG.md` for the release notes. ## The Weft suite diff --git a/scripts/check_coverage_floors.py b/scripts/check_coverage_floors.py index 43943f7..5d421ce 100644 --- a/scripts/check_coverage_floors.py +++ b/scripts/check_coverage_floors.py @@ -30,6 +30,7 @@ "src/legis/governance/": 90.0, # currently ~92.7 "src/legis/api/": 88.0, # currently ~89.8 "src/legis/mcp.py": 80.0, # currently ~82 + "src/legis/doctor.py": 88.0, # currently ~91 } From 872657e726447d3a521da672dd1819a66cb5b278 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 20:15:42 +1000 Subject: [PATCH 68/72] test(doctor): harden secret guard + make fresh-project e2e test hermetic Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/test_doctor.py | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/tests/test_doctor.py b/tests/test_doctor.py index ce92f65..20b4aa8 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -322,6 +322,11 @@ def test_db_override_empty_string_is_error(tmp_path, monkeypatch): def test_repair_makes_fresh_project_healthy(tmp_path, monkeypatch): monkeypatch.chdir(tmp_path) + # Hermetic: an inherited sibling URL env var (valid or not) would otherwise + # leak into the repair → exit 0 assertion. Unset both so the check is "not + # configured" (ok), never a non-repairable error. + monkeypatch.delenv("LOOMWEAVE_API_URL", raising=False) + monkeypatch.delenv("FILIGREE_API_URL", raising=False) # First run: unhealthy (no install artifacts, no .mcp.json). assert run_doctor(tmp_path, repair=False, fmt="text") == 1 # Repair run: install-wiring + .mcp.json get fixed; re-check is healthy. @@ -347,4 +352,12 @@ def test_json_output_has_no_secret(tmp_path, monkeypatch): buf = io.StringIO() with contextlib.redirect_stdout(buf): run_doctor(tmp_path, repair=False, fmt="json") - assert "TOP-SECRET" not in buf.getvalue() + out = buf.getvalue() + assert "TOP-SECRET" not in out + # Prove the secret-bearing path actually ran: with both the protected policy + # and the key set, check_hmac_key reads the key and reports ok. Asserting the + # check is present (and ok) keeps this guard from passing vacuously if the + # key-reading check were ever removed. + payload = json.loads(out) + hmac_checks = [c for c in payload["checks"] if c["id"] == "runtime.hmac_key"] + assert hmac_checks and hmac_checks[0]["status"] == "ok" From c078d69799a276793cc49dce8b145ac67a026f10 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 20:28:33 +1000 Subject: [PATCH 69/72] fix(doctor): resolvability-based .mcp.json drift check; align text/json health MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fix 1: replace presence-only mcp_json check with mcp_entry_is_current() — a new predicate in install.py that verifies the entry exists, its args invoke mcp, and its command resolves to an existing executable. Byte-equality is deliberately avoided so valid binary-path variation (uv-tool vs venv) doesn't read as drift. check_mcp_json in doctor.py uses the predicate and re-verifies after repair. Fix 2: align render_text with render_json / exit-code semantics — the "ok" banner is now suppressed only on error, not on warn. A warn-only project shows "legis doctor: ok (N warning(s))" while still listing the warn checks below. Spec doc updated to describe the resolvability-based freshness check. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../specs/2026-06-07-legis-doctor-design.md | 8 +- src/legis/doctor.py | 46 ++++---- src/legis/install.py | 33 ++++++ tests/test_doctor.py | 105 +++++++++++++++++- 4 files changed, 166 insertions(+), 26 deletions(-) diff --git a/docs/superpowers/specs/2026-06-07-legis-doctor-design.md b/docs/superpowers/specs/2026-06-07-legis-doctor-design.md index 8cb26a3..81819bf 100644 --- a/docs/superpowers/specs/2026-06-07-legis-doctor-design.md +++ b/docs/superpowers/specs/2026-06-07-legis-doctor-design.md @@ -123,8 +123,12 @@ error. - `install.agents_skill` — `.agents` (Codex) skill pack present, fingerprint fresh. - `install.hook` — Claude Code `SessionStart` hook registered. - `install.gitignore` — legis `.gitignore` rules present. -- `install.mcp_json` — `.mcp.json` has a `legis` server entry matching the - canonical local entry (`legis mcp --agent-id ` via the resolved binary). +- `install.mcp_json` — `.mcp.json` has a usable `legis` server entry: present, + args invoke `mcp`, and `command` resolves to an existing executable. Deliberately + NOT byte-canonical — a valid but differently-resolved legis binary (uv-tool vs + venv path) must not read as drift; only a missing entry, malformed args, or a + dead `command` path is stale. `--repair` writes the canonical entry via + `register_mcp_json` (resolved binary at repair time). ### Config & stores - `config.weft_toml` — **report-only.** ABSENT → `ok` (defaults intentional); diff --git a/src/legis/doctor.py b/src/legis/doctor.py index 76a88b1..fb64234 100644 --- a/src/legis/doctor.py +++ b/src/legis/doctor.py @@ -53,43 +53,43 @@ def render_json(checks: list[DoctorCheck]) -> str: def render_text(checks: list[DoctorCheck]) -> str: - healthy = all(c.status == "ok" for c in checks) - if healthy: - return "legis doctor: ok" - lines = ["legis doctor:"] - for c in checks: - if c.status == "ok": - continue + has_error = any(c.status == "error" for c in checks) + has_warn = any(c.status == "warn" for c in checks) + problems = [c for c in checks if c.status != "ok"] + if not has_error: + # warn-only or all-ok: the project is healthy; surface any warns below + if has_warn: + warn_count = sum(1 for c in checks if c.status == "warn") + lines = [f"legis doctor: ok ({warn_count} warning(s))"] + else: + return "legis doctor: ok" + else: + lines = ["legis doctor:"] + for c in problems: lines.append(f" {c.id}: {c.status} — {c.message}" if c.message else f" {c.id}: {c.status}") return "\n".join(lines) def check_mcp_json(root: Path, *, repair: bool) -> DoctorCheck: - """Check that `.mcp.json` exists and has a `legis` server entry.""" + """Check that `.mcp.json` has a current legis server entry. + + 'Current' means: a legis entry exists, its args invoke `mcp`, and its + command resolves to an existing executable. Byte-equality with the canonical + entry is deliberately NOT required — a valid but differently-resolved legis + binary (uv-tool vs venv path) must not read as drift. + """ cid = "install.mcp_json" - path = root / ".mcp.json" - present = False - if path.exists(): - try: - data = json.loads(path.read_text(encoding="utf-8")) - present = ( - isinstance(data, dict) - and isinstance(data.get("mcpServers"), dict) - and "legis" in data["mcpServers"] - ) - except (json.JSONDecodeError, OSError): - present = False - if present: + if _install.mcp_entry_is_current(root): return DoctorCheck(cid, "ok") if repair: from legis.install import register_mcp_json ok, msg = register_mcp_json(root) - if ok: + if ok and _install.mcp_entry_is_current(root): return DoctorCheck(cid, "ok", fixed=True) return DoctorCheck(cid, "error", message=msg) return DoctorCheck( - cid, "error", message="legis server not registered (run: legis install --mcp)" + cid, "error", message="legis server missing or stale (run: legis install --mcp)" ) diff --git a/src/legis/install.py b/src/legis/install.py index 0c8f587..2a0e0ba 100644 --- a/src/legis/install.py +++ b/src/legis/install.py @@ -673,6 +673,39 @@ def gitignore_rules_present(project_root: Path) -> bool: return all(rule in present for rule in _LEGIS_IGNORE_RULES) +def mcp_entry_is_current(project_root: Path) -> bool: + """True iff .mcp.json has a usable legis stdio server entry: a dict whose + args invoke `mcp` and whose command resolves to an existing executable. + Deliberately NOT byte-equality with the canonical entry — a valid but + differently-resolved legis binary (uv-tool vs venv path) must not read as + drift. Only a missing entry, malformed args, or a dead command path is stale. + """ + try: + path = project_path(project_root, ".mcp.json") + except UnsafeInstallPathError: + return False + if not path.is_file(): + return False + try: + data = json.loads(path.read_text(encoding="utf-8")) + except (json.JSONDecodeError, OSError): + return False + if not isinstance(data, dict): + return False + servers = data.get("mcpServers") + entry = servers.get("legis") if isinstance(servers, dict) else None + if not isinstance(entry, dict): + return False + args = entry.get("args") + if not (isinstance(args, list) and "mcp" in args): + return False + command = entry.get("command") + if not isinstance(command, str) or not command: + return False + # command resolves: absolute/relative existing file OR found on PATH + return bool(shutil.which(command)) or Path(command).is_file() + + def ensure_gitignore(project_root: Path) -> tuple[bool, str]: """Ensure legis's runtime-state subtree (``.weft/legis/``) is ignored.""" try: diff --git a/tests/test_doctor.py b/tests/test_doctor.py index 20b4aa8..26b4003 100644 --- a/tests/test_doctor.py +++ b/tests/test_doctor.py @@ -36,11 +36,19 @@ def test_render_json_shape(): def test_render_text_lists_only_problems_when_healthy_says_ok(): - assert "legis doctor: ok" in render_text([DoctorCheck("a", "ok")]) + # all-ok: banner present, no problem lines + assert render_text([DoctorCheck("a", "ok")]) == "legis doctor: ok" + + # error present: no "ok" in headline, error listed out = render_text([DoctorCheck("a", "ok"), DoctorCheck("b", "error", message="bad")]) assert "b: error" in out assert "legis doctor: ok" not in out + # warn-only: banner present with warning count AND warn check is listed + out_warn = render_text([DoctorCheck("a", "ok"), DoctorCheck("b", "warn", message="heads up")]) + assert "legis doctor: ok" in out_warn + assert "b: warn" in out_warn + def test_run_doctor_healthy_after_repair(tmp_path, capsys): # A project repaired via run_doctor renders healthy on re-check, exit 0. @@ -103,6 +111,101 @@ def test_mcp_json_present_is_ok(tmp_path): assert c.fixed is False +def test_mcp_json_stale_command_is_error_then_repaired(tmp_path): + """An entry with a dead command path is stale and must trigger repair.""" + stale_entry = { + "mcpServers": { + "legis": { + "type": "stdio", + "command": "/nonexistent/legis-xyz", + "args": ["mcp", "--agent-id", "claude-code"], + "env": {}, + } + } + } + (tmp_path / ".mcp.json").write_text(json.dumps(stale_entry)) + c = check_mcp_json(tmp_path, repair=False) + assert c.id == "install.mcp_json" + assert c.status == "error" + + fixed = check_mcp_json(tmp_path, repair=True) + assert fixed.status == "ok" + assert fixed.fixed is True + + +# --------------------------------------------------------------------------- +# Direct unit tests for mcp_entry_is_current predicate +# --------------------------------------------------------------------------- + + +from legis.install import mcp_entry_is_current, register_mcp_json as _register_mcp_json + + +def test_mcp_entry_is_current_absent_file(tmp_path): + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_malformed_json(tmp_path): + (tmp_path / ".mcp.json").write_text("{not valid json") + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_non_dict_top_level(tmp_path): + (tmp_path / ".mcp.json").write_text('["just", "an", "array"]') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_missing_mcp_servers(tmp_path): + (tmp_path / ".mcp.json").write_text('{"other": {}}') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_mcp_servers_not_dict(tmp_path): + (tmp_path / ".mcp.json").write_text('{"mcpServers": "not a dict"}') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_no_legis_entry(tmp_path): + (tmp_path / ".mcp.json").write_text('{"mcpServers": {"other": {}}}') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_legis_entry_not_dict(tmp_path): + (tmp_path / ".mcp.json").write_text('{"mcpServers": {"legis": "string"}}') + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_args_without_mcp(tmp_path): + entry = {"mcpServers": {"legis": {"command": "legis", "args": ["serve"]}}} + (tmp_path / ".mcp.json").write_text(json.dumps(entry)) + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_empty_command(tmp_path): + entry = {"mcpServers": {"legis": {"command": "", "args": ["mcp"]}}} + (tmp_path / ".mcp.json").write_text(json.dumps(entry)) + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_dead_command_path(tmp_path): + entry = { + "mcpServers": { + "legis": { + "command": "/nonexistent/legis-xyz", + "args": ["mcp", "--agent-id", "claude-code"], + } + } + } + (tmp_path / ".mcp.json").write_text(json.dumps(entry)) + assert mcp_entry_is_current(tmp_path) is False + + +def test_mcp_entry_is_current_fresh_registered_entry(tmp_path): + """A freshly registered entry must read as current.""" + _register_mcp_json(tmp_path) + assert mcp_entry_is_current(tmp_path) is True + + # --------------------------------------------------------------------------- # Task 6: install-wiring checks (blocks, skills, hook, gitignore) # --------------------------------------------------------------------------- From dbb8e221f3c5dd59bfcf66097e9fb6a950ca1dba Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 20:55:51 +1000 Subject: [PATCH 70/72] fix(gitignore): add .weft/ to ignore list for Filigree issue tracker --- .gitignore | 3 +++ 1 file changed, 3 insertions(+) diff --git a/.gitignore b/.gitignore index 112731a..5bfb44f 100644 --- a/.gitignore +++ b/.gitignore @@ -42,3 +42,6 @@ wardline.yaml *.db-wal # Federated runtime-state subtree (legis is the sole writer; never .weft/ wholesale) .weft/legis/ + +# Filigree issue tracker +.weft/ From ca513d73beea5bbe07ce64f7017968e30fc66542 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Sun, 7 Jun 2026 21:10:57 +1000 Subject: [PATCH 71/72] fix(mcp): negotiate unsupported protocolVersion instead of hard-erroring MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The hand-rolled MCP initialize handler rejected any protocolVersion outside _SUPPORTED_PROTOCOL_VERSIONS with a -32602 error. Newer MCP clients negotiate 2025-06-18, so legis failed to connect entirely while sibling servers (loomweave, wardline) connected fine. Per the MCP spec, when the client requests a version the server does not support (or omits it), the server must respond with a version it does support and let the client decide whether to proceed — not hard-error. Reply with _DEFAULT_PROTOCOL_VERSION in that case. Replaces test_initialize_rejects_unsupported_protocol_version with test_initialize_negotiates_unsupported_protocol_version. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/legis/mcp.py | 20 +++++++++----------- tests/mcp/test_server.py | 12 ++++++++---- 2 files changed, 17 insertions(+), 15 deletions(-) diff --git a/src/legis/mcp.py b/src/legis/mcp.py index fe5c857..eac4086 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -1122,17 +1122,15 @@ def handle_request(request: dict[str, Any], runtime: McpRuntime) -> dict[str, An "error": {"code": -32602, "message": "initialize params must be an object"}, } requested = params.get("protocolVersion") - if requested is not None and requested not in _SUPPORTED_PROTOCOL_VERSIONS: - return { - "jsonrpc": "2.0", - "id": request_id, - "error": { - "code": -32602, - "message": f"unsupported protocolVersion: {requested}", - "data": {"supported": list(_SUPPORTED_PROTOCOL_VERSIONS)}, - }, - } - runtime.protocol_version = requested or _DEFAULT_PROTOCOL_VERSION + if requested in _SUPPORTED_PROTOCOL_VERSIONS: + runtime.protocol_version = requested + else: + # MCP spec: when the client requests a protocolVersion the server + # does not support (or omits it), the server responds with a version + # it does support and lets the client decide whether to proceed — + # it must not hard-error. Hard-erroring here made newer clients + # (e.g. those negotiating 2025-06-18) fail to connect entirely. + runtime.protocol_version = _DEFAULT_PROTOCOL_VERSION runtime.initialized = True result = { "protocolVersion": runtime.protocol_version, diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index 8d05f9b..c2a83da 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -209,7 +209,10 @@ def test_tools_reject_before_initialize(tmp_path): assert responses[0]["error"]["code"] == -32002 -def test_initialize_rejects_unsupported_protocol_version(tmp_path): +def test_initialize_negotiates_unsupported_protocol_version(tmp_path): + # MCP spec: an unsupported (or newer) requested version must not hard-error; + # the server replies with a version it does support and lets the client + # decide. This is what lets newer clients (e.g. 2025-06-18) connect. runtime, _store = _runtime(tmp_path) runtime.initialized = False @@ -219,14 +222,15 @@ def test_initialize_rejects_unsupported_protocol_version(tmp_path): "jsonrpc": "2.0", "id": 1, "method": "initialize", - "params": {"protocolVersion": "1999-01-01"}, + "params": {"protocolVersion": "2025-06-18"}, } ), runtime, ) - assert responses[0]["error"]["code"] == -32602 - assert "2025-03-26" in responses[0]["error"]["data"]["supported"] + assert "error" not in responses[0] + assert responses[0]["result"]["protocolVersion"] == "2025-03-26" + assert responses[0]["result"]["serverInfo"]["name"] == "legis" def test_build_runtime_initialize_does_not_create_local_state(tmp_path, monkeypatch): From 5af3bfa72b1291ea34b77142abca2edcd4274474 Mon Sep 17 00:00:00 2001 From: John Morrissey <544926+tachyon-beep@users.noreply.github.com> Date: Mon, 8 Jun 2026 00:08:14 +1000 Subject: [PATCH 72/72] feat(cli): legis --version; name CELL_NOT_ENABLED enablement path; doc verified_author gap MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Dogfood #2 friction-tail closeout for the legis-owned items (weft-f506e5f845, weft-9da517a67e): - LG-1: add a top-level `legis --version` flag (argparse version action over legis.__version__), closing the gap where the running build could only be identified indirectly. - Le1: the MCP `CELL_NOT_ENABLED` recovery hint now names the concrete enablement path (LEGIS_HMAC_KEY + LEGIS_POLICY_CELLS / policy/cells.toml / LEGIS_DEV_DEFAULT_CELLS=1) instead of a generic "ask the operator"; the per-cell message still names which cell is unenabled. - C3: charter records the self-asserted-write-actor gap (verified_author:null) as a known governance gap — acceptable trust-local, deferred multi-principal. Tests: new version-flag test (exits 0, prints version) and an assertion that the closure-gate CELL_NOT_ENABLED next_action names LEGIS_HMAC_KEY. Full suite green (754 passed), ruff + mypy + coverage floors hold. Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 15 ++++++++++++++- docs/design/legis-charter.md | 13 +++++++++++++ src/legis/cli.py | 7 +++++++ src/legis/mcp.py | 8 +++++++- tests/mcp/test_server.py | 5 +++++ tests/test_cli.py | 14 ++++++++++++++ 6 files changed, 60 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 4b9d51f..9b85bfc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,9 +5,12 @@ All notable changes to Legis are documented here. The format follows versions per [PEP 440](https://peps.python.org/pep-0440/) / [SemVer](https://semver.org/) (pre-release: `1.0.0rc1`). -## [1.0.0rc4] — 2026-06-06 +## [1.0.0rc4] — 2026-06-08 ### Added +- **`legis --version`** — top-level version flag (LG-1, weft-9da517a67e); reports + the installed package version and exits. Closes the dogfood gap where the only + way to identify the running build was an indirect probe. - **`legis doctor [--root] [--repair] [--format text|json]`** — operator health view and safe repair for the install + config layer (instruction blocks, skills, SessionStart hook, `.gitignore`, `.mcp.json` registration, store dir, audit @@ -66,6 +69,16 @@ versions per [PEP 440](https://peps.python.org/pep-0440/) / - **Table-driven MCP dispatch (Q-L8)** — `call_tool` now routes through a tool table instead of an if/elif ladder, and the stdio server bounds each stdin line so a malformed client cannot stream unbounded input. Behavior-preserving. +- **`CELL_NOT_ENABLED` recovery hint names the enablement path (Le1, + weft-f506e5f845)** — the MCP error's `next_action` now tells the agent *how* to + enable a governance cell (set `LEGIS_HMAC_KEY`; configure policy cells via + `LEGIS_POLICY_CELLS` / `policy/cells.toml` / `LEGIS_DEV_DEFAULT_CELLS=1`) instead + of a generic "ask the operator". The per-cell message still names which cell is + unenabled. +- **Charter documents the self-asserted-write-actor gap (C3, weft-f506e5f845)** — + `docs/design/legis-charter.md` now records `verified_author: null` (federation + write attribution is self-asserted, not cryptographically verified) as a known + governance gap, acceptable for trust-local use and deferred for multi-principal. - **Release CI gates** — the coverage floor is raised to 88% with a `ruff` lint gate added (Q-L7), live Loomweave conformance is now non-optional for releases (no silent skip when the oracle is down), and the Filigree client's transport / diff --git a/docs/design/legis-charter.md b/docs/design/legis-charter.md index c9405d8..1ed449b 100644 --- a/docs/design/legis-charter.md +++ b/docs/design/legis-charter.md @@ -35,6 +35,19 @@ Legis can describe repository change and CI state on its own. Legis becomes the common operating picture for project change and governance while preserving the authority boundaries of the other Weft products. +## Known governance gaps + +- **Self-asserted write actor (`verified_author: null`).** Actor identity on + federation write events (e.g. a comment or status change attributed to an + agent) is self-asserted by the caller, not cryptographically verified. For + trust-local, single-operator use this is acceptable. A multi-principal + deployment that needs non-repudiable write attribution would require a + verified-identity binding at the write boundary — Legis governs *change* + provenance but does not today mint or verify the actor identity carried on a + sibling's write. Verified authorship is a deferred item in the governance + story, not a current guarantee. (Surfaced in the 2026-06 lacuna dogfood as + finding C3; tracked federation-side under the residual-friction tail.) + ## Near-term scope The initial repository is documentation-first. It should make the intended role reviewable before runtime implementation starts. diff --git a/src/legis/cli.py b/src/legis/cli.py index 57d4983..e2dcc31 100644 --- a/src/legis/cli.py +++ b/src/legis/cli.py @@ -6,6 +6,7 @@ import uvicorn +from legis import __version__ from legis.clock import SystemClock from legis.governance.sei_backfill import run_pre_sei_backfill from legis.identity.loomweave_client import HttpLoomweaveIdentity, loomweave_hmac_key_from_env @@ -34,6 +35,12 @@ def _add_judge_flags(parser: argparse.ArgumentParser) -> None: def build_parser() -> argparse.ArgumentParser: parser = argparse.ArgumentParser(prog="legis", description="Legis CLI") + parser.add_argument( + "--version", + action="version", + version=f"legis {__version__}", + help="Print the legis version and exit", + ) subparsers = parser.add_subparsers(dest="command") serve = subparsers.add_parser("serve", help="Run the Legis API server") diff --git a/src/legis/mcp.py b/src/legis/mcp.py index eac4086..25c0070 100644 --- a/src/legis/mcp.py +++ b/src/legis/mcp.py @@ -374,7 +374,13 @@ def _recovery_for(code: str) -> dict[str, Any]: next_actions = { "INVALID_ARGUMENT": "Correct the tool arguments and retry.", "INVALID_CELL_SPEC": "Use server-owned routing or a valid cell configuration.", - "CELL_NOT_ENABLED": "Ask the operator to enable the required governance cell.", + "CELL_NOT_ENABLED": ( + "Enable the cell by wiring its backing store: set LEGIS_HMAC_KEY " + "(enables the binding ledger + protected/structured gates), and " + "configure the policy cells via LEGIS_POLICY_CELLS or policy/cells.toml " + "(LEGIS_DEV_DEFAULT_CELLS=1 for the dev posture). The error message " + "names which cell is unenabled." + ), "NO_SUCH_REQUEST": "Poll a known sign-off sequence returned by override_submit.", "NOT_FOUND": "Refresh the target identifier and retry.", "UNKNOWN_TOOL": "Call tools/list and use one of the advertised tool names.", diff --git a/tests/mcp/test_server.py b/tests/mcp/test_server.py index c2a83da..15b0411 100644 --- a/tests/mcp/test_server.py +++ b/tests/mcp/test_server.py @@ -1582,6 +1582,11 @@ def test_filigree_closure_gate_get_not_enabled_without_ledger(monkeypatch): # NotEnabledError is mapped to an error envelope, not raised. assert result["isError"] is True assert result["structuredContent"]["error_code"] == "CELL_NOT_ENABLED" + # Le1 (weft-f506e5f845): the recovery hint must name the concrete + # enablement path, not a vague "ask the operator". Every governance cell + # is wired behind LEGIS_HMAC_KEY in build_runtime. + next_action = result["structuredContent"]["next_action"] + assert "LEGIS_HMAC_KEY" in next_action def test_filigree_closure_gate_get_surfaces_integrity_failure(monkeypatch, tmp_path): diff --git a/tests/test_cli.py b/tests/test_cli.py index a23a153..95e092f 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -39,6 +39,20 @@ def test_main_no_command_returns_2(): assert rc == 2 +def test_version_flag_prints_version_and_exits_zero(capsys): + import pytest + + from legis import __version__ + + with pytest.raises(SystemExit) as excinfo: + build_parser().parse_args(["--version"]) + # argparse's version action exits 0 after printing. + assert excinfo.value.code == 0 + out = capsys.readouterr().out + assert __version__ in out + assert "legis" in out + + def test_check_override_rate_exits_1_on_fail(tmp_path, capsys): from legis.clock import FixedClock from legis.enforcement.engine import EnforcementEngine