Arch-analysis 2026-06-06 remediation: P1/P2 governance & transport hardening#4
Conversation
Architect-Ready codebase analysis (6 parallel explorer passes along architectural seams + synthesis + independent validation gate). Deliverables (00-06): - discovery, subsystem catalog (13 subsystems, edge-cited), C4/dependency diagrams, final report (4 cross-subsystem flow traces), quality assessment (live tooling + Q-H1..Q-L8 inventory), architect handover (3-tier roadmap). - temp/: 6 cluster catalog partials + validation report (evidence base). Key findings: - Clean dependency DAG, no cycles; fail-closed defaults; 90% coverage; mypy clean. - All 6 prior MCP adapter-drift findings (C2,C3,H1,M9,M10,M11) RESOLVED in tree. - Remaining work is seam discipline (service layer is a partial seam) + input authentication (single-secret scope split, unverified source binding, unauthenticated check/PR facts, unsigned Filigree transport). - Nothing blocks the rc. Findings tracked as 18 issues (label arch-analysis-2026-06-06). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…/Q-L2)
gaps._stable_seis and find_lineage_integrity used
payload.get("entity_key", {}); an explicit "entity_key": null returns None
and raised AttributeError. Guard with isinstance(dict), matching
sei_backfill._entity_key.
enforcement.lifecycle.decay_sweep had no per-record guard, so one malformed
row aborted the whole sweep. Wrap OverrideRecord construction in try/except,
log+skip the bad row, keep re-judging the rest.
Regression tests: explicit-null entity_key for both gaps functions, and a
3-row decay sweep where the middle row is malformed and the trailing stale
row is still flagged.
Closes legis-62ac47b09f
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… (Q-M3)
read_all's decode was guarded but the loop body content_hash(rec.payload)
was not. json.loads accepts Infinity/NaN, so a directly-tampered payload
survived the decode guard and then made canonical_json(allow_nan=False)
raise ValueError out of verify_integrity — crashing the exact tamper case it
defends against, and propagating as an uncaught crash into
sei_backfill / binding_ledger.verify / the cli integrity check.
Guard content_hash per record: ValueError/TypeError -> return False. The
existing `if not verify_integrity()` guards in those callers now engage
instead of being bypassed by the raise.
Regression: directly write `{"k": Infinity}` past the append-only triggers
and assert verify_integrity() is False.
Closes legis-beff02eb40
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…col (Q-L3) binding_ledger, sei_backfill, and gaps still typed against the concrete AuditStore/AuditRecord, so they could not be unit-tested against a protocol fake — the unfinished half of the M12 migration. Retype: - binding_ledger: store: AppendOnlyStore - sei_backfill: store: AppendOnlyStore; records/rec: AuditRecordLike - gaps: records: Sequence[AuditRecordLike] Concrete construction stays at the composition roots (api/cli/mcp). mypy clean across 63 files. Tests: structural import-discipline test (mirrors the enforcement one) plus a proof test driving BindingLedger end-to-end against an in-memory fake store that does not derive from AuditStore. Closes legis-2f557a9a24 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ubject (Q-M8) The policy co-occurrence check walked the whole assert node for both boundary evidence and the policy reference. A test asserting something unrelated, with the boundary result and the policy name dropped into the assert *message* string, satisfied the gate while asserting nothing about the boundary. Narrow has_boundary_evidence to the assert's `.test` condition: the boundary call or its result name must be the assertion subject. The policy reference may still live in the message (the established honesty pattern names the policy there, e.g. `assert result == "ok", "PY-WL-101"`). Shared evaluator, so both the static boundary_scan and the runtime decorator gate tighten together. Tests: message-only boundary result -> policy_not_ asserted; condition-subject + policy-in-message -> ok. Closes legis-230515503e Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
route_findings performed N sequential appends to one append-only store with no surrounding transaction; a mid-loop runtime failure left earlier findings permanently persisted -> partial governance picture mistaken for complete. A valid batch always targets a single store (cross-store mixing is already rejected), so wrap the appends in that store's transaction: - AuditStore.transaction(): a context manager that groups appends onto one connection (single BEGIN IMMEDIATE), committing together on clean exit and rolling back the whole batch on any exception. The ambient connection is stored thread-locally, so a batch never leaks its open connection into another thread's append — concurrency-safe (two concurrent batches serialize at the SQLite writer, as they already did). append() uses the ambient connection when present, else opens its own per-call transaction as before. - AppendOnlyStore protocol + engine/signoff passthroughs expose it. - route_findings resolves all entities first (no Loomweave I/O inside the write transaction), then appends under the single target component's transaction. Regression: a 3-finding batch where finding 2's append raises asserts the trail is empty (finding 1 rolled back); a success case asserts all 3 commit. Closes legis-b1ae681f09 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dit H6) default_policy_cells() returns default_cell=chill, the least-governed self-clear cell, and mcp's _load_policy_cell_registry fell back to it whenever no LEGIS_POLICY_CELLS and no policy/cells.toml were found. A typo, a missing registry entry, or an incomplete deployment therefore silently downgraded governance to self-clear. - Add fail_closed_policy_cells() -> structured (block+escalate, human sign-off): the production default for absent config. An unmatched policy escalates to a human instead of self-clearing. - _load_policy_cell_registry now fails closed to structured when no config is found, falling back to the chill dev posture only under an explicit LEGIS_DEV_DEFAULT_CELLS=1 opt-in (no bare flip — chill stays available for local work). - _registry()'s defensive fallback also fails closed. default_policy_cells() itself is unchanged (still chill) so existing dev/test construction is unaffected; it is now documented as dev-only. Tests: absent config -> structured; dev opt-in -> chill; explicit config still wins; fail_closed helper is structured. Closes legis-16b3a7e864 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ons (Q-H2) Three governance decisions were duplicated or bypassed the transport-agnostic service layer; the override-rate decision in particular existed in three implementations and had already caused a divergent fix (07cf54e). - Config coupling: DEFAULT_GOVERNANCE_DB / DEFAULT_CHECK_DB move to a new legis.config module. mcp no longer imports them from the HTTP layer; api re-exports them so existing `from legis.api.app import DEFAULT_*_DB` callers keep working. - api sign-off: post_signoff_request and post_signoff_sign now route through service.request_signoff / a new service.sign_off (NotEnabledError -> 404) instead of reaching past the service to SignoffGate directly. bind_issue's inline trail-verify is replaced by service.verified_records (the same fail-closed integrity + HMAC tamper decision). - cli gate: _check_override_rate no longer hand-rolls protected detection, key-required fail-closed, trail verification, and override-rate scoring. That decision moves into service.evaluate_override_rate_gate (new), preserving the 07cf54e fail-closed semantics (protected records + no LEGIS_HMAC_KEY -> fail). The cli keeps only its missing-db handling, integrity check, and exit-code shell. New ProtectedKeyRequiredError domain error. Tests: service-layer unit tests for evaluate_override_rate_gate (fail-closed without key; scores with key) and sign_off (NotEnabledError when absent). The existing cli fail-closed and api sign-off tests now exercise the service path. Closes legis-0fe0ac07a7 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
_verify_secret returned the actor on a LEGIS_API_SECRET match without
consulting required_scope, so operator-only routes
(/protected/operator-override, /signoff/{seq}/sign) were satisfied by any
holder of the single secret — inconsistent with the scoped-token model, which
denies operator authority to any actor that has not explicitly declared the
operator scope.
A single shared secret cannot intrinsically represent a writer/operator split.
Per the Q-H1 decision (scope-gated, opt-in operator): single-secret mode now
declares its authority via LEGIS_API_SECRET_SCOPE (pipe-separated), defaulting
to writer-only. Operator routes fail closed unless the deployment explicitly
grants the operator scope — a single-operator deployment opts in with one env
var. Writer routes are unaffected by the default.
Tests: single secret -> writer route 201, operator route 403 (default);
LEGIS_API_SECRET_SCOPE="writer|operator" -> operator route 201. The
test_api_admin_auth regression (which encoded the old bypass) now grants the
operator scope explicitly.
Closes legis-0adeef62ae
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… status
Investigated Q-M1 (protected records for non-.py entities sign
source_binding: unverified). The blanket-fail-closed and resolver-signal
("source-backed => require verified") interpretations were both rejected:
- A non-path entity (python:function:... qualname, opaque SEI, service:
target) has no local bytes to verify, and verify_current_source_binding can
only ever reach `verified` for a .py PATH locator. Requiring verification for
resolved/source-backed entities therefore rejects the qualname/SEI protected
tier, which is a first-class feature (test_sei_api: 2 tests assert 201 for an
SEI-keyed protected override).
- The locator-shape concern is not an exploitable write-side forgery: dropping
the .py yields a DIFFERENT entity_key, and source_binding_status is folded
into the signed HMAC fields, so a verified record is always distinguishable
from an unverified one. The .py path locator is already strictly fail-closed
(missing file / unconfigured root / stale fingerprint all rejected).
Q-M1 is a read-side conflation risk: "protected" (HMAC-signed) != "source
verified". Resolution: document the contract in require_verified_source_binding,
and prove the anti-conflation guarantee with a test that source_binding_status
is bound into the signature (flipping the recorded status breaks verification).
The signed-field format is unchanged (many fixtures carry precomputed
signatures; altering signed fields would break them and any persisted records).
Closes legis-aadb43f660
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tract (Q-M6) bind_signoff_to_issue requires a stable identity (SEI) and rejects locator keys to avoid rename-orphaned bindings. Because SEIs come from Loomweave, a degraded Loomweave means a sign-off can be recorded but not bound — Q-M6 asked that this coupling be decided and documented rather than left implicit. Decision (ADR-0003): the contract is (b)-then-(a) — resolve a locator through a SEI_BACKFILL event at bind time (recovery), and otherwise fail closed (HTTP 409) rather than recording a rename-fragile placeholder. A deferred-binding state (c) is explicitly rejected. The sign-off is always recorded; only the Filigree pointer waits for a stable identity, and a binding-requiring policy inherits the fail-closed posture for free. Both branches are already implemented and tested (test_bind_issue_endpoint_uses_resolved_backfill_for_locator_keyed_request; test_locator_keyed_signoff_is_rejected_as_unstable). This change records the contract: new ADR-0003 + a docstring pointer at the call site. Closes legis-66f9c1df58 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…2/Q-M4) POST /checks and POST /git/pulls record writer-supplied facts with only recorded_by=actor — no signature or forge provenance. A writer could record a fake passing CI run or rewrite PR metadata, and a consumer had no signal that the fact was merely writer-asserted rather than forge-verified. No governance decision currently trusts these facts (filigree_gate.evaluate_ issue_closure does not read checks; all consumers are read-only display/audit endpoints), so the honest, proportionate fix is provenance labeling rather than a forge-reporter authority split: - CheckRun / PullRequest gain provenance: str = "unauthenticated". - It is server-controlled, not an input field, so a writer cannot forge "authenticated"; it flows through asdict to every response and is stored (additive column + _ensure_schema migration, mirroring recorded_by). Pre- migration rows read back as unauthenticated. Tests: a writer-recorded check and PR carry provenance: unauthenticated even when the body tries to set "authenticated". Re: legis-78d66fb4d6 (part 1 of 2) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
HttpFiligreeClient sent unsigned requests, unlike the HMAC-signed Loomweave client. The attach `signature` is an app-level attestation about WHAT is bound; it is not transport authentication of WHO is calling. Add sign_filigree_request (mirrors sign_loomweave_request): X-Weft-Component: filigree:<hmac> + timestamp + nonce, with timestamp/nonce injected so the signature is deterministically testable. filigree_hmac_key_from_env resolves LEGIS_FILIGREE_HMAC_KEY (or LEGIS_HMAC_KEY); absent key -> unsigned, backward compatible. The real transport signs in HttpFiligreeClient._signing_fetch; the injected Fetch contract is unchanged (no test-fake ripple), and _urllib_fetch gains an optional headers param used only by the default path. Tests: deterministic/namespaced signature sensitive to the body; env key resolution with channel-specific precedence; the real transport attaches the Weft headers when a key is present and none when it is not. Closes legis-78d66fb4d6 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ProtectedGate.submit() embeds the agent's rationale in the judge prompt and recorded the model's ACCEPTED verdict as the gate decision, HMAC-signing it as authoritative evidence. A prompt-injected rationale that fools the judge into ACCEPTED on a protected policy therefore cleared a tamper-bound gate on the model's word alone. Routing confirms the exposure is on the agent path: a protected-cell policy routes through submit_protected_override -> ProtectedGate.submit (mcp.py / api /protected/overrides), not operator_override. Fix — treat the model as advisory for ACCEPTED on protected policies: - ProtectedGate gains protected_policies + an optional non-LLM validator seam (both keyword, defaults preserve prior behaviour for non-protected policies). - On a protected policy, a judge ACCEPTED stands only if a deterministic validator confirms it; with no validator (the default) it is downgraded to a signed BLOCKED, recording judge_advisory_verdict=ACCEPTED for audit, and the agent must escalate to operator sign-off (operator_override). - api and mcp thread LEGIS_PROTECTED_POLICIES into the gate, so production protected policies get the advisory posture; no validator is wired yet, so the path is operator sign-off. The rationale was already strictly data (build_prompt wraps it as request_json labelled "untrusted input, not instructions"); unchanged. Regression: a simulated prompt injection (judge returns ACCEPTED off an injected rationale) does NOT clear a protected policy without a validator — the signed verdict is BLOCKED; a confirming validator lets ACCEPTED stand; a non-protected policy is unaffected. Closes legis-072abdbb0e Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Weft-component HMAC signs _json_body_bytes (sorted keys, compact separators) but _urllib_fetch was transmitting default json.dumps bytes, so a Filigree verifier checking the body hash against the actual request bytes would reject every signed POST (e.g. attach). Send the exact signed bytes on the wire, mirroring loomweave_client; add a regression test that drives the real transport and verifies the captured body against the captured signature. Also resolve the review's minor/cosmetic items: - Document transaction()'s appends-only contract on AppendOnlyStore and AuditStore: reads inside the batch see a pre-batch snapshot and can hit SQLITE_BUSY against the held BEGIN IMMEDIATE; resolve reads first. - Note that wardline cells_needed reflects the cell_map's full reach, not the present findings, so the cross-store guard / txn_owner are conservative by design. - Make the Filigree canonicalization contract explicit in sign_filigree_request and ADR-0003. - Rename verified_records' protected_gate param to trail_owner (it also takes the sign-off gate) and clarify the docstring. - Resolve the Filigree HMAC key from env only when the real signing transport is used, not when a fetch is injected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
rc2 is already published and immutable on PyPI, so the arch-analysis remediation (incl. the Q-M4 transport fix) ships as rc3. - Declare pydantic>=2 as a direct dependency: api/app.py imports it directly; it was only present transitively via fastapi. - Bump version 1.0.0rc2 -> 1.0.0rc3 (pyproject, __version__, uv.lock). - Bind the /health version assertion to legis.__version__ so it tracks future bumps instead of drifting. Build verified: uv build + twine check pass; 522 tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Added release-prep commit ( |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8883bef1dc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Model is advisory on a protected policy: its ACCEPTED is recorded | ||
| # for audit but does NOT clear the gate (Q-H3). Downgrade the signed | ||
| # verdict to BLOCKED; the agent must escalate to operator sign-off. | ||
| record_ext["judge_advisory_verdict"] = Verdict.ACCEPTED.value |
There was a problem hiding this comment.
Bind advisory judge verdicts into the signature
When a protected policy's judge returns ACCEPTED but no deterministic validator confirms it, this new audit field records the model's advisory verdict, but signing_fields() does not include judge_advisory_verdict. In that scenario, anyone who can edit and rechain the SQLite audit log can change or remove the advisory ACCEPTED evidence while TrailVerifier still accepts the HMAC, which undermines the audit trail's purpose of preserving the prompt-injected/model-cleared attempt; include this field in the signed metadata or avoid storing it as unsigned audit evidence.
Useful? React with 👍 / 👎.
| with batch_txn: | ||
| for f, cell, entity_key, loomweave_ext in prepared: | ||
| _route_one(f, cell, entity_key, loomweave_ext) |
There was a problem hiding this comment.
Keep judge calls out of the audit transaction
For surface_override batches routed through a judge-enabled EnforcementEngine, _route_one() calls engine.submit_override(), which evaluates the judge before appending. Since this new loop runs under AuditStore.transaction() (BEGIN IMMEDIATE on SQLite), a live LLM judge call now holds the governance write lock for the duration of each network request, blocking concurrent API/MCP governance writes and risking SQLITE_BUSY; compute judge decisions before opening the write transaction or limit the transaction to the actual appends.
Useful? React with 👍 / 👎.
… layer The scan-route request-routing decision — "is request-side routing allowed, and is the cell-spec well-formed?" — was hand-copied into both the HTTP (/wardline/scan-results) and MCP (scan_route) adapters, alongside the cell-spec parse and a byte-for-byte _parse_wardline_cell_map helper. The copies had already drifted: HTTP rejected an empty cell_by_severity (422) while MCP silently accepted an empty severity_map and routed nothing — the "check added to one transport not the other" failure mode the layering exists to prevent. Extract service.resolve_scan_routing (+ ResolvedRouting, _parse_cell_map_env) as the single decision. It raises WardlineRoutingError carrying a `kind` discriminator; each adapter maps kind to its own taxonomy — HTTP 500/403/422 (_WARDLINE_ROUTING_STATUS), MCP collapses all three to INVALID_CELL_SPEC (via _service_error, before the generic ServiceError case). Both dead _parse_wardline_cell_map copies are removed; env reads stay in the adapters. Behavior-preserving for every pinned case (HTTP 403/"server-owned" + 422; MCP INVALID_CELL_SPEC; the malformed-finding path still raises INVALID_ARGUMENT from inside route_wardline_scan, kept distinct from routing errors). One intended change closes the drift: an empty per-severity map is now rejected up front on both transports — no silent governance skip. TDD: tests/service/test_wardline.py (14 cases) + a new MCP regression test_scan_route_rejects_empty_severity_map. Full suite 699 passed, mypy + ruff clean, coverage floors hold, policy-boundary-check + governance-gate + SEI oracle green. rc4 review finding #4 (legis-604ddb8dd4). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes the P1/P2 findings from the 2026-06-06 architecture analysis (
docs/arch-analysis-2026-06-06-0158/). Fail-closed posture throughout; every change cites its finding ID.Findings addressed
High (Q-H)
LEGIS_API_SECRET_SCOPEgrantsoperator.service/is now the single decision path for governance (override-rate gate, verified-trail read, sign-off); CLI/API/MCP stop re-implementing it inline. Default store URLs moved to transport-agnosticlegis.config.ACCEPTEDwith no deterministic validator is downgraded toBLOCKED(recorded asjudge_advisory_verdict); the agent must escalate to operator sign-off.Medium (Q-M)
provenance="unauthenticated"(writer-asserted, not forge-verified).verify_integrityreturnsFalseon non-finite-float tamper instead of crashing.LEGIS_DEV_DEFAULT_CELLS=1.assert's test, not just its message).Low (Q-L)
entity_key;decay_sweepskips (and logs) one malformed row instead of aborting the whole sweep.AppendOnlyStoreprotocol rather than the concreteAuditStore.Review fixes (final commit)
Post-review cleanup, headlined by a P2 transport correctness fix:
_urllib_fetchwas transmitting defaultjson.dumpsbytes while the signature committed to canonical bytes; every signed POST (e.g.attach) would fail verification. Now sends the exact signed bytes; added a regression test that drives the real transport and verifies body-against-signature.transaction()'s appends-only contract (reads see a pre-batch snapshot; SQLite can hitSQLITE_BUSY) onAppendOnlyStoreandAuditStore.cells_neededreflects the cell_map's full reach (conservative by design).sign_filigree_requestand ADR-0003.verified_records(protected_gate=...)→trail_owner(also takes the sign-off gate).Verification
pytest tests/→ 522 passed, 2 skipped.Known scope limit
The legis↔Filigree signed handshake is verified against legis's own transport (it signs what it sends) and is byte-consistent with the proven loomweave channel, but there is no live integration test against Filigree's actual verifier in this repo (same as loomweave↔Loomweave). The fix is maximally compatible — it works whether Filigree hashes raw received bytes or re-canonicalizes first.
🤖 Generated with Claude Code