Skip to content

Release 1.0.0 — final (drop the rc)#9

Open
tachyon-beep wants to merge 22 commits into
mainfrom
rc4
Open

Release 1.0.0 — final (drop the rc)#9
tachyon-beep wants to merge 22 commits into
mainfrom
rc4

Conversation

@tachyon-beep

Copy link
Copy Markdown
Collaborator

Brings main from the rc4-release state (PRs #7/#8) to 1.0.0 final. 22 commits; merge is clean (origin/main is an ancestor of rc4, 0 conflicts). No behavior change in the release commit itself — it is the version cut + release-prep docs.

What's in it

Version cut (64208dd)1.0.0rc4 → 1.0.0 across pyproject, legis.__version__ (MCP serverInfo / /health / legis --version), uv.lock; CHANGELOG [Unreleased] → [1.0.0].

Security / honesty — two adversarial review passes, all findings closed:

  • Second pre-ship review (01382d5): JUDGE-3 protected cell now fail-closed unconditionally; GOV-2 identity-gaps no longer reports a false all-clear; F1 TrailVerifier docstring corrected.
  • First risk audit (5076170, b36939d, 98c9f5c, 0a9cfe9, acdbff0+691e838+cf42727, 41e0b20, 0dabc8b): GOV-1 lineage divergence surfaced at the posture root; POLICY-1 disabled-evidence-test detection; AUD-1 delete-and-rechain forgery closed (v3 seq-binding + head anchor); AUD-3 synchronous=FULL; INSTALL-1 split-brain detection; ID-3 signed SEI capability probe; JUDGE-1 prompt-stuffing cap; AUTH-1 / POLICY-2 / CRYPTO-THRESHOLD lows.

The full adversarial threat model ships publicdocs/release-1.0-risk-audit.md + docs/release-1.0-pre-ship-review.md (reproduced attack recipes and all), linked from the README. A "forced me to do the right thing" discipline, not a hardened security boundary; residual tiers (raw DB-file write, model-robustness, response-integrity-rests-on-TLS) named honestly.

Operator surface: legis doctor --fix (canonical flag) with [auto-fixable]/[operator] repairability tagging + filigree-install-gated scope check (84a8047, a11378e); operator config + output-interpretation guides (d5a7580, b975567).

Agent MCP surface: dogfood LEG-1/2/3 closed — policy_list discoverability, matched_rule, scan_route cell-trap message, envelope next_action (f5f5a8b); scan-level artifact posture echoed at the scan_route root (18c3a11).

Federation contracts: adopted Wardline's suppression_state key (fbdf949, W3); honest unconfigured-governance seams N3/N4 + C-8 key confinement preserved (f921562).

Verification

  • Full suite 825 passed, 2 skipped; ruff + mypy clean.
  • See CHANGELOG.md [1.0.0] for the authoritative notes.

Not done in this PR (release follow-ups, operator's call)

  • git tag v1.0.0 (the changelog compare link assumes it).
  • PyPI publish.
  • Post-1.0 backlog + conceptual extensions are tracked in Filigree (label post-1.0), not here.

🤖 Generated with Claude Code

tachyon-beep and others added 22 commits June 8, 2026 01:32
…oc; C-8 preserved

Dogfood-#2 governance honesty (convention C-10), branch-local — merge/release
gated on the filigree-first propagation. Capability confinement (proposed C-8)
preserved throughout: operator signing keys stay out of agent reach, nothing is
auto-provisioned/relocated, no MCP tool enables a cell or self-grants authority.

N3 (weft-df8d2ef454, C-10(c)) — legis no longer ships dark and quiet:
- mcp.py _recovery_for: INVALID_CELL_SPEC names LEGIS_WARDLINE_CELL /
  LEGIS_WARDLINE_CELL_BY_SEVERITY (covers all WardlineRoutingError kinds, incl.
  those str(exc) misses); CELL_NOT_ENABLED split into the keyless simple tier
  (policy/cells.toml / LEGIS_POLICY_CELLS / LEGIS_DEV_DEFAULT_CELLS) and the
  complex tier (LEGIS_HMAC_KEY, operator out-of-band + relaunch). Subsumes Le1.
- doctor.py: two report-only checks (check_policy_cells, check_wardline_routing)
  naming the enablement path when unwired — presence-only, no repair param,
  write nothing, never render a key value. Fail-closed preserved (no auto-open).

N4 (weft-a7a92a40dd, C-10(d)) — honest dirty-tree skip:
- WardlineDirtyTreeError.to_payload() is the single source both transports
  (mcp.py scan_route + api/app.py) serialize: structured reason/posture/cause/
  remediation, routed==[] (governs nothing). No scan_route call argument added;
  the LEGIS_WARDLINE_ALLOW_DIRTY dirty-snapshot opt-in stays an env-only
  operator switch.

C3 (weft-f506e5f845) — charter now documents that legis's OWN audit records
carry a self-asserted agent_id/operator_id (launch-bound + HMAC-tamper-evident,
not authenticated); verified_author:null maps to those fields.

Guards: test_c8_no_agent_reachable_enablement_or_signing_surface (no enable/sign
tool; scan_route schema locked) + doctor checks write-nothing/render-no-key test.
762 passed; ruff + mypy clean; coverage 92.30%; per-package floors hold;
policy-boundary-check PASS; SEI oracle PASS. Designed + adversarially red-teamed
(C-8 verdict: safe) and implementation-reviewed via multi-agent workflows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ntime

Acceptance branch 1 of N3 (weft-df8d2ef454) — "a fresh stdio launch CAN reach a
configured non-secret surface" — was only proven via injected-engine unit tests;
the CHANGELOG and ticket comments assert "chill/coached reachable keyless" as
fact. Add a test that exercises the REAL launch path: build_runtime() with no
LEGIS_HMAC_KEY + the LEGIS_DEV_DEFAULT_CELLS=1 chill posture, then override_submit
-> ACCEPTED_SELF via the lazy keyless _engine. A future change making _engine
require a key now fails here instead of silently falsifying the promise.
(Scan-route axis already pinned by test_scan_route_uses_server_owned_cell.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…48eb2)

Wardline renamed the per-finding output key `suppressed` -> `suppression_state`
across all surfaces incl. the SIGNED legis scan artifact, changing the canonical
signed bytes and breaking the Wardline->legis hop (wardline's opt-in legis_e2e
oracle red by design). legis adopts the new key.

- ingest: WardlineFinding.from_wire reads `suppression_state`; the dataclass
  field, error message, and active_defects branches follow. Values unchanged
  (active/waived/suppressed/baselined/judged); the `Suppressed` enum (value
  vocabulary) and SUPPRESSION_PROOF_KEYS are untouched.
- clean break: a finding carrying only the legacy `suppressed` key reads as
  `active` and OVER-gates — fail-safe (never silently drops a real defect),
  pinned by test_legacy_suppressed_key_is_ignored_clean_break.
- NO signing/canonical change: legis's signer already reproduces Wardline's
  rekeyed golden byte-for-byte. Added the legis-side cross-impl golden MIRROR
  legis was missing: sign(_GOLDEN_FIELDS, _GOLDEN_KEY) == hmac-sha256:v2:2b2cf09…
  over `suppression_state`, so the hop self-verifies on both ends.
- intake fixtures: ~40 `suppressed` test fixtures across tests/wardline,
  tests/api, tests/mcp, tests/store renamed to `suppression_state` (a sweep
  flagged these to avoid vacuously-green suppression-path assertions).

Acceptance: legis 767 tests green; golden byte-agreement pinned; the live signed
hop verifies — wardline's `-m legis_e2e` test_legis_accepts_signed_artifact
PASSES against the reinstalled legis (real build_legis_artifact -> signed
suppression_state artifact -> legis verifies + routes). Branch-only; ship via the
filigree-gated rc4->main merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… resolution

check_policy_cells claimed to "mirror mcp._load_policy_cell_registry" but the
root fallback differs: the resolver uses os.getcwd() when LEGIS_SOURCE_ROOT is
unset, while doctor uses its passed-in root. The env precedence is faithfully
mirrored; the root resolution is a deliberate difference (they coincide when
doctor runs from the server's launch CWD). Tighten the docstring to say so.

Docstring-only; no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…oute root (opp #6)

scan_route returned `{outcome: ROUTED, routed:[...]}` with no top-level posture
field, so an agent relaying "governance passed" could not tell a keyless
dev-grade pass (unverified/dirty) from a CI-signed `verified` pass — the posture
was only buried in each routed record's provenance, and absent entirely when
nothing routed. Same vacuous-green fidelity gap as wardline W2.

- `route_wardline_scan` now returns `RoutedScan(routed, artifact_status)`
  instead of a bare list, surfacing the scan-level `artifact_status` that
  `verify_wardline_artifact` already computes
- both surfaces echo it at the response root: the MCP `scan_route` tool and the
  HTTP `/scan-route` adapter (identical contract)
- new MCP test asserts a keyless unsigned scan echoes `artifact_status:
  "unverified"` at the top level; the exact-shape routing test gains the field

Closes gap-analysis opp #6.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…filigree-scope check (N1)

Close two release-1.0 risk-audit gaps:

POLICY-1 — a pinned, running evidence test could be disabled after the
fact with @pytest.mark.skip / skipif / xfail. The fingerprint is blind to
decorators (Q-L5 parity), so the drift check is byte-identical and cannot
see the disablement. Add a highest-priority disabled-evidence judgement in
the shared evaluate_test_evidence so both the runtime gate and the static
boundary scanner reject it identically (new POLICY_BOUNDARY_TEST_DISABLED).
Marker match is terminal-name based, so it catches the import-alias form
(`from pytest import mark; @mark.skip`) whose only tell lives outside the
function source the fingerprint sees.

N1 — add report-only check_filigree_binding_scope to doctor: an unscoped
federation-write binding in .mcp.json (/api/weft/… etc.) is fail-closed
with HTTP 400 by a filigree server-mode daemon, so scans silently
non-emit. Warn (not error — harmless against single-project/stdio) and
name the offending URL + the scoped form to use.
/governance/lineage-integrity computed status as "unverified" if
unavailable else "verified", ignoring integrity.divergences. A
confirmed external tamper (divergence list populated) reported
status="verified" — a false green at the top-level posture while the
same payload carried the divergence.

Three-way precedence: any divergence -> "diverged" (most severe,
confirmed tamper) over "unverified" (can't check) over "verified".

The existing divergence test pinned the divergences list but pointedly
omitted the status assertion; pin status="diverged" so the false green
cannot regress.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… anchor (AUD-1)

An attacker with DB-file write access could delete an audit record and
re-chain the survivors undetectably: the hash chain is plain SHA (keyless,
recomputable) and the HMAC bound record *content* but never its chain
*position*, so every surviving signature still verified and the chain stayed
internally consistent. service/governance.py already documented that whole-
trail verify catches mutation but not deletion.

Two complementary, isolated mechanisms now close it:

* seq-binding (v3) + contiguity — interior delete and reorder. verify_integrity
  gains an expected-seq counter (a re-chained gap is now a tamper), and
  protected + sign-off verdicts sign at v3, folding the chain seq into the
  HMAC. A renumber-to-hide-a-deletion then fails to verify at the new
  position. seq is taken from the column at verify time, never a payload field.
  Resolved the sign-before-seq ordering with a store-mediated append_signed:
  the store reserves seq + prev_hash under its BEGIN IMMEDIATE lock and hands
  them to a signer callback, so the bound seq is provably the row's seq with no
  race. The store stays key-agnostic (the callback closes over the gate's key).

* HeadAnchor (opt-in) — tail-truncation, the one thing seq-binding structurally
  cannot catch (a truncated head is legitimately last). A small HMAC-signed
  sidecar remembers the last (seq, chain_hash); a missing anchor on an anchored
  store fails closed. Wired as optional gate/verifier params, off by default —
  conceded-capability hardening that does not touch the 1.0 core.

The shared sign()/verify() primitive keeps its v2 default, so the cross-tool
Wardline artifact contract and the binding ledger are byte-for-byte untouched.
Binding ledger stays v2 (separate, homogeneous store) but is covered by the new
contiguity check; renumber-within that store is a documented residual, as is the
inherent renumber-vulnerability of an all-unsigned (chill/coached) run.

Tests: three attack PoCs, each isolating one mechanism (interior-delete-gap →
contiguity; delete-and-renumber → v3 seq-HMAC; tail-truncate → anchor), plus
HeadAnchor unit coverage (forged/missing/reappend/no-op) and a v3 signing pin.
Full suite 793 passed, 2 skipped.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ail-loss (AUD-3)

The audit store ran synchronous=NORMAL under WAL. NORMAL only fsyncs the WAL at
a checkpoint, so a committed-but-not-yet-checkpointed append is lost on a
power-cut while the database stays consistent. The survivors form a contiguous,
fully-signed hash chain — a valid-looking SHORTENED trail indistinguishable
from "nothing more was ever written". For an audit-integrity store that silent
tail-loss is precisely the harm.

Set synchronous=FULL: each commit is fsynced, so a committed governance record
survives power loss; throughput is the correct thing to trade here. The floor
is intentionally not configurable — an audit store's durability must not be
lowerable back to the bug. SQLite's default wal_autocheckpoint still bounds WAL
growth, so no separate checkpoint lifecycle is needed.

This is the prevention half of the shortened-trail problem; AUD-1's out-of-band
head anchor is the detection half (it flags a trail that shrank below its
recorded head, whether by malice or by lost-tail).

Pinned by reading PRAGMA synchronous (==2 FULL) on a listener connection,
mirroring the existing WAL/busy_timeout pragma tests. Full suite 795 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ed limit (AUD-1 red-team)

An adversarial review of the AUD-1 anchor (5 red-team lanes, executed PoCs)
refuted every interior-delete / reorder / renumber / version-downgrade /
seq-soundness attack and confirmed the Wardline v2 contract is byte-for-byte
intact (201-test regression sweep green). It found one genuine residual: the
anchor's HMAC stops forgery but not REPLAY. The anchor is a single mutable
sidecar, so a snapshotting attacker can save a genuinely-signed early anchor
(head=1), let the trail grow, truncate the DB back to seq=1, and restore the
saved anchor — it verifies (real signature, consistent seq + chain_hash) and the
rollback goes undetected.

This is inherent to local same-filesystem storage: nothing on disk is beyond a
file-write attacker's rollback, so no purely-local check (counter, timestamp,
extra copy) closes it — that would be honesty theatre. The fix is a deployment
property: store the anchor on append-only/WORM or remote storage, or run an
external monitor on the anchored head's monotonicity.

The prior docstring over-claimed it detects "a rollback to an earlier consistent
prefix" — false under replay. Corrected to state precisely what it catches
(forgery; truncation by a late/non-snapshotting attacker) and the replay
limitation + its real mitigation. Pinned the boundary with an executable
known-limitation test so the over-claim cannot silently drift back.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…st-marker-only (INSTALL-1)

The injector deliberately tolerates a split brain: when a second legis
instruction block sits beyond a sibling tool's block, it cannot canonicalise
across the foreign block, so it rewrites the first block fresh, warns, and
leaves the stale second copy in place (foreign-safety wins over own-dedup). The
doctor's freshness probe, though, read the token off the FIRST marker only
(_MARKER_TOKEN_RE.search → first match) — so a fresh first block masked a stale
second block and the doctor reported "healthy" on exactly the conflicting-
guidance state it exists to catch.

Freshness now requires EXACTLY ONE legis block at the current token, via a new
foreign-aware walk (_own_open_marker_tokens) that reuses the injector's own
fence-tracking — a legis marker quoted inside a sibling block is not counted, so
the probe never miscounts a documented example as a real block. check_instruction
_block surfaces a split brain (>1 block) with an actionable hand-resolution
message and, since the injector cannot collapse it, does not falsely claim
repair fixed it.

This is the same honesty discipline as GOV-1/POLICY-1: a gate must not report
green on the condition it exists to detect.

RED test pinned the false-"ok" first; both CLAUDE.md and AGENTS.md get the fix
via the shared check. Full suite 797 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
HttpLoomweaveIdentity.capability() probed GET /api/v1/_capabilities with an
explicit signed=False, so the request went out unsigned even when an HMAC key
was provisioned — the lone unsigned exception among the SEI routes, and the very
one that establishes whether legis trusts the provider as SEI-capable. On a
keyed deployment that left the trust-establishing handshake unauthenticated,
spoofable to capability=supported.

Sign it like every other route (the default path already no-ops signing when no
key is set, so loopback/trusted deployments are unchanged). Removed the per-call
`signed` knob from _request entirely: an unsigned opt-out is exactly the
affordance that caused this, and no other caller used it — so it cannot
reintroduce the gap. Wire confidentiality against an on-path response rewrite
remains TLS's job, which _validate_base_url already enforces for any non-loopback
(keyed) host.

RED-pinned the unsigned probe ({} headers when keyed) before the fix; added a
companion test that the keyless probe stays bare. Full suite 799 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… guard (JUDGE-1)

In the coached cell a model ACCEPTED maps straight to accepted=True, and the
agent-controlled rationale (and, on the degraded-to-locator branch, the entity
locator) flowed into the judge prompt with no length bound — so a prompt-stuffing
payload could bury the instruction or smuggle an injection into the model.

LLMJudge.evaluate now bounds the SERIALIZED request — {policy, entity, rationale}
exactly as build_prompt embeds it — at MAX_JUDGE_REQUEST_CHARS (8192) before the
model is consulted; over-cap is rejected as BLOCKED by a deterministic guard that
never calls the model (stamped with a self-documenting sentinel model id, not an
LLM identity). Measuring the serialized request (not the raw rationale) bounds
every agent-settable field in one check — rationale, entity locator, and the
ensure_ascii unicode-expansion variant (each non-ASCII char → 6-char \uXXXX, so a
raw-char cap would be 6x loose). Reject, never truncate: truncation would mutate
the rationale that is recorded and (protected cell) signed, and could pass a
front-loaded injection. The full over-cap rationale is still written to the
BLOCKED record, so the attempt stays attributable.

build_prompt's serialization (the structural-escape defense — a forged sibling
{"verdict":"ACCEPTED"} survives only as an escaped string value) is now pinned by
a round-trip test covering rationale AND entity injection (JUDGE-2). The module
docstring documents the residual honestly: a SEMANTIC injection that persuades
the model is a model-robustness property, not a code fail-open — mitigated by
attribution and, in the protected cell, by Q-H3's deterministic validator.

TDD: RED-pinned both stuffing vectors (rationale + entity reaching an accepting
model) and the model-never-consulted property before the guard; added an
in-cap boundary test so a thorough justification is not falsely blocked. Full
suite 803 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-2, CRYPTO-THRESHOLD-001)

Closes the last three low/post-1.0 items from docs/release-1.0-risk-audit.md.

POLICY-2 (this session) — remove the exemption-rescue mechanism outright.
PolicyGrammar had a VIOLATION->CLEAR exemption-rescue branch wired to an
agent-writable YAML loader (ExemptionAllowlist.from_file) with zero src
consumers — the latent bypass trap the finding names. Full removal: delete
policy/exemptions.py + tests/policy/test_exemptions.py, drop the exemptions
ctor param / _exemptions / rescue branch from grammar.py, and remove the 3
rescue-branch tests. New regression guard test_grammar_has_no_exemption_rescue
_mechanism pins that no exemption seam can be re-introduced by accident. This
supersedes the earlier conservative document-only closure of legis-e512e97bfc
(see ticket history): documenting around the loader left the trap in the tree.

AUTH-1 (doc) — app.py comment telegraphs that LEGIS_ALLOW_UNSCOPED_API_TOKENS=1
grants unscoped tokens operator authority (not renamed: the var already fits
the LEGIS_ALLOW_<bad-thing> family; audit remedy was "rename OR document").

CRYPTO-THRESHOLD-001 (doc) — README scopes the "cryptographic layer" to
intra-suite HMAC tamper-evidence with a self-asserted actor, not third-party
cryptographic proof; names RFC-8785 as the upgrade path.

Full suite green (792 passed, 2 skipped), ruff clean on changed files.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolve the 6 standing lint errors (default ruff E4/E7/E9/F ruleset):
- test_doctor.py: 5x E402 (module-level imports placed under mid-file section
  headers) — consolidated into the top import block; section comments kept.
- test_install.py: 1x F401 — dropped the unused `_legis_mcp_entry` import.

No behaviour change. Full suite green (792 passed, 2 skipped), ruff clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Second adversarial pre-ship review (docs/release-1.0-pre-ship-review.md)
re-attacked the prior audit's self-verified fixes. Crypto-threshold held;
these gaps it surfaced are now closed, each independently re-verified.

- JUDGE-3 (protected-cell fail-open): the Q-H3 advisory-downgrade was gated on
  exact-match `protected_policies`, which diverges from the glob-capable cell
  routing — a protected-cell policy outside the set (incl. any glob route and
  the empty-set default) had its model ACCEPTED signed authoritative. The cell
  is now fail-closed UNCONDITIONALLY: it clears only on a validator-confirmed
  ACCEPTED. Independent re-attack then caught a second variant — a fooled model
  emitting the operator-only OVERRIDDEN_BY_OPERATOR (which _record_signed also
  counts as accepted) cleared the gate even for a declared protected policy.
  Closed at two layers: the judge JSON parser now restricts verdicts to
  {ACCEPTED, BLOCKED}, and submit() downgrades the whole accepted-set.
  Behavior change: with no validator wired (default prod), protected overrides
  now require operator sign-off. Regression tests at parser and gate levels.

- GOV-2: /governance/identity-gaps now returns a {status, gaps} envelope
  ("unavailable" vs "checked") so a can't-check state is not a false all-clear,
  matching the GOV-1 fix on the sibling lineage-integrity endpoint.

- F1: TrailVerifier docstring corrected — no longer claims modify-to-unsigned is
  caught; the modify-to-unsigned / tail-truncation residuals of the conceded
  raw-file-write tier are documented honestly (code hardening tracked post-1.0).

- POLICY-1: aliased-marker (`skipper = pytest.mark.skip; @skipper`) and
  fixture-skip vectors documented as residuals in _disabling_marker (zero live
  @policy_boundary sites; name-heuristic hardening tracked post-1.0).

- ID-SEI-1: LEGIS_ALLOW_INSECURE_REMOTE_HTTP now warns on a remote-plaintext
  bypass (loomweave + filigree clients); documented in README + federation doc.

- ID-SEI-2: resolver `alive` is now strict-bool; a non-bool truthy value
  degrades fail-closed instead of promoting to a stable SEI identity.

- README "Known security limitations" section + CHANGELOG entries.

Suite 801 passed / 2 skipped; ruff + mypy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
doctor:
- `--fix` is now the canonical repair flag; `--repair` stays a working
  alias (argparse dest `fix`), so no script breaks.
- DoctorCheck gains a `repairable` bit; text view tags each problem
  `[fixed]` / `[auto-fixable]` / `[operator]` with footers that point
  auto-fixable items at `legis doctor --fix` and tell the operator that
  `[operator]` items need out-of-band config + a relaunch. JSON checks
  carry `repairable` additively.
- `install.filigree_scope` is gated on filigree actually being installed
  (file-existence probe, no filigree import): the unscoped-binding warning
  only fail-closes against a server-mode filigree daemon, so it is noise
  when filigree is absent. When it fires, the message names it operator-
  owned (the `--filigree-url` is operator-pinned in wardline's `.mcp.json`)
  and stays repairable=False.

tidy for 1.0 (version held at rc4 per the live-e2e gate):
- README + doctor docstring use the canonical `--fix` spelling.
- CHANGELOG [Unreleased] records the above.
- .gitignore ignores `.claude/*.lock` (transient scheduled-tasks lock).
- removed stray build artifacts (.coverage, coverage.json).

Full suite green (813 passed, 2 skipped), ruff + mypy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The README covers the *why* (the 2×2 concept) and the legis-workflow skill
covers the *agent-call* surface, but there was no human-operator guide for
"how do I configure this" and "what am I seeing when an agent does X". Adds
docs/guide/:

- configuration.md — the operator's governance-control reference: reconciles
  "zero human config" (the agent's experience) with the operator's two acts
  (choose the cell, hold the key); per-cell cost/buys table; the fail-closed
  routing default + resolution order; full LEGIS_* / OPENROUTER_* env-var
  reference grouped by purpose; and a separate, warning-carrying "dev-only /
  escape hatches" section for the LEGIS_UNSAFE_* / LEGIS_ALLOW_* flags.
- reading-legis-output.md — organized by "where it surfaces / what it means /
  do I act": keeps the recorded Verdict (ACCEPTED/BLOCKED/OVERRIDDEN_BY_OPERATOR)
  distinct from the override_submit outcome envelope (ACCEPTED_SELF /
  ACCEPTED_BY_JUDGE / BLOCKED / ESCALATED_PENDING / NEED_INPUTS); covers scan
  outcomes, artifact/identity/lineage statuses, the override-rate gate, CI exit
  codes, doctor tags, and flags the only signals that need a human in real time.
- README.md (index) + links from the top-level README.

Every flag/enum/command cited was verified against source (e.g. dropped a
spurious OPENROUTER_BASE_URL row that was a grep artifact of the
DEFAULT_OPENROUTER_BASE_URL constant, not a real env var).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The reference tables answer "what does signal Y mean / do I act"; a single
compact narrative (agent hits a coached policy → BLOCKED → revise →
ACCEPTED_BY_JUDGE → async review, with the structured ESCALATED_PENDING
contrast) converts the reference into the mental model behind the user's
literal question, "what am I seeing when an agent does X".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gree's install predicate

Two corrections to the doctor checks landed in 84a8047:

- **Split-brain instruction block is not auto-fixable.** `--fix` returns before
  the repair branch for the >1-block split-brain case (the injector won't splice
  across a sibling tool's block), so tagging it `repairable=True` rendered a false
  `[auto-fixable]` signal that re-creates the very --fix loop the design
  eliminates. Now `repairable=False` → `[operator]`, matching the check's own
  "resolve it by hand" message. (Corrects the tag shipped in 84a8047.)

- **`_filigree_installed` now mirrors filigree's real install predicate.** It was
  an AND requiring `.filigree.conf` AND a `config.json`; filigree's
  `find_filigree_anchor` (core.py:1046-1064) treats a project as installed if ANY
  of three markers is present: `.filigree.conf` (file), `.weft/filigree/` (dir),
  or `.filigree/` (dir) — never AND, and the store/legacy checks are `.is_dir()`,
  not a `config.json` `.is_file()`. The old AND would return "not installed" for
  confless / legacy / conf-only installs and SILENTLY DROP a real unscoped-binding
  warning where filigree genuinely is installed — the false-green the governance
  honesty discipline forbids. Tests updated to cover conf-only, confless-weft, and
  confless-legacy installs (the last is the live federation-legacy-path case).

Full suite green (815 passed, 2 skipped), ruff + mypy clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…te cell trap, envelope next_action

LEG-1: add the policy_list tool (routing table + each cell's honest enabled state, computed via a shared explain_cell so it can never disagree with policy_explain) and an additive matched_rule field on policy_explain (a configured policy reports its rule pattern; an unconfigured/hallucinated name reports null). cell_for now delegates to a new rule_for() so routing and discovery cannot drift.

LEG-2: the error envelope already carries next_action/recoverable for every code (_recovery_for); reconcile the SKILL.md error table to it verbatim and add one drift-lock test asserting every emitted code yields a non-empty next_action. No new abstraction.

LEG-3: scan_route's server-owned rejection now names the rejected request-side arg(s) (cell/severity_map/fail_on) while retaining the literal 'server-owned' substring; the cell/severity_map/fail_on schema descriptions state the LEGIS_UNSAFE_WARDLINE_REQUEST_ROUTING gating.

Additive only; no routing/enablement/tiering semantics changed. ruff + mypy clean; full suite 825 passed, 2 skipped (+10 tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Version 1.0.0rc4 -> 1.0.0 across pyproject, legis.__version__ (feeds the
MCP serverInfo, /health, and `legis --version`), and uv.lock. CHANGELOG
[Unreleased] -> [1.0.0] (2026-06-09) with refreshed compare links.

1.0 release-prep hygiene (same pass):
- README points to the now-public adversarial threat model — the risk
  audit and the independent pre-ship review, attack recipes and all —
  framed as the "forced me to do the right thing" discipline it is.
- Dropped the rc1 "Known limitations" list from the changelog: the MCP
  item was superseded at rc2; the live sibling-gated items moved to the
  Filigree tracker (outstanding work belongs in the tracker, not the log).

No code behavior change — version strings + docs only. Full suite green
(825 passed, 2 skipped; ruff + mypy clean).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 9, 2026 11:09
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR cuts the 1.0.0 final release (from 1.0.0rc4) and includes the release-prep hardening and documentation that accompanied the adversarial reviews described in the PR metadata. It also updates several governance/attestation invariants (audit-trail tamper evidence, Wardline schema interop, MCP/operator surfaces) and expands tests/docs to pin the intended fail-closed behaviors.

Changes:

  • Bump versioning and release notes to 1.0.0 across package metadata and changelog.
  • Strengthen governance/audit integrity and posture reporting (v3 signatures with chain_seq, head anchor support, WAL durability pragma, structured skip payloads, more explicit routing errors).
  • Update Wardline ingest to the suppression_state wire key and add new/expanded tests and operator/agent-facing documentation.

Reviewed changes

Copilot reviewed 67 out of 69 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
uv.lock Update editable package version to 1.0.0.
tests/wardline/test_policy.py Update test fixtures to use suppression_state.
tests/wardline/test_ingest.py Update ingest tests for suppression_state and add golden/skip payload tests.
tests/wardline/test_governor.py Update governor tests for suppression_state.
tests/wardline/test_coached_routing.py Update coached routing tests for suppression_state.
tests/test_install.py Remove unused import in MCP registration test.
tests/store/test_head_anchor.py Add tests for the out-of-band head anchor behavior and limitations.
tests/store/test_batch_read_free_invariant.py Update finding fixtures to use suppression_state.
tests/store/test_audit_store.py Add/expand tests for synchronous=FULL and contiguity integrity checks.
tests/service/test_wardline.py Add tests pinning improved Wardline routing error messages.
tests/service/test_governance.py Update signing fields tests to include seq binding.
tests/service/test_explain.py Pin matched_rule reporting in explain payloads.
tests/policy/test_honesty_gate.py Add tests ensuring disabled evidence tests are rejected (POLICY-1).
tests/policy/test_grammar.py Pin removal of exemptions seam and update related expectations.
tests/policy/test_exemptions.py Remove exemptions tests (feature removed).
tests/policy/test_evidence.py Add evaluator tests for disabled-marker detection (skip/xfail/skipif).
tests/policy/test_boundary_scan.py Add end-to-end scan tests for disabled evidence tests.
tests/mcp/test_server.py Add policy_list tool coverage, posture echoing, and next_action invariants.
tests/identity/test_resolver.py Add test ensuring non-bool alive does not promote stable identity.
tests/identity/test_loomweave_client.py Add tests for signed capability probe and insecure-HTTP warnings.
tests/filigree/test_client.py Add tests for insecure-HTTP warning and enforcement behavior.
tests/enforcement/test_trail_verify.py Add tests for seq-binding and anchored tail-truncation detection.
tests/enforcement/test_signoff.py Update expected signature prefix to v3 for sign-offs.
tests/enforcement/test_signing.py Add v3 signing/verification primitive tests.
tests/enforcement/test_regressions.py Remove exemptions regression test (feature removed).
tests/enforcement/test_protected_submit.py Update protected submit tests for fail-closed behavior + v3 binding.
tests/enforcement/test_protected_override.py Update operator override signature expectations to v3.
tests/enforcement/test_protected_extensions.py Update signature verification reconstruction to use seq.
tests/enforcement/test_judge.py Add tests for operator-only verdict rejection and prompt-size cap.
tests/api/test_sei_api.py Update API tests for new identity-gaps envelope and protected validator wiring.
tests/api/test_complex_api.py Update API tests for fail-closed protected behavior + identity-gaps envelope.
tests/api/test_combinations_api.py Update API tests to use suppression_state and pin structured dirty-skip fields.
src/legis/wardline/ingest.py Implement suppression_state, structured dirty-tree skip payload, and updated active-defects logic.
src/legis/store/protocol.py Extend store protocol with append_signed and head query support.
src/legis/store/head_anchor.py Add new HeadAnchor implementation for tail-truncation detection.
src/legis/store/audit_store.py Add append_signed, contiguity checks, synchronous=FULL, and head query helper.
src/legis/service/wardline.py Improve routing error messages and return scan-level posture in routing result.
src/legis/service/explain.py Add matched_rule and refactor explain plumbing (explain_cell).
src/legis/policy/grammar.py Remove exemptions seam from policy grammar.
src/legis/policy/exemptions.py Remove exemptions implementation (feature removed).
src/legis/policy/evidence.py Add disabling-marker detection and return disabled evidence results.
src/legis/policy/cells.py Add rule_for and expose rule list for routing introspection.
src/legis/policy/boundary_scan.py Map disabled evidence outcome to a dedicated rule id.
src/legis/mcp.py Add policy_list, improve scan_route output posture, and enrich recovery hints.
src/legis/install.py Add split-brain detection helper for multiple legis instruction blocks.
src/legis/identity/resolver.py Fail-closed alive handling requiring strict boolean True.
src/legis/identity/loomweave_client.py Sign capability probe when keyed and warn on insecure remote HTTP.
src/legis/filigree/client.py Warn on insecure remote HTTP when bypass flag is set.
src/legis/enforcement/signoff.py Bind sign-off signatures to seq (v3) and optionally advance head anchor.
src/legis/enforcement/signing.py Add v3 signature prefix support and verification dispatch.
src/legis/enforcement/protected.py Add v3 seq-binding, head-anchor checking, and protected fail-closed logic.
src/legis/enforcement/judge.py Add prompt-size cap guard and restrict allowed judge verdicts.
src/legis/data/skills/legis-workflow/SKILL.md Document policy_list and updated error recovery hints.
src/legis/cli.py Add canonical legis doctor --fix flag (keep --repair alias).
src/legis/api/app.py Improve identity-gaps honesty envelope, lineage-integrity status, and Wardline responses.
src/legis/init.py Bump __version__ to 1.0.0.
README.md Update release status and add security limitation + operator docs sections.
pyproject.toml Bump project version to 1.0.0.
docs/release-1.0-risk-audit.md Add published pre-release adversarial audit doc.
docs/release-1.0-pre-ship-review.md Add published second-pass adversarial review doc.
docs/guide/README.md Add operator guide index.
docs/guide/reading-legis-output.md Add operator guide for interpreting outcomes/verdicts/statuses.
docs/guide/configuration.md Add operator configuration guide and env var reference.
docs/federation/sei-conformance.md Document TLS custody seal dependency and insecure-HTTP bypass implications.
docs/design/legis-charter.md Expand charter note about self-asserted actor identity in records and federation writes.
CHANGELOG.md Add 1.0.0 entry summarizing security/honesty closures and surface changes.
.gitignore Ignore Claude Code transient lock files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +116 to +120
anchored_seq = body.get("head_seq")
anchored_chain = body.get("head_chain_hash")
if not sig or anchored_seq is None or anchored_chain is None:
raise AnchorError(f"head anchor {self._path} is structurally malformed")
if not verify(_anchor_fields(anchored_seq, anchored_chain), sig, self._key):
Comment thread README.md

- **The coached cell is a model-robustness wall, not a cryptographic one.** A blocked agent clears the coached gate by convincing the LLM judge; a *malicious prompt injection* that persuades the model will likewise clear it. Structural injection (forging a verdict key) is closed and any transport/parse failure is fail-closed to `BLOCKED`, but the coached cell has no defense-in-depth against a model that is genuinely fooled. For verdicts that must not rest on the model's word, use the **protected** cell, where a judge `ACCEPTED` is advisory only and is downgraded to require operator sign-off (unless a deterministic, non-LLM validator confirms it).
- **Tamper-evidence assumes the signing key is out of the attacker's reach, and is not absolute against raw DB-file writes.** v3 signing binds each record's chain position, so in-place edits, reordering, and renumbering are detected. A holder of raw write access to the governance `.db` can still *delete* a record and re-chain, or rewrite a record's policy to a non-protected value and strip its protected markers ("modify-to-unsigned"), or truncate the tail — these are residuals of the conceded raw-file-write threat tier. The opt-in `HeadAnchor` mitigates truncation/rewind (with a documented anchor-replay caveat). Keep the governance store on storage only the operator controls.
- **Durability tier.** The audit store runs `synchronous=FULL`, but a power loss can still drop the most recent un-checkpointed appends; the trail stays internally consistent (a shortened-but-valid tail), it does not corrupt.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants