You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Exactly the shape of #123 (dangling_relation articulated in health.lint, enforced by no writer) and #81 (Claim.evidence non-empty), but on the four graph-reference fields the Claim model points at other artifacts with: entities, supersedes, superseded_by, contradicts (src/vouch/models.py:204-210).
#124 closed the dangling-reference gap for Relations and Pages — and the codebase says so out loud:
# src/vouch/storage.py:312-320# Closes the structural counterpart of the #81 fix: every graph artifact# (Relation source/target/evidence, Page entities/sources) must resolve# to a known artifact in the KB before it lands on disk. ...
Note what is not in that list: the Claim's own entities, supersedes, superseded_by, contradicts. put_claim validates onlyclaim.evidence:
# src/vouch/storage.py:348-357defput_claim(self, claim: Claim) ->Claim:
# Evidence entries can be Source IDs or Evidence IDs -- accept either.forcid_or_sidinclaim.evidence:
if (self._source_dir(cid_or_sid) /"meta.yaml").exists():
continueifself._evidence_path(cid_or_sid).exists():
continueraiseValueError(
f"claim {claim.id} cites unknown source/evidence {cid_or_sid}"
)
# ... writes the file. entities / supersedes / superseded_by /# contradicts are never looked at.
update_claim is worse — it re-validates the model (the #82 fix) but performs zero reference-existence checks:
# src/vouch/storage.py:383-392defupdate_claim(self, claim: Claim) ->Claim:
ifnotself._claim_path(claim.id).exists():
raiseArtifactNotFoundError(f"claim {claim.id}")
Claim.model_validate(claim.model_dump(mode="json")) # model only — no ref checkself._claim_path(claim.id).write_text(...)
Compare _validate_relation_refs (src/vouch/storage.py:486-503) and put_page's claim/entity/source loop (src/vouch/storage.py:417-425) — both reject dangling endpoints before writing. The Claim write path has no equivalent.
The most damning part: fsck._check_lifecycle_chainsalready declares three of these four fields as error-severity invariants (src/vouch/health.py:251-288):
# dangling_supersedes (error) — claim supersedes a missing claim# dangling_superseded_by (error) — claim.superseded_by points at a missing claim# dangling_contradicts (error) — claim contradicts a missing claim
So the system contract is explicit — these references must resolve — but it is enforced only after the fact by vouch fsck, never at any write boundary. This is the identical "invariant articulated at one boundary, enforced by zero writers" pattern that #81 / #123 / #125 / #149 each closed. (Claim.entities is the worst of the four: nothing checks it anywhere — not put_claim, not update_claim, not even fsck — so a claim pointing at a non-existent entity is completely invisible.)
Reach paths that land a dangling Claim reference
Direct construction — store.put_claim(Claim(id="c1", text="t", evidence=[src.id], contradicts=["ghost"], entities=["ghost"])). The evidence cite resolves, so put_claim's only loop passes; contradicts / entities are never inspected; the YAML lands on disk citing claims/entities that do not exist.
In-place mutation + update — c = store.get_claim("real"); c.supersedes = ["ghost"]; store.update_claim(c). Round-trips through Claim.model_validate (no ref validator) and persists.
Bundle / sync import — bundle.import_check's _check_dangling_refs (src/vouch/bundle.py:423-430) validates onlyclaim.evidence for claims (it does check page.entities / page.sources just above, at :411-422 — the asymmetry is right there in the same function). A manifest-consistent bundle whose claim YAML carries contradicts: ["../never-existed"] or entities: ["ghost"] passes import_check and import_apply writes it straight to disk (dest.write_bytes(body), src/vouch/bundle.py:619 — no put_claim, no ref guard). sync shares the same _validate_content path (src/vouch/sync.py:212,321).
Proposal approve — proposals.approve calls store.put_claim(claim) (src/vouch/proposals.py:376), inheriting the gap. An approved claim proposal whose payload has entities: ["ghost"] lands a dangling edge through the review-gated surface.
Reproducer (direct — no bundle needed)
fromvouch.modelsimportClaimfromvouch.storageimportKBStorestore=KBStore.init(tmp)
src=store.put_source(b"e")
# evidence resolves, so put_claim's only check passes;# contradicts + entities are never validated.store.put_claim(Claim(
id="c1", text="t", evidence=[src.id],
contradicts=["does-not-exist"],
entities=["also-missing"],
))
# The claim is on disk pointing at two artifacts that do not exist.fromvouchimporthealthreport=health.fsck(store)
codes= {f.codeforfinreport.findings}
assert"dangling_contradicts"incodes# fsck flags it AFTER the factassertreport.okisFalse# but the write already succeeded# `entities=["also-missing"]` produces NO finding at all — invisible.
Graph traversal silently breaks.relations_from / relations_to, kb.neighbors (Graph-aware retrieval: kb.neighbors and --expand-graph for context packs #184/Feat/graph neighbors context expansion #185), supersede/contradict chains, and any future graph-expansion context pack walk these inline links. A dangling superseded_by makes a claim look retired in favour of a claim that does not exist; a dangling contradicts marks a claim contested against a ghost (and lint's contested_no_contradiction check can't catch it because contradicts is non-empty — it just points at nothing).
fsck already promises these are errors. A KB that imported one poisoned bundle now fails vouch fsck with dangling_supersedes / dangling_superseded_by / dangling_contradicts — an operator-visible error for data that should never have been writable in the first place. The fix turns a post-hoc fsck error into a write-time rejection, exactly as fix: reject dangling Relation/Page references at every write path #124 did for relations.
Claim.entities has zero coverage anywhere — not at write time, not in lint, not in fsck. It is the one reference field with no safety net at all.
Suggested fix
Mirror _validate_relation_refs / the #124 structure exactly — a storage-layer reference check on every Claim write path:
KBStore._validate_claim_refs(claim) (new, next to _validate_relation_refs in src/vouch/storage.py):
claim.entities → each must resolve via self._entity_path(eid).exists().
claim.supersedes, claim.contradicts, and claim.superseded_by (if not None) → each must resolve via self._claim_path(cid).exists().
Call it at the top of bothput_claim (alongside the existing evidence loop) and update_claim (alongside the existing Claim.model_validate). The update_claim call is what closes the in-place-mutation reach path — same reasoning as the fix(models): require Claim.evidence to be non-empty at the model layer #82 re-validation fix.
Note on self-reference during lifecycle: supersede / contradict always load both ends via get_claim before linking (src/vouch/lifecycle.py:35-36,74-75), so honest lifecycle writes reference existing claims and stay green. The only writes this rejects are ones that should never have landed.
bundle.import_check symmetry (src/vouch/bundle.py:423-430): extend the claim branch of _check_dangling_refs to check claim.entities (against ids["entity"]) and claim.supersedes / claim.contradicts / claim.superseded_by (against ids["claim"]), exactly as the page branch just above already checks page.entities / page.sources. This keeps bundle rejection at import_check time (a clean schema validation failed / dangling reference issue) rather than relying solely on the storage guard, and closes the import_apply direct-write path which never calls put_claim.
Regression tests, matching the existing per-path layout:
tests/test_storage.py: put_claim with entities=["ghost"], supersedes=["ghost"], contradicts=["ghost"], superseded_by="ghost" each raise ValueError before any file is written; update_claim re-validates an in-place mutation to a dangling ref; a claim whose refs all resolve still lands (no regression). A supersede / contradict round-trip stays green (proves honest lifecycle writes are unaffected).
tests/test_bundle.py: a manifest-consistent bundle whose claim YAML has contradicts: ["ghost"] (and one with entities: ["ghost"]) is rejected by import_check with a dangling reference issue and import_apply refuses; a bundle whose claim graph refs all resolve round-trips cleanly.
No on-disk-layout, schema, bundle-format, MCP/JSONL surface, or audit-log change — strictly additive write-time validation. Existing KBs are unaffected on read; a legacy KB that already contains a poisoned reference surfaces via vouch fsck exactly as it does today. The migration story is identical to #81 / #123.
Problem
Exactly the shape of #123 (
dangling_relationarticulated inhealth.lint, enforced by no writer) and #81 (Claim.evidencenon-empty), but on the four graph-reference fields the Claim model points at other artifacts with:entities,supersedes,superseded_by,contradicts(src/vouch/models.py:204-210).#124closed the dangling-reference gap for Relations and Pages — and the codebase says so out loud:Note what is not in that list: the Claim's own
entities,supersedes,superseded_by,contradicts.put_claimvalidates onlyclaim.evidence:update_claimis worse — it re-validates the model (the #82 fix) but performs zero reference-existence checks:Compare
_validate_relation_refs(src/vouch/storage.py:486-503) andput_page's claim/entity/source loop (src/vouch/storage.py:417-425) — both reject dangling endpoints before writing. The Claim write path has no equivalent.The most damning part:
fsck._check_lifecycle_chainsalready declares three of these four fields aserror-severity invariants (src/vouch/health.py:251-288):So the system contract is explicit — these references must resolve — but it is enforced only after the fact by
vouch fsck, never at any write boundary. This is the identical "invariant articulated at one boundary, enforced by zero writers" pattern that #81 / #123 / #125 / #149 each closed. (Claim.entitiesis the worst of the four: nothing checks it anywhere — notput_claim, notupdate_claim, not evenfsck— so a claim pointing at a non-existent entity is completely invisible.)Reach paths that land a dangling Claim reference
store.put_claim(Claim(id="c1", text="t", evidence=[src.id], contradicts=["ghost"], entities=["ghost"])). The evidence cite resolves, soput_claim's only loop passes;contradicts/entitiesare never inspected; the YAML lands on disk citing claims/entities that do not exist.c = store.get_claim("real"); c.supersedes = ["ghost"]; store.update_claim(c). Round-trips throughClaim.model_validate(no ref validator) and persists.bundle.import_check's_check_dangling_refs(src/vouch/bundle.py:423-430) validates onlyclaim.evidencefor claims (it does checkpage.entities/page.sourcesjust above, at:411-422— the asymmetry is right there in the same function). A manifest-consistent bundle whose claim YAML carriescontradicts: ["../never-existed"]orentities: ["ghost"]passesimport_checkandimport_applywrites it straight to disk (dest.write_bytes(body),src/vouch/bundle.py:619— noput_claim, no ref guard).syncshares the same_validate_contentpath (src/vouch/sync.py:212,321).proposals.approvecallsstore.put_claim(claim)(src/vouch/proposals.py:376), inheriting the gap. An approved claim proposal whose payload hasentities: ["ghost"]lands a dangling edge through the review-gated surface.Reproducer (direct — no bundle needed)
Why this matters
storage.py:314) names Relation and Page while silently omitting Claim.relations_from/relations_to,kb.neighbors(Graph-aware retrieval: kb.neighbors and --expand-graph for context packs #184/Feat/graph neighbors context expansion #185), supersede/contradict chains, and any future graph-expansion context pack walk these inline links. A danglingsuperseded_bymakes a claim look retired in favour of a claim that does not exist; a danglingcontradictsmarks a claimcontestedagainst a ghost (andlint'scontested_no_contradictioncheck can't catch it becausecontradictsis non-empty — it just points at nothing).fsckalready promises these are errors. A KB that imported one poisoned bundle now failsvouch fsckwithdangling_supersedes/dangling_superseded_by/dangling_contradicts— an operator-visibleerrorfor data that should never have been writable in the first place. The fix turns a post-hocfsckerror into a write-time rejection, exactly as fix: reject dangling Relation/Page references at every write path #124 did for relations.Claim.entitieshas zero coverage anywhere — not at write time, not inlint, not infsck. It is the one reference field with no safety net at all.Suggested fix
Mirror
_validate_relation_refs/ the #124 structure exactly — a storage-layer reference check on every Claim write path:KBStore._validate_claim_refs(claim)(new, next to_validate_relation_refsinsrc/vouch/storage.py):claim.entities→ each must resolve viaself._entity_path(eid).exists().claim.supersedes,claim.contradicts, andclaim.superseded_by(if notNone) → each must resolve viaself._claim_path(cid).exists().ValueError(f"claim {claim.id} references unknown <kind> {ref!r}"), matching the message shapeput_page/_validate_relation_refsalready use.put_claim(alongside the existing evidence loop) andupdate_claim(alongside the existingClaim.model_validate). Theupdate_claimcall is what closes the in-place-mutation reach path — same reasoning as the fix(models): require Claim.evidence to be non-empty at the model layer #82 re-validation fix.Note on self-reference during lifecycle:
supersede/contradictalways load both ends viaget_claimbefore linking (src/vouch/lifecycle.py:35-36,74-75), so honest lifecycle writes reference existing claims and stay green. The only writes this rejects are ones that should never have landed.bundle.import_checksymmetry (src/vouch/bundle.py:423-430): extend theclaimbranch of_check_dangling_refsto checkclaim.entities(againstids["entity"]) andclaim.supersedes/claim.contradicts/claim.superseded_by(againstids["claim"]), exactly as thepagebranch just above already checkspage.entities/page.sources. This keeps bundle rejection atimport_checktime (a cleanschema validation failed/dangling referenceissue) rather than relying solely on the storage guard, and closes theimport_applydirect-write path which never callsput_claim.Regression tests, matching the existing per-path layout:
tests/test_storage.py:put_claimwithentities=["ghost"],supersedes=["ghost"],contradicts=["ghost"],superseded_by="ghost"each raiseValueErrorbefore any file is written;update_claimre-validates an in-place mutation to a dangling ref; a claim whose refs all resolve still lands (no regression). Asupersede/contradictround-trip stays green (proves honest lifecycle writes are unaffected).tests/test_bundle.py: a manifest-consistent bundle whose claim YAML hascontradicts: ["ghost"](and one withentities: ["ghost"]) is rejected byimport_checkwith adangling referenceissue andimport_applyrefuses; a bundle whose claim graph refs all resolve round-trips cleanly.tests/test_health.py: a legacy on-disk claim YAML with a danglingcontradicts(predating the guard) still surfaces as the existingdangling_contradictsfsckfinding rather than crashing — confirms the migration path mirrors bug: Claim model has no min-evidence validator — uncited claims land via bundle import, put_claim, update_claim #81/fix(models): require Claim.evidence to be non-empty at the model layer #82.No on-disk-layout, schema, bundle-format, MCP/JSONL surface, or audit-log change — strictly additive write-time validation. Existing KBs are unaffected on read; a legacy KB that already contains a poisoned reference surfaces via
vouch fsckexactly as it does today. The migration story is identical to #81 / #123.Will follow with the patch.