-
Notifications
You must be signed in to change notification settings - Fork 26
fix(storage): reject dangling Claim graph references on every write path #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: test
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -354,6 +354,36 @@ def _evidence_ref_exists(self, ref_id: str) -> bool: | |
|
|
||
| # --- claims ------------------------------------------------------------ | ||
|
|
||
| def _validate_claim_refs(self, claim: Claim) -> None: | ||
| """Reject dangling graph references on a Claim before it lands. | ||
|
|
||
| The #124 graph-integrity fix closed `Relation.source/target/evidence` | ||
| and `Page.entities/sources` (see the note above `_node_exists`) but | ||
| left the Claim's own four reference fields — `entities`, | ||
| `supersedes`, `superseded_by`, `contradicts` — unchecked on every | ||
| write path. `fsck._check_lifecycle_chains` already declares three of | ||
| them as `error`-severity findings (`dangling_supersedes`, | ||
| `dangling_superseded_by`, `dangling_contradicts`), so the invariant | ||
| was articulated but enforced by no writer. Enforce it here, the same | ||
| way `_validate_relation_refs` guards relation endpoints. | ||
|
|
||
| `evidence` is validated separately by `put_claim` (it accepts either | ||
| a Source id or an Evidence id, a different resolution surface). | ||
| """ | ||
| for eid in claim.entities: | ||
| if not self._entity_path(eid).exists(): | ||
| raise ValueError( | ||
| f"claim {claim.id} references unknown entity {eid!r}" | ||
| ) | ||
| claim_refs = [*claim.supersedes, *claim.contradicts] | ||
| if claim.superseded_by is not None: | ||
| claim_refs.append(claim.superseded_by) | ||
| for cid in claim_refs: | ||
| if not self._claim_path(cid).exists(): | ||
| raise ValueError( | ||
| f"claim {claim.id} references unknown claim {cid!r}" | ||
| ) | ||
|
|
||
| def put_claim(self, claim: Claim) -> Claim: | ||
| # Evidence entries can be Source IDs or Evidence IDs -- accept either. | ||
| for cid_or_sid in claim.evidence: | ||
|
|
@@ -364,6 +394,7 @@ def put_claim(self, claim: Claim) -> Claim: | |
| raise ValueError( | ||
| f"claim {claim.id} cites unknown source/evidence {cid_or_sid}" | ||
| ) | ||
| self._validate_claim_refs(claim) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
With this new write-time guard, a claim proposal whose Useful? React with 👍 / 👎. |
||
| try: | ||
| with self._claim_path(claim.id).open("x") as f: | ||
| f.write(_yaml_dump(claim.model_dump(mode="json"))) | ||
|
|
@@ -398,6 +429,11 @@ def update_claim(self, claim: Claim) -> Claim: | |
| # The Claim model's field validators only run at construction | ||
| # time; mutation alone bypasses them unless we round-trip. | ||
| Claim.model_validate(claim.model_dump(mode="json")) | ||
| # Re-check graph references too: an in-place mutation can introduce a | ||
| # dangling entities/supersedes/superseded_by/contradicts link that the | ||
| # model validator can't catch (it has no KB access). Mirrors the | ||
| # put_claim guard so the update path can't reintroduce the gap. | ||
| self._validate_claim_refs(claim) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Because Useful? React with 👍 / 👎. |
||
| self._claim_path(claim.id).write_text(_yaml_dump(claim.model_dump(mode="json"))) | ||
| self._embed_and_store(kind="claim", id=claim.id, text=claim.text) | ||
| # Keep the FTS5 row in sync with the on-disk claim so lifecycle | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this starts rejecting
claim.entities, any existing KB that already has a claim pointing at a deleted or misspelled entity can no longer archive/confirm/update that claim, butvouch fsckstill only checks the claim lifecycle fields (supersedes,superseded_by,contradicts) and never reports dangling claim entities. Please add a lint/fsck finding for these entity refs alongside the new write-time gate so users have a preflight repair path instead of discovering the blocker only when an unrelated update fails.Useful? React with 👍 / 👎.