fix(storage): reject dangling Claim graph references on every write path by galuis116 · Pull Request #201 · vouchdev/vouch

galuis116 · 2026-06-10T15:43:01Z

What changed

Adds a storage-layer reference guard, KBStore._validate_claim_refs (src/vouch/storage.py), that rejects a Claim whose entities, supersedes, superseded_by, or contradicts point at an artifact not present in the KB. It's called from both put_claim (alongside the existing evidence loop) and update_claim (alongside the existing Claim.model_validate re-check). bundle.import_check gains the matching check in the claim branch of its graph-integrity pass, mirroring the page checks already there. No model, on-disk-layout, schema, bundle-format, MCP/JSONL, or audit-log change.

Why

Fixes #196. Claim's four graph-reference fields were validated by no write path:

put_claim (src/vouch/storage.py:348) checked only claim.evidence.
update_claim (src/vouch/storage.py:383) re-validated the model (the fix(models): require Claim.evidence to be non-empty at the model layer #82 fix) but the model validator has no KB access, so dangling refs sailed through.
bundle.import_apply writes claim YAML straight to disk (dest.write_bytes, no put_claim); import_check validated only claim.evidence for claims.
proposals.approve → put_claim, inheriting the gap.

The #124 graph-integrity fix closed Relation.source/target/evidence and Page.entities/sources — and its own comment (src/vouch/storage.py:314) lists exactly those while silently omitting the Claim's own fields. Meanwhile fsck._check_lifecycle_chains (src/vouch/health.py:251-288) already declares dangling_supersedes / dangling_superseded_by / dangling_contradicts as error-severity findings. So the contract was articulated but enforced by no writer — the identical shape as #81 (Claim.evidence non-empty) and #123/#124 (Relation/Page dangling refs). Claim.entities had zero coverage anywhere — not at write time, not in fsck.

What might break

Strictly additive write-time validation. Honest writes are unaffected; only Claims whose graph refs point at non-existent artifacts start being rejected.

store.put_claim / store.update_claim raise ValueError("claim <id> references unknown entity/claim …") for a dangling entities / supersedes / superseded_by / contradicts.
bundle.import_check reports a dangling reference: … claim entity … / … claim graph ref … issue and import_apply refuses (it raises on the first import_check issue).
Lifecycle ops stay green: supersede / contradict (src/vouch/lifecycle.py:35-36,74-75) load both ends via get_claim before linking, so their refs always resolve.
Existing KBs are unaffected on read; a legacy KB already holding a poisoned ref still surfaces via vouch fsck exactly as before. The migration story matches bug: Claim model has no min-evidence validator — uncited claims land via bundle import, put_claim, update_claim #81 / validation gap: every write path lands Relations/Pages with dangling foreign-id references #123.

VEP

Not required — tightens write-time validation to match an invariant fsck already documents; no surface change.

Tests

Affected modules green locally — tests/test_storage.py, tests/test_health.py, tests/test_bundle.py, tests/test_cli.py (the touched test). ruff check src tests clean; mypy src → Success: no issues found. (The single failing test_bundle.py::test_import_apply_rejects_absolute_path is pre-existing on test — a Windows-specific path-message assertion unrelated to this change; verified it fails identically with these changes stashed.)
New / changed behaviour has tests:
- tests/test_storage.py (7): test_put_claim_rejects_unknown_{entity,supersedes,superseded_by,contradicts}_ref, test_put_claim_accepts_resolvable_graph_refs, test_update_claim_rejects_in_place_mutation_to_dangling_ref, test_lifecycle_contradict_round_trips_after_guard.
- tests/test_bundle.py (2): test_import_check_rejects_claim_with_dangling_graph_refs, test_import_check_accepts_claim_with_resolvable_graph_refs (honest round-trip guard).
- tests/test_health.py / tests/test_cli.py: the existing fsck dangling-chain tests now write poisoned claim YAML directly to disk (via a _write_claim_direct helper) so they reproduce the legacy on-disk state fsck must still surface, rather than relying on put_claim to plant data it now correctly rejects; the asymmetric-contradicts test creates its claims in dependency order.
CHANGELOG.md updated under ## [Unreleased] ### Fixed.

Closes #196.

🤖 Generated with Claude Code

Claim's four graph-reference fields — entities, supersedes, superseded_by, contradicts — were validated by no write path. put_claim checked only claim.evidence; update_claim re-validated the model (no KB access, so no ref check); bundle.import_apply writes claim YAML straight to disk. The vouchdev#124 graph-integrity fix closed Relation.source/target/evidence and Page.entities/sources but, as its own storage.py comment shows, skipped the Claim's own reference fields — even though fsck already declares dangling_supersedes / dangling_superseded_by / dangling_contradicts as error-severity findings. The invariant was articulated but enforced by no writer (same shape as vouchdev#81 / vouchdev#123). - storage.KBStore._validate_claim_refs: entities -> entity ids, supersedes/contradicts/superseded_by -> claim ids. Called from both put_claim and update_claim (the latter closes the in-place-mutation reach path, mirroring the vouchdev#82 re-validation fix). - bundle.import_check: extend the claim branch of the graph-integrity pass with the same checks, matching the existing page-ref checks, so a bundle can't land a dangling claim ref through import_apply's direct write. - Honest lifecycle writes are unaffected: supersede/contradict load both ends via get_claim before linking, so their refs always resolve. Regression tests in test_storage.py, test_bundle.py, and test_health.py; the fsck/CLI dangling-chain tests now write poisoned YAML directly to disk to reproduce the legacy on-disk state fsck must still surface. Closes vouchdev#196.

coderabbitai · 2026-06-15T17:31:12Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8548a9d8-1e00-45a6-82af-8b28e226bd39

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

The `test` merge restored this test to the put_claim(Claim(supersedes= ["ghost"])) form, which now trips the new _validate_claim_refs guard and fails CI. Write the dangling-ref claim YAML directly to disk instead — the same approach already used by the fsck dangling tests in test_health.py — so it reproduces the legacy/poisoned on-disk state fsck must still surface, without relying on a write path that now (correctly) rejects it.

plind-junior · 2026-06-17T10:27:43Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d3a134bdbc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-17T10:36:11Z

            raise ValueError(
                f"claim {claim.id} cites unknown source/evidence {cid_or_sid}"
            )
+        self._validate_claim_refs(claim)


Precheck claim refs before batch approval

With this new write-time guard, a claim proposal whose entities payload contains a typo (possible through MCP/JSONL kb.propose_claim, since propose_claim only validates evidence) now fails only when approve() reaches store.put_claim. check_approvable() still returns None for that proposal, so the default vouch approve a b path passes the all-or-nothing precheck and then catches this ValueError while continuing to approve later IDs, contradicting the documented “nothing was approved” semantics. Please validate claim entity refs before filing/prechecking proposals, or include this guard in check_approvable().

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-17T10:36:11Z

+        for eid in claim.entities:
+            if not self._entity_path(eid).exists():
+                raise ValueError(


Surface legacy claim entity refs in fsck

When this starts rejecting claim.entities, any existing KB that already has a claim pointing at a deleted or misspelled entity can no longer archive/confirm/update that claim, but vouch fsck still only checks the claim lifecycle fields (supersedes, superseded_by, contradicts) and never reports dangling claim entities. Please add a lint/fsck finding for these entity refs alongside the new write-time gate so users have a preflight repair path instead of discovering the blocker only when an unrelated update fails.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-17T10:36:11Z

+        # dangling entities/supersedes/superseded_by/contradicts link that the
+        # model validator can't catch (it has no KB access). Mirrors the
+        # put_claim guard so the update path can't reintroduce the gap.
+        self._validate_claim_refs(claim)


Avoid partial lifecycle writes on legacy refs

Because update_claim() now raises on any existing dangling graph ref, lifecycle operations can leave a half-applied state when the second claim is legacy/poisoned. For example, vouch supersede old new writes old.status = superseded first, then store.update_claim(new) raises if new.entities or another graph field points at a missing artifact, so the KB records old.superseded_by = new without the reciprocal new.supersedes, relation, or audit event. Please pre-validate all touched claims before the first write or make the lifecycle update atomic.

Useful? React with 👍 / 👎.

galuis116 and others added 2 commits June 10, 2026 08:34

Merge branch 'test' into fix/claim-dangling-graph-refs

a487fe5

galuis116 force-pushed the fix/claim-dangling-graph-refs branch from c82a5f7 to d3a134b Compare June 16, 2026 09:11

chatgpt-codex-connector Bot reviewed Jun 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(storage): reject dangling Claim graph references on every write path#201

fix(storage): reject dangling Claim graph references on every write path#201
galuis116 wants to merge 3 commits into
vouchdev:testfrom
galuis116:fix/claim-dangling-graph-refs

galuis116 commented Jun 10, 2026

Uh oh!

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading

Review skipped

Uh oh!

plind-junior commented Jun 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 17, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 17, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

galuis116 commented Jun 10, 2026

What changed

Why

What might break

VEP

Tests

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

plind-junior commented Jun 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jun 15, 2026 •

edited

Loading