feat(synthesize): kb.synthesize answer-mode retrieval over the review-gated KB by dripsmvcp · Pull Request #238 · vouchdev/vouch

dripsmvcp · 2026-06-17T01:52:09Z

feat(synthesize): `kb.synthesize` answer-mode retrieval over the review-gated KB

What changed

Adds kb.synthesize — an answer-mode counterpart to kb.context. Where
kb.context returns a ranked list of relevant items, kb.synthesize
answers a query in prose, but strictly from approved (durable) claims,
with an inline [claim_id] citation behind every sentence.

New surface, wired across all three transports that the capabilities test
keeps in sync:

src/vouch/synthesize.py — synthesize(store, *, query, depth=3, max_chars=4000, llm=False). Walks build_context_pack(... limit=depth),
keeps only claim items that resolve to a durable claim via
store.get_claim, and composes a deterministic answer: one short,
single-clause sentence per claim, each carrying at least one [claim_id]
citation. No sentence is emitted that isn't traceable to a claim id.
max_chars truncates by dropping trailing claims (never by cutting a
citation). Returns
{"query", "answer", "claims", "gaps", "_meta": {"synthesis_confidence"}}.
gaps lists the query's salient terms for which no approved claim was
found (and is the whole answer when nothing matched). synthesis_confidence
is high when every cited claim is stable, medium when any is
working/actionable, low when any is contested. llm=True raises
(reserved for an opt-in generative backend; deterministic synthesis is the
v1 default).
src/vouch/capabilities.py — kb.synthesize appended to METHODS.
src/vouch/jsonl_server.py — _h_synthesize handler + HANDLERS entry.
src/vouch/server.py — @mcp.tool() kb_synthesize(query, depth=3, max_chars=4000).
src/vouch/cli.py — vouch synthesize "<query>" [--depth N] [--max-chars N].
CHANGELOG.md — ### Added bullet under ## [Unreleased].

Why / root cause

kb.context is a retrieval primitive: it ranks and budgets items but leaves
answer composition (and the discipline of only using approved knowledge) to
the caller. There was no first-class way to ask the KB a question and get a
prose answer whose every clause is provably backed by a reviewed claim, with
the uncovered parts of the question surfaced rather than silently dropped.
kb.synthesize fills that gap deterministically — citation-gated by
construction, so it cannot fabricate an unbacked sentence — and grades its own
confidence from the lifecycle status of the claims it actually cited.

Test plan

tests/test_synthesize.py covers:

3 approved auth claims → non-empty answer citing all 3 ids by [id],
confidence high.
A query the KB doesn't cover → answer == "", claims == [], gaps
populated with the query's salient terms.
Fuzz/traceability: every sentence in a non-empty answer carries at least one
[id] citation whose id is in claims and resolves via store.get_claim.
max_chars drops trailing claims without cutting a citation
(citation count == cited-claim count).
Confidence reflects claim status (working → medium, contested → low).
llm=True raises the reserved-backend ValueError.
kb.synthesize is in capabilities().methods and in the JSONL HANDLERS,
and is callable via handle_request end-to-end.

Verification gate (fresh venv, editable install of this worktree):

$ ./.venv/bin/ruff check src tests
All checks passed!

$ ./.venv/bin/mypy src
Success: no issues found in 30 source files

$ ./.venv/bin/python -m pytest -q
94 passed, 6 skipped in 0.81s

(The 6 skips are pre-existing numpy/embedding-optional tests, unrelated to this
change.)

Closes #222

Summary by CodeRabbit

New Features
- Answer synthesis capability now available over approved knowledge base claims with inline citations.
- Gap reporting identifies uncovered query topics; confidence grading reflects claim stability.
- New vouch synthesize CLI command and corresponding API/MCP interfaces.
Documentation
- Updated changelog and release documentation for synthesis feature.

…#222) Add deterministic, citation-gated synthesis over approved claims with an explicit gaps block and synthesis_confidence. Wired across CLI, MCP, and JSONL; capabilities lists kb.synthesize. Tests cover citation traceability and the no-coverage gaps path.

coderabbitai · 2026-06-17T01:52:21Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c50bcd2f-a8da-4388-a1cd-19dbce4cf5b9

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds kb.synthesize, a new deterministic answer-mode retrieval feature. A new synthesize.py module builds citation-bearing prose exclusively from approved KB claims, reporting gaps for uncovered query terms and a confidence grade. The feature is wired into the capabilities list, JSONL server, MCP server, and CLI, and validated by a new test suite.

Changes

kb.synthesize answer-mode retrieval

Layer / File(s)	Summary
Core synthesis implementation `src/vouch/synthesize.py`	New module with `_salient_terms`, `_clause`, `_covers`, `_confidence` helpers and the top-level `synthesize()` function. Builds citation-bearing prose from approved claims, stops at `max_chars`, computes `gaps` from uncovered query terms, and assigns `synthesis_confidence` from claim lifecycle statuses. Raises `ValueError` for `llm=True`.
Transport wiring `src/vouch/capabilities.py`, `src/vouch/jsonl_server.py`, `src/vouch/server.py`, `src/vouch/cli.py`	Adds `"kb.synthesize"` to `METHODS`; introduces `_h_synthesize` handler and `HANDLERS["kb.synthesize"]` in the JSONL server; adds `kb_synthesize` MCP tool in `server.py`; registers `vouch synthesize` Click command with `--depth` and `--max-chars` options in `cli.py`.
Tests and documentation `tests/test_synthesize.py`, `CHANGELOG.md`, `PR_BODY.md`	Full test suite covering citation completeness, gap reporting, per-sentence citation traceability, `max_chars` truncation, confidence grading, `llm=True` error, capabilities registration, and the JSONL handler. Changelog and PR body document the feature, output shape, and transport surfaces.

Sequence Diagram

sequenceDiagram
  participant CLI as vouch synthesize
  participant MCP as kb_synthesize
  participant JSONL as kb.synthesize handler
  participant synthesize as synthesize module
  participant KBStore

  CLI->>synthesize: synthesize(store, query, depth, max_chars)
  MCP->>synthesize: synthesize(_store(), query, depth, max_chars)
  JSONL->>synthesize: synthesize(store, query, depth, max_chars, llm)
  synthesize->>KBStore: context_pack(query, depth)
  KBStore-->>synthesize: ranked claim items
  synthesize->>KBStore: load claim artifacts
  KBStore-->>synthesize: claim objects with lifecycle status
  synthesize-->>CLI: query, answer with citations, claims, gaps, synthesis_confidence
  synthesize-->>MCP: query, answer with citations, claims, gaps, synthesis_confidence
  synthesize-->>JSONL: query, answer with citations, claims, gaps, synthesis_confidence

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 A query arrives, the burrow hums with thought,
Only approved claims — the freshest that were caught!
Each sentence wears a [claim_id] badge with pride,
And gaps are named for topics that weren't inside.
No hallucination here, just carrots, crisp and true —
synthesis_confidence: high 🥕 for me and you!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: introducing kb.synthesize for answer-mode retrieval over the review-gated KB, which is the core feature of this PR.
Linked Issues check	✅ Passed	The PR implementation addresses all core objectives from `#222`: deterministic answer-mode synthesis, inline [claim_id] citations, gaps reporting, synthesis_confidence grading, integration across CLI/MCP/JSONL transports, and llm=True error handling.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to `#222`: new synthesize module, integration points (CLI/MCP/JSONL/capabilities), tests, and documentation—no extraneous changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@PR_BODY.md`:
- Line 65: Add a language tag to the fenced code block that contains the ruff
check command output. Change the opening fence from an unlabeled ``` to ```bash
to properly indicate the code block contains bash shell commands, which
satisfies the markdownlint MD040 requirement for labeled code blocks.

In `@src/vouch/synthesize.py`:
- Around line 128-132: The gaps computation currently only checks coverage
against cited_claims, which may exclude approved claims that were dropped due to
truncation. Modify the condition in the gaps list comprehension where _covers is
called to also check coverage against the approved claims in addition to
cited_claims, so that terms are not marked as gaps if they are covered by any
discovered approved claim, regardless of whether that claim was included in the
final cited_claims output.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 492b7f02-ccad-4554-88cf-e152ff2d3ba3

📥 Commits

Reviewing files that changed from the base of the PR and between 3beb821 and 07d8376.

📒 Files selected for processing (8)

CHANGELOG.md
PR_BODY.md
src/vouch/capabilities.py
src/vouch/cli.py
src/vouch/jsonl_server.py
src/vouch/server.py
src/vouch/synthesize.py
tests/test_synthesize.py

coderabbitai · 2026-06-17T01:57:49Z

+
+Verification gate (fresh venv, editable install of this worktree):
+
+```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add a language tag to the fenced verification block.

Line 65 uses an unlabeled fenced block; markdownlint MD040 flags this.

📝 Suggested fix

-``` +```bash $ ./.venv/bin/ruff check src tests All checks passed! ... -``` +```

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 65-65: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@PR_BODY.md` at line 65, Add a language tag to the fenced code block that contains the ruff check command output. Change the opening fence from an unlabeled ``` to ```bash to properly indicate the code block contains bash shell commands, which satisfies the markdownlint MD040 requirement for labeled code blocks.

Source: Linters/SAST tools

coderabbitai · 2026-06-17T01:57:49Z

+    gaps = [
+        term
+        for term in _salient_terms(query)
+        if not (cited_claims and _covers(term, *cited_claims))
+    ]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Compute gaps from discovered approved claims, not only emitted sentences.

Line 128–132 currently checks coverage only against cited_claims (claims that fit max_chars). If truncation drops a covering claim, gaps incorrectly reports missing knowledge even though an approved claim was found.

💡 Suggested fix

- cited_claims = [c for c in approved if c.id in set(cited)] gaps = [ term for term in _salient_terms(query) - if not (cited_claims and _covers(term, *cited_claims)) + if not _covers(term, *approved) ]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/vouch/synthesize.py` around lines 128 - 132, The gaps computation currently only checks coverage against cited_claims, which may exclude approved claims that were dropped due to truncation. Modify the condition in the gaps list comprehension where _covers is called to also check coverage against the approved claims in addition to cited_claims, so that terms are not marked as gaps if they are covered by any discovered approved claim, regardless of whether that claim was included in the final cited_claims output.

The test-branch merge into feat/222-synthesize left jsonl_server._load_cfg referencing yaml and sessions.session_end referencing salience without their imports, and an unsorted cli import block — ruff F821/I001 failed the CI lint step. Add import yaml to jsonl_server, add salience to the sessions package import, and sort the cli imports. ruff, mypy, and pytest all pass.

coderabbitai Bot reviewed Jun 17, 2026

View reviewed changes

plind-junior changed the base branch from main to test June 17, 2026 04:48

plind-junior and others added 2 commits June 16, 2026 21:48

Merge branch 'test' into feat/222-synthesize

e14ee9b

plind-junior merged commit a661bd2 into vouchdev:test Jun 17, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(synthesize): kb.synthesize answer-mode retrieval over the review-gated KB#238

feat(synthesize): kb.synthesize answer-mode retrieval over the review-gated KB#238
plind-junior merged 3 commits into
vouchdev:testfrom
dripsmvcp:feat/222-synthesize

dripsmvcp commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 17, 2026

Uh oh!

coderabbitai Bot Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dripsmvcp commented Jun 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

feat(synthesize): kb.synthesize answer-mode retrieval over the review-gated KB

What changed

Why / root cause

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dripsmvcp commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

feat(synthesize): `kb.synthesize` answer-mode retrieval over the review-gated KB

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading