Version: 2026-05-21 · Status: draft for 0.x. Pinned format expected at 1.0.
Audience: people writing alternative implementations of the kb.*
surface, people building adapters, people auditing what lands on disk.
Dated snapshots of this document live under spec/ — e.g. spec/2026-05-21/SPEC.md — so the protocol can evolve while older versions stay readable. The file you are reading is the latest; treat dated copies as immutable history.
This is the canonical written description of:
- The on-disk layout of a
.vouch/repository. - The object model (Source, Evidence, Claim, Entity, Relation, Page, Session, Proposal, AuditEvent).
- The
kb.*method surface, with parameter and result shapes. - The review-gate state machine (proposal → decision → durable artifact).
- The audit log, bundle format, and capabilities descriptor.
For implementation-level documentation see docs/. For the pydantic source of truth see src/vouch/models.py. For JSON Schema versions see schemas/.
This document uses RFC 2119 keywords (MUST / SHOULD / MAY).
A vouch knowledge base is a directory named .vouch/ at any path the
host chooses (typically the repo root). Every file inside is plain text
(YAML, markdown, or JSONL) except state.db.
.vouch/
├── config.yaml # KB settings
├── .gitignore # ignores proposed/, state.db
├── audit.log.jsonl # append-only audit log (committed)
├── state.db # SQLite FTS5 index (derived; not committed)
├── claims/<id>.yaml # durable, reviewed claims
├── pages/<id>.md # markdown with YAML frontmatter
├── sources/<sha>/meta.yaml # source metadata
├── sources/<sha>/content # captured bytes (optional)
├── entities/<id>.yaml # graph nodes
├── relations/<id>.yaml # graph edges
├── evidence/<id>.yaml # span pointers into sources
├── sessions/<id>.yaml # agent session records
├── proposed/<id>.yaml # pending proposals (gitignored, local)
└── decided/<id>.yaml # accepted/rejected (committed for audit)
Implementations MUST treat files on disk as the source of truth.
state.db is a derivable cache; any operation MUST be reproducible by
deleting state.db and rebuilding it from the files.
version: "0.1"
kb_name: my-project
agent: claude-code # default actor if VOUCH_AGENT is unset
retrieval:
backend: fts5 # fts5 | substring
fts5_porter: true
review:
require_citations: true # claims without citations fail validation
approver_role: human # human | trusted-agent- Source ids MUST be the lowercase hex sha256 of the captured bytes (64 hex chars). Re-registering identical bytes MUST be idempotent.
- Claim / Page / Entity / Relation / Evidence ids are kebab-case slugs unique within their kind. Implementations MAY derive them from user-provided titles but MUST guarantee uniqueness.
- Session ids SHOULD be ULIDs or timestamp-prefixed slugs so that sort order matches chronology.
- Proposal ids MUST be unique across
proposed/anddecided/combined — a proposal keeps its id when it moves between directories.
The authoritative pydantic definitions are in src/vouch/models.py. The shapes below are informative.
Immutable input material. Content-addressed by sha256.
| field | type | required | notes |
|---|---|---|---|
id |
hex sha256 | yes | 64 lowercase hex chars |
type |
enum (see §2.1.1) | yes | default file |
locator |
string | yes | path/URL/commit/etc. |
title |
string | no | |
hash |
hex sha256 | no | mirrors id |
immutable |
bool | no | default true |
scope |
enum | no | private/project/team/public |
byte_size |
int | no | |
media_type |
string | no | default text/plain |
created_at |
ISO-8601 datetime | yes | UTC |
metadata |
object | no | implementation-defined |
tags |
array of string | no |
file, url, transcript, message, commit, issue, screenshot,
pdf, audio, video, folder.
A span pointer into a Source. Claims cite Evidence ids or Source ids; both are valid citations.
| field | type | required | notes |
|---|---|---|---|
id |
slug | yes | |
source_id |
sha256 | yes | must reference an existing source |
source_type |
enum | no | mirrored from source for fast filter |
locator |
string | yes | L10-L20, t=00:14:23, #sec-3 |
quote |
string | no | extracted span text |
hash |
sha256 | no | hash of the quoted span |
created_at |
datetime | yes | UTC |
The smallest assertion that can be cited, contradicted, superseded, or archived.
| field | type | required | notes |
|---|---|---|---|
id |
slug | yes | |
text |
string | yes | one-sentence assertion |
type |
enum (§2.3.1) | yes | |
status |
enum (§2.3.2) | yes | |
confidence |
float 0..1 |
yes | default 0.7 |
evidence |
array of id | yes | MUST be non-empty (see §3) |
entities |
array of id | no | |
supersedes |
array of id | no | |
superseded_by |
id | no | set when superseded |
contradicts |
array of id | no | |
scope |
enum | no | |
tags |
array of str | no | |
created_at |
datetime | yes | |
updated_at |
datetime | yes | |
last_confirmed_at |
datetime | no | |
approved_by |
string | no | actor who ran kb.approve |
fact, decision, preference, workflow, observation, question,
warning.
working, actionable, stable, contested, superseded, archived,
redacted.
Entities are typed named things; Relations are typed edges.
EntityType: person, project, repo, company, concept,
decision, workflow, file, api, incident, source, agent,
tool, team, system.
RelationType: uses, depends_on, contradicts, supersedes,
supports, caused_by, owned_by, derived_from, similar_to,
blocks, implements, references.
Maintained markdown with YAML frontmatter on disk. Body is GFM markdown.
PageType: entity, concept, decision, workflow, session,
index, log, report, source-summary.
PageStatus: draft, active, stale, archived.
A work block. Bundles the proposals an agent produced within one session
so kb.crystallize can promote the durable parts.
vouch's review-gate primitive. See §4.
One line in audit.log.jsonl per mutation. event is a dotted verb:
source.register, proposal.create, proposal.approve,
proposal.reject, claim.supersede, claim.archive, bundle.import,
etc. The complete vocabulary is defined in
spec/audit-vocabulary.md.
These rules MUST be enforced server-side before a proposal can be created and before any approved artifact is written to disk:
- Every Claim MUST carry at least one citation (Source id or
Evidence id). Claims without citations are a validation error, not
a warning. Implementations MAY relax this for
working-status claims ifreview.require_citations: falseinconfig.yaml. - Every Evidence MUST reference an existing Source. Dangling evidence is rejected.
- Source ids MUST be the sha256 of their content. If the captured bytes don't hash to the id, the source is rejected.
- Supersession is directional. If A
supersedesB then B's status becomessupersededandB.superseded_byis set to A. Cycles are rejected. - Contradiction is symmetric. If A
contradictsB then B also gains A in itscontradictslist. - Approval is once-only. A proposal in
decided/cannot be moved back toproposed/. Re-proposing creates a new proposal id.
kb.propose_* kb.approve
agent ────────────────────────► proposed/ ──────────► durable artifact
│ (claims/, pages/, …)
│ kb.reject
└───────────────► decided/<id>.yaml
(status: rejected)
kb.propose_*writes the proposal toproposed/<id>.yamland emits an audit eventproposal.create.kb.approve <id>movesproposed/<id>.yamltodecided/<id>.yaml(status:approved) and writes the corresponding durable artifact to its kind-specific directory.kb.reject <id>moves the proposal todecided/<id>.yaml(status:rejected) and does not write a durable artifact.proposed/MUST be gitignored.decided/MUST be committed.- Proposals cannot mutate existing durable artifacts; they can only
create. Mutating durable artifacts uses the lifecycle methods
(
kb.supersede,kb.contradict,kb.archive,kb.confirm,kb.cite), each of which is directly audited.
The full state diagram and edge cases are in spec/review-gate.md.
A conforming implementation MUST expose the following methods. The
canonical names appear in kb.capabilities. Parameter and result
shapes are defined in spec/methods.md.
kb.capabilities, kb.status, kb.search, kb.context,
kb.read_page, kb.read_claim, kb.read_entity, kb.read_relation,
kb.list_pages, kb.list_claims, kb.list_entities,
kb.list_relations, kb.list_sources, kb.list_pending.
kb.register_source, kb.register_source_from_path, kb.source_verify.
Evidence is harmless and de-duplicates by content hash; gating it would just slow agents down without protecting anything.
kb.propose_claim, kb.propose_page, kb.propose_entity,
kb.propose_relation. All accept dry_run: true for validation-only.
kb.approve, kb.reject. Implementations MUST require an approver
identity distinct from proposed_by unless explicitly configured for
trusted-agent self-approval.
kb.supersede, kb.contradict, kb.archive, kb.confirm, kb.cite.
These mutate already-reviewed knowledge — they're metadata about
durable artifacts, not new assertions, so they don't need the gate.
kb.session_start, kb.session_end, kb.crystallize.
kb.index_rebuild, kb.lint, kb.doctor, kb.audit, kb.export,
kb.export_check, kb.import_check, kb.import_apply.
A conforming server MUST speak one of:
- MCP over stdio. JSON-RPC 2.0 framed per the MCP spec. Methods
appear as MCP tools named
kb_<method>(underscore for tool-naming compatibility); the canonical name in capabilities is stillkb.<method>. - JSONL over stdin/stdout. One JSON object per line. Request shape:
{"id": str, "method": str, "params": object}. Response shape:{"id": str, "ok": bool, "result"?: any, "error"?: {"code": str, "message": str}}. Error codes:method_not_found,missing_param,invalid_request,internal_error.
Servers MAY support both. See spec/transports.md.
audit.log.jsonl is one AuditEvent per line, appended on every
mutation. The log is committed and is the durable history of the KB.
- Events MUST include
id,event,actor,created_at,object_ids. - Events MAY include
dry_run,reversible, and andataobject for event-specific payload. - Events MUST NOT be deleted or rewritten. Rotation (compressing old segments into separate files) is permitted; in-place edits are not.
Full event vocabulary: spec/audit-vocabulary.md.
vouch export produces a tar.gz containing:
manifest.json # version, generated_at, files: [{path, sha256, bytes}]
claims/...
pages/...
sources/.../meta.yaml
sources/.../content
entities/...
relations/...
evidence/...
sessions/...
decided/...
config.yaml
Excluded from bundles: proposed/, state.db, audit.log.jsonl.
Proposals are draft-state; the audit log is local to one KB instance;
the index is derivable.
manifest.json carries a sha256 for every file. Import is gated by
vouch import-check, which produces a diff (new / conflict /
identical) before any destructive action. vouch import-apply
defaults to --on-conflict skip; overwriting requires explicit opt-in.
Bundle schema: schemas/bundle.manifest.schema.json.
Returned by kb.capabilities:
{
"name": "vouch",
"version": "0.0.1",
"spec": "vouch-0.1",
"methods": ["kb.capabilities", "kb.status", ...],
"retrieval": ["fts5", "substring"],
"review_gated": true,
"transports": ["mcp", "jsonl"],
"knowledge_capability": {
"kind": "local-cited-review-gated-kb",
"stores_evidence": true,
"audit_log": true
}
}Implementations claiming to speak this spec MUST set
"review_gated": true. Implementations without a review gate are not
conforming vouch servers — they're something else.
This spec follows the project version. Pre-1.0:
- The on-disk layout may change between minor versions; a migration tool is provided.
- The
kb.*surface may add methods at any time. Removing or reshaping a method is a breaking change and must appear inCHANGELOG.md.
At 1.0:
- The on-disk layout is frozen. Breaking changes require a major bump and a migration.
- The
kb.*surface is frozen for removal/reshape. Additions are always permitted.
A conforming implementation:
- Implements every method in §5 with the documented shapes.
- Enforces every validation rule in §3.
- Implements the review-gate state machine in §4.
- Writes audit events for every mutation per §7.
- Produces and consumes bundles matching §8.
- Returns a capabilities descriptor matching §9.
A test pack measuring this is planned for 0.2 — see ROADMAP.md.