diff --git a/docs/architecture.md b/docs/architecture.md index b7ad43c..ab3cd30 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,16 +1,17 @@ # Witnex Architecture -> Phase 1. Types and end-to-end flow only — no proving logic is implemented yet. +> Status: Prompt 1 (scaffold) and Prompt 2 (working slice) are implemented. The +> Risc0 guest builds and proves in **dev mode**; a full real-STARK, +> receipt-checked end-to-end path is the remaining work (see Status). Witnex turns each agent execution into a **tamper-evident commitment** and a **ZK proof** that the commitment is well-formed, so a third party can verify -*what the agent did* without re-running it and without seeing the underlying -plaintext. +*what the agent did* without re-running it. ## What is proven (and what is not) -| Property | Proven in Phase 1? | -|----------|--------------------| +| Property | Proven? | +|----------|---------| | The agent committed to a specific input and output (input/output integrity) | ✅ | | Tool calls occurred in the claimed order with claimed parameters (trace integrity) | ✅ | | The committed inputs/outputs/tool data form a consistent hash chain | ✅ | @@ -24,12 +25,25 @@ It does **not** say the output is the right answer. | Component | Crate / package | Role | |-----------|-----------------|------| -| Agent runtime + trace recorder | `witnex-core` | Defines `Agent`, `ExecutionTrace`, `ToolCall` and supporting commitment types. | -| Prover | `witnex-prover` | Risc0 zkVM host + guest; proves a trace well-formed. | -| Verifier | `witnex-verifier` | Standalone verification of a `ProofBundle`. | -| Demo CLI | `witnex-cli` | `witnex demo summarize` / `witnex verify`. | +| Runtime + commitment types | `witnex-core` | `Agent`, `ExecutionTrace`, `ToolCall`, `Digest`; `ExecutionTrace::commit`/`commitment` (the canonical SHA-256 hash chain); the `LlmBackend` trait + `MockBackend`. | +| LLM backend | `witnex-anthropic` | `AnthropicBackend` — `LlmBackend` over the Claude Messages API. | +| Proof bundle types | `witnex-prover` | `ProofBundle { trace, commitment, proof }`. | +| Structural verifier | `witnex-verifier` | `StructuralVerifier` — recomputes the commitment (Phase-1, needs the full trace). | +| Demo CLI | `witnex-cli` | `witnex demo summarize` (mock or Anthropic, env-gated) / `witnex verify` (structural). | +| zkVM (separate workspace) | `zkvm/` | Risc0 guest (recompute commitment → journal), host `prove`/`verify`, and the `witnex-zkvm` CLI (`prove`/`verify` with a real receipt). | | SDK | `@witnex/sdk` | TypeScript mirror of the trace/proof types. | +## Two verification modes + +| Mode | Where | Checks | Needs | +|------|-------|--------|-------| +| **Structural** (Phase 1) | `witnex verify` (`witnex-verifier`) | recompute `trace.commitment()` == `bundle.commitment` | the full trace; no toolchain | +| **ZK receipt** | `witnex-zkvm verify` (`zkvm/host`) | Risc0 receipt verifies against the guest image id; journal binds to the trace | the rzup toolchain (Linux/macOS/WSL) | + +Both check the *same property* — that the commitment was correctly derived. The +ZK receipt additionally proves it cryptographically (the right guest ran), and +is the path toward succinct verification without re-execution. + ## End-to-end sequence ```mermaid @@ -37,41 +51,40 @@ sequenceDiagram autonumber actor Caller as Caller / Developer participant Agent as Agent (witnex-core) - participant LLM as LLM backend (Anthropic) - participant Prover as Prover (Risc0 host + guest) + participant LLM as Backend (Anthropic / Mock) + participant Prover as zkVM host + guest (zkvm/) participant Verifier as Verifier (third party) Note over Caller,Agent: 1) Agent invocation - Caller->>Agent: execute(input) - Agent->>Agent: input_hash = SHA256(input) - Agent->>Agent: prompt_template_hash = SHA256(template) + Caller->>Agent: summarize(input) + Agent->>Agent: input_hash, prompt_template_hash = SHA256(...) Note over Agent,LLM: 2) Trace generation - Agent->>LLM: inference(prompt(template, input)) + Agent->>LLM: complete(model, system, messages, max_tokens) LLM-->>Agent: output - Agent->>Agent: output_hash = SHA256(output) - Agent->>Agent: record tool_calls (name, in/out hashes), timestamp, nonce - Agent-->>Caller: ExecutionTrace + Agent->>Agent: output_hash; record tool_calls, timestamp, nonce + Agent->>Agent: commitment = SHA256 hash chain over all fields + Agent-->>Caller: ProofBundle { trace, commitment, proof:empty } - Note over Agent,Prover: 3) Proof generation + Note over Caller,Prover: 3) Proof generation (witnex-zkvm prove) Caller->>Prover: prove(trace) - Prover->>Prover: guest re-derives & checks the hash chain (well-formedness) - Prover-->>Caller: ProofBundle { trace, proof } as JSON + Prover->>Prover: guest recomputes the commitment, commits it to the journal + Prover-->>Caller: ProofBundle with embedded Risc0 receipt Note over Caller,Verifier: 4) Verification (no re-execution) - Caller->>Verifier: proof.json - Verifier->>Verifier: verify receipt vs guest image id + public journal - alt valid receipt & well-formed trace - Verifier-->>Caller: VERIFIED - else - Verifier-->>Caller: INVALID + Caller->>Verifier: bundle.json + alt structural (witnex verify) + Verifier->>Verifier: recompute commitment == bundle.commitment + else ZK receipt (witnex-zkvm verify) + Verifier->>Verifier: receipt.verify(image id); journal == commitment end + Verifier-->>Caller: VERIFIED / INVALID ``` ## Trust model -- The verifier trusts the **Risc0 proof system** and the published **guest image - id**, nothing else about the prover. +- The ZK verifier trusts the **Risc0 proof system** and the published **guest + image id**, nothing else about the prover. - The verifier does **not** trust the agent's host, network, or LLM provider. - Timestamps and the model id are *committed*, not *proven honest* — a dishonest agent can still claim a wrong clock or a wrong model; what it cannot do is @@ -79,6 +92,13 @@ sequenceDiagram ## Status & next steps -- [x] Prompt 1 — workspace scaffold, core types, this document. -- [ ] Prompt 2 — implement the `witnex demo summarize` / `witnex verify` slice - with a minimal Risc0 hash-chain guest and a mocked-LLM integration test. +- [x] **Prompt 1** — workspace scaffold, core types, this document. +- [x] **Prompt 2** — `witnex demo summarize` / `witnex verify`, canonical trace + commitment, mocked + real (Anthropic) backends, and an end-to-end + integration test with a mocked LLM. +- [x] **Risc0 guest** — recomputes the commitment in the zkVM; host + `prove`/`verify`; `witnex-zkvm` CLI; receipt embedded in `ProofBundle`. + Built and dev-mode-tested (CI + a Linux box). +- [ ] **Real STARK end-to-end** — a full non-dev proof run, and wiring real + receipt verification into the main `witnex verify` (currently structural). +- [ ] **zkML** — proving the inference itself (out of Phase 1 scope).