Add `mcp_writes` audit log table for queryable MCP write history (Phase 3 #2269 Q4)

> *Original issue #2536 by @justinhsu1477 on 2026-05-23T14:18:50Z*

---

## Summary

Add a persistent `mcp_writes` audit log table that records every successful MCP write tool invocation (`memory.store`, `memory.note`, `tree.tag`) so users can review what LLMs connected via MCP have written to their Memory Tree. Replaces the current `tracing::info!`-only audit trail with a queryable surface.

This closes out **Q4 from the Phase 3 RFC (#2269)** — the only one of the four RFC questions that doesn't have an implementation answer yet:

- ✅ **Q1** multi-process write path → Liohtml's #2306 picked routing through `doc_put` (daemon-sole-writer)
- ✅ **Q2** provenance → `source_type = "mcp:<client>"` (#2306 placeholder + YOMXXX's #2332 `clientInfo.name` capture)
- ✅ **Q3** `SecurityPolicy` → distinct `enforce_write_policy()` gating on `ToolOperation::Act` (#2306)
- ⏳ **Q4** audit log → 🟡 **this issue**

## Problem

After #2306 / #2316 / #2332 landed, the MCP write surface is functionally complete but only emits ephemeral `tracing::info!` lines for write events. Quoting @graycyrus's #2306 review verbatim:

> "The `tracing::info!` audit log exists but only shows in logs. A UI-side notification ... would give users visibility."
>
> "The real residual risk isn't injection or corruption (that's well-covered), it's a rogue LLM silently filling memory with noise and the user not knowing."

Concrete user-impact gaps without a queryable audit log:

1. **No accountability** — a user can't answer "what did Claude Desktop write into my memory this week?" without `grep`-ing log files.
2. **No compliance story** — enterprise users with audit-trail requirements can't satisfy them with a tracing-only sink.
3. **No abuse detection** — a misbehaving client filling memory at the per-hour rate limit shows up only as log spam, not a queryable signal.
4. **No undo path** — even if a user spots a problem write in the log, there's no structured handle to address it.

## Proposed solution

Add a new SQLite table next to the existing memory tree DB (same SQLite handle to avoid a second connection):

```sql
CREATE TABLE mcp_writes (
    id                  INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp_ms        INTEGER NOT NULL,
    client_info         TEXT    NOT NULL,   -- "mcp:claude-desktop" / "mcp:cursor" / fallback "mcp"
    tool_name           TEXT    NOT NULL,   -- "memory.store" / "memory.note" / "tree.tag"
    args_summary        TEXT,               -- JSON object with non-PII identifying args (see below)
    resulting_chunk_id  TEXT,               -- document_id returned by memory_doc_put
    success             INTEGER NOT NULL,   -- 1 success, 0 failure
    error_message       TEXT                -- populated only when success == 0
);

CREATE INDEX idx_mcp_writes_timestamp ON mcp_writes (timestamp_ms DESC);
CREATE INDEX idx_mcp_writes_client    ON mcp_writes (client_info);
CREATE INDEX idx_mcp_writes_tool      ON mcp_writes (tool_name);
```

### What goes into `args_summary`

Per-tool slim JSON that captures the **identifying** args without duplicating the document content (which is already persisted via `memory_doc_put`):

| Tool | `args_summary` shape |
| --- | --- |
| `memory.store` | `{ "title": "<first 128 chars>", "namespace": "...", "tag_count": N }` |
| `memory.note` | `{ "chunk_id": "...", "note_text_length": N }` |
| `tree.tag` | `{ "chunk_id": "...", "tags": [...] }` |

`args_summary` deliberately avoids storing raw note/store content twice — the actual content lives in the memory tree itself; the audit table just records the metadata enough to identify and join.

### Write flow

Inside `dispatch_write_tool` in `src/openhuman/mcp_server/tools.rs`:

```rust
// Existing: dispatch to doc_put
match all::try_invoke_registered_rpc(rpc_method, params.clone()).await {
    Some(Ok(value)) => {
        let document_id = value.get("document_id").and_then(Value::as_str);
        // NEW: audit record after successful write
        let _ = mcp_audit::record_write(McpWriteRecord {
            timestamp_ms: now_ms(),
            client_info: session.source_type(),    // from McpSession
            tool_name: tool_name.to_string(),
            args_summary: summarize_args(tool_name, params),
            resulting_chunk_id: document_id.map(str::to_string),
            success: true,
            error_message: None,
        }).await;
        tracing::info!(tool = tool_name, chunk_id = document_id, "write success");
        Ok(tool_success(value))
    }
    Some(Err(message)) => {
        // NEW: audit failed writes too — abuse detection signal
        let _ = mcp_audit::record_write(McpWriteRecord {
            success: false,
            error_message: Some(message.clone()),
            ...
        }).await;
        ...
    }
    ...
}
```

The audit insert uses `let _ = ... .await` (best-effort) — see Q2 below.

### Query surface

New RPC: `openhuman.mcp_audit_list` (read-only):

```rust
PutMcpAuditListParams {
    limit: u32,                        // default 50, max 500
    offset: u32,                       // default 0
    since_ms: Option<u64>,             // optional time-window filter
    client_filter: Option<String>,     // optional "mcp:claude-desktop" filter
    tool_filter: Option<String>,       // optional "memory.store" filter
    success_only: Option<bool>,        // default None (both); useful for UI's "show failures"
}
```

Returns `Vec<McpWriteRecord>` ordered by `timestamp_ms DESC`.

## Open design questions (need maintainer direction)

### Q1 — Storage backend

**A.** Add the table to the **existing memory tree SQLite** (single handle, single migration, transactional with the write itself). Easiest deployment.

**B.** Separate `mcp_audit.sqlite` file (isolated; survives memory tree corruption / reset; can be wiped independently for privacy). Bit more plumbing.

**C.** Reuse one of the existing KV stores or event_bus persistence layers if there's a natural home I haven't found.

The implementation sketch assumes **A**, but `mcp_writes` is a write-heavy append-only log with very different access patterns from the chunk tree — splitting may be cleaner.

### Q2 — Audit-write coupling on failure

**A. Best-effort audit (write succeeds even if audit fails)** — the sketch above. Audit failure is logged but not propagated. Pro: write availability never degrades; con: theoretical race where the chunk lands but the audit row doesn't, breaking the "every write is auditable" guarantee.

**B. Transactional (audit-then-write inside one SQLite tx)** — strict guarantee, but couples write latency to the audit table and complicates the `memory_doc_put` RPC boundary.

**C. Write-then-audit, abort-on-audit-failure** — between A and B; would require rolling back the just-completed write, which `doc_put` doesn't currently support cleanly.

Preference: **A** unless maintainers want the stronger guarantee for compliance reasons.

### Q3 — Retention policy

**A.** No automatic prune (table grows forever). Simplest; trusts user disk space.

**B.** Rolling window — e.g. delete rows older than 90 days during a daily job (could piggyback on the existing memory tree scheduler).

**C.** Size-bounded — drop oldest rows when the table exceeds N MB.

**D.** Per-user opt-in retention setting (under `config.mcp.audit_retention_days`).

For v1, leaning **A** (no prune) with an explicit "retention is a follow-up" note — simpler scope, doesn't lock us into a policy before we see real usage volume.

### Q4 — Query surface

**A. Internal RPC only** (`openhuman.mcp_audit_list`) — OpenHuman's own UI or CLI can read; MCP clients themselves cannot see their audit history. Tightest blast radius.

**B. Also expose as an MCP tool** (e.g. `audit.list_writes`) — lets a client like Claude Desktop reflect on its own writes, useful for "show me what you stored last session" UX. But also means an MCP client can see what other clients wrote to the same user's memory tree.

**C. Hybrid** — MCP tool exposes only the current client's own writes (filtered by `client_info == session.source_type()`), while the internal RPC sees all.

Strong preference for **A** in v1 (smallest surface, fewest privacy decisions). **C** is the right long-term shape; **B** is too permissive.

## Acceptance criteria

- [ ] `mcp_writes` table created via SQLite migration; existing users migrate cleanly on first launch after upgrade.
- [ ] Every successful `dispatch_write_tool` call inserts a row with the correct `client_info` (from `McpSession`), `tool_name`, `args_summary`, and `resulting_chunk_id`.
- [ ] Failed writes also recorded (`success: 0`, `error_message` populated) — abuse-detection signal.
- [ ] `args_summary` does not duplicate the chunk content (only identifying metadata).
- [ ] `openhuman.mcp_audit_list` RPC registered and returns records ordered by `timestamp_ms DESC`, with `limit` / `offset` / filter support.
- [ ] Unit tests cover: insert success, insert failure, query with each filter, ordering, limit + offset.
- [ ] **Diff coverage ≥ 80%**.
- [ ] No new UI surface — that's the follow-up.

## Out of scope (follow-ups)

- **UI surface** for browsing the audit log — Channels tab? Settings panel? Dedicated "Memory Activity" view? Belongs to a separate issue once the data layer is in place. @graycyrus already flagged the UI notification angle on #2306; an audit list view is the natural companion.
- **Per-write user confirmation** — the MCP spec doesn't have a confirm primitive (also discussed on #2269 Q3); confirmation is an OS-level out-of-band concern.
- **Audit log for read tools** (`memory.search`, `tree.read_chunk`, etc.) — debatable whether reads need audit; defer until we see a request.
- **Retention policy** — see Q3; punted to "follow-up after we have usage data".

## Related

- #2269 (Phase 3 RFC) — this issue is Q4 of that RFC.
- #2306 (Liohtml's `memory.store` / `memory.note`) — answered Q1, Q2 placeholder, Q3.
- #2316 (`tree.tag`) — third write tool; same audit hook applies uniformly.
- #2332 (YOMXXX's `clientInfo.name` capture) — provides the `client_info` field this audit table records.
- #2317 (closed) — companion follow-up that #2332 implemented.
- @graycyrus #2306 review: "if maintainers later want queryable history this is the easiest place to swap in".

Tool	`args_summary` shape
`memory.store`	`{ "title": "<first 128 chars>", "namespace": "...", "tag_count": N }`
`memory.note`	`{ "chunk_id": "...", "note_text_length": N }`
`tree.tag`	`{ "chunk_id": "...", "tags": [...] }`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `mcp_writes` audit log table for queryable MCP write history (Phase 3 #2269 Q4) #35

Summary

Problem

Proposed solution

What goes into `args_summary`

Write flow

Query surface

Open design questions (need maintainer direction)

Q1 — Storage backend

Q2 — Audit-write coupling on failure

Q3 — Retention policy

Q4 — Query surface

Acceptance criteria

Out of scope (follow-ups)

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add mcp_writes audit log table for queryable MCP write history (Phase 3 #2269 Q4) #35

Description

Summary

Problem

Proposed solution

What goes into args_summary

Write flow

Query surface

Open design questions (need maintainer direction)

Q1 — Storage backend

Q2 — Audit-write coupling on failure

Q3 — Retention policy

Q4 — Query surface

Acceptance criteria

Out of scope (follow-ups)

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Add `mcp_writes` audit log table for queryable MCP write history (Phase 3 #2269 Q4) #35

What goes into `args_summary`