From 67d46227650596226a9127665d3e1ed410680883 Mon Sep 17 00:00:00 2001 From: Rich Bodo Date: Sun, 31 May 2026 18:59:33 +1200 Subject: [PATCH 1/3] spec: add Exceptions concept + EX-CLOUD-LLM registry entry New spec/exceptions.md defines Exceptions (stable EX-* conditions under which a PNA deliberately departs from a baseline guarantee): the raise/catch/handle model, PNA mode vs non-PNA mode, the RFC-2119 handler contract (EX-H1..EX-H8), the Relaxes:/Reversible:/Stresses: header conventions, the per-dimension strength-profile vocabulary, and the first registry entry EX-CLOUD-LLM (demonstrated by fellows_local_db). Reference-driven per CONTRIBUTING; the lint extension, PNA_Spec pointer, SKILL evaluate-flow step, and design-record backfill follow in this PR. Co-Authored-By: Claude Opus 4.8 (1M context) --- spec/exceptions.md | 185 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 185 insertions(+) create mode 100644 spec/exceptions.md diff --git a/spec/exceptions.md b/spec/exceptions.md new file mode 100644 index 0000000..1e55713 --- /dev/null +++ b/spec/exceptions.md @@ -0,0 +1,185 @@ +# PNA Exceptions + +> **Spec-Version:** tracks the PNA Spec version in [`PNA_Spec.md`](PNA_Spec.md). +> +> This file defines **Exceptions**: stable-ID'd conditions (`EX-*`) under which a PNA deliberately +> departs from a baseline guarantee — a named AC, or the core PNA definition ("runs local-only, +> never as SaaS"; see [`PNA_Spec.md` § Vocabulary](PNA_Spec.md), `vocab-pna`). + +## Concept + +An Exception is modeled on a software exception. It is **raised** by a specific user action, must +be **caught** (never raised silently), and must be **handled** by a defined **solution**. + +| Software exception | PNA Exception | +|---|---| +| A condition that interrupts normal control flow | A condition under which a PNA departs from a baseline guarantee | +| `raise` / `throw` | **Raised** by a specific user action (e.g. connecting a cloud MCP client) | +| Uncaught exception crashes / leaks | A *silent* deviation is the failure mode — exceptions MUST be **caught** | +| `try/except` handler | A defined **solution** (consent + signal + explainer + reversal path) | +| Stack trace identifies the exception | Every exception has a stable `EX-*` ID and a registry entry | + +**An app is in PNA mode when no exceptions are active.** Raising any exception exits PNA mode. A PNA +is spec-conformant in non-PNA mode **iff** every active exception is handled to the [handler +contract](#handler-contract) below. This reframes conformance: + +- **Old framing:** conformant = *never deviates from any AC or the PNA definition.* +- **New framing:** conformant = *in PNA mode, honors every applicable AC; in non-PNA mode, catches + and handles every active deviation honestly.* + +A tool that lets a user point a hosted model at their Private DB is neither "non-conformant garbage" +nor "secretly fine" — it is **a conformant PNA operating in a declared non-PNA mode**, and the +[evaluate flow](pna-build-eval-contrib/SKILL.md) can say exactly that, by `EX-*` ID. + +### Validation, not certification + +PNT **validates behaviors against the Goals; it does not certify.** There is no pass/fail badge and +no certifying body (see `CONTRIBUTING.md` and the skill's § Principles, "Conformance is checked, not +awarded"). The evaluate flow *detects* exceptions and *verifies how each is handled*, reporting by +`EX-*` ID. "This app raises `EX-CLOUD-LLM` and handles it to contract" is a finding, not a grade. + +### Scope discipline + +Exceptions are bounded so they stay a PNA-class mechanism rather than a general deviation framework: + +- **Goal-anchored.** Every exception MUST name, via `Relaxes:`, the specific `AC-*` (or + `PNA-DEFINITION`) it departs from. PNT defines exceptions ONLY for deviations from its own + Goals/ACs — not for other application classes. A proposed "exception" that relaxes no named PNA + guarantee is not a PNA exception. +- **Composition, not enumeration.** Non-PNA mode is binary (in / out). The active-set explainer + (EX-H4) renders the *currently-active* exceptions at runtime, each with its own entry and strength + profile. PNT never pre-enumerates combinations; cost scales linearly in the number of defined + exceptions, not combinatorially. + +## Handler contract + +Normative language uses RFC 2119 / RFC 8174 keywords (MUST, MUST NOT, SHOULD, MAY) only when +capitalized, consistent with [`PNA_Spec.md` § Universal architectural commitments](PNA_Spec.md). For +each exception it can raise, a conforming PNA: + +- **EX-H1 — Stable identity.** MUST define and reference the exception by its stable `EX-*` ID. +- **EX-H2 — Consent before raise.** MUST obtain explicit informed consent BEFORE raising the + exception (no silent raise). The consent surface MUST link to an explanation of that specific + exception. +- **EX-H3 — Persistent non-PNA-mode signal.** While the exception is active, MUST present a + persistent user-facing signal that the app is not in PNA mode. The signal MUST name the active + exception and MUST link to an explanation of the active exception set. The signal MAY be + dismissable, but dismissal MUST NOT clear the exception (dismissal acknowledges; it does not + resolve). +- **EX-H4 — Active-set explainer.** MUST provide a user-reachable explanation of the + CURRENTLY-ACTIVE exception set. Because active combinations are installation-specific and cannot + be enumerated in a static document, this explainer MUST be generated at runtime from the active + set and MUST link out to each active exception's registry entry in this file. +- **EX-H5 — Declared reversibility.** MUST declare whether returning to PNA mode is supported + (reversible) or not (irreversible). If it declares reversible, it MUST provide a practical, + user-reachable path back to PNA mode that the validation flow can confirm from code/UX. + Reversibility refers to **MODE ONLY**: a handler MUST NOT imply that returning to PNA mode undoes + consequences already incurred (e.g. data already disclosed to a third party). +- **EX-H6 — Recommended solution.** SHOULD name a recommended solution in its registry entry, + demonstrated by a reference design. +- **EX-H7 — Consent reaches the ultimate human.** Where the consuming actor is an agent/proxy + rather than the ultimate human (e.g. an orchestrator agent invoking the PNA on a person's behalf), + the handler MUST make a best-effort attempt to propagate the notice and acceptance (EX-H2) to the + ultimate human interface, and MUST NOT treat an intermediary agent's acceptance as the human's. + This clause is **best-effort by nature**: a PNA can instruct a cooperating client to surface the + notice but cannot compel a non-cooperating one (its strength is `best-effort`, EX-H8). It sharpens + AC-MCP-A — the required consent is the *human's*, and a middleman must not manufacture it. (Cf. + macaroon attenuation: delegated authority only narrows down a chain, never amplifies.) +- **EX-H8 — Per-dimension strength disclosure.** MUST publish a **strength profile** for the + exception (see [Strength profiles](#strength-profiles)): for each dimension of the guarantee, the + *kind* of assurance offered, drawn from the fixed vocabulary. The profile MUST be user-reachable + from the active-set explainer (EX-H4). A single collapsed "assurance level" MUST NOT be used in + place of the per-dimension profile. + +> **Sub-contract IDs.** `EX-H1..EX-H8` follow PNT's existing sub-contract convention +> (`-`). They are deliberately prose/list items, not `| EX-… |` registry rows, so +> the lint collects them as handler clauses, not as registry exceptions. The evaluate flow cites +> them ("fails EX-H3 — no persistent signal"). + +## Header conventions + +These mirror the `Realizes: AC-…` header that contract files carry (see `tools/lint-spec-ids.py`). +They appear in an exception's registry entry and in a reference design's handler declaration. + +- **`Relaxes:`** — the baseline guarantee(s) the exception departs from; the inverse of `Realizes:`. + Each token is an `AC-*` ID or the literal `PNA-DEFINITION` (for departures from "local-only, never + SaaS"). Comma-separated. Example: `Relaxes: PNA-DEFINITION, AC-MCP-A`. +- **`Reversible:`** — whether returning to PNA mode is supported. `yes` or `no`. If `yes`, a + `Reversal:` field MUST follow naming the mechanism (a route, control, or code reference the + validation flow can confirm). See EX-H5. +- **`Stresses:`** *(optional, non-normative)* — a Goal the exception puts under pressure without + strictly relaxing a single AC. Example: `Stresses: Goal 1`. + +## Strength profiles + +The strength of an exception's handling is disclosed **per dimension, not as a single graded level.** +A collapsed level (à la Common Criteria EAL or OWASP ASVS L1/2/3) would fold "the boundary is +enforced" together with "the provider's data handling is unverifiable" into one misleading number. +The honest frame: **once data crosses to a third party, the PNA can guarantee nothing about the data +itself; every real guarantee is about the *boundary* (consent, signaling, reversibility, +auditability) and *local recoverability*.** + +Each dimension's class is one of the fixed vocabulary (lint-checked for membership; the evaluate +flow judges accuracy): + +| Class | Meaning | +|---|---| +| `enforced` | The app's own code makes it true; locally testable/auditable. | +| `verifiable` | A claim an auditor can confirm from open code (not enforced at runtime, but checkable). | +| `best-effort` | The app requests it of an untrusted party; cannot compel. | +| `provider-asserted` | Depends entirely on a third party's own, app-unverifiable policy. | +| `recoverable-only` | The app cannot prevent the harm but can undo/restore it. | +| `none` | No guarantee is possible. | + +## Exception registry + +| EX | Name | Relaxes | Stresses | Reversible | Recommended solution | +|---|---|---|---|---|---| +| EX-CLOUD-LLM | Cloud-hosted AI over PNA data | PNA-DEFINITION, AC-MCP-A | Goal 1 | yes (mode only) | consent gate + persistent dismissable "not a PNA" signal + active-set explainer + return-to-PNA-mode — demonstrated by `fellows_local_db` | + +### EX-CLOUD-LLM — Cloud-hosted AI over PNA data + +**Relaxes:** PNA-DEFINITION, AC-MCP-A +**Stresses:** Goal 1 +**Reversible:** yes +**Reversal:** mode only — the user disconnects the cloud MCP client and returns to PNA mode. +Returning to PNA mode does NOT undo any disclosure already made to the cloud provider (EX-H5). + +**Raised when:** the user connects a cloud-hosted MCP client (e.g. Claude Desktop on a hosted +model, a desktop AI app on a hosted API) to a PNA's MCP servers that can return Private DB rows. The +canonical trigger is the Private Data Ops server (see [`PNA_Spec.md` § Vocabulary](PNA_Spec.md), MCP +server). + +**Recommended solution:** pre-raise consent gate (EX-H2) naming the exception and linking the +explainer; persistent dismissable "not a PNA right now" signal (EX-H3); runtime active-set explainer +(EX-H4) surfacing the strength profile below; declared, reversible return-to-PNA-mode path (EX-H5); +best-effort consent-propagation notice to cloud clients via the MCP `instructions` handshake (EX-H7). +Demonstrated by `fellows_local_db` (`reference_designs/fellows_local_db/`). + +**Strength profile (EX-H8):** + +| Dimension | Strength | Why | +|---|---|---| +| Consent precedes the raise | enforced | Setup is blocked until the user accepts the agreement. | +| Non-PNA-mode signal while active | enforced | A persistent banner shows until the user returns to PNA mode. | +| Mode is reversible | enforced | A return-to-PNA-mode control clears the exception. | +| Servers read-only, two files only | verifiable | Databases opened `mode=ro`; auditable in open source. | +| Local data damage from a bad AI step | recoverable-only | Not prevented — restorable from a backup/export. | +| Consent reaches the human, not a proxy | best-effort | EX-H7 — cloud clients are asked to relay it; cannot be compelled. | +| Provider won't train on / retain the data | provider-asserted | The provider's policy; unverifiable by the app. | +| Data already sent to the provider | none | Irreversible once it has crossed the boundary. | + +#### What it does and does not relax + +`EX-CLOUD-LLM` relaxes the *delivery* guarantee (data leaves the device to a cloud model) and +AC-MCP-A's consent posture. It does **not** relax AC-MCP-B (the workspace still launches +transports), AC-1, or any other AC. Keeping the `Relaxes:` set tight is part of honest handling — an +exception names the *minimum* set of guarantees it actually departs from. + +## Origin + +The Exceptions concept was distilled from operating the `fellows_local_db` reference design with a +~500-user base: users wanted cloud-LLM integration, local models were impractical for them, and the +spec had no first-class way to deviate *honestly*. See that design's `docs/architectural_findings.md` +(upstream) and `reference_designs/fellows_local_db/`. Per PNT's reference-driven model, this concept +ships alongside the working design that demonstrates it. From b8b762b42f0e8056677c7ef1578678326080761f Mon Sep 17 00:00:00 2001 From: Rich Bodo Date: Sun, 31 May 2026 20:45:56 +1200 Subject: [PATCH 2/3] tools: extend lint-spec-ids for EX/Relaxes/Reversible/strength traceability MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Collects EX-* registry IDs from spec/exceptions.md; verifies every 'Relaxes:' token resolves to a known AC, EX, or the PNA-DEFINITION sentinel; checks 'Reversible:' is well-formed (yes|no, with a 'Reversal:' field when yes); and validates any strength-profile column carries only the fixed strength classes (EX-H8). Shape/presence only — behavioral correctness stays the LLM evaluate flow's job. No-ops cleanly when exceptions.md is absent. Co-Authored-By: Claude Opus 4.8 (1M context) --- tools/lint-spec-ids.py | 121 +++++++++++++++++++++++++++++++++++------ 1 file changed, 105 insertions(+), 16 deletions(-) diff --git a/tools/lint-spec-ids.py b/tools/lint-spec-ids.py index 01356e3..867a5bb 100755 --- a/tools/lint-spec-ids.py +++ b/tools/lint-spec-ids.py @@ -1,5 +1,5 @@ #!/usr/bin/env python3 -"""Lint the AC ↔ contract bidirectional traceability invariants. +"""Lint the AC/EX ↔ contract bidirectional traceability invariants. Checks: 1. Every AC in spec/PNA_Spec.md and spec/axes.md carries a stable ID @@ -8,6 +8,17 @@ in its head (within the first ~25 lines), naming at least one valid AC. 3. Every AC named in a contract's "Realizes:" header actually exists in the spec. + 4. Every Exception in spec/exceptions.md carries a stable ID (EX-*) in a + registry table row. + 5. Every "Relaxes:" header (in exceptions.md) names only valid AC IDs, EX + IDs, or the PNA-DEFINITION sentinel. + 6. Every "Reversible:" field is well-formed (yes|no); a "yes" requires a + "Reversal:" field. Every value in a strength-profile column is one of the + fixed strength classes (EX-H8). + +The lint validates the *shape* of declarations (presence + ID/vocabulary +resolution), not their behavioral correctness — that is the LLM evaluate +flow's job (see pna-build-eval-contrib/SKILL.md § Evaluate flow). Exits 0 if clean, 1 if any violation found. Designed to be CI-friendly. """ @@ -22,6 +33,24 @@ AC_RE = re.compile(r"^\| (AC-[A-Z0-9-]+?)(?=\s|\*|\|)", re.MULTILINE) REALIZES_RE = re.compile(r"Realizes:\s*((?:AC-[A-Z0-9-]+(?:\s*,\s*)?)+)", re.IGNORECASE) +# Exception registry IDs live in `| EX-... |` table rows (mirrors AC_RE). The +# handler-clause IDs (EX-H1..EX-H8) are list items, not table rows, so they are +# deliberately NOT collected here as registry exceptions. +EX_RE = re.compile(r"^\| (EX-[A-Z0-9-]+?)(?=\s|\*|\|)", re.MULTILINE) +# Inverse of REALIZES_RE. Tokens may be AC-*, EX-*, or the PNA-DEFINITION +# sentinel (the PNA definition is prose in vocab-pna, not an `| AC-X |` row). +RELAXES_RE = re.compile( + r"Relaxes:\s*((?:(?:AC-[A-Z0-9-]+|EX-[A-Z0-9-]+|PNA-DEFINITION)(?:\s*,\s*)?)+)", + re.IGNORECASE, +) +REVERSIBLE_RE = re.compile(r"Reversible:\s*([A-Za-z]+)", re.IGNORECASE) +STRENGTH_CLASSES = { + "enforced", "verifiable", "best-effort", + "provider-asserted", "recoverable-only", "none", +} + +EXCEPTIONS_PATH = REPO / "spec" / "exceptions.md" + def collect_spec_ac_ids() -> set[str]: """All AC IDs from the AC tables in the spec.""" @@ -30,8 +59,7 @@ def collect_spec_ac_ids() -> set[str]: if not path.exists(): print(f"FAIL: spec file missing: {path}") sys.exit(1) - text = path.read_text() - ids.update(AC_RE.findall(text)) + ids.update(AC_RE.findall(path.read_text())) return ids @@ -44,22 +72,64 @@ def collect_contract_realizes() -> dict[Path, list[str]]: continue head = "\n".join(f.read_text().splitlines()[:25]) m = REALIZES_RE.search(head) - if not m: - out[f] = [] - continue - ids = [s.strip() for s in m.group(1).split(",")] - out[f] = ids + out[f] = [s.strip() for s in m.group(1).split(",")] if m else [] return out +def collect_exception_ids() -> set[str]: + """EX-* registry IDs from spec/exceptions.md. Empty if the file is absent + (so the lint stays green on toolkit versions that predate Exceptions).""" + if not EXCEPTIONS_PATH.exists(): + return set() + return set(EX_RE.findall(EXCEPTIONS_PATH.read_text())) + + +def collect_relaxes() -> list[str]: + """All tokens from every 'Relaxes:' header in exceptions.md (flattened).""" + if not EXCEPTIONS_PATH.exists(): + return [] + tokens: list[str] = [] + for m in RELAXES_RE.finditer(EXCEPTIONS_PATH.read_text()): + tokens.extend(t.strip() for t in m.group(1).split(",")) + return tokens + + +def collect_strength_violations(text: str) -> list[str]: + """Validate strength-profile tables: any markdown table with a column + headed 'Strength' must carry only fixed strength-class values in that + column. Keyed on the header name so unrelated tables are untouched.""" + violations: list[str] = [] + lines = text.splitlines() + i = 0 + while i < len(lines): + line = lines[i] + if line.lstrip().startswith("|") and "strength" in line.lower(): + cells = [c.strip().lower() for c in line.strip().strip("|").split("|")] + if "strength" in cells: + col = cells.index("strength") + j = i + 2 # skip header + |---| separator + while j < len(lines) and lines[j].lstrip().startswith("|"): + row = [c.strip() for c in lines[j].strip().strip("|").split("|")] + if len(row) > col and row[col]: + val = row[col] + if val not in STRENGTH_CLASSES: + violations.append( + f"strength-profile names unknown class '{val}' " + f"(allowed: {', '.join(sorted(STRENGTH_CLASSES))})" + ) + j += 1 + i = j + continue + i += 1 + return violations + + def main() -> int: failures: list[str] = [] spec_ids = collect_spec_ac_ids() if not spec_ids: - failures.append("No AC IDs found in spec/. Check that AC tables follow the `| AC-X |` row format.") - for line in failures: - print(f"FAIL: {line}") + print("FAIL: No AC IDs found in spec/. Check the `| AC-X |` row format.") return 1 contract_realizes = collect_contract_realizes() @@ -75,18 +145,37 @@ def main() -> int: if ac not in spec_ids: failures.append(f"{rel}: claims to realize {ac}, but {ac} is not defined in spec/.") + # --- Exceptions (spec/exceptions.md) --- + exception_ids = collect_exception_ids() + known = spec_ids | exception_ids | {"PNA-DEFINITION"} + for tok in collect_relaxes(): + if tok not in known: + failures.append( + f"exceptions.md: Relaxes names {tok}, which is not a known AC, EX, " + "or PNA-DEFINITION." + ) + + if EXCEPTIONS_PATH.exists(): + ex_text = EXCEPTIONS_PATH.read_text() + rev_values = [m.group(1).lower() for m in REVERSIBLE_RE.finditer(ex_text)] + for v in rev_values: + if v not in ("yes", "no"): + failures.append(f"exceptions.md: malformed 'Reversible: {v}' (want yes|no).") + if "yes" in rev_values and "Reversal:" not in ex_text: + failures.append("exceptions.md: 'Reversible: yes' present but no 'Reversal:' field.") + failures.extend(f"exceptions.md: {v}" for v in collect_strength_violations(ex_text)) + if failures: print(f"lint-spec-ids: {len(failures)} violation(s) found.") for line in failures: print(f" - {line}") return 1 - n_contracts = len(contract_realizes) - n_acs = len(spec_ids) n_realizing = sum(1 for v in contract_realizes.values() if v) - print(f"lint-spec-ids: OK") - print(f" spec defines {n_acs} AC IDs") - print(f" {n_realizing}/{n_contracts} contract files declare a 'Realizes:' header") + print("lint-spec-ids: OK") + print(f" spec defines {len(spec_ids)} AC IDs") + print(f" {n_realizing}/{len(contract_realizes)} contract files declare a 'Realizes:' header") + print(f" exceptions.md defines {len(exception_ids)} exception ID(s)") return 0 From 0f0081a6d52e7696e7159300978fc0c4827ccda5 Mon Sep 17 00:00:00 2001 From: Rich Bodo Date: Sun, 31 May 2026 20:48:10 +1200 Subject: [PATCH 3/3] spec+skill+design: wire Exceptions into spec pointer, evaluate flow, fellows record - PNA_Spec.md vocab-pna: one-line pointer to exceptions.md (raising an Exception exits PNA mode; conformant only while handled to contract). - SKILL.md Evaluate flow: new 'Detect and verify Exceptions' step (caught/handled, reversibility, consent-to-human, strength accuracy, undeclared-deviation backstop); report keyed by AC or EX ID. - reference_designs/fellows_local_db: backfill the Exceptions originating contribution + bring in the Architecture.md copy (AC attestation table + EX-CLOUD-LLM exception attestation). SWHID stays pending (maintainer, post-merge). Lint green: 25 AC IDs, 12/12 contracts, 1 exception ID. Co-Authored-By: Claude Opus 4.8 (1M context) --- pna-build-eval-contrib/SKILL.md | 10 +- .../fellows_local_db/Architecture.md | 382 ++++++++++++++++++ reference_designs/fellows_local_db/README.md | 6 +- spec/PNA_Spec.md | 2 +- 4 files changed, 395 insertions(+), 5 deletions(-) create mode 100644 reference_designs/fellows_local_db/Architecture.md diff --git a/pna-build-eval-contrib/SKILL.md b/pna-build-eval-contrib/SKILL.md index bf26c88..82b9a77 100644 --- a/pna-build-eval-contrib/SKILL.md +++ b/pna-build-eval-contrib/SKILL.md @@ -39,14 +39,20 @@ Inputs: a candidate PNA's source tree (or a description sufficient to read its b - If the candidate has an Architecture document with an AC attestation table, check that the declared verification mechanism actually runs and passes. 2. **For each flavor-derived AC in `spec/axes.md`** triggered by the candidate's axis picks, do the same. 3. **For each typed contract relevant to the candidate's axis picks**, check that the candidate implements the contract correctly. Contract headers (`Realizes: AC-X, AC-Y`) tell you which ACs the contract serves. -4. **Produce a structured report keyed by AC ID.** The canonical form is the typed artifact at `tools/evaluate-report.schema.json` (JSON Schema). Emit an instance of that schema as the source of truth, then render the human-readable report as a *view* over it — don't hand-write the prose report and skip the artifact. Emitting the typed form is what makes two runs on the same candidate diffable (which ACs changed status). Per-AC status is one of: +4. **Detect and verify Exceptions** (see `spec/exceptions.md`). For each Exception the candidate can raise — declared in its Architecture document's exception attestation, or inferred from the source where undeclared: + - **Caught & handled?** Confirm consent is obtained *before* the raise (EX-H2), a persistent non-PNA-mode signal is shown while active (EX-H3), and a runtime active-set explainer exists (EX-H4). Cite code/UX for each. + - **Reversibility?** Read the `Reversible:` declaration; if `yes`, trace the `Reversal:` mechanism and decide whether the code/UX delivers a practical path back to PNA mode. Mode only — do not credit a handler that implies returning to PNA mode undoes prior disclosure (EX-H5). + - **Consent reaches the human?** Where an agent/proxy can drive the app, check the handler makes a best-effort attempt to propagate consent to the ultimate human and does not let an intermediary self-consent (EX-H7). + - **Strength profile accurate?** Check each dimension's class (EX-H8) against the code/UX; the lint already confirmed the classes are valid vocabulary — you judge whether they're truthful (e.g. nothing about the provider's behavior is claimed above `provider-asserted`). + - **Undeclared deviations.** You are the backstop: if the candidate departs from an AC or the PNA definition WITHOUT declaring an Exception, that is a silent (uncaught) deviation — a conformance failure. Flag it and name the `EX-*` it should have raised. +5. **Produce a structured report keyed by AC or EX ID.** The canonical form is the typed artifact at `tools/evaluate-report.schema.json` (JSON Schema). Emit an instance of that schema as the source of truth, then render the human-readable report as a *view* over it — don't hand-write the prose report and skip the artifact. Emitting the typed form is what makes two runs on the same candidate diffable (which ACs changed status). Per-AC status is one of: - `conformant` — with cited code locations. - `non-conformant` — with cited code locations showing the violation and the AC's stated requirement. - `not-applicable` — with reason (typically: the candidate's flavor doesn't trigger this AC). - `unable-to-determine` — with explanation; defaults to flagging for human review. Each finding may also carry `evidence` entries tagged by `source` (`deterministic` / `llm` / `human`). When a deterministic check in `tools/` (e.g. the egress lint) has run against the candidate, fold its output in as a `source: deterministic` evidence entry on the AC it bears on, so the deterministic and LLM layers land on one finding. -5. **Summarize at the top** (the artifact's `summary` object): overall posture and the most concerning non-conformances. Goals 1–5 are the load-bearing user-facing concerns — anything compromising private-data sovereignty (Goal 1), source-mirroring honesty (Goal 2), transport security (Goal 3), durability (Goal 4), or local diagnosability (Goal 5) leads the summary. +6. **Summarize at the top** (the artifact's `summary` object): overall posture and the most concerning non-conformances. Goals 1–5 are the load-bearing user-facing concerns — anything compromising private-data sovereignty (Goal 1), source-mirroring honesty (Goal 2), transport security (Goal 3), durability (Goal 4), or local diagnosability (Goal 5) leads the summary. Callers may ask you to emphasize specific Goals or axes at runtime (e.g., "focus on private-data sovereignty"). Treat that as a hint for the summary, not a structural variation. diff --git a/reference_designs/fellows_local_db/Architecture.md b/reference_designs/fellows_local_db/Architecture.md new file mode 100644 index 0000000..b646184 --- /dev/null +++ b/reference_designs/fellows_local_db/Architecture.md @@ -0,0 +1,382 @@ +# Architecture (fellows_local_db) + +This document is fellows_local_db's **specialization-and-conformance layer**: it declares which version of the PNA Spec this repo conforms to, names the axis picks fellows has made, and catalogs the fellows-specific values that the spec leaves to each implementation (HTTP routes, schema, worker constants, debug placeholders, distribution tunables). + +Universal PNA architecture — vocabulary, goals, the two-store ownership split, the worker-owned-OPFS rule, the version-handshake contract, the universal ACs — lives in the [PNA Spec](https://github.com/richbodo/personal_network_toolkit/blob/main/PNA_Spec.md) at the [personal_network_toolkit](https://github.com/richbodo/personal_network_toolkit) repo. This file does not restate it. + +--- + +## Spec conformance + +**Spec-Version:** [0.1 (draft)](https://github.com/richbodo/personal_network_toolkit/blob/main/CHANGELOG.md) +**Use case:** [Directory Archive](https://github.com/richbodo/personal_network_toolkit/blob/main/use_cases.md#directory-archive) + +### Flavor — fellows's six axis picks + +| Axis | Pick | Why | +|---|---|---| +| [Distribution](https://github.com/richbodo/personal_network_toolkit/blob/main/axes.md#distribution) | `web-bundle-with-magic-link` | EHF-allowlisted PWA; multiple fellows install from one origin behind a magic-link gate. | +| [Storage substrate](https://github.com/richbodo/personal_network_toolkit/blob/main/axes.md#storage-substrate) | `opfs-sqlite-wasm` | Browser-only deployment; sqlite3.wasm in a dedicated worker with OPFS-SAH-Pool VFS. | +| [Ingestion shape](https://github.com/richbodo/personal_network_toolkit/blob/main/axes.md#ingestion-shape) | `single-source-static-mirror` | One source (Knack JSON dump); no dedup; opt-in user-driven re-import. | +| [Workspace shell](https://github.com/richbodo/personal_network_toolkit/blob/main/axes.md#workspace-shell) | `vanilla-js-spa` | Single-IIFE `app/static/app.js`; hash routing; no framework, no bundler. | +| [Comms transport set](https://github.com/richbodo/personal_network_toolkit/blob/main/axes.md#comms-transport-set) | `mailto-only` | `mailto:` (+ `tel:`) today; Signal planned. | +| [MCP-exposure](https://github.com/richbodo/personal_network_toolkit/blob/main/axes.md#mcp-exposure) | `shared+private+comms` | `mcp_servers/` ships three stdio MCP servers for Claude Desktop and similar clients. | + +This section is fellows's **AC attestation table** — the Security Target role from the toolkit's +[`ARCHITECTURE_TEMPLATE.md`](https://github.com/richbodo/personal_network_toolkit/blob/main/reference_designs/templates/ARCHITECTURE_TEMPLATE.md). +Every applicable AC carries a **Realization** (how the code satisfies it), a **Verification** (the +test, rubric, or human-review note that proves it), and a **Status**. Verification refs are +`file::test_function` where a deterministic test exists; otherwise an LLM rubric or a human-review +note is named (both acceptable per the template). Status is `conformant` / `partial-conformance` / +`not-applicable`. Partial rows state honestly what is and isn't covered. + +### Universal ACs + +| AC | Realization | Verification | Status | +|---|---|---|---| +| AC-1 (two-store ownership split) | Two SQLite DBs: `fellows.db` (read-only contact data) + `relationships.db` (read-write, user-owned), separate OPFS files. `app/relationships.py:open_db()` ATTACHes fellows as `f` with `?mode=ro`; the worker owns the pair. | `tests/test_relationships.py::test_attach_fellows_readonly_allows_select`, `::test_attach_fellows_readonly_denies_write`; `tests/test_database.py` | conformant | +| AC-4 (versioned cross-boundary handshake) | `WORKER_RPC_VERSION`/`RELATIONSHIPS_SCHEMA_VERSION` in `vendor/sqlite-worker.js`; `EXPECTED_WORKER_RPC_VERSION` in `app.js`; `refuseIfVersionSkew()` gates mutating RPCs on mismatch, reads still pass; build label is not the gate. | `tests/e2e/test_version_handshake.py::test_version_skew_refuses_mutations_but_allows_reads`; `tests/e2e/test_worker_rpc.py` | conformant | +| AC-6 (always-reachable diagnostic escape) | `?gate=1` forces the email gate regardless of stuck state; Reset Everything + Clear App Cache `POST /api/logout` and reload. | `tests/e2e/test_email_gate.py`; `tests/e2e/test_reset_everything.py`; `tests/e2e/test_clear_app_cache.py` | conformant | +| AC-7 (self-service field-debug substrate) | Build label (AC-15), `?diag=1` state-dump, sanitized error capture (`deploy/client_error_sanitizer.py`), bug-report flow, `?gate=1` escape, boot watchdog with named phase marks, slow-boot persistence. | `tests/e2e/test_diagnostics_panel.py`; `test_boot_watchdog.py`; `test_boot_error_panel.py`; `test_bug_report.py`; `test_boot_beacon.py` | conformant | +| AC-9 (auto-backup of private data) | `vendor/sqlite-worker.js:maybeBackupRelationshipsDb()` — per-boot debounced (`BACKUP_DEBOUNCE_MS`), 5-slot rotation ring; folder mode writes the ring into the user folder. | `tests/e2e/test_user_folder_storage.py::test_snapshot_lands_in_folder_when_folder_mode_active`, `::test_opfs_to_folder_backup_migration_on_folder_boot`; restore via `tests/e2e/test_settings.py`. Debounce cadence by code inspection (LLM rubric). | conformant | +| AC-10 (opt-in non-destructive re-imports) | About-page *Update directory data*; `previewFellowsDbSwap()`/`applyFellowsDbSwap()` preview `group_members` orphaned by the swap before commit; one-shot soft scan. | `tests/e2e/test_directory_data_update_flow.py::test_apply_with_group_impact_shows_dialog_and_can_cancel`, `::test_apply_with_group_impact_confirm_completes_swap`; `test_orphan_soft_scan.py`; `test_versioned_fellows_db.py` | conformant | +| AC-11 (concurrent-access detection) | Worker `isOwnershipConflictError()` → `OWNERSHIP_CONFLICT` with a specific "another tab/window of this app is already open" message; Web Lock `fellows-relationships-folder-write` guards folder writes. | `tests/e2e/test_user_folder_storage.py::TestPhase2WriteLock::test_lock_held_during_write_surfaces_failure_then_recovers`; `test_worker_spawn_failure.py` | conformant | +| AC-15 (build label tied to source revision) | `build/build_pwa.py:compute_build_label()` → `-`, stamped into `app.js`/`sw.js`/`vendor/sqlite-worker.js` at build time; `app/server.py` substitutes the same at serve time. | `tests/test_build_pwa.py`; `tests/e2e/test_update_check.py`; `test_bug_report.py` (asserts `app: -`); `test_boot_beacon.py` | conformant | +| AC-16 (user-driven transport selection) | Group/fellow export surfaces `mailto:` (+ `tel:`); the user picks per outreach; no transport is hardcoded as the sole option. Axis pick is `comms-transport-set: mailto-only` (Signal planned). | `tests/e2e/test_groups_export.py`; `tests/test_comms.py` | partial-conformance (conformant to the `mailto-only` axis pick; richer transports planned) | +| AC-17 (mirrored data is sourced) | `build/restore_from_knack_scrapefile.py` maps every column to a Knack `field_*` (raw_dump fallback); no contact data introduced beyond the configured Knack source. | `tests/test_database.py`; `build/diff_fellows_db.py` (bytewise vs `fellows.db.backup.2026-04-08`, via `just db-verify`); [`./data_provenance.md`](./data_provenance.md) (human review) | conformant | +| AC-18 (transports cannot read message contents) | Only `mailto:` / `tel:` offered — no centralized SaaS message broker (Slack/Discord). `mailto:` hands to the user's client; MCP comms only stages a `mailto:` URL. | Architectural / human-review; `tests/test_comms.py` (stage-only, returns `mailto:` URL); `tests/test_private_data_ops.py` (`mode=ro`) | conformant | +| AC-19 (user-visible payload before send) | Group export panel shows recipients + subject + body + merged data before launch, editable/cancelable; bulk shows recipient count + warning. | `tests/e2e/test_groups_export.py`; `tests/e2e/test_groups_compose.py` | conformant | +| AC-PRM-A (LLM calls over user data are transports) | Cloud-LLM use is opt-in via the `EX-CLOUD-LLM` exception (consent gate → non-PNA mode); a local model is the default green path. Per-call prompt + merged-data visibility lives in the cloud client's own UI (the user drives Claude Desktop). | `tests/e2e/test_pna_exception_mode.py`; `test_mcpb_settings.py`; see Exception attestation below | partial-conformance (cloud opt-in via per-install consent; per-call prompt visibility is the cloud client's UI, not the workspace's) | +| AC-PRM-D (re-ingestion is user-initiated) | Directory-data refresh is an explicit About-page button only; boot is install-only and never background-polls. | `tests/e2e/test_directory_data_update_flow.py`; `test_versioned_fellows_db.py::test_install_only_does_not_refetch_on_sha_mismatch` | conformant | +| AC-MCP-A (cloud AI clients require consent for Private DB) | Realized as the `EX-CLOUD-LLM` exception: a workspace consent gate before the user wires up a cloud client + a persistent non-PNA-mode signal; `mcp_servers/private_data_ops.py` opens both DBs `mode=ro`. The stdio servers are not per-call gated by design (out-of-band from the workspace — see [`../plans/pna_toolkit_exceptions_contribution.md`](../plans/pna_toolkit_exceptions_contribution.md) open question). | `tests/e2e/test_pna_exception_mode.py`; `tests/test_private_data_ops.py` (`mode=ro`) | partial-conformance (per-session/per-install opt-in via `EX-CLOUD-LLM`; not per-call server-side gating) | +| AC-MCP-B (MCP Communications stages; workspace launches) | `mcp_servers/comms.py:stage_email()` returns a `mailto:` URL + payload preview and never fires a transport; the user's mail client launches it. | `tests/test_comms.py::test_stage_email_basic_to`, `::test_stage_email_bcc_group_send` | conformant | + +### Flavor-derived ACs triggered by fellows's picks + +Cross-referenced to the toolkit's [axes.md](https://github.com/richbodo/personal_network_toolkit/blob/main/axes.md): + +| AC | Triggered by | Realization | Verification | Status | +|---|---|---|---|---| +| AC-2 (no SaaS surface) | `dist:web-bundle-with-magic-link` | `deploy/server.py` ships no per-user RW endpoints; the dev server's retired `/api/groups` and `/api/settings` were the only ones that ever existed (Phase 1 cutover). | `tests/test_deploy_auth_round_trip.py::test_directory_api_is_403_without_session`; `test_deploy_sqlite_api.py`; `test_deploy_mcpb_routes.py` | conformant | +| AC-3 (single OPFS owner) | `storage:opfs-sqlite-wasm` | `app/static/vendor/sqlite-worker.js` is the sole context that calls `navigator.storage.getDirectory` or opens a `FileSystemSyncAccessHandle`. | `tests/e2e/test_worker_rpc.py`; `test_worker_cold_start.py`; `test_local_first_boot.py` | conformant | +| AC-5 (stale session never locks users out of cache) | `dist:web-bundle-with-magic-link` (auth-gated) | Three-tier `window.__dataProvider` hot-swaps `worker` → `api+idb` on 401/403 mid-boot; the cached directory stays readable. | `tests/e2e/test_offline_only_mode.py::test_returning_visit_renders_from_local_opfs_when_network_down`; `test_search_offline_fallback.py`; `test_local_first_boot.py` | conformant | +| AC-8 (anti-enumeration + abuse-bounded analytics) | `dist:web-bundle-with-magic-link` + `debug:has-error-sink` | `deploy/server.py` auth endpoints return neutral payloads with per-IP rate limits (`deploy/magic_link_auth.py:check_rate_limit`); the `/api/client-errors` sink is sanitized (`deploy/client_error_sanitizer.py`). See [`./email_gate.md`](./email_gate.md). | `tests/test_magic_link_auth.py`; `test_deploy_auth_round_trip.py`; `test_deploy_client_errors.py`; `test_client_error_sanitizer.py` | conformant | +| AC-12 (capability detection inside worker) | `storage:opfs-sqlite-wasm` | Worker `init` reports `opfsCapable`; the main thread reads the field and renders the unsupported-browser panel rather than UA-sniffing. | `tests/e2e/test_unsupported_browser.py::test_no_sah_falls_back_to_api_idb_provider`; `test_worker_cold_start.py` | conformant | +| AC-13 (COOP/COEP required) | `storage:opfs-sqlite-wasm` + `dist:web-served` | Both dev (`app/server.py:Handler.end_headers`) and prod (`deploy/server.py`) send `Cross-Origin-Opener-Policy: same-origin` + `Cross-Origin-Embedder-Policy: require-corp` and a strict CSP. Caddy preserves them at the edge. | `tests/test_api.py::TestSecurityHeaders::test_coop_coep_present`, `::test_strict_csp_present`, `::test_other_hardening_headers_present` | conformant | +| AC-14 (SW never owns SQLite) | `dist:web-bundle-with-magic-link` (PWA) | `app/static/sw.js` is app-shell + update/signature only; `/fellows.db` is explicitly bypassed in the fetch handler. | `tests/e2e/test_sw_post_caching.py`; `test_image_cache_no_bust.py` | conformant | + +### ACs that are not applicable in fellows's flavor + +| AC | Reason | +|---|---| +| AC-PRM-B | Applies to `ingestion:multi-source-merge-with-dedup`; fellows is single-source (`single-source-static-mirror`). | +| AC-PRM-C | Applies to `storage:native-sqlite-via-filesystem`; fellows uses `opfs-sqlite-wasm`. | + +Picks fellows did not take on other axes carry their own flavor-derived ACs in [axes.md](https://github.com/richbodo/personal_network_toolkit/blob/main/axes.md); none fire here. + +### Exception attestation (non-PNA mode) + +fellows raises one PNA **Exception** — `EX-CLOUD-LLM` — when the user wires the directory to a +cloud LLM (Claude Desktop MCP). See [`./architectural_findings.md`](./architectural_findings.md) for +the discovery and [`../plans/pna_toolkit_exceptions_contribution.md`](../plans/pna_toolkit_exceptions_contribution.md) +for the upstream-contribution plan (the handler contract `EX-H1..EX-H8`). + +| EX | Relaxes | Handled? | Realization | Verification | Status | +|---|---|---|---|---|---| +| EX-CLOUD-LLM | PNA-DEFINITION (local-only / never-SaaS), AC-MCP-A; stresses Goal 1 | yes; reversible (mode only) | Workspace-side handler: consent gate in Settings before the user wires up the cloud client (`recordMcpbConsent()`); persistent dismissable "Going rogue — not a PNA" banner naming the exception (`syncNotAPnaBanner()`, `index.html`); in-app explainer `#/exception/EX-CLOUD-LLM` rendering the **per-dimension strength profile** (`PNA_EXCEPTION_STRENGTH` → `renderExceptionPage()`, EX-H8); "Return to PNA mode" control (`returnToPnaMode()`); `` machine-readable marker. **EX-H7** consent-to-human propagation is surfaced best-effort via the MCP `instructions` handshake on the data-returning servers (`CLOUD_LLM_PROPAGATION_NOTICE` in `mcp_servers/private_data_ops.py` + `mcp_servers/shared_data_ops.py`); servers stay `mode=ro`, not per-call gated. Code: `app/static/app.js`, `app/static/index.html`, `mcp_servers/`. | `tests/e2e/test_pna_exception_mode.py` (raise/dismiss/persist, banner→explainer, return-to-PNA from explainer + Settings, explainer active/inactive/unknown, `test_explainer_shows_per_dimension_strength_profile`); `tests/e2e/test_mcpb_settings.py`; `tests/test_private_data_ops.py::test_instructions_carry_cloud_llm_propagation_notice`; `tests/test_shared_data_ops.py::test_instructions_carry_cloud_llm_propagation_notice` | conformant for EX-H1–H6 and EX-H8; EX-H7 (consent-to-human) surfaced best-effort via MCP `instructions` | + +--- + +## HTTP API + +Read-only fellow data (served from `app/fellows.db`): + +| Method | Path | Purpose | +|---|---|---| +| GET | `/api/fellows` | Minimal list (`record_id`, `slug`, `name`, `has_contact_email`) for instant directory render. | +| GET | `/api/fellows?full=1` | Full fellow rows (phase 2 of the two-phase load). | +| GET | `/api/fellows/` | One fellow by `slug` or `record_id`. | +| GET | `/api/search?q=…` | FTS5 search across name / bio / cohort / fellow_type / search_tags / key_links. | +| GET | `/api/stats` | Aggregates for the About page: total, breakdowns by fellow_type / cohort / region, field completeness. | +| GET | `/fellows.db` | Raw SQLite snapshot for the PWA's OPFS bootstrap. | +| GET | `/images/.{jpg,png}` | Profile image; alphanumeric-fuzzy filename fallback. | +| GET | `/` and other static paths | App shell from `app/static/`. | + +Production-only routes (added by `deploy/server.py`; conform to the Distribution slot's auth contract [`distribution-auth.openapi.yaml`](https://github.com/richbodo/personal_network_toolkit/blob/main/spec/contracts/distribution-auth.openapi.yaml)): + +| Method | Path | Purpose | +|---|---|---| +| GET | `/api/auth/status` | Never gated; returns `{authEnabled, authenticated, hasSessionCookie, installRecentlyAllowed, build, buildGitSha}`. | +| POST | `/api/send-unlock` | Anti-enum, always 200; rate-limited per email-hash. | +| POST | `/api/verify-token` | 200 + Set-Cookie on success; 401 with distinct `expired`/`invalid` strings otherwise. | +| POST | `/api/logout` | Idempotent, always 200. | +| POST | `/api/client-errors` | Unauthenticated client-error sink. Always 204. Sanitized + rate-limited; logs `event=client_error` to journald. Dev stub mirrors prod for round-trip. Schema: [`client-errors-payload.schema.json`](https://github.com/richbodo/personal_network_toolkit/blob/main/spec/contracts/client-errors-payload.schema.json); privacy boundary detailed in [`./email_gate.md` § Client error reporting](./email_gate.md#client-error-reporting). | +| GET | `/healthz` | Liveness probe. | +| GET | `/build-meta.json` | Build label + git SHA + `fellows_db_sha` for SW drift-check. | +| GET | `/api/debug/diagnostics` | Operator diagnostics blob. | + +The server opens a new SQLite connection per request (no pool; unnecessary at local scale). `/api/stats` is heavier than a simple row fetch (region split + field-completeness pass over `extra_json` via `json_extract`); still fine at local scale. + +The Two-Phase Load pattern (`/api/fellows` then `/api/fellows?full=1` in the background, falling back to `/api/fellows/` if the user clicks before phase 2 completes) is a Workspace concern; the route names are fellows specifics. + +--- + +## Cross-origin and CSP headers + +Both servers send (must be preserved by Caddy at the edge): + +- `Cross-Origin-Opener-Policy: same-origin` + `Cross-Origin-Embedder-Policy: require-corp` — AC-13 prerequisite for OPFS-SAH-Pool. +- `Cross-Origin-Resource-Policy: same-origin`, `Referrer-Policy: strict-origin-when-cross-origin`, `X-Content-Type-Options: nosniff`. +- A strict Content-Security-Policy: `default-src 'self'; script-src 'self' 'wasm-unsafe-eval'; worker-src 'self'; connect-src 'self'; img-src 'self' data:; style-src 'self'; font-src 'self'; object-src 'none'; base-uri 'self'; frame-ancestors 'none';` +- A locked-down `Permissions-Policy` (geolocation / camera / microphone / payment / sensors / USB / etc. all `=()`). + +HSTS is added by Caddy (`ansible/roles/caddy/templates/Caddyfile.j2`), not the Python server. + +**Subresource Integrity (SRI):** `index.html` carries SHA-384 `integrity=` attributes on both `