diff --git a/.github/workflows/spec-lint.yml b/.github/workflows/spec-lint.yml
index 5800b44..a4088b3 100644
--- a/.github/workflows/spec-lint.yml
+++ b/.github/workflows/spec-lint.yml
@@ -14,3 +14,19 @@ jobs:
with:
python-version: "3.11"
- run: python3 tools/lint-spec-ids.py
+
+ egress-lint-selftest:
+ runs-on: ubuntu-latest
+ steps:
+ - uses: actions/checkout@v4
+ - uses: actions/setup-python@v5
+ with:
+ python-version: "3.11"
+ - name: clean fixture must pass (exit 0)
+ run: python3 tools/egress-lint.py tools/egress-lint-fixtures/clean
+ - name: dirty fixture must fail (exit 1)
+ run: |
+ if python3 tools/egress-lint.py tools/egress-lint-fixtures/dirty; then
+ echo "::error::egress-lint did not flag the dirty fixture"
+ exit 1
+ fi
diff --git a/README.md b/README.md
index 44e2bdf..2fdd0c7 100644
--- a/README.md
+++ b/README.md
@@ -10,10 +10,17 @@ Three deliverables, in dependency order:
2. **Production-ready reference applications.** Working PNAs you can install, study, and adapt. — first reference design is a distributed directory archive (lives at [richbodo/fellows_local_db](https://github.com/richbodo/fellows_local_db)).
3. **AI tooling — skill + MCP (Model Context Protocol) servers.** How AI agents work with PNT. The skill at [`pna-build-eval-contrib/SKILL.md`](pna-build-eval-contrib/SKILL.md) is what an agent reads to consume the spec at design time. The MCP servers (typed contracts in [`contracts/`](contracts/); three v1 stdio implementations in `fellows_local_db/mcp_servers/`) expose an already-built PNA's capabilities at runtime so AI clients (Claude Desktop, Cursor, local Ollama agents) can drive a PNA on the user's behalf.
-PNT supports three modes of use, all packaged in the [skill](pna-build-eval-contrib/SKILL.md):
+PNT supports three modes of use, all packaged in the [skill](pna-build-eval-contrib/SKILL.md). **Install it once** so your agent auto-discovers it — symlink the skill into your skills directory (run from your PNT working directory):
+```bash
+mkdir -p ~/.claude/skills
+ln -s "$(pwd)/pna-build-eval-contrib" ~/.claude/skills/pna-build-eval-contrib
+```
+
+A `git pull` here then updates the skill everywhere it's used. See [`docs/users-guide.md` § Install the skill](docs/users-guide.md#install-the-skill) for copy-instead-of-symlink, project-scoped, and no-install alternatives. With the skill installed, drive any mode in natural language:
+
+- **Evaluate.** *Audit any contact app for safety before you install it.* An AI agent reads the candidate's source, checks it against every applicable AC (Architectural Commitment), and returns a structured report flagging anything that would put your data at risk. The lowest-friction way in — and it doubles as a self-check on your own in-progress design.
- **Build.** An AI agent reads the spec and helps you compose a conformant PNA against the typed contracts, adapting from a reference design that shares your axis picks.
-- **Evaluate.** An AI agent audits a candidate PNA's source against every applicable AC (Architectural Commitment) and returns a structured report — useful for deciding whether someone else's PNA is safe to install, or for self-checking your own in-progress design.
- **Contribute.** When you find a spec gap or have a design that adds ecosystem value, the skill walks you through preflight validation (Architecture document + AC attestation table) and then opens the PR back to PNT.
See [`docs/users-guide.md`](docs/users-guide.md) for step-by-step instructions for each.
diff --git a/docs/users-guide.md b/docs/users-guide.md
index 98d85ba..2039240 100644
--- a/docs/users-guide.md
+++ b/docs/users-guide.md
@@ -4,7 +4,17 @@ The PNA (Personal Network Application) Spec is the canonical specification; this
PNT (Personal Network Toolkit) is built to be consumed by AI coding agents. Most of this guide assumes you have an agent (Claude Code, Cursor, an equivalent) you can ask things like *"use the PNT skill to validate my design."* The skill at [`pna-build-eval-contrib/SKILL.md`](../pna-build-eval-contrib/SKILL.md) is the agent-consumption view of everything in this guide.
-> **Status note (May 2026).** The three skill flows below — build, evaluate, contribute — haven't been exercised end-to-end yet. The materials are in place; Phase 5 of the reorganization plan validates them against `fellows_local_db` as the first reference design. The agent prompts and output shapes below describe the intended behavior per [`pna-build-eval-contrib/SKILL.md`](../pna-build-eval-contrib/SKILL.md); expect refinement as the skill gets dogfooded.
+**The fastest way in is auditing.** If you just want to know whether a contact app is safe before you install it — without building or contributing anything — go straight to [Goal 2](#goal-2--audit-a-candidate-pna-before-installing-it). It's the lowest-friction front door to PNT: point an agent at the app's source and get back an AC-keyed safety report.
+
+> **Status note (May 2026).** PNT's deterministic tooling is now tested; the agent-driven flows are not yet exercised end-to-end.
+>
+> **Tested / CI-enforced:**
+> - [`tools/egress-lint.py`](../tools/egress-lint.py) — the deterministic AC-1 egress check, with clean/dirty self-test fixtures run in CI.
+> - [`tools/lint-spec-ids.py`](../tools/lint-spec-ids.py) — AC ↔ contract traceability lint, run in CI.
+> - [`tools/evaluate-report.schema.json`](../tools/evaluate-report.schema.json) — the audit-report schema, validated against its meta-schema and conditional rules.
+>
+> **Not yet exercised end-to-end:**
+> - The **build**, **audit**, and **contribute** skill flows. The materials are in place; Phase 5 of the reorganization plan validates them against `fellows_local_db` as the first reference design. The agent prompts and output shapes below describe the *intended* behavior per [`pna-build-eval-contrib/SKILL.md`](../pna-build-eval-contrib/SKILL.md); expect refinement as the skill gets dogfooded.
---
@@ -68,7 +78,7 @@ You're starting (or extending) a personal network application.
The Verification field is load-bearing for Goal 3 (Contribute). See Goal 6 for what makes a good Verification entry.
-**7. Self-check.** Run Goal 2 (Audit) on your own in-progress code before declaring the design done. The agent walks every applicable AC and flags non-conformances.
+**7. Self-check.** Run Goal 2 (Audit) on your own in-progress code before declaring the design done. The agent walks every applicable AC and flags non-conformances. For the AC-1 (private-data-sovereignty) row in particular, add an `egress-allow.json` to your repo listing the remote origins your flavor legitimately uses, and run [`tools/egress-lint.py`](../tools/egress-lint.py) against your source — it's the deterministic half of that check and makes a ready-made Verification entry (see Goal 6). Wire it into your own CI so a future change can't silently introduce an off-device data path.
---
@@ -96,12 +106,16 @@ You have a PNA in front of you (someone else's, or your own in-progress one) and
If the candidate ships its own Architecture document with an AC attestation table, the agent validates the document against the code (do cited code locations match the claimed realization? do declared verification mechanisms actually pass?). If there's no Architecture document, the agent infers axis picks from the source and walks every applicable AC from scratch.
-**4. Read the AC-keyed report.** The agent produces a structured report keyed by AC ID:
+ As part of the audit the agent also runs the deterministic checks in `tools/` — notably [`tools/egress-lint.py`](../tools/egress-lint.py), which scans for off-device data leaks (the AC-1 sovereignty concern) — and folds their results into the matching AC findings as `source: deterministic` evidence, alongside its own reading. The deterministic layer catches the one violation that's easy to miss in a large tree; the LLM layer reasons about everything the lint can't.
+
+**4. Read the AC-keyed report.** The agent produces a structured report keyed by AC ID, emitted as a typed artifact ([`tools/evaluate-report.schema.json`](../tools/evaluate-report.schema.json)) with a human-readable rendering over it. Per-AC status is one of:
- `conformant` — design honors this AC; cited code locations included
- `non-conformant` — design violates this AC; report names the AC requirement and the offending code
- `not-applicable` — design's flavor doesn't trigger this AC
- `unable-to-determine` — needs human review
+ Because the report is typed, two runs over the same candidate are diffable. Ask the agent to save the artifact (e.g. `eval-report.json`); when the app ships an update, re-audit and diff the two JSON files — the per-AC status changes are your drift/regression signal (the "did anything quietly stop conforming?" check). The human-readable summary you read is just a rendering over this artifact.
+
**5. Decide.** Goals 1–5 are the load-bearing user-facing concerns — private-data sovereignty (Goal 1), source-mirroring honesty (Goal 2), transport security (Goal 3), durability (Goal 4), local diagnosability (Goal 5). If any of those are non-conformant, the design is not safe to trust with your data. Non-conformances against architectural details that don't touch Goals 1–5 are still worth fixing but aren't immediate red flags.
**Optional: emphasize a specific concern.** E.g.: *"Focus on Goal 1 — make sure my Private DB rows can't leave my device."* This shapes the summary, not the underlying check.
@@ -196,7 +210,7 @@ Your job as a contributor: fill in the **AC attestation table** in your Architec
The Verification field is load-bearing. Three kinds are acceptable:
-1. **Deterministic test** — a script or test file decides conformance mechanically. Example: a script that scans the codebase for any `fetch(...)` call to a non-localhost URL on the Private DB code path.
+1. **Deterministic test** — a script or test file decides conformance mechanically. Example: [`tools/egress-lint.py`](../tools/egress-lint.py) scans the source for unsanctioned off-device egress vectors (`fetch`/`sendBeacon`/remote `src`/etc.) against an allow-list of the origins your flavor legitimately uses, and its `--json` output folds straight into the AC-1 finding of an evaluate report.
2. **LLM evaluation rubric** — a prompt or rubric describing what an LLM should look for. Useful for posture/intent ACs that mechanical tests can't reach. Example: *"Read every code path that reads from Private DB and decide whether any of them sends data off-device. Cite specific call sites."*
3. **Human-review note** — a short note explaining why no automated test is feasible, with the review record itself archived in the design's repo (e.g., `docs/conformance-review-2026-05.md`).
@@ -228,5 +242,8 @@ The skill description triggers on natural-language requests fitting any of these
- [`reference_designs/`](../reference_designs/) — accepted designs + templates
- [`pna-build-eval-contrib/SKILL.md`](../pna-build-eval-contrib/SKILL.md) — the agent-consumption view (what you're invoking through the agent above)
- [`CONTRIBUTING.md`](../CONTRIBUTING.md) — full contribution rules
-- [`tools/`](../tools/) — validators
+- [`tools/`](../tools/) — validators and the audit-report schema:
+ - [`tools/egress-lint.py`](../tools/egress-lint.py) — deterministic AC-1 check for off-device data leaks (Goals 1, 2, 6)
+ - [`tools/evaluate-report.schema.json`](../tools/evaluate-report.schema.json) — typed schema for the audit report (Goal 2)
+ - [`tools/lint-spec-ids.py`](../tools/lint-spec-ids.py) — AC ↔ contract traceability lint
- [`plans/reorganization-plan.md`](../plans/reorganization-plan.md) — the live plan tracking PNT's own evolution
diff --git a/llms.txt b/llms.txt
index b14aecf..a361967 100644
--- a/llms.txt
+++ b/llms.txt
@@ -2,6 +2,8 @@
> Universal specification for personal network applications (PNAs). Defines vocabulary, goals, use cases, axes, composition (how PNAs get built), architectural commitments, and slot contracts. Targeted at AI agents and humans building PNAs together.
+**New here? Start with the skill.** If you are an AI agent that landed here cold, read [`pna-build-eval-contrib/SKILL.md`](pna-build-eval-contrib/SKILL.md) first. It is the entry point: it routes you into the spec, contracts, and reference designs for whichever flow you need — **evaluate** ("is this app safe to install?"), **build**, or **contribute**.
+
## Skill (agent entry point)
- [pna-build-eval-contrib/SKILL.md](pna-build-eval-contrib/SKILL.md) — canonical PNT skill for AI agents. Three flows: build a conformant PNA from the spec, evaluate a candidate PNA's conformance against the spec, contribute a design back to PNT
diff --git a/plans/pnt-next-steps-plan.md b/plans/pnt-next-steps-plan.md
new file mode 100644
index 0000000..2bf1d46
--- /dev/null
+++ b/plans/pnt-next-steps-plan.md
@@ -0,0 +1,89 @@
+# PNT Next Steps — High-Level Plan
+
+Ordered as requested: **1 → 4 → 5 → 3 → 6 → 2**. High-level only; work the details with Claude Code. Each item notes how it rides the existing reorg phases and `tools/` conventions rather than starting a parallel track.
+
+Sequencing logic: a cheap README win first; then formalize the evaluate *output* (4) so later checks have a place to land; then a real design to test against (5); then the deterministic check (3) whose findings flow into that output; then a reading-gated architecture decision (6); then the skill split (2) last, once Evaluate has earned its place as the front door.
+
+---
+
+## 1. Install signpost + promote Evaluate (quick win) — ✅ DONE (2026-05-29)
+
+**Goal.** Close the two remaining README gaps now that the skill is already surfaced and linked.
+
+> **Status:** Done. README now leads the three modes with Evaluate ("audit any contact app for safety before you install it") and carries a concrete symlink install snippet pointing to `docs/users-guide.md § Install the skill`; `llms.txt` opens with a "Start with the skill" line routing a cold agent to `SKILL.md` as the entry point.
+
+- Add a concrete **install/activation** snippet: how an agent picks up the skill (copy `pna-build-eval-contrib/` into `.claude/skills/`, or the equivalent one-liner), so it auto-discovers rather than relying on a human pasting the path.
+- Reorder the "three modes" so **Evaluate leads** for the average reader — frame it as "audit any contact app for safety before you install it," with Build/Contribute following. Evaluate is the lowest-friction front door and the one a non-builder actually wants.
+- Make `llms.txt` route a cold agent to the SKILL.md as the build/eval entry point.
+
+**Done when.** A new person can read the README and get an agent running Evaluate without asking how.
+
+---
+
+## 4. Typed evaluate-report artifact — ✅ DONE (2026-05-29)
+
+**Goal.** Turn the evaluate flow's existing structured report into a typed artifact so results are machine-comparable and drift becomes a diff.
+
+> **Status:** Done. JSON Schema at `tools/evaluate-report.schema.json` (Draft 2020-12, validated): AC-keyed `findings` with per-AC `status` (`conformant`/`non-conformant`/`not-applicable`/`unable-to-determine`), code-location citations, a `summary` posture, and an `evidence` array tagged by `source` (`deterministic`/`llm`/`human`) — the seam item 3's egress lint feeds into. Conditional rules enforce citations-on-(non)conformant and rationale-on-(n/a, undetermined). Lives in `tools/`, not `contracts/`, because it realizes no AC (would fail `lint-spec-ids.py`). `SKILL.md` § Evaluate flow now emits the artifact as source of truth with the prose report as a view; `docs/users-guide.md` Goal 2 and the skill's Key resources updated.
+
+- Define a JSON Schema for the AC-keyed report (per-AC status: `conformant` / `non-conformant` / `not-applicable` / `unable-to-determine`, plus cited code locations and the summary posture).
+- Have the evaluate flow emit to this schema; keep the human-readable rendering as a view over it.
+- Reinforces README Goal 6 (AC as unit of identity) and gives the "occasionally re-check we didn't drift" workflow a concrete regression signal.
+
+**Done when.** Two eval runs on the same design can be diffed to show exactly which ACs changed status.
+
+---
+
+## 5. Attest the mutual-aid / community-care use case
+
+**Goal.** Add the use case closest to your social-network-health origin — surface who in a personal network needs help and who can offer it, then communicate — alongside the existing Directory Archive / PRM / Multi-PNA entries in `use_cases.md`.
+
+- You're already building a reference design for this; let the design drive the use-case attestation rather than writing it speculatively.
+- Treat it as the hardest privacy stress test: a "needs help" field is health/vulnerability-adjacent, so it should exercise your sovereignty and consent ACs harder than any prior design. Note any new flavor-derived ACs it surfaces (candidate Contribute-flow spec diff).
+
+**Done when.** The use case is attested in `use_cases.md` and backed by a working reference design with a filled AC attestation table.
+
+---
+
+## 3. Egress lint (deterministic sovereignty check) — ✅ DONE (2026-05-29)
+
+**Goal.** One deterministic check guarding Goal 1 (private-data sovereignty): does any code path send private data off-device?
+
+> **Status:** Done. `tools/egress-lint.py` statically scans a PNA source tree for egress vectors (`fetch`/XHR/`sendBeacon`/`WebSocket`/`EventSource`/`import()`/`importScripts`/axios/jQuery and HTML `src`/`action`/`object data`/``/`