Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .github/workflows/spec-lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,19 @@ jobs:
with:
python-version: "3.11"
- run: python3 tools/lint-spec-ids.py

egress-lint-selftest:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: clean fixture must pass (exit 0)
run: python3 tools/egress-lint.py tools/egress-lint-fixtures/clean
- name: dirty fixture must fail (exit 1)
run: |
if python3 tools/egress-lint.py tools/egress-lint-fixtures/dirty; then
echo "::error::egress-lint did not flag the dirty fixture"
exit 1
fi
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,17 @@ Three deliverables, in dependency order:
2. **Production-ready reference applications.** Working PNAs you can install, study, and adapt. — first reference design is a distributed directory archive (lives at [richbodo/fellows_local_db](https://github.com/richbodo/fellows_local_db)).
3. **AI tooling — skill + MCP (Model Context Protocol) servers.** How AI agents work with PNT. The skill at [`pna-build-eval-contrib/SKILL.md`](pna-build-eval-contrib/SKILL.md) is what an agent reads to consume the spec at design time. The MCP servers (typed contracts in [`contracts/`](contracts/); three v1 stdio implementations in `fellows_local_db/mcp_servers/`) expose an already-built PNA's capabilities at runtime so AI clients (Claude Desktop, Cursor, local Ollama agents) can drive a PNA on the user's behalf.

PNT supports three modes of use, all packaged in the [skill](pna-build-eval-contrib/SKILL.md):
PNT supports three modes of use, all packaged in the [skill](pna-build-eval-contrib/SKILL.md). **Install it once** so your agent auto-discovers it — symlink the skill into your skills directory (run from your PNT working directory):

```bash
mkdir -p ~/.claude/skills
ln -s "$(pwd)/pna-build-eval-contrib" ~/.claude/skills/pna-build-eval-contrib
```

A `git pull` here then updates the skill everywhere it's used. See [`docs/users-guide.md` § Install the skill](docs/users-guide.md#install-the-skill) for copy-instead-of-symlink, project-scoped, and no-install alternatives. With the skill installed, drive any mode in natural language:

- **Evaluate.** *Audit any contact app for safety before you install it.* An AI agent reads the candidate's source, checks it against every applicable AC (Architectural Commitment), and returns a structured report flagging anything that would put your data at risk. The lowest-friction way in — and it doubles as a self-check on your own in-progress design.
- **Build.** An AI agent reads the spec and helps you compose a conformant PNA against the typed contracts, adapting from a reference design that shares your axis picks.
- **Evaluate.** An AI agent audits a candidate PNA's source against every applicable AC (Architectural Commitment) and returns a structured report — useful for deciding whether someone else's PNA is safe to install, or for self-checking your own in-progress design.
- **Contribute.** When you find a spec gap or have a design that adds ecosystem value, the skill walks you through preflight validation (Architecture document + AC attestation table) and then opens the PR back to PNT.

See [`docs/users-guide.md`](docs/users-guide.md) for step-by-step instructions for each.
Expand Down
27 changes: 22 additions & 5 deletions docs/users-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,17 @@ The PNA (Personal Network Application) Spec is the canonical specification; this

PNT (Personal Network Toolkit) is built to be consumed by AI coding agents. Most of this guide assumes you have an agent (Claude Code, Cursor, an equivalent) you can ask things like *"use the PNT skill to validate my design."* The skill at [`pna-build-eval-contrib/SKILL.md`](../pna-build-eval-contrib/SKILL.md) is the agent-consumption view of everything in this guide.

> **Status note (May 2026).** The three skill flows below — build, evaluate, contribute — haven't been exercised end-to-end yet. The materials are in place; Phase 5 of the reorganization plan validates them against `fellows_local_db` as the first reference design. The agent prompts and output shapes below describe the intended behavior per [`pna-build-eval-contrib/SKILL.md`](../pna-build-eval-contrib/SKILL.md); expect refinement as the skill gets dogfooded.
**The fastest way in is auditing.** If you just want to know whether a contact app is safe before you install it — without building or contributing anything — go straight to [Goal 2](#goal-2--audit-a-candidate-pna-before-installing-it). It's the lowest-friction front door to PNT: point an agent at the app's source and get back an AC-keyed safety report.

> **Status note (May 2026).** PNT's deterministic tooling is now tested; the agent-driven flows are not yet exercised end-to-end.
>
> **Tested / CI-enforced:**
> - [`tools/egress-lint.py`](../tools/egress-lint.py) — the deterministic AC-1 egress check, with clean/dirty self-test fixtures run in CI.
> - [`tools/lint-spec-ids.py`](../tools/lint-spec-ids.py) — AC ↔ contract traceability lint, run in CI.
> - [`tools/evaluate-report.schema.json`](../tools/evaluate-report.schema.json) — the audit-report schema, validated against its meta-schema and conditional rules.
>
> **Not yet exercised end-to-end:**
> - The **build**, **audit**, and **contribute** skill flows. The materials are in place; Phase 5 of the reorganization plan validates them against `fellows_local_db` as the first reference design. The agent prompts and output shapes below describe the *intended* behavior per [`pna-build-eval-contrib/SKILL.md`](../pna-build-eval-contrib/SKILL.md); expect refinement as the skill gets dogfooded.

---

Expand Down Expand Up @@ -68,7 +78,7 @@ You're starting (or extending) a personal network application.

The Verification field is load-bearing for Goal 3 (Contribute). See Goal 6 for what makes a good Verification entry.

**7. Self-check.** Run Goal 2 (Audit) on your own in-progress code before declaring the design done. The agent walks every applicable AC and flags non-conformances.
**7. Self-check.** Run Goal 2 (Audit) on your own in-progress code before declaring the design done. The agent walks every applicable AC and flags non-conformances. For the AC-1 (private-data-sovereignty) row in particular, add an `egress-allow.json` to your repo listing the remote origins your flavor legitimately uses, and run [`tools/egress-lint.py`](../tools/egress-lint.py) against your source — it's the deterministic half of that check and makes a ready-made Verification entry (see Goal 6). Wire it into your own CI so a future change can't silently introduce an off-device data path.

---

Expand Down Expand Up @@ -96,12 +106,16 @@ You have a PNA in front of you (someone else's, or your own in-progress one) and

If the candidate ships its own Architecture document with an AC attestation table, the agent validates the document against the code (do cited code locations match the claimed realization? do declared verification mechanisms actually pass?). If there's no Architecture document, the agent infers axis picks from the source and walks every applicable AC from scratch.

**4. Read the AC-keyed report.** The agent produces a structured report keyed by AC ID:
As part of the audit the agent also runs the deterministic checks in `tools/` — notably [`tools/egress-lint.py`](../tools/egress-lint.py), which scans for off-device data leaks (the AC-1 sovereignty concern) — and folds their results into the matching AC findings as `source: deterministic` evidence, alongside its own reading. The deterministic layer catches the one violation that's easy to miss in a large tree; the LLM layer reasons about everything the lint can't.

**4. Read the AC-keyed report.** The agent produces a structured report keyed by AC ID, emitted as a typed artifact ([`tools/evaluate-report.schema.json`](../tools/evaluate-report.schema.json)) with a human-readable rendering over it. Per-AC status is one of:
- `conformant` — design honors this AC; cited code locations included
- `non-conformant` — design violates this AC; report names the AC requirement and the offending code
- `not-applicable` — design's flavor doesn't trigger this AC
- `unable-to-determine` — needs human review

Because the report is typed, two runs over the same candidate are diffable. Ask the agent to save the artifact (e.g. `eval-report.json`); when the app ships an update, re-audit and diff the two JSON files — the per-AC status changes are your drift/regression signal (the "did anything quietly stop conforming?" check). The human-readable summary you read is just a rendering over this artifact.

**5. Decide.** Goals 1–5 are the load-bearing user-facing concerns — private-data sovereignty (Goal 1), source-mirroring honesty (Goal 2), transport security (Goal 3), durability (Goal 4), local diagnosability (Goal 5). If any of those are non-conformant, the design is not safe to trust with your data. Non-conformances against architectural details that don't touch Goals 1–5 are still worth fixing but aren't immediate red flags.

**Optional: emphasize a specific concern.** E.g.: *"Focus on Goal 1 — make sure my Private DB rows can't leave my device."* This shapes the summary, not the underlying check.
Expand Down Expand Up @@ -196,7 +210,7 @@ Your job as a contributor: fill in the **AC attestation table** in your Architec

The Verification field is load-bearing. Three kinds are acceptable:

1. **Deterministic test** — a script or test file decides conformance mechanically. Example: a script that scans the codebase for any `fetch(...)` call to a non-localhost URL on the Private DB code path.
1. **Deterministic test** — a script or test file decides conformance mechanically. Example: [`tools/egress-lint.py`](../tools/egress-lint.py) scans the source for unsanctioned off-device egress vectors (`fetch`/`sendBeacon`/remote `src`/etc.) against an allow-list of the origins your flavor legitimately uses, and its `--json` output folds straight into the AC-1 finding of an evaluate report.
2. **LLM evaluation rubric** — a prompt or rubric describing what an LLM should look for. Useful for posture/intent ACs that mechanical tests can't reach. Example: *"Read every code path that reads from Private DB and decide whether any of them sends data off-device. Cite specific call sites."*
3. **Human-review note** — a short note explaining why no automated test is feasible, with the review record itself archived in the design's repo (e.g., `docs/conformance-review-2026-05.md`).

Expand Down Expand Up @@ -228,5 +242,8 @@ The skill description triggers on natural-language requests fitting any of these
- [`reference_designs/`](../reference_designs/) — accepted designs + templates
- [`pna-build-eval-contrib/SKILL.md`](../pna-build-eval-contrib/SKILL.md) — the agent-consumption view (what you're invoking through the agent above)
- [`CONTRIBUTING.md`](../CONTRIBUTING.md) — full contribution rules
- [`tools/`](../tools/) — validators
- [`tools/`](../tools/) — validators and the audit-report schema:
- [`tools/egress-lint.py`](../tools/egress-lint.py) — deterministic AC-1 check for off-device data leaks (Goals 1, 2, 6)
- [`tools/evaluate-report.schema.json`](../tools/evaluate-report.schema.json) — typed schema for the audit report (Goal 2)
- [`tools/lint-spec-ids.py`](../tools/lint-spec-ids.py) — AC ↔ contract traceability lint
- [`plans/reorganization-plan.md`](../plans/reorganization-plan.md) — the live plan tracking PNT's own evolution
2 changes: 2 additions & 0 deletions llms.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

> Universal specification for personal network applications (PNAs). Defines vocabulary, goals, use cases, axes, composition (how PNAs get built), architectural commitments, and slot contracts. Targeted at AI agents and humans building PNAs together.

**New here? Start with the skill.** If you are an AI agent that landed here cold, read [`pna-build-eval-contrib/SKILL.md`](pna-build-eval-contrib/SKILL.md) first. It is the entry point: it routes you into the spec, contracts, and reference designs for whichever flow you need — **evaluate** ("is this app safe to install?"), **build**, or **contribute**.

## Skill (agent entry point)

- [pna-build-eval-contrib/SKILL.md](pna-build-eval-contrib/SKILL.md) — canonical PNT skill for AI agents. Three flows: build a conformant PNA from the spec, evaluate a candidate PNA's conformance against the spec, contribute a design back to PNT
Expand Down
Loading
Loading