From 393bd0871a345edb52960b5e0b4945181f74e4de Mon Sep 17 00:00:00 2001 From: Rich Bodo Date: Sun, 31 May 2026 20:56:12 +1200 Subject: [PATCH] =?UTF-8?q?docs:=20toolkit=20DX=20+=20framing=20=E2=80=94?= =?UTF-8?q?=20per-repo=20install,=20validation-not-cert,=20exceptions=20pr?= =?UTF-8?q?ior=20art?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - users-guide.md: expand per-repo skill install (symlink for dev vs vendored-copy-with-pinned-commit for a contributing design) + the 'skills load at session start, restart to invoke' caveat. - PNA_Spec.md § Building a PNA: promote 'validation, not certification' to a first-class framing statement (cross-links CONTRIBUTING + SKILL; ties in exceptions reported by AC/EX ID, not graded). - prior_art.md: new § 9 — behavioral exceptions, consent propagation (TCF/UMA/Kantara/macaroons), graded assurance (EAL/ASVS/SLSA/EARL), and legible labeling (nutrition labels/model cards/datasheets); finding that the mechanics have precedent but the exception-as-class-concept is novel. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/PriorArt.md | 13 +++++++++++++ docs/PriorArtReferences.md | 15 +++++++++++++++ docs/users-guide.md | 20 ++++++++++++++++++-- spec/PNA_Spec.md | 2 ++ 4 files changed, 48 insertions(+), 2 deletions(-) diff --git a/docs/PriorArt.md b/docs/PriorArt.md index d430e60..5eb4dad 100644 --- a/docs/PriorArt.md +++ b/docs/PriorArt.md @@ -116,6 +116,19 @@ Background research on the application class PNT targets. **Proximity to PNT:** Background. Defines the domain but not the class-blueprint approach. +### 9. Behavioral exceptions, consent propagation, and graded assurance + +Prior art for the [Exceptions](../spec/exceptions.md) concept — a PNA *deliberately and honestly* departing from a guarantee — and for its per-dimension strength profiles. Surveyed when designing `EX-CLOUD-LLM`. + +- **Graded assurance levels.** Common Criteria EAL, [OWASP ASVS](https://owasp.org/www-project-application-security-verification-standard/) levels, and [SLSA](https://slsa.dev/) levels (all surveyed in § 4–5) establish the *grade-the-strength* precedent. PNT's strength profile borrows the idea but **rejects a single collapsed level** in favor of per-dimension classes — one number would hide that "the boundary is enforced" and "the provider's data handling is unverifiable" are different *kinds* of assurance. +- **Machine-readable conformance reporting.** [W3C EARL](https://www.w3.org/TR/EARL10-Schema/)'s pass / fail / cannot-tell vocabulary is the model for the evaluate flow reporting exception handling by ID (including "unable-to-determine"). +- **Consent propagation through delegation chains.** The IAB Europe [Transparency & Consent Framework](https://iabeurope.eu/transparency-consent-framework/) (a consent string propagated down an ad-tech chain), [User-Managed Access and Consent Receipts](https://kantarainitiative.org/) (Kantara), and **macaroons** (Birgisson et al., Google Research, 2014 — bearer credentials with *attenuating* caveats). Macaroons are the closest match to handler clause EX-H7: delegated authority only **narrows** as it passes through intermediaries, never amplifies — exactly the property "consent must reach the ultimate human; a proxy can't manufacture it" requires. +- **Legible strength/limitation labeling.** Apple privacy "nutrition labels", [model cards](https://arxiv.org/abs/1810.03993) (Mitchell et al., 2019), and [datasheets for datasets](https://arxiv.org/abs/1803.09010) (Gebru et al., 2018) — precedent for surfacing strengths *and* limitations in a fixed, user-readable structure. The per-dimension strength profile is this idea applied to a behavioral exception. + +**Finding:** the *mechanics* — grade the strength, attest it, propagate attenuated consent, report cannot-tell — have solid precedent, and PNT borrows them rather than inventing. What appears genuinely new is **framing a deliberate behavioral deviation as a first-class, stable-ID'd, caught-and-handled "exception" for an application class**; no surveyed artifact offers a description language for that. The human-AI-team development context is why it surfaces now. + +**Proximity to PNT:** Informative (mechanism). The grading, reporting, and attenuation patterns are adopted; the exception-as-class-concept is the novel part. + ## Proximity matrix | Artifact | Generative? | Class-scoped? | Multi-flavor? | Machine-checkable? | Overall proximity | diff --git a/docs/PriorArtReferences.md b/docs/PriorArtReferences.md index 6c58acd..c12d9c0 100644 --- a/docs/PriorArtReferences.md +++ b/docs/PriorArtReferences.md @@ -95,6 +95,21 @@ Companion reference list to [`PriorArt.md`](./PriorArt.md), the analytical surve - **[vCard (draft-ietf-vcarddav-vcardrev-02)](https://www.ietf.org/archive/id/draft-ietf-vcarddav-vcardrev-02.html)** — The contact-data format underlying CardDAV and most contact-exchange systems. - **[Defensics vCard Test Suite (Black Duck)](https://www.blackduck.com/fuzz-testing/defensics/protocols/vcard.html)** — Commercial fuzz/robustness conformance suite for vCard implementations. +## Behavioral exceptions & consent propagation + +Sources for [`PriorArt.md` § 9](./PriorArt.md) — prior art for the Exceptions concept (a PNA deliberately, honestly departing from a guarantee) and its per-dimension strength profiles. + +- **[Macaroons: Cookies with Contextual Caveats for Decentralized Authorization](https://research.google/pubs/pub41892/)** — Birgisson, Politz, Erlingsson, Taly, Vrable, Lentczner (Google, 2014). Bearer credentials whose authority can only be *attenuated* (narrowed) by adding caveats as they pass through intermediaries, never amplified. The closest formal analog to handler clause EX-H7 — consent must reach the ultimate human; a proxy cannot manufacture it. +- **[IAB Europe Transparency & Consent Framework (TCF)](https://iabeurope.eu/transparency-consent-framework/)** — A consent string propagated down a multi-party (ad-tech) delegation chain; prior art — cautionary as much as exemplary — for carrying a consent signal across actors. +- **[Kantara Initiative — User-Managed Access (UMA) & Consent Receipts](https://kantarainitiative.org/)** — Standards for user-controlled delegated authorization and for issuing a machine-readable receipt that a specific consent was given; relevant to recording and propagating the human's consent under an exception. +- **[Common Criteria — Evaluation Assurance Levels (ISO/IEC 15408)](https://www.commoncriteriaportal.org/)** — Graded assurance levels (EAL1–7); the canonical "grade the strength" precedent the strength profile borrows from while rejecting a single collapsed level. +- **[OWASP ASVS](https://owasp.org/www-project-application-security-verification-standard/)** — Application Security Verification Standard; level-based (L1–L3) assurance — another graded-strength precedent. +- **[SLSA](https://slsa.dev/)** — Supply-chain Levels for Software Artifacts; graded build-integrity assurance levels. +- **[W3C EARL — Evaluation and Report Language](https://www.w3.org/TR/EARL10-Schema/)** — pass / fail / cannot-tell reporting vocabulary; the model for reporting exception handling by ID, including "unable-to-determine". +- **[Model Cards for Model Reporting](https://arxiv.org/abs/1810.03993)** — Mitchell et al., 2019. Fixed-structure disclosure of a model's intended use, performance, and limitations; precedent for surfacing strengths *and* limitations legibly (the per-dimension strength profile applies this to a behavioral exception). +- **[Datasheets for Datasets](https://arxiv.org/abs/1803.09010)** — Gebru et al., 2018. Standardized dataset documentation (provenance, composition, limitations); same legible-disclosure lineage as model cards. +- **Apple App Store privacy "nutrition labels"** — Per-app privacy summaries in a fixed structure; a consumer-facing instance of legible strength/limitation labeling. + --- *Reference list compiled from a conversation with Claude (Anthropic), May 2026. See [`PriorArt.md`](./PriorArt.md) for the analytical survey that uses these sources.* diff --git a/docs/users-guide.md b/docs/users-guide.md index 2039240..a5473e6 100644 --- a/docs/users-guide.md +++ b/docs/users-guide.md @@ -34,10 +34,26 @@ Symlinking keeps the skill in sync with your PNT clone — a `git pull` here upd **Alternatives:** - **Copy instead of symlink** — replace `ln -s` with `cp -r` to pin the skill to a specific version. You'll re-copy when you want updates. -- **Project-level install** — replace `~/.claude/skills` with `/.claude/skills` to scope the skill to one project (useful if you want different skill versions in different projects). - **Run Claude Code from PNT itself** — no install required; the skill is discoverable from this directory. Adequate for one-off auditing. -**Verify.** Start Claude Code in any directory and try one of the prompts below; if the skill triggers, you're set. You can also ask the agent something like *"what PNT skills do you have available?"* to check. +### Per-repo install (for a design that contributes to PNT) + +When a PNA repo actively contributes back (it's a reference design, or you drive the contribute flow from it), scope the skill to that repo at `/.claude/skills/pna-build-eval-contrib` so collaborators on it pick the skill up. Two forms: + +- **Symlink** (dev convenience, no drift): + ```bash + ln -s /pna-build-eval-contrib /.claude/skills/pna-build-eval-contrib + ``` + Stays in sync with your PNT clone, but the absolute path is machine-specific — **don't commit a machine-specific symlink.** +- **Vendored copy** (portable, committable): + ```bash + cp -r /pna-build-eval-contrib /.claude/skills/pna-build-eval-contrib + ``` + Commit it **with a provenance note pinning the PNT commit** it was copied from (e.g. an `INSTALLED_FROM.md` beside `SKILL.md`). Collaborator-friendly and reproducible, but it **drifts** from upstream — re-sync (re-copy + bump the pinned commit) before relying on it for a contribution. + +Pick the symlink for local iteration; pick the vendored copy when the design repo should carry the skill for everyone working on it. + +**Verify.** Start Claude Code and try one of the prompts below; if the skill triggers, you're set. You can also ask *"what PNT skills do you have available?"*. **Note:** skills load at **session start** — if you just installed the skill, restart Claude Code (or open a fresh session) before it becomes invocable; a mid-session install is not picked up. --- diff --git a/spec/PNA_Spec.md b/spec/PNA_Spec.md index 34c7c15..6b8a12f 100644 --- a/spec/PNA_Spec.md +++ b/spec/PNA_Spec.md @@ -24,6 +24,8 @@ Without PNAs, or something like them, we often go to a list of contacts in linke When an AI is asked to build a PNA, it is required to follow the contracts of the PNA on the user's behalf, and those contracts are written so the AI can pick them up and check its own work. The user's confidence comes from the spec being clear enough that both they and the AI can read it. As long as the contracts hold, an AI can rewrite a PNA from scratch while the user is still talking to it without changing the user's sovereignty, durability, or privacy posture. The goals below are user-facing needs; the [architectural commitments (ACs)](#vocab-universal-ac) after them are the choices that make those needs achievable. Check out the specs of any reference design to see the output of this process. +> **Validation, not certification.** PNT validates behaviors against the Goals; it does not certify. There is no pass/fail badge and no certifying body (see [`CONTRIBUTING.md`](../CONTRIBUTING.md) § "Acceptance is not certification" and the skill's § Principles, "Conformance is checked, not awarded"). Conformance is *checked* — by the user, or by an AI running the evaluate flow — against this spec. Where a PNA deliberately departs from a guarantee it raises an [Exception](exceptions.md); the evaluate flow then detects each exception and verifies how it is handled, **reporting by `AC-*`/`EX-*` ID rather than awarding a grade.** + ---