From bd63d1a23db4ad8df1fe1e91585b15fca256550d Mon Sep 17 00:00:00 2001 From: Rich Bodo Date: Fri, 29 May 2026 21:44:12 +1200 Subject: [PATCH 1/2] docs: split prior art into prose + references, drop Plans section - Rename docs/prior_art.md -> docs/PriorArt.md (the prose survey), add a pointer to the companion reference list. - Add docs/PriorArtReferences.md: the annotated source list, merged from the two dropped reference files (prior_art_2.md + pnt-prior-art-references.md), deduplicated and organized by category. Drops the "Karpathy joining Anthropic" section (out of scope for PNT prior art). - README: repoint the blueprint link, replace the Plans section with a Prior Art section linking both docs. - SKILL.md: repoint the prior-art resource and name the references file. Co-Authored-By: Claude Opus 4.8 (1M context) --- README.md | 6 +- docs/{prior_art.md => PriorArt.md} | 2 + docs/PriorArtReferences.md | 100 +++++++++++++++++++++++++++++ pna-build-eval-contrib/SKILL.md | 2 +- 4 files changed, 106 insertions(+), 4 deletions(-) rename docs/{prior_art.md => PriorArt.md} (99%) create mode 100644 docs/PriorArtReferences.md diff --git a/README.md b/README.md index 44e2bdf..3e3b3e5 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Personal Network Toolkit -The Personal Network Toolkit (PNT) is a [generative application-class blueprint](docs/prior_art.md) for building and validating **[personal network applications (PNAs)](spec/PNA_Spec.md#goals)** — local-only apps for viewing contact data and working on relationship data over a firewalled private data layer. PNAs run on the user's device, never as SaaS, and bridge SaaS-held contact data into a private workspace. **[Why do this?](spec/PNA_Spec.md#preamble)** +The Personal Network Toolkit (PNT) is a [generative application-class blueprint](docs/PriorArt.md) for building and validating **[personal network applications (PNAs)](spec/PNA_Spec.md#goals)** — local-only apps for viewing contact data and working on relationship data over a firewalled private data layer. PNAs run on the user's device, never as SaaS, and bridge SaaS-held contact data into a private workspace. **[Why do this?](spec/PNA_Spec.md#preamble)** When building a PNA, specs are foundational because users will increasingly compose software by prompting AI agents, and success is measured by adherence to them. @@ -77,9 +77,9 @@ A reference design is a working, deployed PNA that demonstrates one valid combin When a second reference design (e.g. a Personal Relationship Manager) lands, it'll be linked from here. -## Plans +## Prior Art -`plans/` holds long-form proposals for PNT itself. Today: [`reorganization-plan.md`](plans/reorganization-plan.md) — the unified reorg plan covering repo restructure, spec formalization (RFC-style normative language, stable IDs, bidirectional traceability), per-design Architecture documents with AC attestation, Software Heritage archival, and a single skill that lets AI agents build, evaluate, and contribute PNAs. +PNT is a *generative application-class blueprint*: a machine-readable spec, written for AI agents, that defines what makes an instance of a whole *class* of apps correct and lets many implementations attest conformance. [`docs/PriorArt.md`](docs/PriorArt.md) surveys the landscape — spec-driven-development tooling, agent-interface standards, conformance test suites, Common Criteria Protection Profiles, and local-first infrastructure — and finds no published artifact occupies the same quadrant: *generative + class-scoped + multi-flavor + machine-checkable*. The full annotated source list is in [`docs/PriorArtReferences.md`](docs/PriorArtReferences.md). ## License diff --git a/docs/prior_art.md b/docs/PriorArt.md similarity index 99% rename from docs/prior_art.md rename to docs/PriorArt.md index 7e25dcb..d430e60 100644 --- a/docs/prior_art.md +++ b/docs/PriorArt.md @@ -6,6 +6,8 @@ The survey was conducted as a conversation with Claude (Anthropic) in May 2026, The finding is that **no published artifact occupies the same position as PNT**. The closest analogs are either *evaluative* rather than *generative*, *per-project* rather than *per-application-class*, or *stack-coupled* rather than *architecture-coupled*. This justifies continued investment in PNT as a distinct category of artifact. +> **References:** the full annotated source list for this survey lives in [`PriorArtReferences.md`](./PriorArtReferences.md). + ## Status First version - May 23 2026 - [Rich and Claude learn from prior art](https://claude.ai/share/df001d9f-e700-4b8e-be4e-b28470f7f5cf) - define the class of work. diff --git a/docs/PriorArtReferences.md b/docs/PriorArtReferences.md new file mode 100644 index 0000000..6c58acd --- /dev/null +++ b/docs/PriorArtReferences.md @@ -0,0 +1,100 @@ +# Prior Art References + +Companion reference list to [`PriorArt.md`](./PriorArt.md), the analytical survey. This file catalogs the sources uncovered during the prior-art conversation (Claude / Anthropic, May 2026), organized by category, each with a one-line note on relevance to PNT. + +## Primary sources + +- **[Contract-Driven AI Development (C-DAD)](https://medium.com/software-architecture-in-the-age-of-ai)** — Enrico Piovesan (Autodesk), Oct 2025 white paper. The maximalist, enterprise-scale statement of PNT's core bet: typed, versioned contracts as the load-bearing artifact for human–AI build teams. +- **[Why Spec-Driven Development Has Reached Its Limit](https://medium.com/software-architecture-in-the-age-of-ai/why-spec-driven-development-has-reached-its-limit-6e9bfed9ee13)** — Enrico Piovesan, "Software Architecture in the Age of AI" (Medium), Nov 2025. Argues specs must become living *runtime* contracts; the position PNT diverges from (PNT keeps contracts design-time and file-based). + +## Spec-Driven Development tooling + +- **[GitHub Spec Kit](https://github.com/github/spec-kit)** — The most prominent SDD tool: a four-phase workflow (specify, plan, task, implement) plus a `constitution.md` of immutable principles governing generation. ([design doc](https://github.com/github/spec-kit/blob/main/spec-driven.md).) +- **[OpenSpec](https://github.com/Fission-AI/OpenSpec)** — Fission-AI (MIT). Lightweight, vendor-neutral SDD layer closest in spirit to PNT; propose→apply→archive delta loop with ADRs kept alongside specs. +- **[AWS Kiro](https://kiro.dev)** — Spec-driven agentic IDE (requirements→design→tasks→code); the enterprise/IDE incarnation of the workflow PNT collapses into axis selection. +- **[BMAD-METHOD](https://github.com/bmad-code-org/BMAD-METHOD)** — Multi-agent "agile AI" orchestration (PM/architect/dev/QA roles); the heaviest end of the spectrum, a contrast to PNT's solo/community-builder target. +- **[Tessl (Framework + Spec Registry)](https://tessl.io)** — Guy Podjarny / Tessl. Spec-as-source-of-truth platform with a registry of pre-built library specs; the most C-DAD-aligned commercial effort and a model for "publish specs others' agents consume." +- **[Google Antigravity](https://antigravity.google)** — Google DeepMind, Nov 2025. Agent-first IDE (VS Code fork) where agents plan/implement/verify from objectives. +- **[Spec-Driven Development: From Code to Contract in the Age of AI Coding Assistants](https://arxiv.org/abs/2602.00180)** — Deepak Babu Piskala, arXiv 2602.00180, Jan 2026. Canonical academic framing of SDD as executable contracts that constrain what AI agents generate. +- **[BCMS — Spec-Driven Development definitive guide (2026)](https://thebcms.com/blog/spec-driven-development)** — Industry survey of the 2026 SDD landscape (Spec Kit, Kiro, BMAD-METHOD, OpenSpec, Tessl, Antigravity). +- **[Augment Code — What is Spec-Driven Development](https://www.augmentcode.com/guides/what-is-spec-driven-development)** — Industry-oriented framing of SDD as "executable contracts that constrain what AI agents generate." +- **[EARS (Easy Approach to Requirements Syntax)](https://en.wikipedia.org/wiki/Easy_Approach_to_Requirements_Syntax)** — Alistair Mavin et al., Rolls-Royce, RE'09 (2009). A constrained natural-language requirement grammar that parses cleanly for agents; a candidate format for PNT's human-readable goals layer. (Primary source: alistairmavin.com/ears.) + +## Agent-interface standards (what an agent reads) + +- **[Agent Skills / SKILL.md](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview)** — Anthropic (open standard; community hub at [agentskills.io](https://agentskills.io)). The skill format PNT already ships in; progressive disclosure (name + description first, body on demand) is how PNT's spec and slot map are meant to be consumed. +- **[AGENTS.md](https://agents.md)** — Open convention; the project-level "agent README" emitted by Spec-Kit/OpenSpec. Worth having at the root of each PNT reference design. +- **[llms.txt](https://llmstxt.org)** — Jeremy Howard / Answer.AI. Discovery-file convention for pointing agents at canonical docs; PNT ships one and routes it to the SKILL.md. + +## AI app-builder system prompts & architectures + +- **[x1xhlol/system-prompts-and-models-of-ai-tools](https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools)** — Canonical leaked-prompts collection with extracted system prompts for v0, Cursor, Lovable, Replit Agent, Bolt, Devin, Windsurf, Claude Code, and others. +- **[elder-plinius/CL4R1T4S](https://github.com/elder-plinius/CL4R1T4S)** — Mirror collection of AI system prompts framed as a transparency project. +- **[Lovable Agent Prompt](https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools/blob/main/Lovable/Agent%20Prompt.txt)** — Lovable's production system prompt, notable for rigorous design-system enforcement and a "PERFECT ARCHITECTURE" refactoring check on a fixed React + Vite + Tailwind + shadcn + Supabase stack. +- **[Replit Agent breakoutagents case study (LangChain)](https://www.langchain.com/breakoutagents/replit)** — Architectural breakdown of Replit Agent's multi-agent ReAct-style design with initial-codegen and follow-up agents. +- **[Pliny the Liberator — Replit Agent leak (X)](https://x.com/elder_plinius/status/1914774065937891398)** — April 2025 disclosure revealing Replit Agent uses two cooperating agents rather than one monolithic prompt. +- **[Replit Enterprise Custom Templates](https://docs.replit.com/teams/custom-templates)** — The strongest published official analog to a generative blueprint: customers create `custom_instruction/instructions.md` files injected into Agent's system prompt. The mechanism would accept a PNT-shaped artifact without modification. +- **[Lovable Prompting Handbook (Jan 2025)](https://lovable.dev/blog/2025-01-16-lovable-prompting-handbook)** — Lovable's official guidance for project-level constraints, design tokens, and backend structure documents. +- **[ML6 — The Anatomy of a Lovable App](https://www.ml6.eu/en/blog/the-anatomy-of-a-lovable-app-and-its-boundaries-in-enterprise-software)** — Third-party but accurate description of what a stack-coupled application archetype looks like in production. + +## Conformance test suites & standards + +- **[Solid Conformance Test Harness](https://github.com/solid-contrib/conformance-test-harness)** — The strongest direct analog to PNT by domain (decentralized personal data) and intent (ecosystem interop), though evaluative rather than generative. +- **[Inrupt — Interoperability tests for Solid developers](https://www.inrupt.com/blog/interoperability-tests-for-solid-developers)** — Plain-English overview of how the Solid community uses the CTH. +- **[xAPI LRS Conformance Test Suite](https://xapi.com/conformance-test/)** — 1,300-criterion conformance suite for Learning Record Stores with a public registry of certified implementations. +- **[HL7 FHIR CapabilityStatement](https://www.hl7.org/fhir/capabilitystatement.html)** — Healthcare-app conformance resource where each implementation publishes a structured declaration of which profiles and operations it supports; the closest "axis-attestation" model in a mature standard. +- **[W3C EARL (Evaluation and Report Language) Schema](https://www.w3.org/TR/EARL10-Schema/)** — Standard vocabulary for machine-readable conformance pass/fail reports, used by Solid CTH and other W3C-adjacent conformance work. +- **[EU Interoperability Test Bed (ITB)](https://www.itb.ec.europa.eu/docs/guides/latest/definingYourTestConfiguration/index.html)** — Generic harness with the test-subject + actor + test-suite model that Solid's CTH borrowed from. +- **[NIST — What is this thing called Conformance?](https://www.nist.gov/itl/ssd/information-systems-group/what-thing-called-conformance)** — Foundational framing of conformance testing, validation, and certification. + +## Contract testing & interface specification + +- **[Pact](https://pact.io)** — Pact Foundation / SmartBear. Consumer-driven contract testing; the decades-proven "verify an implementation honors its contract" pattern PNT's evaluate flow re-creates for ACs. +- **[OpenAPI Specification](https://www.openapis.org)** — Standard for typed HTTP interfaces (PNT uses it for distribution auth); mature conformance-validation tooling can wire straight into a reference design's CI. +- **[AsyncAPI](https://www.asyncapi.com)** — The event/message-driven analogue to OpenAPI; relevant if a PNA's comms transport becomes event-based. +- **[JSON Schema](https://json-schema.org)** — Underpins PNT's worker-handshake and MCP tool-surface contracts; the schemas double as machine-checkable conformance fixtures. + +## Architecture standards & verification frameworks + +- **[OWASP Application Security Verification Standard (ASVS)](https://owasp.org/www-project-application-security-verification-standard/)** — Level-based verification framework where applications attest to a specific conformance level rather than passing a binary check. +- **[SLSA — Supply chain Levels for Software Artifacts](https://slsa.dev/)** — Level-based attestation framework for build provenance; the same per-implementation attestation pattern at a different problem. + +## Coding-agent benchmarks & evals + +- **[Anthropic — Demystifying Evals for AI Agents](https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents)** — Anthropic's published methodology for designing capability evals that graduate into regression suites for agent reliability. +- **[SWE-bench Verified](https://www.swebench.com/verified.html)** — Human-validated subset of 500 GitHub-issue tasks used to evaluate whether coding agents can patch real repos. +- **[Terminal-Bench](https://www.tbench.ai/)** — Benchmark suite measuring tool-using agents on realistic terminal tasks. +- **[SWE-bench Pro / SWE-Bench-CL](https://arxiv.org/pdf/2507.00014)** — Harder and continual-learning evaluation variants beyond SWE-bench Verified. +- **[Anthropic's $20K C compiler experiment (WebProNews)](https://www.webpronews.com/anthropics-20000-experiment-how-16-parallel-ai-agents-built-a-100000-line-c-compiler-from-scratch-in-rust/)** — Coverage of Anthropic's 16-parallel-agent C compiler build, which reached for the *existing* C conformance test suite rather than authoring its own. + +## MCP conformance + +- **[Issue #1990 — Add an official MCP conformance test suite](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1990)** — Open request for an official MCP conformance test suite, still pending as of this survey. +- **[MCP roadmap — Conformance Test Suites](https://modelcontextprotocol.io/development/roadmap)** — Roadmap item: "automated verification that clients, servers, and SDKs correctly implement the specification." +- **[DinCoder MCP — Conformance documentation](https://glama.ai/mcp/servers/@flight505/MCP_DinCoder/blob/6c67715293c9863a9246c0e3a094e2d184e74572/docs/conformance.md)** — Example of an individual MCP implementation publishing its own per-section conformance checklist against the spec. + +## Local-first software & sync infrastructure + +- **[Local-first software — the seven ideals](https://www.inkandswitch.com/local-first/)** — Kleppmann, Wiggins, van Hardenberg, McGranaghan (Ink & Switch, 2019). The seminal essay defining the ideals PNT's ACs operationalize for the contact-data domain. ([Wikipedia overview](https://en.wikipedia.org/wiki/Local-first_software).) +- **[Home-Cooked Software and Barefoot Developers](https://maggieappleton.com/home-cooked-software)** — Maggie Appleton, Local-First Conf 2024. Names PNT's exact audience (community-embedded "barefoot developers" building "home-cooked" apps with AI); references Robin Sloan's "An app can be a home-cooked meal." +- **[Local-first, forever](https://tonsky.me/blog/crdt-filesync/)** — Nikita Prokopov (Tonsky). Proposes sync over commodity file-syncing (append-only per-client op logs + CRDT merge underneath), a candidate PNA axis pick for multi-device sharing without a SaaS root. +- **[Automerge](https://automerge.org)** — Ink & Switch (Automerge 3.0, May 2025). Mature CRDT library (Rust core, ~10× lower memory) underpinning the file-sync pattern above; relevant if a PNA needs conflict-free multi-writer state. +- **[Yjs](https://yjs.dev)** — Kevin Jahns. Dominant CRDT for collaborative text editing; reference point if a PNA ever needs shared rich-text notes. +- **[Loro](https://loro.dev)** — Loro (1.0, 2024). Newer CRDT library with movable-tree and rich-text types targeting gaps Automerge/Yjs leave. +- **[ElectricSQL](https://electric-sql.com)** — Server-streams-Postgres-to-client sync engine ("Electric Next"); a sync-boundary alternative to PNT's static-mirror ingestion. +- **[PowerSync](https://www.powersync.com)** — Cross-platform SQLite sync with native SDKs; relevant comparison for the storage/sync axes. +- **[LiveStore](https://livestore.dev)** — Johannes Schickling. Event-sourced reactive-SQLite local-first data layer (Riffle lineage); an architectural cousin to PNT's OPFS-SQLite reference design. +- **[PGlite](https://pglite.dev)** — Electric. Full Postgres compiled to WASM in the browser; a substrate option if a PNA outgrows SQLite. +- **[Beehive](https://www.inkandswitch.com)** — Ink & Switch, 2024. Local-first, capability-based decentralized access control; relevant if PNT's magic-link distribution ever needs principled sharing/revocation. (Link is the lab root; confirm the current project page.) +- **[awesome-local-first](https://github.com/alexanderop/awesome-local-first)** — Curated index of local-first essays, libraries, and sync engines; a maintained map of this whole space. +- **[Behavioural Types for Local-First Software](https://arxiv.org/pdf/2305.04848)** — Academic work on formal verification of local-first systems; the most rigorous treatment of correctness for this app class. + +## Contact-data standards + +- **[CardDAV (RFC 6352)](https://datatracker.ietf.org/doc/rfc6352/)** — IETF protocol standard for contact-data servers; directly domain-relevant to PNT's application class. +- **[vCard (draft-ietf-vcarddav-vcardrev-02)](https://www.ietf.org/archive/id/draft-ietf-vcarddav-vcardrev-02.html)** — The contact-data format underlying CardDAV and most contact-exchange systems. +- **[Defensics vCard Test Suite (Black Duck)](https://www.blackduck.com/fuzz-testing/defensics/protocols/vcard.html)** — Commercial fuzz/robustness conformance suite for vCard implementations. + +--- + +*Reference list compiled from a conversation with Claude (Anthropic), May 2026. See [`PriorArt.md`](./PriorArt.md) for the analytical survey that uses these sources.* diff --git a/pna-build-eval-contrib/SKILL.md b/pna-build-eval-contrib/SKILL.md index 4dd7170..aa0f29d 100644 --- a/pna-build-eval-contrib/SKILL.md +++ b/pna-build-eval-contrib/SKILL.md @@ -112,4 +112,4 @@ A builder using Claude Code can drive both preflight and PR authoring end-to-end - `reference_designs/templates/` — the per-design and Architecture templates - `tools/lint-spec-ids.py` — checks AC ↔ contract traceability invariants - `CONTRIBUTING.md` — full contribution rules -- `docs/prior_art.md` — survey of related work +- `docs/PriorArt.md` — survey of related work (annotated source list in `docs/PriorArtReferences.md`) From 50ba05b29bebee5f80c4fc5999213589648c59c2 Mon Sep 17 00:00:00 2001 From: Rich Bodo Date: Fri, 29 May 2026 21:56:13 +1200 Subject: [PATCH 2/2] plans: tick Phase 4.5 checkboxes; point prior-art refs at the split files MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Phase 4.5 items (User's Guide, contribute-flow preflight, README link) are done — tick their checkboxes. - Update the target-layout diagram and the Phase 1 relocation note to point at docs/PriorArt.md + docs/PriorArtReferences.md so the survey is findable from the plan. Co-Authored-By: Claude Opus 4.8 (1M context) --- plans/reorganization-plan.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/plans/reorganization-plan.md b/plans/reorganization-plan.md index 93ebd68..46ba99b 100644 --- a/plans/reorganization-plan.md +++ b/plans/reorganization-plan.md @@ -145,7 +145,8 @@ personal_network_toolkit/ │ ├── validate-architecture.py # Lint Architecture documents against the template │ └── lint-spec-ids.py # Check ID rigor across spec and contracts ├── docs/ -│ └── prior_art.md # Was: research/prior_art_survey.md +│ ├── PriorArt.md # Prior-art survey (was: research/prior_art_survey.md) +│ └── PriorArtReferences.md # Annotated source list for the survey ├── plans/ # This file lives here └── archive/ # Optional: forked snapshots of high-signal designs └── / # Git submodule pointing at pnt-archive/ @@ -347,7 +348,7 @@ Phases are sized for Claude Code sessions. Each ends in a committable, working s - [x] Create new directory layout: `spec/`, `contracts/`, `reference_designs/`, `reference_designs/templates/`, `pna-build-eval-contrib/`, `tools/`, `docs/`, `archive/`. - [x] Move existing spec documents into `spec/`. - [x] Move existing typed contracts from `spec/contracts/` into top-level `contracts/`. -- [x] Relocate `research/prior_art_survey.md` to `docs/prior_art.md`. +- [x] Relocate `research/prior_art_survey.md` to `docs/prior_art.md` (later split into `docs/PriorArt.md` (prose survey) + `docs/PriorArtReferences.md` (annotated source list)). - [x] Create `reference_designs/fellows_local_db/` with a placeholder README pointing to the external repo (SWHID added in Phase 5). - [x] Update root `README.md` to reflect the new layout (the `## Status` section is the home for success criteria). @@ -375,9 +376,9 @@ Phases are sized for Claude Code sessions. Each ends in a committable, working s ### Phase 4.5 — User's Guide and contribute-flow preflight -- [ ] Write `docs/users-guide.md` — concise step-by-step instructions organized around the six success criteria in `README.md § Status`. The reference-design submission workflow gets the most space (since `fellows_local_db` is the first user of it in Phase 5). -- [ ] Enhance `pna-build-eval-contrib/SKILL.md`'s contribute flow with an explicit **preflight validation** step (does the design have all required files? is the Architecture document complete and accurate against the code?), the **"what's interesting architecturally?"** prompt — with three valid patterns documented (new AC, existing pattern on a new platform, ecosystem-value-add) — and explicit **interactive** guidance for creating the Architecture document when it doesn't yet exist. -- [ ] Link `docs/users-guide.md` from `README.md`. +- [x] Write `docs/users-guide.md` — concise step-by-step instructions organized around the six success criteria in `README.md § Status`. The reference-design submission workflow gets the most space (since `fellows_local_db` is the first user of it in Phase 5). +- [x] Enhance `pna-build-eval-contrib/SKILL.md`'s contribute flow with an explicit **preflight validation** step (does the design have all required files? is the Architecture document complete and accurate against the code?), the **"what's interesting architecturally?"** prompt — with three valid patterns documented (new AC, existing pattern on a new platform, ecosystem-value-add) — and explicit **interactive** guidance for creating the Architecture document when it doesn't yet exist. +- [x] Link `docs/users-guide.md` from `README.md`. Phase 4.5 lands in the same PR as Phases 1–4 so the User's Guide is exercisable against `fellows_local_db` immediately in Phase 5.