diff --git a/README.md b/README.md index 786fa4e..0883130 100644 --- a/README.md +++ b/README.md @@ -2,91 +2,86 @@ Open CoT banner -### Cognitive Control Plane for Governed Agent Execution +### Schemas for Cognitive Artifacts, Capabilities, and Reconciliation -**Open CoT** — an open standard and reference implementation for model-agnostic governed agent execution. +**Open CoT** — an open standard for portable cognitive artifacts, capability snapshots, execution intent, observations, policy boundaries, receipts, and reconciliation results. Open CoT License: MIT Contributions welcome -JSON Schema +JSON Schema --- -## Why this exists +## Why This Exists -Agents need to reach tools, data, and services, but every stack reinvents authorization, safety boundaries, and audit. Models are often treated as if their natural-language output were both intent and permission. There is no **portable contract** between what a model *proposes* and what a deployment *allows*. +Modern AI systems need a stable contract between fuzzy cognition and concrete capability. The model-like component can interpret, summarize, propose, and produce typed artifacts, but it must not own runtime authority or side effects. -> Open-CoT is a model-agnostic cognitive control plane that standardizes the trusted contract between model output, harness/runtime enforcement, policy, delegation, tool execution, provenance, and audit. +Open CoT defines the portable interface layer for that boundary: -Open-CoT separates those layers: typed schemas for proposals and artifacts, a **normative governed execution model**, and a reference harness that enforces the contract end to end. The same envelopes and state machine can sit behind different models and runtimes because the control plane is explicit, not inferred from free-form text. +- what cognitive artifacts look like, +- how available capabilities are represented, +- how execution intent binds to an immutable capability snapshot, +- how policy, delegation, budget, and receipts are recorded, +- how observations and final reconciliation results are serialized. -That matters wherever you need **comparable audit**, **shared tooling across vendors**, or **defensible denial** when a model asks for something unsafe. The goal is not prettier logs; it is a **trusted contract** between proposal and execution. +Earlier runtime-governance language was useful while the project was searching for the right security shape. The standard is now moving toward a sharper inversion: **cognition emits structured artifacts; runtimes reconcile those artifacts against capability, policy, budget, and evidence**. -## The core insight +## The Core Insight -If reasoning, tool intent, provenance, budgets, state transitions, and delegation are carried in **stable typed schemas**, then a harness or runtime can be **portable** across models: it does not have to reverse-engineer each vendor’s behavior. +The LLM is not the runtime, orchestrator, or authority boundary. It is a non-deterministic cognitive function. A runtime can use its output only after validation and reconciliation. -**Models propose.** Schemas **express**. The harness **validates**. Policy **decides**. The auth broker **narrows** scope and issues receipts. Tools **execute** only under granted authority. Receipts and audit artifacts **prove** what ran and who allowed it. +**Cognition emits.** Schemas express. Capability snapshots bound what may be requested. Policy gates authorize or refuse. Runtimes execute through explicit endpoints. Observations, receipts, and reconciliation results prove what happened. -## What this repo contains +This makes Open CoT useful beyond any one framework. An implementation can use Restate, Temporal, a queue worker, a local process, MCP, HTTP, or a custom executor. The portable layer is the schema contract, not the implementation stack. + +## What This Repo Contains | Area | Role | |------|------| -| [`rfcs/`](./rfcs/) | **51 RFCs** — normative definitions for reasoning traces, tool invocation, the governed FSM, sandboxing, budgets, permissions, policy, delegation, provenance, identity, org governance, receipts, audit, and capability manifests | -| [`schemas/`](./schemas/) | Versioned JSON Schemas per RFC (`registry.json`, `rfc-*-*.json`) | -| [`harness/`](./harness/) | **Reference harness** (TypeScript) — governed FSM, validation, tools, budgets, trace emission | +| [`rfcs/`](./rfcs/) | **53 RFCs** covering reasoning traces, tool invocation, governed execution, policy, delegation, receipts, capability manifests, cognitive artifacts, and reconciliation results | +| [`schemas/`](./schemas/) | Versioned JSON Schemas per RFC, including `registry.json` | +| [`harness/`](./harness/) | Reference TypeScript harness that exercises earlier governed execution RFCs | | [`examples/`](./examples/) | Validated instance fixtures keyed by registry shortname | | [`reference/python/`](./reference/python/) | Reference Python tooling | -| [`tools/`](./tools/) | Schema and fixture validation (`validate.py`, sync helpers) | -| [`standards/`](./standards/) | Human-readable patterns, metrics, narrative docs | +| [`tools/`](./tools/) | Schema and fixture validation, registry sync, and RFC helpers | +| [`standards/`](./standards/) | Human-readable reasoning patterns and evaluation metrics | | [`datasets/`](./datasets/) | Conventions and converters for training-ready data | | [`benchmarks/`](./benchmarks/) | Tasks, scoring, leaderboards | | [`conformance/`](./conformance/) | Conformance and interoperability material | -| [`tests/`](./tests/) | Shared Python tests for validation and tooling | -| [`docs/`](./docs/) | Contributing, architecture, philosophy, ELI5 guide, experiment cards | - -For a concise layout of control plane vs data plane, see [`docs/architecture.md`](./docs/architecture.md). - -**If you are evaluating quickly:** (1) read [`docs/eli5_guide.md`](./docs/eli5_guide.md), (2) run the harness tests above, (3) run `python tools/validate.py`, (4) skim RFC 0007 plus RFCs 0041, 0042, 0047, 0048, and 0051 for the governance and temporal spine. - -## The governed execution model - -RFC 0007 defines a **fourteen-state** finite state machine. A compliant run starts in **`receive`** and ends in **`audit_seal`**. Along the main path: +| [`docs/`](./docs/) | Architecture, philosophy, contributing, experiments, and launch notes | -`receive` → `frame` → `plan` → `request_authority` → `validate_authority` → `delegate_narrow` → `execute_tool` → `observe_result` → `critique_verify` → `finalize` → `audit_seal` +For the current architecture framing, see [`docs/architecture.md`](./docs/architecture.md). -Authority and failure routing adds **`deny`**, **`escalate`**, and **`fail_safe`**, each terminating into a sealed audit according to policy. +## Forward Spine -**The model cannot self-authorize.** It may only request capabilities; the harness, policy engine, and broker decide, narrow, and record grants. **Tool side effects occur only in `execute_tool`**, with explicit permission or a documented standing authorization cited on the execution receipt (RFC 0048). +The newer reconciliation-oriented spine is: -RFC 0007 also allows a **pre-authorized shortcut** from `plan` to `execute_tool` when a deployment holds a **standing grant** (for example, sandbox allowlists): the shortcut must still be cited on the receipt so auditors can see why delegation states were skipped. +- **RFC 0052** — cognitive artifacts, execution intent, observations, and immutable capability snapshots. +- **RFC 0053** — reconciliation result envelope and structured error taxonomy. +- **RFC 0049** — capability manifests, now a predecessor to more precise capability snapshots. +- **RFC 0041** — policy documents and policy gate semantics. +- **RFC 0047** — delegation requests, decisions, and authority receipts. +- **RFC 0048** — execution receipts and audit envelopes. +- **RFC 0051** — temporal semantics for validity, replay, and ordering. -## Design principles +Older RFCs still matter. RFC 0001, 0003, and 0007 define foundational reasoning, tool invocation, and governed execution concepts. The new RFCs clarify how those ideas become a portable schema layer for reconciliation runtimes. -- **Typed schemas over ambiguous prose** — contracts are JSON Schema, not instructions embedded in model copy. -- **The model is an untrusted proposer** — output is validated input, not implicit command. -- **Portable harness semantics** — the same FSM and envelopes apply across models and adapters. -- **Explicit provenance and evidence** — receipts, delegation records, and audit envelopes close the loop. -- **Permission-aware tool execution** — grants are scoped, consumable, and auditable. -- **Delegation as a bounded request** — narrow, time-bounded authority; no self-issued power of attorney. -- **Policy-enforced narrowing and auditability** — policy consults at defined boundaries; runs seal into tamper-evident audit material. +## Design Principles -Values behind these bullets are expanded in [`docs/philosophy.md`](./docs/philosophy.md). Token efficiency and context management strategies are covered in [`docs/token-efficiency.md`](./docs/token-efficiency.md). - -## Quick start - -**Reference harness** (mock backend, no API keys required): - -```bash -cd harness && npm install && npm test -``` +- **Typed artifacts over prompt contracts** — model output is structured input, not authority. +- **Capability snapshots over ambient tools** — cognition sees an explicit inventory and cannot invent endpoints. +- **Execution intent over direct execution** — proposed work is reconciled before side effects. +- **Policy gates over schema-only safety** — valid shape is not permission. +- **Observations and receipts over logs alone** — every side effect should leave replayable evidence. +- **Implementation neutrality** — Open CoT should not require Restate, MCP, Vercel AI SDK, Open Lagrange, or any specific runtime. +- **Spec gaps become RFC work** — if an implementation needs a general interface, it belongs here. -Optional demos: `npx tsx examples/chat-demo.ts` and `npx tsx examples/coder-demo.ts` (see [`harness/README.md`](./harness/README.md)). +## Quick Start -**Python validation** (schemas + examples): +Validate schemas and examples: ```bash python3 -m venv .venv && source .venv/bin/activate @@ -94,61 +89,30 @@ pip install -r requirements-tools.txt python tools/validate.py ``` -**New to the project?** Start with [`docs/eli5_guide.md`](./docs/eli5_guide.md). - -**Optional:** CPU-friendly smoke path `bash scripts/quickstart_experiment.sh`; after installing `requirements-tools.txt`, run `pytest -q` for repo tests. For local OSS train/eval, see [`experiments/local_oss_runbook.md`](./experiments/local_oss_runbook.md). - -**Live LLM** (OpenAI-compatible endpoint such as Ollama): +Run the reference harness: ```bash -cd harness && OPENAI_BASE_URL=http://localhost:11434/v1 npx tsx examples/chat-demo.ts "Explain recursion" +cd harness && npm install && npm test ``` -## What the harness covers - -| Capability | RFC | Harness touchpoints | -|------------|-----|---------------------| -| Governed execution FSM | RFC 0007 | `src/schemas/agent-loop.ts`, `src/core/transitions.ts` | -| Permission system | RFC 0042 | `src/schemas/permission.ts` (grants; model cannot mint) | -| Policy enforcement (consult hooks + loop guardrails) | RFC 0041 | `src/core/loop-policy.ts`, schema alignment with RFC 0041 / 0043 | -| Delegation flow | RFC 0047 | `src/schemas/delegation.ts`, `AgentState` delegation fields | -| Execution receipts | RFC 0048 | `src/schemas/receipt.ts` | -| Audit envelopes | RFC 0043 | `src/schemas/audit-envelope.ts` | -| Budget enforcement | RFC 0038 | `src/core/budget-tracker.ts` | -| Tool contracts | RFC 0003 (+ 0018 errors) | `src/core/tool-registry.ts`, `src/tools/` | -| Safety sandboxing | RFC 0017 | `src/schemas/sandbox.ts`, enforcement in tool registry | -| Capability manifests | RFC 0049 | `src/governance/manifest-builder.ts`, injected at `frame` / `critique_verify` | -| Observability telemetry | RFC 0031 | `src/schemas/telemetry.ts`, metrics on `AgentState` | +## Open Lagrange Relationship -The harness and schemas **mutually stress-test** each other: invalid transitions or shapes fail fast in CI; gaps in the spec show up as harness friction. +Open Lagrange is the opinionated TypeScript proving ground for this standard. It uses Restate for durable reconciliation, Zod for runtime boundaries, Vercel AI SDK for structured cognitive artifact generation, and MCP-shaped endpoints for side effects. -Reasoning **patterns** (plan–verify, debate, and similar) remain documented for datasets and evaluation in [`standards/reasoning-patterns.md`](./standards/reasoning-patterns.md); they sit alongside the control plane, not instead of it. +That implementation pressure-tests Open CoT. If Open Lagrange needs a portable structure, this repo should receive the RFC/schema update instead of letting a private dialect grow elsewhere. -## Current status +## Current Status -- **51 RFCs** and a versioned JSON Schema registry with CI validation. -- Reference harness implements the governed FSM, delegation and receipt types, budgets, sandboxed tools, and trace validation (see table above). -- Cross-language checks: TypeScript-emitted traces validate under Python tooling. -- Tiered examples, synthetic seed data, and experiment runbooks under [`experiments/`](./experiments/). -- Breaking temporal normalization landed in RFC 0051 (`observed_at` / `decided_at` / `effective_at` / `completed_at`); migration guide: [`docs/temporal-migration-rfc0051.md`](./docs/temporal-migration-rfc0051.md). - -## Experiment cards - -For focused scenarios (hidden reasoning, runaway loops, token budgets, policy hooks), see [`docs/experiments/`](./docs/experiments/README.md). Launch packaging notes live in [`docs/public-launch.md`](./docs/public-launch.md). +- **53 RFCs** and a versioned JSON Schema registry. +- New draft schemas for cognitive artifacts and reconciliation results. +- Reference harness coverage for governed execution, policy, delegation, receipts, budgets, and capability manifests. +- Cross-language validation tooling for schemas and examples. +- Experiment cards and local runbooks under [`docs/experiments/`](./docs/experiments/). ## Contributing -See [`docs/contributing.md`](./docs/contributing.md). Improvements to schema clarity, harness coverage, examples, and benchmarks are especially welcome. - -Normative changes belong in RFCs first; reference code should follow the spec, not the other way around. Small harness fixes that clarify an already-intended RFC are welcome when they include a pointer to the RFC section they implement. - -### RFC feedback process - -- Use each RFC’s linked GitHub **Discussion** for normative debate. -- Use **Issues** for actionable implementation work. -- RFC changes in PRs should link the RFC file and its Discussion thread. -- Index of discussions: [`docs/rfc-discussions.md`](./docs/rfc-discussions.md). +See [`docs/contributing.md`](./docs/contributing.md). Normative changes belong in RFCs first; implementations should follow the spec and feed gaps back into it. ## License -This project is licensed under the **MIT License**. See [`LICENSE`](./LICENSE) for the full text. +This project is licensed under the **MIT License**. See [`LICENSE`](./LICENSE). diff --git a/docs/architecture.md b/docs/architecture.md index addf829..765104e 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,121 +1,82 @@ # Architecture -Open-CoT is a **cognitive control plane**: normative schemas, a governed finite-state machine, and audit artifacts define how proposals become permitted actions. Everything that interprets vendor-specific protocols, calls remote tools, or wraps a particular model sits in the **data plane** and must conform to the control plane contract. +Open CoT is a portable schema layer for reconciling non-deterministic cognition with concrete capability. -## Control plane vs data plane +It does not require a particular runtime, model provider, endpoint protocol, storage backend, or workflow engine. It defines the artifacts that let those systems coordinate safely. -| Control plane (normative) | Data plane (pluggable) | -|---------------------------|-------------------------| -| RFCs and JSON Schemas | Model adapters (OpenAI-compatible, local OSS, etc.) | -| Governed execution FSM (RFC 0007) | Concrete tool implementations | -| Delegation, permission, receipt, audit schemas | Network and storage drivers | -| Policy consultation *points* and required artifacts | Organization-specific policy rule bodies | -| Validation rules and registry | Human review UX for `escalate` | +## Cognitive Layer vs Runtime Layer -The control plane answers: *what states may we be in, what documents must exist at each step, and what may cross a trust boundary.* The data plane answers: *which LLM, which HTTP endpoint, which filesystem.* +| Cognitive layer | Runtime layer | +|-----------------|---------------| +| Emits a typed cognitive artifact | Validates and reconciles that artifact | +| Sees a capability snapshot | Discovers and signs capability inventory | +| Proposes execution intent | Applies policy, budget, and preconditions | +| Produces explanatory reasoning trace | Treats trace as audit material, not proof | +| Consumes observations | Executes endpoints and records receipts | -Keeping the split sharp avoids a common failure mode: shipping a “policy layer” that only watches logs **after** tools already ran. Here, policy consultation **points** are part of the FSM; skipping them is non-conformant, not an optimization. +The boundary is intentionally asymmetric. Cognition may propose. Runtime reconciles. -## Textual component diagram +## Core Data Flow -Picture data moving left to right on the **happy path**: +The current forward path is: -`Model adapter` → **Reasoning envelope** (typed proposal) → **Governed FSM** (phase gate) → **Policy engine** (decision) → **Permission store** (grants) → **Auth broker** (narrowed `AuthorityReceipt`) → **Tool executor** (`execute_tool` only) → **Observation path** (`observe_result` / `critique_verify`) → **Audit engine** (`finalize` / `audit_seal`). +`Capability discovery` → **Capability Snapshot** → `Cognitive step` → **Cognitive Artifact** → `Runtime validation` → `Policy gate` → `Endpoint execution` → **Observation** → **Reconciliation Result** → `Receipts / audit`. -Side channels include **budget** enforcement (RFC 0038) and **sandbox** allow/deny lists (RFC 0017), which can pre-empt a transition or force `fail_safe` without giving unsafe payloads back to the model. +The cognitive step receives only the capability snapshot and prior observations. It does not receive ambient authority, live tool handles, filesystem access, credentials, or transport configuration. -Temporal validity and ordering semantics (RFC 0051) are now part of the control-plane contract: governance artifacts use canonical fields (`observed_at`, `decided_at`, `effective_at`, `expires_at`, `started_at`, `completed_at`, `superseded_at`) and SHOULD carry non-wall-clock ordering metadata for replay stability. +## Primary Artifacts -For streamed decoding, deployments should treat budget control as an active circuit-breaker (preflight budget gate + mid-stream cancellation), not a post-hoc accounting report. The reference harness now supports this runtime pattern; see [Model adaptation for budget control](./model-adaptation-budget-control.md). +1. **Capability Snapshot** — Immutable inventory of available endpoints. Each capability carries server name, capability name, JSON-schema-compatible input shape, optional output shape, risk level, approval requirement, and stable digest. +2. **Cognitive Artifact** — Structured proposal emitted by a model-like component. It includes intent verification, assumptions, reasoning trace, execution intent, uncertainty, observations, and optional yield reason. +3. **Execution Intent** — A requested endpoint action bound to a specific snapshot ID and capability digest. +4. **Policy Gate Result** — Runtime authorization result. Shape validation does not imply permission. +5. **Observation** — Structured record of endpoint output, skipped work, validation failure, or policy refusal. +6. **Reconciliation Result** — Final envelope describing completed, yielded, approval-required, failed, or completed-with-errors outcomes. +7. **Receipts and Audit Envelopes** — Integrity-backed execution and lifecycle evidence from RFC 0048 and related RFCs. -## Major components +## Trust Boundaries -These names describe responsibilities; a single deployment may fold multiple roles into one service, but the boundaries stay conceptually distinct. +| Source | May supply | Must not supply | +|--------|------------|-----------------| +| Cognitive function | Structured artifact, execution intent, assumptions, explanation | Authority, forged receipts, endpoints outside the snapshot | +| Runtime | Validation, reconciliation, policy gates, endpoint dispatch, observations | Silent policy bypass, hidden side effects | +| Policy layer | Allow, deny, narrow, approval, yield semantics | Direct endpoint side effects | +| Endpoint executor | Endpoint output, errors, metadata | Expanded authority or altered snapshot semantics | +| Audit layer | Integrity and replay evidence | Retroactive mutation of prior artifacts | -1. **Reasoning envelope** — Structured proposal from the model (intent, constraints, requested capabilities). It is **input** to validation, not authority to act. -2. **Governed FSM** — The fourteen-state machine from RFC 0007. Only **`execute_tool`** may perform tool side effects; transitions are schema-checked and logged. -3. **Policy engine** — Evaluates rules at mandated consultation boundaries (e.g. `frame`, `plan`, `validate_authority`, `observe_result`, `critique_verify`, `finalize`). Outcomes include allow, deny, narrow, or escalate. -4. **Permission system** — Holds and revokes **grants** (RFC 0042). Grants are issued by the harness or policy stack, never minted by the model. -5. **Auth broker** — Takes approved or narrowed delegation and produces **`AuthorityReceipt`** with `granted_scope ≤ requested_scope` (RFC 0047 / 0048). This is where scope is tightened, not where the model freelances. -6. **Tool executor** — Dispatches a registered tool only inside **`execute_tool`**, consuming or citing permission, emitting **`ToolExecutionReceipt`**. -7. **Audit engine** — Assembles integrity-backed **audit envelopes** (RFC 0043 / 0048) and drives the terminal **`audit_seal`** state so a run has a durable, reviewable closure. -8. **Capability manifest** (RFC 0049) — A harness-compiled briefing injected before every LLM call (the **manifest heartbeat**). Tells the model what tools are available, what is blocked, what constraints apply, and how much budget remains. The model never builds this — only the harness does. The heartbeat counters context decay: as the conversation grows, the model progressively forgets earlier instructions. Re-injecting at every turn keeps the truth visible. Cost is under 200 tokens per injection; savings from prevented hallucinated tool calls and avoided denial cycles are significantly higher. +## Validation Order -The **harness** in this repository implements the FSM gate, schema validation (via Ajv against repo JSON Schemas), loop-level policy checks, budgets, sandbox enforcement for registered tools, capability manifest heartbeat, and trace emission. A production deployment might split policy and broker onto separate services, but the **same artifacts** (requests, decisions, receipts, envelopes) should still appear in the trace. +A conforming reconciliation runtime should evaluate execution intent in this order: -## Data flow: one tool call +1. Validate the cognitive artifact shape. +2. Confirm the referenced snapshot ID. +3. Confirm endpoint server and capability names exist in the snapshot. +4. Confirm the capability digest. +5. Validate arguments against the original capability input schema. +6. Apply policy gates. +7. Check approval requirements, risk, budget, and preconditions. +8. Execute the endpoint through the runtime boundary. +9. Validate endpoint result shape when available. +10. Record observation, receipt, and reconciliation result. -1. The model emits a **ReasoningEnvelope** and plan data; the harness validates against schema (`frame` / `plan`). +## Normative vs Reference - At `frame`, the adapter’s job ends at **serialization**: turning model output into JSON that matches schema. The harness’s job begins at **validation** and **policy consult**—there is no “helpful” bypass that trusts pretty-printed prose. +- **Normative:** RFC text and JSON Schemas under `schemas/`. +- **Reference:** TypeScript harness, Python helpers, examples, and downstream implementations such as Open Lagrange. -2. The harness records a **`DelegationRequest`** when new capability is needed (`request_authority`). +Open Lagrange is an opinionated implementation: Restate for durable execution, Zod for runtime validation, Vercel AI SDK for structured generation, and MCP-shaped endpoint execution. Those choices prove the standard under pressure, but they are not required by Open CoT. - The request carries justification and requested scope; it is evidence for auditors, not a self-signed certificate. +## RFC Map -3. **Policy** evaluates the request in **`validate_authority`** → **`DelegationDecision`** (approved, denied, narrowed, escalated). +- **RFC 0052** — Cognitive artifact and capability snapshot. +- **RFC 0053** — Reconciliation result and error taxonomy. +- **RFC 0049** — Capability manifest precursor and model-facing capability projection. +- **RFC 0041** — Policy documents and policy gate inputs. +- **RFC 0047** — Delegation and authority material. +- **RFC 0048** — Execution receipts and audit envelopes. +- **RFC 0051** — Temporal semantics, validity, replay, and ordering. +- **RFC 0001 / 0003 / 0007** — Foundational reasoning, tool invocation, and governed execution lineage. - Each capability in the request should be evaluated independently, per RFC 0007, so a partial approval becomes an explicit **narrowed** outcome rather than silent truncation. +## Closing Note -4. On approval path, the **auth broker** narrows scope and attaches **`AuthorityReceipt`** (`delegate_narrow`). - -5. The harness enters **`execute_tool`** with a valid grant or documented standing authorization, runs the tool, and writes **`ToolExecutionReceipt`**. - -6. Results flow through **`observe_result`** and **`critique_verify`** under policy; **`finalize`** revokes outstanding grants as required; **`audit_seal`** seals the trace and audit material. - -If policy denies or a hard failure occurs, control routes through **`deny`**, **`escalate`**, or **`fail_safe`** into **`audit_seal`** per RFC 0007. - -**Standing authorization path:** When `plan` transitions directly to `execute_tool`, the harness must still record **why** execution was legal—typically a cited standing grant or sandbox allowlist entry on the **`ToolExecutionReceipt`**. Auditors should never have to guess that tools ran “because the model said so.” - -## Schema registry and validation path - -The `schemas/` directory is the **machine-readable** anchor: each RFC with schema sections publishes JSON Schema files; `registry.json` lists shortnames and versions used by examples and CI. - -Downstream flows typically look like: **author** instance JSON → **validate** with `tools/validate.py` (Python) or harness-side Ajv (TypeScript) → **store or replay** the validated trace. The control plane stays identical even when the storage backend or transport changes. - -## Trust boundaries - -| Source | What it may supply | What it must not supply | -|--------|--------------------|-------------------------| -| Model | Proposals, plans, justifications, interpretations of observations | Self-granted permissions, forged receipts, out-of-band tool execution | -| Harness | State transitions, validation, delegation records, dispatch, trace steps | Silent policy bypass; executing tools outside `execute_tool` | -| Policy engine | Allow / deny / narrow / escalate decisions | Direct tool side effects (policy decides; executor acts) | -| Auth broker | Narrowed **`AuthorityReceipt`** | Broader scope than policy approved | -| Tool executor | Tool outputs, errors, telemetry | Authority to expand requested scope | - -**Humans in `escalate`** supply judgment, not schema: the FSM only requires that escalation eventually resolves to `delegate_narrow`, `deny`, or `audit_seal` (for example on timeout). UX for reviewers is data-plane concern; the **control-plane obligation** is to record the resolution and receipts. - -## Normative vs reference - -- **Normative:** RFC text, JSON Schemas under `schemas/`, and the FSM / artifact rules they define. A conforming implementation must satisfy these without contradicting the specs. -- **Reference:** The TypeScript **`harness/`** package, Python helpers under **`reference/python/`** and **`tools/`**, and **`examples/`** fixtures. They demonstrate one correct reading of the specs and are expected to evolve as RFCs tighten; when reference and RFC disagree, the RFC wins and the reference should be updated. - -Treat the harness as a **credible PoC** of the control plane, not as the only permissible runtime architecture. - -## RFC map (minimal spine) - -Implementers usually traverse these in order after RFC 0001 (reasoning / trace): - -- **RFC 0007** — Governed FSM; owns which phase may call tools and where policy must be consulted. -- **RFC 0003** (+ **RFC 0018** tool errors) — Tool invocation payloads and structured failure. -- **RFC 0017** — Sandbox surfaces that constrain *which* tools or scopes are even eligible. -- **RFC 0038** — Budget objects that can terminate a run without pretending the model “agreed” to stop. -- **RFC 0041** — Policy evaluation schema and attachment semantics at FSM hooks. -- **RFC 0042** — Permissions / ACL material the harness may attach to a run. -- **RFC 0047** — Delegation requests and decisions that precede narrowed authority. -- **RFC 0048** — Execution receipts plus integrity linkage into audit envelopes. -- **RFC 0043** — Auditing and compliance log shapes that consume the above identifiers. -- **RFC 0051** — Cross-cutting temporal semantics, validity windows, freshness, replay handling, and supersession behavior. - -Additional RFCs cover identity (**RFC 0026**), provenance (**RFC 0035**), org governance (**RFC 0044**), federation, and economics; they extend the same spine rather than replacing it. - -## Operational deployment patterns - -- **Single binary / edge agent:** All components may run in-process; still emit the same receipts so a central collector can verify them later. -- **Split trust domains:** Policy engine and auth broker on hardened hosts; model adapter on GPU nodes; tool executor in a network-restricted VPC. The control plane stays coherent as long as **artifact IDs** chain correctly in the trace. -- **Human escalation:** `escalate` pauses the FSM; reviewers act through a workflow UI; the resumed transition must record the human decision alongside broker output. -- **Replay and forensics:** Because phases and receipts are serialized, investigators can replay a run without re-executing tools, comparing recorded policy decisions to recorded grants. - -## Closing note - -Adopters should map their existing “agent framework” onto this diagram explicitly: identify where proposals become JSON, where policy can still say **no** before side effects, and where receipts land for compliance. If those three answers are fuzzy, the system is still trusting the model more than the contract. +The standard should make implementations interchangeable at the artifact boundary. If a runtime discovers a missing portable concept, that concept should become an Open CoT RFC/schema change rather than a private extension. diff --git a/docs/philosophy.md b/docs/philosophy.md index 21f9bce..ab7ad8d 100644 --- a/docs/philosophy.md +++ b/docs/philosophy.md @@ -1,47 +1,39 @@ # Philosophy -Open-CoT exists to make **governed agent execution** interoperable: a shared, machine-verifiable contract between model output, runtime enforcement, policy, delegation, tools, provenance, and audit. +Open CoT exists to make the boundary between cognition and capability portable. -We are deliberately opinionated about **where trust lives**. Capability flows through grants, policy, and brokers—not through persuasive text in the model channel. That can feel heavier than ad-hoc agent scripts; the bet is that regulated and multi-vendor environments will prefer **one inspectable contract** over many implicit ones. +The model-like component is useful because it can interpret, compress, explain, and propose. It is not useful as an authority boundary. Open CoT treats its output as a cognitive artifact: structured, inspectable, and untrusted until a runtime reconciles it against capability, policy, budget, and evidence. ## Principles -### Typed schemas over ambiguous prose +### Typed artifacts over ambiguous prose -Every serious boundary—reasoning steps, tool intent, FSM phase, delegation, permission, receipt, audit envelope—is expressed as **JSON Schema** and validated by the harness. Natural language may explain *why* a human approved something; it does not define *whether* a transition is legal. +Every serious boundary should be expressed as JSON Schema: cognitive artifact, capability snapshot, execution intent, policy material, observation, receipt, and reconciliation result. Natural language can explain context; it cannot grant permission. -### The model is an untrusted proposer +### Capability snapshots over ambient access -Models suggest plans and capability needs. They **never** authorize themselves. The harness treats model output like any other untrusted input: parse, validate, consult policy, then either advance the FSM or refuse. Obedience to the model is a bug. +Cognition receives an explicit snapshot of available endpoints. The snapshot binds endpoint names, input shape, risk, approval requirement, and digest. Requests outside that snapshot are invalid. -### Portable harness semantics +### Execution intent is not execution -The governed FSM and artifact shapes are **model-agnostic**. Any model that can emit structured output (or sit behind an adapter that shapes output) can participate. Portability comes from shared control-plane semantics, not from standardizing hidden chain-of-thought prose. +An execution intent is a proposal. A runtime must validate shape, snapshot identity, capability digest, arguments, policy, risk, approval, budget, and preconditions before side effects occur. -### Explicit provenance and evidence +### Policy is separate from validation -Side effects and decisions leave **receipts** and audit-linked records. A completed run should answer: what was requested, what was allowed, what ran, and how integrity was sealed. Silence is not accountability. +Zod, JSON Schema, or any other validator can prove shape. They cannot prove permission. Policy gates are separate artifacts and should leave their own evidence. -### Permission-aware execution +### Observations over transcript trust -Tool calls require **explicit authority**—grants with scope and lifetime, or a documented standing authorization cited on the execution receipt. Permission is a runtime object managed by the trust stack, not a vibe inferred from the prompt. +Endpoint output becomes an observation. Observations are structured runtime records, not loose transcript text. They can carry result data, skipped work, validation failures, policy refusals, and reconciliation errors. -### Delegation as a bounded request +### Reconciliation over orchestration by text -When more capability is needed, the agent issues a formal **delegation request**. Policy approves, narrows, denies, or escalates. The auth broker issues a narrowed receipt. Delegation is not a model-signed blank check. +The runtime owns progression. The cognitive step emits an artifact, then yields to the runtime boundary. This keeps retries, crash recovery, endpoint execution, and audit in deterministic code. -### Fail-closed safety +### Implementation pressure should improve the standard -If validation fails, **deny**. If observation violates policy, **quarantine** and route toward **`fail_safe`** rather than leaking unsafe material back to the model. If budgets exhaust, **stop**. Uncertainty defaults to refusal, not optimism. +Open Lagrange is a proving ground, not a competing dialect. If it needs a portable structure, Open CoT should gain or refine an RFC/schema. Runtime-specific choices stay local; reusable interfaces belong here. -Designs that “try the tool call and roll back” still owe the same receipts: optimism belongs in the training loss, not in the authorization boundary. +### Backward compatibility without freezing vocabulary -### Token-aware by design - -Structure costs tokens, and tokens cost money and context. The control plane should not burn the model's budget on bureaucracy. Capability manifests (RFC 0049) tell the model what it can do upfront so it does not waste tokens guessing. Compact text serialization keeps the overhead under 200 tokens. Context compilation — summarizing observations, windowing traces, stripping harness metadata — keeps the model focused on the task, not the plumbing. See [`docs/token-efficiency.md`](./token-efficiency.md) for active research on wire formats and small-model strategies. - -### Small credible proofs before ambitious claims - -The reference harness exists to show that the **contract works**: one end-to-end path that respects the FSM, receipts, and validation. We aim for a narrow, correct slice of the ecosystem—not a declaration that every framework must adopt this stack tomorrow. - -If a behavior cannot be expressed in schema and FSM transitions yet, we treat that as a **spec gap** to fix, not as encouragement to bypass the harness with bespoke glue code. +Earlier RFCs use historical terms from the project’s transitional period. Those documents remain part of the record. New work should prefer cognition, capability, execution intent, observation, policy gate, runtime boundary, and reconciliation terminology. diff --git a/docs/rfc-discussion-index.json b/docs/rfc-discussion-index.json index 3176029..7100fe2 100644 --- a/docs/rfc-discussion-index.json +++ b/docs/rfc-discussion-index.json @@ -308,6 +308,18 @@ "rfc_path": "rfcs/0051-temporal-semantics-validity-extension.md", "discussion_title": "RFC 0051 \u2014 Temporal Semantics & Validity Extension", "discussion_url": "https://github.com/supernovae/open-cot/discussions/51" + }, + "0052": { + "rfc_title": "Cognitive Artifact & Capability Snapshot", + "rfc_path": "rfcs/0052-cognitive-artifact-and-capability-snapshot.md", + "discussion_title": "RFC 0052 \u2014 Cognitive Artifact & Capability Snapshot", + "discussion_url": "https://github.com/supernovae/open-cot/discussions/52" + }, + "0053": { + "rfc_title": "Reconciliation Result & Error Taxonomy", + "rfc_path": "rfcs/0053-reconciliation-result.md", + "discussion_title": "RFC 0053 \u2014 Reconciliation Result & Error Taxonomy", + "discussion_url": "https://github.com/supernovae/open-cot/discussions/53" } } } diff --git a/docs/rfc-discussions.md b/docs/rfc-discussions.md index 525adc3..c1d5bff 100644 --- a/docs/rfc-discussions.md +++ b/docs/rfc-discussions.md @@ -58,4 +58,5 @@ Canonical discussion threads for all Open CoT RFCs. Use these threads for normat | [`RFC 0049`](../rfcs/0049-capability-manifest.md) | Capability Manifest | [Open thread](https://github.com/supernovae/open-cot/discussions/49) | | [`RFC 0050`](../rfcs/0050-toon-adapter.md) | TOON Adapter: Token-Oriented Object Notation | [Open thread](https://github.com/supernovae/open-cot/discussions/50) | | [`RFC 0051`](../rfcs/0051-temporal-semantics-validity-extension.md) | Temporal Semantics & Validity Extension | [Open thread](https://github.com/supernovae/open-cot/discussions/51) | - +| [`RFC 0052`](../rfcs/0052-cognitive-artifact-and-capability-snapshot.md) | Cognitive Artifact & Capability Snapshot | [Open thread](https://github.com/supernovae/open-cot/discussions/52) | +| [`RFC 0053`](../rfcs/0053-reconciliation-result.md) | Reconciliation Result & Error Taxonomy | [Open thread](https://github.com/supernovae/open-cot/discussions/53) | diff --git a/rfcs/0052-cognitive-artifact-and-capability-snapshot.md b/rfcs/0052-cognitive-artifact-and-capability-snapshot.md new file mode 100644 index 0000000..24d6e4d --- /dev/null +++ b/rfcs/0052-cognitive-artifact-and-capability-snapshot.md @@ -0,0 +1,48 @@ +# RFC 0052 — Cognitive Artifact & Capability Snapshot (v0.1) + +**Status:** Draft +**Author:** Open CoT Community +**Created:** 2026-04-27 +**Target Version:** Schema v0.10 +**Discussion:** https://github.com/supernovae/open-cot/discussions/52 + +--- + +## 1. Summary + +This RFC defines portable structures for runtimes that wrap non-deterministic +cognitive functions with deterministic validation and execution boundaries. + +The core structure is a **Cognitive Artifact**: a typed proposal emitted by a +model or model-like system. It is untrusted input. A runtime validates and +reconciles it against an immutable **Capability Snapshot** before performing +any side effect. + +## 2. Core concepts + +- `capability_snapshot`: immutable inventory of endpoints available to the + cognitive step. +- `cognitive_artifact`: typed proposal emitted from the cognitive step. +- `execution_intent`: requested endpoint execution tied to a snapshot and + capability digest. +- `observation`: structured evidence recorded during reconciliation. + +## 3. Normative requirements + +- A cognitive artifact MUST NOT be treated as authorization. +- Every execution intent MUST reference the exact snapshot used for generation. +- A runtime MUST verify endpoint name, capability name, and capability digest + before execution. +- A runtime MUST validate arguments against the original capability input + schema. +- Reasoning traces are explanatory audit material only. They are not proof, + authorization, or trusted state. + +## 4. Runtime neutrality + +This RFC does not require a specific durable execution engine, MCP transport, +model provider, or TypeScript implementation. Those are implementation choices. + +## 5. Schema + +Machine-readable schema: `schemas/rfc-0052-cognitive-artifact.json`. diff --git a/rfcs/0053-reconciliation-result.md b/rfcs/0053-reconciliation-result.md new file mode 100644 index 0000000..486ed65 --- /dev/null +++ b/rfcs/0053-reconciliation-result.md @@ -0,0 +1,57 @@ +# RFC 0053 — Reconciliation Result & Error Taxonomy (v0.1) + +**Status:** Draft +**Author:** Open CoT Community +**Created:** 2026-04-27 +**Target Version:** Schema v0.10 +**Discussion:** https://github.com/supernovae/open-cot/discussions/53 + +--- + +## 1. Summary + +This RFC defines a portable result envelope for runtimes that reconcile typed +cognitive artifacts against capability snapshots, policy gates, execution +bounds, endpoint results, and observations. + +The result envelope records what executed, what was skipped, what errors were +observed, and the final reconciliation status. + +## 2. Status values + +- `completed` +- `completed_with_errors` +- `yielded` +- `requires_approval` +- `failed` + +## 3. Error taxonomy + +The portable taxonomy includes: + +- `INVALID_ARTIFACT` +- `SNAPSHOT_MISMATCH` +- `UNKNOWN_MCP_SERVER` +- `UNKNOWN_CAPABILITY` +- `CAPABILITY_DIGEST_MISMATCH` +- `SCHEMA_VALIDATION_FAILED` +- `POLICY_DENIED` +- `APPROVAL_REQUIRED` +- `PRECONDITION_FAILED` +- `BUDGET_EXCEEDED` +- `MCP_EXECUTION_FAILED` +- `RESULT_VALIDATION_FAILED` +- `YIELDED` + +## 4. Normative requirements + +- Shape validation MUST NOT be treated as permission. +- Permission and policy gates MUST be represented separately from schema + validation. +- Errors SHOULD be recorded as structured observations when possible. +- A reconciliation result SHOULD preserve enough evidence for replay and audit + without requiring endpoint re-execution. + +## 5. Schema + +Machine-readable schema: `schemas/rfc-0053-reconciliation-result.json`. diff --git a/schemas/registry.json b/schemas/registry.json index d683617..6a0bcbd 100644 --- a/schemas/registry.json +++ b/schemas/registry.json @@ -52,6 +52,8 @@ "execution_receipts_audit_envelopes": "schemas/rfc-0048-execution-receipts-audit-envelopes.json", "capability_manifest": "schemas/rfc-0049-capability-manifest.json", "toon_adapter": "schemas/rfc-0050-toon-adapter.json", - "temporal_semantics": "schemas/rfc-0051-temporal-semantics.json" + "temporal_semantics": "schemas/rfc-0051-temporal-semantics.json", + "cognitive_artifact": "schemas/rfc-0052-cognitive-artifact.json", + "reconciliation_result": "schemas/rfc-0053-reconciliation-result.json" } } diff --git a/schemas/rfc-0052-cognitive-artifact.json b/schemas/rfc-0052-cognitive-artifact.json new file mode 100644 index 0000000..6a8814d --- /dev/null +++ b/schemas/rfc-0052-cognitive-artifact.json @@ -0,0 +1,354 @@ +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://opencot.dev/schema/v0.10/cognitive-artifact.json", + "title": "Open CoT RFC 0052 — Cognitive Artifact and Capability Snapshot", + "type": "object", + "oneOf": [ + { + "$ref": "#/$defs/cognitive_artifact" + }, + { + "$ref": "#/$defs/capability_snapshot" + } + ], + "$defs": { + "json_schema_like": { + "type": "object", + "additionalProperties": true + }, + "risk_level": { + "type": "string", + "enum": [ + "read", + "write", + "destructive", + "external_side_effect" + ] + }, + "capability_descriptor": { + "type": "object", + "additionalProperties": false, + "required": [ + "mcp_server_name", + "capability_name", + "description", + "input_schema", + "risk_level", + "requires_approval", + "capability_digest" + ], + "properties": { + "mcp_server_name": { + "type": "string", + "minLength": 1 + }, + "capability_name": { + "type": "string", + "minLength": 1 + }, + "description": { + "type": "string" + }, + "input_schema": { + "$ref": "#/$defs/json_schema_like" + }, + "output_schema": { + "$ref": "#/$defs/json_schema_like" + }, + "risk_level": { + "$ref": "#/$defs/risk_level" + }, + "requires_approval": { + "type": "boolean" + }, + "capability_digest": { + "type": "string", + "pattern": "^[a-f0-9]{64}$" + } + } + }, + "capability_snapshot": { + "type": "object", + "additionalProperties": false, + "required": [ + "snapshot_id", + "discovered_at", + "capabilities_hash", + "capabilities" + ], + "properties": { + "snapshot_id": { + "type": "string", + "minLength": 1 + }, + "discovered_at": { + "type": "string", + "format": "date-time" + }, + "capabilities_hash": { + "type": "string", + "pattern": "^[a-f0-9]{64}$" + }, + "capabilities": { + "type": "array", + "items": { + "$ref": "#/$defs/capability_descriptor" + } + } + } + }, + "intent_verification": { + "type": "object", + "additionalProperties": false, + "required": [ + "interpreted_user_objective", + "request_boundaries", + "believed_allowed_requests", + "prohibited_requests" + ], + "properties": { + "interpreted_user_objective": { + "type": "string" + }, + "request_boundaries": { + "type": "array", + "items": { + "type": "string" + } + }, + "believed_allowed_requests": { + "type": "array", + "items": { + "type": "string" + } + }, + "prohibited_requests": { + "type": "array", + "items": { + "type": "string" + } + } + } + }, + "reasoning_trace_step": { + "type": "object", + "additionalProperties": false, + "required": [ + "step_id", + "kind", + "content" + ], + "properties": { + "step_id": { + "type": "string", + "minLength": 1 + }, + "kind": { + "type": "string", + "enum": [ + "interpretation", + "constraint", + "hypothesis", + "verification", + "yield" + ] + }, + "content": { + "type": "string" + }, + "confidence": { + "type": "number", + "minimum": 0, + "maximum": 1 + } + } + }, + "execution_intent": { + "type": "object", + "additionalProperties": false, + "required": [ + "intent_id", + "snapshot_id", + "target_mcp_server", + "capability_name", + "capability_digest", + "risk_level", + "requires_approval", + "idempotency_key", + "arguments" + ], + "properties": { + "intent_id": { + "type": "string", + "minLength": 1 + }, + "snapshot_id": { + "type": "string", + "minLength": 1 + }, + "target_mcp_server": { + "type": "string", + "minLength": 1 + }, + "capability_name": { + "type": "string", + "minLength": 1 + }, + "capability_digest": { + "type": "string", + "pattern": "^[a-f0-9]{64}$" + }, + "risk_level": { + "$ref": "#/$defs/risk_level" + }, + "requires_approval": { + "type": "boolean" + }, + "idempotency_key": { + "type": "string", + "minLength": 1 + }, + "arguments": { + "type": "object" + }, + "preconditions": { + "type": "array", + "items": { + "type": "string" + } + }, + "expected_result_shape": { + "$ref": "#/$defs/json_schema_like" + }, + "postconditions": { + "type": "array", + "items": { + "type": "string" + } + } + } + }, + "observation": { + "type": "object", + "additionalProperties": false, + "required": [ + "observation_id", + "status", + "summary", + "observed_at" + ], + "properties": { + "observation_id": { + "type": "string", + "minLength": 1 + }, + "intent_id": { + "type": "string" + }, + "status": { + "type": "string", + "enum": [ + "recorded", + "skipped", + "error" + ] + }, + "summary": { + "type": "string" + }, + "output": {}, + "observed_at": { + "type": "string", + "format": "date-time" + } + } + }, + "cognitive_artifact": { + "type": "object", + "additionalProperties": false, + "required": [ + "artifact_id", + "schema_version", + "capability_snapshot_id", + "intent_verification", + "observations", + "assumptions", + "reasoning_trace", + "execution_intent", + "uncertainty" + ], + "properties": { + "artifact_id": { + "type": "string", + "minLength": 1 + }, + "schema_version": { + "type": "string", + "enum": [ + "open-cot.reconciliation.v0.1" + ] + }, + "capability_snapshot_id": { + "type": "string", + "minLength": 1 + }, + "intent_verification": { + "$ref": "#/$defs/intent_verification" + }, + "observations": { + "type": "array", + "items": { + "$ref": "#/$defs/observation" + } + }, + "assumptions": { + "type": "array", + "items": { + "type": "string" + } + }, + "reasoning_trace": { + "type": "array", + "items": { + "$ref": "#/$defs/reasoning_trace_step" + } + }, + "execution_intent": { + "type": "array", + "items": { + "$ref": "#/$defs/execution_intent" + } + }, + "uncertainty": { + "type": "object", + "additionalProperties": false, + "required": [ + "level", + "explanation" + ], + "properties": { + "level": { + "type": "string", + "enum": [ + "low", + "medium", + "high" + ] + }, + "explanation": { + "type": "string" + } + } + }, + "yield_reason": { + "type": "string" + } + } + } + }, + "x-opencot": { + "rfc": "0052", + "shortname": "cognitive_artifact", + "source_rfc": "rfcs/0052-cognitive-artifact-and-capability-snapshot.md" + } +} diff --git a/schemas/rfc-0053-reconciliation-result.json b/schemas/rfc-0053-reconciliation-result.json new file mode 100644 index 0000000..2cef960 --- /dev/null +++ b/schemas/rfc-0053-reconciliation-result.json @@ -0,0 +1,119 @@ +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://opencot.dev/schema/v0.10/reconciliation-result.json", + "title": "Open CoT RFC 0053 — Reconciliation Result and Error Taxonomy", + "type": "object", + "additionalProperties": false, + "required": [ + "reconciliation_id", + "status", + "capability_snapshot", + "executed_intents", + "skipped_intents", + "observations", + "errors", + "final_message" + ], + "properties": { + "reconciliation_id": { + "type": "string", + "minLength": 1 + }, + "status": { + "type": "string", + "enum": [ + "completed", + "completed_with_errors", + "yielded", + "requires_approval", + "failed" + ] + }, + "capability_snapshot": { + "$ref": "rfc-0052-cognitive-artifact.json#/$defs/capability_snapshot" + }, + "artifact": { + "$ref": "rfc-0052-cognitive-artifact.json#/$defs/cognitive_artifact" + }, + "executed_intents": { + "type": "array", + "items": { + "$ref": "rfc-0052-cognitive-artifact.json#/$defs/execution_intent" + } + }, + "skipped_intents": { + "type": "array", + "items": { + "$ref": "rfc-0052-cognitive-artifact.json#/$defs/execution_intent" + } + }, + "observations": { + "type": "array", + "items": { + "$ref": "rfc-0052-cognitive-artifact.json#/$defs/observation" + } + }, + "errors": { + "type": "array", + "items": { + "$ref": "#/$defs/reconciliation_error" + } + }, + "final_message": { + "type": "string" + } + }, + "$defs": { + "error_code": { + "type": "string", + "enum": [ + "INVALID_ARTIFACT", + "SNAPSHOT_MISMATCH", + "UNKNOWN_MCP_SERVER", + "UNKNOWN_CAPABILITY", + "CAPABILITY_DIGEST_MISMATCH", + "SCHEMA_VALIDATION_FAILED", + "POLICY_DENIED", + "APPROVAL_REQUIRED", + "PRECONDITION_FAILED", + "BUDGET_EXCEEDED", + "MCP_EXECUTION_FAILED", + "RESULT_VALIDATION_FAILED", + "YIELDED" + ] + }, + "reconciliation_error": { + "type": "object", + "additionalProperties": false, + "required": [ + "code", + "message", + "observed_at" + ], + "properties": { + "code": { + "$ref": "#/$defs/error_code" + }, + "message": { + "type": "string" + }, + "intent_id": { + "type": "string" + }, + "observed_at": { + "type": "string", + "format": "date-time" + }, + "details": { + "type": "object", + "additionalProperties": true + } + } + } + }, + "x-opencot": { + "rfc": "0053", + "shortname": "reconciliation_result", + "source_rfc": "rfcs/0053-reconciliation-result.md" + } +} diff --git a/tools/schema_lib.py b/tools/schema_lib.py index 1ecd138..1028379 100644 --- a/tools/schema_lib.py +++ b/tools/schema_lib.py @@ -67,6 +67,8 @@ "0049": "capability_manifest", "0050": "toon_adapter", "0051": "temporal_semantics", + "0052": "cognitive_artifact", + "0053": "reconciliation_result", } # RFC ids where extraction must use explicit markers. diff --git a/tools/sync_schemas_from_rfcs.py b/tools/sync_schemas_from_rfcs.py index 3a54238..2441011 100644 --- a/tools/sync_schemas_from_rfcs.py +++ b/tools/sync_schemas_from_rfcs.py @@ -158,6 +158,16 @@ def main() -> int: rendered = ", ".join(f"{rfc_id}({len(paths)})" for rfc_id, paths in sorted(dups.items())) raise RuntimeError(f"Duplicate RFC ids detected: {rendered}") + manually_authored_schema_ids = {"0052", "0053"} + manually_authored_schemas: dict[str, dict[str, Any]] = {} + for rfc_id in manually_authored_schema_ids: + shortname = RFC_SHORTNAME.get(rfc_id) + if not shortname: + continue + path = SCHEMAS_DIR / schema_filename(rfc_id, shortname) + if path.is_file(): + manually_authored_schemas[rfc_id] = json.loads(path.read_text(encoding="utf-8")) + SCHEMAS_DIR.mkdir(parents=True, exist_ok=True) for stale in SCHEMAS_DIR.glob("rfc-*.json"): stale.unlink() @@ -170,7 +180,9 @@ def main() -> int: data: dict[str, Any] | None = None - if rfc_id == "0004": + if rfc_id in manually_authored_schemas: + data = manually_authored_schemas[rfc_id] + elif rfc_id == "0004": data = build_branching_schema(rfc_id) elif rfc_id == "0007": data = build_agent_loop_schema()