Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions Taskfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -202,3 +202,80 @@ tasks:
--keyring-append melange.rsa.pub \
--arch {{.ARCH}}
- 'echo "Built packages/notme.tar — load with: docker load < packages/notme.tar"'

# schema-bridge tasks
#
# capnp → zod TypeScript codegen via the Rust crate at
# packages/schema-bridge/. Replaces the older `schema:ts` TS-based
# codegen which silently emitted `z.unknown()` for unrecognised
# constructs; schema-bridge fails fast on any unmapped capnp shape.
#
# NOT wired into `task verify` yet — schema-bridge doesn't yet map
# the `$Go.package` / `$Go.import` annotations our identity.capnp
# uses for Go codegen. Filing as a follow-on bead. Until that lands,
# `task gen:zod` against identity.capnp will fail with a clear
# `unmapped capnp construct annotation` diagnostic.
schema-bridge:build:
desc: Build the capnpc-schema-bridge plugin (release profile)
dir: packages/schema-bridge
sources:
- src/**/*.rs
- Cargo.toml
- Cargo.lock
generates:
- target/release/capnpc-schema-bridge
cmds:
- cargo build --locked --release

gen:zod:
desc: Regenerate gen/ts/*.zod.ts from schema/*.capnp via schema-bridge
deps: [schema-bridge:build]
cmds:
# `capnp compile` exits 0 even when the plugin errors and writes
# nothing — its stderr carries the real diagnostic. Generate into
# a sibling tmpdir under gen/ts/ (same filesystem as destination)
# so the final `mv` is a same-fs rename — atomic on POSIX. A
# plain `mktemp -d` lands in $TMPDIR (often /tmp, a separate fs),
# which silently degrades the rename to a cross-fs copy+unlink.
- |
set -eu
mkdir -p gen/ts
TMPDIR=$(mktemp -d -p gen/ts)
trap 'rm -rf "$TMPDIR"' EXIT
capnp compile \
-o./packages/schema-bridge/target/release/capnpc-schema-bridge:"$TMPDIR" \
-I "$HOME/go/pkg/mod/capnproto.org/go/capnp/v3@v3.1.0-alpha.2/std" \
--src-prefix=schema schema/identity.capnp
if [ ! -f "$TMPDIR/identity.zod.ts" ]; then
echo "FAIL — schema-bridge produced no identity.zod.ts; check capnp stderr above" >&2
exit 1
fi
mv "$TMPDIR/identity.zod.ts" gen/ts/identity.zod.ts
Comment thread
jamestexas marked this conversation as resolved.

gen:zod:check-drift:
desc: Fail if committed gen/ts/*.zod.ts has drifted from a fresh regen
deps: [schema-bridge:build]
cmds:
- |
set -eu
# Baseline must exist before we can diff against it. Until the
# schema-bridge Go-annotation gap (rosary-8d2c78) is closed,
# `task gen:zod` against identity.capnp errors out, so no
# baseline is committed. Emit a clearer error than the raw
# `diff: gen/ts/identity.zod.ts: No such file or directory`.
if [ ! -f gen/ts/identity.zod.ts ]; then
echo "FAIL — gen/ts/identity.zod.ts baseline missing; run \`task gen:zod\` and commit the output first (blocked on rosary-8d2c78)." >&2
exit 1
fi
mkdir -p gen/ts
TMPDIR=$(mktemp -d -p gen/ts)
trap 'rm -rf "$TMPDIR"' EXIT
capnp compile \
-o./packages/schema-bridge/target/release/capnpc-schema-bridge:"$TMPDIR" \
-I "$HOME/go/pkg/mod/capnproto.org/go/capnp/v3@v3.1.0-alpha.2/std" \
--src-prefix=schema schema/identity.capnp
if [ ! -f "$TMPDIR/identity.zod.ts" ]; then
echo "FAIL — schema-bridge produced no identity.zod.ts; check capnp stderr above" >&2
exit 1
fi
Comment thread
jamestexas marked this conversation as resolved.
diff -u gen/ts/identity.zod.ts "$TMPDIR/identity.zod.ts"
1 change: 1 addition & 0 deletions packages/schema-bridge/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
target/
81 changes: 81 additions & 0 deletions packages/schema-bridge/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions packages/schema-bridge/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
[package]
name = "schema-bridge"
version = "0.0.1"
edition = "2021"
description = "Capnp + JSON-extension schemas → zod / TS / JSON Schema. Fail-fast codegen."
license = "Apache-2.0"
publish = false

[[bin]]
name = "capnpc-schema-bridge"
path = "src/main.rs"

[lib]
name = "schema_bridge"
path = "src/lib.rs"

[dependencies]
capnp = "0.21"
thiserror = "2.0"
7 changes: 7 additions & 0 deletions packages/schema-bridge/NOTICE
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
packages/schema-bridge — Apache-2.0

Lifted from cloister/tools/schema-bridge/ (AGPL-3.0) on 2026-05-18 and
re-licensed under Apache-2.0 by the sole author. No third-party
contributions existed in the cloister copy (verified via git log).

See vault/NOTICE for the precedent of this round-trip pattern.
189 changes: 189 additions & 0 deletions packages/schema-bridge/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
# schema-bridge

Capnp + JSON-extension schemas → zod / TS / (future: JSON Schema).
Single source of truth, fail-fast codegen, designed to be extracted to
its own crate once it stabilises.

## Why

cloister had two parallel schema pipelines: capnp→TS for the manifest,
zod→JSON Schema for tool I/O. Adding a third source (the `.cloister.json`
CLI config) would have meant hand-mirroring a capnp struct against a
zod schema, with ADR-0004's append-only / monotonic-ordinal guarantees
dropped on the floor. Bad shape, deferred forever.

schema-bridge is the missing piece: read a capnp schema, lower into a
small intermediate representation (IR), emit every downstream target
from that IR. capnp's own ordinal rules carry through; new fields land
in one place; nothing drifts.

## Self-maintenance invariant

The point of this tool is that it stays correct without anyone
remembering to update it. The mechanism: **any capnp construct without
a complete IR-and-emit mapping is a hard error.**

```text
unmapped capnp construct `list` at node id=aaaa (Foo.items):
add a mapping for `list` in schema-bridge, or open an issue
```

This means the codegen is *intentionally incomplete*, but every gap is
loud. notme's older `capnp-to-ts.ts` (which this tool replaces in
spirit) silently emitted `z.unknown()` for unrecognised constructs;
that's the precise failure mode schema-bridge exists to prevent.

**Today the codegen is opt-in** — `task gen:zod` regenerates
`gen/ts/identity.zod.ts` and `task gen:zod:check-drift`
verifies the committed copy matches. Neither task is wired into
`task lint` or `task verify` yet, so an unmapped capnp construct
won't break CI automatically; it WILL break the moment a developer
runs the regen or drift-check task locally. The plan is to wire
`gen:zod:check-drift` into `task verify` once the schema-bridge
mapping coverage stabilises (tracked separately) — at that point
unmapped constructs become a hard CI failure. No silent fallbacks
regardless.

## What's mapped today

| capnp construct | IR | zod emit |
|----------------------------------------|-----------------------------|-------------------------------------------------|
| `struct` | `Struct { fields, union }` | `z.lazy(() => z.object({…}))` |
| scalar fields | `Scalar(_)` | `z.string()` / `z.number()` / etc. |
| struct refs | `StructRef(name)` | `{Name}Schema` |
| enum refs | `EnumRef(name)` | `{Name}Schema` (where `{Name}Schema = z.enum`) |
| `List(T)` | `List(Box<FieldType>)` | `z.array(T)` (recurses) |
| top-level `enum` | `Enum { name, variants }` | `z.enum([…])` + `type X = "a" \| "b"` |
| `name :union { … }` (group form) | `Struct.union: Some(Union)` | `z.union([z.object({ <variant>: <T> }).strict(), …])` — one strict single-key object per variant |
| Void union variants | `UnionVariant.ty = Void` | `z.object({ <variant>: z.null() }).strict()` inside the union |
| union-only structs (no base fields) | empty `fields`, `Some(union)` | same `z.union([…])` shape (no intersect wrapper) |

Verified end-to-end (run `capnp compile -oschema-bridge:<dir>` against
each):

- `manifest/cluster.capnp` → 136 lines clean zod TS (1 enum, 2 named
unions including all-Void `Wire.transport`)
- `manifest/cloister.capnp` → 246 lines clean zod TS (13 structs,
`Backend.kind` 6-variant union, `Route.kind` 10-variant mostly-Void
union)

| Deliberately unmapped (errors today)| reason |
|-------------------------------------|----------------------------------------------|
| `interface` | RPC types — out of scope for now |
| `const`, `annotation` (top-level) | not used at the schema surfaces we care about |
| `anyPointer` | typed-erasure escape hatch; unmapped |
| generics (`$Foo(T)`) | needs IR generics representation |
| anonymous inline union | unused in cloister; the `name :union {…}` sugar covers all current use|
| non-union group (field namespacing) | unused in cloister |
| group variant inside a union | legal capnp, unused in cloister |
| any annotation on a node/field | including `$Json.flatten`, `$Json.discriminator`, `$Json.name`, `$Json.base64`, `$Json.hex`, `$Json.notification` (ids from `capnp/compat/json.capnp`) — affect JSON encoding and so MUST be handled or fail loudly; cloister capnp files use no annotations today |

Adding any of these is a focused change: extend the IR variant, add
the emit in `outputs/zod.rs`, add one golden test + leave one
fail-case test for the still-unmapped neighbour. The fail-case tests
stay forever as regression guards — they catch a future construct
that silently slips through because it looks "close enough" to
something that IS supported.

## Visibility of known gaps

Every unmapped construct above is paired with two tests:

1. **A regression-guard fail-fast test** — must throw
`UnmappedConstruct`. Stays active forever; catches a future
construct that silently slips through.
2. **An `#[ignore]`'d aspirational stub** (where the emit shape is
already clear) — documents what success will look like. `cargo
test` prints `<name> ... ignored, schema-bridge does not yet …`
on every run, so the gap is visible in CI output without breaking
the build. Activation gesture: remove `#[ignore]`, implement, fill
in the assertions. The paired regression-guard stays.

Today's `#[ignore]`'d stubs (search for them in
`tests/integration.rs`):

- `flat_union_emit_under_json_flatten` — emit when `$Json.flatten`
is on a union field
- `anonymous_inline_union_emits_flat` — emit for
`struct Foo { union { … } }`
- `non_union_group_emits_nested_object` — emit for
`field :group { x; y; }` (field namespacing without discriminator)

Constructs without aspirational stubs (`interface`, generics,
`anyPointer`) are deferred indefinitely — they're non-goals for the
zod-validation surface today, not just "not yet."

## How it runs

```sh
# As a capnp plugin (the supported invocation):
capnp compile \
-o./target/release/capnpc-schema-bridge:./gen \
manifest/cli-config.capnp
```

`capnp compile` invokes the binary with the parsed `CodeGeneratorRequest`
on stdin. The binary writes `<output-dir>/<schema-basename>.zod.ts`
(e.g. `cluster.zod.ts` from `manifest/cluster.capnp`) — zod schemas
plus TS interface declarations in one file. One emit per invocation
today; per-file splitting is on the follow-on list.

For development the library is also drivable directly — see
`tests/integration.rs` for examples of building a `CodeGeneratorRequest`
by hand. That's how the test suite stays hermetic (no capnp CLI
needed in CI).

## Layout

```
packages/schema-bridge/
├── Cargo.toml standalone workspace; depends only on capnp + thiserror
├── README.md this file
├── src/
│ ├── lib.rs public API for tests
│ ├── main.rs capnp plugin entry — stdin → emit → file
│ ├── error.rs SchemaBridgeError + UnmappedConstruct
│ ├── ir/ the intermediate representation
│ ├── inputs/ capnp → IR (future: json-extension/ for aggregation)
│ └── outputs/ IR → zod (future: ts.rs, json_schema.rs)
└── tests/
└── integration.rs golden + fail-case suite
```
Comment thread
jamestexas marked this conversation as resolved.

## Follow-on work

Tracked separately from this initial drop. In rough priority order:

1. Wire into `task manifest` + `task verify` — codegen step alongside
the existing capnp→TS pipeline. Decide whether the output replaces
`src/generated/cluster.ts` or sits beside it as
`src/generated/cluster.zod.ts`.
2. JSON-extension input adapter for the aggregation pattern (capnp
defines the structural backbone, JSON files supply per-variant
field extensions). Where the polymorphism for skill / mcp / agent
actually lands.
3. JSON Schema output adapter (`outputs/json_schema.rs`) — drives the
`$schema` field in `.cloister.json` for editor autocomplete.
4. TS-types-only output adapter, separated from the zod emit, so
consumers can pick one or both.
5. End-to-end fixture tests against `manifest/*.capnp` — currently
verified manually (see README "What's mapped today"); locking that
in as a golden-output test in CI prevents silent regressions.
6. License — this crate ships as Apache-2.0 (see `Cargo.toml`,
source headers, and `NOTICE`). Revisit if extraction to a
standalone repo lands.

## Non-goals (the helm comparison)

The aggregation pattern this tool serves looks superficially like
helm — multiple inputs composing into one output — but the design
explicitly avoids helm's failure modes:

- ❌ No string templating (no `{{ … }}` substitution anywhere)
- ❌ No runtime value substitution
- ❌ No values.yaml-style override layers chained 4-deep
- ✅ All aggregation is at the IR level, statically resolved
- ✅ Output is plain emitted source code, reviewable and diffable

If a feature looks like it might pull this toward helm-shaped
templating, reject the feature.
Loading
Loading