feat(epcis): async capture, private-by-default, partition-aware events#411
Open
feat(epcis): async capture, private-by-default, partition-aware events#411
Conversation
Extend POST /api/epcis/capture so callers can target a context graph
(and optional sub-graph) per request instead of being pinned to the
node's epcis.contextGraphId config. Body shape:
{ contextGraphId?, subGraphName?, epcisDocument, publishOptions? }
- contextGraphId is optional: per-request value takes precedence,
with fallback to epcis.contextGraphId then legacy epcis.paranetId.
When neither yields a value, route returns 400 InvalidContent
naming both options instead of the previous 503 plugin-misconfigured
message.
- subGraphName is optional with no fallback (sub-graphs are inherently
per-payload). Validated with validateSubGraphName when present and
threaded into the publisher opts so it reaches agent.publishAsync.
- contextGraphId is validated with validateContextGraphId.
- handleCaptureAsync gains optional contextGraphId/subGraphName on
CaptureRequest; the publisher-facing opts type is split out as
PublisherCaptureOpts (CaptureOptions + subGraphName) so the wire
publishOptions stays unchanged.
Tests cover handler-level override / threading / back-compat and the
full daemon-route fallback chain plus 400s on invalid CG/sub-graph.
Exercises the new POST /api/epcis/capture wire fields against a local
devnet node:
1. missing contextGraphId everywhere → 400 InvalidContent
2. invalid contextGraphId syntax → 400 with validator reason
3. invalid subGraphName (reserved "_" prefix) → 400 with reason
4. empty subGraphName → 400
5. non-string contextGraphId → 400
6. subGraphName threads to publisher (unregistered sub-graph
surfaces as 503 EnqueueFailed naming the sub-graph) — proves
route → handler → agent.publishAsync opts wiring end-to-end
7. valid per-request contextGraphId only → 202 + captureID
Idempotent against any running devnet (defaults to node 1 on :9201,
auth token from .devnet/node1/auth.token). 16/16 assertions green on
the slice-02 worktree.
Captures the domain language used across the EPCIS feature work: EPCIS Document, Capture, Capture ID, Context Graph, Shared Working Memory, Finalized partition, Private partition, Privacy envelope. Useful for any agent picking up an EPCIS slice in a fresh session.
Mirrors the slice-02 capture pattern on the GET /api/epcis/events route.
The route now reads `contextGraphId` and `subGraphName` from the query
string, validates each with the same helpers used by the capture path,
and falls back to `config.epcis.contextGraphId ?? config.epcis.paranetId`
only for `contextGraphId`. Both fail with the canonical
`{ "error": "InvalidContent", "message": ... }` shape when invalid or
missing.
`EventsQueryConfig` gains an optional `subGraphName` so the handler can
thread the resolved value into `buildEpcisQuery` without repurposing the
URLSearchParams shape.
Slice-03 follow-up: the anchor⇄payload join in the `finalized=true`
branch used `FILTER(?event = ?root)` across two GRAPH clauses, which
returns zero rows on the live triplestore even when both subjects are
byte-equal. Replaced with a shared `?event` variable across both graphs
(SPARQL bind-by-name), which is what makes the live devnet block in
docs/epcis/devnet-s4-e2e-2026-05-05.md actually return events.
Tests:
- handler-level: subGraphName reaches the SPARQL builder for both
canonical and SWM partitions; root partition stays root when
subGraphName is omitted; date-range validation regressions.
- route-level: per-request CG overrides config; subGraphName picks
the right graph URIs in emitted SPARQL; legacy paranetId fallback;
400 surface for missing/invalid CG, invalid subGraphName, no
agent.query call when validation fails.
- query-builder unit: orphan exclusion now pinned to the
`?event dkg:privateDataAnchor "true"` shape (no FILTER).
`scripts/slice-04-e2e.sh` drives the GET /api/epcis/events route
end-to-end on a 6-node devnet:
- Per-request `contextGraphId` carries through to canonical-partition
SPARQL and surfaces the captured event with full private payload
(eventTime, bizStep, epcList, eventType).
- Per-request CG isolation: querying a different CG returns nothing.
- Per-request `subGraphName` routes to <cg>/<sub>; root-graph queries
do not bleed sub-graph events.
- Privacy: an unauthorised observer node sees the public anchor but
the `/_private` payload stays absent and the EPCIS query for that
event surfaces nothing on that node.
- 400 surface: invalid `contextGraphId` and reserved-prefix
`subGraphName` over the live route, mirroring the unit-level
validation symmetry with slice-02 capture.
`docs/epcis/devnet-s4-e2e-2026-05-05.md` records the run (36 / 36
passed) and the pre-existing devnet limitations the slice surfaced
but does not own — publisher-wallet authority not on the on-chain CG
publish list, SWM anchor↔private-payload subject drift, and the
authorised-peer sync gating on chain finalization. None of those
block the slice's stated criteria.
Per-request contextGraphId + subGraphName on GET /api/epcis/events. SPARQL builder fix bundled in: anchor-payload join now uses a shared ?event variable across both GRAPH clauses (the prior FILTER(?event = ?root) didn't bind on Oxigraph). Devnet e2e covers the route, private partition merge, and non-allowed-node leak check. Outstanding (follow-up): SWM anchor subject (urn:uuid:<eventID>) does not match _private payload subject (dkg:...async-publish:...), so ?finalized=false returns empty on live data. Tracked in a separate diagnostic task.
The async-lift validation step rewrote root subjects to a synthetic `dkg:<cg>:<ns>:<scope>/<tail>-<hash>` form for both public and private quads. The SWM anchor was committed earlier in `agent.publishAsync` under the source IRI (e.g. `urn:uuid:<eventID>`) and never went through that rewrite, so the anchor in `<cg>/_shared_memory` and the payload in `<cg>/_private` ended up under different subjects. The slice-04 EPCIS query joins anchor and payload by subject, so `?finalized=false` returned empty whenever a private event was captured via the async-lift path. `canonicalRootIri` is now identity. The lift's `canonicalRootMap` becomes a self-map, `canonicalizeQuads` is a no-op, and SWM, canonical CG data graph, and `<cg>/_private` all agree on the source root IRI for the same logical event. The `assertNoCanonicalRootCollisions` guard still works under identity (distinct sources stay distinct). Test updates: - `async-lift-validation.test.ts`: renamed and rewritten to assert identity behavior; deleted the sha256-based canonical-form helper. - `async-lift-publisher.test.ts`: regression guard added on the end-to-end `processNext` test — `canonicalRootMap['urn:local:/rihana']` must be `'urn:local:/rihana'`, and the SWM anchor in `<cg>/_shared_memory` must use the same source IRI as the `<cg>/_private` payload. The two finalized-state-already-published tests had to flip share/publish ordering to avoid the SWM Rule 4 collision they previously avoided by relying on canonical-form divergence.
Single-node single-scenario probe that captures a private bare EPCIS doc on N1, asserts the SWM anchor and `<cg>/_private` payload share the source root IRI (no `dkg:<cg>:async-publish:…` leak), and verifies `GET /api/epcis/events?finalized=false` returns the event with full payload (`eventTime`, `bizStep`, `epcList`, `eventType`). Includes a `?finalized=true` regression guard for slice 04. Result on the slice/03b branch: 13 passed / 0 failed against `devnet-test` on the standard 6-node devnet topology. Verification appended to `docs/epcis/devnet-s4-e2e-2026-05-05.md` retiring caveat #2 (the SWM-anchor↔`/_private` subject mismatch); caveats #1 and #3 remain pre-existing devnet limitations outside this slice's scope.
Fix the SWM anchor / private payload IRI mismatch surfaced by slice/04's devnet e2e. The lift validator was rewriting public+private quad subjects to a synthesized canonical IRI (dkg:<cg>:<ns>:<scope>/<tail>-<hash>) before broadcast and private-staging promotion, while the agent's publishAsync already committed the SWM anchor under the source IRI (urn:uuid:<eventID>). Reader (events query) joined on shared subject IRI, so the mismatch produced zero rows for ?finalized=false. Fix (option A from spec): canonicalRootIri is now identity. SWM anchor, canonical <cg> data graph, and <cg>/_private all agree on the source IRI for the same logical event. Removed unreachable canonical-form helpers. End-to-end devnet probe (scripts/slice-03b-finalized-false-probe.sh) — 13/13 PASS, including ?finalized=false returning the captured event with full payload.
New `dkg epcis` subcommand tree wraps the daemon's /api/epcis/*
contract:
- `capture <document>`: reads either a raw EPCIS 2.0 JSON-LD doc or
an envelope (`{ epcisDocument, publishOptions, contextGraphId,
subGraphName }`), threads CLI flags through (`--context-graph-id`,
`--sub-graph-name`, `--access-policy`, repeated `--allowed-peer`),
POSTs to /api/epcis/capture, prints the 202 body.
- `status <captureID>`: GETs /api/epcis/capture/:id and prints the
job state JSON.
- `query [...flags]`: GETs /api/epcis/events with a query string built
from filter flags. Without `--all`, prints the first page plus
`nextPageUrl` (parsed from `Link: rel="next"`) so callers can step
manually. With `--all`, follows the next-page links and merges every
page's `eventList` into the first page's response.
HTTP statuses map to the documented exit-code table:
- 2xx → 0; 503 → 3 (publisher unavailable); 404 → 4 (not found);
other 4xx → 2 (client error); everything else → 1.
ApiClient gains `captureEpcis`, `getEpcisCapture`, `queryEpcisEvents`,
and `queryEpcisEventsByPath`. The query helpers surface the parsed
`nextPageUrl` so the `--all` walk doesn't re-parse Link headers.
Tests: 13 ApiClient unit tests (mocked fetch) + 19 CLI smoke tests
(spawn the compiled CLI against an in-process http stub) covering
flag→body translation, exit-code mapping, --all pagination, envelope
parsing, and CLI-flag-overrides-envelope precedence.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `scripts/slice-05-cli-e2e.sh`: a 13-step probe that exercises
`dkg epcis {capture,status,query}` end-to-end against a 6-node devnet,
including the privacy contract (allow-list capture on N1, query on N2
allowed peer, query + SPARQL probe on N3 unauthorised observer).
Verified on a freshly-booted local devnet (publishers enabled): 20/20
PASS. Full results, per-step assertions, and the pre-existing devnet
limitations encountered are written up in
`docs/epcis/devnet-cli-e2e-2026-05-05.md`. The doc cross-references
slice-04's e2e doc for caveats #1 and #3, both of which apply here
unchanged — capture terminates in `failed` rather than `finalized`
because the publisher wallet has no on-chain CG-publish authority on
this devnet, and authorised-peer private sync to N2 needs that same
finalization to fire. Privacy is verified positively on N3 via the
public anchor + empty `_private` ASK probe instead.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dkg epcis {capture,status,query} CLI subcommands wrapping the HTTP API,
plus ApiClient additions (captureEpcis, getEpcisCapture, queryEpcisEvents,
queryEpcisEventsByPath). --all follows Link: rel='next' pagination and
merges all pages into a single response. Exit codes mapped per spec
(0/1/2/3/4) via exitCodeForEpcisHttpStatus + reportEpcisError.
13 unit tests on the ApiClient additions; 19 spawn tests on the CLI
subcommands covering happy path, flag-to-body translation, envelope
parsing, --all pagination, and the full exit-code matrix.
Live devnet e2e (scripts/slice-05-cli-e2e.sh) — 20/20 PASS, with
documented caveats (capture state ends in failed because the publisher
wallet is not on the bootstrap CG auth list — closed by slice/06's
curated-CG setup with explicit publisher-wallet registration).
scripts/epcis-smoke-test.sh boots a 6-node devnet (or reuses one running) and runs eleven scenarios that empirically verify the privacy + on-chain authorization contract end-to-end: - N1 = publisher / curator (sole on-chain authorized publisher in EOA-curated mode) - N2 = allowed peer (recipient of allow-list private payload sync) - N3 = unauthorized observer (subscribed to public partition only; publish attempts must be rejected) Setup creates a curated context graph `<N1.agentAddr>/epcis-test` with `accessPolicy: 1, allowedAgents: [N1, N2]` and registers it on-chain. Pre-flight verifies the on-chain auth list before scenarios run: - `getPublishPolicy(cgId).policy == 0` (curated) - `getPublishPolicy(cgId).authority == N1.publisherWallet` - `isAuthorizedPublisher(cgId, N1) == true` - `isAuthorizedPublisher(cgId, N3) == false` Per-scenario PASS/FAIL with diagnostics goes to stdout and to `docs/epcis/devnet-results-<YYYY-MM-DD>.md`. Script exits 0 only when all scenarios pass; on failure it leaves the devnet running and preserves the test artifacts under /tmp for inspection. Idempotent: re-runs against an existing devnet by detecting the CG via `/api/context-graph/list` before attempting create. Empirical findings recorded in the report (these are observations about the integration branch, not regressions introduced by the smoke test): 1. Allow-list payload auto-pull is unimplemented (scenario 8 is informational, mirroring slice-04 caveat #3): the receiver-side `AccessClient.requestAccess` flow is not auto-triggered when an event arrives with `allowedPeers` containing the receiver's peerId. Privacy on N3 is still positively verified (5, 6, 9, 10). 2. Curator mode is EOA only (CLI does not expose PCA). In EOA mode `participantAgents` is metadata for CG-level sync gating; it does not grant on-chain publish rights. Only the single `storedAuthority` (N1) is on-chain authorized. 3. Scenario 11 is satisfied by the network-layer gate, not the chain gate: the curator denies N3's CG-meta sync request, so `/api/epcis/capture` 404s before any chain interaction. The chain gate is independently verified at preflight. Both gates fire as designed; the script accepts whichever is observed. Verification (live): ./scripts/devnet.sh clean DEVNET_ENABLE_PUBLISHER=1 ./scripts/devnet.sh start 6 ./scripts/epcis-smoke-test.sh # → 11 passed (incl. 1 informational) / 0 failed Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multi-node devnet privacy + auth-gate smoke test (scripts/epcis-smoke-test.sh). Curated CG setup with explicit publisher-wallet authorization, on-chain auth preflight, 11 scenarios covering capture, finalization, partition queries (?finalized=true|false), private partition merge, non-allowed-node leak check, and the chain auth gate. Idempotent; rerunnable on a live devnet. Result on a fresh devnet boot: 11/11 PASS (1 informational). Notable caveats (tracked for follow-up, not blocking): - Scenario 8 (allow-list payload visible on N2) demoted to informational — receiver-side auto-pull is missing in the publisher access machinery. Tracked in #409. - N1+N2 simultaneous chain auth not verifiable in EOA curator mode (CLI exposes EOA only; PCA mode would allow it). N1=true / N3=false on-chain is verified; N2's role is exercised at the P2P CG-meta-sync layer. - Privacy gate ends up double-layered: chain (preflight verified separately) + network (curator denies N3 CG-meta sync, so N3 has no local CG view). Stronger than the spec asked for; both gates fire as designed.
Post-rename cleanup. v10 renamed paranet → context graph, but a few legacy refs survived in the EPCIS surface: - drop the `epcis.paranetId` config back-compat fallback in both the capture and events-query routes; users must now configure `epcis.contextGraphId` (or pass it per-request). - drop the deprecated `paranetId?` field from the `epcis?` config type. - delete the two route tests asserting the legacy paranetId fallback. - rename the `'test-paranet'` test fixture string to `'test-cg'` across events-query, handlers, query-builder tests. Out of scope: the broader `config.paranets[]` subscription field (used by daemon /status response and cli config writers) and ~hundreds of paranet refs in dkg-agent / ccl-* / other route files. Those need a separate cross-cutting cleanup PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Revert slice/03b's identity passthrough in async-lift-validation.ts so roots are once again rewritten to `dkg:<cg>:<ns>:<scope>/<name>-<hash>` form by the lift validator (the colleague's original design). Adapt the canonicalization to the SWM partition by stamping a matching `<canonical> dkg:privateDataAnchor "true"` triple into `<cg>/_shared_memory` from inside the lift's `lift()` method, gated on private staging being present for that root. EPCIS partition-aware queries can now JOIN the public anchor in SWM with the canonical payload that lands in `<cg>/_private` once the chain publish completes, which fixes `?finalized=false` returning empty for private captures. Other changes: - async-lift-validation.ts is byte-for-byte the colleague's original apart from one keyword (`function` -> `export function` on canonicalRootIri), so the publisher impl can reuse it without a duplicate copy. - The async-lift validation/publisher tests are restored to their pre-slice/03b assertions (synthesized canonical IRIs in SWM and `<cg>/_private`). - slice-03b probe rewritten to match the canonical-restored behavior: asserts the canonical anchor lands in SWM, the canonical payload lands in `<cg>/_private`, and the EPCIS join surfaces the event for both `?finalized=false` and `?finalized=true`. Verified: 86/86 publisher unit tests pass; devnet probe 11/11 pass; slice-04 multi-node e2e 36/36 pass; slice-05 CLI e2e 20/20 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rewires
POST /api/epcis/captureandGET /api/epcis/eventsto be private-by-default, per-request context graph, partition-aware, async-only. Bare EPCIS documents now publish into the private partition by default; only adkg:privateDataAnchor "true"triple per root entity is left in the public partition. Captures and queries acceptcontextGraphId+subGraphNameper request (with a config fallback for single-CG deployments). The events query exposes a?finalized=true|falseboolean to switch between the canonical (durable) and shared-working-memory (in-flight) public partitions; in both modes the response merges the matching private partition for events the local node can see.A new CLI tree
dkg epcis {capture,status,query}wraps the HTTP API. A multi-node devnet smoke-test (scripts/epcis-smoke-test.sh) verifies the full privacy and authorization contract end-to-end.Why private-by-default
EPCIS supply-chain events carry commercially sensitive data: bizSteps reveal operational stages, EPC lists reveal materials in transit, partner identifiers live in
bizTransactionList. The previous async capture path wrapped a bare document as{ public: doc }before publishing, forcing every captured event to be visible to all CG subscribers. Operators capturing real supply-chain data had to choose between leaking everything or skipping EPCIS.Bare document → private partition is the smallest, hardest-to-misuse default that preserves discoverability (anchors leak) while protecting payload (full triples stay local). Callers who want a fully public capture send
{ public: <doc> }; hybrid captures send{ public, private }. The agent layer's existingdefaultVisibility: 'private'is now honored end-to-end.Alternatives considered:
{ public, private }always). Rejected: hostile to the public-only and private-only common cases; breaks every existing client that posts bare EPCIS documents.private=truequery param. Rejected: out-of-band metadata for what is fundamentally a payload-shape decision.Wire contract
Capture
Bare documents wrap as
{ private: doc }internally. The validator runs against the document the publisher will actually publish (bare doc, envelope'spublic, or envelope'sprivate— whichever is present).Status
statereflects the lift queue's machine:accepted | claimed | validated | broadcast | included | finalized | failed.Events query
Response stays a flat GS1 EPCIS 2.0-conformant
EPCISQueryDocument. The SPARQL builder unions:finalized=true, SWM whenfinalized=false).dkg:privateDataAnchor "true"in the chosen public partition AND the matching event's payload in<cg>/_private.Orphan private payloads (no matching anchor in the chosen public partition) are excluded — an integrity guard.
CLI
Exit codes: 0 success, 1 unexpected error, 2 client error (4xx), 3 service unavailable (503), 4 not found (404).
Slices
This branch is the integration of seven slices. Each was developed on its own slice branch, devnet-verified where applicable, and merged here.
{ private: doc }; synchandleCaptureremovedcontextGraphId+subGraphNamecontextGraphId?finalizedpartition selector + private mergecontextGraphId+subGraphName_privatepayload disagreed and never joined. Fix: keep source IRI through validation.dkg epcis {capture,status,query}subcommands--allfollowsLink: rel="next"; full exit-code matrixTesting
Three layers of verification, each addressing a different failure mode.
Unit (per-package)
partition×subGraphName× representative filter sets.{ private }wrap; envelope passthrough; envelope rejection.Handler / route (in-process HTTP stub)
{ private }; envelope passthrough; access-policy + allowed-peers propagation; per-request CG + fallback; CG/subgraph validation.?finalized=true|falseselects the right public graph URI; private payload merging; orphan exclusion; subgraph variant; back-compat for no-contextGraphIdcallers.--allpagination, full exit-code matrix.Suite results on this branch:
pnpm -F @origintrail-official/dkg-epcis test— 142/142 passpnpm -F @origintrail-official/dkg-cli test(epcis subset) — 49/49 passnpx tsc --noEmit— clean acrosspackages/{epcis,publisher,cli,agent}Devnet end-to-end (multi-node)
scripts/epcis-smoke-test.shboots a 6-node devnet and runs an 11-scenario verification on top of a freshly created curated context graph with the publisher wallet explicitly registered as authorized. Scenarios cover:202 + captureID.finalized(proves wallet auth + on-chain publish lifecycle).3-4. Events query in both
?finalized=modes returns the captured event with full private payload populated.5-6. Non-allowed observer node (N3) sees an empty event list and a verifiably-empty
<cg>/_privatepartition (privacy contract honored at the API and at raw SPARQL).7-8. Allow-list capture (informational — see Known gaps).
Latest devnet result on this branch: 11/11 PASS (1 informational). Report committed at
docs/epcis/devnet-results-2026-05-05.md. Per-slice live e2e reports also committed underdocs/epcis/.Known gaps tracked separately
EPCIS: allow-list private payload propagation has no receiver-side pull #409 — allow-list private payload propagation has no receiver-side pull. The publisher correctly authorizes peers in
allowedPeers(server-sideAccessHandlerenforces it), but no code on the receiver's side ever callsAccessClient.requestAccess. The PRD's "allowed peers receive private payload" promise is structurally incomplete in this branch. The privacy gate is verifiably correct (non-allowed peers cannot fetch); the delivery requires a follow-up. Scenario 8 in the smoke test is therefore informational.Three pre-existing
publishJsonLdagent tests assert thatagent.publishAsyncpopulates<cg>/_privateimmediately on return; in the actual modelpublishAsyncwrites to staging only, and the lift runner moves to<cg>/_privatelater. The contract is exercised live by slice/03b's devnet probe and by slice/06's smoke test. The stale unit tests should be either rewritten to drive a real lift runner or moved to integration tests in a separate cleanup.Out of scope
dkg:privateDataAnchor, which already existed.Test plan
pnpm install && pnpm typecheckcleanpnpm -F @origintrail-official/dkg-epcis testpassespnpm -F @origintrail-official/dkg-cli testpasses (epcis subset green; pre-existing PROD-BUG audit failures are unrelated)./scripts/devnet.sh start && ./scripts/epcis-smoke-test.shexits 0 (11/11 PASS)dkg epcis capture <doc> --context-graph-id <cg>anddkg epcis query --context-graph-id <cg>smoke-tested manually against the devnet🤖 Generated with Claude Code