Skip to content

feat: source-file linkage triples + devnet hardening (chain onto PR #120)#121

Merged
Jurij89 merged 2 commits intotest/devnet-e2e-sections-18-24from
feat/source-file-linkage-triples
Apr 11, 2026
Merged

feat: source-file linkage triples + devnet hardening (chain onto PR #120)#121
Jurij89 merged 2 commits intotest/devnet-e2e-sections-18-24from
feat/source-file-linkage-triples

Conversation

@Jurij89
Copy link
Copy Markdown
Contributor

@Jurij89 Jurij89 commented Apr 11, 2026

Summary

Chain PR onto test/devnet-e2e-sections-18-24 (the base for PR #120). Two changes bundled:

  1. Phase B — Source-file linkage triples. Implements 19_MARKDOWN_CONTENT_TYPE.md §10.1 and §10.2 — the full 20-row source-file linkage contract that was missing from PR feat: wire import-file endpoint and Phase 2 markdown extraction (#77, #79, #80) #113. After a daemon restart, the assertion graph can now rediscover the original file blob by SPARQLing _meta for dkg:sourceFileHash. Previously the only linkage was the in-memory extractionStatus map, which vanished on restart. Hash format is keccak256 per 03_PROTOCOL_CORE.md §2.1:658.

  2. Phase D — Devnet test hardening. Hardens scripts/devnet-test.sh sections 18-24 (introduced in PR test: add devnet e2e sections 18-24 covering V10 feature gaps #120) against 4 P0 + 11 P1 + 5/7 P2 findings from a review pass. §21 now actually SPARQLs for the new linkage predicates, §23a no longer silently masks auth regressions under DEVNET_NO_AUTH=1, §23c/§23g no longer false-pass on 500s, §18a no longer accepts pre-catchup idle as success, and a hung node can no longer stall CI ~40 min via missing --max-time.

Companion spec PR

Three spec cleanups surfaced during implementation — see OriginTrail/dkgv10-spec#86:

  • Add dkg:mdIntermediateHash row to 19_MARKDOWN_CONTENT_TYPE.md §10.2 _meta layout
  • Fix dkg:rootEntity literal→IRI typo at §10.2:601
  • Add normative guidance on fileUri URN shape (urn:dkg:file:keccak256:<hex>)

The spec PR is independent and can merge in either order.

Test plan

  • packages/cli/test/extraction-markdown.test.ts — 41/41 passing (was 36)
  • packages/cli/test/import-file-integration.test.ts — 31/31 passing (was 25)
  • packages/cli/test/multipart.test.ts — 25/25 passing (unchanged)
  • packages/core full suite — 415/415 passing (unchanged)
  • bash -n scripts/devnet-test.sh — syntax OK
  • Linux CI — gating signal
  • Live devnet smoke run — to be verified by reviewer / after merge

Pre-existing Windows-hostility failures (slot-helpers, migration, rollback, auto-update, blue-green, publisher-wallets, publisher-cli-smoke, install-script, indexer) are unrelated to this PR — none of the modified files have failures.

Follow-up: GET /api/file/:hash endpoint

This PR establishes the in-graph linkage (<assertionUal> dkg:sourceFileHash "keccak256:<hex>" in CG root _meta) that lets a SPARQL client rediscover the source file hash after daemon restart. Actually retrieving the file bytes over HTTP is deferred — the daemon currently has no GET /api/file/:hash route, so the round-trip is only exercisable in-process via FileStore.get() (which import-file-integration.test.ts verifies). Exposing the file store over HTTP requires a separate design for access control semantics (private CGs per 19_MARKDOWN_CONTENT_TYPE.md §4.1), content-type preservation, and spec-side language. Tracked as a follow-up.

Closes / references

Not closing any issues — this is a chain PR onto the PR #120 base, and the linkage work was surfaced post-merge on PR #113 rather than from a dedicated issue.

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

Comment thread packages/cli/src/daemon.ts Outdated
Comment thread packages/cli/src/daemon.ts Outdated
Comment thread packages/cli/src/extraction/markdown-extractor.ts Outdated
Comment thread scripts/devnet-test.sh Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from cac9544 to a90d433 Compare April 11, 2026 12:22
Comment thread packages/cli/src/daemon.ts
Comment thread scripts/devnet-test.sh Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from a90d433 to 9a6772b Compare April 11, 2026 12:39
Comment thread packages/cli/src/daemon.ts Outdated
Comment thread packages/cli/src/daemon.ts
Comment thread packages/cli/src/file-store.ts Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from 9a6772b to aecb6b8 Compare April 11, 2026 13:12
agentDid: `did:dkg:agent:${agent.peerId}`,
agentDid,
ontologyRef,
documentIri: assertionUri,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: pinning documentIri to assertionUri means the new rootEntity override only changes rows 3/14 metadata. assertion.promote() partitions by quad subject, so a document with rootEntity: urn:parent will still promote under assertionUri while _meta says the root is urn:parent. Either rewrite the extracted subjects under the resolved root, or keep the metadata aligned with the actual promoted root.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred to a follow-up PR. Filed as #122.

The issue is real but the fix is architectural — the current behavior pins documentIri: assertionUri in the import-file route, which means all structural content triples (frontmatter properties, H1→schema:name, wikilinks, hashtags, dataview fields, section headings) emit with the assertion UAL as their subject, regardless of whether frontmatter has rootEntity set. That makes _meta row 14 an annotation rather than a content-partition statement.

The three options discussed in the follow-up issue are:

  1. 10A — extractor rewrites content-triple subjects to resolvedRootEntity. Has semantic hazards: if two assertions both claim rootEntity: urn:note:parent, they'd both write content triples on <urn:note:parent>, re-introducing the cross-assertion contention problem the Bug 8 promote-filter fix just solved.
  2. 10B — remove the rootEntity override feature entirely. Drops a spec-defined feature (§19.10.1:508).
  3. 10C — document the current behavior as a metadata HINT, not retargeting. Zero code, just a spec clarification that row 14 is an annotation.

The Round 4 consensus is that 10C is the smallest defensible path, but it needs spec-engineer and broader architectural discussion before committing. Not blocking for PR #121 — users who hit the mismatch can work around it by leaving rootEntity unset (the reflexive default is consistent) or by accepting that the _meta annotation is informational.

Tracking in #122. Not resolving this thread — leaving it open as a deferred marker.

Comment thread packages/cli/src/daemon.ts Outdated
Comment thread packages/publisher/src/dkg-publisher.ts Outdated
Comment thread packages/cli/src/extraction/markdown-extractor.ts Outdated
Comment thread packages/cli/src/daemon.ts Outdated
Comment thread packages/cli/src/daemon.ts Outdated
// Row 1 — points at the content-addressed file URN
{ subject: args.subject, predicate: DKG_SOURCE_FILE, object: args.sourceFileIri },
// Row 3 — resolved root entity (reflexive or frontmatter/explicit override)
{ subject: args.subject, predicate: DKG_ROOT_ENTITY, object: resolvedRootEntity },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: the new rootEntity override only changes the emitted metadata/linkage quad; it does not change any subject IRIs. assertionPromote still derives roots from subjects via autoPartition, so promoting a document with rootEntity: ... will still publish under assertionUri/section subjects while _meta now claims a different root. Either reject this override here until the downstream promote path can honor it, or plumb the resolved root through partitioning/promotion.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred — this is the same issue as the daemon.ts:2613 thread, already filed as #122. The rootEntity override / promote-partitioning mismatch is architectural and requires either subject rewriting (with cross-assertion contention hazards), removing the feature (contradicts spec §19.10.1:508), or a documentIri plumbing rework. Tracked in #122 for follow-up. Leaving this thread open as a deferred marker, same as the original thread.

@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from 4cb2df3 to 9a3f2da Compare April 11, 2026 14:09
Comment thread packages/cli/src/daemon.ts
const metaQuads: Array<{ subject: string; predicate: string; object: string; graph: string }> = [
// Row 14 — rootEntity comes from the extractor's resolved value so
// the data-graph row 3 and `_meta` row 14 point at the same IRI.
{ subject: assertionUri, predicate: 'http://dkg.io/ontology/rootEntity', object: resolvedRootEntity, graph: metaGraph },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: dkg:rootEntity in _meta can now disagree with the actual root entity the rest of the pipeline uses. assertionPromote still derives KA roots from subjects via autoPartition, and this import path still makes assertionUri the named subject, so a frontmatter rootEntity override only changes bookkeeping here. That leaves downstream consumers seeing one root in _meta and another during promotion/update. Either make the imported document root follow resolvedRootEntity, or keep this row reflexive until the publisher honors the override end-to-end.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred — this is the third location Codex has flagged the rootEntity override / promote-partitioning mismatch. See the original daemon.ts:2613 thread and the markdown-extractor.ts:529 thread, both tracked in #122. The architectural fix (documentIri plumbing end-to-end) is out of scope for this PR. Leaving this thread open as a deferred marker consistent with the other two.

Comment thread packages/cli/src/daemon.ts
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from 9a3f2da to 51eb21d Compare April 11, 2026 14:46
Comment thread packages/cli/src/extraction/markdown-extractor.ts
Comment thread packages/cli/src/daemon.ts
Comment thread packages/cli/src/daemon.ts
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from 51eb21d to ee61822 Compare April 11, 2026 15:23
Comment thread packages/cli/src/daemon.ts
Comment thread scripts/devnet-test.sh Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from ee61822 to aea36de Compare April 11, 2026 15:42
Comment thread packages/publisher/src/dkg-publisher.ts Outdated
Comment thread packages/cli/src/daemon.ts
Comment thread packages/cli/src/daemon.ts Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch 2 times, most recently from 7d87d09 to 8bb41e0 Compare April 11, 2026 16:24
// that would otherwise pass the prefix check and blow up at the RDF
// layer with a cryptic error (Codex Bug 13). Non-IRI values fall
// through to slugification as before.
let resolvedRootEntity: string = args.rootEntityIri ?? args.subject;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: This PR now allows rootEntity to diverge from the document subject, but the downstream promote/publish path still partitions KAs by quad subject only (autoPartition() never looks at dkg:rootEntity). A document with rootEntity: urn:dkg:md:parent-root will still be promoted/owned under its subject URI while rows 3/14 claim the root is parent-root. Either keep resolvedRootEntity reflexive until partitioning honors dkg:rootEntity, or update the promote/publish path in the same PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred — duplicate of the autoPartition vs dkg:rootEntity partition divergence already tracked in #122 (also surfaced on this PR as threads PRRT_kwDORwbl8c56TLcv / PRRT_kwDORwbl8c56TWbZ / PRRT_kwDORwbl8c56Tbh8 — this is now the fourth location Codex has flagged the same root-cause pattern). The architectural fix requires plumbing the resolved documentIri / rootEntityIri through partitioning and promotion end-to-end, which is out of scope for this PR. Leaving this thread open as a visible deferred marker pointing at #122, consistent with the other three.

Comment thread packages/cli/src/daemon.ts Outdated
Comment thread packages/cli/src/extraction/markdown-extractor.ts Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from 8bb41e0 to 5257a17 Compare April 11, 2026 16:44
Comment thread packages/cli/src/daemon.ts
const metaQuads: Array<{ subject: string; predicate: string; object: string; graph: string }> = [
// Row 14 — rootEntity comes from the extractor's resolved value so
// the data-graph row 3 and `_meta` row 14 point at the same IRI.
{ subject: assertionUri, predicate: 'http://dkg.io/ontology/rootEntity', object: resolvedRootEntity, graph: metaGraph },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: resolvedRootEntity can now differ from assertionUri, but the import path still extracts all content under the assertion UAL subject. assertionPromote/autoPartition root KAs from subjects, not from this _meta triple, so a frontmatter rootEntity override will make _meta advertise one root while promotion still publishes the assertion under the UAL. Either apply the override to the extracted subject as well, or keep row 14 reflexive until downstream promotion understands dkg:rootEntity overrides.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred — same autoPartition vs dkg:rootEntity partition divergence pattern tracked in #122, already surfaced on this PR as threads PRRT_kwDORwbl8c56TLcv / PRRT_kwDORwbl8c56TWbZ / PRRT_kwDORwbl8c56Tbh8 / PRRT_kwDORwbl8c56T3jv. This is now the fifth marker Codex has flagged at locations along the same architectural root-cause pattern. The fix requires plumbing resolved documentIri / rootEntityIri through autoPartition / partitioning / promotion end-to-end, which is out of scope for this PR. Leaving this thread open as a visible deferred marker alongside the four prior markers.

Comment thread packages/cli/src/extraction/markdown-extractor.ts Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from 5257a17 to 592f7b3 Compare April 11, 2026 17:06
Comment thread packages/publisher/src/dkg-publisher.ts Outdated
Comment thread packages/publisher/src/dkg-publisher.ts Outdated
Comment thread packages/cli/src/extraction/markdown-extractor.ts Outdated
Comment thread packages/cli/src/daemon.ts
Comment thread packages/cli/src/daemon.ts
Comment thread packages/cli/src/extraction/markdown-extractor.ts Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from 5c5d25e to 7f6aa97 Compare April 11, 2026 17:44
Comment thread packages/publisher/src/dkg-publisher.ts Outdated
Comment thread packages/cli/src/daemon.ts Outdated
Comment thread packages/cli/src/daemon.ts
const metaQuads: Array<{ subject: string; predicate: string; object: string; graph: string }> = [
// Row 14 — rootEntity comes from the extractor's resolved value so
// the data-graph row 3 and `_meta` row 14 point at the same IRI.
{ subject: assertionUri, predicate: 'http://dkg.io/ontology/rootEntity', object: resolvedRootEntity, graph: metaGraph },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: resolvedRootEntity is now persisted here, but assertionPromote() still partitions promoted assertions by subject via autoPartition() and ignores these dkg:rootEntity rows. An import with rootEntity: frontmatter will therefore advertise one root in WM/_meta and publish under a different root later. Either thread this override through the promote/publish path or keep row 14 reflexive until downstream promotion honors it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferred — same autoPartition vs dkg:rootEntity partition divergence pattern tracked in #122. Already flagged in this PR at threads PRRT_kwDORwbl8c56TLcv (Bug 10) / PRRT_kwDORwbl8c56TWbZ (Bug 16) / PRRT_kwDORwbl8c56Tbh8 (Bug 18) / PRRT_kwDORwbl8c56T3jv (Bug 28) / PRRT_kwDORwbl8c56T7e6 (Bug 32).

This is the sixth marker for the same architectural issue — three of which (Bug 18, Bug 32, and now this one) are at daemon.ts:2791 specifically. Three flags at the same line across three review rounds reinforces that this is systemic to the documentIri pinning decision rather than a local patch opportunity. The architectural fix requires plumbing resolved root through autoPartition / partitioning / promotion end-to-end, which is out of scope for this PR.

Leaving this thread open as a visible deferred marker alongside the five prior markers, consistent with the established pattern. See #122 for the tracking issue (already updated to reflect the 6-marker count).

Comment thread packages/cli/skills/dkg-node/SKILL.md Outdated
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from a5c168c to 54e5b09 Compare April 11, 2026 18:28
….1/§10.2)

The markdown-extraction pipeline in packages/cli was emitting structural
content triples but not the source-file linkage triples required by
19_MARKDOWN_CONTENT_TYPE.md §10.1 and §10.2. Without those triples, the
assertion graph had no way to rediscover the original file blob after a
daemon restart — the in-memory extractionStatus map was the only surviving
linkage, and it vanished on restart.

This commit implements the full 20-row spec-mandated source-file linkage
contract from §10.1 (data-graph entity linkage) and §10.2 (_meta graph
metadata), split between the extractor and the import-file route handler:

**Extractor (packages/cli/src/extraction/markdown-extractor.ts)**
- Removes the non-spec dkg:derivedFrom / prov:wasGeneratedBy provenance
  block entirely — daemon owns all provenance emission now.
- Emits rows 1-3 on the document subject IRI: dkg:sourceFile (to fileUri),
  dkg:sourceContentType "text/markdown" (always the extractor input type,
  even for PDF where the MD intermediate is what the extractor processed),
  dkg:rootEntity (reflexive by default, frontmatter override supported).
- Frontmatter rootEntity key is consumed so it doesn't leak through as
  schema:rootEntity via the generic frontmatter-to-predicate fallthrough.

**Daemon (packages/cli/src/daemon.ts import-file route handler)**
- Computes keccak256 via ethers.keccak256 alongside sha256. FileStoreEntry
  now exposes both hashes; ImportFileResponse.fileHash + ExtractionStatusRecord
  + mdIntermediateHash all switched to keccak256 per spec §2.1:658 (file store
  is keccak256-addressed).
- Mints fileUri as urn:dkg:file:keccak256:<hex> and passes it into the extractor.
- After Phase 2, builds rows 4-8 (file descriptor block: rdf:type dkg:File,
  dkg:contentHash, dkg:fileName, dkg:contentType, dkg:size).
- Mints one fresh urn:dkg:extraction:<uuidv4> per import and emits rows 9-13
  (ExtractionProvenance block with dkg:extractedFrom <fileUri>, dkg:extractedBy,
  dkg:extractedAt, dkg:extractionMethod "structural").
- After assertion.write, writes rows 14-19 (always) and row 20 (conditionally,
  only when Phase 1 actually ran for PDF/DOCX uploads) into the CG ROOT _meta
  graph via agent.store.insert with explicit contextGraphMetaUri(contextGraphId)
  — NEVER the sub-graph _meta graph, per §16.2.1:3449-3466.
- Row 15 (_meta sourceContentType) uses the ORIGINAL upload content type
  (e.g. "application/pdf"), distinct from row 2 (data-graph sourceContentType)
  which uses the extractor input "text/markdown". This row-2-vs-row-15 split
  is explicitly tested in both directions.
- _meta insert failures flow through recordFailedExtraction so
  /extraction-status doesn't get stuck at in_progress on partial writes.

**FileStore (packages/cli/src/file-store.ts)**
- Dual-hash store: sha256 remains the on-disk primary for back-compat; a
  keccak256 pointer file is written under keccak256/<shard>/<hex> containing
  the sha256 hex so FileStore.get() accepts either prefix and resolves to
  the same blob.
- Non-breaking for any existing sha256 callers.

**Tests**
- extraction-markdown.test.ts grows from 36 to 41 tests: explicit assertions
  that dkg:derivedFrom is NEVER emitted; rows 1-3 coverage with and without
  sourceFileIri; rootEntity override precedence (frontmatter > explicit input
  > reflexive); rootEntity frontmatter key no longer leaks through the
  generic predicate fallthrough.
- import-file-integration.test.ts grows from 25 to 31 tests: row 4-13 data-graph
  descriptor and provenance block verification with fresh UUIDv4 per import;
  row 14-19 _meta graph verification; PDF content-type split (row 2 = text/markdown
  AND row 15 = application/pdf on the same import); sub-graph routing with
  explicit assertion that _meta quads always land in CG root meta, never
  sub-graph meta; daemon-restart recovery (clear extractionStatus map, recover
  hash from captured _meta quads, re-fetch blob via FileStore.get with byte
  equality); FileStore dual-prefix acceptance.

Refs: 19_MARKDOWN_CONTENT_TYPE.md §10.1, §10.2, §3.2, §4
      05_PROTOCOL_EXTENSIONS.md §6.3, §6.5
      03_PROTOCOL_CORE.md §2.1 (file store)
      EXAMPLE_FULL_FLOW.md §2
Companion spec PR: OriginTrail/dkgv10-spec#86

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from 54e5b09 to adb6180 Compare April 11, 2026 18:40
Comment thread scripts/devnet-test.sh
Comment thread packages/cli/src/daemon.ts
 review

Hardens scripts/devnet-test.sh sections 18-24 (introduced in PR #120) against
a review pass that surfaced 4 P0 (blocking), 11 P1 (fix-in-PR), and 7 P2
(nice-to-have) findings. All P0 and P1 addressed; 5 of 7 P2 addressed.

**P0 (all 4)**
- §21 now SPARQLs the assertion data graph for the new source-file linkage
  triples (dkg:sourceFile, dkg:sourceContentType, dkg:rootEntity) AND the
  CG root _meta graph for dkg:sourceFileHash with keccak256 format regex
  and a drift-check that the _meta hash equals the wire fileHash from the
  import response. §21h asserts dkg:mdIntermediateHash is ABSENT for
  markdown uploads (row 20 is Phase-1-only). Previously §21 only checked
  the in-memory ImportFileResponse fileHash — a daemon that wrote zero
  linkage triples would have silently passed.
- §23a (no-auth 401) now detects DEVNET_NO_AUTH=1 explicitly and emits
  [SKIP] cleanly; hard-fails if a real auth regression returns 200.
  Previously silently degraded to WARN under DEVNET_NO_AUTH=1, masking
  real regressions.
- §23c / §23g switched from substring-grep-on-body to new http_post_capture
  helper that captures both body AND HTTP status; now requires 4xx AND
  error token. A 500 with body {"error":"internal"} no longer false-passes.
- §18a catchup polling removed "idle" from success markers (idle is the
  INITIAL pre-catchup state, not a completion). Only completed|synced|done
  break the poll loop.

**P1 (all 11)**
- New c() helper bounds curl with --max-time 30 --connect-timeout 5
  (env-overridable via DEVNET_CURL_TIMEOUT / DEVNET_CURL_CONNECT_TIMEOUT).
  A hung node can no longer stall CI ~40 min per polling section.
- json_get normalizes Python booleans to lowercase true/false; all
  check "..." "True" callsites flipped to "true".
- New safe_bindings_count and safe_quads_count helpers emit a PARSE_ERR
  sentinel on schema drift instead of silently returning "0". ~20 call
  sites converted.
- §20d/§20f now route through c -X PUT for consistent timeout + auth.
- §22c/§22d fragile inline python ternary replaced with explicit
  try/except that surfaces __ERR__ / __MISSING__ sentinels distinctly
  from legitimate status values.
- New §21i: PNG upload graceful-degrade negative test asserts
  extraction.status == "skipped" AND tripleCount == 0 AND pipelineUsed
  null (spec §6.5 graceful-degrade path had zero coverage).
- New §24g: write-to-unregistered-sub-graph negative test with nanosecond-
  suffixed name, requires 4xx response.
- §23b no longer conflates PARSE_ERR with legitimate empty results.
- §22a asserts triplesWritten >= 2 before 22b enqueues; a silent zero-
  write can no longer hide a broken publisher queue.
- §21e promote check excludes __ERR__ in addition to __NONE__ and "0".
- §18b/§18c hard-fail (not warn) when sync claimed completion but VM/SWM
  data is missing ("catchup reported complete but data missing — bug").
  Paired with P0-4 this closes the two-layer cover that catchup bugs
  previously had.

**P2 (5 of 7)**
- GOSSIP_WAIT_S env var for the §24b settle window.
- Explicit [SKIP] log when §22c-f skip due to missing jobId.
- DEVNET_TMPDIR honors $TMPDIR for Windows/WSL developers.
- shareOperationId / workspaceOperationId dual-lookup documented as
  intentional legacy alias (not an API-rename drift).
- §24f gossip single sleep replaced with 5×1s poll-and-break loop.

Not addressed (scope): P2-5 (positive auth control — adds new coverage
rather than hardening existing) and P2-7 (stable test IDs — pure refactor).

New helpers added: c(), ok/fail/warn/skip, json_get, check,
safe_bindings_count, safe_quads_count, http_post_capture.

Refs: spec §10.1, §10.2, §6.5

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Jurij89 Jurij89 force-pushed the feat/source-file-linkage-triples branch from adb6180 to fe63671 Compare April 11, 2026 18:54
@Jurij89 Jurij89 merged commit 2a4b123 into test/devnet-e2e-sections-18-24 Apr 11, 2026
1 check passed
const metaQuads: Array<{ subject: string; predicate: string; object: string; graph: string }> = [
// Row 14 — rootEntity comes from the extractor's resolved value so
// the data-graph row 3 and `_meta` row 14 point at the same IRI.
{ subject: assertionUri, predicate: 'http://dkg.io/ontology/rootEntity', object: resolvedRootEntity, graph: metaGraph },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Bug: This only records the rootEntity override in _meta; the imported quads themselves still live under assertionUri, and assertionPromote/autoPartition derive KA roots from subjects, not from dkg:rootEntity. A later promote will still publish the assertion URI as the root entity, so the new override becomes informational only. If the override is meant to affect publish/update identity, rewrite the imported document/section subjects to the resolved root entity or teach partitioning to honor dkg:rootEntity.

// based IRI" without restricting schemes; the only exclusions are
// blank nodes (RDF 1.1 §3.4 — not IRIs) and reserved protocol
// namespaces (§19.10.2:708-723). `isSafeIri` matches that contract.
if (isSafeIri(fmId)) return fmId;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Issue: isSafeIri() only checks SPARQL-safe syntax; it does not enforce the reserved urn:dkg:file: / urn:dkg:extraction: namespaces mentioned in this comment. With this change, id: urn:dkg:file:... is accepted here and only rejected later at the publisher boundary (or written if a caller bypasses that boundary). Add the same reserved-prefix guard here so subject resolution and write-time validation stay consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant