Skip to content

feat(dtpr-ai): v2 schema reshape — Element.context, subchains, PII, AIAAIC research#274

Merged
pichot merged 6 commits intomainfrom
feat/dtpr-ai-v2-taxonomy
Apr 27, 2026
Merged

feat(dtpr-ai): v2 schema reshape — Element.context, subchains, PII, AIAAIC research#274
pichot merged 6 commits intomainfrom
feat/dtpr-ai-v2-taxonomy

Conversation

@pichot
Copy link
Copy Markdown
Member

@pichot pichot commented Apr 27, 2026

Summary

  • JSON Schema additions for the v2 shape: Element.context override (full replace, not merge), nullable color on context values (null = tag rendering), DatachainType.subchains[], and per-instance subchain_instances/subchain_instance_id/actions/sources/linked_instance_ids/instance_version/updated_at.
  • ai@2026-04-27-beta lands the structural reshape: bare-slug category ids (drop the ai__ prefix), decisionfunctional_modes (catalog emptied for the element-design pass), accountable shape fix (rounded-squarehexagon) + Role context, processing shape (hexagoncircle) + technical-detail framing, PII context on input_dataset/output_dataset (none/anonymized/identifiable/biometric), citizen-priority display order, and subchains: [data_flow] on the datachain-type.
  • Concepts docs added for the shape contract (advisory) and subchains, plus the AIAAIC + research-corpus material from the prior commit on this branch.
  • Validates and builds cleanly; element-level authoring (six functional-mode elements, nine AIAAIC risk elements, consolidated accountable, family-typed processing) follows in subsequent dtpr-element-design / dtpr-category-audit / dtpr-comprehension-audit sessions.

Test plan

  • `pnpm --filter ./api schema:validate ai@2026-04-27-beta` clean
  • `pnpm --filter ./api schema:build ai@2026-04-27-beta` emits 303 files; content_hash regenerated
  • `pnpm --filter ./api schema:validate ai@2026-04-16-beta` still validates (Category.id pattern relaxed to keep older beta loadable)
  • `pnpm --filter ./api typecheck` clean
  • `pnpm --filter ./api test` 388/388 passing

🤖 Generated with Claude Code

pichot and others added 5 commits April 27, 2026 14:44
Adds four research corpus entries informing the next AI schema beta:
MIT AI Risk Repository v4 (April 2026, 1,725 risks across 7 domains),
Atlas of AI Risks (citizen-facing AI city register prior art),
AIAAIC Collaborative Harms Taxonomy (the chosen organizing cut for
v2's risks_mitigation, victim-oriented over cause-oriented), and
AIR 2024 (policy-grounded reference, surveyed and rejected for
citizen audience).

Adds dtpr-ai/content/9.attribution.md documenting AIAAIC's CC BY-SA
4.0 license, full citation, and the mixed-license boundary between
DTPR (CC BY 4.0) and AIAAIC-derived risk content. Establishes the
per-element citation pattern via the existing Element.citation field.

Stress-test of the v2 structural shape lives in
.context/dtpr-ai-v2-structural-proposal.md (gitignored). The
schema:new run for ai@2026-04-27-beta follows this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lays the structural seams that the new ai@2026-04-27-beta needs.

- Element.context (optional override; fully replaces Category.context)
- Nullable color on context values (null = renderer treats value as a
  tag instead of a colored dot; baked icons skip null variants)
- DatachainType.subchains[] (named, ordered groups of categories)
- DatachainInstance.subchain_instances[], elements[].subchain_instance_id,
  elements[].actions[], instance_version, updated_at, sources[],
  linked_instance_ids[]
- Validator rules extended to cover element-level context, null colors,
  subchain refs/uniqueness, subchain locale strings
- json-emitter filters null-color variants from icon_variants and uses
  effective context for variant resolution

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the same `^[a-zA-Z0-9_-]+$` whitelist Element.id has used since
the port. The new convention drops the legacy `<datachain_type>__<slug>`
prefix in favor of bare slugs (`accountable`, `functional_modes`, …);
the description is updated accordingly.

The forbid-`__` refinement from the v2 proposal is intentionally NOT
enforced at the structural layer so the existing ai@2026-04-16-beta
(which still uses `ai__` ids) continues to validate. The new
ai@2026-04-27-beta uses bare slugs by content; the convention can be
hardened once the older beta is retired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the schema-level edits captured in the v2 structural proposal.
Element-level authoring (the six functional-mode elements, the nine
AIAAIC risk elements, the consolidated accountable element with
logo support, and the family-typed processing catalog) follows in
subsequent dtpr-element-design / dtpr-category-audit passes.

- Drop the `ai__` prefix on category directories and IDs (eleven
  categories renamed to bare slugs).
- Rename `ai__decision` → `functional_modes`; empty its element
  catalog (the six retired decision elements are deleted in beta).
- Reorder categories by citizen priority: purpose, accountable,
  functional_modes, risks_mitigation, rights, input_dataset,
  processing, output_dataset, access, retention, storage.
- accountable: rounded-square → hexagon (port-bug correction); add
  Role context (vendor/deployer, color: null = tag rendering).
- processing: hexagon → circle (matches data-in-motion shape
  contract); description rewritten to clarify the technical-detail
  framing.
- input_dataset: description rewritten to clarify runtime input
  (not training data); add PII context (none color: null,
  anonymized #4A90D9, identifiable #FFD700, biometric #D9342F).
- output_dataset: same PII context.
- datachain-type.yaml: subchains[data_flow = input_dataset →
  processing → output_dataset]; pt added to localized name.

Validates and builds cleanly via schema:validate / schema:build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Glosses the two new constructs introduced in ai@2026-04-27-beta so
authors and reviewers reach for the same vocabulary.

- 5.shape-contract.md: advisory mapping (hexagon = who/why/what,
  circle = data in motion, rounded-square = data at rest, octagon =
  context/guardrails). Documents the convention without making the
  schema enforce it.
- 6.subchains.md: glosses datachain-type-level `subchains` and the
  per-instance `subchain_instances` realization, with the smart-
  intersection example from the proposal.
- 0.index.md: links the two new pages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 27, 2026

Too many files changed for review. (174 files found, 100 file limit)

verify.mjs's CORPUS_SLUG_RE expects the slug to end in `.md` (matching
the on-disk filenames); the four AIAAIC/AI-risk rows in INDEX.md were
missing the suffix, failing the offline plugin conformance check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pichot pichot merged commit 7c9b930 into main Apr 27, 2026
6 checks passed
@pichot pichot deleted the feat/dtpr-ai-v2-taxonomy branch April 27, 2026 19:43
pichot added a commit that referenced this pull request May 5, 2026
Three fixes to make the previous commit deploy.

1. Drop content.config.ts + app/pages/index.vue
   c12 loads our user-layer content.config.ts via jiti, which
   transitively requires @nuxt/content -> @nuxt/kit. jiti evals
   @nuxt/kit/dist/index.mjs:182's `import.meta.dev` via new
   vm.Script (CJS context), throwing SyntaxError. Docus's own
   content.config.ts has the same imports but loads inside a Nuxt
   context where jiti is configured differently.

   The homepage UContainer override is a polish item; restoring it
   needs a docus-friendly extension that doesn't trigger this load
   chain. Following up separately.

2. Update @dtpr/ui test fixtures for v2 schema reshape
   The v2 reshape (#274) made `actions` and `subchains` required
   defaults via z.ZodDefault, which Zod 4 surfaces as required in
   the inferred output type. vite-plugin-dts type-checks tests
   during declaration generation, so missing fields blocked dts
   emission. Add `actions: []` to InstanceElement fixtures, plus
   `subchains: []`, `subchain_instances: []`, `sources: []`,
   `linked_instance_ids: []` to DatachainType / DatachainInstance.

3. DtprIcon: reset failed state when isDark changes
   Watch already reset `failed.value` when src/darkSrc changed.
   isDark wasn't in the deps, so a dark-mode toggle could leave a
   stale 404 marker, blocking retry against the new url.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pichot added a commit that referenced this pull request May 5, 2026
* feat: all-versions schema deploy + @dtpr/ui dark mode

Three independent changes bundled per request.

1. Schema deploy enumerates every version in the source tree
   - .github/workflows/api-deploy.yaml: drop the hard-coded
     ai@2026-04-16-beta default, walk api/schemas/<type>/*/, and
     run schema:build + r2-upload for each version. Validation gates
     the worker deploy. workflow_dispatch.version remains as an
     optional single-version override.
   - api/scripts/r2-prune.ts: new step that drops index entries
     whose source dir was deleted (e.g. after schema:promote
     rename). Worker reads gate through schemas/index.json
     (api/src/rest/version-resolver.ts), so index-only prune is
     sufficient — no R2 object deletion required.

2. @dtpr/ui dark mode
   - packages/ui/src/vue/styles.css: dark-mode CSS tokens, two
     triggers (html.dark for @nuxt/color-mode hosts, prefers-color
     -scheme for standalone SSR consumers).
   - packages/ui/src/vue/use-dark-mode.ts: reactive dark-mode flag
     mirroring the host page's color mode.
   - packages/ui/src/core/types.ts: ElementDisplayIcon.urlDark so
     callers can supply a dark-variant icon source.
   - DtprIcon swaps src based on the resolved mode; tests cover
     the resolution order and SSR fallback.

3. dtpr-ai docs polish
   - dtpr-ai/app/pages/index.vue + content.config.ts: override
     docus's default landing template so the homepage is wrapped
     in the same UContainer max-width as the rest of the site.
   - Taxonomy pages and Playground pick up dark-mode-aware icon
     URLs from the new @dtpr/ui surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: unblock dtpr-ai Cloudflare build

Three fixes to make the previous commit deploy.

1. Drop content.config.ts + app/pages/index.vue
   c12 loads our user-layer content.config.ts via jiti, which
   transitively requires @nuxt/content -> @nuxt/kit. jiti evals
   @nuxt/kit/dist/index.mjs:182's `import.meta.dev` via new
   vm.Script (CJS context), throwing SyntaxError. Docus's own
   content.config.ts has the same imports but loads inside a Nuxt
   context where jiti is configured differently.

   The homepage UContainer override is a polish item; restoring it
   needs a docus-friendly extension that doesn't trigger this load
   chain. Following up separately.

2. Update @dtpr/ui test fixtures for v2 schema reshape
   The v2 reshape (#274) made `actions` and `subchains` required
   defaults via z.ZodDefault, which Zod 4 surfaces as required in
   the inferred output type. vite-plugin-dts type-checks tests
   during declaration generation, so missing fields blocked dts
   emission. Add `actions: []` to InstanceElement fixtures, plus
   `subchains: []`, `subchain_instances: []`, `sources: []`,
   `linked_instance_ids: []` to DatachainType / DatachainInstance.

3. DtprIcon: reset failed state when isDark changes
   Watch already reset `failed.value` when src/darkSrc changed.
   isDark wasn't in the deps, so a dark-mode toggle could leave a
   stale 404 marker, blocking retry against the new url.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pichot added a commit that referenced this pull request May 6, 2026
…families, semantic input/output + pii ramp (#278)

* feat(dtpr-ai): functional_modes — six modes (Focus 1)

Authors the six functional-mode elements that PR #274 (7c9b930) deliberately
left empty after renaming ai__decision → functional_modes. Each element
carries a verb-forward title (Analytical — decides), a plain-language
description with examples, and a citation to Narain Jashanmal's *AI Taxonomy*
(v1.1, January 2026) — the source DTPR adopted by name for the six-mode
partition.

Elements added (category_id: functional_modes, all en/es/fr/pt; km + tl
deferred to a translator pass):

- analytical_mode  — decides
- semantic_mode    — understands and remembers
- generative_mode  — creates
- agentic_mode     — acts
- perceptive_mode  — senses
- physical_mode    — moves

Each element references a first-draft symbol stub at
api/schemas/ai/2026-04-27-beta/symbols/mode_<verb>.svg — minimal verb-glyph
geometry pending a designer pass for sign-scale legibility.

Source captured durably at
plugin/dtpr/research/2026-05-06T1443-narain-jashanmal-ai-taxonomy.md, with
boundary cues (perceptive↔analytical input shape, semantic↔generative
new-content test, agentic↔physical action target, analytical↔semantic input
modality) recorded so future dtpr-element-design / dtpr-comprehension-audit
runs over functional_modes pick up the same partition.

A multi-PR plan tracker lives at
docs/plans/2026-05-05-002-feat-ai-2026-04-27-beta-element-authoring-plan.md
and covers the four other categories PR #274 deferred (risks_mitigation
AIAAIC expansion, accountable consolidation + logo, processing family
typing, output_dataset first-pass authoring). Focus 1 is marked drafted
in the tracker; Focus 2 (AIAAIC harms) is in flight in a parallel session
and its files will land in a separate commit.

schema:validate and schema:build pass; composed icons emit hexagon × symbol
correctly under dist/.../icons/<mode>_mode/{default,dark}.svg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dtpr-ai): risks_mitigation — replace 6 mechanism elements with AIAAIC's 9 victim-oriented harms (Focus 2)

Retire the cause-oriented elements (compromise_of_privacy, function_creep,
opaque_decision_making, overreliance_automation_bias, system_drift,
unequal_performance) and their symbols. Replace with the 9 victim-oriented
harm types from the AIAAIC Collaborative Harms Taxonomy: autonomy_loss,
physical_harm, psychological_harm, reputational_harm, financial_harm,
civil_liberties_harm, societal_cultural_harm, political_economic_harm,
environmental_harm. Each carries the AIAAIC citation per locale; CC BY-SA 4.0
flows from the citation field per dtpr-ai/content/9.attribution.md.

Mechanism and outcome framings overlapped awkwardly (e.g. compromise_of_privacy
collided with autonomy_loss + civil_liberties_harm), forcing authors into
mechanism-vs-outcome coin flips for the same scenario. Beta is mid-flight —
remap downstream datachain instances; 2026-04-16-beta keeps the old IDs.

Symbols are placeholder geometry pending designer pass. Non-en locale strings
drafted by implementer; translator review pending. Keeping functional_modes
and risks_mitigation orthogonal — the mode↔harm cross-product belongs in the
renderer, not in element context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dtpr-ai): processing — replace 7 technique elements with 12 family-typed (Focus 4)

Retire the technique-flavored elements (llm, text-to-speech, sentiment-analysis,
time-series-forecasting, recommendation-systems, optimization, privacy-preserving)
and replace with a family-typed catalog where each element id IS a family:
language_models, computer_vision, biometric_recognition, speech_audio,
classification_prediction, affect_emotion_analysis, anomaly_detection,
optimization, recommendation_ranking, search_retrieval, clustering_segmentation,
privacy_transformation.

Family encoding: bare slugs on Element.id — not Element.context override and not
a category-level context axis. Context fits per-instance selection (PII on
input_dataset); families are catalog-level facts that should not require the
author to re-pick at instance time. Naming-prefix conventions were declined as
fragile.

Coverage rationale: 12 families span the public-space AI surface DTPR targets —
chatbots/kiosks, occupancy and crowd vision, accessibility audio, demand and
traffic forecasting, equipment and fraud monitoring, signal/transit
optimization, wayfinding ranking, document QA, audience analytics, de-identified
camera pipelines. Biometric Recognition and Affect & Emotion Analysis are split
out from Computer Vision and Classification & Prediction respectively because
both carry distinct regulatory weight (EU AI Act prohibitions, ICO guidance) and
high public-trust signal that disclosure should surface independently.

Symbols: ten reuse purpose-designed processing/decision-making/data-shape icons
from the existing symbol set. Two (search_retrieval → connectivity,
clustering_segmentation → social) borrow closest-fit existing icons and need
designer follow-up before stable promotion.

Comprehension check (rubric 2026-04-20):
- Audience fit: pass — commuter-grade verbs throughout.
- Plain-language: partial — RAG, k-anonymization, differential privacy
  un-glossed in search_retrieval and privacy_transformation; copy pass before
  stable.
- Symbol legibility: partial — two reused symbols flagged for design.
- Ambiguity flags: pass — explicit boundary callouts (computer_vision ↔
  biometric_recognition, speech_audio ↔ biometric_recognition,
  clustering_segmentation flags inferred-vs-authored cluster labels).
- Locale coverage: partial — all six locales drafted, technical anglicisms in
  Khmer and Tagalog need translator review.
- Variable-substitution clarity: n/a — no variables.
- Overlap and distinctness: pass.
- Overall: partial — ship-blockers are the two symbol stand-ins and the copy
  pass for un-glossed jargon.

`pnpm --filter ./api schema:validate ai@2026-04-27-beta` passes (11 categories,
88 elements). Earlier 2026-04-16-beta is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dtpr-ai): input_dataset + output_dataset — replace 7 format elements with 11 semantic categories × 2 sides (Focus 5)

Reframes both dataset categories around what the data is *about*, not
how it's *encoded*. Eight independently-developed citizen-facing
transparency frameworks (Apple Privacy Nutrition Labels, Google Play
Data Safety, W3C DPV PD v2.3, GDPR Art. 9, EU AI Act Art. 3(1) +
Annex III, TILT, DaPIS, original DTPR's own semantic three) all
categorize this way; "binary" / "tabular" / "boolean" read as
programmer documentation to a non-technical commuter. Captured in
corpus at plugin/dtpr/research/2026-05-06T1515-semantic-data-categories-public-disclosure.md.

11 bidirectional categories with 22 element files (input_* + output_*
sharing one symbol_id each):

  About a person / body / place / behaviour / measurement
  Sensitive personal information
  Operational data (school occupancy, routes, budgets, public records)
  A decision about you
  A recommendation or prediction
  Generated content
  A physical action

PII context dropped from both categories — once "About a person",
"About a body", and "Sensitive personal" are explicit elements, the
context is double-bookkeeping. output_dataset description and prompt
broadened beyond "data products" to cover decisions, content, and
physical actions per EU AI Act Art. 3(1).

Six new symbol stubs (about_a_body, about_behaviour, sensitive_personal,
operational_data, generated_content, physical_action) are minimal
geometric placeholders flagged for designer pass. Existing five
symbol_ids reused (personal, spatial, values_time, dm_accept-or-deny,
dm_priority-ranking).

Authored across 4 locales (en/es/fr/pt) matching Focus 1's precedent;
km/tl translation deferred to native-speaker pass.

schema:validate ✓ 11 categories, 103 elements
schema:build    ✓ 400 dist files
test            ✓ 388 tests (340 workers + 48 cli)

Beta-stage breaking change for any datachain instance pinned to
ai@2026-04-27-beta and using old element ids (personal, tabular,
binary, etc.); migration table in the corpus entry. In-policy per
the plan's Out-of-scope section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dtpr-ai): restore pii context on input_dataset + output_dataset (Focus 5 amendment)

The first Focus 5 cut dropped the `pii` context dimension on a "double-bookkeeping"
argument that only held for one of four old values (`biometric` ↔ `*_about_a_body`).
Restore the dimension as a pure identifiability ramp — orthogonal to the semantic
element catalog — to bring back original DTPR's at-a-glance colour-band signal.

PII context (both categories, six locales):
- `de_identified` (#4A90D9 blue) — about people, link to who is broken
- `pseudonymous`  (#9575CD purple, new) — linked via opaque token, name not revealed
- `identifiable`  (#FFD700 yellow) — directly identified
- absence of value carries the "no PII claim" meaning (matches Role on `accountable`)

Old `none` and `biometric` retired. Schema validates: 11 categories, 103 elements,
466 dist files (up from 400 for new per-value icon variants). 388 tests pass.

Drive-by typo fixes (longstanding):
- `dm_anomay-detection` → `dm_anomaly-detection` (symbol + element symbol_id)
- `processing_privacy-preservoing-transformation` → `…-preserving-transformation`

Plan (`docs/plans/2026-05-05-002-…`):
- Reflect refined PII context.
- Drop Focus 3 (`accountable` consolidation + logo support) — not this version.
- Drop km/tl translator-pass follow-ups (out of scope this version).
- Correct Focus 4 status line (was `ai@2026-05-06-beta`; actually `ai@2026-04-27-beta`).

Corpus (`plugin/dtpr/research/2026-05-06T1515-…`):
- Replace "Why the PII context becomes redundant" with "Why the PII context stays —
  refined to a pure identifiability ramp," documenting per-value rationale.
- Refresh `content_hash` to current schema build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant