EntityProcess · christso · Apr 27, 2026 · Apr 27, 2026
diff --git a/apps/web/src/content/docs/docs/guides/enterprise-governance.mdx b/apps/web/src/content/docs/docs/guides/enterprise-governance.mdx
@@ -0,0 +1,141 @@
+---
+title: Enterprise Governance
+description: A Git-native pattern for inventorying and reviewing the AI systems in your organisation, using a `.ai-register.yaml` per repo and a GitHub Action to aggregate them.
+sidebar:
+  order: 9
+---
+
+This guide describes a lightweight convention for keeping a documented
+**AI system inventory** — the thing every modern AI-governance framework
+asks for — without adopting a governance platform.
+
+You should be able to read this in under ten minutes and have something
+running by the end.
+
+## Why a manifest
+
+Every modern AI-governance framework expects a documented inventory of AI
+systems:
+
+- **NIST AI RMF GOVERN-1.3** — documented AI system inventory.
+- **ISO/IEC 42001:2023 Clause 7** — AI system documentation.
+- **EU AI Act Annex IV** — technical documentation per high-risk system.
+
+Large enterprises typically answer this with governance platforms (Credo AI,
+OneTrust AI Governance, ServiceNow AI Control Tower, IBM watsonx.governance).
+Smaller teams, open-source projects, or orgs that haven't invested in a
+platform need a lighter pattern that still satisfies an auditor.
+
+A Git-native manifest per repo, aggregated nightly via a GitHub Action,
+gets you audit-grade inventory at zero infra cost. If you later adopt a
+governance platform, **the same manifests become its import source** —
+nothing has to be re-keyed.
+
+## What it looks like
+
+In the **repo root** of each AI system, commit a `.ai-register.yaml`:
+
+```yaml
+system:
+  id: example-support-agent
+  name: Example Customer Support Agent
+  owner: support-platform-team
+  risk_tier: high                    # EU AI Act vocabulary
+  deployment: production
+  data_classification: restricted
+  description: Answers customer-support questions over chat.
+  models:
+    - provider: anthropic
+      model: claude-opus-4-7
+  evals:
+    path: evals/
+    runs_in_ci: true
+  controls:                          # <FRAMEWORK>-<VERSION>:<ID>
+    - NIST-AI-RMF-1.0:GOVERN-1.3
+    - ISO-42001-2023:Clause-7
+    - EU-AI-ACT-2024:Art.55
+    - INTERNAL-AI-POLICY-1.0:CTRL-CUSTOMER-ISOLATION
+  last_reviewed: 2026-04-24
+```
+
+The full example, including comments, is in the agentv repo at
+`examples/governance/ai-register/.ai-register.yaml`.
+
+### Why these fields
+
+- **`risk_tier`** — EU AI Act vocabulary (`prohibited | high | limited | minimal`).
+  Other vocabularies (e.g. NIST 800-30) work too; pick one and stick with it.
+- **`controls`** — same string format as the eval-level `governance` schema
+  documented in [governance metadata]. That overlap is intentional: a
+  control declared on a system can be cross-referenced against the controls
+  exercised by its evals.
+- **`last_reviewed`** — a date. Aggregators flag entries older than
+  whatever cadence your governance team works to.
+- **`evals.path`** — a pointer to the agentv evals that exercise this
+  system. The aggregator does not run them; it just records that they exist.
+
+## Aggregating across the org
+
+In a dedicated `ai-register` repo (or your existing governance repo), drop
+`.github/workflows/aggregate.yml` from `examples/governance/ai-register/`.
+The workflow:
+
+1. Searches the org via `gh api search/code` for every `.ai-register.yaml`.
+2. Fetches each one via `gh api repos/.../contents`.
+3. Aggregates them with a small Python script into `register.csv` and a
+   self-contained `register.html` table.
+4. Surfaces stale entries (`last_reviewed` > 90 days) on the workflow
+   summary and uploads the CSV + HTML as workflow artifacts.
+
+Required secret: **`GH_AGGREGATE_TOKEN`** with `repo` (or `read:org`)
+scope, scoped to the org you want to enumerate. For public repos the
+default `GITHUB_TOKEN` is sufficient.
+
+The workflow is fewer than 150 lines of YAML, runs in a single job, and
+has no third-party dependencies beyond `gh` (preinstalled on
+`ubuntu-latest`) and `PyYAML`.
+
+## Day-2 operations
+
+A useful starting cadence:
+
+- Engineers update `.ai-register.yaml` whenever a system enters or leaves
+  production, or its model / scope changes materially.
+- The aggregator runs weekly via cron.
+- The workflow summary is the source of truth for stale entries; if your
+  team prefers a Slack ping, add one extra step that posts to a webhook.
+- Quarterly, the governance team walks the CSV and updates `last_reviewed`
+  on the systems they signed off on.
+
+That's the whole loop.
+
+## Relationship to evaluation
+
+agentv does not parse `.ai-register.yaml`. The convention is **orthogonal**:
+
+- The manifest documents **which AI systems exist**, who owns them, and
+  which controls they are accountable for.
+- The eval YAML documents **which behaviour a given system was tested
+  against**.
+
+Both files use the same `<FRAMEWORK>-<VERSION>:<ID>` control format, so a
+script can intersect "manifest claims this system is covered by
+NIST-AI-RMF-1.0:MEASURE-2.7" with "eval results show 14 cases tagged
+NIST-AI-RMF-1.0:MEASURE-2.7 ran this quarter."
+
+## Migration to a governance platform
+
+When and if your org adopts Credo AI / OneTrust AI Governance /
+ServiceNow AI Control Tower / IBM watsonx.governance:
+
+- Each platform accepts CSV / JSON imports keyed on system identifiers.
+- Your `register.csv` artifact already has the per-system row each
+  importer expects.
+- The `controls` column maps directly onto the framework-control fields
+  the platform exposes — there is nothing to re-key.
+
+You don't have to rip out the manifest convention either. Most teams keep
+the Git-native artifact as the **canonical source** and the platform as
+the **operations surface**, syncing one direction.
+
+[governance metadata]: ./agent-eval-layers/
diff --git a/examples/governance/ai-register/.ai-register.yaml b/examples/governance/ai-register/.ai-register.yaml
@@ -0,0 +1,37 @@
+system:
+  id: example-support-agent
+  name: Example Customer Support Agent
+  owner: support-platform-team
+  risk_tier: high                  # EU AI Act vocabulary: prohibited | high | limited | minimal
+  deployment: production
+  data_classification: restricted
+  description: >-
+    Answers customer support questions over chat. Routes to humans when the
+    user requests a refund or asks anything outside the documented FAQ.
+
+  # Models actually in use. Versioned strings so an auditor a year from now
+  # can see which family + model identifier was running, not just "Anthropic".
+  models:
+    - provider: anthropic
+      model: claude-opus-4-7
+    - provider: openai
+      model: gpt-4o-mini
+
+  # Pointer to the agentv evals that exercise this system. Aggregator scripts
+  # can `git ls-tree HEAD evals/` to count cases, or run `agentv eval` on demand.
+  evals:
+    path: evals/
+    runs_in_ci: true
+
+  # Cross-framework controls. Format follows the convention in #1161:
+  # <FRAMEWORK>-<VERSION>:<ID>. Custom prefixes are explicitly supported.
+  controls:
+    - NIST-AI-RMF-1.0:GOVERN-1.3
+    - NIST-AI-RMF-1.0:MEASURE-2.7
+    - ISO-42001-2023:Clause-7
+    - EU-AI-ACT-2024:Art.55
+    - INTERNAL-AI-POLICY-1.0:CTRL-CUSTOMER-ISOLATION
+
+  # When the system was last reviewed by the owning team. Aggregators flag
+  # entries older than your governance cadence (quarterly is typical).
+  last_reviewed: 2026-04-24
diff --git a/examples/governance/ai-register/.github/workflows/aggregate.yml b/examples/governance/ai-register/.github/workflows/aggregate.yml
@@ -0,0 +1,174 @@
+name: aggregate-ai-register
+
+# Sweeps every repo the bot can see for an `.ai-register.yaml`, merges them
+# into a CSV + a static HTML dashboard, and surfaces stale entries (those
+# whose `last_reviewed` is older than STALE_DAYS) on the workflow summary.
+#
+# Drop this file into a dedicated `ai-register` repo (or your governance
+# repo). Each AI-system repo just commits a `.ai-register.yaml` at its
+# root — there is nothing to install on those repos.
+#
+# Required: GH_AGGREGATE_TOKEN secret with `repo` (or `read:org` + `repo`)
+# scope, scoped to the GitHub org or user you want to enumerate. The
+# default GITHUB_TOKEN works for public repos; a PAT is needed for
+# private ones.
+
+on:
+  workflow_dispatch:
+    inputs:
+      org:
+        description: GitHub org or user to scan (defaults to current repo owner)
+        required: false
+  schedule:
+    - cron: "0 6 * * 1"   # every Monday 06:00 UTC
+
+env:
+  STALE_DAYS: "90"
+
+permissions:
+  contents: read
+  issues: write   # for posting the stale-review summary as an issue comment
+
+jobs:
+  aggregate:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Locate .ai-register.yaml across the org
+        id: locate
+        env:
+          GH_TOKEN: ${{ secrets.GH_AGGREGATE_TOKEN || secrets.GITHUB_TOKEN }}
+          ORG: ${{ inputs.org || github.repository_owner }}
+        run: |
+          set -euo pipefail
+          mkdir -p out
+          # GitHub code search returns up to 1000 hits; use REST search/code
+          # filtered to filename. Cursor pagination via per_page+page.
+          : > out/repos.txt
+          page=1
+          while :; do
+            resp=$(gh api -X GET search/code \
+              -f q="filename:.ai-register.yaml org:${ORG}" \
+              -F per_page=100 -F page=$page)
+            count=$(jq -r '.items | length' <<<"$resp")
+            jq -r '.items[] | "\(.repository.full_name)\t\(.path)"' <<<"$resp" >> out/repos.txt
+            [ "$count" -lt 100 ] && break
+            page=$((page+1))
+          done
+          wc -l out/repos.txt
+
+      - name: Fetch each manifest
+        env:
+          GH_TOKEN: ${{ secrets.GH_AGGREGATE_TOKEN || secrets.GITHUB_TOKEN }}
+        run: |
+          set -euo pipefail
+          mkdir -p out/manifests
+          while IFS=$'\t' read -r repo path; do
+            [ -z "$repo" ] && continue
+            slug=$(echo "$repo" | tr '/' '_')
+            gh api "repos/${repo}/contents/${path}" --jq '.content' \
+              | base64 -d > "out/manifests/${slug}.yaml" || echo "skip $repo"
+          done < out/repos.txt
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+      - run: pip install --quiet "PyYAML==6.0.2"
+
+      - name: Aggregate to CSV + HTML
+        run: |
+          python3 - <<'PY'
+          import csv, datetime, glob, html, json, os, pathlib, sys, yaml
+
+          stale_days = int(os.environ.get("STALE_DAYS", "90"))
+          today = datetime.date.today()
+          rows = []
+          stale = []
+          for fn in sorted(glob.glob("out/manifests/*.yaml")):
+              try:
+                  data = yaml.safe_load(pathlib.Path(fn).read_text())
+              except yaml.YAMLError as e:
+                  rows.append({"id": fn, "error": str(e)})
+                  continue
+              sys_ = (data or {}).get("system", {})
+              row = {
+                  "id": sys_.get("id", ""),
+                  "name": sys_.get("name", ""),
+                  "owner": sys_.get("owner", ""),
+                  "risk_tier": sys_.get("risk_tier", ""),
+                  "deployment": sys_.get("deployment", ""),
+                  "last_reviewed": str(sys_.get("last_reviewed", "")),
+                  "controls": "; ".join(sys_.get("controls", []) or []),
+                  "models": "; ".join(
+                      f"{m.get('provider','?')}:{m.get('model','?')}"
+                      for m in sys_.get("models", []) or []
+                  ),
+                  "source_file": pathlib.Path(fn).name,
+              }
+              rows.append(row)
+              try:
+                  reviewed = datetime.date.fromisoformat(row["last_reviewed"])
+                  if (today - reviewed).days > stale_days:
+                      stale.append(row)
+              except (TypeError, ValueError):
+                  pass
+
+          out = pathlib.Path("out")
+          with (out / "register.csv").open("w", newline="") as f:
+              writer = csv.DictWriter(
+                  f,
+                  fieldnames=["id", "name", "owner", "risk_tier", "deployment",
+                              "last_reviewed", "controls", "models",
+                              "source_file"],
+              )
+              writer.writeheader()
+              for r in rows:
+                  writer.writerow({k: r.get(k, "") for k in writer.fieldnames})
+
+          def esc(v): return html.escape(str(v))
+          th = lambda s: f"<th>{esc(s)}</th>"
+          td = lambda s: f"<td>{esc(s)}</td>"
+          headers = ["id", "name", "owner", "risk_tier", "deployment",
+                     "last_reviewed", "controls", "models", "source_file"]
+          tbl = ["<table>", "<thead><tr>" + "".join(th(h) for h in headers) + "</tr></thead>", "<tbody>"]
+          for r in rows:
+              tbl.append("<tr>" + "".join(td(r.get(h, "")) for h in headers) + "</tr>")
+          tbl += ["</tbody>", "</table>"]
+          stale_html = ""
+          if stale:
+              stale_html = "<h2>Stale entries (>" + str(stale_days) + " days)</h2><ul>"
+              for r in stale:
+                  stale_html += f"<li>{esc(r['id'])} ({esc(r['owner'])}) — last_reviewed={esc(r['last_reviewed'])}</li>"
+              stale_html += "</ul>"
+          (out / "register.html").write_text(
+              "<!doctype html><meta charset=utf-8>"
+              "<title>AI System Register</title>"
+              "<style>body{font-family:system-ui;margin:2rem}"
+              "table{border-collapse:collapse}td,th{padding:.4em .8em;border:1px solid #ccc}"
+              "th{background:#f4f4f4;text-align:left}</style>"
+              "<h1>AI System Register</h1>"
+              + stale_html
+              + "\n".join(tbl)
+          )
+
+          # Workflow summary
+          summary = pathlib.Path(os.environ.get("GITHUB_STEP_SUMMARY", "/dev/null"))
+          with summary.open("a") as f:
+              f.write(f"## AI System Register\n\nFound **{len(rows)}** systems.\n\n")
+              if stale:
+                  f.write(f"### Stale (>{stale_days} days)\n\n")
+                  for r in stale:
+                      f.write(f"- `{r['id']}` ({r['owner']}) last_reviewed={r['last_reviewed']}\n")
+              else:
+                  f.write(f"All entries reviewed within the last {stale_days} days.\n")
+          print(json.dumps({"systems": len(rows), "stale": len(stale)}))
+          PY
+
+      - uses: actions/upload-artifact@v4
+        with:
+          name: ai-register
+          path: |
+            out/register.csv
+            out/register.html