diff --git a/apps/web/src/content/docs/docs/guides/enterprise-governance.mdx b/apps/web/src/content/docs/docs/guides/enterprise-governance.mdx new file mode 100644 index 00000000..217a237c --- /dev/null +++ b/apps/web/src/content/docs/docs/guides/enterprise-governance.mdx @@ -0,0 +1,141 @@ +--- +title: Enterprise Governance +description: A Git-native pattern for inventorying and reviewing the AI systems in your organisation, using a `.ai-register.yaml` per repo and a GitHub Action to aggregate them. +sidebar: + order: 9 +--- + +This guide describes a lightweight convention for keeping a documented +**AI system inventory** — the thing every modern AI-governance framework +asks for — without adopting a governance platform. + +You should be able to read this in under ten minutes and have something +running by the end. + +## Why a manifest + +Every modern AI-governance framework expects a documented inventory of AI +systems: + +- **NIST AI RMF GOVERN-1.3** — documented AI system inventory. +- **ISO/IEC 42001:2023 Clause 7** — AI system documentation. +- **EU AI Act Annex IV** — technical documentation per high-risk system. + +Large enterprises typically answer this with governance platforms (Credo AI, +OneTrust AI Governance, ServiceNow AI Control Tower, IBM watsonx.governance). +Smaller teams, open-source projects, or orgs that haven't invested in a +platform need a lighter pattern that still satisfies an auditor. + +A Git-native manifest per repo, aggregated nightly via a GitHub Action, +gets you audit-grade inventory at zero infra cost. If you later adopt a +governance platform, **the same manifests become its import source** — +nothing has to be re-keyed. + +## What it looks like + +In the **repo root** of each AI system, commit a `.ai-register.yaml`: + +```yaml +system: + id: example-support-agent + name: Example Customer Support Agent + owner: support-platform-team + risk_tier: high # EU AI Act vocabulary + deployment: production + data_classification: restricted + description: Answers customer-support questions over chat. + models: + - provider: anthropic + model: claude-opus-4-7 + evals: + path: evals/ + runs_in_ci: true + controls: # -: + - NIST-AI-RMF-1.0:GOVERN-1.3 + - ISO-42001-2023:Clause-7 + - EU-AI-ACT-2024:Art.55 + - INTERNAL-AI-POLICY-1.0:CTRL-CUSTOMER-ISOLATION + last_reviewed: 2026-04-24 +``` + +The full example, including comments, is in the agentv repo at +`examples/governance/ai-register/.ai-register.yaml`. + +### Why these fields + +- **`risk_tier`** — EU AI Act vocabulary (`prohibited | high | limited | minimal`). + Other vocabularies (e.g. NIST 800-30) work too; pick one and stick with it. +- **`controls`** — same string format as the eval-level `governance` schema + documented in [governance metadata]. That overlap is intentional: a + control declared on a system can be cross-referenced against the controls + exercised by its evals. +- **`last_reviewed`** — a date. Aggregators flag entries older than + whatever cadence your governance team works to. +- **`evals.path`** — a pointer to the agentv evals that exercise this + system. The aggregator does not run them; it just records that they exist. + +## Aggregating across the org + +In a dedicated `ai-register` repo (or your existing governance repo), drop +`.github/workflows/aggregate.yml` from `examples/governance/ai-register/`. +The workflow: + +1. Searches the org via `gh api search/code` for every `.ai-register.yaml`. +2. Fetches each one via `gh api repos/.../contents`. +3. Aggregates them with a small Python script into `register.csv` and a + self-contained `register.html` table. +4. Surfaces stale entries (`last_reviewed` > 90 days) on the workflow + summary and uploads the CSV + HTML as workflow artifacts. + +Required secret: **`GH_AGGREGATE_TOKEN`** with `repo` (or `read:org`) +scope, scoped to the org you want to enumerate. For public repos the +default `GITHUB_TOKEN` is sufficient. + +The workflow is fewer than 150 lines of YAML, runs in a single job, and +has no third-party dependencies beyond `gh` (preinstalled on +`ubuntu-latest`) and `PyYAML`. + +## Day-2 operations + +A useful starting cadence: + +- Engineers update `.ai-register.yaml` whenever a system enters or leaves + production, or its model / scope changes materially. +- The aggregator runs weekly via cron. +- The workflow summary is the source of truth for stale entries; if your + team prefers a Slack ping, add one extra step that posts to a webhook. +- Quarterly, the governance team walks the CSV and updates `last_reviewed` + on the systems they signed off on. + +That's the whole loop. + +## Relationship to evaluation + +agentv does not parse `.ai-register.yaml`. The convention is **orthogonal**: + +- The manifest documents **which AI systems exist**, who owns them, and + which controls they are accountable for. +- The eval YAML documents **which behaviour a given system was tested + against**. + +Both files use the same `-:` control format, so a +script can intersect "manifest claims this system is covered by +NIST-AI-RMF-1.0:MEASURE-2.7" with "eval results show 14 cases tagged +NIST-AI-RMF-1.0:MEASURE-2.7 ran this quarter." + +## Migration to a governance platform + +When and if your org adopts Credo AI / OneTrust AI Governance / +ServiceNow AI Control Tower / IBM watsonx.governance: + +- Each platform accepts CSV / JSON imports keyed on system identifiers. +- Your `register.csv` artifact already has the per-system row each + importer expects. +- The `controls` column maps directly onto the framework-control fields + the platform exposes — there is nothing to re-key. + +You don't have to rip out the manifest convention either. Most teams keep +the Git-native artifact as the **canonical source** and the platform as +the **operations surface**, syncing one direction. + +[governance metadata]: ./agent-eval-layers/ diff --git a/examples/governance/ai-register/.ai-register.yaml b/examples/governance/ai-register/.ai-register.yaml new file mode 100644 index 00000000..785770c0 --- /dev/null +++ b/examples/governance/ai-register/.ai-register.yaml @@ -0,0 +1,37 @@ +system: + id: example-support-agent + name: Example Customer Support Agent + owner: support-platform-team + risk_tier: high # EU AI Act vocabulary: prohibited | high | limited | minimal + deployment: production + data_classification: restricted + description: >- + Answers customer support questions over chat. Routes to humans when the + user requests a refund or asks anything outside the documented FAQ. + + # Models actually in use. Versioned strings so an auditor a year from now + # can see which family + model identifier was running, not just "Anthropic". + models: + - provider: anthropic + model: claude-opus-4-7 + - provider: openai + model: gpt-4o-mini + + # Pointer to the agentv evals that exercise this system. Aggregator scripts + # can `git ls-tree HEAD evals/` to count cases, or run `agentv eval` on demand. + evals: + path: evals/ + runs_in_ci: true + + # Cross-framework controls. Format follows the convention in #1161: + # -:. Custom prefixes are explicitly supported. + controls: + - NIST-AI-RMF-1.0:GOVERN-1.3 + - NIST-AI-RMF-1.0:MEASURE-2.7 + - ISO-42001-2023:Clause-7 + - EU-AI-ACT-2024:Art.55 + - INTERNAL-AI-POLICY-1.0:CTRL-CUSTOMER-ISOLATION + + # When the system was last reviewed by the owning team. Aggregators flag + # entries older than your governance cadence (quarterly is typical). + last_reviewed: 2026-04-24 diff --git a/examples/governance/ai-register/.github/workflows/aggregate.yml b/examples/governance/ai-register/.github/workflows/aggregate.yml new file mode 100644 index 00000000..daebfb25 --- /dev/null +++ b/examples/governance/ai-register/.github/workflows/aggregate.yml @@ -0,0 +1,174 @@ +name: aggregate-ai-register + +# Sweeps every repo the bot can see for an `.ai-register.yaml`, merges them +# into a CSV + a static HTML dashboard, and surfaces stale entries (those +# whose `last_reviewed` is older than STALE_DAYS) on the workflow summary. +# +# Drop this file into a dedicated `ai-register` repo (or your governance +# repo). Each AI-system repo just commits a `.ai-register.yaml` at its +# root — there is nothing to install on those repos. +# +# Required: GH_AGGREGATE_TOKEN secret with `repo` (or `read:org` + `repo`) +# scope, scoped to the GitHub org or user you want to enumerate. The +# default GITHUB_TOKEN works for public repos; a PAT is needed for +# private ones. + +on: + workflow_dispatch: + inputs: + org: + description: GitHub org or user to scan (defaults to current repo owner) + required: false + schedule: + - cron: "0 6 * * 1" # every Monday 06:00 UTC + +env: + STALE_DAYS: "90" + +permissions: + contents: read + issues: write # for posting the stale-review summary as an issue comment + +jobs: + aggregate: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - name: Locate .ai-register.yaml across the org + id: locate + env: + GH_TOKEN: ${{ secrets.GH_AGGREGATE_TOKEN || secrets.GITHUB_TOKEN }} + ORG: ${{ inputs.org || github.repository_owner }} + run: | + set -euo pipefail + mkdir -p out + # GitHub code search returns up to 1000 hits; use REST search/code + # filtered to filename. Cursor pagination via per_page+page. + : > out/repos.txt + page=1 + while :; do + resp=$(gh api -X GET search/code \ + -f q="filename:.ai-register.yaml org:${ORG}" \ + -F per_page=100 -F page=$page) + count=$(jq -r '.items | length' <<<"$resp") + jq -r '.items[] | "\(.repository.full_name)\t\(.path)"' <<<"$resp" >> out/repos.txt + [ "$count" -lt 100 ] && break + page=$((page+1)) + done + wc -l out/repos.txt + + - name: Fetch each manifest + env: + GH_TOKEN: ${{ secrets.GH_AGGREGATE_TOKEN || secrets.GITHUB_TOKEN }} + run: | + set -euo pipefail + mkdir -p out/manifests + while IFS=$'\t' read -r repo path; do + [ -z "$repo" ] && continue + slug=$(echo "$repo" | tr '/' '_') + gh api "repos/${repo}/contents/${path}" --jq '.content' \ + | base64 -d > "out/manifests/${slug}.yaml" || echo "skip $repo" + done < out/repos.txt + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: "3.12" + - run: pip install --quiet "PyYAML==6.0.2" + + - name: Aggregate to CSV + HTML + run: | + python3 - <<'PY' + import csv, datetime, glob, html, json, os, pathlib, sys, yaml + + stale_days = int(os.environ.get("STALE_DAYS", "90")) + today = datetime.date.today() + rows = [] + stale = [] + for fn in sorted(glob.glob("out/manifests/*.yaml")): + try: + data = yaml.safe_load(pathlib.Path(fn).read_text()) + except yaml.YAMLError as e: + rows.append({"id": fn, "error": str(e)}) + continue + sys_ = (data or {}).get("system", {}) + row = { + "id": sys_.get("id", ""), + "name": sys_.get("name", ""), + "owner": sys_.get("owner", ""), + "risk_tier": sys_.get("risk_tier", ""), + "deployment": sys_.get("deployment", ""), + "last_reviewed": str(sys_.get("last_reviewed", "")), + "controls": "; ".join(sys_.get("controls", []) or []), + "models": "; ".join( + f"{m.get('provider','?')}:{m.get('model','?')}" + for m in sys_.get("models", []) or [] + ), + "source_file": pathlib.Path(fn).name, + } + rows.append(row) + try: + reviewed = datetime.date.fromisoformat(row["last_reviewed"]) + if (today - reviewed).days > stale_days: + stale.append(row) + except (TypeError, ValueError): + pass + + out = pathlib.Path("out") + with (out / "register.csv").open("w", newline="") as f: + writer = csv.DictWriter( + f, + fieldnames=["id", "name", "owner", "risk_tier", "deployment", + "last_reviewed", "controls", "models", + "source_file"], + ) + writer.writeheader() + for r in rows: + writer.writerow({k: r.get(k, "") for k in writer.fieldnames}) + + def esc(v): return html.escape(str(v)) + th = lambda s: f"{esc(s)}" + td = lambda s: f"{esc(s)}" + headers = ["id", "name", "owner", "risk_tier", "deployment", + "last_reviewed", "controls", "models", "source_file"] + tbl = ["", "" + "".join(th(h) for h in headers) + "", ""] + for r in rows: + tbl.append("" + "".join(td(r.get(h, "")) for h in headers) + "") + tbl += ["", "
"] + stale_html = "" + if stale: + stale_html = "

Stale entries (>" + str(stale_days) + " days)

    " + for r in stale: + stale_html += f"
  • {esc(r['id'])} ({esc(r['owner'])}) — last_reviewed={esc(r['last_reviewed'])}
  • " + stale_html += "
" + (out / "register.html").write_text( + "" + "AI System Register" + "" + "

AI System Register

" + + stale_html + + "\n".join(tbl) + ) + + # Workflow summary + summary = pathlib.Path(os.environ.get("GITHUB_STEP_SUMMARY", "/dev/null")) + with summary.open("a") as f: + f.write(f"## AI System Register\n\nFound **{len(rows)}** systems.\n\n") + if stale: + f.write(f"### Stale (>{stale_days} days)\n\n") + for r in stale: + f.write(f"- `{r['id']}` ({r['owner']}) last_reviewed={r['last_reviewed']}\n") + else: + f.write(f"All entries reviewed within the last {stale_days} days.\n") + print(json.dumps({"systems": len(rows), "stale": len(stale)})) + PY + + - uses: actions/upload-artifact@v4 + with: + name: ai-register + path: | + out/register.csv + out/register.html diff --git a/examples/governance/ai-register/README.md b/examples/governance/ai-register/README.md new file mode 100644 index 00000000..8a7becef --- /dev/null +++ b/examples/governance/ai-register/README.md @@ -0,0 +1,45 @@ +# `.ai-register.yaml` — Git-native AI system register + +A two-file pattern for documenting your AI systems against the governance +frameworks every Year-1 auditor will ask about (NIST AI RMF GOVERN-1.3, +ISO/IEC 42001 Clause 7, EU AI Act Annex IV). + +``` +your-org/ +├── service-a/.ai-register.yaml # one per AI-system repo +├── service-b/.ai-register.yaml +├── … +└── ai-register/ # one aggregator repo + └── .github/workflows/aggregate.yml # walks the org, builds CSV + HTML +``` + +The full pattern, motivation, and migration notes are documented in the +agentv.dev guide: **Enterprise governance** at +`/docs/guides/enterprise-governance/`. This directory ships the example +manifest and the aggregator workflow file. + +## Contents + +- **`.ai-register.yaml`** — example manifest. Drop a copy at the **repo root** + of each AI system you want to inventory, and edit the fields. `controls` + uses the same `-:` shape as the eval-level + governance schema in #1161, so the same string appears in the manifest and + in eval result JSONL — that's the correlation point. + +- **`.github/workflows/aggregate.yml`** — copy this into a dedicated + governance repo (commonly named `ai-register`). It runs weekly (and on + manual dispatch), walks the org for every `.ai-register.yaml`, and uploads + a CSV + static HTML dashboard as a workflow artifact. Stale entries + (`last_reviewed` older than `STALE_DAYS`, default 90) surface on the + workflow summary and can be wired to an issue comment, Slack webhook, or + whatever notification channel you already use. + +## Why this stays out of agentv core + +agentv does not parse `.ai-register.yaml`. The convention is deliberately +free-standing: if you later adopt a governance platform (Credo AI, OneTrust, +ServiceNow AI Control Tower, IBM watsonx.governance), these manifests are +your import source — not a thing you need to migrate away from. + +If the convention grows, that growth happens in conversation between teams +adopting it; agentv stays lightweight.