Pure-function policy matrix for AI coding agents. Maps
(repo, capability, context)to one of three modes:deny/require_approval/auto_allow.
Status: 0.1.4 alpha. The public API is frozen for v0.1; examples and
hook/wrapper recipes will grow in v0.2.
AI coding agents (Claude Code, Codex, Aider, and friends) need a single place to answer one question, the same way, every time:
"The agent wants to do X in repo Y — should I let it?"
agent-policy is that single place. It is deliberately tiny:
- One pure function —
evaluate(policy, repo, capability, context). - No I/O, no logging, no global state. The evaluator does not touch disk, network, or clocks. It is safe to call from a hook, a test, or a long-running daemon.
- Fail-closed defaults. A missing
default_modeisrequire_approval, unknown fields in policy files are rejected, and hard guardrails cannot be overridden by repo policy.
It does not parse shell commands, manage state, or send messages.
Those belong in the wrapper layer that calls evaluate.
pip install yui-agent-policyFrom a source checkout, install the package in editable mode so both the
library and examples/check.py can resolve import agent_policy:
pip install -e .Requires Python 3.11+ (uses stdlib tomllib). The only runtime dependency
is pydantic >= 2.
from agent_policy import evaluate, PolicyMatrix, RepoPolicy
policy = PolicyMatrix(
default_mode="require_approval",
repo_policy=[
RepoPolicy(
repo="acme/app",
ownership_class="internal",
capabilities={
"read": "auto_allow",
"commit": "auto_allow",
"push": "auto_allow",
"shell": "require_approval",
},
),
],
)
decision = evaluate(
policy,
repo="acme/app",
capability="commit",
context={"ownership_class": "internal"},
)
print(decision.mode) # "auto_allow"
print(decision.reason) # "repo_policy"
print(decision.matched_repo) # "acme/app"Load the same policy from a TOML file:
from agent_policy import evaluate, load_policy_file
policy = load_policy_file("policy.toml")
decision = evaluate(policy, repo="acme/app", capability="commit")evaluate also accepts a plain dict in the same shape as PolicyMatrix,
which is convenient for tests and one-off scripts.
Every call returns a frozen PolicyDecision with three fields:
| Field | Type | Meaning |
|---|---|---|
mode |
"deny" | "require_approval" | "auto_allow" |
What the caller should do. |
reason |
"hard_guardrail" | "repo_policy" | "default_mode" | ... |
Which rule produced the decision. |
matched_repo |
str | None |
The repo string that matched, or None. |
Decisions are evaluated in this order:
- Hard guardrails — cannot be overridden by repo policy.
push.force→ alwaysdeny.merge.pr→ alwaysrequire_approval.- External
first_write_to_repoon a mutating capability →require_approval. Read is not blocked.
- Repo policy match — every
[[repo_policy]]entry for the requested repo is scanned (optionally gated byownership_class). The first entry that declares the capability wins. Splitting a repo's policy across multiple entries is supported. default_modefallback — used when no repo policy declares the capability. Defaults torequire_approvalif unset.
HARD_GUARDRAILS is exported as a constant so tooling can assert against
it without importing private symbols.
# policy.toml
default_mode = "require_approval"
[[repo_policy]]
repo = "acme/app"
ownership_class = "internal"
[repo_policy.capabilities]
read = "auto_allow"
commit = "auto_allow"
push = "auto_allow"
[[repo_policy]]
repo = "acme/app" # same repo, extra constraint
[repo_policy.capabilities]
shell = "require_approval"Unknown top-level fields or typos inside [[repo_policy]] fail loudly
with a pydantic.ValidationError — there is no silent degradation.
agent-policy deliberately does not know how to parse git push --force
or a shell command line. The intended shape is:
┌────────────────────────┐
agent ───▶ │ wrapper (hook / CLI) │ ──▶ agent-policy.evaluate()
│ - normalize capability│ │
│ - build context │ ▼
│ - act on decision │ PolicyDecision
└────────────────────────┘
The wrapper owns: parsing the agent's intent, mapping it to one of the
MVP capabilities (read, write, commit, push, push.force,
merge.pr, shell), and executing whatever side effect the decision
implies (block, prompt for approval, log and allow).
A runnable minimal wrapper lives in examples/check.py.
For require_approval decisions, keep the approval layer outside
agent-policy but make the wrapper contract explicit. Production wrappers
should:
- Bind approval records to the exact capability, session, path, and command being executed. A command change after approval should fail closed.
- Record the source decision event or audit hash in the approval record, then verify that the event still exists before running the approved command.
- Treat approvals as single-use for side-effecting operations such as
artifact.publish; reserve a local use marker before executing the command so retry races cannot reuse the same approval. - Keep bypass corpora, private logs,
.env*files, and red-team transcripts outside tracked paths, and add an independent scanner such asyui-agent-guardto CI.
These checks belong in the wrapper/admission layer rather than the pure
evaluator. The ai_resilience_policy.toml example shows the capability
vocabulary; downstream repositories can combine it with their own approval
record schema and CI gates.
See examples/. Runnable after installing the package
(pip install yui-agent-policy, or pip install -e . from a source checkout):
policy.toml— a minimal fail-closed policy with two repos.ai_resilience_policy.toml— a safety-oriented vocabulary example for publication, constitution, audit, secret-materialization, and scanner policy changes. These remain repo-policy capabilities rather than hard guardrails until downstream wrappers prove they are universal invariants.check.py— a tiny CLI wrapper that mapsPolicyDecisionto JSON on stdout and a process exit code, suitable for PreToolUse hooks.claude_code_hook.sh— a Claude CodePreToolUsehook that reads the hook payload from stdin, maps the tool to a capability, and shells out tocheck.py. SetAGENT_POLICY_FILEandAGENT_POLICY_REPOin the hook's environment, then point~/.claude/settings.jsonat it.codex_hook.sh— a Codex CLIPreToolUsehook (shell guardrail pilot). Codex hooks currently intercept Bash commands only — read, write, and edit operations are not covered. Mapsgit push --forcetopush.force,gh pr mergetomerge.pr, and everything else toshell. Requiresfeatures.codex_hooks = truein your Codex config and ahooks.jsonin~/.codex/or<repo>/.codex/.capability_map.py— stdlib-only helper that turns a raw Bash command into one ofpush.force/merge.pr/shell. Both hook wrappers shell out to it instead of doing substring matching, so quoted literals likeprintf '%s\n' 'git push --force'no longer produce a falsepush.forceclassification. See the file header for the exact algorithm (heredoc stripping →shlextokenization → scan-anywhere → recursivebash -c/eval).
The Codex CLI hook feature is marked "Under development" in upstream
docs. Two gaps affect how agent-policy presents decisions through
it, and they are worth knowing before you enable it:
require_approvaldegrades to block. Codex hook events accept onlyallowordeny— there is nopermissionDecision: "ask"yet.examples/codex_hook.shtherefore exits2for bothdenyandrequire_approval, and the only UX signal distinguishing the two is the stderr line (DENY ...vsrequire_approval ...). Users must retry after manual approval rather than being prompted inline.- Bash-only scope. Codex hooks intercept shell commands and nothing else. Read, write, and edit tool calls are invisible — if you need capability gating on those, use the Claude Code hook.
- Heuristic command parsing.
capability_map.pyisshlex-based, not a full shell. It handles quoted literals, heredocs, compound statements, and the commonbash -c '...'/evalwrappers, but exotic forms such asgit --git-dir=/path push --force, process substitution, or function definitions are not modeled. The fail-closed default isshell, which policy can still flag asrequire_approvalordeny.
Tag-driven. Pushing a vX.Y.Z annotated tag triggers .github/workflows/release.yml, which first verifies that the tag matches [project].version in pyproject.toml, checks that the version is not already present on PyPI, then builds the sdist + wheel and publishes to PyPI via Trusted Publishing (OIDC). No maintainer-side credentials are required. Manual workflow_dispatch with publish=false is a build-only dry run; it skips the publish job. Manual publish=true must be run against a v* tag ref; running it from a branch fails before build.
MIT.