An agent-policy drift detector for reproducible AI-assisted development. PolicyMesh finds contradictory instructions and configuration before stale repo rules make agents behave differently from run to run.
Agent policy now lives across CLAUDE.md, AGENTS.md, Cursor rules, Copilot instructions, MCP configs, Codex config, VS Code settings, Windsurf, Aider, and repo docs. PolicyMesh reads that whole surface in the checked-out repository, compares the rules that different tools see, and turns contradiction drift into a reviewable report.
flowchart LR
Instructions["Instruction files<br/>CLAUDE.md · AGENTS.md · Cursor rules<br/>Copilot instructions"] --> Mesh
Configs["Agent configs<br/>MCP · Codex · VS Code<br/>Windsurf · Aider"] --> Mesh
PR["Pull request diff<br/>base vs head"] --> Mesh
Mesh[("PolicyMesh<br/>contradiction + drift audit")] --> Report["Review output<br/>PR annotations · Markdown<br/>JSON · SARIF"]
Report --> Decision["Reviewable policy decisions"]
classDef input fill:#1e293b,stroke:#334155,color:#e2e8f0
classDef engine fill:#0f172a,stroke:#1e293b,color:#e2e8f0,stroke-width:2px
classDef output fill:#0c4a6e,stroke:#0369a1,color:#e0f2fe
class Instructions,Configs,PR input
class Mesh engine
class Report,Decision output
Ships as a local-only TypeScript CLI and GitHub Action. One audit pass can produce terminal text, Markdown step summaries, PR annotations, JSON for GovVerdict, or SARIF for SAST consumers.
See also: agent-gov-core for the shared report schema · agent-gov-demo for an end-to-end sample PR · ScopeTrail for PR-level permission drift.
AI-agent behavior is now controlled by a pile of repo-local policy files. They are edited by different people, at different times, for different tools. A contradiction does not fail CI. It just changes what the agent believes it is allowed to do.
Most drift is not malicious. It is a teammate editing one agent file weeks after another file already defined a different rule. Reviewers see one file at a time in a PR. The next agent session just receives conflicting policy and silently chooses which rule to follow.
PolicyMesh exists to make those contradictions visible before they turn into nondeterministic agent behavior.
| Drift class | Example |
|---|---|
| Instruction drift | CLAUDE.md fences off sensitive paths while Cursor rules say agents may edit any file. |
| MCP drift | The same MCP server is pinned in one config, @latest in another, and disabled somewhere else. |
| Permission drift | Claude denies a sensitive read while Codex is trusted with network access and a broader sandbox. |
| Operational hazards | A config launches an MCP server through sudo, points at a broken local script, or hides a risky imperative in review noise. |
name: PolicyMesh
on: pull_request
permissions:
contents: read
jobs:
policymesh:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0 # required for diff mode
- uses: Conalh/PolicyMesh@v0.5.2
with:
fail-on: high
diff: true # gate only on findings this PR introduces or worsensWrites a Markdown report to the Actions step summary and emits PR-visible ::warning annotations on the exact conflicting config lines.
git clone https://github.com/Conalh/PolicyMesh
cd PolicyMesh
npm install
npm run build
# Audit the bundled conflicted fixture
node dist/index.js audit --repo test/fixtures/conflicted --format markdown
# Or audit a real repo
node dist/index.js audit --repo /path/to/your/repo --format textReal output from test/fixtures/conflicted, --format text:
PolicyMesh agent policy review: HIGH
Effective capability union:
- 1 MCP server configured
- 3 unpinned MCP packages
- bash wildcards allowed (Claude)
- broad read paths allowed (Claude)
- network enabled (Codex)
- Codex project trusted
- Codex sandbox: workspace-write
- Strictest: Claude (1 sensitive deny rule) · Loosest: Codex (trusted + network)
[HIGH] github: MCP server "github" has different launch commands across surfaces:
"npx -y @modelcontextprotocol/server-github@1.2.3" vs "@latest" vs "@2.0.0".
Surfaces: Root MCP, Cursor MCP, VS Code MCP, Windsurf MCP, Codex.
[MEDIUM] github: unpinned command across 3 surfaces (@latest). Surfaces: Cursor, VS Code, Windsurf.
[MEDIUM] Read(.env): Claude denies Read(.env) but has broad allow rules Bash(npm *), Read(~/**).
[MEDIUM] network_access: Codex network access enabled alongside other configured surfaces.
[HIGH] github: Codex project trusted while MCP servers are unpinned and inconsistent.
--format json emits the canonical agent-gov-core Report envelope — the same shape every tool in the suite emits, so GovVerdict can merge them:
{
"schemaVersion": "1.0",
"tool": "policy_mesh",
"rating": "high",
"findings": [
{
"tool": "policy_mesh",
"kind": "policy_mesh.mcp_command_mismatch",
"severity": "high",
"message": "MCP server \"github\" has different launch commands across surfaces…",
"location": { "file": ".mcp.json", "line": 3 },
"salientKey": "github",
"data": {
"subject": "github",
"recommendation": "Use the same pinned MCP server definition in every MCP config file.",
"surfaces": ["root_mcp", "cursor_mcp", "vscode_mcp", "windsurf_mcp", "codex"],
"signature": "d0bb4972fd9e855d"
},
"fingerprint": "ce65620cb8140af3"
}
]
}--format sarif is also supported for the GitHub Security tab and other SAST consumers.
- Runs against the checked-out repo — no upload, no hosted scanner, no telemetry. The GitHub Action writes a Markdown report to the step summary and emits PR-visible annotations; pass
github-tokento additionally post a sticky PR comment that updates in place. - One audit pass renders five output formats:
textfor terminals,markdownfor step summaries and PR comments,jsonfor piping to GovVerdict,githubfor::warningannotations on the exact conflicting line,sariffor the GitHub Security tab. - Detectors group by canonical identity (for example, MCP command normalization ignores neutral flag reordering /
-yvs--yes/.cmdvs.exe) and fire only when two or more surfaces actually disagree. - Diff mode (
diff: true) audits the PR base in a temporary worktree, audits HEAD, and gates only on new or worsened findings — so a PR does not fail on pre-existing conflicts. Findings resolved by the PR are surfaced separately as green-check signal. fix/fix pincan auto-align MCP enabled-state orcommand/argsdrift to a canonical surface you nominate. Always dry-run first;--writedoes line-targeted edits that preserve comments and indentation.- Baselines.
.policymesh-exceptions.jsonsuppresses known-and-documented findings, optionally locked to a content signature so the suppression breaks if the violation later changes..policymesh-baseline.jsonencodes the positive state the team requires and fires HIGH on drift.
- Cross-surface by default. A single policy file is rarely the full truth; the useful signal is where two tools expose incompatible rules to the same agent workflow.
- Adoptable in messy repos. Diff mode gates on new or worsened findings, which lets teams start advisory and tighten later without fixing all historical drift first.
- One report model. CLI text, Markdown, GitHub annotations, JSON, and SARIF all render from the same report object, so the machine-readable and human-readable surfaces stay aligned.
- Narrow fixes. The
fixcommands are dry-run-first and line-targeted because policy edits should remain reviewable instead of becoming a silent rewrite of repo governance.
| Command | What it does |
|---|---|
policymesh audit --repo <path> |
Full repo audit. --format text|markdown|json|github|sarif. --recursive for monorepos. |
policymesh diff --base-ref <git-ref> |
Audit a base ref in a temp worktree, audit working tree, print the delta. |
policymesh diff --base-report a.json --head-report b.json |
Diff two saved JSON audits. |
policymesh fix --canonical <surface> [--write] |
Align MCP enabled / disabled state to a canonical surface. |
policymesh fix pin --canonical <surface> [--write] |
Align MCP command / args to a canonical surface. |
policymesh render --input <json> --format <fmt> |
Re-render a saved audit in another format. |
PolicyMesh audits nine surfaces: root_mcp, cursor_mcp, vscode_mcp, codeium_mcp, windsurf_mcp, claude, codex, aider, and instructions.
For fix --canonical <surface> and fix pin --canonical <surface>, <surface> must currently be one of the five JSON MCP surfaces: root_mcp, cursor_mcp, vscode_mcp, codeium_mcp, or windsurf_mcp. The line-targeted JSON editor preserves comments and indentation on exactly these formats; aligning the Codex TOML, Claude, Aider, or instruction surfaces is not yet supported, and fix rejects them with an error rather than risk a lossy rewrite.
| Input | Default | Purpose |
|---|---|---|
repo |
$GITHUB_WORKSPACE |
Checkout path to inspect. |
fail-on |
none |
Severity that fails the step: none, low, medium, high, critical. Start advisory, raise later. |
diff |
false |
On pull_request, gate only on findings introduced or worsened by this PR. |
recursive |
false |
Monorepo mode — audit every sub-project with its own agent config independently. |
github-token |
(unset) | Optional GITHUB_TOKEN with pull-requests: write to post a sticky PR comment that updates in place. |
rating (none/low/medium/high/critical), finding-count, surface-count.
PolicyMesh v0.5 detects:
- Instruction drift: risky imperatives in
AGENTS.md,CLAUDE.md,.cursor/rules/*.md, and.github/copilot-instructions.mdsuch as "ignore deny rules", "edit any file", or "auto-commit". - MCP drift: command mismatches, missing-server gaps, enabled-state drift, env / header drift without echoing secret values, unpinned
@latestpackages, and hardcoded API credentials in MCP launch lines. - Permission drift: Claude broad-allow vs narrow-deny contradictions, Claude broad allows without a
PreToolUsehook, Claude MCP grants for servers that are not configured, Codex network-access + trusted-project + risky-MCP combinations, Codex sandbox gaps relative to Claude denies, and Aiderdangerously-allow-non-git. - Operational hazards: MCP servers launched via elevation utilities (
sudo,pkexec,runas…), broken local script paths, and config parser differences that normally hide in review noise.
VS Code and Cursor configs are parsed as JSONC, so comments and trailing commas are accepted.
PolicyMesh ships a labeled precision/recall benchmark over 31 fixture repos (24 with planted drift, 7 clean) spanning 24 detector kinds. Ground truth is fixed by fixture design; the harness scores the audit engine against it. Reproduce with npm run build && node benchmark/run-benchmark.mjs.
| Metric | Result |
|---|---|
| Detection (any finding) — recall | 100% (24/24 rogue repos flagged) |
| Detection — false-positive rate | 0% (0/7 clean repos flagged) |
| Detection — precision | 100% |
| Correct primary finding kind | 24/24 rogue repos |
| All expected finding kinds | 24/24 rogue repos |
| Exact consolidated rating | 21/21 repos where the label pins one |
The 7 clean repos include four engineered false-positive traps — neutral npx -y vs npx, multi-line Codex TOML args, Aider dangerously-allow-non-git: false, and benign "Always/Never" prose — plus a baseline-satisfied repo and an active exception. None produce a finding.
Severity is calibrated, not maximized. At a strict fail-on: high gate, recall is 54% — by design: env / header / enabled-state drift and missing servers are medium-or-low because they are reproducibility hazards, not exploits. The critical band is reserved for hardcoded secrets. Full confusion matrix at every gate, per-category and per-case breakdowns: benchmark/RESULTS.md. Methodology and labels: benchmark/labels.json.
Local-only OSS tools that review AI-agent PRs and coding sessions for config drift, policy mismatches, and scope creep. Each tool covers an orthogonal failure mode; they share a canonical Finding schema and can be merged into a single verdict.
| Repo | What it catches |
|---|---|
| ScopeTrail | Diffs agent config files between PR base and head — permission drift. |
| PolicyMesh (this repo) | Finds contradictory agent instructions and config drift that make behavior non-reproducible. |
| CapabilityEcho | Network, subprocess, eval, lifecycle, and workflow-permission signals in code diffs. |
| TaskBound | Compares the stated task to the actual diff — scope creep. |
| SessionTrail | Parses Cursor / Claude / Codex JSONL session transcripts for runtime behavior. |
| GovVerdict | Merges JSON reports from the tools above into a single verdict. |
| agent-gov-core | Shared parsers, the canonical Finding schema, mergeFindings. |
| agent-gov-demo | Sandbox repo with a rogue PR that exercises all five tools end-to-end. |
Demo PR exercising the full stack: agent-gov-demo#1
MIT. Bug reports and false-positive reports welcome via Issues.