Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ ScopeTrail is **local-only**. It reads the checked-out repository, materializes

The detectors cover the surfaces an AI agent can actually escalate through:

- **MCP** — `.mcp.json`, `.cursor/mcp.json`, `.vscode/mcp.json`, `.codeium/windsurf/mcp_config.json`, sample/template/disabled variants, and prefixed sample files such as `claude_mcp_config.json`.
- **MCP** — `.mcp.json`, `.cursor/mcp.json`, `.vscode/mcp.json`, `.codeium/windsurf/mcp_config.json`. Sample/template/disabled variants (and prefixed sample files such as `claude_mcp_config.json`) are reviewed only with `--include-samples` / the Action's `include-samples: true` — those files never load into an agent, so a change to one is not permission drift.
- **Claude Code settings** — `.claude/settings.json`, including widened allow rules, removed deny rules, and added / removed / command-swapped hooks.
- **Codex** — `.codex/config.toml`, including sandbox elevation, weakened approval policy, network access, trusted-project changes, and `[mcp_servers.NAME]` additions / unpinned commands.

Expand All @@ -159,7 +159,7 @@ ScopeTrail ships a labeled precision/recall benchmark over **35 fixture PRs** (2

The 8 benign cases include seven engineered **false-positive traps** — narrowly-scoped Claude grants (a textual diff sees new `allow` lines), an all-tightening Codex posture, network access that was *already* on, a brand-new Codex config pinned to the narrowest posture, a dropped MCP `env` var, a removed MCP server, and a `.mcp.json` with reordered keys but an identical launch command — plus one byte-identical snapshot. None produce a finding, because the detectors compare semantics and flag only *widening*.

**Severity is calibrated, not maximized.** At a strict `fail-on: high` gate, recall is 85% — by design: sample/template MCP additions, pinned version bumps, broad `Read` allows, and newly-enabled Codex network access sit at `low`/`medium` because they widen the surface without being directly exploitable. The `high`/`critical` band is reserved for executable or secret-facing changes — a bare `Bash` grant, a removed `Read(.env)` deny, a `danger-full-access` sandbox, an unencrypted remote MCP endpoint. Full confusion matrix at every gate, per-category and per-case breakdowns: [benchmark/RESULTS.md](benchmark/RESULTS.md). Methodology and labels: [benchmark/labels.json](benchmark/labels.json).
**Severity is calibrated, not maximized.** At a strict `fail-on: high` gate, recall is 85% — by design: opt-in sample/template MCP additions, pinned version bumps, broad `Read` allows, and newly-enabled Codex network access sit at `low`/`medium` because they widen the surface without being directly exploitable. The `high`/`critical` band is reserved for executable or secret-facing changes — a bare `Bash` grant, a removed `Read(.env)` deny, a `danger-full-access` sandbox, an unencrypted remote MCP endpoint. Full confusion matrix at every gate, per-category and per-case breakdowns: [benchmark/RESULTS.md](benchmark/RESULTS.md). Methodology and labels: [benchmark/labels.json](benchmark/labels.json).

## Design choices worth flagging

Expand Down
17 changes: 16 additions & 1 deletion action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,12 @@ inputs:
description: Severity that fails the action. Use none, low, medium, high, or critical.
required: false
default: none
include-samples:
description: >-
Also review sample/template/disabled MCP configs (.mcp.json.template, .sample, prefixed examples).
Off by default because those files never load into an agent, so changes to them are not permission drift.
required: false
default: 'false'

outputs:
rating:
Expand All @@ -47,6 +53,7 @@ runs:
SCOPE_BASE: ${{ inputs.base }}
SCOPE_HEAD: ${{ inputs.head }}
SCOPE_FAIL_ON: ${{ inputs.fail-on }}
SCOPE_INCLUDE_SAMPLES: ${{ inputs.include-samples }}
DEFAULT_BASE: ${{ github.event.pull_request.base.sha || github.event.before }}
DEFAULT_HEAD: ${{ github.event.pull_request.head.sha || github.sha }}
run: |
Expand All @@ -57,6 +64,13 @@ runs:
head="${SCOPE_HEAD:-$DEFAULT_HEAD}"
fail_on="${SCOPE_FAIL_ON:-none}"

# Optional opt-in: review sample/template/disabled MCP configs too.
# Empty unless requested, so the unquoted expansion below adds no arg.
include_samples=""
if [ "${SCOPE_INCLUDE_SAMPLES:-false}" = "true" ]; then
include_samples="--include-samples"
fi

if [ -z "$base" ] || [ -z "$head" ]; then
echo "::error::ScopeTrail needs base and head refs. Pass base/head inputs or run on pull_request with actions/checkout fetch-depth: 0."
exit 2
Expand All @@ -77,7 +91,8 @@ runs:
--format github \
--out-markdown "$report_file" \
--out-json "$json_file" \
--fail-on "$fail_on"
--fail-on "$fail_on" \
$include_samples
cli_status=$?
set -e

Expand Down
7 changes: 6 additions & 1 deletion benchmark/run-benchmark.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,12 @@ function runDiff(fixture) {
'--new',
join(caseDir, 'new'),
'--format',
'json'
'json',
// The corpus labels sample/template MCP fixtures (category `mcp-sample`) as
// expected detections, so the benchmark opts into that surface explicitly.
// The default CLI report leaves it off — those files never load into an
// agent, so a change to one is not permission drift.
'--include-samples'
];
const stdout = execFileSync('node', [cli, ...args], {
encoding: 'utf8',
Expand Down
16 changes: 15 additions & 1 deletion dist/detectors/mcp.js
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ const IGNORED_SAMPLE_SCAN_DIRS = new Set([
// reads. Keeping the source of truth in the detector prevents the
// snapshot list and the detector list from drifting (they did, before).
export const MCP_TARGET_PATHS = MCP_CONFIGS.map((config) => config.path);
export async function detectMcpDrift(oldRoot, newRoot) {
export async function detectMcpDrift(oldRoot, newRoot, options = {}) {
const findings = [];
for (const config of MCP_CONFIGS) {
// Surface invalid JSON as a finding instead of silently producing
Expand Down Expand Up @@ -139,6 +139,20 @@ export async function detectMcpDrift(oldRoot, newRoot) {
}
}
}
// Sample/template configs are an opt-in surface — see McpDriftOptions for why
// a file that no agent loads can't be drift. Off by default keeps the report
// scoped to live configuration.
if (options.includeSamples) {
findings.push(...(await detectMcpSampleDrift(oldRoot, newRoot)));
}
return findings;
}
// Diff sample/template/disabled MCP configs on their own low-severity track so
// a noisy template change can be reviewed for copy-paste hygiene without ever
// being mistaken for a change to what an agent can actually do. Only runs when
// the caller opts in via McpDriftOptions.includeSamples.
async function detectMcpSampleDrift(oldRoot, newRoot) {
const findings = [];
for (const path of await listMcpSampleConfigPaths(oldRoot, newRoot)) {
const config = { path, serverKeys: ['mcpServers', 'servers'] };
const newSource = await readJsonObjectWithSource(configPath(newRoot, path));
Expand Down
18 changes: 12 additions & 6 deletions dist/git-snapshot.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@ export const SNAPSHOT_PATHS = [
...CLAUDE_TARGET_PATHS,
...CODEX_TARGET_PATHS
];
export async function materializeGitSnapshot(repo, ref) {
export async function materializeGitSnapshot(repo, ref, options = {}) {
await verifyGitRef(repo, ref);
const root = await mkdtemp(join(tmpdir(), 'scopetrail-snapshot-'));
let completed = false;
try {
for (const relativePath of await snapshotPathsForRef(repo, ref)) {
for (const relativePath of await snapshotPathsForRef(repo, ref, options.includeSamples ?? false)) {
const content = await readPathAtRef(repo, ref, relativePath);
if (content === null) {
continue;
Expand Down Expand Up @@ -53,11 +53,17 @@ export async function materializeGitSnapshot(repo, ref) {
}
}
}
async function snapshotPathsForRef(repo, ref) {
async function snapshotPathsForRef(repo, ref, includeSamples) {
const paths = new Set(SNAPSHOT_PATHS);
for (const relativePath of await listPathsAtRef(repo, ref)) {
if (isMcpSampleConfigPath(relativePath)) {
paths.add(relativePath);
// Sample/template configs are opt-in (see McpDriftOptions). When the caller
// hasn't asked for them, skip the full `git ls-tree -r` walk entirely — the
// detector would ignore the materialized files anyway, so listing every
// tracked path on every PR is wasted work.
if (includeSamples) {
for (const relativePath of await listPathsAtRef(repo, ref)) {
if (isMcpSampleConfigPath(relativePath)) {
paths.add(relativePath);
}
}
}
return [...paths].sort();
Expand Down
28 changes: 21 additions & 7 deletions dist/index.js
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/usr/bin/env node
import { realpathSync } from 'node:fs';
import { writeFile } from 'node:fs/promises';
Expand Down Expand Up @@ -33,12 +33,16 @@
}
else {
try {
const baseSnapshot = await materializeGitSnapshot(parsed.repo, parsed.base);
const baseSnapshot = await materializeGitSnapshot(parsed.repo, parsed.base, {
includeSamples: parsed.includeSamples
});
// `cleanup` is only assigned once BOTH snapshots exist, so if head
// materialization fails (an unresolvable head ref, a max-buffer error)
// the base snapshot's temp dir would leak. Clean it explicitly before
// the error propagates to the handler below.
const headSnapshot = await materializeGitSnapshot(parsed.repo, parsed.head).catch(async (headError) => {
const headSnapshot = await materializeGitSnapshot(parsed.repo, parsed.head, {
includeSamples: parsed.includeSamples
}).catch(async (headError) => {
await baseSnapshot.cleanup();
throw headError;
});
Expand All @@ -62,7 +66,7 @@
// the CLI three times for markdown/json/github, which repeated
// git snapshot materialization and detector work on each call.
const findings = [
...(await detectMcpDrift(oldRoot, newRoot)),
...(await detectMcpDrift(oldRoot, newRoot, { includeSamples: parsed.includeSamples })),
...(await detectClaudeSettingsDrift(oldRoot, newRoot)),
...(await detectCodexConfigDrift(oldRoot, newRoot))
];
Expand Down Expand Up @@ -94,6 +98,7 @@
let outMarkdown;
let outJson;
let failOn = 'none';
let includeSamples = false;
for (let index = 0; index < argv.length; index += 1) {
const arg = argv[index];
const value = argv[index + 1];
Expand Down Expand Up @@ -145,6 +150,12 @@
failOn = value;
index += 1;
}
else if (arg === '--include-samples') {
// Boolean flag — opts into reviewing sample/template/disabled MCP configs
// (`.mcp.json.template`, prefixed examples, ...). Off by default because
// those files never load into an agent, so a change to one is not drift.
includeSamples = true;
}
else {
return { ok: false, error: `Unknown argument: ${arg}` };
}
Expand All @@ -161,15 +172,15 @@
if (!head) {
return { ok: false, error: 'Missing required --head <ref> argument.' };
}
return { ok: true, mode: 'git', repo, base, head, format, outMarkdown, outJson, failOn };
return { ok: true, mode: 'git', repo, base, head, format, outMarkdown, outJson, failOn, includeSamples };
}
if (!oldRoot) {
return { ok: false, error: 'Missing required --old <dir> argument or --base <ref> argument.' };
}
if (!newRoot) {
return { ok: false, error: 'Missing required --new <dir> argument.' };
}
return { ok: true, mode: 'directories', oldRoot, newRoot, format, outMarkdown, outJson, failOn };
return { ok: true, mode: 'directories', oldRoot, newRoot, format, outMarkdown, outJson, failOn, includeSamples };
}
function isReportFormat(value) {
return value === 'text' || value === 'markdown' || value === 'json' || value === 'github';
Expand Down Expand Up @@ -198,7 +209,10 @@
function usage() {
return [
'Usage:',
' scopetrail diff --old <dir> --new <dir> [--format text|markdown|json|github] [--out-markdown PATH] [--out-json PATH] [--fail-on none|low|medium|high|critical]',
' scopetrail diff --repo <repo> --base <ref> --head <ref> [--format text|markdown|json|github] [--out-markdown PATH] [--out-json PATH] [--fail-on none|low|medium|high|critical]'
' scopetrail diff --old <dir> --new <dir> [--format text|markdown|json|github] [--out-markdown PATH] [--out-json PATH] [--fail-on none|low|medium|high|critical] [--include-samples]',
' scopetrail diff --repo <repo> --base <ref> --head <ref> [--format text|markdown|json|github] [--out-markdown PATH] [--out-json PATH] [--fail-on none|low|medium|high|critical] [--include-samples]',
'',
' --include-samples Also review sample/template/disabled MCP configs (.mcp.json.template, .sample, prefixed examples).',
' Off by default: those files never load into an agent, so changes to them are not permission drift.'
].join('\n');
}
2 changes: 1 addition & 1 deletion docs/PILOT.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Useful checks during the trial:

- Did ScopeTrail catch real permission drift?
- Did any warning feel noisy or too broad?
- Did sample/template/disabled MCP config findings, including platform-suffixed and prefixed MCP config examples, correctly stay separate from active MCP server drift?
- If you opted in with `include-samples`, did sample/template/disabled MCP config findings, including platform-suffixed and prefixed MCP config examples, correctly stay separate from active MCP server drift?
- Did it miss an agent config surface your repository uses?
- Would a team workflow need cross-repo visibility, policy ownership, exception workflow, or reporting?

Expand Down
2 changes: 1 addition & 1 deletion docs/TRUST.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ ScopeTrail is a local-only GitHub Action and CLI for reviewing AI-agent permissi

ScopeTrail reads the checked-out repository and compares supported agent configuration files between the pull request base and head refs. Supported active files include `.mcp.json`, `.cursor/mcp.json`, `.vscode/mcp.json`, `.codeium/windsurf/mcp_config.json`, `.claude/settings.json`, and `.codex/config.toml`.

ScopeTrail also reviews sample/template/disabled MCP config files such as `.mcp.json.sample`, `.mcp.json.template`, `.mcp.json.disabled`, `.mcp.json.example`, platform-suffixed MCP example files such as `.mcp.json.windows.example` and `.mcp.json.example.mac`, nested `mcp_config.json.example` variants, and prefixed MCP config example files such as `example_mcp_config.json`, `claude_mcp_config.json`, `cursor_mcp_config.json`, and `vscode_mcp_config.json`. Those findings are reported separately from active MCP server drift so copied examples can be reviewed without implying they are live configuration.
ScopeTrail can also review sample/template/disabled MCP config files such as `.mcp.json.sample`, `.mcp.json.template`, `.mcp.json.disabled`, `.mcp.json.example`, platform-suffixed MCP example files such as `.mcp.json.windows.example` and `.mcp.json.example.mac`, nested `mcp_config.json.example` variants, and prefixed MCP config example files such as `example_mcp_config.json`, `claude_mcp_config.json`, `cursor_mcp_config.json`, and `vscode_mcp_config.json`. This is **off by default**: those files never load into an agent runtime, so a change to one cannot alter what an agent is allowed to do, and reporting it as drift would be noise. Pass `--include-samples` (CLI) or set `include-samples: true` (Action) to opt in. When enabled, those findings are reported separately from active MCP server drift so copied examples can be reviewed without implying they are live configuration.

In GitHub Actions, `fetch-depth: 0` is required so ScopeTrail can compare the pull request base and head commits instead of only seeing the latest checkout.

Expand Down
34 changes: 33 additions & 1 deletion src/detectors/mcp.ts
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,22 @@ interface McpServerModel extends McpServerConfig {
sourceText?: string;
}

export async function detectMcpDrift(oldRoot: string, newRoot: string): Promise<Finding[]> {
export interface McpDriftOptions {
// Sample/template/disabled MCP configs (`.mcp.json.template`, `.sample`,
// prefixed examples, ...) never load into an agent runtime — only the live
// `.mcp.json` and editor configs do. A change to a template therefore can't
// alter what an agent is actually allowed to do, so for a *drift* detector it
// is noise, not drift. Scanning them is opt-in: teams who want copy-paste
// hygiene on shipped examples ask for it explicitly; the default report stays
// scoped to surfaces an agent actually loads.
includeSamples?: boolean;
}

export async function detectMcpDrift(
oldRoot: string,
newRoot: string,
options: McpDriftOptions = {}
): Promise<Finding[]> {
const findings: Finding[] = [];

for (const config of MCP_CONFIGS) {
Expand Down Expand Up @@ -173,6 +188,23 @@ export async function detectMcpDrift(oldRoot: string, newRoot: string): Promise<
}
}

// Sample/template configs are an opt-in surface — see McpDriftOptions for why
// a file that no agent loads can't be drift. Off by default keeps the report
// scoped to live configuration.
if (options.includeSamples) {
findings.push(...(await detectMcpSampleDrift(oldRoot, newRoot)));
}

return findings;
}

// Diff sample/template/disabled MCP configs on their own low-severity track so
// a noisy template change can be reviewed for copy-paste hygiene without ever
// being mistaken for a change to what an agent can actually do. Only runs when
// the caller opts in via McpDriftOptions.includeSamples.
async function detectMcpSampleDrift(oldRoot: string, newRoot: string): Promise<Finding[]> {
const findings: Finding[] = [];

for (const path of await listMcpSampleConfigPaths(oldRoot, newRoot)) {
const config = { path, serverKeys: ['mcpServers', 'servers'] };
const newSource = await readJsonObjectWithSource(configPath(newRoot, path));
Expand Down
28 changes: 22 additions & 6 deletions src/git-snapshot.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,23 @@ export interface GitSnapshot {
cleanup: () => Promise<void>;
}

export async function materializeGitSnapshot(repo: string, ref: string): Promise<GitSnapshot> {
export interface GitSnapshotOptions {
// Mirror of McpDriftOptions.includeSamples: when the detector won't look at
// sample/template configs, don't walk the tree to materialize them either.
includeSamples?: boolean;
}

export async function materializeGitSnapshot(
repo: string,
ref: string,
options: GitSnapshotOptions = {}
): Promise<GitSnapshot> {
await verifyGitRef(repo, ref);

const root = await mkdtemp(join(tmpdir(), 'scopetrail-snapshot-'));
let completed = false;
try {
for (const relativePath of await snapshotPathsForRef(repo, ref)) {
for (const relativePath of await snapshotPathsForRef(repo, ref, options.includeSamples ?? false)) {
const content = await readPathAtRef(repo, ref, relativePath);
if (content === null) {
continue;
Expand Down Expand Up @@ -65,11 +75,17 @@ export async function materializeGitSnapshot(repo: string, ref: string): Promise
}
}

async function snapshotPathsForRef(repo: string, ref: string): Promise<string[]> {
async function snapshotPathsForRef(repo: string, ref: string, includeSamples: boolean): Promise<string[]> {
const paths = new Set(SNAPSHOT_PATHS);
for (const relativePath of await listPathsAtRef(repo, ref)) {
if (isMcpSampleConfigPath(relativePath)) {
paths.add(relativePath);
// Sample/template configs are opt-in (see McpDriftOptions). When the caller
// hasn't asked for them, skip the full `git ls-tree -r` walk entirely — the
// detector would ignore the materialized files anyway, so listing every
// tracked path on every PR is wasted work.
if (includeSamples) {
for (const relativePath of await listPathsAtRef(repo, ref)) {
if (isMcpSampleConfigPath(relativePath)) {
paths.add(relativePath);
}
}
}

Expand Down
Loading