feat(sarif,detector): describe instruction/baseline kinds; stop over-flagging SHA-pinned GitHub URLs#53
Merged
Merged
Conversation
…flagging SHA-pinned GitHub URLs SARIF (report.ts): - shortDescriptionForKind covered the MCP / Claude / Codex / Aider kinds but fell back to the raw kind string for instruction and baseline-drift findings. Add descriptions for instructions_skip_confirmation / _override_safety / _broad_write / _auto_version_control and baseline_ rating_drift / _version_drift / _parse_error. Detector precision (parsers/mcp.ts): - isUnpinnedCommand flagged every github.com URL as unpinned, including one pinned to an immutable 40-char commit SHA. A SHA makes the install reproducible, so only branch / tag / HEAD URLs are now flagged. Benchmark (finding 11): add two benign false-positive-trap fixtures + labels — a SHA-pinned GitHub URL (must not be mcp_unpinned) and an absolute local script path outside the repo (must not be missing_local_script) — and regenerate RESULTS.md (33 cases, 0 false positives). Note: the diff/fix-pin/instruction-coverage adversarial cases from the review belong with PRs 1/2/4/5 — the benchmark harness only runs `audit`, and those fixtures/behaviours live on those branches. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
# Conflicts: # test/heuristics.test.mjs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses review findings #10 and #11.
#10 — SARIF descriptions
shortDescriptionForKindcovered the MCP / Claude / Codex / Aider kinds but fell back to the raw kind string for instruction and baseline-drift findings, so those rules showed up in the GitHub Security tab aspolicy_mesh.instructions_override_safetyinstead of a sentence. Added descriptions for:instructions_skip_confirmation,instructions_override_safety,instructions_broad_write,instructions_auto_version_controlbaseline_rating_drift,baseline_version_drift,baseline_parse_error(
exceptions_parse_errorwas already present.)#11 — Detector precision + benchmark
Stop over-flagging SHA-pinned GitHub URLs.
isUnpinnedCommandflagged everygithub.comURL as unpinned, including one pinned to an immutable 40-char commit SHA — which is reproducible. Now only branch / tag / HEAD URLs are flagged; a SHA-pinned URL is treated as pinned.Benchmark expansion. Added two benign false-positive-trap fixtures + labels and regenerated
RESULTS.md(now 33 cases, 9 benign, 0 false positives, 100% detection recall):mcp-github-sha-pinned— SHA-pinned GitHub URL must not bemcp_unpinned.mcp-absolute-script-path— absolute local script path outside the repo must not bemissing_local_script(the detector already excludes absolute paths; this locks it in).Scope note
The other adversarial cases the review lists (same-severity diff change, fix-pin inline args, recursive instruction-only package,
.cursorrules, fenced-code example) belong with PRs #47 / #48 / #50 / #51: this benchmark harness only runsaudit, and those fixtures/behaviours live on those branches. They're covered by unit/CLI tests in their own PRs.Tests
Unit test for
isUnpinnedCommand(SHA-pinned vs branch vs@latest); CLI test asserting instruction & baseline SARIF rules carry real descriptions. All 120 tests pass;dist/rebuilt and committed.🤖 Generated with Claude Code