feat(analyzer): implement MCP rug-pull detection (RP1-RP3)#161
feat(analyzer): implement MCP rug-pull detection (RP1-RP3)#161mimran-khan wants to merge 1 commit into
Conversation
Replaces the non-functional stub with a real analyzer that detects MCP rug-pull attack indicators: Manifest comparison (when previous_manifest available): - RP1: New parameters added (potential data capture) - RP2: Description changed (potential prompt injection via description) - RP3: Parameters removed (behavior divergence) Static risk analysis (always runs): - RP1: Wildcard permissions that amplify rug-pull impact - RP2: Dynamic tool loading patterns (fetch_tools, runtime discovery) Key behavior changes: - Logs WARNING when previous_manifest unavailable (was silently returning "0 findings" indistinguishable from "checked and found nothing") - Actually detects rug-pull indicators instead of being a no-op - Returns actionable findings with remediations Fixes NVIDIA#150
4c9130a to
88b912f
Compare
rng1995
left a comment
There was a problem hiding this comment.
Verdict: Request changes — the analyzer is a real improvement over the stub and the tests are good, but there's a likely duplicate PR, inconsistent rule-ID semantics, and the headline manifest-diff path is unreachable.
Blocking / please clarify
- Possible duplicate of the still-open #125 (identical title, "implement MCP rug-pull detection (RP1-RP3)"). Only one should proceed — please confirm which supersedes so we don't merge both.
- Inconsistent rule-ID semantics. In
_compare_manifests(src/skillspector/nodes/analyzers/mcp_rug_pull.py): RP1 = new parameter (~L63), RP3 = removed parameter (~L92), RP2 = description change (~L121). But in_static_risk_analysis: RP1 = wildcard permission (~L168) and RP2 = dynamic loading (~L202). The samerule_idcarries two different meanings, which produces inconsistent SARIF rule descriptors — note therules[]dedup added in #158 keeps only the first message seen per id, so one of the two meanings will be silently mislabeled. Give each distinct concept its own rule id. - Manifest-comparison path is unreachable.
node()readsstate.get("previous_manifest")(~L240) but nothing in this PR populates it — there's no--previous-manifestCLI flag or state plumbing. So in practiceprevious_manifestis alwaysNone, only_static_risk_analysisruns, and the headline RP1/RP2/RP3 manifest-diffing never fires (confirmed bytest_without_previous_manifest_warns). Please add the plumbing or clarify the follow-up that wires it.
Tests
Unit tests for both _compare_manifests and _static_risk_analysis are good. They don't catch (2)/(3) because they call the helpers directly / inject previous_manifest into state by hand, bypassing the missing CLI/state wiring.
|
Thanks for the review @rng1995 — all three blocking points are valid. After checking, PR #125 by @tcconnally covers the same scope with cleaner rule-ID semantics (each RP1/RP2/RP3 has a single consistent meaning) and was opened first. Our manifest-comparison path here is unreachable without CLI plumbing that doesn't exist yet. Closing this in favor of #125. If the manifest-diffing logic is useful down the road, it can be proposed as a follow-up once #125 lands and the state/CLI plumbing is in place. |
Summary
The
mcp_rug_pullanalyzer was a non-functional stub that unconditionally returned empty findings — indistinguishable from "checked and found nothing" in scan output. This gave users a false sense of security when scanning MCP-enabled skills.This PR implements actual rug-pull detection with two analysis modes:
Manifest Comparison (when
previous_manifestis supplied)Static Risk Analysis (always runs)
*) permissionsKey Behavior Changes
previous_manifestis unavailable, the analyzer now logs aWARNINGinstead of silently producing "0 findings"How Rug-Pull Detection Works
Static analysis runs on every scan regardless, catching:
Testing
15 new tests covering:
Fixes #150