feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181)#182
Conversation
…vasions
The import-alias normalization rewrites `from builtins import exec` and `import builtins; builtins.exec(...)` to the qualified `builtins.exec`, which the bare-name sink checks (call_name == "exec", _EXEC_SINKS) then miss. Collapse `builtins.<x>` back to the bare builtin so these re-enter the existing detection, covering exec/eval/compile/__import__. Add a dynamic-import resolver so `importlib.import_module('os').system(...)` (sibling of __import__) resolves to os.system / the subprocess family and hits the sink ladders.
Baseline detection unchanged; composes with the getattr branch (NVIDIA#166) with no double-fire; FP-neighbors stay clean.
Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
…ness gate
Introduce resolve_to_canonical_sink, a single chokepoint reducing every spelling
of a dangerous callable -- bare/alias/from/as, builtins.*,
importlib.import_module, reflective getattr("lit"), subscript __builtins__["exec"]
and <mod>.__dict__["lit"], plus importlib/runpy/code sibling machinery -- to one
canonical sink id. Wire it into behavioral_ast and behavioral_taint_tracking so
the per-idiom ladders stay simple, with a shared parametrized evasion corpus
asserting the invariant as a fitness gate: a future missed spelling fails a
shared test instead of shipping as a silent blind spot.
Reflective subprocess invocations grade AST9-HIGH for parity with reflective
os.system; direct subprocess.* keeps baseline AST4-MEDIUM; AST9 and the canonical
fallback stay complementary (the _covered_by_ast9 guard prevents double-firing).
Benign neighbours never canonicalize to a sink.
Stacked on the builtins/importlib sink-evasion fix (reuses
resolve_dynamic_import_call); to be rebased onto main once that lands.
108 evasion-corpus tests pass; full suite +108 with zero regressions
(ruff 0.15.2 + mypy clean).
Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
| return None | ||
|
|
||
|
|
||
| def _strip_builtins_prefix(name: str) -> str: |
There was a problem hiding this comment.
[Automated SkillSpector Review]
This helper (and the other common.py changes plus the duplicated test_behavioral_ast.py / test_behavioral_taint_tracking.py additions in this PR) come from #180, which this branch is stacked on. Once #180 merges, please rebase onto main so this PR's diff contains only the new canonical-sink layer. No change needed to the code itself.
rng1995
left a comment
There was a problem hiding this comment.
[Automated SkillSpector Review]
Requesting changes — scope/rebase, not the implementation.
To be clear up front: the canonical-sink layer itself is excellent and I have no correctness or security objection to canonical_sink.py. The single-chokepoint design (resolve_to_canonical_sink covering bare/alias/from/as, builtins.*, dynamic import, reflective getattr, subscript __builtins__["exec"]/os.__dict__[...], and importlib/runpy/code sibling machinery with receiver-gated instance methods), the AST9-parity for reflective subprocess, the _covered_by_ast9 anti-double-fire guard, and the shared evasion-corpus fitness gate are all well-reasoned and thoroughly tested.
The blocker is scope/mergeability: this PR is stacked on #180 and its diff currently includes #180's commit 8bdc3bc (the common.py helpers and the test_behavioral_ast.py / test_behavioral_taint_tracking.py additions are duplicated from #180). As written, this can't be merged independently without pulling in / conflicting with #180.
Requested change (matches your own stated plan): land #180 first, then rebase this onto main so the diff contains only the canonical-sink layer (canonical_sink.py, the behavioral_ast/behavioral_taint_tracking wiring, and tests/evasion_corpus/). I'll re-review and approve once the diff is isolated to that.
Implements the proposal in #181: a single canonicalization chokepoint that collapses every spelling of a dangerous callable to one canonical sink id, paired with a shared evasion corpus wired as a fitness gate.
I've opened this as a concrete PR (rather than only waiting on a yes/no in #181) so the approach can be judged on real, tested code — happy to adjust the direction or close if you'd prefer.
This branch is stacked on #180 (
fix: detect builtins.* and importlib.import_module sink evasions): the canonical resolver reusesresolve_dynamic_import_call/_strip_builtins_prefixintroduced there. Becausemaindoes not yet contain #180, this PR's diff currently includes #180's commit8bdc3bc(thecommon.pyhelpers + its behavioral tests). Once #180 lands I'll rebase ontomainso only the canonical layer remains. Suggested merge order: #180 first, then this.What this PR adds on top of #180
canonical_sink.py—resolve_to_canonical_sink(node, aliases, type_map): one chokepoint reducing bare / alias /from/as,builtins.*,importlib.import_module, reflectivegetattr("lit"), subscript__builtins__["exec"]/<mod>.__dict__["lit"], and importlib / runpy / code sibling machinery to a single canonical sink id, reusing the existing alias / type-map / dynamic-import machinery.behavioral_ast+behavioral_taint_trackingwired to call it. Reflectivesubprocessinvocations grade AST9-HIGH (parity with reflectiveos.system); a directsubprocess.*call keeps baseline AST4-MEDIUM; a_covered_by_ast9guard keeps AST9 and the canonical fallback complementary so they never double-fire. Benign neighbours never canonicalize to a sink.tests/evasion_corpus/— shared parametrized corpus as a fitness gate: every spelling → its canonical id and a real production sink rule; FP-neighbours → no sink. Adding a sink becomes "add corpus rows", not "add a detector branch + hope it's complete".Scope / non-goals
In scope: statically-resolvable spellings. Out of scope (documented, left to the generic reflective-access AST7-LOW rule): genuinely dynamic idioms — non-literal
getattr(os, name), variable subscript key, computed attribute string, module name from a variable, local-variable indirection (f = os.system; f(x)). These have no static canonical id and would need def-use / taint tracking; kept explicitly as non-goals so the layer doesn't oversell completeness.Validation (real analyzers + repo toolchain, isolated container)
test_static_patterns::TestRunStaticPatternsAgentSnooping::test_no_same_line_duplicatefailure is unrelated and present on the base tooruff==0.15.2check +ruff format+mypyall cleanAI disclosure: prepared with AI assistance; every line was reviewed and validated by me, and the results above come from running the repo's real analyzers + test toolchain.