feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181) by zied-jlassi · Pull Request #182 · NVIDIA/SkillSpector

zied-jlassi · 2026-06-23T21:53:57Z

Implements the proposal in #181: a single canonicalization chokepoint that collapses every spelling of a dangerous callable to one canonical sink id, paired with a shared evasion corpus wired as a fitness gate.

I've opened this as a concrete PR (rather than only waiting on a yes/no in #181) so the approach can be judged on real, tested code — happy to adjust the direction or close if you'd prefer.

⚠️ Depends on / stacked on #180 — please read first

This branch is stacked on #180 (fix: detect builtins.* and importlib.import_module sink evasions): the canonical resolver reuses resolve_dynamic_import_call / _strip_builtins_prefix introduced there. Because main does not yet contain #180, this PR's diff currently includes #180's commit 8bdc3bc (the common.py helpers + its behavioral tests). Once #180 lands I'll rebase onto main so only the canonical layer remains. Suggested merge order: #180 first, then this.

What this PR adds on top of #180

canonical_sink.py — resolve_to_canonical_sink(node, aliases, type_map): one chokepoint reducing bare / alias / from / as, builtins.*, importlib.import_module, reflective getattr("lit"), subscript __builtins__["exec"] / <mod>.__dict__["lit"], and importlib / runpy / code sibling machinery to a single canonical sink id, reusing the existing alias / type-map / dynamic-import machinery.
behavioral_ast + behavioral_taint_tracking wired to call it. Reflective subprocess invocations grade AST9-HIGH (parity with reflective os.system); a direct subprocess.* call keeps baseline AST4-MEDIUM; a _covered_by_ast9 guard keeps AST9 and the canonical fallback complementary so they never double-fire. Benign neighbours never canonicalize to a sink.
tests/evasion_corpus/ — shared parametrized corpus as a fitness gate: every spelling → its canonical id and a real production sink rule; FP-neighbours → no sink. Adding a sink becomes "add corpus rows", not "add a detector branch + hope it's complete".

Scope / non-goals

In scope: statically-resolvable spellings. Out of scope (documented, left to the generic reflective-access AST7-LOW rule): genuinely dynamic idioms — non-literal getattr(os, name), variable subscript key, computed attribute string, module name from a variable, local-variable indirection (f = os.system; f(x)). These have no static canonical id and would need def-use / taint tracking; kept explicitly as non-goals so the layer doesn't oversell completeness.

Validation (real analyzers + repo toolchain, isolated container)

evasion corpus: 108/108 pass
full suite: +108 tests, zero regressions — 1000 passed with this diff vs 892 on the base branch; the single pre-existing test_static_patterns::TestRunStaticPatternsAgentSnooping::test_no_same_line_duplicate failure is unrelated and present on the base too
ruff==0.15.2 check + ruff format + mypy all clean

AI disclosure: prepared with AI assistance; every line was reviewed and validated by me, and the results above come from running the repo's real analyzers + test toolchain.

…vasions The import-alias normalization rewrites `from builtins import exec` and `import builtins; builtins.exec(...)` to the qualified `builtins.exec`, which the bare-name sink checks (call_name == "exec", _EXEC_SINKS) then miss. Collapse `builtins.<x>` back to the bare builtin so these re-enter the existing detection, covering exec/eval/compile/__import__. Add a dynamic-import resolver so `importlib.import_module('os').system(...)` (sibling of __import__) resolves to os.system / the subprocess family and hits the sink ladders. Baseline detection unchanged; composes with the getattr branch (NVIDIA#166) with no double-fire; FP-neighbors stay clean. Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

…ness gate Introduce resolve_to_canonical_sink, a single chokepoint reducing every spelling of a dangerous callable -- bare/alias/from/as, builtins.*, importlib.import_module, reflective getattr("lit"), subscript __builtins__["exec"] and <mod>.__dict__["lit"], plus importlib/runpy/code sibling machinery -- to one canonical sink id. Wire it into behavioral_ast and behavioral_taint_tracking so the per-idiom ladders stay simple, with a shared parametrized evasion corpus asserting the invariant as a fitness gate: a future missed spelling fails a shared test instead of shipping as a silent blind spot. Reflective subprocess invocations grade AST9-HIGH for parity with reflective os.system; direct subprocess.* keeps baseline AST4-MEDIUM; AST9 and the canonical fallback stay complementary (the _covered_by_ast9 guard prevents double-firing). Benign neighbours never canonicalize to a sink. Stacked on the builtins/importlib sink-evasion fix (reuses resolve_dynamic_import_call); to be rebased onto main once that lands. 108 evasion-corpus tests pass; full suite +108 with zero regressions (ruff 0.15.2 + mypy clean). Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>

rng1995 · 2026-06-24T08:30:26Z

    return None


+def _strip_builtins_prefix(name: str) -> str:


[Automated SkillSpector Review]

This helper (and the other common.py changes plus the duplicated test_behavioral_ast.py / test_behavioral_taint_tracking.py additions in this PR) come from #180, which this branch is stacked on. Once #180 merges, please rebase onto main so this PR's diff contains only the new canonical-sink layer. No change needed to the code itself.

rng1995

[Automated SkillSpector Review]

Requesting changes — scope/rebase, not the implementation.

To be clear up front: the canonical-sink layer itself is excellent and I have no correctness or security objection to canonical_sink.py. The single-chokepoint design (resolve_to_canonical_sink covering bare/alias/from/as, builtins.*, dynamic import, reflective getattr, subscript __builtins__["exec"]/os.__dict__[...], and importlib/runpy/code sibling machinery with receiver-gated instance methods), the AST9-parity for reflective subprocess, the _covered_by_ast9 anti-double-fire guard, and the shared evasion-corpus fitness gate are all well-reasoned and thoroughly tested.

The blocker is scope/mergeability: this PR is stacked on #180 and its diff currently includes #180's commit 8bdc3bc (the common.py helpers and the test_behavioral_ast.py / test_behavioral_taint_tracking.py additions are duplicated from #180). As written, this can't be merged independently without pulling in / conflicting with #180.

Requested change (matches your own stated plan): land #180 first, then rebase this onto main so the diff contains only the canonical-sink layer (canonical_sink.py, the behavioral_ast/behavioral_taint_tracking wiring, and tests/evasion_corpus/). I'll re-review and approve once the diff is isolated to that.

zied-jlassi added 2 commits June 23, 2026 21:20

zied-jlassi mentioned this pull request Jun 23, 2026

Proposal: canonical dangerous-callable resolver + shared evasion corpus as a fitness gate #181

Open

rng1995 reviewed Jun 24, 2026

View reviewed changes

rng1995 requested changes Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181)#182

feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181)#182
zied-jlassi wants to merge 2 commits into
NVIDIA:mainfrom
zied-jlassi:feature/canonical-sink-evasion-corpus

zied-jlassi commented Jun 23, 2026

Uh oh!

rng1995 Jun 24, 2026

Uh oh!

rng1995 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

zied-jlassi commented Jun 23, 2026

⚠️ Depends on / stacked on #180 — please read first

What this PR adds on top of #180

Scope / non-goals

Validation (real analyzers + repo toolchain, isolated container)

Uh oh!

rng1995 Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants