Skip to content

feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181)#182

Open
zied-jlassi wants to merge 2 commits into
NVIDIA:mainfrom
zied-jlassi:feature/canonical-sink-evasion-corpus
Open

feat(behavioral): canonical dangerous-callable resolver + shared evasion corpus (implements #181)#182
zied-jlassi wants to merge 2 commits into
NVIDIA:mainfrom
zied-jlassi:feature/canonical-sink-evasion-corpus

Conversation

@zied-jlassi

Copy link
Copy Markdown
Contributor

Implements the proposal in #181: a single canonicalization chokepoint that collapses every spelling of a dangerous callable to one canonical sink id, paired with a shared evasion corpus wired as a fitness gate.

I've opened this as a concrete PR (rather than only waiting on a yes/no in #181) so the approach can be judged on real, tested code — happy to adjust the direction or close if you'd prefer.

⚠️ Depends on / stacked on #180 — please read first

This branch is stacked on #180 (fix: detect builtins.* and importlib.import_module sink evasions): the canonical resolver reuses resolve_dynamic_import_call / _strip_builtins_prefix introduced there. Because main does not yet contain #180, this PR's diff currently includes #180's commit 8bdc3bc (the common.py helpers + its behavioral tests). Once #180 lands I'll rebase onto main so only the canonical layer remains. Suggested merge order: #180 first, then this.

What this PR adds on top of #180

  • canonical_sink.pyresolve_to_canonical_sink(node, aliases, type_map): one chokepoint reducing bare / alias / from / as, builtins.*, importlib.import_module, reflective getattr("lit"), subscript __builtins__["exec"] / <mod>.__dict__["lit"], and importlib / runpy / code sibling machinery to a single canonical sink id, reusing the existing alias / type-map / dynamic-import machinery.
  • behavioral_ast + behavioral_taint_tracking wired to call it. Reflective subprocess invocations grade AST9-HIGH (parity with reflective os.system); a direct subprocess.* call keeps baseline AST4-MEDIUM; a _covered_by_ast9 guard keeps AST9 and the canonical fallback complementary so they never double-fire. Benign neighbours never canonicalize to a sink.
  • tests/evasion_corpus/ — shared parametrized corpus as a fitness gate: every spelling → its canonical id and a real production sink rule; FP-neighbours → no sink. Adding a sink becomes "add corpus rows", not "add a detector branch + hope it's complete".

Scope / non-goals

In scope: statically-resolvable spellings. Out of scope (documented, left to the generic reflective-access AST7-LOW rule): genuinely dynamic idioms — non-literal getattr(os, name), variable subscript key, computed attribute string, module name from a variable, local-variable indirection (f = os.system; f(x)). These have no static canonical id and would need def-use / taint tracking; kept explicitly as non-goals so the layer doesn't oversell completeness.

Validation (real analyzers + repo toolchain, isolated container)

  • evasion corpus: 108/108 pass
  • full suite: +108 tests, zero regressions — 1000 passed with this diff vs 892 on the base branch; the single pre-existing test_static_patterns::TestRunStaticPatternsAgentSnooping::test_no_same_line_duplicate failure is unrelated and present on the base too
  • ruff==0.15.2 check + ruff format + mypy all clean

AI disclosure: prepared with AI assistance; every line was reviewed and validated by me, and the results above come from running the repo's real analyzers + test toolchain.

…vasions

The import-alias normalization rewrites `from builtins import exec` and `import builtins; builtins.exec(...)` to the qualified `builtins.exec`, which the bare-name sink checks (call_name == "exec", _EXEC_SINKS) then miss. Collapse `builtins.<x>` back to the bare builtin so these re-enter the existing detection, covering exec/eval/compile/__import__. Add a dynamic-import resolver so `importlib.import_module('os').system(...)` (sibling of __import__) resolves to os.system / the subprocess family and hits the sink ladders.

Baseline detection unchanged; composes with the getattr branch (NVIDIA#166) with no double-fire; FP-neighbors stay clean.

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
…ness gate

Introduce resolve_to_canonical_sink, a single chokepoint reducing every spelling
of a dangerous callable -- bare/alias/from/as, builtins.*,
importlib.import_module, reflective getattr("lit"), subscript __builtins__["exec"]
and <mod>.__dict__["lit"], plus importlib/runpy/code sibling machinery -- to one
canonical sink id. Wire it into behavioral_ast and behavioral_taint_tracking so
the per-idiom ladders stay simple, with a shared parametrized evasion corpus
asserting the invariant as a fitness gate: a future missed spelling fails a
shared test instead of shipping as a silent blind spot.

Reflective subprocess invocations grade AST9-HIGH for parity with reflective
os.system; direct subprocess.* keeps baseline AST4-MEDIUM; AST9 and the canonical
fallback stay complementary (the _covered_by_ast9 guard prevents double-firing).
Benign neighbours never canonicalize to a sink.

Stacked on the builtins/importlib sink-evasion fix (reuses
resolve_dynamic_import_call); to be rebased onto main once that lands.

108 evasion-corpus tests pass; full suite +108 with zero regressions
(ruff 0.15.2 + mypy clean).

Signed-off-by: Zied Jlassi <6190550+zied-jlassi@users.noreply.github.com>
return None


def _strip_builtins_prefix(name: str) -> str:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Automated SkillSpector Review]

This helper (and the other common.py changes plus the duplicated test_behavioral_ast.py / test_behavioral_taint_tracking.py additions in this PR) come from #180, which this branch is stacked on. Once #180 merges, please rebase onto main so this PR's diff contains only the new canonical-sink layer. No change needed to the code itself.

@rng1995 rng1995 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Automated SkillSpector Review]

Requesting changes — scope/rebase, not the implementation.

To be clear up front: the canonical-sink layer itself is excellent and I have no correctness or security objection to canonical_sink.py. The single-chokepoint design (resolve_to_canonical_sink covering bare/alias/from/as, builtins.*, dynamic import, reflective getattr, subscript __builtins__["exec"]/os.__dict__[...], and importlib/runpy/code sibling machinery with receiver-gated instance methods), the AST9-parity for reflective subprocess, the _covered_by_ast9 anti-double-fire guard, and the shared evasion-corpus fitness gate are all well-reasoned and thoroughly tested.

The blocker is scope/mergeability: this PR is stacked on #180 and its diff currently includes #180's commit 8bdc3bc (the common.py helpers and the test_behavioral_ast.py / test_behavioral_taint_tracking.py additions are duplicated from #180). As written, this can't be merged independently without pulling in / conflicting with #180.

Requested change (matches your own stated plan): land #180 first, then rebase this onto main so the diff contains only the canonical-sink layer (canonical_sink.py, the behavioral_ast/behavioral_taint_tracking wiring, and tests/evasion_corpus/). I'll re-review and approve once the diff is isolated to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants