fix: ignore structural markdown comments in static engine by Rachitrajvaishkiyar · Pull Request #49 · NVIDIA/SkillSpector

Rachitrajvaishkiyar · 2026-06-14T16:49:44Z

Description

Addresses the false positives reported in #37 where the static P2 prompt injection analyzer flags benign structural elements inside Markdown files.

Changes Introduced

Added an exemption gate inside the P2 regex match loop to ignore harmless structural markdown headers, templates, and machine metadata stamps (template:, theme:, coalmine:, revalidate).
Implemented a check to safely bypass standard HTML/Markdown comment blocks (``) unless they explicitly contain active adversarial or override instructions (e.g., ignore previous, `system prompt`, `override instructions`).

This ensures the scanner remains highly effective against actual prompt injections while eliminating noisy flags on purely structural text layout elements.

rng1995

Review: REQUEST_CHANGES

Thanks for tackling the markdown false positives from the linked issue — reducing noise on benign structural content is worthwhile. However, as written this change introduces serious detection bypasses in the prompt-injection analyzer, so it can't merge in its current form.

Blocking — substring exemption is a trivial bypass

if any(p in matched_str.lower() for p in ["template:","theme:","coalmine:","revalidate"]): continue

This drops any P2 match that merely contains one of those substrings anywhere in the matched text. An attacker can neutralize the rule by embedding the token in the payload, e.g.:

contains send (so it matched P2) but is now silently suppressed because it also contains template:. The exemption isn't anchored to frontmatter or line start, so it applies to active injection content too.

Blocking — the comment branch guts existing detection

if matched_str.startswith("<!--"):
    if not any(d in matched_str.lower() for d in ["ignore previous","system prompt","override instructions","you must","respond as"]): continue

 (no listed phrase)
 (system: is not the substring system prompt)
 (the check is the contiguous substring ignore previous, but this text reads ignore all previous)

So even a textbook "ignore all previous instructions" injection slips through. Replacing a keyword set with a 5-phrase literal denylist is easily defeated by rewording and is a major coverage regression for a primary injection vector (comments are invisible in rendered markdown but read by the model).

Blocking — no tests
This is suppression logic in a security detector with no test coverage. It needs tests that (a) reproduce the specific false positives from the issue and show they're gone, and (b) prove real injections are still caught — including reworded payloads inside comments and payloads that contain the exempted substrings. The analyzer suite elsewhere in this repo is thorough; this change should match that bar.

Code quality (please also address)

chr(60)+chr(33)+chr(45)+chr(45) is an obfuscated way to write the literal "<!--"; please use the string literal — the current form is hard to read and review for no benefit.
Inline if ...: continue compound statements and the missing spaces after commas in the list literals will fail the project's ruff check / ruff format --check gate.
The substring checks are unanchored (p in matched_str.lower()), which is what makes them bypassable; if an exemption is needed it should be anchored to where the benign construct legitimately appears.
coalmine: isn't a standard markdown/frontmatter element and isn't explained — it reads as tuned to one specific sample. Please justify it or drop it.

Suggested direction
Target the benign construct narrowly rather than suppressing injection signals. For example: detect a YAML frontmatter block delimited by --- at the top of the file and exempt key lines within that block only; or exempt specific known-benign comment directives by exact, anchored match. Crucially, never suppress a match that still contains genuine exfil/override signals (send/transmit/upload/ignore/system/instructions/POST/GET). For ambiguous cases, prefer lowering confidence over dropping the finding entirely. Pairing that with the tests above would let this land safely.

Happy to re-review once the suppression is anchored and can't be bypassed by embedding a token or rewording a comment.

fix: ignore structural markdown comments in static engine

9da0bb7

rng1995 requested changes Jun 21, 2026

View reviewed changes

mimran-khan mentioned this pull request Jun 22, 2026

[Bug] Static pattern analyzers fire on markdown documentation and code blocks, not just executable skill logic #135

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: ignore structural markdown comments in static engine#49

fix: ignore structural markdown comments in static engine#49
Rachitrajvaishkiyar wants to merge 1 commit into
NVIDIA:mainfrom
Rachitrajvaishkiyar:fix/issue-37-static-false-positives

Rachitrajvaishkiyar commented Jun 14, 2026

Uh oh!

rng1995 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Rachitrajvaishkiyar commented Jun 14, 2026

Description

Changes Introduced

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants