You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you write a cookbook and include the sentence "be careful with knives — they can cut you," a safety scanner should not flag your cookbook as a weapon. But that's exactly what SkillSpector does with documentation.
Consider a skill that has a docs/deployment.md file showing users how to manually check a service:
This is a documentation example — it's not executable code, it's not what the agent will run, it's a reference for humans reading the skill. But SkillSpector's static pattern analyzers flag it as "Tool Misuse: insecure network call" with the same severity as if the skill's actual Python code was running curl -k.
The problem: 10 of 12 static analyzers have zero awareness of whether they're scanning executable code or markdown documentation. Only 2 analyzers (excessive_agency and memory_poisoning) check if a match is inside a code example. The rest fire indiscriminately on any text that matches a regex pattern — including fenced code blocks, usage examples, and reference documentation that will never be executed by an agent.
Relation to Existing Issues
This problem has been reported from the user-symptom side by multiple contributors:
Root cause identification: The exact code-level reason — is_code_example() exists in common.py but is called by only 2 of 12 static analyzers. This is the specific integration gap.
Comprehensive fix scope: Not just one analyzer (P2) or one scoring formula, but a unified filtering layer in run_static_patterns() that applies to ALL 12 analyzers before findings are emitted.
File-type-aware strategy: Distinguishes between hard-dropping (non-executable file types like .md, .json, .yaml) vs. confidence-downweighting (executable files where a code-example context might be real but should be lower confidence) — avoiding the security hole where an attacker could suppress findings in .py files by salting code-example indicators.
skillspector scan ./docs-skill/ --no-llm --format json
# Multiple TM1, EA2, SQP findings — ALL from the .md documentation files# tool.py (the only executable code) has zero findings# Score: HIGH or CRITICAL due to documentation examples
Root Cause
1. is_code_example() only used in 2 of 12 analyzers
The static_patterns_tool_misuse analyzer (TM1) has zero documentation filtering:
# static_patterns_tool_misuse.py — no call to is_code_example anywhereforpattern, confidenceinTM1_PATTERNS:
formatchinre.finditer(pattern, content, re.IGNORECASE|re.MULTILINE):
# Fires on ANY text match, including markdown code blocks
It only checks for the presence of backticks in a ±3 line context window. It doesn't:
Parse markdown structure (fenced code blocks have start/end boundaries)
Distinguish "this skill will execute X" from "this document describes X"
Account for documentation files in docs/, procedures/, references/ subdirectories
Impact
Skills with deployment/procedure documentation are systematically flagged as CRITICAL
Developers documenting shell commands in their skill get penalized
The scanner cannot distinguish "skill instructs the agent to run curl --insecure" (genuine risk) from "documentation describes a manual procedure that uses curl --insecure" (informational only)
Makes the tool unreliable for any real-world skill that includes usage examples
Summary
If you write a cookbook and include the sentence "be careful with knives — they can cut you," a safety scanner should not flag your cookbook as a weapon. But that's exactly what SkillSpector does with documentation.
Consider a skill that has a
docs/deployment.mdfile showing users how to manually check a service:This is a documentation example — it's not executable code, it's not what the agent will run, it's a reference for humans reading the skill. But SkillSpector's static pattern analyzers flag it as "Tool Misuse: insecure network call" with the same severity as if the skill's actual Python code was running
curl -k.The problem: 10 of 12 static analyzers have zero awareness of whether they're scanning executable code or markdown documentation. Only 2 analyzers (
excessive_agencyandmemory_poisoning) check if a match is inside a code example. The rest fire indiscriminately on any text that matches a regex pattern — including fenced code blocks, usage examples, and reference documentation that will never be executed by an agent.Relation to Existing Issues
This problem has been reported from the user-symptom side by multiple contributors:
What this issue adds:
is_code_example()exists incommon.pybut is called by only 2 of 12 static analyzers. This is the specific integration gap.run_static_patterns()that applies to ALL 12 analyzers before findings are emitted..md,.json,.yaml) vs. confidence-downweighting (executable files where a code-example context might be real but should be lower confidence) — avoiding the security hole where an attacker could suppress findings in.pyfiles by salting code-example indicators.static_runner.pylevel with_NON_EXECUTABLE_FILE_TYPES,_DOCUMENTATION_CONFIDENCE_FACTOR, and_CODE_EXAMPLE_CONFIDENCE_FACTOR— a single integration point rather than per-analyzer patches.Reproduction
Create a skill with documentation that references common shell patterns:
SKILL.md:
tool.py:
docs/usage.md:
docs/deployment.md:
Root Cause
1.
is_code_example()only used in 2 of 12 analyzersThe
static_patterns_tool_misuseanalyzer (TM1) has zero documentation filtering:2.
is_code_example()itself is too narrowIt only checks for the presence of backticks in a ±3 line context window. It doesn't:
docs/,procedures/,references/subdirectoriesImpact
curl --insecure" (genuine risk) from "documentation describes a manual procedure that usescurl --insecure" (informational only)Affected Version
SkillSpector v2.2.3