Summary
Imagine a toll booth that normally processes cars in 2 seconds each. One day a truck arrives with a license plate in a specific unusual format. The toll booth computer tries to read it and freezes — not for 2 seconds, but for 2 hours. Meanwhile, the entire highway backs up because one license plate crashed the system.
SkillSpector's MP2 "Context Window Stuffing" static pattern contains a regex that is vulnerable to catastrophic backtracking (ReDoS — Regular Expression Denial of Service). When this regex encounters certain input patterns, the regex engine's execution time grows exponentially, causing the scanner to hang indefinitely on a single file. A malicious skill can exploit this to DoS the scanning pipeline.
Why This Matters — Real-World Scenario
Scenario: Automated scanning pipeline in CI
An organization runs SkillSpector in their CI/CD pipeline with a 5-minute timeout per skill scan. A malicious contributor discovers that certain text patterns cause the MP2 regex to take exponential time. They craft a skill that looks normal except for one file containing:
abababababababababababababababababababababab
(a specific repeating pattern that triggers the catastrophic backtracking)
The CI job hits the regex, and the scanner process hangs. Because the regex engine is stuck in a tight CPU loop (not waiting on I/O), the process doesn't respond to graceful shutdown signals quickly. The CI job times out after 5 minutes and is marked as failed.
If the organization processes skills in a queue, this creates a denial-of-service: legitimate skills behind the malicious one wait indefinitely. If the scanner runs with limited concurrency (e.g., 4 parallel scans), 4 carefully-crafted submissions can block the entire pipeline.
Worse, if there's no timeout and the process runs in a serverless function (AWS Lambda, Cloud Run), the function runs until memory/time limits kill it — consuming billable compute time.
Reproduction
import re
import time
# The vulnerable regex from MP2 patterns
pattern = re.compile(r"((\S)(?!\2).{1,19}?)\1{20,}")
# Craft input that triggers catastrophic backtracking
# The regex tries to match a repeating pattern of 2+ unique chars
# When it can't find a full match, it backtracks exponentially
evil_input = "ab" * 30 + "c" # 61 chars — should be instant but isn't
start = time.time()
result = pattern.search(evil_input)
elapsed = time.time() - start
print(f"Time: {elapsed:.2f}s") # Expected: >10s (exponential growth with input length)
print(f"Match: {result}")
# Create a skill with the trigger pattern
mkdir -p /tmp/redos-skill
cat > /tmp/redos-skill/SKILL.md << 'SKILLEOF'
---
name: redos-test
description: Tests regex performance
---
# ReDoS Test
SKILLEOF
# Create file with pattern that triggers backtracking
python -c "print('ab' * 50 + 'c')" > /tmp/redos-skill/trigger.txt
# This scan will hang on the MP2 pattern
timeout 30 skillspector scan /tmp/redos-skill/ --no-llm
# Expected: timeout after 30s (scanner hangs on regex)
Root Cause
In src/skillspector/nodes/analyzers/static_patterns_memory_poisoning.py, line 84:
MP2_PATTERNS = [
# ... other patterns ...
(r"((\S)(?!\2).{1,19}?)\1{20,}", 0.8), # "Context Window Stuffing"
]
This regex attempts to detect repeated text patterns (potential context window stuffing). Breaking it down:
(\S) — capture a non-whitespace character
(?!\2) — followed by a character that is NOT the same (negative lookahead)
.{1,19}? — followed by 1-19 characters (lazy quantifier)
\1{20,} — the entire group repeated 20+ times
The catastrophic backtracking occurs because:
.{1,19}? is a lazy quantifier inside a group that must repeat 20+ times
- When the full match fails, the engine backtracks through all possible lengths of
.{1,19}? for each of the 20+ repetitions
- The combination of nested quantifiers (
{1,19}? inside \1{20,}) creates exponential state space
- Input that almost matches but doesn't (e.g., repeating patterns with a slight break) forces maximum backtracking
The time complexity is O(2^n) where n is the input length, rather than the expected O(n).
Impact
- Denial of Service: A single crafted file can hang the scanner process indefinitely
- CI/CD pipeline blocking: Scanning queues stall when one scan hangs
- Resource exhaustion: CPU pegged at 100% during backtracking with no progress
- No timeout protection: The regex engine does not support timeouts in Python's
re module
- Billable compute waste: Cloud-hosted scans (Lambda, Cloud Run) burn compute budget on stuck processes
- Trivial to exploit: The trigger pattern is simple to construct and doesn't look suspicious in a code review
Affected Version
SkillSpector v2.2.3
Summary
Imagine a toll booth that normally processes cars in 2 seconds each. One day a truck arrives with a license plate in a specific unusual format. The toll booth computer tries to read it and freezes — not for 2 seconds, but for 2 hours. Meanwhile, the entire highway backs up because one license plate crashed the system.
SkillSpector's MP2 "Context Window Stuffing" static pattern contains a regex that is vulnerable to catastrophic backtracking (ReDoS — Regular Expression Denial of Service). When this regex encounters certain input patterns, the regex engine's execution time grows exponentially, causing the scanner to hang indefinitely on a single file. A malicious skill can exploit this to DoS the scanning pipeline.
Why This Matters — Real-World Scenario
Scenario: Automated scanning pipeline in CI
An organization runs SkillSpector in their CI/CD pipeline with a 5-minute timeout per skill scan. A malicious contributor discovers that certain text patterns cause the MP2 regex to take exponential time. They craft a skill that looks normal except for one file containing:
(a specific repeating pattern that triggers the catastrophic backtracking)
The CI job hits the regex, and the scanner process hangs. Because the regex engine is stuck in a tight CPU loop (not waiting on I/O), the process doesn't respond to graceful shutdown signals quickly. The CI job times out after 5 minutes and is marked as failed.
If the organization processes skills in a queue, this creates a denial-of-service: legitimate skills behind the malicious one wait indefinitely. If the scanner runs with limited concurrency (e.g., 4 parallel scans), 4 carefully-crafted submissions can block the entire pipeline.
Worse, if there's no timeout and the process runs in a serverless function (AWS Lambda, Cloud Run), the function runs until memory/time limits kill it — consuming billable compute time.
Reproduction
Root Cause
In
src/skillspector/nodes/analyzers/static_patterns_memory_poisoning.py, line 84:This regex attempts to detect repeated text patterns (potential context window stuffing). Breaking it down:
(\S)— capture a non-whitespace character(?!\2)— followed by a character that is NOT the same (negative lookahead).{1,19}?— followed by 1-19 characters (lazy quantifier)\1{20,}— the entire group repeated 20+ timesThe catastrophic backtracking occurs because:
.{1,19}?is a lazy quantifier inside a group that must repeat 20+ times.{1,19}?for each of the 20+ repetitions{1,19}?inside\1{20,}) creates exponential state spaceThe time complexity is O(2^n) where n is the input length, rather than the expected O(n).
Impact
remoduleAffected Version
SkillSpector v2.2.3