Skip to content

[Phase 0.2.3] Create adversarial test framework#37

Open
richard-devbot wants to merge 3 commits intoCursorTouch:mainfrom
richard-devbot:richardson/phase0-adversarial-tests
Open

[Phase 0.2.3] Create adversarial test framework#37
richard-devbot wants to merge 3 commits intoCursorTouch:mainfrom
richard-devbot:richardson/phase0-adversarial-tests

Conversation

@richard-devbot
Copy link
Copy Markdown

Closes #9

What was implemented

tests/adversarial/ directory structure

  • __init__.py — package marker
  • conftest.py — three pytest fixtures:
    • injection_payloads — parametrized fixture loading all 55 prompt injection patterns from prompt_injection.yaml (one test invocation per pattern, keyed by ID)
    • mock_llm_with_injection — builds a MagicMock LLM client whose .complete() / .acomplete() return the current injection payload as response content, simulating a compromised or adversarially-controlled model
    • attack_scenario — parametrized fixture over 5 multi-step attack chains (roleplay escalation, tool-chain exfiltration, indirect web injection, authority escalation, context poisoning via memory)

Payload library (tests/adversarial/payloads/)

File Patterns Categories
prompt_injection.yaml 55 instruction override, data exfiltration, jailbreak, prompt leakage, context poisoning, obfuscation, multi-turn, tool abuse, social engineering, nested injection
indirect_injection.yaml 33 web content, document (PDF/CSV/JSON/YAML), email, API response, database, code repository, calendar, search results, media metadata, cross-context
resource_exhaustion.yaml 28 context flood, recursive prompts, tool call abuse, memory exhaustion, computation abuse, malformed input, rate/session abuse

Test file (test_adversarial.py)

  • TestPromptInjectionPayloads — 5 parametrized test methods run against every injection payload (schema validation, null byte stripping, INST delimiter removal, <system> tag stripping, mock LLM response sanitization)
  • TestAttackScenarios — 5 test methods validate each attack scenario's schema and ordering invariants
  • 5 @given hypothesis tests: sanitizer never crashes on arbitrary text, sanitizer is idempotent, handles control/surrogate characters, response safety check handles any dict, injection delimiters stripped in all contexts

Dev dependencies added

  • hypothesis>=6.100.0
  • pyyaml>=6.0.0

Test count

With 55 injection payloads × 5 test methods + 5 scenarios × 5 test methods + 5 hypothesis tests = 305+ test invocations from this framework.

…brary

Creates tests/adversarial/ with conftest fixtures (injection_payloads,
mock_llm_with_injection, attack_scenario), 55+ prompt injection patterns,
33 indirect injection patterns, 28 resource exhaustion patterns across three
YAML payload files, and property-based fuzz tests via hypothesis. Adds
hypothesis and pyyaml to dev dependencies.

Closes CursorTouch#9

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@richard-devbot richard-devbot force-pushed the richardson/phase0-adversarial-tests branch from 942fb76 to 80461ec Compare April 19, 2026 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Phase 0.2.3] Create adversarial test framework

1 participant