[Phase 0.2.3] Create adversarial test framework by richard-devbot · Pull Request #37 · CursorTouch/Operator-Use

richard-devbot · 2026-04-13T06:50:10Z

Closes #9

What was implemented

`tests/adversarial/` directory structure

__init__.py — package marker
conftest.py — three pytest fixtures:
- injection_payloads — parametrized fixture loading all 55 prompt injection patterns from prompt_injection.yaml (one test invocation per pattern, keyed by ID)
- mock_llm_with_injection — builds a MagicMock LLM client whose .complete() / .acomplete() return the current injection payload as response content, simulating a compromised or adversarially-controlled model
- attack_scenario — parametrized fixture over 5 multi-step attack chains (roleplay escalation, tool-chain exfiltration, indirect web injection, authority escalation, context poisoning via memory)

Payload library (`tests/adversarial/payloads/`)

File	Patterns	Categories
`prompt_injection.yaml`	55	instruction override, data exfiltration, jailbreak, prompt leakage, context poisoning, obfuscation, multi-turn, tool abuse, social engineering, nested injection
`indirect_injection.yaml`	33	web content, document (PDF/CSV/JSON/YAML), email, API response, database, code repository, calendar, search results, media metadata, cross-context
`resource_exhaustion.yaml`	28	context flood, recursive prompts, tool call abuse, memory exhaustion, computation abuse, malformed input, rate/session abuse

Test file (`test_adversarial.py`)

TestPromptInjectionPayloads — 5 parametrized test methods run against every injection payload (schema validation, null byte stripping, INST delimiter removal, <system> tag stripping, mock LLM response sanitization)
TestAttackScenarios — 5 test methods validate each attack scenario's schema and ordering invariants
5 @given hypothesis tests: sanitizer never crashes on arbitrary text, sanitizer is idempotent, handles control/surrogate characters, response safety check handles any dict, injection delimiters stripped in all contexts

Dev dependencies added

hypothesis>=6.100.0
pyyaml>=6.0.0

Test count

With 55 injection payloads × 5 test methods + 5 scenarios × 5 test methods + 5 hypothesis tests = 305+ test invocations from this framework.

…brary Creates tests/adversarial/ with conftest fixtures (injection_payloads, mock_llm_with_injection, attack_scenario), 55+ prompt injection patterns, 33 indirect injection patterns, 28 resource exhaustion patterns across three YAML payload files, and property-based fuzz tests via hypothesis. Adds hypothesis and pyyaml to dev dependencies. Closes CursorTouch#9 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

… [ci]

richard-devbot force-pushed the richardson/phase0-adversarial-tests branch from 942fb76 to 80461ec Compare April 19, 2026 16:13

Richardson Gunde added 2 commits April 19, 2026 22:01

fix: update test imports for refactored tools paths [ci]

36242e1

fix: fix remaining test_agent.py and e2e imports for refactored tools…

c311c28

… [ci]

richard-devbot mentioned this pull request Apr 26, 2026

PR Merge Roadmap — reviewer guide for Joe #47

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Phase 0.2.3] Create adversarial test framework#37

[Phase 0.2.3] Create adversarial test framework#37
richard-devbot wants to merge 3 commits intoCursorTouch:mainfrom
richard-devbot:richardson/phase0-adversarial-tests

richard-devbot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

richard-devbot commented Apr 13, 2026

What was implemented

tests/adversarial/ directory structure

Payload library (tests/adversarial/payloads/)

Test file (test_adversarial.py)

Dev dependencies added

Test count

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`tests/adversarial/` directory structure

Payload library (`tests/adversarial/payloads/`)

Test file (`test_adversarial.py`)