Skip to content

Latest commit

 

History

History
807 lines (629 loc) · 27.7 KB

File metadata and controls

807 lines (629 loc) · 27.7 KB

Architecture Deep Dive

Adversarial Security Agents System Design

This document provides a detailed explanation of the system architecture, design decisions, and implementation details.


Table of Contents

  1. System Overview
  2. Component Details
  3. Data Flow
  4. Design Decisions
  5. Scalability & Performance

System Overview

High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐
│  Layer 1: LLM Decision Making (Main PC)                         │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  LM Studio                                                 │  │
│  │  • Model: qwen2.5-coder-14b-instruct-abliterated         │  │
│  │  • Inference: temp=0.4, min_p=0.08, repeat_penalty=1.08  │  │
│  │  • API: OpenAI-compatible /v1/chat/completions           │  │
│  └───────────────────────────────────────────────────────────┘  │
└────────────────────────────┬────────────────────────────────────┘
                             │ HTTP POST (chat completions)
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  Layer 2: Knowledge Base (K3s Cluster)                          │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  MCP RAG Server                                            │  │
│  │  • Protocol: Model Context Protocol (JSON-RPC 2.0)        │  │
│  │  • Index: FAISS (L2 distance, 5,395 documents)            │  │
│  │  • Embeddings: all-MiniLM-L6-v2 (384 dimensions)          │  │
│  │  • Sources: GTFOBins, Atomic Red Team, HackTricks         │  │
│  └───────────────────────────────────────────────────────────┘  │
└────────────────────────────┬────────────────────────────────────┘
                             │ MCP Protocol (tools/call)
                             ↑
┌─────────────────────────────────────────────────────────────────┐
│  Layer 3: Autonomous Agent (K3s Pod - Isolated)                 │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  Red Team Agent Process                                    │  │
│  │  ┌─────────────────┐  ┌──────────────┐  ┌──────────────┐ │  │
│  │  │ MCP Client      │→ │ LLM Client   │→ │ Command      │ │  │
│  │  │ (Query RAG)     │  │ (Get Plan)   │  │ Sandbox      │ │  │
│  │  └─────────────────┘  └──────────────┘  └──────────────┘ │  │
│  │                                               │            │  │
│  │  ┌─────────────────────────────────────────────┐          │  │
│  │  │ BlackArch Toolkit (2000+ tools)            │          │  │
│  │  │ • hydra, nmap, metasploit, sqlmap, etc.   │          │  │
│  │  └─────────────────────────────────────────────┘          │  │
│  └───────────────────────────────────────────────────────────┘  │
│                                                                  │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  NetworkPolicy (Kernel-level Isolation)                    │  │
│  │  • Egress: ONLY target + MCP + LLM + DNS                  │  │
│  │  • Ingress: DENY ALL                                      │  │
│  └───────────────────────────────────────────────────────────┘  │
└────────────────────────────┬────────────────────────────────────┘
                             │ SSH (port 22)
                             ↓
┌─────────────────────────────────────────────────────────────────┐
│  Layer 4: Target Machine                                        │
│  • Intentionally vulnerable system                              │
│  • Weak credentials, SUID binaries, misconfigurations           │
└─────────────────────────────────────────────────────────────────┘

Component Details

1. LM Studio (LLM Inference Engine)

Role: Provides the "brain" for agent decision-making

Technology Stack:

  • LM Studio 0.3.28+ (local LLM inference)
  • Model: Qwen 2.5 Coder 14B Instruct (abliterated variant)
  • API: OpenAI-compatible REST API

Why Qwen 2.5 Coder Abliterated?

  • Uncensored: Standard models refuse security-related queries ("I cannot help with hacking")
  • Code-focused: Optimized for command syntax and technical tasks
  • Good reasoning: 14B parameters sufficient for attack planning
  • Local: No data leaves infrastructure (privacy + air-gap compatible)

Inference Configuration:

{
    "temperature": 0.4,        # Low = deterministic command generation
    "min_p": 0.08,             # Dynamic sampling (adapts to model confidence)
    "top_p": 1.0,              # Disabled (conflicts with min_p)
    "top_k": 0,                # Disabled (min_p is superior)
    "repeat_penalty": 1.08,    # Prevents "nmap... nmap... nmap..." loops
    "stream": False,           # Required for reliable text parsing
    "cache_prompt": True,      # Cache system prompt for speed
    "max_tokens": 2048         # Prevent runaway generation
}

Why These Settings?

  • Temperature 0.4: Research shows lower temp (0.3-0.5) better for tool use than default 0.7
  • Min-P 0.08: Dynamically filters tokens based on top token probability (better than static top_p)
  • Stream: False: Easier to parse complete responses vs streaming tokens

Known Limitation:

  • Tool calling API broken on abliterated models (see KNOWN-ISSUES.md)
  • Workaround: Agent-orchestrated pattern with text parsing

2. MCP RAG Server (Knowledge Base)

Role: Provides offensive security knowledge via semantic search

Technology Stack:

  • Web Framework: Flask 3.0.0 + Gunicorn (sync workers)
  • Vector Database: FAISS (Facebook AI Similarity Search)
  • Embeddings: sentence-transformers (all-MiniLM-L6-v2, 384 dims)
  • Protocol: Model Context Protocol (MCP) JSON-RPC 2.0

Why Flask + Gunicorn (not FastAPI + Uvicorn)?

  • K3s containerd blocks socketpair() syscall
  • FastAPI + Uvicorn requires socketpair for asyncio event loop
  • Flask + sync workers = no socketpair needed
  • Lesson: "Boring technology" wins in restricted environments

Data Sources (5,395 documents total):

  1. GTFOBins (~200 files): Unix binary exploitation for privilege escalation
  2. Atomic Red Team (~5,000 tests): MITRE ATT&CK-mapped adversary emulation
  3. HackTricks (~200 pages): Offensive security playbooks and techniques

FAISS Index Configuration:

dimension = 384  # all-MiniLM-L6-v2 embedding size
index = faiss.IndexFlatL2(dimension)  # L2 distance (Euclidean)

Why L2 Distance?

  • Simple, fast, works well for cybersecurity technical content
  • Alternatives (cosine similarity) showed minimal improvement in testing
  • FAISS IndexFlatL2 supports exact search (no approximation errors)

MCP Tools Exposed:

  1. search(query: str, top_k: int = 5)

    • Semantic search over entire knowledge base
    • Returns ranked results with source metadata
    • Example: search("SSH brute force weak password")
  2. list_techniques()

    • Returns all 327 indexed MITRE ATT&CK techniques
    • Useful for agent to explore available tactics

API Endpoint:

  • Internal: http://mcp-rag-server.default.svc.cluster.local:8080
  • External: http://192.168.1.181:30800 (NodePort)

3. Red Team Agent (Autonomous Executor)

Role: Orchestrates attack loop (query → plan → execute → observe → iterate)

Architecture Pattern: Agent-Orchestrated (not LLM-orchestrated)

Why Agent-Orchestrated?

# Agent controls the loop
while not objective_achieved:
    knowledge = query_mcp_rag(objective)
    plan = llm.generate(prompt_with_knowledge)
    commands = extract_commands_from_text(plan)
    result = execute_with_sandbox(commands[0])

# vs LLM-orchestrated (doesn't work with abliterated models)
# response = llm.chat(messages, tools=[query_rag_tool, execute_command_tool])
# for tool_call in response.tool_calls:  # <-- BROKEN on abliterated models
#     execute_tool(tool_call)

Benefits:

  • ✅ Works with abliterated models (essential for uncensored security research)
  • ✅ More transparent (can inspect exact prompts and responses)
  • ✅ Easier to debug (print/log every step)
  • ✅ More flexible (adjust extraction patterns without model changes)

Components:

a. MCPClient

class MCPClient:
    def __init__(self, mcp_url: str):
        self._initialize()  # Handshake: initialize → notifications/initialized → tools/list

    def search(self, query: str, top_k: int = 3) -> List[Dict]:
        # Query knowledge base via MCP tools/call

MCP Handshake:

  1. initialize - Exchange protocol version and capabilities
  2. notifications/initialized - Confirm ready
  3. tools/list - Discover available tools
  4. tools/call - Execute search/list_techniques

b. LLMClient

class LLMClient:
    def generate(self, prompt: str, system: str = None) -> str:
        # Call LM Studio /v1/chat/completions with optimized params

Prompt Structure:

System: [Comprehensive methodology with 5 phases]

User:
Relevant techniques from knowledge base:
[RAG results - top 3 documents]

Objective: Gain SSH access using weak credentials
Target: 192.168.1.99

Based on the knowledge base, provide ONE specific command.

c. CommandSandbox

class CommandSandbox:
    WHITELIST = [
        'ssh', 'hydra', 'nmap', 'metasploit', ...  # 2000+ BlackArch tools
    ]

    BLACKLIST_PATTERNS = [
        r'rm\s+-rf\s+/',    # Prevent destruction
        r'dd\s+if=.*of=/dev/',  # Prevent disk wipe
        r'mkfs',            # Prevent filesystem format
        ...
    ]

    def execute(self, command: str) -> Tuple[str, int]:
        if not self.is_safe(command):
            return "BLOCKED", -1

        result = subprocess.run(command, shell=True, capture_output=True)
        self.log(command, result.stdout, result.returncode)
        return result.stdout, result.returncode

Safety Features:

  • Whitelist: Only known tools allowed
  • Blacklist: Destructive patterns blocked
  • Logging: Every command recorded with timestamp
  • Timeout: 30s max execution time
  • Non-root: Runs as UID 1000

d. RedTeamAgent (Main Loop)

class RedTeamAgent:
    def attack_cycle(self, objective: str, max_iterations: int = 5):
        for iteration in range(max_iterations):
            # 1. Query knowledge base
            knowledge = self.query_knowledge_base(objective)

            # 2. Ask LLM for plan
            plan = self.ask_llm(plan_prompt, context=knowledge)

            # 3. Extract commands from freeform text
            commands = self._extract_commands(plan)

            # 4. Check for repetition (prevent infinite loops)
            if self._is_repeating(commands[0]):
                # Query RAG for alternative techniques
                alt_knowledge = self.query_knowledge_base(f"{objective} alternative")
                plan = self.ask_llm("Suggest DIFFERENT approach", context=alt_knowledge)
                commands = self._extract_commands(plan)

            # 5. Execute command
            self.command_history.append(commands[0])
            output, code = self.execute_command(commands[0])

            # 6. Check success
            if code == 0 and success_pattern_matches(output):
                print("SUCCESS!")
                break

            # 7. Ask LLM what to try next (with exit code feedback)
            feedback = f"Exit code: {code} {'✅' if code == 0 else '❌'}\nOutput: {output[:1024]}"
            next_action = self.ask_llm(feedback)

Key Features:

  • Repetition Detection: Tracks last 3 commands, auto-queries RAG for alternatives if stuck
  • Exit Code Feedback: LLM learns from success (0) vs failure (non-zero)
  • Output Truncation: 1024 chars max to LLM (prevents context overflow)
  • Command History: Full log saved to /var/log/agent/commands.log

4. BlackArch Container

Role: Provides offensive security toolkit in isolated environment

Base Image: archlinux:latest

Installed Tools (highlights):

  • Password Cracking: hydra, john, hashcat, medusa
  • Network Scanning: nmap, masscan, tcpdump, wireshark-cli
  • Web Tools: sqlmap, nikto, dirb, gobuster, wfuzz
  • Exploitation: metasploit, msfvenom
  • Wireless: aircrack-ng, airodump-ng, aireplay-ng
  • Enumeration: enum4linux, smbclient, rpcclient
  • +1990 more via BlackArch repository

Dockerfile Key Sections:

FROM archlinux:latest

# Install BlackArch repository
RUN curl -O https://blackarch.org/strap.sh && \
    chmod +x strap.sh && \
    ./strap.sh

# Install essential tools (2GB+ install)
RUN pacman -Sy --noconfirm \
    python python-pip \
    openssh nmap netcat curl wget \
    hydra sshpass nikto sqlmap \
    metasploit john hashcat \
    aircrack-ng wireshark-cli tcpdump

# Python dependencies
RUN pip install --no-cache-dir requests --break-system-packages

# Non-root user
RUN useradd -m -u 1000 agent
USER agent

CMD ["python", "/app/agent.py"]

Security Context (in K8s manifest):

securityContext:
  runAsUser: 1000
  runAsNonRoot: true
  allowPrivilegeEscalation: false
  capabilities:
    drop: ["ALL"]
  seccompProfile:
    type: RuntimeDefault

Why Non-Root?

  • Defense in depth: Even if agent escapes sandbox, can't modify container
  • Best practice: Never run as root unless absolutely necessary
  • UID 1000: Standard user ID (not privileged)

5. Network Isolation (Kubernetes NetworkPolicy)

Role: Kernel-level enforcement of allowed network traffic

Why NetworkPolicy > iptables?

  • Declarative: Defined in YAML, version-controlled
  • Kubernetes-native: Integrated with pod lifecycle
  • Provable: Can verify policy with kubectl describe
  • Automatic: Applied immediately when pod starts

Policy Definition:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: agent-isolation
  namespace: redteam-lab
spec:
  podSelector:
    matchLabels:
      app: redteam-agent
  policyTypes:
    - Egress
  egress:
    # 1. Allow DNS (required for domain resolution)
    - to:
        - namespaceSelector: {}
      ports:
        - protocol: UDP
          port: 53

    # 2. Allow MCP RAG server (knowledge base)
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: default
          podSelector:
            matchLabels:
              app: mcp-rag-server
      ports:
        - protocol: TCP
          port: 8080

    # 3. Allow LM Studio (LLM inference)
    - to:
        - ipBlock:
            cidr: 192.168.1.84/32
      ports:
        - protocol: TCP
          port: 1234

    # 4. Allow target machine (attack surface)
    - to:
        - ipBlock:
            cidr: 192.168.1.99/32
      ports:
        - protocol: TCP
          port: 22  # SSH only

What's BLOCKED (implicit deny):

  • ❌ Internet access (no 0.0.0.0/0 route)
  • ❌ Other Kubernetes pods (except MCP server)
  • ❌ Other LAN hosts (except target + LLM)
  • ❌ Kubernetes API server
  • ❌ All ingress traffic

Testing Isolation:

# Should FAIL (internet blocked)
kubectl exec redteam-agent -n redteam-lab -- curl https://google.com

# Should SUCCEED (MCP server allowed)
kubectl exec redteam-agent -n redteam-lab -- curl http://mcp-rag-server.default.svc.cluster.local:8080/healthz

# Should SUCCEED (target allowed)
kubectl exec redteam-agent -n redteam-lab -- nc -zv 192.168.1.99 22

Data Flow

Typical Attack Iteration

1. OBJECTIVE SET
   ↓
   User/Config: "Gain SSH access using weak credentials"

2. QUERY KNOWLEDGE BASE
   ↓
   Agent → MCP Server: tools/call("search", {"query": "SSH brute force weak password"})
   ↓
   MCP Server → FAISS: Vector similarity search
   ↓
   FAISS returns top 3 documents:
   - Atomic Red Team T1110.001 (Password Guessing)
   - HackTricks SSH Brute Force
   - Hydra documentation
   ↓
   MCP Server → Agent: Results with technique details

3. PLAN ATTACK
   ↓
   Agent → LM Studio: POST /v1/chat/completions
   {
     "system": "You are a red team agent... [methodology]",
     "user": "Knowledge base: [RAG results]\n\nObjective: Gain SSH access\nSuggest ONE command."
   }
   ↓
   LM Studio (Qwen 2.5 Coder):
   - Reads RAG context
   - Applies system prompt methodology
   - Generates response with reasoning + command
   ↓
   Response: "**Reasoning**: Try common passwords via hydra (T1110.001)\n**Command**: `hydra -l victim -p password123 ssh://192.168.1.99`"
   ↓
   Agent: Extract command via regex

4. EXECUTE COMMAND
   ↓
   Agent → CommandSandbox: execute("hydra -l victim -p password123 ssh://192.168.1.99")
   ↓
   Sandbox: Check whitelist (hydra ✅) and blacklist (no patterns matched ✅)
   ↓
   Sandbox → Shell: subprocess.run(command, capture_output=True)
   ↓
   Shell: Execute hydra against target
   ↓
   Target (192.168.1.99): Accept connection, validate password
   ↓
   Hydra: "[22][ssh] host: 192.168.1.99   login: victim   password: password123"
   ↓
   Sandbox → Agent: (stdout, exit_code=0)
   ↓
   Agent: Log to /var/log/agent/commands.log

5. OBSERVE RESULT
   ↓
   Agent: Check if "password" in output and exit_code == 0
   ↓
   Agent: SUCCESS! SSH credentials found.
   ↓
   Break loop

6. REPORT
   ↓
   Agent: Print summary (time, commands executed, success)

Failure Scenario (with repetition detection):

Iteration 1: Execute "nmap -sV 192.168.1.99" → Exit 0
Iteration 2: Execute "nmap -sV 192.168.1.99" → Exit 0 (repeated)
Iteration 3: Execute "nmap -sV 192.168.1.99" → REPETITION DETECTED
   ↓
   Agent: Query RAG("SSH access alternative to nmap")
   ↓
   LLM: Suggest "hydra" instead
   ↓
   Execute "hydra..." → Success

Design Decisions

1. Why Agent-Orchestrated Instead of LLM-Orchestrated?

Decision: Agent controls loop, LLM provides text responses (not tool calls)

Reason: LM Studio tool calling API broken on abliterated models

Evidence:

curl .../v1/chat/completions -d '{"tools": [...]}'
# Returns: {"error":"Unexpected empty grammar stack after accepting piece: {\""}

Trade-offs:

Approach Pros Cons
LLM-Orchestrated Structured output, type safety Doesn't work with abliterated models ❌
Agent-Orchestrated Works with abliterated, more transparent Requires text parsing

Conclusion: Agent-orchestrated is actually BETTER for our use case (transparency, flexibility)


2. Why Flask + Gunicorn for MCP Server?

Decision: Use Flask + sync workers (not FastAPI + Uvicorn)

Reason: K3s containerd blocks socketpair() syscall

Testing:

# FastAPI + Uvicorn
import asyncio
loop = asyncio.get_event_loop()  # Requires socketpair() → BLOCKED by K3s

# Flask + Gunicorn sync workers
# No asyncio, no socketpair needed → WORKS

Lesson: Async isn't always necessary. Sync workers handle ~100 req/s fine for our use case.


3. Why Min-P Sampling Instead of Top-P?

Decision: Use min_p=0.08 (not top_p=0.9)

Reason: Min-P adapts dynamically to model confidence

How it works:

Top-P (static threshold):
- Keep tokens until cumulative probability ≥ 0.9
- Same threshold regardless of model confidence

Min-P (dynamic threshold):
- Keep tokens with P ≥ (min_p × P_top_token)
- If top token = 80%, min_p 0.08 → keep tokens >6.4%
- If top token = 20%, min_p 0.08 → keep tokens >1.6%

Result: Fewer low-quality command suggestions, more deterministic outputs

Source: Research paper "Min-P Sampling: A Better Sampling Method" (2024)


4. Why Physical Target Instead of Container?

Decision: Attack real hardware (192.168.1.99), not just other pods

Reasons:

  1. More realistic: Real attack surface (SSH, kernel, filesystem)
  2. Better demo: Shows agent works on actual infrastructure
  3. CVE testing: Can test real vulnerabilities, not simulated
  4. Still safe: NetworkPolicy prevents lateral movement

Alternative considered: Vulnerable container (Damn Vulnerable Web App in pod)

  • Pros: Easier setup, faster reset
  • Cons: Less realistic, can't test OS-level exploits

5. Why BlackArch Instead of Kali Linux?

Decision: Use BlackArch repository (Arch Linux base)

Reasons:

  1. Rolling release: Always latest tool versions
  2. Lightweight: Minimal base image (~500MB vs Kali's ~2GB)
  3. Pacman: Fast, reliable package manager
  4. Consistency: Main PC already runs BlackArch

Alternative: Kali Linux

  • Pros: More popular, better documented
  • Cons: Debian-based (slower updates), larger images

Scalability & Performance

Current Performance

Single Agent:

  • Attack cycle: ~90 seconds (SSH compromise)
  • MCP query latency: <100ms (FAISS L2 search)
  • LLM inference: 5-15s per response (14B model, consumer GPU)
  • Total iterations: 1-5 (depending on objective complexity)

Resource Usage:

  • Agent pod: 0.5 CPU, 512MB RAM (typical)
  • MCP server: 0.2 CPU, 1GB RAM (FAISS index in memory)
  • LM Studio: 8GB VRAM (14B model in 4-bit quantization)

Scaling Considerations

Horizontal Scaling (multiple agents):

# Scale to 10 agents attacking different targets
kubectl scale deployment redteam-agent -n redteam-lab --replicas=10

Bottlenecks:

  1. LM Studio: Single LLM instance = sequential inference

    • Solution: LM Studio supports parallel requests (queue internally)
    • Alternative: Deploy multiple LM Studio instances
  2. MCP RAG Server: FAISS IndexFlatL2 is single-threaded

    • Solution: FAISS supports GPU acceleration (IndexFlatL2 → IndexIVFFlat)
    • Alternative: Multiple MCP server replicas (stateless, can load-balance)
  3. Network: 1Gbps LAN sufficient for <100 agents

    • MCP queries: ~50KB/query
    • LLM responses: ~5KB/response

Future Enhancements:

  • LLM: Use vLLM for batched inference (10x throughput)
  • RAG: GPU-accelerated FAISS (100x faster search)
  • Agents: Distributed task queue (Celery + Redis)

Failure Modes & Recovery

1. LLM Generates Invalid Command

Symptom: Command extraction returns empty list

Recovery:

commands = self._extract_commands(plan)
if not commands:
    print("⚠️  No commands found")
    continue  # Skip iteration, query RAG again

Prevention: Structured system prompt enforces output format


2. Command Fails 3+ Times (Repetition)

Symptom: Same command executed 3+ times with failures

Recovery:

if self._is_repeating(command):
    # Query RAG for alternative techniques
    alt_knowledge = self.query_knowledge_base(f"{objective} alternative")
    plan = self.ask_llm("Suggest DIFFERENT approach")

Metrics: In testing, repetition detection reduced stuck loops by 80%


3. MCP Server Unavailable

Symptom: Connection refused or timeout

Recovery:

try:
    results = self.mcp.search(query)
except Exception as e:
    print(f"❌ MCP search failed: {e}")
    results = []  # Proceed without RAG context (degraded mode)

Alternative: Agent can still function with LLM-only (no RAG), just less accurate


4. Target Unreachable

Symptom: Network timeout or connection refused

Recovery:

output, code = self.execute_command(command)
if code != 0 and "timeout" in output.lower():
    print("⚠️  Target unreachable, aborting")
    break

NetworkPolicy blocks: Verified if unable to connect to allowed target IP


Security Considerations

Threat Model

Assumptions:

  • Agent code is trusted (user writes it)
  • LLM may generate malicious commands (defense: sandbox)
  • Target is intentionally vulnerable (expected)

Mitigations:

  1. NetworkPolicy: Kernel-level isolation (not app-level)
  2. Command Sandbox: Whitelist + blacklist (defense in depth)
  3. Non-root: Agent can't modify container filesystem
  4. Resource Limits: Prevent DoS (1 CPU, 1GB RAM)
  5. Logging: Full audit trail of all commands

Residual Risks:

  • LLM prompt injection: Could trick agent into skipping sandbox
    • Mitigation: Sandbox enforced in code, not via LLM
  • 0-day in BlackArch tools: Could escape container
    • Mitigation: seccomp RuntimeDefault blocks many syscalls
  • Kubernetes CVE: Could escape NetworkPolicy
    • Mitigation: Keep K8s updated, monitor CVEs

Future Architecture Evolution

Short-Term (Next 3 Months)

  1. Blue Team Agent: Defensive monitoring pod

    • Detects agent activity via logs
    • Competes with red team (red vs blue game)
  2. Multi-Objective Support: Chain techniques

    • Example: "Gain SSH access → privesc → exfiltrate flag"
  3. Stealth Metrics: Measure detectability

    • Track noise level (failed logins, scans, etc.)

Mid-Term (6-12 Months)

  1. Reinforcement Learning: Agent learns from success/failure

    • Reward: Time to compromise + stealth
    • Technique: PPO or DQN
  2. Multi-Agent Collaboration: Multiple agents coordinate

    • Example: One agent distracts IDS, another exploits
  3. Custom Exploit Development: Agent writes own scripts

    • LLM generates Python/Bash exploits based on recon

Long-Term (12+ Months)

  1. CTF Automation: Solve Capture The Flag challenges
  2. Real CVE Exploitation: Test against actual vulnerabilities
  3. Adversarial Training: Red agent trains blue agent (GAN-like)

Document Version: 1.0 Last Updated: October 9, 2025 Status: Production architecture, battle-tested