feat: introduce multi-agent red team orchestration framework#36
Merged
l50 merged 9 commits intoJan 13, 2026
Merged
Conversation
…am capabilities **Added:** - RedTeamDispatcher class for centralized multi-agent task coordination and message routing (src/ares/core/dispatcher.py) - KubernetesPodExecutor for executing commands in ephemeral Kubernetes pods, with pod discovery and retry (src/ares/core/k8s_executor.py) - Inter-agent message protocol with Pydantic models and enums for agent communication (src/ares/core/messages.py) - OperationRecoveryManager for checkpoint/restore of cluster-wide state in Redis to support pod crash recovery (src/ares/core/recovery.py) - Multi-agent models: AgentInfo, AgentRole, SharedRedTeamState, TaskInfo, TaskResult, VulnerabilityInfo, AgentLocalState (src/ares/core/models.py) - Factories for creating specialized red team agents by role, including agent registration and ensemble creation (src/ares/core/factories/red_agents.py) - Orchestrator, Cracker, and Lateral callback toolsets for agent-specific multi-agent workflows (src/ares/tools/red/orchestrator.py) - Dedicated agent instruction templates for orchestrator, cracker, lateral, privesc, acl_exploiter, poisoner, atomic roles (templates/redteam/agents/*.md.jinja) **Changed:** - __init__ files in core and factories updated to expose new dispatcher, recovery, Kubernetes executor, multi-agent factories, and models - RedFactory and RedTeamState expanded to support multi-agent features and new attack chains - Red team toolsets extended with CredentialDiscoveryTools, expanded ACL, ADCS, delegation, and MSSQL attack support (src/ares/tools/red/network.py) - System instructions template updated with multi-agent, low-hanging fruit, and advanced attack path guidance (templates/redteam/agents/system_instructions.md.jinja) - Tests for red_factory updated for new event/message model, more robust event handling, and ToolEnd checks (tests/test_red_factory.py) - Exposed new orchestration and callback toolsets in tools/red/__init__.py **Removed:** - No files removed in this change.
**Changed:** - Renamed `AgentRole.ORCHESTRATOR` to `AgentRole.ENUM` across all logic, including dispatcher, agent creation, and configuration to better reflect enumeration role - Renamed `AgentRole.ACL_EXPLOITER` to `AgentRole.ACL` for brevity and clarity - Renamed `AgentRole.POISONER` to `AgentRole.POISONING` for naming consistency - Updated all role-based configuration dictionaries, instruction templates, capabilities, and factory logic to use new role names - Changed default multi-agent ensemble roles to use new names - Updated all references and logic in dispatcher to use new role names for subscriptions and routing **Removed:** - Deprecated old template files and replaced them by renaming: - `templates/redteam/agents/orchestrator.md.jinja` → `enum.md.jinja` - `templates/redteam/agents/acl_exploiter.md.jinja` → `acl.md.jinja` - `templates/redteam/agents/poisoner.md.jinja` → `poisoning.md.jinja`
**Changed:** - Relocated all template files from top-level templates directory to src/ares/templates to improve project organization and align with standard source code structure. No content changes were made to the templates.
**Added:** - Introduced multi-agent operation orchestration via new `ares.core.orchestrator` with workflow automation, agent ensemble creation, and dispatcher integration - Implemented worker agent loop (`ares.core.worker`) for specialized agent task processing, heartbeat monitoring, and dispatcher task completion reporting - Added workflow automation utilities (`ares.core.workflows`) for credential expansion and exploitation coordination in multi-agent operations - Provided production YAML configuration for multi-agent operations (`config/multi-agent-production.yaml`) supporting agent roles, timeouts, priorities, and resource/security settings - Added integration tests for end-to-end multi-agent workflow orchestration, vulnerability queue, credential expansion, and dispatcher message flow (`tests/integration/test_multi_agent_workflow.py`) **Changed:** - Updated `Taskfile.yaml` with new multi-agent red team tasks, status, checkpoint, and Kubernetes infrastructure checks for streamlined multi-agent management - Extended `pyproject.toml` to include YAML, Jinja, and Markdown files in build artifacts and added conditional dependency for `importlib_resources` - Updated `src/ares/core/__init__.py` to expose new orchestration, config, worker, and workflow modules in package exports - Refactored dispatcher (`src/ares/core/dispatcher.py`) to add priority-based vulnerability queue, exploitation tracking, and async task completion handling - Enhanced orchestrator toolset (`src/ares/tools/red/orchestrator.py`) with credential expansion, vulnerability queueing, and queue status reporting tools - Improved template resource loading (`src/ares/core/templates.py`) for compatibility with package installations using `importlib_resources` - Updated red team engines (`src/ares/core/engines.py`) to use new template resource loading for attack chain and detection recipe YAMLs - Updated main CLI entrypoint (`src/ares/main.py`) to support multi-agent orchestration and worker agent invocation with config-driven argument parsing - Extended recovery manager (`src/ares/core/recovery.py`) with periodic checkpointing using dispatcher state - Improved remote execution module (`src/ares/core/remote.py`) to support both Kubernetes subprocess and AWS SSM execution modes **Added:** - Added new configuration module (`src/ares/core/config.py`) for YAML-driven, environment-variable-overrideable multi-agent operation settings **Changed:** - Updated README generator to use the new template directory location **Removed:** - No removals in this change set; all changes are additive or enhancements to support multi-agent red team workflows and orchestration
…rkers **Added:** - Added `redis` as a required dependency for worker operation coordination - Implemented `discover_active_operation` async utility to scan Redis for the most recently checkpointed operation, enabling workers to auto-discover which operation to join if not specified - Enhanced CLI and worker startup to support optional operation ID with auto-discovery logic **Changed:** - Updated worker launch flow to allow empty or missing operation IDs; workers now attempt Redis-based discovery before failing - Improved CLI documentation and parameter handling to reflect new auto- discovery behavior, including updated usage examples and argument descriptions - Adjusted handling of empty string operation IDs (e.g., from k8s configmaps) to trigger auto-discovery logic rather than error - Updated lock and dependency files to include `redis` and the correct marker for `importlib-resources` based on Python version **Removed:** - Removed strict requirement for an explicit operation ID when starting a worker; this is now optional due to discovery logic
…oss-pod messaging **Added:** - Introduced `RedisTaskQueue` in `src/ares/core/task_queue.py` for cross-pod task and result messaging via Redis, supporting multi-agent workflows in Kubernetes - Implemented `TaskMessage` and `TaskResult` Pydantic models for structured task/result exchange - Added `RedisWorkerAgent` to `src/ares/core/worker.py` for polling Redis and reporting results in Kubernetes deployments - Added `kubernetes>=29.0.0` as a dependency in `pyproject.toml` and `uv.lock` for direct K8s pod interactions - Created `tests/integration/test_redis_task_queue_integration.py` for end-to-end Redis queue testing - Created `tests/test_task_queue.py` for unit testing `RedisTaskQueue` behavior and models **Changed:** - Updated `RedTeamDispatcher` to use `RedisTaskQueue` when `redis_url` is set, falling back to in-memory queues otherwise - Refactored all major dispatcher task routing methods (`request_crack`, `request_lateral_movement`, etc.) to support Redis queueing for cross-pod communication - Extended dispatcher with `dispatch_and_wait` and `wait_for_redis_result` for synchronous-style orchestration via Redis - Updated worker startup to prefer Redis-based polling in Kubernetes deployments and fall back to in-memory dispatcher in single-process mode - Improved prompt generation for Redis task consumption in `generate_prompt_from_task` - Updated orchestrator tools to support `wait_for_result` and timeout parameters, enabling synchronous workflows with Redis-backed workers - Enhanced BloodHound output parsing and host registration for improved state sharing between agents - Marked `KubernetesPodExecutor` as deprecated for task dispatch, recommending Redis-based task queue usage - Updated integration tests and fixtures to mock or patch Redis for reliable test execution - Updated `uv.lock` to reflect new dependencies: `kubernetes`, `durationpy`, `google-auth`, `pyasn1`, `pyasn1-modules`, `requests-oauthlib`, `rsa`, `websocket-client` **Removed:** - Removed direct K8s port-forward and local subprocess-based task routing from main orchestrator workflow in favor of in-cluster execution and Redis-based coordination - Deprecated direct in-process pod execution for agent communication in favor of Redis queue mechanisms
CAP-847 Implement Multi-Agent Red Team Architecture in Kubernetes
Description: Objective: Design and implement a Kubernetes-based multi-agent red team system with specialized agents, centralized task dispatch, and resilient shared state, supporting parallelized attack workflows and coordinated operations. Scope of Work:
Dependencies:
Acceptance Criteria:
Additional Notes:
|
**Changed:** - Adjust temporary project structure in test to create the templates directory at `src/ares/templates` with parent directories, aligning with the path expected by the script in `test_generate_readme.py`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Key Changes:
Added:
config/multi-agent-production.yamldefines operational parameters, agentroles, timeouts, recovery, priorities, and security for Kubernetes deployments
src/ares/core/config.py: YAML/env config loader with agent and operationschemas, supporting overrides and caching
src/ares/core/dispatcher.py: CentralRedTeamDispatcherfor agentregistration, message routing, task management, and shared state
src/ares/core/messages.py: Typed inter-agent protocol for tasks, discovery,and coordination
src/ares/core/task_queue.py: Redis-based cross-pod task queue fororchestrator <-> worker communication with heartbeat and queue stats
src/ares/core/worker.py: Worker agent loop for polling tasks from Redis ordispatcher and reporting results
src/ares/core/workflows.py: Automated credential expansion and exploitationworkflows for recursive attack loops
src/ares/core/orchestrator.py: Main entrypoint for running multi-agentoperations and coordination logic
src/ares/core/k8s_executor.py: (Deprecated) Kubernetes pod executor fordirect pod command execution (retained for debugging/logging)
src/ares/core/recovery.py: OperationRecoveryManager for checkpointing,restoring, and cleaning up operation state in Redis
src/ares/core/factories/red_agents.py: Factories for creating role-specificagents, hooks, and toolsets with detailed orchestration logic
src/ares/templates/redteam/agents/src/ares/tools/red/orchestrator.py: Tools for dispatching tasks, monitoringstate, and reporting results for orchestrator, cracker, and lateral agents
tests/integration/test_multi_agent_workflow.py,tests/integration/test_redis_task_queue_integration.py, andtests/test_task_queue.pyfor orchestrator, queue, and workflow validationChanged:
pyproject.toml,poetry.lock, anduv.lock:redis,kubernetes, andimportlib_resourcesdependencies fordistributed coordination and resource loading
Taskfile.yaml:operations in Kubernetes, plus infrastructure checks
src/ares/core/__init__.py,src/ares/core/factories/__init__.py:src/ares/core/engines.py,src/ares/core/templates.py:importlib_resourcesfor reliableaccess in installed packages and containers
src/ares/core/models.py:AgentInfo,AgentRole,TaskInfo,TaskResult,SharedRedTeamState, and cross-agent compatibility with single-agent statesrc/ares/core/remote.py:orchestrator/worker/EC2 with Redis-based command dispatch in K8s
src/ares/tools/red/network.py:ACL, and delegation attack methods, and improved BloodHound host parsing
src/ares/core/factories/red_factory.py:reporting, including new low-hanging fruit tool prioritization
src/ares/main.py:multi-agentandworkerCLI commands for launching orchestrator andspecialized agent pods with config-driven options
Removed:
resources for portability and containerization