Skip to content

feat: enhance red team automation with exploitation toolsets and forced exploitation logic#35

Merged
l50 merged 3 commits into
mainfrom
jayson/cap-838-fix-red-agent-behavioral-gaps-missing-tool-wrappers
Jan 12, 2026
Merged

feat: enhance red team automation with exploitation toolsets and forced exploitation logic#35
l50 merged 3 commits into
mainfrom
jayson/cap-838-fix-red-agent-behavioral-gaps-missing-tool-wrappers

Conversation

@l50
Copy link
Copy Markdown
Contributor

@l50 l50 commented Jan 12, 2026

Key Changes:

  • Added new toolsets for coercion, MSSQL, ACL abuse, CVE exploits, trust attacks, and lateral movement
  • Introduced a vulnerability discovery hook that forces exploitation when vulnerabilities are found
  • Overhauled BloodHound and Certipy tools to provide structured, actionable outputs
  • Updated agent instructions and system prompt to require exploitation of all discovered paths before summarizing

Added:

  • New toolsets for exploitation, including:
    • CoercionTools (PetitPotam, Coercer) for authentication coercion attacks
    • MSSQLTools for SQL Server exploitation and impersonation chains
    • ACLExploitTools for exploiting Active Directory ACL misconfigurations (pywhisker, bloodyAD)
    • CVEExploitTools for known vulnerabilities (noPac, PrintNightmare)
    • TrustAttackTools for cross-domain/forest escalation (raiseChild)
    • LateralMovementTools for remote access (evil-winrm, psexec)
  • Vulnerability discovery hook that monitors tool output and injects guidance to immediately exploit discovered vulnerabilities (e.g., ESC1, ACL abuse, delegation, MSSQL impersonation)
  • Structured output parsing for BloodHound and Certipy tools, including JSON summaries of attack paths, vulnerable templates, and prioritized recommended actions
  • Expanded import and export logic in ares.tools.red.__init__.py to register all new toolsets

Changed:

  • BloodHound toolset now parses output for actionable attack paths, ACL abuse, delegation, and high-value targets, returning structured data and step-by-step recommendations
  • Certipy toolset parses ADCS enumeration output for ESC vulnerabilities and provides structured exploitation steps and parameters
  • create_redteam_agent now includes all new exploitation toolsets and registers the vulnerability discovery hook to enforce exploitation-first workflows
  • System instructions for red team agents now explicitly forbid summarizing operations until all discovered vulnerabilities/paths have been exploited, with detailed anti-summarization and fallback chain rules
  • Example commands and error recovery logic included in system prompt to guide agent behavior

Removed:

  • Redundant static tool references and outdated instructions, consolidating all tool import/export logic to support new exploitation automation
  • Legacy examples in docstrings that did not reflect new exploitation toolsets or workflows

…irst workflow

**Added:**

- Introduced new exploitation toolsets: CoercionTools (petitpotam, coercer), MSSQLTools
  (mssqlclient), ACLExploitTools (pywhisker, bloodyAD), CVEExploitTools (noPac,
  PrintNightmare), TrustAttackTools (raiseChild), and LateralMovementTools
  (evil-winrm, psexec) to the red team arsenal
- Implemented vulnerability_discovery_hook to automatically inject exploit
  guidance when critical vulnerabilities or exploitation paths are detected
  during tool execution
- Added detailed error recovery and attack fallback chain guidance to agent
  system instructions to ensure persistent exploitation attempts

**Changed:**

- Refactored BloodHoundTools and CertipyTools to return structured JSON output,
  including actionable exploitation steps, discovered attack paths, and
  prioritized next actions for automation
- Updated tool import and export structure to expose all new toolsets in the
  red team tools module
- Enhanced create_redteam_agent to register and initialize all new exploitation
  toolsets, ensuring agent can leverage the full range of attacks in automated
  workflows
- Expanded agent instructions to explicitly forbid summarization or completion
  until all discovered vulnerabilities and attack paths are exploited,
  establishing exploitation-first operational logic
- Improved docstrings and usage examples for exploitation tool methods to guide
  correct tool usage and chaining

**Removed:**

- Removed outdated usage examples and replaced with updated, more concise
  demonstrations matching new toolsets and output structures
@linear
Copy link
Copy Markdown

linear Bot commented Jan 12, 2026

CAP-838 Fix Red Agent: Behavioral Gaps & Missing Tool Wrappers

Description:
The Red Agent currently fails to exploit most discovered vulnerabilities due to behavioral shortcomings and lack of tool wrappers for key attack techniques. This task aims to address these gaps by improving the agent’s exploitation behavior and exposing installed tools through new wrappers, thereby enabling comprehensive automated exploitation.


Objective:

Ensure the Red Agent reliably exploits all discovered attack paths by introducing structured output parsing, proactive exploitation triggers, anti-summarization directives, and wrappers for critical exploitation tools.


Scope of Work:

  • Refactor discovery tools (e.g., certipy_find, BloodHound) to parse output and return structured, actionable data
  • Implement a vulnerability discovery hook to trigger immediate exploitation actions when vulnerabilities are detected
  • Update system instructions to include an anti-summarization directive and require exploitation before summary or completion
  • Add tool wrappers for installed binaries: petitpotam, coercer, mssqlclient.py, pywhisker, bloodyAD, noPac, PrintNightmare, raiseChild.py, evil-winrm
  • Test new wrappers to ensure agent can invoke and process output from each tool
  • (Optional, P1) Implement a periodic priority check hook to remind the agent of unexploited discoveries

Dependencies:

  • Access to source code for src/ares/tools/red/network.py and related agent logic
  • Existing installations of exploitation tools (as listed)
  • Ability to update agent system instructions/templates
  • Test environment with representative AD infrastructure

Acceptance Criteria:

  1. Discovery tools (certipy_find, BloodHound) return structured output with explicit exploit parameters (e.g., CA/template names, attack path details)
  2. When a vulnerability is found, the agent immediately attempts exploitation without manual intervention
  3. Agent does not prematurely summarize or complete operations until all exploitable paths have been attempted
  4. Wrappers for all listed tools are implemented and accessible to the agent via @dn.tool_method
  5. Agent successfully invokes at least one exploitation command for each new wrapper in end-to-end tests
  6. (If implemented) The agent receives periodic reminders to exploit all discovered vulnerabilities

Additional Notes:

  • Prioritize behavioral fixes (structured output, exploitation triggers, anti-summarization) before adding tool wrappers for maximum impact
  • Reference implementation priority and quick wins section for incremental delivery
  • Test with varied AD environments to ensure robust parsing and exploitation logic
  • For structured output, consider JSON or dicts to encapsulate actionable findings
  • See logs and gap analysis in the original plan for specific examples of behavioral failure

@dreadnode-renovate-bot dreadnode-renovate-bot Bot added area/templates Changes made to warpgate template configurations area/src labels Jan 12, 2026
**Added:**

- Added test class for `CoercionTools` covering initialization, state management,
  PetitPotam (authenticated/unauthenticated/failure), and Coercer scenarios
- Added test class for `MSSQLTools` including login, xp_cmdshell, and exception
  handling tests
- Added test class for `ACLExploitTools` covering pywhisker, bloodyAD group member
  addition, password reset, and exception paths
- Added test class for `CVEExploitTools` testing noPac and PrintNightmare
  exploitation and error handling
- Added test class for `TrustAttackTools` including trust escalation, target
  domain handling, and failure cases
- Added test class for `LateralMovementTools` for evil-winrm (password/hash/no
  creds/exception), psexec (password/hash/exception), and state management
- All new tests mock remote execution and verify both positive and negative paths
- Increased coverage for error handling and credential edge cases in network tools
…eminders

**Added:**

- Implemented tracking of discovered and exploited vulnerabilities using in-memory
  state, with helpers for registering new discoveries and exploitation attempts
- Added `track_vulnerability_discoveries` async hook to monitor tool results and
  update vulnerability state based on tool output patterns
- Introduced `periodic_priority_check` async hook to periodically remind the agent
  about unexploited vulnerabilities and inject warnings every N steps
- Comprehensive unit tests for discovery, exploitation, and periodic reminder
  logic in the red team agent factory, including edge cases and idempotency

**Changed:**

- Extended `reset_event_tracking` to clear vulnerability tracking state for clean
  agent runs
- Updated agent creation to register the new hooks for vulnerability tracking and
  periodic reminders

**Removed:**

- No removals
@l50 l50 merged commit dfec8e3 into main Jan 12, 2026
8 checks passed
@l50 l50 deleted the jayson/cap-838-fix-red-agent-behavioral-gaps-missing-tool-wrappers branch January 12, 2026 04:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/templates Changes made to warpgate template configurations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant