refactor: improve domain/credential logic and result processing for multi-domain ops by l50 · Pull Request #322 · dreadnode/ares

l50 · 2026-05-15T21:02:56Z

Key Changes:

Unified parent-child domain handling and credential selection logic across orchestrator modules
Hardened result processing to avoid trusting legacy scalar outputs from LLM workers
Improved detection and handling of tool outputs for signals like lockout, seimpersonate, NTLMv1, and ccache
Extended metadata for operation completion to record red team/blue team boundaries

Added:

red_completed_at, red_completion_reason, and red_blocked_on_blue fields to operation metadata (in both state and reports)
Direct ACL enumeration step after inter-realm ticket forging to accelerate SID-filtered trust analysis
Dedicated functions to build and dispatch direct krbtgt extraction with confirmation parsing
Test coverage for new domain/credential selection and evidence parsing policies

Changed:

Parent-child domain logic: replaced references to north.contoso.local with child.contoso.local for consistency in tests, documentation, and code comments
Credential selection: consistently allow child-domain creds for parent operations and vice versa, including proper fallback and forest-level matching
Result processing:
- Only trust tool output arrays for critical signals (e.g. golden ticket, ccache, seimpersonate, lockout)
- Ignore agent-generated summary/output fields from LLM workers (rust-llm-runner) for all evidence detection
- New include_legacy_scalar_outputs policy flag to gate trust of legacy output fields
- Hardened extract_locked_usernames_from_result, result_has_seimpersonate_signal, result_has_ccache_evidence, and related functions to avoid LLM hallucination
Task result shape: added optional worker_pod provenance field for downstream policy decisions
Hash publishing: accept both NT and LM:NT pairs for NTLM hash values, rejecting malformed entries
Secretsdump krbtgt extraction: dispatch direct tool call and only mark dedup if krbtgt hash is confirmed in parser output
Orchestrator completion: persist red/blue completion markers separately and expose to reporting and display
Trust automation: direct ACL enumeration with correct Kerberos context after ticket forging

Removed:

report_cracked_credential callback and agent tool definition; all cracked credentials must now be extracted from structured tool output, not LLM summaries
Legacy fallback logic that trusted LLM-provided scalar fields for evidence detection or credential publishing

… domain logic **Added:** - New test cases for child domain preservation when parent is valid and for domain/hostname overlap scenarios in `normalize_state_domains` logic - Tests for correct lockout detection filtering based on worker type and output field in result processing - Tests for privilege and NTLMv1/lockout signals extracted only from trusted tool outputs **Changed:** - `normalize_state_domains` now retains child domains if their parent is valid, and confirms real domains by checking FQDN suffixes from hosts - Only removes trusted evidence keys in `merge_result_extras`, expanding the list of dropped keys to avoid agent-supplied shadowing - Evidence detection (e.g., SeImpersonate, NTLMv1, ccache, lockouts, locked usernames) now only considers trusted tool outputs and ignores summary fields - `payload_contains_golden_ticket_marker` ignores summary and explicit flag fields, using only trusted tool outputs - `check_golden_ticket_completion` now prioritizes provided task domain over payload domain field - Test suite and payload construction updated throughout to use `tool_outputs`/trusted fields instead of summary or legacy output fields **Removed:** - Detection logic that previously relied on summary or explicit agent fields for evidence or privilege signals - Legacy test payloads and field usage in favor of trusted output arrays

…ardening

codecov · 2026-05-15T21:06:50Z

Codecov Report

❌ Patch coverage is 76.88022% with 249 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.81%. Comparing base (3b4d36e) to head (a161dae).

Files with missing lines	Patch %	Lines
ares-cli/src/orchestrator/result_processing/mod.rs	38.09%	65 Missing ⚠️
ares-cli/src/orchestrator/automation/trust.rs	0.00%	58 Missing ⚠️
...res-cli/src/orchestrator/automation/secretsdump.rs	67.22%	39 Missing ⚠️
ares-cli/src/orchestrator/completion.rs	16.66%	35 Missing ⚠️
...src/orchestrator/result_processing/admin_checks.rs	63.23%	25 Missing ⚠️
ares-cli/src/ops/loot/format/display.rs	76.59%	11 Missing ⚠️
...li/src/orchestrator/automation/mssql_link_pivot.rs	68.42%	6 Missing ⚠️
ares-cli/src/ops/loot/format/json.rs	0.00%	3 Missing ⚠️
...rchestrator/result_processing/impacket_recovery.rs	92.10%	3 Missing ⚠️
...rchestrator/result_processing/discovery_polling.rs	97.43%	2 Missing ⚠️
... and 2 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #322      +/-   ##
==========================================
+ Coverage   78.78%   78.81%   +0.02%     
==========================================
  Files         439      439              
  Lines      124726   125356     +630     
==========================================
+ Hits        98271    98799     +528     
- Misses      26455    26557     +102

Files with missing lines	Coverage Δ
ares-cli/src/dedup/tests.rs	`100.00% <100.00%> (ø)`
ares-cli/src/orchestrator/automation/crack.rs	`71.07% <100.00%> (ø)`
ares-cli/src/orchestrator/automation/gpp_sysvol.rs	`84.13% <100.00%> (+2.77%)`	⬆️
...i/src/orchestrator/automation/group_enumeration.rs	`78.85% <100.00%> (ø)`
...li/src/orchestrator/automation/ntlmv1_downgrade.rs	`74.83% <100.00%> (+3.67%)`	⬆️
...cli/src/orchestrator/automation/password_policy.rs	`85.35% <100.00%> (+2.25%)`	⬆️
ares-cli/src/orchestrator/automation/s4u.rs	`89.56% <100.00%> (+0.90%)`	⬆️
...-cli/src/orchestrator/callback_handler/dispatch.rs	`27.80% <ø> (-6.25%)`	⬇️
ares-cli/src/orchestrator/callback_handler/mod.rs	`40.90% <ø> (ø)`
...res-cli/src/orchestrator/callback_handler/tests.rs	`100.00% <100.00%> (ø)`
... and 34 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…controls **Added:** - New test cases for policy-based exclusion of scalar output fields in result parsing, ensuring that only tool-emitted data is consumed for automations - Tests verifying LM:NT pair acceptance in hash publishing, and that krbtgt LM:NT hashes grant domain admin status - Tests for red completion metadata population and operation meta parsing - Documentation and comments clarifying the rationale for parsing policies and the removal of the cracked credential callback **Changed:** - Replaced test and documentation references from `north.contoso.local` to `child.contoso.local` for consistency in domain hierarchy examples across all modules and test fixtures - Updated all domain trust, host, and parsing logic to use `child.contoso.local` as the canonical child domain example, including in test data, assertions, and string construction - Introduced new policies to control parsing of legacy scalar output fields in orchestrator result processing, with explicit toggles based on worker provenance (e.g., excluding LLM-runner model-authored narrative from tool output parsing) - Refactored result parsing helpers (e.g., NTLMv1 detection, impersonation, ccache evidence, lockout extraction) to centralize text part collection and support the new policy controls for legacy output - Updated domain credential selection in orchestrator automations to allow matching on parent/child domain relationships, skipping cross-forest creds when unrelated - Adjusted secretsdump krbtgt extraction automation to dispatch the tool directly, only marking dedup after successful krbtgt hash parsing - Modified MSSQL link pivot automation to always set `windows_auth` true when a credential domain is present, and to support impersonation hints in tool args - Improved operation completion metadata: now records `red_completed_at`, `red_completion_reason`, and `red_blocked_on_blue` to Redis and operation state, with new display and JSON output in loot reporting - Enhanced domain SID and golden ticket marker extraction to prefer trusted task context over payload fields, and made result processing robust against LLM-generated summaries and legacy payload shapes - Hardened NTLM hash value validation to accept both standard 32-hex and LM:NT pairs, rejecting malformed relay artifacts - Removed the `report_cracked_credential` callback and tool definition, centralizing cracked credential extraction to automated stdout parsing only; hallucinated calls are now deterministically ignored - Updated tool registry and callback handler logic to trap removed callback names, returning a deterministic "tool removed" response for hallucinations **Removed:** - Legacy `report_cracked_credential` callback handler, tool definition, and associated tests, as cracked credentials are now reliably extracted from stdout without LLM summarization fallback - Callback tool references for removed/unsupported reporting callbacks from agent tool registry, with trap logic for deterministic handling

l50 added 2 commits May 15, 2026 14:56

Merge remote-tracking branch 'origin/main' into fix/llm-state-taint-h…

520b71c

…ardening

l50 changed the title ~~refactor: restrict evidence parsing to trusted tool outputs and standardize result payloads~~ refactor: improve domain/credential logic and result processing for multi-domain ops May 15, 2026

l50 merged commit 716599a into main May 15, 2026
14 checks passed

l50 deleted the fix/llm-state-taint-hardening branch May 15, 2026 22:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: improve domain/credential logic and result processing for multi-domain ops#322

refactor: improve domain/credential logic and result processing for multi-domain ops#322
l50 merged 3 commits into
mainfrom
fix/llm-state-taint-hardening

l50 commented May 15, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

l50 commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

l50 commented May 15, 2026 •

edited

Loading

codecov Bot commented May 15, 2026 •

edited

Loading