refactor: improve domain/credential logic and result processing for multi-domain ops#322
Merged
Conversation
… domain logic **Added:** - New test cases for child domain preservation when parent is valid and for domain/hostname overlap scenarios in `normalize_state_domains` logic - Tests for correct lockout detection filtering based on worker type and output field in result processing - Tests for privilege and NTLMv1/lockout signals extracted only from trusted tool outputs **Changed:** - `normalize_state_domains` now retains child domains if their parent is valid, and confirms real domains by checking FQDN suffixes from hosts - Only removes trusted evidence keys in `merge_result_extras`, expanding the list of dropped keys to avoid agent-supplied shadowing - Evidence detection (e.g., SeImpersonate, NTLMv1, ccache, lockouts, locked usernames) now only considers trusted tool outputs and ignores summary fields - `payload_contains_golden_ticket_marker` ignores summary and explicit flag fields, using only trusted tool outputs - `check_golden_ticket_completion` now prioritizes provided task domain over payload domain field - Test suite and payload construction updated throughout to use `tool_outputs`/trusted fields instead of summary or legacy output fields **Removed:** - Detection logic that previously relied on summary or explicit agent fields for evidence or privilege signals - Legacy test payloads and field usage in favor of trusted output arrays
…controls **Added:** - New test cases for policy-based exclusion of scalar output fields in result parsing, ensuring that only tool-emitted data is consumed for automations - Tests verifying LM:NT pair acceptance in hash publishing, and that krbtgt LM:NT hashes grant domain admin status - Tests for red completion metadata population and operation meta parsing - Documentation and comments clarifying the rationale for parsing policies and the removal of the cracked credential callback **Changed:** - Replaced test and documentation references from `north.contoso.local` to `child.contoso.local` for consistency in domain hierarchy examples across all modules and test fixtures - Updated all domain trust, host, and parsing logic to use `child.contoso.local` as the canonical child domain example, including in test data, assertions, and string construction - Introduced new policies to control parsing of legacy scalar output fields in orchestrator result processing, with explicit toggles based on worker provenance (e.g., excluding LLM-runner model-authored narrative from tool output parsing) - Refactored result parsing helpers (e.g., NTLMv1 detection, impersonation, ccache evidence, lockout extraction) to centralize text part collection and support the new policy controls for legacy output - Updated domain credential selection in orchestrator automations to allow matching on parent/child domain relationships, skipping cross-forest creds when unrelated - Adjusted secretsdump krbtgt extraction automation to dispatch the tool directly, only marking dedup after successful krbtgt hash parsing - Modified MSSQL link pivot automation to always set `windows_auth` true when a credential domain is present, and to support impersonation hints in tool args - Improved operation completion metadata: now records `red_completed_at`, `red_completion_reason`, and `red_blocked_on_blue` to Redis and operation state, with new display and JSON output in loot reporting - Enhanced domain SID and golden ticket marker extraction to prefer trusted task context over payload fields, and made result processing robust against LLM-generated summaries and legacy payload shapes - Hardened NTLM hash value validation to accept both standard 32-hex and LM:NT pairs, rejecting malformed relay artifacts - Removed the `report_cracked_credential` callback and tool definition, centralizing cracked credential extraction to automated stdout parsing only; hallucinated calls are now deterministically ignored - Updated tool registry and callback handler logic to trap removed callback names, returning a deterministic "tool removed" response for hallucinations **Removed:** - Legacy `report_cracked_credential` callback handler, tool definition, and associated tests, as cracked credentials are now reliably extracted from stdout without LLM summarization fallback - Callback tool references for removed/unsupported reporting callbacks from agent tool registry, with trap logic for deterministic handling
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Key Changes:
Added:
red_completed_at,red_completion_reason, andred_blocked_on_bluefields to operation metadata (in both state and reports)Changed:
north.contoso.localwithchild.contoso.localfor consistency in tests, documentation, and code commentsrust-llm-runner) for all evidence detectioninclude_legacy_scalar_outputspolicy flag to gate trust of legacy output fieldsextract_locked_usernames_from_result,result_has_seimpersonate_signal,result_has_ccache_evidence, and related functions to avoid LLM hallucinationworker_podprovenance field for downstream policy decisionsRemoved:
report_cracked_credentialcallback and agent tool definition; all cracked credentials must now be extracted from structured tool output, not LLM summaries