Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 9 additions & 8 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

**Status:** Phase 8.5 (Dual src/dst HostContext) COMPLETE; Pre-MVP quality fixes ongoing
**Started:** 2026-03-11
**Last Updated:** 2026-04-29
**Last Updated:** 2026-05-14

See [CHANGELOG.md](CHANGELOG.md) for detailed development history of completed phases.

Expand Down Expand Up @@ -241,7 +241,7 @@ Replaced manual per-emitter field coordination with SecurityEvent intermediate r
- [x] **P1** Web application response/session realism follow-up — Added data-driven inbound `web_server` visitor profiles so human visitors consume `traffic_rates.web` as top-level actions, then fan out into required page assets/API calls through `site_maps.yaml`; crawler, health-check, API-client, and opportunistic-probe traffic now uses source-native configured request/status/User-Agent profiles. Static resource sizes are stable per host/path, human navigation and render fanout timing use `timing_profiles.yaml`, and docs/skill references now explain the budget and config ownership. Verification passed: focused web/timing/baseline tests (`107 passed, 1 skipped`), config-related tests (`64 passed`), `uv run eforge validate-config`, repo-wide Ruff checks/format checks, full normal `uv run pytest -q` (`3012 passed, 15 skipped`), and `git diff --check`.
- [x] **P1** Well-synced network sensor timing follow-up — Replaced hardcoded multi-sensor Zeek +/-400ms skew plus broad path delay with a validated `network_sensor_observation` timing profile. The default `well_synced` profile keeps stable per-sensor clock skew within +/-1.5ms and per-flow capture/path delay within 50-2000us while preserving canonical packet/byte truth unless source-native observation variance is explicitly enabled. Verification passed with focused Zeek/timing tests, `uv run eforge validate-config`, repo-wide Ruff checks/format checks, full normal `uv run pytest -q` (`3012 passed, 15 skipped`), and `git diff --check`.
- [x] **P1** Source identity and endpoint baseline realism sprint — completed TLS/X.509 issuer-compatible chain signatures, Sysmon Event 7 native third-party module identity, config-driven Windows scheduled-process timing, and DHCP registry emission policy tied to lease activity. Verified with `uv run eforge validate-config`, focused regressions, Ruff, normal pytest, and slow-inclusive pytest.
- [ ] **P2** Endpoint/eCAR baseline variance follow-up — Loop 96 found workstation eCAR category volumes and Linux process lifecycle evidence too uniform and complete. The realistic endpoint observation-gap portion is now handled by named observation profiles; remaining work should focus on host/persona-specific volume variance, long-lived process state, and benign unmatched endpoint artifacts.
- [x] **P2** Endpoint/eCAR baseline variance follow-up — addressed through the host/activity profile realism layer. Host family, role, persona, and stable per-host multipliers now shape endpoint, process, registry, scheduled-task, syslog, bash, eCAR, Windows, Zeek, firewall, IDS, web, and proxy rates; config-driven encoded PowerShell variants and benign endpoint texture reduce repeated per-host artifacts. Verification passed with focused host-activity/config/ASA/baseline tests, `uv run eforge validate-config`, Ruff checks/format checks, full normal `uv run pytest -v`, and slow-inclusive `uv run pytest -v --include-slow --no-cov` (`3057 passed, 1 skipped`).
- [x] **Later architectural sprint: imperfect observation and source coverage** — implemented a training-friendly `complete` default plus overlay-compatible named observation profiles that apply deterministic source-level drop/delay/coverage semantics without modeling contradictions. The policy covers endpoint, network, proxy/web, firewall, IDS, Windows, Sysmon, Zeek, syslog, bash history, and eCAR source families, while ground truth preserves canonical truth and records source evidence status. Verification passed: focused observation/config/ground-truth tests, `uv run eforge validate-config`, Ruff checks/format checks, full normal `uv run pytest -v` (`3036 passed, 15 skipped`), and slow-inclusive `uv run pytest -v --include-slow` (`3050 passed, 1 skipped`).
- [x] Full slow-suite regression cleanup after loop-65 merge — explicit-proxy storyline beacons now preserve authored hostname+destination IP pairs only when the storyline marks that pair as intentional, normal proxy-origin DNS resolution remains intact, and the parallel-generation LogonID assertion treats Type 7 unlock reuse as valid slice-of-time Windows behavior. Verified with targeted proxy/parallel tests, `uv run ruff check .`, `uv run ruff format --check .`, and `uv run pytest -v --include-slow` (`2875 passed, 23 skipped`).
Detection Engineer blind review completed for the regenerated Loop 61 dataset at `scenarios/iteration-test/data`; reviewer verdict: Synthetic, 63/100 confidence. Main findings: one PROXY-01 sshd accepted-login lifecycle gap/self-source artifact and Windows 4648 explicit-credential caller PID/image provenance ambiguity around `WS-MCHEN-01`.
Expand Down Expand Up @@ -437,7 +437,7 @@ Data works but experienced analysts spot tells. Grouped by format for efficient
- [x] Event 10 source/target pairs too narrow — fixed by widening `process_access_patterns.yaml` and seeded long-lived process actors. Verification audit output: 950 Event 10 records used 16 source/target pairs.
- [x] Registry writer processes too narrow — fixed with key-family-aware writer selection. Verification audit output: Event 12/13 records used 12 writer process images and 1,968 unique TargetObject paths with 0 template artifacts.
- [x] Event 7 residual attribution issues — tightened generic module/process matching and retained process-aware DLL materialization. Verification audit output: 380 Event 7 records used 42 unique ImageLoaded paths.
- [ ] Cross-source distribution realism layer — defer until data-source reviews are complete. Independent Sysmon reviews found that field-level realism improved, but per-host event volumes and recipe selection remain too uniform. Design a deterministic host/activity profile layer derived from scenario facts (host type, roles, assigned_user, persona, services, stable seed) and use it to shape Sysmon, Windows Security, Zeek, syslog, firewall, web, proxy, and eCAR/EDR rates. Avoid implementing Sysmon-only profile logic unless needed as a narrow bug fix.
- [x] Cross-source distribution realism layer — implemented a deterministic, overlay-capable host/activity profile layer derived from host family, roles, persona/risk, services, and stable per-host variance. Baseline generation now uses these profiles to scale Windows Security/Sysmon/eCAR, Zeek/network/web/proxy, Linux syslog/bash, firewall/ASA, IDS, auth, endpoint registry, scheduled process, and service-noise volumes without requiring scenario YAML changes.

**Zeek:**
- [x] Zeek DNS / network support log review — fixed DNS/TLS PTR coherence, added realistic TXT lookup variety, prevented CDN-hostname MX artifacts, increased file-server SMB target coverage, and made SSH pivot UIDs respect sensor visibility. Tests, docs, skills, and skill references updated where needed.
Expand Down Expand Up @@ -583,8 +583,8 @@ Data works but experienced analysts spot tells. Grouped by format for efficient
- [x] Security: bound threat-detection deny timestamp tracking window to prevent unbounded memory/CPU growth
- [x] ASA imperfect-observation realism — addressed by the general observation profile layer. `complete` preserves paired training-friendly firewall evidence, while non-default profiles can apply deterministic ASA source-family gaps that create realistic missing/partial firewall evidence without rewriting canonical truth.
- [ ] ASA message type diversity limited to 106023/302013-16/305011-12 — missing 111008, 113004, 733100, 106001, 725001, 304001
- [ ] ASA deny baseline burstiness/profile variance — defer to a general per-source activity profile rather than a one-off ASA fix. Current deny events are uniformly spaced (3-7s); real scans should have configurable burst/quiet periods, campaign-level cadence, and source-specific variance.
- [ ] ASA deny metadata diversity — defer to a general field-distribution realism layer. Current deny events use `[0x0, 0x0]` hash values uniformly; a later profile should model when hashes remain zero vs vary by platform/message/context.
- [x] ASA deny baseline burstiness/profile variance — fixed through host activity profiles and firewall-deny burst scheduling. Baseline denies now use deterministic burst/quiet periods and host/profile variance instead of uniform 3-7 second spacing.
- [x] ASA deny metadata diversity — fixed by carrying deny hash metadata on canonical firewall context and rendering stable varied ASA hash values where appropriate instead of hardcoded `[0x0, 0x0]`.
- [ ] Recognizable 45.33.32.x public IPs remain in built-in scan/attacker pools — the original `45.33.32.1` NAT PAT finding is stale, but code still uses `45.33.32.156` in scan/attacker pools. Move these values into data/config or replace them with less recognizable public-looking lab addresses during the broader public-IP/profile cleanup.

**eCAR:**
Expand All @@ -598,10 +598,11 @@ Data works but experienced analysts spot tells. Grouped by format for efficient
**Cross-Source / General:**
- [x] Configurable cross-source evidence disagreement — implemented as named observation profiles with `complete` as the default. Non-default profiles can introduce deterministic dropped/delayed/filtered/out-of-window evidence across Zeek, web, proxy, firewall, IDS, Windows, Sysmon, syslog, bash history, and eCAR without contradictions or ambiguous rewrites; ground truth retains source evidence status for traceability.
- [x] Cross-sensor timestamp precision identical to 15+ decimal places — microsecond jitter added in snort.py, windows.py, and storyline.py
- [ ] **P2** Per-host-type event rate multiplier — Domain controllers generate ~50 events/hr but real DCs running AD/DNS/DFS/GPO produce thousands/hr. `system.type` is used for routing but never for volume scaling. Need `event_rate_multiplier` on System model (or implicit per-type defaults) applied in `_calculate_events_for_hour()` and `_generate_system_traffic()`. DCs should be 3-5x workstation baseline; file servers and web servers similarly elevated.
- [ ] Configurable per-entity artifact variation — deferred to the general host/activity profile layer. Encoded PowerShell baseline noise is currently identical across hosts (same Get-Service blob); later profiles should derive stable per-host command variants, encoded payloads, tool versions, and operator habits.
- [ ] Configurable per-host volume variance — deferred to the general host/activity profile layer. Workstation connection counts are suspiciously uniform (808-1068 range); later profiles should widen variance by role, persona, weekday, installed apps, and stable host-specific multipliers.
- [x] **P2** Per-host-type event rate multiplier — implemented as implicit host/activity profile defaults rather than scenario YAML fields. Domain controllers, file servers, web servers, proxies, Linux servers, and workstations now receive role/family/persona-specific multipliers across baseline activity, auth, endpoint, network, and source-specific noise.
- [x] Configurable per-entity artifact variation — implemented in the host/activity profile layer for baseline artifact texture, including stable per-host encoded PowerShell variants and profile-owned endpoint activity scaling.
- [x] Configurable per-host volume variance — implemented via stable host/persona/role multipliers applied across major activity families so hosts no longer share narrow uniform volume bands by construction.
- [ ] Configurable per-host/source log deployment coverage — observation profiles now support source-family gaps and host-scoped missingness multipliers, but explicit per-host source enablement/disablement remains future work. A later setting should model named host groups, disabled sensors, partial deployments, and collection windows when users need topology-level telemetry coverage differences rather than event-level missingness.
- [ ] **P2** Generation speed and efficiency follow-up — Sprint 4 host/activity realism is functionally verified, but the slow-inclusive suite exposed that `pytest-cov` plus `tracemalloc` can make the medium dataset memory test pathological. A future sprint should profile generation without instrumentation noise, identify hot paths introduced by richer host activity/web fanout/firewall texture, and decide whether to optimize generation, mark the memory test `--no-cov`, or relax/update stale performance assertions.
- [x] DNS IP pool reuse causes cross-provider resolution (CloudFront→Microsoft IPs, etc.) — domain-first selection ensures consistent domain→IP mapping via FORWARD_DNS
- [x] AWS region mismatch between DNS PTR and SSL SNI for same IP — AWS hostname/PTR generation now derives a stable per-IP region/edge identity and PTR generation respects known forward hostname context.
- [x] TLS volume clustering design — added data-driven TLS destination profiles with overlay support and `eforge validate-config` schema/tag checks. Auto-generated external TLS now uses weighted enterprise, certificate-infra, package-update, developer-tool, and long-tail browsing profiles with stable per-host preferences. Smoke output had 28,544 TLS SNI rows, 116 distinct names, top SNI share 5.5%, and top-5 share 18.0%.
Expand Down
1 change: 1 addition & 0 deletions commands/eforge/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ When writing to the overlay, files are partial — they contain ONLY the user's
| Modify Windows auth realism | `windows_auth_realism.yaml` | (standalone — Security log auth timing and failed-logon profile knobs) |
| Modify baseline auth noise | `auth_noise.yaml` | (standalone — stale scheduled-credential accounts and irregular recurrence timing) |
| Modify endpoint background noise | `endpoint_noise.yaml` | (standalone — scheduled-process timing and DHCP registry emission policy) |
| Modify host activity distribution | `host_activity_profiles.yaml` | (standalone — host/persona/role rate-family multipliers, firewall deny bursts, and artifact variants) |
| Modify source observation coverage | `observation_profiles.yaml` | Scenario `observation_profile` selects the named profile; keep `complete` as the default training profile |
| Modify causal/source timing | `timing_profiles.yaml` | (standalone — causal prerequisite, source latency, teardown, and Windows/Sysmon collision-spacing knobs) |
| ~~Format definitions~~ | Not user-customizable | Engine internals — requires code changes |
Expand Down
8 changes: 8 additions & 0 deletions commands/eforge/references/config-dependency-graph.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,14 @@ Each row is a file; columns show what it depends on and what depends on it.
| depends on | nothing | Standalone rate table |
| **depended on by** | Engine (runtime) | Drives all baseline traffic rate calculations (user activity, web top-level actions, DNS, SMB, Kerberos, LDAP, persona connections) |

### host_activity_profiles.yaml
| Direction | File | Relationship |
|-----------|------|-------------|
| depends on | scenario host metadata | Uses system type, roles, assigned users, primary systems, and user personas to resolve coarse activity multipliers |
| depends on | `traffic_rates.yaml` | Multiplies resolved baseline rates after global intensity and scenario `baseline_activity.traffic_rates` overrides are applied |
| **depended on by** | Engine (runtime) | Shapes host/persona/role baseline volume, endpoint noise, Linux/syslog shell activity, firewall deny bursts, IDS/ICMP rates, and encoded PowerShell artifact variation |
| validated by | `eforge validate-config` | Enforces known rate-family names, ordered positive bounds, core host types, firewall deny burst settings, and artifact variant pools |

### web_session_profiles.yaml
| Direction | File | Relationship |
|-----------|------|-------------|
Expand Down
56 changes: 53 additions & 3 deletions commands/eforge/references/config-host-activity.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@ Schema documentation for host-level activity config files. User customizations g
5. [windows_auth_realism.yaml](#windows_auth_realismyaml)
6. [auth_noise.yaml](#auth-noise-auth_noiseyaml)
7. [endpoint_noise.yaml](#endpoint-noise-endpoint_noiseyaml)
8. [observation_profiles.yaml](#observation-profiles-observation_profilesyaml)
9. [timing_profiles.yaml](#timing_profilesyaml)
10. [Domain Controller Baseline Activity](#domain-controller-baseline-activity)
8. [host_activity_profiles.yaml](#host-activity-profiles-host_activity_profilesyaml)
9. [observation_profiles.yaml](#observation-profiles-observation_profilesyaml)
10. [timing_profiles.yaml](#timing_profilesyaml)
11. [Domain Controller Baseline Activity](#domain-controller-baseline-activity)

---

Expand Down Expand Up @@ -350,6 +351,55 @@ registry_noise:

---

## Host Activity Profiles (`host_activity_profiles.yaml`)

Controls coarse host/persona/role volume multipliers for baseline realism. This layer is intentionally rate-family based rather than event-type based: it keeps scenario authors from managing per-emitter matrices while still making domain controllers, servers, workstations, sysadmins, developers, and exposed roles produce distinct volumes.

```yaml
rate_families:
default_bounds: [0.25, 6.0]
bounds:
windows_machine_auth: [0.5, 8.0]
firewall_deny: [0.4, 5.0]

host_types:
workstation:
base_multiplier: 1.0
variance: [0.75, 1.35]
families:
inbound_network: 0.65
server:
base_multiplier: 1.8
variance: [0.85, 1.45]
families:
windows_service_process: 1.15
domain_controller:
base_multiplier: 4.0
variance: [0.9, 1.3]
families:
dc_kerberos: 1.5

role_profiles:
web_server:
families:
inbound_network: 2.0
firewall_deny: 1.35

persona_profiles:
sysadmin:
families:
linux_remote_admin: 1.45
windows_remote_admin: 1.35
```

Resolved multipliers apply after global intensity defaults and scenario `baseline_activity.traffic_rates` overrides. Use `traffic_rates.yaml` for global low/medium/high defaults; use `host_activity_profiles.yaml` when the rate should differ by host type, role, persona, or deterministic per-host variance.

Valid rate families are: `user_activity`, `web`, `dns_interval`, `ntp`, `smb_interval`, `kerberos`, `ldap`, `persona_connections`, `role_network`, `inbound_network`, `windows_service_process`, `windows_registry`, `windows_scheduled_task`, `windows_remote_thread`, `windows_process_access`, `windows_module_load`, `windows_remote_admin`, `windows_service_logon`, `windows_machine_auth`, `dc_kerberos`, `linux_syslog`, `linux_remote_admin`, `linux_shell`, `firewall_deny`, `ids_alert`, and `icmp_monitoring`.

`artifact_variants.powershell_encoded` provides data-driven benign encoded PowerShell payload templates and parameter pools. `firewall_deny` controls ASA deny burst windows, quiet periods, and mostly-zero metadata hash frequency. Run `eforge validate-config` after overlay changes; it rejects unknown rate-family names, missing core host types, inverted ranges, invalid probabilities, and empty artifact pools.

---

## Observation Profiles (`observation_profiles.yaml`)

Defines named source-observation profiles selected by scenario `observation_profile`. Keep `complete` as the default for training-friendly perfect source coverage and correlation. Use non-default profiles only when a scenario intentionally needs realistic source gaps or ingestion delays.
Expand Down
Loading
Loading