Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,7 @@ Replaced manual per-emitter field coordination with SecurityEvent intermediate r
- [x] **Later architectural sprint: imperfect observation and source coverage** — implemented a training-friendly `complete` default plus overlay-compatible named observation profiles that apply deterministic source-level drop/delay/coverage semantics without modeling contradictions. The policy covers endpoint, network, proxy/web, firewall, IDS, Windows, Sysmon, Zeek, syslog, bash history, and eCAR source families, while ground truth preserves canonical truth and records source evidence status. Verification passed: focused observation/config/ground-truth tests, `uv run eforge validate-config`, Ruff checks/format checks, full normal `uv run pytest -v` (`3036 passed, 15 skipped`), and slow-inclusive `uv run pytest -v --include-slow` (`3050 passed, 1 skipped`).
- [x] Observation-aware automated eval and manifest — generation now writes `OBSERVATION_MANIFEST.json` beside ground truth, `eforge eval` loads it when present, coverage-style causality metrics report raw and observation-adjusted scores for expected non-visible evidence, and correctness/contradiction checks remain strict. Verification passed with config validation, Ruff checks/format checks, focused eval/manifest tests, and full normal `uv run pytest -v` (`3047 passed, 15 skipped`).
- [x] Post-host-activity score check — synced `dev`, cleaned up stale TODOs, regenerated/evaluated `scenarios/iteration-test` from the current iteration-test prompt with `enterprise_standard` observation, and ran one blind expert-panel review without entering another fix loop. Automated eval passed at `92.39` over `108,858` records; blind synthetic-confidence averaged `82.75`. Highest-leverage follow-ups are Linux SSH/syslog lifecycle ordering, Zeek observation-tree consistency, X.509 metadata coherence, Windows OS-build/local-SID identity, and static web asset manifests.
- [x] Current-dev calibration pass — regenerated and evaluated `scenarios/iteration-test` from current `dev`, fixed actionable cleanliness issues in OCSP optional-field rendering, observation-manifest accounting for sensor-filtered network evidence, Kerberos/domain-logon causal ordering, storyline event timing, storyline trace matching, temporal trace comparison, and visible Windows logon-before-process ordering. Verification passed with `uv run eforge validate-config`, scenario validation with only expected sensor/observation/pivot-linkability warnings, quantitative eval at `94.64` with all hard gates passing, Ruff checks, focused regressions (`164 passed`), and full normal `uv run pytest -v` (`3075 passed, 15 skipped`).
- [x] Full slow-suite regression cleanup after loop-65 merge — explicit-proxy storyline beacons now preserve authored hostname+destination IP pairs only when the storyline marks that pair as intentional, normal proxy-origin DNS resolution remains intact, and the parallel-generation LogonID assertion treats Type 7 unlock reuse as valid slice-of-time Windows behavior. Verified with targeted proxy/parallel tests, `uv run ruff check .`, `uv run ruff format --check .`, and `uv run pytest -v --include-slow` (`2875 passed, 23 skipped`).
Detection Engineer blind review completed for the regenerated Loop 61 dataset at `scenarios/iteration-test/data`; reviewer verdict: Synthetic, 63/100 confidence. Main findings: one PROXY-01 sshd accepted-login lifecycle gap/self-source artifact and Windows 4648 explicit-credential caller PID/image provenance ambiguity around `WS-MCHEN-01`.

Expand Down
11 changes: 6 additions & 5 deletions scenarios/ITERATION-TEST-PROMPT.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,12 +95,13 @@
`src_ip`. Produces ASA 106023 denies + Zeek S0 conn entries on external-facing
sensors only (not internal sensors).

2. Web Scan (+0h30m): External attacker runs web vulnerability scanning against WEB-EXT-01.
2. Web Scan (+0h31m): External attacker runs web vulnerability scanning against WEB-EXT-01.
Use a `web_scan` event with `source_ip: "185.70.41.45"`, `dst_ip: "10.10.3.10"`,
`dst_port: 443`, `hostname: "ehr-portal.meridianhcs.com"`, `preset: nikto`,
`rate: 10`, and exactly one termination field: `duration: "20m"`. Do not use
`src_ip`. Run concurrently with the port scan. Expect 733100 threat-detection
alerts during this phase.
`src_ip`. Start one minute after the port scan so timing checks do not see
identical step timestamps, while still overlapping the scan activity. Expect
733100 threat-detection alerts during this phase.

3. Rogue Device (+0h45m): Attacker plugs rogue laptop into network, obtains IP via DHCP.
Use a `dhcp_lease` event on the parent storyline `system` for the rogue device.
Expand Down Expand Up @@ -172,7 +173,7 @@
interval: "10m", duration: "1h30m", jitter: 0.3, hostname, user_agent, method: GET,
orig_bytes/resp_bytes for realistic sizing).

18. Blocked C2 (+4h30m): Attacker malware on DC-01 also attempts to beacon directly to
18. Blocked C2 (+4h31m): Attacker malware on DC-01 also attempts to beacon directly to
45.33.32.30:443 — blocked by firewall (server_vlan → external not in policy). Use beacon
event with action: deny, interval: "30m", duration: "1h30m". Denied attempts visible to
internal sensors only.
Expand All @@ -185,7 +186,7 @@
length_range: [10, 18], interval: "30s", duration: "45m",
rcode_distribution for mostly NXDOMAIN).

21. Collection (+5h): Authenticate to FILE-SRV-01 with backdoor account svc_mhsync
21. Collection (+5h01m): Authenticate to FILE-SRV-01 with backdoor account svc_mhsync
(logon event, type 3), enumerate shares, stage financial and patient data, compress
with PowerShell Compress-Archive.

Expand Down
4 changes: 4 additions & 0 deletions src/evidenceforge/config/evaluation/causal_pairs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,7 @@ pairs:
match_fields:
before: TargetUserName
after: TargetUserName
# 4624 rows occur on target systems while 4768 rows occur on DCs, and the
# shared username key is weak. A later matching TGT is not proof that a
# target-host logon inverted Kerberos causality.
allow_missing_prior: true
6 changes: 3 additions & 3 deletions src/evidenceforge/config/formats/zeek_ocsp.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ output:
"serialNumber": {{ serialNumber | tojson }},
"certStatus": {{ certStatus | tojson }},
"thisUpdate": {{ thisUpdate | tojson }},
"nextUpdate": {{ nextUpdate | tojson }},
"revoketime": {{ revoketime | tojson }},
"revokereason": {{ revokereason | tojson }}
"nextUpdate": {{ nextUpdate | tojson }}{% if revoketime is not none %},
"revoketime": {{ revoketime | tojson }}{% endif %}{% if revokereason is not none %},
"revokereason": {{ revokereason | tojson }}{% endif %}
}
Loading
Loading