Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
7caf4b2
fix: improve observation coherence and TLS realism
DavidJBianco May 15, 2026
b476c16
fix: improve web and kerberos baseline realism
DavidJBianco May 15, 2026
7e6c7ea
fix: repair service logon and linux telemetry realism
DavidJBianco May 15, 2026
02bd4bd
fix: scope bash tool affinity by role pool
DavidJBianco May 15, 2026
67ac00d
fix: prefer host services in bash templates
DavidJBianco May 15, 2026
caf06a7
fix: align cli http and analyzer timing realism
DavidJBianco May 15, 2026
508501c
fix: preserve cli http network effect context
DavidJBianco May 15, 2026
811b2f1
fix: bind process http commands to proxy flows
DavidJBianco May 15, 2026
380f38c
docs: record loop 5 blind review results
DavidJBianco May 15, 2026
ac30094
fix: improve loop 6 session and web realism
DavidJBianco May 15, 2026
5702bbf
docs: record loop 6 blind review results
DavidJBianco May 15, 2026
3f66d06
fix: improve loop 7 service and cli realism
DavidJBianco May 15, 2026
6374303
fix: close tracked foreground processes at finalize
DavidJBianco May 15, 2026
a477484
docs: record loop 7 blind review results
DavidJBianco May 16, 2026
e768f4c
fix: correct service install start semantics
DavidJBianco May 16, 2026
e21a25f
docs: record loop 8 assessment results
DavidJBianco May 16, 2026
46bd9d2
fix: diversify web and proxy status outcomes
DavidJBianco May 16, 2026
454edf0
docs: record loop 9 assessment results
DavidJBianco May 16, 2026
f0f5c3d
fix: reduce rare admin tool background noise
DavidJBianco May 16, 2026
34731ff
docs: record loop 10 assessment results
DavidJBianco May 16, 2026
e9ff69c
fix: diversify linux command texture
DavidJBianco May 16, 2026
4993829
docs: record loop 11 assessment results
DavidJBianco May 16, 2026
b7c8a70
fix: repair source-native host contradictions
DavidJBianco May 16, 2026
9ae822c
docs: record loop 12 assessment results
DavidJBianco May 16, 2026
93463e4
fix: preserve source-native web response semantics
DavidJBianco May 16, 2026
af301b9
docs: record loop 13 assessment results
DavidJBianco May 16, 2026
7a82449
fix: align source-native command and DNS semantics
DavidJBianco May 16, 2026
16740cc
docs: record loop 14 assessment results
DavidJBianco May 16, 2026
9c9dcef
fix: enforce auth and network source semantics
DavidJBianco May 16, 2026
024db1a
docs: record loop 15 assessment results
DavidJBianco May 16, 2026
21f3a79
fix: repair source-native process and zeek texture
DavidJBianco May 16, 2026
aeb457b
docs: record loop 16 assessment results
DavidJBianco May 16, 2026
eaf090a
fix: repair network and session source semantics
DavidJBianco May 16, 2026
4484c50
docs: record loop 17 assessment results
DavidJBianco May 16, 2026
bc738f2
fix: vary completed TLS duration floors
DavidJBianco May 16, 2026
e98f744
docs: record loop 18 assessment results
DavidJBianco May 16, 2026
dc4616c
fix: repair proxy http response semantics
DavidJBianco May 16, 2026
b4c99b1
fix: preserve redirect response mime semantics
DavidJBianco May 16, 2026
6b207bd
docs: record loop 19 assessment results
DavidJBianco May 16, 2026
76bc107
fix: bind shell helpers to user sessions
DavidJBianco May 16, 2026
f991a77
docs: record loop 20 assessment results
DavidJBianco May 16, 2026
c9a7b72
fix: attribute browser http flows to user processes
DavidJBianco May 16, 2026
6b589ec
docs: record loop 21 assessment results
DavidJBianco May 16, 2026
ecc45ef
fix: reduce linux endpoint cadence fingerprints
DavidJBianco May 16, 2026
f8c19f0
docs: record loop 22 assessment results
DavidJBianco May 16, 2026
91546d7
fix: vary zeek multi-sensor timing offsets
DavidJBianco May 16, 2026
a097f30
docs: record loop 23 assessment results
DavidJBianco May 16, 2026
999a20e
fix: diversify public dns and certificate profiles
DavidJBianco May 16, 2026
09076c1
docs: record loop 24 assessment results
DavidJBianco May 16, 2026
dd56f08
fix: model persistent zeek http transactions
DavidJBianco May 16, 2026
ebc2d42
docs: record loop 25 assessment results
DavidJBianco May 16, 2026
4f92a11
fix: align persistent http flow accounting
DavidJBianco May 16, 2026
c13e429
docs: record loop 26 assessment results
DavidJBianco May 16, 2026
38e431d
fix: diversify linux syslog daemon noise
DavidJBianco May 16, 2026
3e78053
docs: record loop 27 assessment results
DavidJBianco May 16, 2026
350b0f5
fix: loosen dns tunnel and c2 cadence
DavidJBianco May 16, 2026
5994b26
docs: record loop 28 assessment results
DavidJBianco May 16, 2026
e37a5f3
fix: vary linux syslog timer texture
DavidJBianco May 16, 2026
73e123e
docs: record loop 29 assessment results
DavidJBianco May 16, 2026
bc3772d
fix: mix ecar flow principal attribution
DavidJBianco May 16, 2026
044b097
docs: record loop 30 assessment results
DavidJBianco May 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 156 additions & 1 deletion TODO.md

Large diffs are not rendered by default.

20 changes: 18 additions & 2 deletions commands/eforge/references/config-host-activity.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,8 @@ schedules:
jitter_minutes: 60
distro: all
role: web_server
services_any: [nginx, apache2]
slot_skip_probability: 0.08
cron_user: root
cron_commands:
debian: "/usr/bin/certbot renew --quiet --deploy-hook 'systemctl reload nginx'"
Expand All @@ -130,6 +132,12 @@ schedules:
| `jitter_minutes` | int | yes | Max jitter offset (per-host deterministic) |
| `distro` | string | yes | `all`, `debian`, or `rhel` |
| `role` | string | no | Host role filter (e.g., `web_server`) |
| `roles` | list[string] | no | Host role filter where any role may match |
| `exclude_roles` | list[string] | no | Host roles that suppress this schedule |
| `services_any` | list[string] | no | Required host service/package signals where any service may match |
| `host_probability` | float | no | Deterministic per-host enable probability between `0.0` and `1.0` |
| `slot_skip_probability` | float | no | Deterministic per-slot skip probability for frequent timers |
| `slot_jitter_seconds` | int | no | Extra runtime jitter for frequent timer slots |
| `process_path` | string | no | Path to service binary for process create events |

**Systemd timer additional fields:**
Expand Down Expand Up @@ -322,7 +330,7 @@ scheduled_stale_credentials:

## Endpoint Noise (`endpoint_noise.yaml`)

Controls endpoint background timing and registry-emission policies that are too source-specific for scenario YAML. Use it to tune routine Windows scheduled-process spacing and whether DHCP interface registry values appear as ambient Sysmon/EDR noise.
Controls endpoint background timing, registry-emission, and EDR attribution policies that are too source-specific for scenario YAML. Use it to tune routine Windows scheduled-process spacing, whether DHCP interface registry values appear as ambient Sysmon/EDR noise, and how often eCAR FLOW rows expose process/user principal context.

```yaml
windows_scheduled_processes:
Expand All @@ -343,11 +351,19 @@ registry_noise:
emit_on_lease_events: true
suppress_system_types: [server, domain_controller]
suppress_roles: [domain_controller, dns_server, file_server, web_server]

ecar_flow_identity:
user_process_probability: 0.88
service_process_probability: 0.48
root_process_probability: 0.42
inbound_listener_probability: 0.36
```

`windows_scheduled_processes` replaces hour-end clamping with profile-driven trigger windows, per-host phase offsets, jitter, and skips. Keep `trigger_window_end_seconds` comfortably below 3599 to avoid synthetic `xx:59:59` clusters.

`registry_noise.dhcp_interface_values` reserves DHCP interface registry writes for actual DHCP lease/reconfigure activity. Static infrastructure roles should stay in `suppress_system_types` or `suppress_roles` so they do not repeatedly rewrite DHCP values as ambient registry noise. Run `eforge validate-config` after overlay changes; it rejects inverted ranges, empty value-name lists, and invalid probabilities.
`registry_noise.dhcp_interface_values` reserves DHCP interface registry writes for actual DHCP lease/reconfigure activity. Static infrastructure roles should stay in `suppress_system_types` or `suppress_roles` so they do not repeatedly rewrite DHCP values as ambient registry noise.

`ecar_flow_identity` controls mixed FLOW principal attribution. User-owned process flows usually carry `principal`, service/root flows carry it less often, inbound listener flows carry it occasionally, and unknown or rejected flows remain unattributed. Run `eforge validate-config` after overlay changes; it rejects inverted ranges, empty value-name lists, and invalid probabilities.

---

Expand Down
31 changes: 30 additions & 1 deletion src/evidenceforge/cli/validate_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,9 @@ def validate_config() -> ValidationResult:
"activity/tls_realism.yaml": {
"dict_fields": {"san", "serial_numbers", "ocsp", "certificate_chains", "destinations"},
},
"activity/public_dns_profiles.yaml": {
"list_fields": {"nameserver_profiles": "name", "mail_profiles": "name"},
},
"activity/smb_file_transfers.yaml": {
"list_fields": {"mime_types": None, "analyzer_sets": None},
},
Expand Down Expand Up @@ -231,7 +234,11 @@ def validate_config() -> ValidationResult:
},
},
"activity/endpoint_noise.yaml": {
"dict_fields": {"windows_scheduled_processes", "registry_noise"},
"dict_fields": {
"windows_scheduled_processes",
"registry_noise",
"ecar_flow_identity",
},
},
"activity/host_activity_profiles.yaml": {
"dict_fields": {
Expand Down Expand Up @@ -470,6 +477,7 @@ def validate_config() -> ValidationResult:
from evidenceforge.generation.activity.process_network import load_process_network_map
from evidenceforge.generation.activity.proxy_uri import load_proxy_uri_templates
from evidenceforge.generation.activity.proxy_user_agents import load_proxy_user_agents
from evidenceforge.generation.activity.public_dns_profiles import load_public_dns_profiles
from evidenceforge.generation.activity.site_maps import load_site_maps
from evidenceforge.generation.activity.spawn_rules import load_spawn_rules
from evidenceforge.generation.activity.system_processes import load_system_processes
Expand All @@ -480,6 +488,7 @@ def validate_config() -> ValidationResult:
from evidenceforge.generation.activity.windows_auth_realism import load_windows_auth_realism

dns_data = load_dns_registry()
public_dns_profiles_data = load_public_dns_profiles()
ids_data = load_ids_signatures()
catalog_data = load_catalog()
traffic_data = load_traffic_profiles()
Expand Down Expand Up @@ -1719,6 +1728,7 @@ def _record_ids_rule_identity(
ProcessAccessPatternEntry,
ProcessNetworkEntry,
ProxyUserAgentOverrideEntry,
PublicDnsProfilesConfig,
PublicNtpServerEntry,
RemoteThreadStartLocationEntry,
ScheduledTaskEntry,
Expand Down Expand Up @@ -1892,6 +1902,12 @@ def _record_ids_rule_identity(
if tls_realism_data:
_SCHEMA_CHECKS.append(([tls_realism_data], TlsRealismConfig, "tls_realism.yaml"))

# public_dns_profiles.yaml
if public_dns_profiles_data:
_SCHEMA_CHECKS.append(
([public_dns_profiles_data], PublicDnsProfilesConfig, "public_dns_profiles.yaml")
)

# kerberos_realism.yaml
from evidenceforge.generation.activity.kerberos_realism import load_kerberos_realism

Expand Down Expand Up @@ -1920,6 +1936,19 @@ def _record_ids_rule_identity(
if not isinstance(entry, dict):
continue
app = str(entry.get("app") or "<unknown>")
_VALID_SYSLOG_SYSTEM_TYPES = {"workstation", "server", "domain_controller"}
for system_type in entry.get("system_types", []):
if system_type not in _VALID_SYSLOG_SYSTEM_TYPES:
result.issues.append(
Issue(
"ERROR",
"extra_syslog_messages.yaml",
(
f'App "{app}" has invalid system_type "{system_type}" '
f"(valid: {sorted(_VALID_SYSLOG_SYSTEM_TYPES)})"
),
)
)
for message in entry.get("messages", []):
if not isinstance(message, str):
continue
Expand Down
2 changes: 1 addition & 1 deletion src/evidenceforge/config/activity/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ caches data after first load. Two files (`network_params.yaml`,
| `kerberos_realism.yaml` | `kerberos_realism.py` | Kerberos 4768 TGT PreAuthType, TicketOptions, encryption, and PKINIT certificate field distributions with overlay support. |
| `windows_auth_realism.yaml` | `windows_auth_realism.py` | Windows Security authentication realism knobs such as minimum 4800→4801 lock/unlock gap, failed-logon validation paths, companion network evidence, and 4672 privilege profiles. |
| `auth_noise.yaml` | `auth_noise.py` | Baseline authentication-noise profiles such as stale scheduled-credential account pools and irregular recurrence timing. |
| `endpoint_noise.yaml` | `endpoint_noise.py` | Endpoint background timing and registry-emission policies for Windows scheduled processes and DHCP interface registry writes. |
| `endpoint_noise.yaml` | `endpoint_noise.py` | Endpoint background timing, registry-emission, and EDR attribution policies for Windows scheduled processes, DHCP interface registry writes, and eCAR FLOW principal context. |
| `host_activity_profiles.yaml` | `host_activity_profiles.py` | Coarse host/persona/role rate multipliers for baseline volume, endpoint noise, firewall deny bursts, and data-driven artifact variation. |
| `observation_profiles.yaml` | `config/observation_profiles.py` | Named source-observation profiles for optional source-level missingness and delays. Scenario `observation_profile` defaults to `complete`; generation records status in `OBSERVATION_MANIFEST.json` for eval. |
| `proxy_uri_templates.yaml` | `proxy_uri.py` | Per-domain URI path templates for proxy logs (Windows Update, CRL, OCSP, Azure AD, etc.). |
Expand Down
5 changes: 4 additions & 1 deletion src/evidenceforge/config/activity/application_catalog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -541,7 +541,7 @@ applications:
linux:
image_path: "/usr/bin/curl"
command_templates:
- "curl -s https://api.example.com/status"
- "curl -s {external_api_url}"
- "curl -sS -o /dev/null -w '%{http_code}' {internal_url}"
- "curl -X GET {internal_url} -H 'Accept: application/json'"
categories: [user_app]
Expand Down Expand Up @@ -802,6 +802,7 @@ applications:

- id: dsquery
display_name: "Directory Service Query"
selection_weight: 2
platforms:
windows:
image_path: "C:\\Windows\\System32\\dsquery.exe"
Expand All @@ -820,6 +821,7 @@ applications:
- 'dsquery.exe group -samid "*admin*" -limit {ad_limit}'
categories: [query]
personas: [sysadmin, help_desk]
system_types: [domain_controller]

- id: ldapsearch
display_name: "LDAP Search"
Expand Down Expand Up @@ -868,6 +870,7 @@ applications:

- id: ntdsutil
display_name: "NTDS Utility"
selection_weight: 1
platforms:
windows:
image_path: 'C:\Windows\System32\ntdsutil.exe'
Expand Down
47 changes: 47 additions & 0 deletions src/evidenceforge/config/activity/bash_commands.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,29 +27,76 @@ common:
- "w"
- "whoami"
- "uname -a"
- "uname -sr"
- "uname -mrs"
- "cat /proc/version | cut -d' ' -f1-3"
- "uptime"
- "df -h"
- "df -h /"
- "df -h /var"
- "df -h /tmp"
- "free -m"
- "free -h"
- "ps aux"
- "ps -ef"
- "ps -ef | head"
- "ps aux --sort=-%mem | head"
- "ps aux | grep {service}"
- "clear"
- "hostname -f"
- "hostname"
- "date"
- "date -u"
- "history"
- "cat /etc/hostname"
- "cat /etc/os-release"
- "cat /etc/issue"
- "cat /etc/passwd | head"
- "ll"
- "exit"
- "cat /etc/resolv.conf"
- "cat /proc/cpuinfo | grep 'model name' | head -1"
- "grep -m1 'model name' /proc/cpuinfo"
- "echo $SHELL"
- "locale"
- "umask"
- "ulimit -n"
- "env | head -20"
- "env | sort | head"
- "which python3"
- "command -v python3"
- "file /usr/bin/ls"
- "stat /etc/passwd"
- "getent hosts localhost"
- "getent passwd $(whoami)"
- "groups"
- "users"
- "last -5"
- "who -a"
- "ip -br addr"
- "ip route"
- "ip route get 8.8.8.8"
- "ss -tan | head"
- "ss -s"
- "ls -ltr /var/log/ | tail -10"
- "ls -lt /var/log | head"
- "ls -ltr /var/log | tail"
- "ls -ld /var/log"
- "ls -lah /tmp | head"
- "find /tmp -maxdepth 1 -type f | head"
- "du -sh /var/log"
- "du -sh /home/* 2>/dev/null | head"
- "journalctl --no-pager -n 5"
- "journalctl -p err --no-pager -n 10"
- "journalctl --since '10 min ago' --no-pager -n 20"
- "journalctl -xe --no-pager | tail -20"
- "systemctl --failed --no-pager"
- "tail -20 /var/log/syslog"
- "tail -50 /var/log/auth.log"
- "grep -i error /var/log/syslog | tail"
- "grep -i failed /var/log/auth.log | tail"
- "lsmod | head"
- "dmesg --ctime | tail -20"

sysadmin:
- "systemctl status sshd"
Expand Down
2 changes: 1 addition & 1 deletion src/evidenceforge/config/activity/dns_registry.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ domains:
tags: [saas, background, outlook, teams, onedrive]
- domain: packages.microsoft.com
ips: ["13.107.246.52", "13.107.246.53"]
tags: [background, windows, linux]
tags: [background, linux]
- domain: res.cdn.office.net
ips: ["13.107.6.171", "13.107.9.171"]
tags: [cdn, outlook, teams, onedrive]
Expand Down
6 changes: 6 additions & 0 deletions src/evidenceforge/config/activity/endpoint_noise.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,9 @@ registry_noise:
- forward_proxy
- app_server
- database

ecar_flow_identity:
user_process_probability: 0.88
service_process_probability: 0.48
root_process_probability: 0.42
inbound_listener_probability: 0.36
Loading
Loading