[codex] Clean calibration eval warnings by DavidJBianco · Pull Request #162 · Cisco-Talos/EvidenceForge

DavidJBianco · 2026-05-15T18:04:44Z

Summary

Fix actionable calibration cleanliness issues found while regenerating and evaluating scenarios/iteration-test from current dev.
Tighten evaluator matching so observation-aware/pivot-related expected gaps do not turn into false positives, while real contradictions still fail.
Fix OCSP optional field rendering, network observation-manifest accounting for sensor-filtered evidence, and visible Windows logon-before-process ordering.

Validation

uv run eforge validate-config
uv run eforge validate scenarios/iteration-test/scenario.yaml
uv run eforge generate scenarios/iteration-test/scenario.yaml --verbose --force
uv run eforge eval scenarios/iteration-test/data --scenario scenarios/iteration-test/scenario.yaml --format json --verbose -> overall 94.64, all hard gates passing
uv run ruff check .
uv run ruff format --check .
Focused regressions: 164 passed
uv run pytest -v -> 3075 passed, 15 skipped

Notes

Pivot-linkability misses that are realistic or explained by the observation profile are intentionally left as calibration signals rather than cleanup blockers.

fix: clean calibration eval warnings

e771e77

DavidJBianco marked this pull request as ready for review May 15, 2026 18:06

DavidJBianco merged commit 87ac753 into dev May 15, 2026
4 checks passed

DavidJBianco mentioned this pull request May 15, 2026

[codex] Release v0.7.0 #161

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Clean calibration eval warnings#162

[codex] Clean calibration eval warnings#162
DavidJBianco merged 1 commit into
devfrom
codex/calibration-cleanliness-fixes

DavidJBianco commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DavidJBianco commented May 15, 2026

Summary

Validation

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant