Skip to content

Add evolution_plan.md: audit, roadmap, and prioritized reliability tasks#2

Merged
arty-kk merged 1 commit into
mainfrom
codex/run-deep-static-audit-and-plan-evolution
Mar 30, 2026
Merged

Add evolution_plan.md: audit, roadmap, and prioritized reliability tasks#2
arty-kk merged 1 commit into
mainfrom
codex/run-deep-static-audit-and-plan-evolution

Conversation

@arty-kk
Copy link
Copy Markdown
Owner

@arty-kk arty-kk commented Mar 30, 2026

Motivation

  • The repository's evaluator and acceptance flow were found to be inconsistent with project test conventions leading to false-negative candidate rejections and untested rollback paths.
  • A consolidated plan is needed to align runtime test execution, enforce evaluator-selector contracts, and add tests for critical evolution/rollback behaviors.
  • The document organizes findings, goals, and a prioritized incremental roadmap to stabilize core reliability and developer UX.

Description

  • Add evolution_plan.md which captures a baseline architecture map, critical flows, prioritized pain points (P0/P1/P2), and a three-phase roadmap to address them.
  • Introduce atomic task specs (EVO-001..EVO-005) recommending replacing unittest discover with pytest -q, adding evaluator→selector contract tests, adding deterministic rollback and ExperimentManager.run_async integration tests, and normalizing LLM task contracts.
  • Include concrete steps, acceptance criteria, validation commands, and rollback strategies for each task to guide implementation and verification.
  • Enumerate explicit non-goals to constrain scope and a stopping rule to ensure the audit's actionable outcomes are addressed.

Testing

  • No automated tests were added in this change and no CI jobs were executed as part of this PR.
  • The plan recommends running pytest -q and python -m compileall -q src as validation commands for subsequent code changes.

Codex Task

@arty-kk arty-kk merged commit 4f30d80 into main Mar 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant