Skip to content

fix loop: no post-merge validation — bad fixes are undetected until a human notices #55

@ooloth

Description

@ooloth

Problem

The fix loop's responsibility ends when a PR is opened. If the PR is merged and introduces a regression — a test that now fails on a different code path, a type error that CI catches in a downstream build, a behavior change that breaks a dependent service — nothing in this system detects it. The only safety nets are the target project's CI and human reviewers.

This matters because:

  • The fix loop is calibrated for minimal, targeted changes, but it can still be wrong
  • An agent-merged regression may not be immediately visible (e.g. flaky tests, integration failures)
  • There's no audit trail connecting a regression back to the agent fix that caused it

What's needed

A lightweight post-merge check that runs some time after a PR is merged and records whether the fix held:

  1. CI status check: After merge, poll the target repo's CI for the merge commit. If CI fails, file a follow-up issue or comment on the original issue linking the failure.
  2. Groom integration: The groom loop already checks if issues are resolved. Extend it to also check if any agent-fix PRs have merged into a failing CI state.
  3. Failure attribution: If a CI failure is detected post-merge, label the original issue needs-human-review and comment with the failing check URL.

Minimal version

A command run.py check-merged <project-id> that:

  • Finds agent-opened PRs merged in the last N days
  • Checks CI status of each merge commit
  • Comments on the original issue if CI is failing

This doesn't require loop integration — it can be run as a separate cron step.

Definition of Done

  • run.py check-merged <project-id> outputs a table of merged agent PRs and their CI status
  • PRs with failing CI trigger a comment on the original issue and add needs-human-review
  • The command is documented in docs/playbooks/

Out of Scope

  • Automated rollback (too risky to automate)
  • Monitoring for behavioral regressions not caught by CI

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestscope:fixFix loop, implement, review, PR openingscope:observabilityLogging, cost tracking, run history

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions