Automate stage reset with opt-out label to reduce coordination overhead

## problem

Resetting the shared `stage` branch is currently manual and disruptive. Engineers have to ask in a channel before resetting to avoid interrupting active testing, which slows recovery when stage is broken.

In practice, most resets should proceed immediately, but there is no lightweight mechanism to protect the minority of PRs that are actively testing.

## solution

Introduce an automated stage-reset workflow with default-allow behavior and explicit opt-out:

* Add a new PR label: `stage-reset-blocked`
* When a stage reset is requested, automation checks open PRs labeled `on stage`
* If **no** PR has `stage-reset-blocked`, automatically reset `stage` to `main`, then re-merge `on stage` PR branches in deterministic order
* If one or more PRs have `stage-reset-blocked`, skip reset and notify owners/Slack with blocked PR list
* Always post a completion/status summary to Slack and keep full audit trail in Actions logs

This reduces coordination cost in the common case while preserving safety for active testers.

## technical

Implement via GitHub Actions + existing merge-bot conventions:

1. **Trigger paths**
   * Manual `workflow_dispatch` for reset requests (initial rollout)
   * Optional automatic trigger on merge-bot conflict / failed stage deploy (later)
2. **Data collection**
   * Query open PRs with label `on stage`
   * Partition into blocked/unblocked based on `stage-reset-blocked`
3. **Reset execution rules**
   * Proceed only when blocked list is empty (or when explicit maintainer override input is set)
   * Reset branch: align `stage` to `main`
   * Re-merge each `on stage` branch back to `stage`
   * Push branch and report per-PR merge outcomes
4. **Notifications**
   * Pre-action Slack message (reason, on-stage PRs, blocked PRs)
   * Post-action Slack message (success/failure, re-applied PRs, failures requiring manual follow-up)
5. **Guardrails**
   * Dry-run mode for initial rollout
   * Protected release-window bypass rules
   * Full run logs and deterministic ordering for repeatability
6. **Docs alignment**
   * Keep `apps/docs/docs/04-engineering-practices/02-deployment/index.md` in sync with workflow behavior
   * Include policy examples for when to apply/remove `stage-reset-blocked`

Related context: [ENG-3570](https://linear.app/jesus-film-project/issue/ENG-3570/ci-cd-optimisation-cancel-in-progress-builds-on-update)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automate stage reset with opt-out label to reduce coordination overhead #8837

problem

solution

technical

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Automate stage reset with opt-out label to reduce coordination overhead #8837

Description

problem

solution

technical

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions