Add GitHub-hosted deploy fallback#682
Conversation
|
Controller classification after worker handoff:
Classification: |
|
Controller review after CI green on current head
Verdict: implementation is CI-green and review-ready, but I am holding this as a human/infra merge gate because it changes the production deploy workflow and still needs the real no-op fallback deploy exercise recorded before issue #645 can be fully closed. Note: authenticated GitHub user is the PR author, so I cannot leave a formal approving review from this account; recording the maintainer classification here as a normal PR comment instead. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bb4b1f08a6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - name: Configure AWS credentials from OIDC | ||
| uses: aws-actions/configure-aws-credentials@v4 | ||
| with: | ||
| role-to-assume: ${{ vars.AWS_DEPLOY_ROLE_ARN }} | ||
| aws-region: ${{ env.AWS_REGION }} |
There was a problem hiding this comment.
Set role-duration-seconds for long fallback deploys
This fallback job can run for 75 minutes, but aws-actions/configure-aws-credentials defaults assumed-role credentials to 1 hour (docs). If Docker builds, migrations, or ECS waits pass 60 minutes, later aws calls in scripts/deploy.sh all can fail with expired credentials after ECR/ECS mutations have already started; set role-duration-seconds and the IAM role max session to cover the job timeout.
Useful? React with 👍 / 👎.
| deploy-github-hosted-oidc: | ||
| name: Deploy via GitHub-hosted OIDC fallback | ||
| if: ${{ github.event_name == 'workflow_dispatch' && inputs.deploy_path == 'github-hosted-oidc' }} | ||
| runs-on: ubuntu-latest |
There was a problem hiding this comment.
Let the fallback supersede a stuck Mac deploy
Because this new fallback job remains under the existing workflow-level concurrency: deploy-prod, it can be blocked when a push deploy is already waiting on the offline [self-hosted, opensend-deploy] runner; GitHub concurrency allows only one running workflow and one pending workflow per group by default (docs). In the exact Mac-runner-outage scenario this is meant to handle, the manual OIDC run may just sit pending unless an operator first cancels the stuck run, so the fallback needs an explicit safe supersede/cancel policy or that cancellation step should be codified.
Useful? React with 👍 / 👎.
|
Controller reclassification after fresh Codex review on current head
|
|
Controller refresh after current supervisor tick:
Next safe action: restore Discord thread creation credentials, then launch the narrow |
Summary
github-hosted-oidcdeploy path to the existing Deploy workflow while keeping the Mac mini as the push/default path.Verification
python3YAML parse/assertions for.github/workflows/deploy.ymlbun run test tests/deploy.test.tsmake checkmake testFollow-up to fully close #645
deploy_path=github-hosted-oidcafter the OIDC role/repo variables are configured and record the run evidence in issue [P2] Deploy pipeline single point of failure: Mac mini runner #645/runbook.Refs #645