Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/actions/just/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ This GitHub Action sets up [`just`](https://github.com/casey/just) and runs a sp
## 🚀 Features

- Installs a specific version of [`just`](https://github.com/casey/just)
- Installs `just` through `extractions/setup-crate@v2` in the same minimal composite-action shape used by `extractions/setup-just`
- Uses AWS credentials already configured earlier in the same job when needed
- Executes any `just` command (recipe)
- Captures and returns the final line of output as an action output
Expand Down
9 changes: 7 additions & 2 deletions .github/actions/just/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,14 @@ runs:
using: "composite"
steps:
- name: Install Just
uses: extractions/setup-just@v4
uses: extractions/setup-crate@v2.0.1
with:
just-version: ${{ inputs.just_version }}
repo: casey/just@${{ inputs.just_version }}
github-token: ${{ github.token }}

- name: Verify Just installation
shell: bash
run: just --version

- name: Run just action (try/catch + capture)
id: capture
Expand Down
1 change: 1 addition & 0 deletions .github/actions/terragrunt/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ This GitHub Action sets up **Terraform** and **Terragrunt** and runs a specified
## Features

- Installs pinned versions of Terraform and Terragrunt
- Installs Terragrunt through `gruntwork-io/terragrunt-action@v3`
- Uses AWS credentials already configured earlier in the same job when needed
- Optionally passes Terragrunt variables via JSON tfvars
- Supports `plan` mode for producing local saved plan files
Expand Down
5 changes: 3 additions & 2 deletions .github/actions/terragrunt/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,10 @@ runs:
terraform_wrapper: false

- name: Install Terragrunt
uses: autero1/action-terragrunt@v3.0.2
uses: gruntwork-io/terragrunt-action@v3
with:
terragrunt-version: ${{ inputs.tg_version }}
tg_version: ${{ inputs.tg_version }}
tf_path: terraform

- name: Normalize and write override_tg_vars
if: inputs.tg_action == 'apply' || inputs.tg_action == 'plan' || inputs.tg_action == 'destroy'
Expand Down
10 changes: 8 additions & 2 deletions .github/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ If you are unsure, the live `aws/oidc` stack in the target environment is the so
- `release.yml`
Creates release tags, prepares shared CI artifacts, builds release outputs, and publishes the GitHub release. Version bumps come from a repo-local action that scans commit subjects since the latest semver tag and matches configurable major/minor/patch prefixes.
- `pull_request.yml`
Provides fast validation for workflow syntax, Terraform formatting/linting, changed runtime builds, and a direct execution check of the repo-local `get-next-version` Docker action. The version preview job classifies the PR title, so it reflects the version that would be implied if that PR title lands on `main`. Its `check` job runs the repo-local `get-changes` Docker action directly, using the PR base SHA for a PR-style `base...HEAD` diff. When `.github/actions/**` changed, the workflow reuses `shared_directories_get.yml` to discover action directories with `Dockerfile`s and runs a Docker unit-test matrix for them after the GitHub formatting job. The Lambda naming check only runs when Lambda sources changed, and the ECS task/service pair check runs when container sources or Terragrunt live-stack directories changed; each is an explicit prerequisite for the corresponding build job.
Provides fast validation for workflow syntax, Terraform formatting/linting, changed runtime builds, and a direct execution check of the repo-local `get-next-version` Docker action. The version preview job classifies the PR title, so it reflects the version that would be implied if that PR title lands on `main`. Its `check` job runs the repo-local `get-changes` Docker action directly, using the PR base SHA for a PR-style `base...HEAD` diff. When `.github/actions/**` changed, the workflow reuses `shared_directories_get.yml` to discover action directories with `Dockerfile`s and runs a Docker unit-test matrix for them after the GitHub formatting job. The Lambda naming check only runs when Lambda sources changed, and the ECS task/service pair check runs when container sources or Terragrunt live-stack directories changed; each is an explicit prerequisite for the corresponding build job. Terragrunt installation in that workflow now uses `gruntwork-io/terragrunt-action@v3`.

The local version action can also be tested outside GitHub Actions, either by running the Python entrypoint directly or through its dedicated Docker image.

Expand Down Expand Up @@ -128,7 +128,7 @@ flowchart LR
### Cleanup And Discovery

- `destroy.yml`
Tears down app layers before shared dependencies, including the shared observability dashboard and any environment-owned shared artifact stacks such as the `dev` code bucket.
Tears down app layers before shared dependencies, including the shared observability dashboard and any environment-owned shared artifact stacks such as the `dev` code bucket. The workflow-dispatch input `allow_prod_cleanup` now gates every cleanup or destroy job that is normally skipped for `prod`, including the `Code Bucket`, `ECR`, and final tagged-resource cleanup jobs. After the main graph completes, the workflow first counts tagged leftovers through `justfile.destroy`, prints a warning only when any remain, and then runs the cleanup recipe. That cleanup currently deletes leaked Cognito user pools, deregisters and then deletes leaked ECS task-definition revisions, deletes leftover ECS clusters, and force-deletes leftover Secrets Manager secrets, then validates the remaining tagged ARNs against the underlying service APIs rather than treating the tagging index as the source of truth. Already-removed Cognito pools, ECS task-definition revisions, ECS clusters, or Secrets Manager secrets are treated as successful no-ops so stale tagging API results do not fail cleanup. `prod` runs that same path only when `allow_prod_cleanup` is enabled, and the workflow prints a conspicuous warning first.
- `shared_directories_get.yml`
Derives the directory-based matrices used by wrapper workflows and PR action-test discovery.

Expand All @@ -144,6 +144,8 @@ Run these checks on every CI, workflow, or deploy-contract change.
- the repo-local `./.github/actions/terragrunt` action supports `tg_action: plan` for producing the binary plan locally; it renders `terragrunt.plan.txt` and writes `terragrunt.plan.meta.json` via `justfile.tg` (`terragrunt-plan-render`)
- `./.github/actions/terragrunt` always uploads per-stack plan artifacts on `plan` and always downloads them on `apply_plan`, using the caller-provided `PLAN_ARTIFACT_S3_PREFIX` environment variable, so graph executors like `shared_infra.yml` do not need separate `./.github/actions/just` steps for those transfers
- both repo-local composite actions, `./.github/actions/just` and `./.github/actions/terragrunt`, now assume AWS credentials are already configured in the current job when they need AWS access. The repo pattern is to run `aws-actions/configure-aws-credentials` at the top of each AWS-using job and then call the local actions without extra auth inputs
- `./.github/actions/just` installs the requested `just` version through `extractions/setup-crate@v2` in the same minimal composite-action shape as `extractions/setup-just`, rather than depending on `extractions/setup-just` itself
- `./.github/actions/terragrunt` installs the requested Terragrunt version through `gruntwork-io/terragrunt-action@v3`, passing `tf_path: terraform` so the repo keeps using the separately pinned Terraform binary from `hashicorp/setup-terraform`
- saved infra-plan storage is intentionally split into two levels:
- one run-level metadata file at `<plan_artifact_s3_prefix>/infra-plan-metadata/plan-metadata.json`
- one per-stack plan bundle under `<plan_artifact_s3_prefix>/terragrunt-plan-<sanitized-tg-directory>/`
Expand All @@ -160,6 +162,7 @@ Run these checks on every CI, workflow, or deploy-contract change.
- `justfile.ci` for read-only CI helpers
- `justfile.tg` for Terragrunt plan artifact helpers (render/upload/download)
- `justfile.deploy` for mutating CI build and deploy steps
- `justfile.destroy` for explicit teardown and post-destroy cleanup steps

### Release Tagging Checks

Expand Down Expand Up @@ -206,6 +209,9 @@ Run these checks on every CI, workflow, or deploy-contract change.
- confirm destroy ordering still removes downstream consumers before shared stacks
- check required Terraform variables on destroy as well as apply
- prefer depending on real downstream consumers rather than serializing unrelated shared stacks
- when a module creates manual backup artifacts outside Terraform ownership, decide explicitly whether destroy should delete or retain them by environment
- if destroy relies on a final tagged-resource sweep, keep both the scan/count step and the cleanup step in `justfile.destroy`, and fail the workflow on unsupported tagged leftovers so new leak classes are visible
- if destroy relies on a final tagged-resource sweep, make sure the deploy OIDC role also allows `tag:GetResources`; the cleanup path uses the Resource Groups Tagging API before running service-specific deletions

## Wrapper Workflow Summary

Expand Down
61 changes: 59 additions & 2 deletions .github/workflows/destroy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@ on:
options:
- dev
- prod
allow_prod_cleanup:
description: "Also run prod-only cleanup and destroy jobs that are normally skipped for prod"
required: true
type: boolean
default: false

concurrency: # only run destroy when no other deploy/destroy is running for the same environment
group: deploy-${{ inputs.environment }}
Expand Down Expand Up @@ -258,7 +263,7 @@ jobs:

build-bucket:
name: Code Bucket
if: inputs.environment != 'prod'
if: inputs.environment != 'prod' || inputs.allow_prod_cleanup
needs:
- lambdas
runs-on: ubuntu-latest
Expand All @@ -278,7 +283,7 @@ jobs:

ecr:
name: ECR
if: inputs.environment != 'prod'
if: inputs.environment != 'prod' || inputs.allow_prod_cleanup
needs:
- network
runs-on: ubuntu-latest
Expand Down Expand Up @@ -316,3 +321,55 @@ jobs:
with:
tg_directory: infra/live/${{ inputs.environment }}/aws/cluster
tg_action: destroy

cleanup:
name: Cleanup
if: inputs.environment != 'prod' || inputs.allow_prod_cleanup
needs:
- observability
- cognito
- security
- build-bucket
- ecr
- cluster
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6

- uses: aws-actions/configure-aws-credentials@v6
with:
role-to-assume: ${{ env.AWS_OIDC_ROLE_ARN }}
aws-region: ${{ env.AWS_REGION }}

- name: Warn before prod tagged cleanup
if: inputs.environment == 'prod'
run: |
printf '\033[1;31m%s\033[0m\n' 'WARNING: running retained tagged-resource cleanup for prod.'
printf '%s\n' 'This may delete leaked Cognito user pools and deregister tagged ECS task-definition revisions.'

- name: Count tagged resources
id: count-tagged
uses: ./.github/actions/just
env:
ENVIRONMENT: ${{ inputs.environment }}
PROJECT_NAME: ${{ vars.PROJECT_NAME }}
ALLOW_PROD_CLEANUP: ${{ inputs.allow_prod_cleanup }}
with:
justfile_path: justfile.destroy
just_action: count-tagged-resources

- name: Warn when tagged resources remain
if: steps.count-tagged.outputs.just_outputs != '0'
run: |
printf '\033[1;33m%s\033[0m\n' "WARNING: found ${{ steps.count-tagged.outputs.just_outputs }} tagged resources after destroy."

- name: Cleanup tagged resources
if: steps.count-tagged.outputs.just_outputs != '0'
uses: ./.github/actions/just
env:
ENVIRONMENT: ${{ inputs.environment }}
PROJECT_NAME: ${{ vars.PROJECT_NAME }}
ALLOW_PROD_CLEANUP: ${{ inputs.allow_prod_cleanup }}
with:
justfile_path: justfile.destroy
just_action: cleanup-tagged-resources
5 changes: 3 additions & 2 deletions .github/workflows/pull_request.yml
Original file line number Diff line number Diff line change
Expand Up @@ -152,9 +152,10 @@ jobs:
steps:
- uses: actions/checkout@v6
- uses: hashicorp/setup-terraform@v4
- uses: autero1/action-terragrunt@v3.0.2
- uses: gruntwork-io/terragrunt-action@v3
with:
terragrunt-version: 0.45.10
tg_version: 0.45.10
tf_path: terraform

- name: Terraform fmt check
run: terraform fmt -check -recursive
Expand Down
1 change: 1 addition & 0 deletions infra/live/global_vars.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ locals {
"acm:*",
"route53:*",
"cognito-idp:*",
"tag:GetResources",
]
code_artifact_expiration_days = 0
infra_plan_artifact_expiration_days = 30
Expand Down
2 changes: 2 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ _default:
@just --justfile justfile.tg --list
@printf '\nDeploy recipes (`just --justfile justfile.deploy --list`):\n'
@just --justfile justfile.deploy --list
@printf '\nDestroy recipes (`just --justfile justfile.destroy --list`):\n'
@just --justfile justfile.destroy --list


PROJECT_DIR := justfile_directory()
Expand Down
1 change: 1 addition & 0 deletions justfile.deploy
Original file line number Diff line number Diff line change
Expand Up @@ -667,3 +667,4 @@ ecs-rolling-deploy:
--services "$SERVICE_NAME"

echo "✅ ECS rolling deployment completed for $SERVICE_NAME"

Loading
Loading