fix: Tag platform images via ECR API to fix wrapper bug in private ECR#759
fix: Tag platform images via ECR API to fix wrapper bug in private ECR#759wangzlei wants to merge 1 commit into
Conversation
The release workflow was producing $VERSION-amd64 and $VERSION-arm64 tags that pointed at single-entry image indexes (wrappers) instead of the platform image manifests, because docker/build-push-action's default buildx behavior bundles each push with provenance attestations inside an index. This change: - Removes the per-platform build/push steps (the source of the wrapper tags) - Adds an ECR-API retag step that reads the multi-arch manifest index and applies $VERSION-amd64 / $VERSION-arm64 directly to the platform digests via aws ecr put-image - Tags attestation manifests by their sha hex Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| echo "ERROR: failed to fetch ${arch} platform manifest at ${DIGEST}" | ||
| exit 1 | ||
| fi | ||
| aws ecr put-image \ |
There was a problem hiding this comment.
aws ecr put-image will fail with ImageAlreadyExistsException if this workflow is re-run for the same version (e.g., after a transient failure in a later step like publish-layer-prod). Consider adding || true or catching this specific error to make the step idempotent:
aws ecr put-image \
--repository-name "${REPO_NAME}" \
--image-tag "v${VERSION}-${arch}" \
--image-manifest "$MANIFEST" \
--region "${REGION}" > /dev/null 2>&1 || {
echo "Tag v${VERSION}-${arch} may already exist (expected on re-runs)"
}The same applies to the attestation put-image call below.
| ${{ env.RELEASE_PUBLIC_REPOSITORY }}:v${{ env.VERSION }} | ||
| ${{ env.RELEASE_PRIVATE_REPOSITORY }}:v${{ env.VERSION }} | ||
|
|
||
| - name: Tag platform images in private ECR |
There was a problem hiding this comment.
The repository name is hardcoded here, duplicating the suffix of RELEASE_PRIVATE_REPOSITORY (020628701572.dkr.ecr.us-west-2.amazonaws.com/adot-autoinstrumentation-python). If the repository name in that env var ever changes, this would silently diverge.
Consider deriving it from the existing env var to keep a single source of truth:
REPO_NAME: ${{ env.RELEASE_PRIVATE_REPOSITORY && ... }}Or extract it in the script:
REPO_NAME="${RELEASE_PRIVATE_REPOSITORY##*/}"| exit 1 | ||
| fi | ||
|
|
||
| # Tag platform images (amd64, arm64) with v$VERSION-$arch |
There was a problem hiding this comment.
This jq filter could return multiple lines if a manifest index contains more than one entry for the same architecture (e.g., image variants or platform features). If that happens, DIGEST would be a multi-line string and the subsequent batch-get-image call would fail with a confusing error.
Consider adding | first (jq 1.6+) or | limit(1; ...) to ensure a single value, or validating the result:
DIGEST=$(echo "$MANIFEST_INDEX" | jq -r --arg a "$arch" '[.manifests[] | select(.platform.architecture == $a)] | first | .digest')Low risk for this project in practice, but worth hardening.
| REPO_NAME: adot-autoinstrumentation-python | ||
| VERSION: ${{ env.VERSION }} | ||
| REGION: ${{ env.AWS_PRIVATE_ECR_REGION }} | ||
| run: | |
There was a problem hiding this comment.
The AWS credentials were last configured for public ECR (region us-east-1, line 119-122) before this step runs. The batch-get-image and put-image calls target private ECR in us-west-2. While this step passes --region explicitly to each AWS CLI call (which handles the region correctly), the IAM role assumed at line 119 is for public ECR access.
The docker/build-push-action step likely succeeds because Docker auth tokens from both login steps are still cached. However, the ECR API calls here need IAM permissions on the private registry (020628701572.dkr.ecr.us-west-2).
Please verify that the role assumed at "Configure AWS credentials for public ECR" (line 119) has ecr:BatchGetImage and ecr:PutImage permissions on the private repository. If it does not, you will need to re-assume the private ECR role before this step or reorder the credential configuration steps so the private ECR credentials are active when this step runs.
Summary
docker/build-push-actionpush to both private and public ECRv$VERSION-amd64/v$VERSION-arm64directly to the platform digests viaaws ecr put-imageThis is the same fix applied in:
Test plan
v$VERSION-amd64/v$VERSION-arm64tags in private ECR point to platform manifests (not wrapper indexes)v$VERSIONmulti-arch tag still works correctly in both public and private ECR🤖 Generated with Claude Code