Skip to content

BUILD-10724: Import GitHub cache to S3 (migration fallback)#45

Merged
julien-carsique-sonarsource merged 1 commit intomasterfrom
feat/jcarsique/BUILD-10724-migrationGh2s3
Mar 24, 2026
Merged

BUILD-10724: Import GitHub cache to S3 (migration fallback)#45
julien-carsique-sonarsource merged 1 commit intomasterfrom
feat/jcarsique/BUILD-10724-migrationGh2s3

Conversation

@julien-carsique-sonarsource
Copy link
Copy Markdown
Contributor

@julien-carsique-sonarsource julien-carsique-sonarsource commented Mar 18, 2026

Summary

When migrating from GitHub cache to S3, the S3 bucket starts empty. This causes all runners to re-download dependencies from scratch until their first S3 cache save completes. This PR addresses that by automatically importing existing GitHub cache entries into S3 during the migration window.

Changes

action.yml — new import-github-cache input + migration steps

New input: import-github-cache (default: enabled)

Resolution order (mirrors the existing CACHE_BACKEND / backend pattern):

  1. Action input import-github-cache
  2. Environment variable CACHE_IMPORT_GITHUB (can be set from a repo variable via ${{ vars.CACHE_IMPORT_GITHUB }})
  3. Default: true

New steps (S3 path only):

Step Description
Determine GitHub cache import mode Resolves the effective import mode from input → env var → default
Import GitHub cache to S3 (migration fallback) Runs actions/cache/restore with the original unprefixed key when S3 misses and import mode is active. The S3 post-job save then persists the restored content to S3.
Enforce fail-on-cache-miss after GitHub import fallback Fails explicitly only after both S3 and GitHub have been tried, so fail-on-cache-miss: true is still respected.

fail-on-cache-miss and lookup-only are correctly propagated to the GitHub fallback step.

.github/workflows/check-cache-migration.yml — migration completion detector

Manually triggered (workflow_dispatch) workflow to determine when the migration is complete and automatically opt out of the import fallback.

It:

  1. Lists GitHub cache entries (paginated), filtering to long-lived branches (main, master, branch-*, dogfood-on-*, feature/long/*) and excluding transient keys (build-number-*, mise-*)
  2. Lists S3 objects (paginated via AWS CLI)
  3. Compares: for each included GitHub entry, checks for {ref}/{key} in S3
  4. If 100% of entries are present in S3: creates or updates repository variable CACHE_IMPORT_GITHUB=false, disabling the fallback automatically

Behaviour matrix

S3 hit Import mode GH hit Result fail-on-cache-miss
active S3 content used, no GH attempt per input
active GH content restored → saved to S3 post-job pass
active miss fail if flag set
inactive S3 content used per input
inactive miss per input (unchanged)

Testing

Jira

BUILD-10724 — child of BUILD-10684

@julien-carsique-sonarsource julien-carsique-sonarsource requested a review from a team as a code owner March 18, 2026 17:13
@sonar-review-alpha
Copy link
Copy Markdown

sonar-review-alpha bot commented Mar 18, 2026

Summary

Implements a migration bridge for switching from GitHub cache to S3. When S3 backend is active but a cache entry doesn't exist in S3, the action falls back to restoring from GitHub Actions cache (if import mode is enabled) and then persists that to S3 for future runs. Includes a detection workflow that automatically disables the fallback once migration is complete. This prevents cache invalidation during the migration window while providing a way to track progress and opt out when ready.

What reviewers should know

Start here:

  1. action.yml lines 126–168: the core logic — note how fail-on-cache-miss is suppressed on the S3 step (line 168) when import mode is active, then enforced later only if both backends miss (lines 172–186)
  2. .github/workflows/test-cache-migration-gh2s3.yml: covers the four key scenarios (import enabled, import disabled via input, import disabled via env var, S3 hit with import enabled)
  3. .github/workflows/check-cache-migration.yml: the gradual-rollout mechanism — this must be copied into each consuming repo to track migration status

Non-obvious decisions:

  • fail-on-cache-miss is intentionally suppressed on the S3 step to avoid failing before GitHub is tried. The explicit fail step (lines 172–186) happens only after both backends have been attempted.
  • S3 keys can be stored in two forms (refs/heads/main/key vs main/key) depending on the event type; the migration detector checks both (line 121–124)
  • The test workflow provisions caches in specific order: GitHub entries first, then an S3 entry that shadows one of the GitHub entries — this tests that S3 hits bypass GitHub entirely.

Watch for:

  • The migration detector workflow is disabled by default and must be manually triggered or integrated into consuming repos via copy-paste (documented in README.md lines 406–440)
  • Resolution order for import mode (import-github-cache input → CACHE_IMPORT_GITHUB env var → true) mirrors the existing backend resolution pattern

  • Generate Walkthrough
  • Generate Diagram

🗣️ Give feedback

@julien-carsique-sonarsource julien-carsique-sonarsource marked this pull request as draft March 18, 2026 17:13
Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion: The migration logic is sound and fits the existing patterns well, but there is one security issue that needs fixing before merge.

SonarCloud recommendations: Even though action.yml:191 is not failing the quality gate, I strongly recommend fixing it because ${{ inputs.key }} is interpolated directly into the shell run: block inside double quotes — a key value containing a " character can break out of the string and execute arbitrary shell commands. Fix by passing the key through an environment variable:

env:
  CACHE_KEY: ${{ inputs.key }}
run: |
  echo "::error::Cache miss: no cache found in S3 or GitHub for key '${CACHE_KEY}'"
  exit 1

🗣️ Give feedback

@julien-carsique-sonarsource julien-carsique-sonarsource force-pushed the feat/jcarsique/BUILD-10724-migrationGh2s3 branch 3 times, most recently from 74563fd to 261d1d0 Compare March 23, 2026 18:04
@julien-carsique-sonarsource julien-carsique-sonarsource marked this pull request as ready for review March 23, 2026 18:08
Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion: The migration logic in action.yml is sound and handles the fail-on-cache-miss edge cases correctly, but the completion-detection workflow has a pagination bug that would prevent it from ever auto-disabling the fallback in production.

🗣️ Give feedback

@julien-carsique-sonarsource julien-carsique-sonarsource force-pushed the feat/jcarsique/BUILD-10724-migrationGh2s3 branch from 261d1d0 to c0f18eb Compare March 23, 2026 18:20
Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion: The new commit adds solid test coverage (content verification, the S3-hit scenario, and key fixes like dropping ${{ runner.os }} from keys). However, the S3 pagination bug flagged in the previous review remains unaddressed.

🗣️ Give feedback

@julien-carsique-sonarsource julien-carsique-sonarsource force-pushed the feat/jcarsique/BUILD-10724-migrationGh2s3 branch from c0f18eb to 2afad67 Compare March 23, 2026 18:34
Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! ✅

🗣️ Give feedback

@julien-carsique-sonarsource julien-carsique-sonarsource force-pushed the feat/jcarsique/BUILD-10724-migrationGh2s3 branch from 2afad67 to 0d796ab Compare March 24, 2026 09:13
Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion: The action.yml logic is solid and the fail-on-cache-miss deferral pattern is correctly implemented. However, the migration detector has a key-format mismatch that will cause it to permanently misreport migration status for any caches saved from PR-triggered workflows on dogfood-on-* or feature/long/* branches.

🗣️ Give feedback

@julien-carsique-sonarsource julien-carsique-sonarsource force-pushed the feat/jcarsique/BUILD-10724-migrationGh2s3 branch from 0d796ab to bfd7572 Compare March 24, 2026 10:28
Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion: One real bug in the fail-on-cache-miss enforcement path that needs fixing before merge. The rest of the implementation — the three-step restore flow, credential handling, migration checker, and test coverage — is solid.

🗣️ Give feedback

Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion: Both previously raised bugs are resolved. The PR is clean and ready to merge.

🗣️ Give feedback

Copy link
Copy Markdown
Contributor

@bwalsh434 bwalsh434 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this is super clever, nice job!

…tion mode)

When the S3 backend is used and the S3 cache misses, automatically attempt to restore the cache from GitHub using the original unprefixed key.
The S3 post-job step will then save the restored content to S3, pre-provisioning it for subsequent runs.

The feature is enabled by default. Resolution order to disable it:
  1. Action input `import-github-cache: 'false'`
  2. Environment variable `CACHE_IMPORT_GITHUB=false`
  3. Default: true

`fail-on-cache-miss` and `lookup-only` are propagated to the GitHub fallback step.
When `fail-on-cache-miss` is set and import mode is active, failure is deferred until both S3 and GitHub have been tried.

Also adds `.github/workflows/check-cache-migration.yml`: a manually-triggered workflow that compares GitHub cache entries to S3 objects across
target branches (main, master, branch-*, dogfood-on-*, feature/long/*), ignoring transient keys (build-number-*, mise-*). When 100% of entries
are found in S3, it automatically sets the CACHE_IMPORT_GITHUB=false repository variable to disable the import fallback (this requires the
`CACHE_IMPORT_GITHUB` environment variable to be set from the repository variable via `${{ vars.CACHE_IMPORT_GITHUB }}`).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@julien-carsique-sonarsource julien-carsique-sonarsource force-pushed the feat/jcarsique/BUILD-10724-migrationGh2s3 branch from 0e17479 to 762278a Compare March 24, 2026 15:33
@sonarqube-cloud-us
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown

@sonar-review-alpha sonar-review-alpha bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both previously flagged bugs are fixed. No new issues found.

🗣️ Give feedback

@julien-carsique-sonarsource julien-carsique-sonarsource merged commit 6a7c382 into master Mar 24, 2026
20 checks passed
@julien-carsique-sonarsource julien-carsique-sonarsource deleted the feat/jcarsique/BUILD-10724-migrationGh2s3 branch March 24, 2026 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants