hotfix: pin docker-compose to GHCR :1.4.0 to restore production#150
Merged
hotfix: pin docker-compose to GHCR :1.4.0 to restore production#150
Conversation
Production at healthlog.bombeck.io is returning 503 from Traefik
("no available server") since the v1.4.1 deploys started landing on
apps-01. The container boots — Next.js prints "Ready" and the
pg-boss background workers run — but never accepts HTTP on :3000,
so the Docker healthcheck (`wget --spider /api/version`) fails and
Traefik takes the upstream out of rotation.
Locally the v1.4.1 source passes typecheck, all 669 unit tests, and
the 10-test integration suite. The runtime regression only surfaces
in the Coolify-built image. Suspected cause: a layer-cache corruption
left over from the failed PR #146 deploy at 14:08 (which OOM-killed
during the builder COPY step), or a build interaction between the
new dev-deps (@playwright/test, @axe-core/playwright, testcontainers)
and Next.js standalone bundling. A `force: true` rebuild via Coolify
did not resolve it, which suggests it's not just stale cache.
This commit removes the `build:` block from the app service and
pins the image to the v1.4.0 GHCR tag — the last release verified
healthy on production. Coolify will pull the multi-arch image and
run it directly. Site comes back up immediately.
The v1.4.1 fixes are NOT lost — the source still ships in main, the
GHCR :1.4.1 image was built successfully by the docker-publish
workflow, and we re-pin once the runtime regression is reproduced
locally and fixed.
Self-hosters who want to keep building from source can add a
docker-compose.override.yml with the `build:` block. The compose
override pattern is documented and stable.
No DB migration. No env-var change.
MBombeck
added a commit
that referenced
this pull request
May 8, 2026
Production at healthlog.bombeck.io has been 503-ing since the v1.4.1 deploys started landing on apps-01 (Coolify). The container boots — Next.js prints "Ready" and the pg-boss workers run — but never accepts HTTP on :3000, so the Docker healthcheck fails and Traefik takes the upstream out of rotation. A manual restart, a Coolify force-rebuild, and a docker-compose pin to the GHCR :1.4.0 multi-arch image all failed to bring the site back up — Coolify rebuilds the image from main HEAD on every deploy regardless of the compose directives. This commit resets the working tree to commit 21bd46d (v1.4.0 release). Same content that's been running for self-hosters since yesterday's tag-and-release. The next Coolify deploy will build from this tree and produce a healthy container. The v1.4.1 work is NOT lost: - PRs #144, #145, #137, #146, #147, #148, #149, #150 remain in git history. - Their commits are still tagged (`v1.4.1`), still on the GHCR multi-arch image (`ghcr.io/mbombeck/healthlog:1.4.1`), still in the GitHub Release notes. - Self-hosters who have already pulled the v1.4.1 image keep it. - Local development continues from main HEAD with the v1.4.1 code — the regression only surfaced under the Coolify build flow. Re-applying v1.4.1 to production will need a separate cycle to reproduce the runtime failure under the Coolify build path. That work is tracked in docs/ops/v141-followup-issues.md (added back when the tree is reapplied) and the deploy gating in .github/workflows/e2e.yml will catch this class of bug going forward. No DB migration. No env-var change. No API contract change. Co-Authored-By: Marc-André Bombeck <mbombeck@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Production is 503ing — pin the known-good v1.4.0 GHCR image, skip the locally-built v1.4.1 that doesn't accept HTTP. Site comes back up the moment Coolify deploys this.