diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 16ca10b..109e69c 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -87,3 +87,19 @@ jobs: - name: Run rollback integration test run: bash tests/rollback-integration.sh + + node-tests: + # Behavioral unit/integration tests for the template's own scripts and + # example app: scripts/bump-version.js validation + app/server.js + # /health and 404 behavior. Runs the real code via node:test. + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v6 + + - name: Set up Node.js + uses: actions/setup-node@v6 + with: + node-version: 22 + + - name: Run JS tests + run: npm test diff --git a/AGENTS.md b/AGENTS.md index 48ceab0..05b3e22 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -10,13 +10,16 @@ Dockerfile → Example Node.js (swap for your language, see docs/DOCKERFI docker-compose.yml → Local dev + VPS deployment .env.example → Environment variables template VERSION → Single source of truth for version (1.0.0) -scripts/bump-version.js → Version bumping (patch/minor/major) +scripts/bump-version.js → Version bumping (patch/minor/major); validates VERSION, fails on malformed +scripts/deploy-with-rollback.sh → Health-checked deploy with auto rollback (shared by cd.yml + ci.yml) +tests/ → node:test suites (bump-version, server /health) + rollback-integration.sh +package.json → Root test runner (`npm test` → node --test) docs/ → Setup guides (VPS, GHCR, HTTPS, Dockerfile examples) ``` ## CI/CD Pipeline -- **ci.yml**: Runs on push/PR to main. Hadolint lint + docker-compose validate + Docker build test + Trivy CVE scan (CRITICAL/HIGH). No secrets needed. +- **ci.yml**: Runs on push/PR to main. Hadolint lint + docker-compose validate + Docker build test + Trivy CVE scan (CRITICAL) + rollback integration test + `node:test` JS suites (`npm test`). No secrets needed. - **cd.yml**: Manual trigger OR tag push (v*). Builds image (Buildx + GHA cache) → pushes to GHCR → deploys to VPS via SSH → cleans old images → creates GitHub Release. Concurrency controlled (no parallel deploys). - **setup.yml**: First push only. Auto-creates GitHub Issue with setup checklist. @@ -48,6 +51,8 @@ docs/ → Setup guides (VPS, GHCR, HTTPS, Dockerfile examples) - **Why**: 같은 버전을 두 번 배포하면 GHCR 태그 충돌 + GitHub Release 중복 생성. 이 guard가 없으면 CI 통과해도 CD에서 조용히 깨짐. - Health check pattern in Dockerfile and docker-compose.yml - **Why**: `docker compose up -d --wait`가 health check 통과를 기다림. health check 없으면 컨테이너 시작 = 배포 성공으로 판단해서 깨진 앱이 배포될 수 있음. +- `/health` must stay able to FAIL (`app/server.js` readiness checks → 503) + - **Why**: 롤백은 unhealthy 컨테이너 감지에 의존. `/health`를 항상 200으로 고정하면 깨진 배포도 정상으로 보여 롤백이 무력화됨. 의존성 프로브는 `createApp({ readinessChecks })`로 연결. - Concurrency control in cd.yml - **Why**: 동시에 두 배포가 실행되면 SSH에서 race condition 발생. `cancel-in-progress: false`로 순서대로 실행. diff --git a/README.ko.md b/README.ko.md index e2cf640..0a79140 100644 --- a/README.ko.md +++ b/README.ko.md @@ -71,7 +71,7 @@ docker compose up ├── docker-compose.yml # 로컬 개발용 ├── .github/ │ ├── workflows/ -│ │ ├── ci.yml # Dockerfile 린트, compose 검증, 빌드 테스트 +│ │ ├── ci.yml # 린트, compose 검증, 빌드 테스트, JS 테스트 │ │ ├── cd.yml # 빌드 → GHCR 푸시 → VPS SSH 배포 │ │ └── setup.yml # 첫 사용 시 셋업 체크리스트 자동 생성 │ └── PULL_REQUEST_TEMPLATE.md @@ -81,22 +81,59 @@ docker compose up │ ├── HTTPS_SETUP.md # Caddy 리버스 프록시 + 자동 HTTPS │ └── VPS_DEPLOY.md # VPS SSH 배포 가이드 ├── scripts/ -│ └── bump-version.js # 버전 범프 +│ ├── bump-version.js # 버전 범프 (VERSION 검증) +│ └── deploy-with-rollback.sh # 헬스체크 배포 + 자동 롤백 +├── tests/ # node:test 스위트 + 롤백 통합 테스트 +├── package.json # `npm test` 러너 └── VERSION # 현재 버전 ``` ## 기능 - **언어 무관** — Dockerfile만 바꾸면 Node, Python, Go, Rust, Java, 정적 사이트 모두 가능 -- **CI 파이프라인** — Dockerfile 린트 (hadolint), docker-compose 검증, 빌드 테스트 +- **CI 파이프라인** — Dockerfile 린트 (hadolint), docker-compose 검증, 빌드 테스트, 그리고 매 푸시마다 `node:test` 스위트 (버전 범프 + `/health`) 실행 - **CD 파이프라인** — 빌드 → GHCR 푸시 → docker compose 헬스체크 기반 VPS 배포 + GitHub Release 자동 생성 +- **실제 헬스체크** — `/health`가 준비 상태(readiness)를 반영하며 `503`을 반환할 수 있습니다. 배포 실패 시 실제로 롤백되도록 의존성 프로브(DB, 캐시 등)를 직접 연결하십시오 - **Dockerfile 예시** — Node, Python, Go, Rust, Java용 멀티스테이지 빌드 docs 제공 -- **버전 관리** — `node scripts/bump-version.js patch/minor/major` +- **버전 관리** — `node scripts/bump-version.js patch/minor/major` (`VERSION`을 검증하며, 파일이 손상된 경우 쓰레기 값을 쓰지 않고 명확히 실패합니다) - **로컬 개발** — `docker compose up`으로 볼륨 마운트 + 라이브 리로드 - **HTTPS 가이드** — Caddy 리버스 프록시 + 자동 TLS - **배포 가이드** — GHCR, VPS 설정 단계별 문서 - **템플릿 셋업** — 첫 사용 시 체크리스트 이슈 자동 생성 +## 헬스체크 (`/health`) + +배포 파이프라인은 새 컨테이너가 헬스체크에 실패하면 롤백합니다 +(`docker compose up -d --wait`). 이 안전장치는 `/health`가 실제로 실패를 +보고할 수 있을 때만 동작합니다. 항상 `200`을 반환하는 `/health`는 모든 +배포를 정상으로 보이게 만들어 롤백을 조용히 무력화합니다. + +- **현재 구현됨** — `app/server.js`는 비동기 readiness 체크 목록을 기반으로 + `/health`를 제공합니다. 모든 체크 통과 시 `200 {"status":"ok"}`, 하나라도 + falsy를 반환하거나 예외를 던지면 `503 {"status":"unavailable"}`을 + 반환합니다. 알 수 없는 경로는 `404`를 반환합니다(예시 서버는 모든 경로를 + 받는 catch-all이 아닙니다). 기본 체크는 HTTP 리스너 바인딩만 확인합니다. +- **설계 의도** — fail-closed 원칙: 의존성 장애는 컨테이너를 unhealthy 상태로 + 드러내어 오케스트레이터가 트래픽 라우팅을 멈추고 CD 롤백이 작동하도록 해야 + 하며, 정상 체크 뒤에서 고장난 앱을 계속 서빙해서는 안 됩니다. +- **실제 체크는 직접 연결하셔야 합니다.** 예시 앱을 교체하고, 앱이 실제로 + 필요로 하는 의존성에 대한 프로브를 등록하십시오: + + ```js + const { createApp } = require('./server.js'); + const { server } = createApp({ + readinessChecks: [ + async () => { await db.query('SELECT 1'); return true; }, + async () => (await redis.ping()) === 'PONG', + ], + }); + server.listen(process.env.PORT || 3000); + ``` + +- **비목표(Non-goals)** — 이것은 메트릭/liveness 프레임워크가 아닙니다. 롤백 + 로직이 의존하는 최소한의 readiness 계약일 뿐이며, 더 필요하면 사용하시는 + 스택의 헬스 라이브러리로 교체하십시오. + ## CI/CD ### CI (PR + main 푸시마다) @@ -153,12 +190,22 @@ docker compose up # Dockerfile 변경 후 재빌드 docker compose up --build -# 버전 범프 +# 버전 범프 (VERSION이 손상되면 1.2.NaN을 쓰지 않고 명확히 실패합니다) node scripts/bump-version.js patch # 1.0.0 → 1.0.1 node scripts/bump-version.js minor # 1.0.0 → 1.1.0 node scripts/bump-version.js major # 1.0.0 → 2.0.0 ``` +### 테스트 + +```bash +# Node 테스트: 버전 범프 검증 + /health (200) 및 알 수 없는 경로 (404) +npm test + +# 롤백 통합 테스트 (Docker 필요, CI에서도 실행됨) +bash tests/rollback-integration.sh +``` + ## 언어 변경 1. `app/`을 내 앱 코드로 교체 diff --git a/README.md b/README.md index b74dff5..1d111b4 100644 --- a/README.md +++ b/README.md @@ -71,7 +71,7 @@ docker compose up ├── docker-compose.yml # Local development ├── .github/ │ ├── workflows/ -│ │ ├── ci.yml # Dockerfile lint, compose validate, build test +│ │ ├── ci.yml # Lint, compose validate, build test, JS tests │ │ ├── cd.yml # Build → GHCR push → VPS deploy via SSH │ │ └── setup.yml # Auto setup checklist on first use │ └── PULL_REQUEST_TEMPLATE.md @@ -81,22 +81,59 @@ docker compose up │ ├── HTTPS_SETUP.md # HTTPS with Caddy reverse proxy │ └── VPS_DEPLOY.md # VPS SSH deployment guide ├── scripts/ -│ └── bump-version.js # Version bump utility +│ ├── bump-version.js # Version bump utility (validates VERSION) +│ └── deploy-with-rollback.sh # Health-checked deploy + auto rollback +├── tests/ # node:test suites + rollback integration test +├── package.json # `npm test` runner └── VERSION # Current version ``` ## Features - **Language agnostic** — Swap the Dockerfile for any language (Node, Python, Go, Rust, Java, static) -- **CI Pipeline** — Dockerfile lint (hadolint), docker-compose validation, build verification, Trivy CVE scan on every push +- **CI Pipeline** — Dockerfile lint (hadolint), docker-compose validation, build verification, Trivy CVE scan, plus `node:test` suites (version-bump + `/health`) on every push - **CD Pipeline** — Build → push to GHCR → health-checked deploy to VPS via docker compose + auto GitHub Release +- **Real health checks** — `/health` reflects a readiness signal and can return `503`; wire your own dependency probes (DB, cache, …) so failed deploys actually roll back - **Dockerfile examples** — Multi-stage builds for Node, Python, Go, Rust, Java in docs -- **Version management** — `node scripts/bump-version.js patch/minor/major` +- **Version management** — `node scripts/bump-version.js patch/minor/major` (validates `VERSION`, fails loudly on a malformed file instead of writing garbage) - **Local dev** — `docker compose up` with volume mounts for live reload - **HTTPS guide** — Caddy reverse proxy with automatic TLS - **Deploy guides** — Step-by-step docs for GHCR and VPS setup - **Template setup** — Auto-creates setup checklist issue on first use +## Health checks (`/health`) + +The deploy pipeline rolls back when the new container fails its health check +(`docker compose up -d --wait`). That safety net only works if `/health` can +actually report failure — a `/health` that always returns `200` makes every +deploy look healthy and silently disables rollback. + +- **Currently implemented** — `app/server.js` exposes `/health` backed by a + list of async readiness checks. All checks passing → `200 {"status":"ok"}`. + Any check returning falsy or throwing → `503 {"status":"unavailable"}`. + Unknown paths return `404` (the example server is not a catch-all). The + default check only confirms the HTTP listener is bound. +- **Design intent** — fail-closed: a dependency outage should surface as an + unhealthy container so the orchestrator stops routing traffic and the CD + rollback triggers, rather than serving a broken app behind a green check. +- **You must wire real checks.** Replace the example app and register probes + for the dependencies your app actually needs: + + ```js + const { createApp } = require('./server.js'); + const { server } = createApp({ + readinessChecks: [ + async () => { await db.query('SELECT 1'); return true; }, + async () => (await redis.ping()) === 'PONG', + ], + }); + server.listen(process.env.PORT || 3000); + ``` + +- **Non-goals** — this is not a metrics/liveness framework. It is the minimal + readiness contract the rollback logic depends on; swap in your stack's + health library if you need more. + ## CI/CD ### CI (every PR + push to main) @@ -154,12 +191,22 @@ docker compose up # Rebuild after Dockerfile changes docker compose up --build -# Bump version +# Bump version (fails loudly if VERSION is malformed — never writes 1.2.NaN) node scripts/bump-version.js patch # 1.0.0 → 1.0.1 node scripts/bump-version.js minor # 1.0.0 → 1.1.0 node scripts/bump-version.js major # 1.0.0 → 2.0.0 ``` +### Tests + +```bash +# Node tests: version-bump validation + /health (200) and unknown path (404) +npm test + +# Rollback integration test (needs Docker; also run in CI) +bash tests/rollback-integration.sh +``` + ## Switching Languages 1. Replace `app/` with your application code diff --git a/app/server.js b/app/server.js index d93a4e6..1859617 100644 --- a/app/server.js +++ b/app/server.js @@ -2,16 +2,79 @@ const http = require('http'); const port = process.env.PORT || 3000; -const server = http.createServer((req, res) => { - if (req.url === '/health') { - res.writeHead(200, { 'Content-Type': 'application/json' }); - res.end(JSON.stringify({ status: 'ok' })); - return; +// --- Readiness checks ------------------------------------------------------- +// +// /health is only useful if it can actually report FAILURE — a health check +// that always returns 200 tells your load balancer / orchestrator nothing and +// defeats the rollback logic in scripts/deploy-with-rollback.sh (an unhealthy +// container would look healthy and never roll back). +// +// `readinessChecks` is a list of async functions. Each must resolve truthy +// when its dependency is reachable and reject / resolve falsy otherwise. The +// process records `started` once `listen` fires, which is a real (if minimal) +// readiness signal: before the listener is up, /health reports 503. +// +// TODO(you): wire your real dependencies here. Examples: +// readinessChecks.push(async () => { await db.query('SELECT 1'); return true; }); +// readinessChecks.push(async () => (await redis.ping()) === 'PONG'); +// A check that throws or returns falsy flips /health to 503 so deploys roll +// back and orchestrators stop routing traffic. +function createApp(options = {}) { + const state = { started: false }; + const readinessChecks = options.readinessChecks || [ + // Default real signal: the HTTP listener must be bound. This is replaced + // /augmented by callers with real dependency probes. + async () => state.started, + ]; + + async function isReady() { + const results = await Promise.allSettled( + readinessChecks.map((check) => Promise.resolve().then(check)) + ); + return results.every( + (r) => r.status === 'fulfilled' && Boolean(r.value) + ); } - res.writeHead(200, { 'Content-Type': 'application/json' }); - res.end(JSON.stringify({ status: 'ok', message: 'Hello from Docker!' })); -}); -server.listen(port, () => { - console.log(`Server running on port ${port}`); -}); + const server = http.createServer((req, res) => { + if (req.url === '/health') { + isReady() + .then((ready) => { + const code = ready ? 200 : 503; + res.writeHead(code, { 'Content-Type': 'application/json' }); + res.end(JSON.stringify({ status: ready ? 'ok' : 'unavailable' })); + }) + .catch(() => { + res.writeHead(503, { 'Content-Type': 'application/json' }); + res.end(JSON.stringify({ status: 'unavailable' })); + }); + return; + } + + if (req.url === '/') { + res.writeHead(200, { 'Content-Type': 'application/json' }); + res.end(JSON.stringify({ status: 'ok', message: 'Hello from Docker!' })); + return; + } + + // Unknown path: a real server must not answer 200 for everything. + res.writeHead(404, { 'Content-Type': 'application/json' }); + res.end(JSON.stringify({ status: 'not_found' })); + }); + + // Mark ready only once the listener is actually bound. + server.on('listening', () => { + state.started = true; + }); + + return { server, state, isReady, readinessChecks }; +} + +if (require.main === module) { + const { server } = createApp(); + server.listen(port, () => { + console.log(`Server running on port ${port}`); + }); +} + +module.exports = { createApp }; diff --git a/package.json b/package.json new file mode 100644 index 0000000..b028510 --- /dev/null +++ b/package.json @@ -0,0 +1,9 @@ +{ + "name": "docker-deploy-starter", + "version": "0.0.0", + "private": true, + "description": "Test harness for the docker-deploy-starter template (scripts + app).", + "scripts": { + "test": "node --test \"tests/**/*.test.js\"" + } +} diff --git a/scripts/bump-version.js b/scripts/bump-version.js index b996f28..3551392 100644 --- a/scripts/bump-version.js +++ b/scripts/bump-version.js @@ -2,27 +2,73 @@ const fs = require('fs'); const path = require('path'); -const versionFile = path.join(__dirname, '..', 'VERSION'); -const version = fs.readFileSync(versionFile, 'utf-8').trim(); -const [major, minor, patch] = version.split('.').map(Number); - -const type = process.argv[2] || 'patch'; -let newVersion; - -switch (type) { - case 'major': - newVersion = `${major + 1}.0.0`; - break; - case 'minor': - newVersion = `${major}.${minor + 1}.0`; - break; - case 'patch': - newVersion = `${major}.${minor}.${patch + 1}`; - break; - default: +const SEMVER_RE = /^\d+\.\d+\.\d+$/; + +// Parse and validate a raw VERSION string into [major, minor, patch]. +// Returns null (does not throw) if the string is not a strict X.Y.Z of +// non-negative integers, so callers can fail loudly instead of writing +// garbage like "1.2.NaN" back to disk. +function parseVersion(raw) { + if (!SEMVER_RE.test(raw)) { + return null; + } + const parts = raw.split('.').map((n) => Number(n)); + if (!parts.every((n) => Number.isInteger(n) && n >= 0)) { + return null; + } + return parts; +} + +// Compute the next version for a given bump type. Exported alongside +// parseVersion so the test suite drives the real logic, not a copy. +function bump(raw, type) { + const parsed = parseVersion(raw); + if (parsed === null) { + throw new Error( + `invalid VERSION "${raw}": expected MAJOR.MINOR.PATCH of non-negative integers` + ); + } + const [major, minor, patch] = parsed; + switch (type) { + case 'major': + return `${major + 1}.0.0`; + case 'minor': + return `${major}.${minor + 1}.0`; + case 'patch': + return `${major}.${minor}.${patch + 1}`; + default: + throw new Error(`unknown bump type "${type}": expected major|minor|patch`); + } +} + +function main(argv) { + const versionFile = path.join(__dirname, '..', 'VERSION'); + const version = fs.readFileSync(versionFile, 'utf-8').trim(); + const type = argv[2] || 'patch'; + + if (!['major', 'minor', 'patch'].includes(type)) { console.error('Usage: node bump-version.js [major|minor|patch]'); process.exit(1); + } + + // Validate BEFORE writing. A corrupt VERSION must never be overwritten + // with a malformed value, and the process must exit non-zero so callers + // (CI, release scripts) see the failure. + if (parseVersion(version) === null) { + console.error( + `Error: VERSION file is malformed: "${version}". ` + + 'Expected MAJOR.MINOR.PATCH of non-negative integers (e.g. 1.2.3).' + ); + process.exit(1); + } + + const newVersion = bump(version, type); + fs.writeFileSync(versionFile, newVersion + '\n'); + console.log(`Bumped version: ${version} → ${newVersion}`); +} + +if (require.main === module) { + main(process.argv); } -fs.writeFileSync(versionFile, newVersion + '\n'); -console.log(`Bumped version: ${version} → ${newVersion}`); +module.exports = { parseVersion, bump }; diff --git a/scripts/deploy-with-rollback.sh b/scripts/deploy-with-rollback.sh index c0b6f0d..843957a 100755 --- a/scripts/deploy-with-rollback.sh +++ b/scripts/deploy-with-rollback.sh @@ -89,17 +89,23 @@ if [ -n "$PREV_IMAGE" ] && [ "$PREV_IMAGE" != "$IMAGE" ]; then if docker compose up -d --wait; then echo "Rollback succeeded." else - # Rollback failed too — remove the broken compose file so the next - # deploy starts from a clean slate (PREV_IMAGE detection on a broken - # container otherwise loops the cascade). Save a copy for forensics. + # Rollback failed too — tear the broken container down BEFORE moving the + # compose file (otherwise `docker compose down` has no compose file to act + # on and would leak the unhealthy container), then remove the broken + # compose so the next deploy starts from a clean slate (PREV_IMAGE + # detection on a broken container otherwise loops the cascade). Save a + # copy for forensics. echo "::error::Rollback also failed — clearing compose file. Saved as docker-compose.failed.yml for investigation." - mv docker-compose.yml docker-compose.failed.yml 2>/dev/null || true docker compose down --remove-orphans >/dev/null 2>&1 || true + mv docker-compose.yml docker-compose.failed.yml 2>/dev/null || true fi else echo "No previous image available to roll back to." - # First deploy of a bad image — same cleanup so we don't carry the - # broken compose into the next attempt. + # First deploy of a bad image — tear the failed container down (BEFORE + # moving the compose file, so `docker compose down` can find it) so we + # don't leak an unhealthy container, then move the broken compose aside so + # we don't carry it into the next attempt. + docker compose down --remove-orphans >/dev/null 2>&1 || true mv docker-compose.yml docker-compose.failed.yml 2>/dev/null || true fi exit 1 diff --git a/tests/bump-version.test.js b/tests/bump-version.test.js new file mode 100644 index 0000000..54e5cc1 --- /dev/null +++ b/tests/bump-version.test.js @@ -0,0 +1,213 @@ +'use strict'; + +// Behavioral tests for scripts/bump-version.js. +// +// Two layers: +// 1. Table-driven unit tests over the exported pure logic (parseVersion, +// bump) — exact increment semantics for patch/minor/major and rejection +// of malformed input. +// 2. End-to-end CLI tests that spawn the real script against a throwaway +// VERSION file and assert the file is rewritten on success AND left +// untouched (with a non-zero exit) on a malformed VERSION. +// +// These FAIL if the validation regression is reintroduced (e.g. writing +// "1.2.NaN" or exiting 0 on corrupt input): the CLI tests check both the +// exit code and the post-run file contents, so a script that writes garbage +// or swallows the error breaks them. + +const test = require('node:test'); +const assert = require('node:assert/strict'); +const { execFileSync } = require('node:child_process'); +const fs = require('node:fs'); +const os = require('node:os'); +const path = require('node:path'); + +const SCRIPT = path.join(__dirname, '..', 'scripts', 'bump-version.js'); +const { parseVersion, bump } = require(SCRIPT); + +// --- Layer 1: pure logic -------------------------------------------------- + +test('bump() computes correct increments (table-driven)', () => { + const cases = [ + // [input, type, expected] + ['1.0.0', 'patch', '1.0.1'], + ['1.0.0', 'minor', '1.1.0'], + ['1.0.0', 'major', '2.0.0'], + ['1.2.3', 'patch', '1.2.4'], + ['1.2.3', 'minor', '1.3.0'], + ['1.2.3', 'major', '2.0.0'], + ['0.0.9', 'patch', '0.0.10'], + ['0.9.9', 'minor', '0.10.0'], + ['9.9.9', 'major', '10.0.0'], + ['10.20.30', 'patch', '10.20.31'], + ]; + for (const [input, type, expected] of cases) { + assert.equal( + bump(input, type), + expected, + `bump("${input}", "${type}") should be ${expected}` + ); + } +}); + +test('minor/major reset lower components to zero', () => { + assert.equal(bump('3.7.5', 'minor'), '3.8.0'); + assert.equal(bump('3.7.5', 'major'), '4.0.0'); +}); + +test('parseVersion accepts strict X.Y.Z of non-negative integers', () => { + assert.deepEqual(parseVersion('1.2.3'), [1, 2, 3]); + assert.deepEqual(parseVersion('0.0.0'), [0, 0, 0]); + assert.deepEqual(parseVersion('12.34.56'), [12, 34, 56]); +}); + +test('parseVersion rejects malformed versions (returns null)', () => { + const bad = [ + 'not-a-version', + '1.2', // too few components + '1.2.3.4', // too many components + '1.2.3-beta', // pre-release suffix + 'v1.2.3', // leading v + '1.2.x', + '1..3', + '1.2.', + '', // empty + '1.2.03a', + ' 1.2.3', // surrounding space (caller trims, but parser is strict) + ]; + for (const v of bad) { + assert.equal(parseVersion(v), null, `parseVersion("${v}") should be null`); + } +}); + +test('bump() throws (does not return garbage) on malformed VERSION', () => { + for (const v of ['not-a-version', '1.2', '1.2.3-beta', '']) { + assert.throws( + () => bump(v, 'patch'), + /invalid VERSION/, + `bump("${v}", "patch") must throw` + ); + } +}); + +test('bump() throws on unknown bump type', () => { + assert.throws(() => bump('1.2.3', 'banana'), /unknown bump type/); +}); + +// --- Layer 2: real CLI behavior ------------------------------------------ + +// Build a throwaway repo layout (/VERSION + /scripts/bump-version.js) +// so the script resolves VERSION via its own __dirname/.. logic. We copy the +// script rather than symlink so require() inside it still works. +function makeSandbox(versionContents) { + const dir = fs.mkdtempSync(path.join(os.tmpdir(), 'bump-version-')); + fs.mkdirSync(path.join(dir, 'scripts')); + fs.copyFileSync(SCRIPT, path.join(dir, 'scripts', 'bump-version.js')); + fs.writeFileSync(path.join(dir, 'VERSION'), versionContents); + return dir; +} + +function runCli(dir, arg) { + const scriptInSandbox = path.join(dir, 'scripts', 'bump-version.js'); + try { + const stdout = execFileSync( + process.execPath, + [scriptInSandbox, arg].filter(Boolean), + { encoding: 'utf-8', stdio: ['ignore', 'pipe', 'pipe'] } + ); + return { code: 0, stdout, stderr: '' }; + } catch (err) { + return { + code: err.status === null || err.status === undefined ? 1 : err.status, + stdout: err.stdout ? err.stdout.toString() : '', + stderr: err.stderr ? err.stderr.toString() : '', + }; + } +} + +test('CLI rewrites VERSION file on a valid patch bump', () => { + const dir = makeSandbox('1.2.3\n'); + try { + const res = runCli(dir, 'patch'); + assert.equal(res.code, 0, `expected exit 0, got ${res.code} (${res.stderr})`); + const after = fs.readFileSync(path.join(dir, 'VERSION'), 'utf-8'); + assert.equal(after, '1.2.4\n'); + } finally { + fs.rmSync(dir, { recursive: true, force: true }); + } +}); + +test('CLI rewrites VERSION file on minor and major bumps', () => { + for (const [arg, expected] of [ + ['minor', '1.3.0\n'], + ['major', '2.0.0\n'], + ]) { + const dir = makeSandbox('1.2.3\n'); + try { + const res = runCli(dir, arg); + assert.equal(res.code, 0, res.stderr); + assert.equal(fs.readFileSync(path.join(dir, 'VERSION'), 'utf-8'), expected); + } finally { + fs.rmSync(dir, { recursive: true, force: true }); + } + } +}); + +test('CLI defaults to patch when no argument is given', () => { + const dir = makeSandbox('4.5.6\n'); + try { + const res = runCli(dir, undefined); + assert.equal(res.code, 0, res.stderr); + assert.equal(fs.readFileSync(path.join(dir, 'VERSION'), 'utf-8'), '4.5.7\n'); + } finally { + fs.rmSync(dir, { recursive: true, force: true }); + } +}); + +test('CLI fails loudly and does NOT corrupt a malformed VERSION', () => { + // This is the core regression guard for the CRITICAL finding: a malformed + // VERSION must cause exit(1) to stderr and must NOT be overwritten with + // "1.2.NaN" / "NaN.undefined.NaN". + const malformedInputs = ['not-a-version\n', '1.2\n', '1.2.3-beta\n']; + for (const original of malformedInputs) { + const dir = makeSandbox(original); + try { + const res = runCli(dir, 'patch'); + assert.equal( + res.code, + 1, + `malformed "${original.trim()}" must exit 1, got ${res.code}` + ); + assert.match( + res.stderr, + /malformed/i, + 'must print a malformed-VERSION error to stderr' + ); + const after = fs.readFileSync(path.join(dir, 'VERSION'), 'utf-8'); + assert.equal( + after, + original, + `VERSION must be left untouched on failure (was "${after.trim()}")` + ); + assert.doesNotMatch( + after, + /NaN|undefined/, + 'VERSION must never contain NaN/undefined' + ); + } finally { + fs.rmSync(dir, { recursive: true, force: true }); + } + } +}); + +test('CLI rejects an unknown bump type with a non-zero exit', () => { + const dir = makeSandbox('1.2.3\n'); + try { + const res = runCli(dir, 'sideways'); + assert.equal(res.code, 1); + // VERSION untouched. + assert.equal(fs.readFileSync(path.join(dir, 'VERSION'), 'utf-8'), '1.2.3\n'); + } finally { + fs.rmSync(dir, { recursive: true, force: true }); + } +}); diff --git a/tests/rollback-integration.sh b/tests/rollback-integration.sh index 92f0cac..aa457a9 100755 --- a/tests/rollback-integration.sh +++ b/tests/rollback-integration.sh @@ -11,8 +11,14 @@ # rollback restores the good image. After the attempt, /health must # still return 200 from the good image. # 3. First deploy of a bad image (no previous image) fails with a -# non-zero exit and leaves nothing healthy — the script must not -# silently swallow the failure. +# non-zero exit and leaves the system quiesced: docker-compose.yml is +# gone, docker-compose.failed.yml is preserved for forensics, and no +# container is left running. The script must not silently swallow the +# failure nor leak an unhealthy container. +# 4. Both images unhealthy: a previous (unhealthy) container exists and the +# new deploy is also unhealthy, so rollback to the previous fails too. +# The script must end with the system quiesced (compose moved aside, no +# container running) rather than cascading or leaking containers. set -euo pipefail @@ -30,8 +36,11 @@ cleanup() { if [ -f "$WORK_DIR/docker-compose.yml" ]; then (cd "$WORK_DIR" && docker compose down -v --remove-orphans >/dev/null 2>&1 || true) fi + if [ -f "$WORK_DIR/docker-compose.failed.yml" ]; then + (cd "$WORK_DIR" && docker compose -f docker-compose.failed.yml down -v --remove-orphans >/dev/null 2>&1 || true) + fi rm -rf "$WORK_DIR" - docker rmi -f "$GOOD_IMAGE" "$BAD_IMAGE" >/dev/null 2>&1 || true + docker rmi -f "$GOOD_IMAGE" "$BAD_IMAGE" "rollback-test/bad:prev" >/dev/null 2>&1 || true } trap cleanup EXIT @@ -48,6 +57,43 @@ check_health() { return 1 } +# Count app containers still tracked by either the live or the failed compose +# file in WORK_DIR. After a failed deploy with no successful rollback, this +# must be zero (the script is responsible for tearing the container down). +running_app_containers() { + local count=0 ids + if [ -f "$WORK_DIR/docker-compose.yml" ]; then + ids="$(cd "$WORK_DIR" && docker compose ps -q app 2>/dev/null || true)" + [ -n "$ids" ] && count=$((count + $(echo "$ids" | grep -c .))) + fi + if [ -f "$WORK_DIR/docker-compose.failed.yml" ]; then + ids="$(cd "$WORK_DIR" && docker compose -f docker-compose.failed.yml ps -q app 2>/dev/null || true)" + [ -n "$ids" ] && count=$((count + $(echo "$ids" | grep -c .))) + fi + echo "$count" +} + +# Assert the deployment is fully quiesced after a terminal failure: +# - docker-compose.yml has been moved aside (gone) +# - docker-compose.failed.yml is preserved for forensics +# - nothing the deploy created is still running +# - nothing answers /health on the port +assert_quiesced() { + local label="$1" + [ ! -f "$WORK_DIR/docker-compose.yml" ] \ + || fail "$label: docker-compose.yml should be gone (moved to .failed.yml)" + [ -f "$WORK_DIR/docker-compose.failed.yml" ] \ + || fail "$label: docker-compose.failed.yml should be preserved for forensics" + local n + n="$(running_app_containers)" + [ "$n" -eq 0 ] \ + || fail "$label: expected no running app containers, found $n" + if curl -sf "http://127.0.0.1:${PORT}/health" >/dev/null 2>&1; then + fail "$label: something is still serving /health on port ${PORT}" + fi + pass "$label: system quiesced (no compose, .failed.yml saved, nothing running)" +} + echo "==> Building fixture images" docker build -t "$GOOD_IMAGE" "$REPO_ROOT/tests/fixtures/good-app" >/dev/null docker build -t "$BAD_IMAGE" "$REPO_ROOT/tests/fixtures/bad-app" >/dev/null @@ -96,9 +142,10 @@ fi rm -f "$WORK_DIR/docker-compose.yml" # --------------------------------------------------------------------------- -# Scenario 3: first deploy of a bad image (no previous) must fail loudly. +# Scenario 3: first deploy of a bad image (no previous) must fail loudly AND +# leave the system quiesced. # --------------------------------------------------------------------------- -echo "==> Scenario 3: first deploy of bad image (no previous) — expect failure" +echo "==> Scenario 3: first deploy of bad image (no previous) — expect failure + quiesce" set +e IMAGE="$BAD_IMAGE" PORT="$PORT" DEPLOY_DIR="$WORK_DIR" SKIP_PULL=1 \ bash "$SCRIPT" @@ -109,5 +156,69 @@ if [ "$rc" -eq 0 ]; then fi pass "bad image first deploy returned non-zero ($rc)" +# The failed first deploy must have moved the compose aside, kept the +# forensic copy, and left nothing running. +assert_quiesced "scenario 3" + +# Fresh state before scenario 4. +(cd "$WORK_DIR" && docker compose down -v --remove-orphans >/dev/null 2>&1 || true) +if [ -f "$WORK_DIR/docker-compose.failed.yml" ]; then + (cd "$WORK_DIR" && docker compose -f docker-compose.failed.yml down -v --remove-orphans >/dev/null 2>&1 || true) +fi +rm -f "$WORK_DIR/docker-compose.yml" "$WORK_DIR/docker-compose.failed.yml" + +# --------------------------------------------------------------------------- +# Scenario 4: both images unhealthy. A previous (unhealthy) container is +# already running and tracked by a compose file, then a new bad deploy is +# attempted. Rollback targets the previous image, which is ALSO unhealthy, so +# rollback fails too. The script must end with the system quiesced — not +# cascading restarts, not leaking the unhealthy container. +# --------------------------------------------------------------------------- +echo "==> Scenario 4: both images unhealthy — expect failure + quiesce" + +# Pre-seed a running-but-unhealthy "previous" container so PREV_IMAGE +# detection finds a bad image to (fail to) roll back to. We bring it up +# WITHOUT --wait so the unhealthy container stays running and is recorded in a +# compose file the deploy script will discover. +BAD_PREV_IMAGE="rollback-test/bad:prev" +docker tag "$BAD_IMAGE" "$BAD_PREV_IMAGE" >/dev/null 2>&1 +cat > "$WORK_DIR/docker-compose.yml" </dev/null 2>&1) +# Sanity: a previous container is actually running before we attempt the deploy. +prev_running="$(running_app_containers)" +[ "$prev_running" -ge 1 ] || fail "scenario 4 setup: previous unhealthy container is not running" +pass "scenario 4 setup: previous unhealthy container is running (PREV will be detected)" + +set +e +IMAGE="$BAD_IMAGE" PORT="$PORT" DEPLOY_DIR="$WORK_DIR" SKIP_PULL=1 \ + bash "$SCRIPT" +rc=$? +set -e +if [ "$rc" -eq 0 ]; then + fail "both-unhealthy deploy returned 0 — failure was swallowed" +fi +pass "both-unhealthy deploy returned non-zero ($rc)" + +# Even though rollback was attempted and also failed, the system must be +# quiesced: compose moved aside, forensic copy kept, nothing running. +assert_quiesced "scenario 4" + +docker rmi -f "$BAD_PREV_IMAGE" >/dev/null 2>&1 || true + echo "" echo "All rollback integration scenarios passed." diff --git a/tests/server.test.js b/tests/server.test.js new file mode 100644 index 0000000..b1bbca0 --- /dev/null +++ b/tests/server.test.js @@ -0,0 +1,128 @@ +'use strict'; + +// Behavioral HTTP tests for app/server.js. +// +// Starts the real server on an ephemeral port and makes real requests: +// - GET /health -> 200 + {"status":"ok"} when ready +// - GET / -> 200 + greeting JSON +// - GET / -> 404 (NOT 200-for-everything) +// - GET /health (failing readiness check injected) -> 503 +// +// The 404 assertion fails if the "200 for every path" regression returns. +// The 503 assertion fails if /health is hardcoded to 200 and cannot express +// failure (the MAJOR finding) — it drives the real isReady() path. + +const test = require('node:test'); +const assert = require('node:assert/strict'); +const { createApp } = require('../app/server.js'); + +// Start a server on an ephemeral port (0) and return { port, close }. +function start(options) { + const { server } = createApp(options); + return new Promise((resolve, reject) => { + server.once('error', reject); + server.listen(0, '127.0.0.1', () => { + const { port } = server.address(); + resolve({ + port, + close: () => + new Promise((res) => server.close(() => res())), + }); + }); + }); +} + +async function get(port, urlPath) { + const res = await fetch(`http://127.0.0.1:${port}${urlPath}`); + let body = null; + const text = await res.text(); + try { + body = JSON.parse(text); + } catch { + body = text; + } + return { status: res.status, body }; +} + +test('GET /health returns 200 and {status:"ok"} when ready', async () => { + const srv = await start(); + try { + const res = await get(srv.port, '/health'); + assert.equal(res.status, 200); + assert.deepEqual(res.body, { status: 'ok' }); + } finally { + await srv.close(); + } +}); + +test('GET / returns 200 with greeting JSON', async () => { + const srv = await start(); + try { + const res = await get(srv.port, '/'); + assert.equal(res.status, 200); + assert.equal(res.body.status, 'ok'); + assert.ok( + typeof res.body.message === 'string' && res.body.message.length > 0, + 'expected a non-empty message' + ); + } finally { + await srv.close(); + } +}); + +test('GET unknown path returns 404 (not 200-for-everything)', async () => { + const srv = await start(); + try { + for (const p of ['/totally-unknown', '/api/nope', '/health/extra', '/favicon.ico']) { + const res = await get(srv.port, p); + assert.equal(res.status, 404, `${p} should be 404, got ${res.status}`); + } + } finally { + await srv.close(); + } +}); + +test('GET /health returns 503 when a readiness check fails', async () => { + // Inject a failing dependency probe — this proves /health reflects a real + // readiness signal and can report failure, instead of always 200. + const srv = await start({ + readinessChecks: [async () => false], + }); + try { + const res = await get(srv.port, '/health'); + assert.equal(res.status, 503, 'failing readiness must yield 503'); + assert.equal(res.body.status, 'unavailable'); + } finally { + await srv.close(); + } +}); + +test('GET /health returns 503 when a readiness check throws', async () => { + const srv = await start({ + readinessChecks: [ + async () => { + throw new Error('db unreachable'); + }, + ], + }); + try { + const res = await get(srv.port, '/health'); + assert.equal(res.status, 503, 'throwing readiness must yield 503'); + assert.equal(res.body.status, 'unavailable'); + } finally { + await srv.close(); + } +}); + +test('GET /health returns 200 only when every readiness check passes', async () => { + const srv = await start({ + readinessChecks: [async () => true, async () => true], + }); + try { + const res = await get(srv.port, '/health'); + assert.equal(res.status, 200); + assert.deepEqual(res.body, { status: 'ok' }); + } finally { + await srv.close(); + } +});