You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Root cause: `nginx:1.25-alpine` does not include `curl`; Docker healthcheck command `curl -sf ...` always exited `sh: curl: not found`, keeping the container permanently in "unhealthy" state regardless of whether Zammad was actually serving traffic
36
+
- Fix: replaced healthcheck test command with `wget -q -O /dev/null http://localhost:80/ && echo OK || exit 1` (wget ships in Alpine by default)
37
+
- Increased healthcheck `retries` 20 → 40 (gives full 800s window at 20s interval)
38
+
- Increased healthcheck `start_period` 60s → 120s (accounts for ES + zammad-init before nginx needs to respond)
39
+
- Increased `wait_healthy` polling in test loop: 20×30=600s → 30×30=900s (15-minute cap)
- Root cause: `tiredofit/freepbx:latest` performs a full module install (>100 FreePBX modules via `fwconsole ma upgradeall`) on first run, which takes 10–30 minutes on local Docker Desktop vs. 8–12 minutes on Azure D4s_v4; the 20-minute cap was insufficient and the fallback did a single immediate HTTP check before the web stack was ready
43
+
- Added `wait_http` helper function (existed in phase4 but was missing from phase3)
44
+
- Extended `wait_healthy` hard cap: 40×30=1200s → 60×30=1800s (30 minutes)
45
+
- Replaced immediate-fail HTTP fallback with a `wait_http "http://localhost:8301/" 20 30` retry loop — 10 additional minutes of HTTP polling before declaring failure (total 40-minute cap)
46
+
47
+
**`it-stack-dev/scripts/testing/lab-phase4.sh` — Snipe-IT and Graylog healthchecks:**
48
+
-**Snipe-IT** — Root cause: Docker healthcheck `retries: 20` at 20s interval = 400s hard cap; first-run Laravel migrations + asset compilation take 6–8 minutes on local Docker Desktop; both the Docker healthcheck and `wait_healthy 24 10` (240s) timed out before migrations completed
49
+
- Increased healthcheck `retries` 20 → 30 (600s hard cap)
-**Graylog** — Root cause: default `GRAYLOG_MESSAGE_JOURNAL_MAX_SIZE` is 5 GB; local Docker Desktop has limited disk I/O throughput causing journal segment creation to take >720s; Docker healthcheck `retries: 24` at 20s = 630s cap meant Docker marked the container "unhealthy" before it was ready, causing `wait_healthy` to immediately exit false
52
+
- Increased healthcheck `retries` 24 → 36 (870s hard cap, consistent with `start_period: 150s` + 720s window)
0 commit comments