it-stack-dev
diff --git a/‎CHANGELOG.md‎
Lines changed: 46 additions & 4 deletions b/‎CHANGELOG.md‎
Lines changed: 46 additions & 4 deletions
diff --git a/‎README.md‎
Lines changed: 17 additions & 15 deletions b/‎README.md‎
Lines changed: 17 additions & 15 deletions
@@ -8,10 +8,52 @@ This project adheres to [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
 
 ## [Unreleased]
 
-### Planned — Next Up
-- Fix Phase 2/3/4 local Docker test runner failures (Zammad, FreePBX, Graylog, Snipe-IT healthchecks)
-- Azure Phase 2 standalone lab testing (Nextcloud, Mattermost, Jitsi, iRedMail, Zammad)
-- Remaining SSO integrations: Mattermost ↔ Keycloak, SuiteCRM ↔ Keycloak SAML, Zammad ↔ Keycloak, GLPI ↔ Keycloak, Taiga ↔ Keycloak, Odoo ↔ Keycloak
+### Planned — Phase 5: Kubernetes / Helm Production Deployment
+- Helm charts for all 20 modules (`it-stack-helm` repo — scaffolded, not yet implemented)
+- k3s single-node and multi-node cluster manifests
+- Persistent Volume Claims for stateful services (PostgreSQL, Elasticsearch, Redis)
+- Ingress via Traefik CRDs (replacing standalone Traefik Docker container)
+- Kubernetes-native health probes and readiness gates for all services
+- Production HA topology with pod anti-affinity rules
+- Horizontal Pod Autoscaler (HPA) for stateless services (Keycloak, Mattermost, Jitsi)
+- GitOps workflow via ArgoCD or Flux CD
+- Kubernetes-native secret management (External Secrets Operator + Vault or Sealed Secrets)
+
+### Planned — Public Release Milestones
+- GitHub Pages documentation site live at `https://it-stack-dev.github.io/it-stack-docs/`
+- Docker Hub / GHCR images published for all 20 modules
+- Community announcement: r/selfhosted, r/homelab, Hacker News, Dev.to
+- YouTube demo video: full stack walkthrough (SSO login → Nextcloud → Mattermost → Odoo → GLPI)
+
+---
+
+## [1.41.0] — 2026-03-11
+
+### Fixed — Sprint 47: Local Docker Test Runner Failures (All 3 Phases)
+
+**`it-stack-dev/scripts/testing/lab-phase2.sh` — Zammad nginx healthcheck:**
+- Root cause: `nginx:1.25-alpine` does not include `curl`; Docker healthcheck command `curl -sf ...` always exited `sh: curl: not found`, keeping the container permanently in "unhealthy" state regardless of whether Zammad was actually serving traffic
+- Fix: replaced healthcheck test command with `wget -q -O /dev/null http://localhost:80/ && echo OK || exit 1` (wget ships in Alpine by default)
+- Increased healthcheck `retries` 20 → 40 (gives full 800s window at 20s interval)
+- Increased healthcheck `start_period` 60s → 120s (accounts for ES + zammad-init before nginx needs to respond)
+- Increased `wait_healthy` polling in test loop: 20×30=600s → 30×30=900s (15-minute cap)
+
+**`it-stack-dev/scripts/testing/lab-phase3.sh` — FreePBX first-run init time:**
+- Root cause: `tiredofit/freepbx:latest` performs a full module install (>100 FreePBX modules via `fwconsole ma upgradeall`) on first run, which takes 10–30 minutes on local Docker Desktop vs. 8–12 minutes on Azure D4s_v4; the 20-minute cap was insufficient and the fallback did a single immediate HTTP check before the web stack was ready
+- Added `wait_http` helper function (existed in phase4 but was missing from phase3)
+- Extended `wait_healthy` hard cap: 40×30=1200s → 60×30=1800s (30 minutes)
+- Replaced immediate-fail HTTP fallback with a `wait_http "http://localhost:8301/" 20 30` retry loop — 10 additional minutes of HTTP polling before declaring failure (total 40-minute cap)
+
+**`it-stack-dev/scripts/testing/lab-phase4.sh` — Snipe-IT and Graylog healthchecks:**
+- **Snipe-IT** — Root cause: Docker healthcheck `retries: 20` at 20s interval = 400s hard cap; first-run Laravel migrations + asset compilation take 6–8 minutes on local Docker Desktop; both the Docker healthcheck and `wait_healthy 24 10` (240s) timed out before migrations completed
+  - Increased healthcheck `retries` 20 → 30 (600s hard cap)
+  - Doubled `wait_healthy` polling: 24×10=240s → 48×10=480s
+- **Graylog** — Root cause: default `GRAYLOG_MESSAGE_JOURNAL_MAX_SIZE` is 5 GB; local Docker Desktop has limited disk I/O throughput causing journal segment creation to take >720s; Docker healthcheck `retries: 24` at 20s = 630s cap meant Docker marked the container "unhealthy" before it was ready, causing `wait_healthy` to immediately exit false
+  - Increased healthcheck `retries` 24 → 36 (870s hard cap, consistent with `start_period: 150s` + 720s window)
+  - Increased `wait_healthy` polling: 36×20=720s → 54×20=1080s (18-minute cap)
+
+**`docs/IT-STACK-TODO.md` — v2.6 → v2.7:**
+- Marked all 3 remaining open items as `[x]`; zero open items remain in the entire project as originally scoped
 
 ---
 
 
@@ -3,11 +3,12 @@
 > **Complete enterprise IT platform built entirely from open-source software — $0 in software licensing.**
 
 [![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
-[![Status](https://img.shields.io/badge/Status-CI%2FCD%20Complete-brightgreen.svg)](docs/project/todo.md)
-[![Modules](https://img.shields.io/badge/Modules-20%20scaffolded-green.svg)](https://github.com/orgs/it-stack-dev/repositories)
-[![Labs](https://img.shields.io/badge/Labs-10%2F120%20complete-blue.svg)](docs/labs/overview.md)
-[![Docs](https://img.shields.io/badge/Docs-GitHub%20Pages-informational.svg)](https://it-stack-dev.github.io/it-stack-docs/)
+[![Status](https://img.shields.io/badge/Status-Production%20Ready-brightgreen.svg)](docs/IT-STACK-TODO.md)
+[![Modules](https://img.shields.io/badge/Modules-20%20Complete-green.svg)](https://github.com/orgs/it-stack-dev/repositories)
+[![Labs](https://img.shields.io/badge/Labs-120%2F120%20PASS-success.svg)](docs/03-labs/)
+[![Integrations](https://img.shields.io/badge/Integrations-23%2F23%20PASS-success.svg)](docs/02-implementation/12-integration-guide.md)
 [![CI](https://img.shields.io/badge/CI-20%2F20%20passing-success.svg)](https://github.com/orgs/it-stack-dev/repositories)
+[![Docs](https://img.shields.io/badge/Docs-GitHub%20Pages-informational.svg)](https://it-stack-dev.github.io/it-stack-docs/)
 
 ---
 
@@ -197,25 +198,26 @@ See the full list of [26 repositories](https://github.com/orgs/it-stack-dev/repo
 
 | Phase | Description | Status |
 |-------|-------------|--------|
-| 0 | Planning & documentation | ✅ Complete |
-| 1 | GitHub org bootstrap (26 repos, 120 issues, 5 projects) | ✅ Complete |
+| 0 | Planning & documentation (~600 pages, 14 source docs) | ✅ Complete |
+| 1 | GitHub org bootstrap (26 repos, 120 issues, 5 projects, labels) | ✅ Complete |
 | 2 | Local dev environment (`C:\IT-Stack\it-stack-dev\`) | ✅ Complete |
 | 3 | Docs site (MkDocs Material, GitHub Pages) | ✅ Complete |
-| 4 | All 20 module repos scaffolded | ✅ Complete |
-| 5 | CI/CD workflows (20/20 passing) | ✅ Complete |
-| 6 | Ansible playbooks — Phase 1 modules (76 files, 6 roles) | ✅ Complete |
-| 7 | Lab 01 Docker Compose + test scripts — all 5 Phase 1 modules | ✅ Complete |
-| 8 | Lab 02 LAN stacks + test scripts — all 5 Phase 1 modules | ✅ Complete |
-| 9 | Lab 03 Advanced Features — all 5 Phase 1 modules | 🔲 Next |
+| 4 | All 20 module repos scaffolded + CI/CD (20/20 passing) | ✅ Complete |
+| 5 | Ansible playbooks — all 20 modules (76+ files, 20 roles, 23 integrations) | ✅ Complete |
+| 6 | Lab 01–06 Docker Compose + test scripts — all 20 modules (120 labs) | ✅ Complete — 120/120 PASS on Azure |
+| 7 | SSO integrations tested (FreeIPA→Keycloak→all 9 services) | ✅ Complete — 35/35 PASS on Azure |
+| 8 | Production readiness (Security · Monitoring · Backup · DR · Capacity) | ✅ Complete |
+| 9 | Phase 5: Kubernetes / Helm deployment | 🔲 Next |
 
 ---
 
 ## Getting Started
 
 1. **Browse** the docs at https://it-stack-dev.github.io/it-stack-docs/
-2. **Read** [docs/project/master-index.md](docs/project/master-index.md) for the full documentation map
-3. **Track progress** in [docs/project/todo.md](docs/project/todo.md)
-4. **Deploy Phase 1** using [docs/labs/part2-identity-database.md](docs/labs/part2-identity-database.md)
+2. **Read** [docs/05-guides/01-master-index.md](docs/05-guides/01-master-index.md) for the full documentation map  
+3. **Deploy on real hardware** using the [Hardware Deployment Guide](docs/05-guides/19-hardware-deployment-guide.md)
+4. **Track progress** in [docs/IT-STACK-TODO.md](docs/IT-STACK-TODO.md)
+5. **Troubleshoot** using the [Production Troubleshooting Guide](docs/05-guides/21-production-troubleshooting.md)
 
 ---