Skip to content

Commit 8868495

Browse files
committed
docs: cloud lab deployment, GUI walkthrough, quick-start guide, full quality pass
Deployment guides: - Add 00-quick-start-deploy.md — primary entry point for cloud and on-prem setup - Add Current Live Deployment section to 18-azure-lab-deployment.md (VM specs, 13-service table, access URLs, NSG setup, disk expansion) New guides: - Add 22-gui-walkthrough.md — full accuracy pass for live cloud environment (live service state, correct credentials, cleaned URLs, MailHog removed) - Add 23-thunderbird-integration.md — email/calendar/contacts desktop setup Documentation updates: - Update 17-admin-runbook.md — Deployment Context table, Cloud Lab VM Operations - Update 01-master-index.md — Path 0 (cloud lab), v2.1 versioning section - Update 21-production-troubleshooting.md — cloud vs on-prem context header - Update network-topology.md — Cloud Single-VM Topology section (Azure, live) - Update IT-STACK-TODO.md — Phase: Cloud Lab Deployment checklists - Update docs/README.md — complete 05-guides table (00-23), numbering key - Update README.md — quick-start first in Getting Started, GUI walkthrough link - Update CHANGELOG.md — v2.1.0 entry with all changed files Scripts: - Add apply-fixes.sh, demo-start.sh, run-tests.sh
1 parent af13a3b commit 8868495

16 files changed

+4091
-24
lines changed

CHANGELOG.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,64 @@ This project adheres to [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
2727

2828
---
2929

30+
## [2.1.0] — 2026-03-12
31+
32+
### Added — Cloud Lab Deployment (Azure Single-VM)
33+
34+
This release documents the live hands-on deployment of IT-Stack services on a single Azure VM (`lab-single`, West US 2), establishing the first public-IP-accessible demo environment for IT-Stack.
35+
36+
**Infrastructure**
37+
- Azure VM `lab-single`: Standard_D4s_v4 (4 vCPU / 16 GB RAM), Ubuntu 24.04 LTS, 30 GB Premium SSD
38+
- Static public IP `4.154.17.25` with NSG ports opened for all service UIs and protocols
39+
- Azure Private DNS zone `lab.it-stack.local` mapped to private IP
40+
- Auto-shutdown policy: 22:00 UTC daily (saves ~33% of compute cost)
41+
- docker-mailserver `mail-demo` deployed as real SMTP/IMAP relay (domain: itstack.local)
42+
43+
**New Services Deployed**
44+
- Jitsi Meet (`jitsi-web-lab01`) — port 8880; 4-container stack (web, prosody, jicofo, JVB)
45+
- Taiga (`taiga-front-s01` + `taiga-back-s01`) — ports 9001/9000; project management
46+
- Zabbix (`zabbix-web-s01`) — port 8307; full monitoring stack + Zabbix Agent 2 on host
47+
- Graylog (`graylog-s01`) — port 9002; GELF UDP :12201 + Syslog UDP :1514 inputs live
48+
49+
**Existing Services Configured / Fixed**
50+
- Nextcloud: 57 apps installed and enabled (collaboration, security, field operations, storage, integration)
51+
- SuiteCRM: SMTP configured via `config_override.php` (Bitnami image path corrected)
52+
- Odoo: SMTP configured via direct SQL into `ir_mail_server` + `ir_config_parameter`
53+
- Mattermost: SMTP wired to `mail-demo`
54+
- Snipe-IT: 506 Internal Server Error root-caused and fixed — duplicate migration `2018_05_14_223646_add_indexes_to_assets` marked as completed in `migrations` table; remaining migrations applied; admin user created via `artisan snipeit:create-admin`
55+
- Snipe-IT: SMTP settings injected via container ENV vars (not DB config)
56+
- Keycloak: Nginx reverse proxy sidecar added for subdirectory routing
57+
58+
### Fixed
59+
- Snipe-IT 506 error caused by pre-existing `assets_created_at_index` conflicting with migration attempt
60+
- SuiteCRM SMTP path: Bitnami container stores config in `/bitnami/suitecrm/public/legacy/` — not `/var/www/html/`
61+
- Odoo SMTP: initial DB created with name `testdb` (not `odoo`) — all DB commands must target `testdb`
62+
- Disk space: removed orphaned `elasticsearch:8.17.3` image (2 GB) and unused `mailhog/mailhog` (572 MB)
63+
64+
### Documentation
65+
- `docs/05-guides/18-azure-lab-deployment.md` — New major section: "Current Live Deployment (March 2026)" covering VM specs, all 12+ services and ports, access URLs, NSG rules, SMTP configs, Snipe-IT fix procedure, Graylog/Zabbix setup commands, disk expansion guide, and Azure cost breakdown
66+
- `docs/07-architecture/network-topology.md` — New major section: "Cloud Single-VM Topology" with full container diagram, port map table, and limitations comparison vs. 8-server on-prem
67+
- `README.md` — Added "Cloud Lab Deployment (Live — March 2026)" callout; added Cloud row to Project Status table; updated Getting Started steps
68+
- `docs/IT-STACK-TODO.md` — Added "Phase: Cloud Lab Deployment" section tracking all completed, pending, and blocked items
69+
- `docs/05-guides/22-gui-walkthrough.md` — Full accuracy pass: Service Directory table split into Active/Pending, all module sections annotated with ✅ Already running or ⏳ Pending, NSG/SSH commands updated to active ports only, corrected credentials (Snipe-IT, Zabbix, Odoo), removed stale MailHog entry, updated Zammad with disk expansion prerequisite
70+
- `docs/05-guides/01-master-index.md` — Added Path 0 (cloud lab, zero-setup), updated Documentation Versioning to v2.1
71+
- `docs/05-guides/17-admin-runbook.md` — Added Deployment Context table and Cloud Lab VM operations section (health check, container management, Keycloak user management)
72+
- `docs/05-guides/21-production-troubleshooting.md` — Added deployment context header clarifying single-VM vs multi-server command translation
73+
74+
### Security / Cost
75+
- Deleted idle Bastion `workspace-1-vnet-bastion` (was billing ~$140/month while unused)
76+
- Queued deletion of second Bastion `rg-stack-test1`
77+
- Identified 2 unattached static IPs for deletion (saves ~$7.44/month)
78+
- Estimated monthly cost at 16 hrs/day runtime: ~$105/month (within Azure Students $100/month credit with recommended optimizations)
79+
80+
### Known Issues / Pending
81+
- **Zammad** not deployed — 30 GB OS disk reached 100% capacity during JS asset write; requires disk expansion to 64 GB before retry
82+
- Docker GELF log driver not yet enabled globally on host (`/etc/docker/daemon.json`) — set up to route all container logs automatically to Graylog
83+
- Zabbix → Mattermost webhook alert channel (`#ops-alerts`) not yet configured
84+
- Keycloak OIDC realm clients for Nextcloud and Mattermost not yet wired end-to-end
85+
86+
---
87+
3088
## [1.41.0] — 2026-03-11
3189

3290
### Fixed — Sprint 47: Local Docker Test Runner Failures (All 3 Phases)

README.md

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -194,6 +194,23 @@ See the full list of [26 repositories](https://github.com/orgs/it-stack-dev/repo
194194

195195
---
196196

197+
## Cloud Lab Deployment (Live — March 2026)
198+
199+
> **12 services are currently running** on a single Azure VM as a live demo/lab environment.
200+
201+
| Property | Value |
202+
|----------|-------|
203+
| VM | `lab-single` — Standard_D4s_v4 (4 vCPU / 16 GB RAM) |
204+
| Public IP | `4.154.17.25` |
205+
| Region | West US 2 |
206+
| Status | ✅ Running — auto-shuts down 22:00 UTC |
207+
208+
**Live services:** Keycloak · Nextcloud (57 apps) · Mattermost · SuiteCRM · Odoo · Snipe-IT · Jitsi Meet · Taiga · Zabbix · Graylog · Traefik · docker-mailserver
209+
210+
See [docs/05-guides/18-azure-lab-deployment.md](docs/05-guides/18-azure-lab-deployment.md)**Current Live Deployment** section — for all ports, credentials, compose commands, and cost breakdown.
211+
212+
---
213+
197214
## Project Status
198215

199216
| Phase | Description | Status |
@@ -207,17 +224,19 @@ See the full list of [26 repositories](https://github.com/orgs/it-stack-dev/repo
207224
| 6 | Lab 01–06 Docker Compose + test scripts — all 20 modules (120 labs) | ✅ Complete — 120/120 PASS on Azure |
208225
| 7 | SSO integrations tested (FreeIPA→Keycloak→all 9 services) | ✅ Complete — 35/35 PASS on Azure |
209226
| 8 | Production readiness (Security · Monitoring · Backup · DR · Capacity) | ✅ Complete |
227+
| Cloud | Single-VM Azure lab — 12/20 services live on 4.154.17.25 | ✅ Live — March 2026 |
210228
| 9 | Phase 5: Kubernetes / Helm deployment | 🔲 Next |
211229

212230
---
213231

214232
## Getting Started
215233

216-
1. **Browse** the docs at https://it-stack-dev.github.io/it-stack-docs/
217-
2. **Read** [docs/05-guides/01-master-index.md](docs/05-guides/01-master-index.md) for the full documentation map
218-
3. **Deploy on real hardware** using the [Hardware Deployment Guide](docs/05-guides/19-hardware-deployment-guide.md)
219-
4. **Track progress** in [docs/IT-STACK-TODO.md](docs/IT-STACK-TODO.md)
220-
5. **Troubleshoot** using the [Production Troubleshooting Guide](docs/05-guides/21-production-troubleshooting.md)
234+
1. **Deploy now** — follow [docs/05-guides/00-quick-start-deploy.md](docs/05-guides/00-quick-start-deploy.md) for cloud (Azure) or on-prem step-by-step setup
235+
2. **Browse** the docs at https://it-stack-dev.github.io/it-stack-docs/
236+
3. **Read** [docs/05-guides/01-master-index.md](docs/05-guides/01-master-index.md) for the full documentation map
237+
4. **Walk through the UI** — see [docs/05-guides/22-gui-walkthrough.md](docs/05-guides/22-gui-walkthrough.md) for every service with credentials
238+
5. **Track progress** in [docs/IT-STACK-TODO.md](docs/IT-STACK-TODO.md)
239+
6. **Troubleshoot** using the [Production Troubleshooting Guide](docs/05-guides/21-production-troubleshooting.md)
221240

222241
---
223242

apply-fixes.sh

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
#!/usr/bin/env bash
2+
# Apply the three Docker runner fixes to lab scripts on Azure VM
3+
set -e
4+
5+
echo "=== Fix 1: Zammad nginx — curl → wget ==="
6+
python3 - << 'PYEOF'
7+
import re
8+
9+
with open('/home/itstack/lab-phase2.sh', 'r') as f:
10+
content = f.read()
11+
12+
old = 'test: ["CMD-SHELL", "curl -sf -o /dev/null -w \'%{http_code}\' http://localhost:80/ | grep -qE \'^[23]\'"]'
13+
new = 'test: ["CMD-SHELL", "wget -q -O /dev/null http://localhost:80/ && echo OK || exit 1"]'
14+
15+
if old in content:
16+
content = content.replace(old, new)
17+
with open('/home/itstack/lab-phase2.sh', 'w') as f:
18+
f.write(content)
19+
print("FIXED: curl replaced with wget in Zammad nginx healthcheck")
20+
elif 'wget -q -O /dev/null http://localhost:80/' in content:
21+
print("ALREADY FIXED: wget already present in Zammad nginx healthcheck")
22+
else:
23+
# Try regex approach
24+
pattern = r'test: \["CMD-SHELL", "curl[^"]*localhost:80/[^"]*"\]'
25+
match = re.search(pattern, content)
26+
if match:
27+
content = re.sub(pattern, 'test: ["CMD-SHELL", "wget -q -O /dev/null http://localhost:80/ && echo OK || exit 1"]', content)
28+
with open('/home/itstack/lab-phase2.sh', 'w') as f:
29+
f.write(content)
30+
print(f"FIXED via regex: replaced '{match.group()}'")
31+
else:
32+
print(f"WARNING: Could not locate curl healthcheck. Current healthcheck lines:")
33+
for i, line in enumerate(content.split('\n')):
34+
if 'localhost:80' in line or ('zammad' in line.lower() and 'health' in line.lower()):
35+
print(f" line {i+1}: {line.strip()}")
36+
PYEOF
37+
38+
echo ""
39+
echo "=== Fix 2: FreePBX — wait_healthy 40x30 → 60x30 (20min → 30min) ==="
40+
python3 - << 'PYEOF'
41+
with open('/home/itstack/lab-phase3.sh', 'r') as f:
42+
content = f.read()
43+
44+
old = 'wait_healthy "$app" 40 30'
45+
new = 'wait_healthy "$app" 60 30'
46+
47+
if old in content:
48+
count = content.count(old)
49+
content = content.replace(old, new)
50+
with open('/home/itstack/lab-phase3.sh', 'w') as f:
51+
f.write(content)
52+
print(f"FIXED: FreePBX wait extended to 30min ({count} occurrence(s) replaced)")
53+
elif 'wait_healthy "$app" 60 30' in content:
54+
print("ALREADY FIXED: FreePBX wait is already 60x30")
55+
else:
56+
print("WARNING: wait_healthy pattern not found. Searching for FreePBX wait...")
57+
for i, line in enumerate(content.split('\n')):
58+
if 'wait_healthy' in line and ('freepbx' in line.lower() or 'app' in line):
59+
print(f" line {i+1}: {line.strip()}")
60+
PYEOF
61+
62+
echo ""
63+
echo "=== Fix 3: Snipe-IT — wait_healthy 24x10 → 48x10 (4min → 8min) ==="
64+
python3 - << 'PYEOF'
65+
with open('/home/itstack/lab-phase4.sh', 'r') as f:
66+
content = f.read()
67+
68+
old = 'wait_healthy "$app" 24 10'
69+
new = 'wait_healthy "$app" 48 10'
70+
71+
if old in content:
72+
count = content.count(old)
73+
content = content.replace(old, new)
74+
with open('/home/itstack/lab-phase4.sh', 'w') as f:
75+
f.write(content)
76+
print(f"FIXED: Snipe-IT wait extended to 8min ({count} occurrence(s) replaced)")
77+
elif 'wait_healthy "$app" 48 10' in content:
78+
print("ALREADY FIXED: Snipe-IT wait is already 48x10")
79+
else:
80+
print("WARNING: wait_healthy pattern not found. Searching for snipeit wait...")
81+
for i, line in enumerate(content.split('\n')):
82+
if 'wait_healthy' in line and ('snipe' in line.lower() or 'app' in line):
83+
print(f" line {i+1}: {line.strip()}")
84+
PYEOF
85+
86+
echo ""
87+
echo "=== Verification ==="
88+
echo "phase2 nginx healthcheck:"
89+
grep -n 'localhost:80' ~/lab-phase2.sh | head -3
90+
91+
echo "phase3 freepbx wait:"
92+
grep -n 'wait_healthy.*app' ~/lab-phase3.sh | head -3
93+
94+
echo "phase4 snipeit wait:"
95+
grep -n 'wait_healthy.*app' ~/lab-phase4.sh | grep -i snipe | head -3
96+
# Fallback: show all app waits in snipeit section
97+
grep -n 'wait_healthy.*app' ~/lab-phase4.sh | head -5
98+
99+
echo ""
100+
echo "ALL FIXES APPLIED SUCCESSFULLY"

0 commit comments

Comments
 (0)