Skip to content

docs(plan): step-by-step go-live session walkthrough (You + Claude)#53

Merged
JacobHaig merged 2 commits into
mainfrom
docs/uhhcraft-go-live-walkthrough
Jun 4, 2026
Merged

docs(plan): step-by-step go-live session walkthrough (You + Claude)#53
JacobHaig merged 2 commits into
mainfrom
docs/uhhcraft-go-live-walkthrough

Conversation

@JacobHaig
Copy link
Copy Markdown
Member

@JacobHaig JacobHaig commented Jun 4, 2026

Summary

A sequential operator script for running the UhhCraft go-live as a live session — the how-we-execute-together companion to the comprehensive UHHCRAFT-GO-LIVE-PLAN.md (the what).

Each step is labelled [YOU] / [CLAUDE] / [BOTH] with an explicit hand-off cue (the exact thing to say/paste to move on), so you can follow it top-to-bottom in chat.

The flow

  1. Decisions (in chat) → 1. Reachability handshake (I test what I can reach via the site-config creds — Semaphore API / OpenBao / Proxmox — which decides whether each later step is Claude-driven or you-driven) → 2. VMs → 3. GPU passthrough → 4. DNS/Stripe → 5. Secrets → 6. Policies/SSH/Podman → 7. Ordered deploy (+ triage loop) → 8. Smoke → 9. Rollback drill → 10. Sign-off.

Key design points

  • Honest division of labor: I trigger/poll Semaphore jobs and run checks where I can reach the service (confirmed in Step 1, with your authorization to use the site-config creds); you do host/dashboard/decision work; I triage every failure into a fix.
  • Keeps the Semaphore-only deploy rule — "I deploy" = trigger a Semaphore template via API, never a manual SSH.
  • Re-flags the open Caddy-host gap at the exact step it matters (deploy step 7).
  • Ends with a who-owns-what table.

Cross-linked from the go-live plan + architecture-reference.md. Docs only; links resolve, plan/ ASCII-art gate + secret/IP audit clean.

AI Layer

  • Adds a new operator-facing runbook: UHHCRAFT-GO-LIVE-WALKTHROUGH.md — a top-to-bottom, chat-friendly stepwise script for a live “You + Claude” go-live session with explicit [YOU]/[CLAUDE]/[BOTH] ownership and exact hand-off cues.
  • Introduces a reachability handshake (Step 1) that gates whether Claude triggers/polls Semaphore jobs or the human operator runs UI actions; documents which steps Claude may poll/triage vs operator-only actions.
  • Cross-links the walkthrough as the execution companion to UHHCRAFT-GO-LIVE-PLAN.md and registers the walkthrough in the architecture index (architecture-reference.md) with status: ACTIVE.

Guardrail Layer

  • Documents operator/agent boundary and an explicit Semaphore-only deploy ground rule (deploys via Semaphore template API; no manual SSH deploy.sh). (Docs only—no policy code changes.)

Automation Layer

  • References existing Semaphore templates/playbooks to run in-session (provision-vm.yml, install-podman.yml, distribute-ssh-keys.yml, harden-ssh.yml, apply-policy-.yml, provision-orb-agent-approle.yml, deploy- and update-* templates) and prescribes the ordered deploy + triage loop; no template code changed (docs-only).

Platform Layer

  • No code or manifests changed; the walkthrough documents the provisioning sequence for Proxmox VMs, GPU passthrough verification steps, DNS/Stripe webhook actions, secret seeding into OpenBao, and the required rollback drill.

Architectural scope & decisions (summary)

  • Role-based execution model with per-step ownership and exact hand-offs.
  • Reachability-gated automation decides whether Claude performs Semaphore API triggers/polls or remains a triage/guide.
  • Enforces Semaphore-only deploys and a defined sequential deploy order with a triage-on-failure loop.
  • Calls out an unresolved ops gap: no deploy-caddy playbook/template; caddy host must be running before distributing site fragment.
  • Requires a one-time rollback drill and documents where OpenBao-seeded secrets and Stripe webhook secrets must be provided.

(Notes) This PR is documentation-only; it updates plan frontmatter to status: ACTIVE and adds the walkthrough runbook and index entry. No secrets, policies, code, or manifests were changed by these commits.

Companion to UHHCRAFT-GO-LIVE-PLAN.md: a sequential operator script for
running Phase 10 as a live session. Each step is labelled [YOU] / [CLAUDE] /
[BOTH] with explicit hand-off cues, so it can be followed top-to-bottom in
chat — I run/trigger what I can reach (Semaphore API, OpenBao, Proxmox via the
site-config creds, once authorized + reachability confirmed in Step 1), you do
the host/dashboard/decision work, and I triage failures.

Steps: 0 decisions → 1 reachability handshake → 2 VMs → 3 GPU passthrough →
4 DNS/Stripe → 5 secrets → 6 policies/SSH/podman → 7 ordered deploy (+triage)
→ 8 smoke → 9 rollback drill → 10 sign-off. Includes a who-owns-what table.

Cross-linked from the go-live plan + architecture-reference index.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces a comprehensive go-live walkthrough runbook (plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md) prescribing a step-by-step operator–Claude live session for UhhCraft deployment, updates the Phase 10 plan frontmatter and guidance to reference the walkthrough, and adds the walkthrough to the architecture development index.

Changes

Go-Live Walkthrough Runbook

Layer / File(s) Summary
Walkthrough runbook creation and integration
plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md, plan/development/UHHCRAFT-GO-LIVE-PLAN.md, plan/architecture/architecture-reference.md
Adds a 10-step live-session runbook covering gating decisions, reachability checks, Proxmox VM provisioning, GPU passthrough verification, DNS/Stripe webhook setup, OpenBao-backed secret seeding, Semaphore-templated SSH/hardening/Podman/AppRole provisioning, ordered service deployments with triage-on-failure, smoke tests, rollback drill, sign-off, and ownership/cross-reference entries. Updates Phase 10 frontmatter to ACTIVE and inserts an index row referencing the walkthrough.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • uhstray-io/agent-cloud#32: Documents and changes related to provisioning Proxmox inference VMs, inventory updates, Podman install, and GPU passthrough verification.
  • uhstray-io/agent-cloud#49: Introduced the original Phase 10 go-live plan that this PR references and extends with a dedicated execution walkthrough.

Suggested labels

docs

Poem

📋 Step by step, the operator guides,
Claude replies in timed divides,
Semaphore runs, no SSH leap,
VMs rise and secrets keep,
Smoke tests pass — the launch takes flight. 🚀


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)

Check name Status Explanation Resolution
Deploy.Sh Does Not Manage Secrets ❌ Error Five deploy.sh scripts violate the secret boundary: n8n, nocodb, semaphore source bao-client.sh; netbox, openbao call secret functions/use port 8200. Remove bao-client.sh imports and secret-store calls from deploy.sh scripts; migrate secret management to Ansible per architecture standards.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a step-by-step operator session walkthrough document for the UhhCraft go-live process.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Pr Title Check ✅ Passed PR title follows conventional commit format: type (docs) is allowed, scope (plan) is valid, description is present, and total length (67 chars) is under the 72-char limit.
No Leaked Secrets Or Ips ✅ Passed All three modified files are documentation-only (*.md). No RFC1918 IPs, SSH keys, AWS/GitHub credentials, or other sensitive patterns detected in added/modified lines.
No Direct-To-Main Commits ✅ Passed Source branch docs/uhhcraft-go-live-walkthrough is not main and follows the required docs/* naming convention.
Semaphore Config Is Code Managed ✅ Passed PR contains only documentation changes (plan files); no Semaphore config files (templates.yml, setup-templates.yml) or infrastructure code modified. Check condition not triggered.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/uhhcraft-go-live-walkthrough
  • 🛠️ Enforce Secrets Boundary
  • 🛠️ Validate Composable Pattern
  • 🛠️ Check deploy script conventions

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md (1)

10-172: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

This plan doc is missing required standard sections.

For plan/**/*.md, this should include the required sections (Problem, Design Principles, Architecture with at least one Mermaid diagram, Implementation Phases with acceptance criteria, Validation Criteria, Security Considerations). Add those sections or explicitly convert this into a non-plan doc location/type if it is intended to be a pure runbook.

As per coding guidelines: “include required sections in each plan doc … Architecture w/ at least one Mermaid diagram; Implementation Phases w/ acceptance criteria; Validation Criteria; Security Considerations.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md` around lines 10 - 172, The
walkthrough file "UhhCraft Go-Live — Session Walkthrough (You + Claude)" is
missing required plan sections; update this document (or move it out of plan/ if
intended as a pure runbook) by adding the top-level headings: Problem, Design
Principles, Architecture (include at least one Mermaid diagram), Implementation
Phases (break into phases with explicit acceptance criteria for each),
Validation Criteria, and Security Considerations; ensure these sections appear
before or alongside the existing step-by-step walkthrough (you can reference
existing Step 0–Step 10 headings for context) so the doc conforms to the
plan/**.md guideline.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md`:
- Around line 163-171: The Markdown fails lint rules MD022/MD058 because
headings and the revision table lack surrounding blank lines; update the
"Cross-references" and "Revision history" blocks so there is a blank line before
each "## Cross-references" and "## Revision history" heading and ensure a blank
line above and below the revision table (the pipe-delimited table under
"Revision history") so the headings and table have proper spacing for
markdownlint.
- Line 4: The frontmatter uses a non-enum value "status: ACTIVE — ready to run";
change the frontmatter to use the exact enum token by setting status: ACTIVE and
remove the trailing explanatory text from that line, then move the explanatory
phrase ("ready to run") into a separate field (e.g., add context: "ready to
run") or place it into the document body; update the frontmatter block where the
status key is defined to reflect only the enum value and ensure any consumers
read the new context field or body content instead.

---

Outside diff comments:
In `@plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md`:
- Around line 10-172: The walkthrough file "UhhCraft Go-Live — Session
Walkthrough (You + Claude)" is missing required plan sections; update this
document (or move it out of plan/ if intended as a pure runbook) by adding the
top-level headings: Problem, Design Principles, Architecture (include at least
one Mermaid diagram), Implementation Phases (break into phases with explicit
acceptance criteria for each), Validation Criteria, and Security Considerations;
ensure these sections appear before or alongside the existing step-by-step
walkthrough (you can reference existing Step 0–Step 10 headings for context) so
the doc conforms to the plan/**.md guideline.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: f01b4548-397d-47bd-8421-c21c1a98db4d

📥 Commits

Reviewing files that changed from the base of the PR and between 6181e98 and b3aaf80.

📒 Files selected for processing (3)
  • plan/architecture/architecture-reference.md
  • plan/development/UHHCRAFT-GO-LIVE-PLAN.md
  • plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md

Comment thread plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md Outdated
Comment thread plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md
…bbit)

- status: ACTIVE (bare enum per architecture-reference.md doc standard) on the
  go-live walkthrough AND the go-live plan; explanatory text lives in the
  title/intro, not the status token.
- blank lines around the Cross-references/Revision-history headings + table
  (MD022/MD058) in the walkthrough.
@JacobHaig
Copy link
Copy Markdown
Member Author

Both fixed in 7e8a0a3:

  • Status enum — set status: ACTIVE (the bare enum value per architecture-reference.md's documented set: PLANNING/PROPOSED/ACTIVE/IMPLEMENTED/COMPLETE/SUPERSEDED/DEFERRED); the descriptor now lives in the title/intro. Applied the same fix to UHHCRAFT-GO-LIVE-PLAN.md (it had ACTIVE — execution pending…) so both go-live docs are consistent.
  • MD022/MD058 spacing — added blank lines around the Cross-references/Revision-history headings + table. (Note: markdownlint isn't actually a CI gate here — verified it's in no workflow — but it's cheap hygiene and you're right to flag it.)

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md (2)

1-8: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add required context field in frontmatter.

Frontmatter is missing the required context key for plan docs.

Suggested minimal fix
 ---
 title: UhhCraft Go-Live — Session Walkthrough (You + Claude)
 date: 2026-06-04
 status: ACTIVE
+context: Live-session execution companion runbook for Phase 10 go-live.
 audience: the operator (you) running this go-live in a live session with Claude
 companion_to: UHHCRAFT-GO-LIVE-PLAN.md
 tags: [uhhcraft, go-live, phase-10, walkthrough, runbook, operator]
 ---

As per coding guidelines: “Frontmatter (Date, Status, Context)” is required for plan docs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md` around lines 1 - 8, The
frontmatter block is missing the required "context" key; update the document's
frontmatter (the YAML block containing title, date, status, audience,
companion_to, tags) by adding a top-level "context" field with an appropriate
value (e.g., "go-live walkthrough" or "operator session") so the frontmatter
contains Date, Status and Context as required by the plan docs guidelines.

1-173: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Required plan template sections are missing for a plan/**/*.md document.

This runbook is operationally strong, but it does not follow the required plan-doc section contract (Problem, Design Principles, Architecture with Mermaid, Implementation Phases, Validation Criteria, Security Considerations). Please either align this file to that template or explicitly reclassify/move it outside the plan-doc template scope.

As per coding guidelines: “plan/frontmatter standards: ensure every plan doc includes … the required sections list (Problem, Design Principles, Architecture w/ mermaid, Implementation Phases, Validation Criteria, Security Considerations, Cross-references, Revision History).”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md` around lines 1 - 173, The
document titled "UhhCraft Go-Live — Session Walkthrough (You + Claude)" is
missing the required plan-template sections; either add them or explicitly
reclassify the file outside the plan-doc scope. Fix by adding the mandated
sections (Problem, Design Principles, Architecture with a Mermaid diagram,
Implementation Phases, Validation Criteria, Security Considerations,
Cross-references, Revision History) into the body of the markdown (retain the
existing step-by-step walkthrough content under Implementation Phases or an
Appendix), or move/reclassify the file out of the plan/* directory and update
its frontmatter status/tags to reflect "runbook" rather than a plan so it no
longer must conform to the plan template; ensure the frontmatter and title
"UhhCraft Go-Live — Session Walkthrough (You + Claude)" and companion_to
references remain consistent with the chosen option.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plan/development/UHHCRAFT-GO-LIVE-PLAN.md`:
- Around line 1-7: The frontmatter for the plan (the YAML block containing
title, date, status, audience, tags) is missing the required context field; add
a top-level context: "<short context string>" entry (e.g., context: "production
rollout validation for UhhCraft Phase 10") to the frontmatter so it includes
Date, Status, and Context per the plan doc standard and retains existing fields
(title, date, status, audience, tags).

---

Outside diff comments:
In `@plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md`:
- Around line 1-8: The frontmatter block is missing the required "context" key;
update the document's frontmatter (the YAML block containing title, date,
status, audience, companion_to, tags) by adding a top-level "context" field with
an appropriate value (e.g., "go-live walkthrough" or "operator session") so the
frontmatter contains Date, Status and Context as required by the plan docs
guidelines.
- Around line 1-173: The document titled "UhhCraft Go-Live — Session Walkthrough
(You + Claude)" is missing the required plan-template sections; either add them
or explicitly reclassify the file outside the plan-doc scope. Fix by adding the
mandated sections (Problem, Design Principles, Architecture with a Mermaid
diagram, Implementation Phases, Validation Criteria, Security Considerations,
Cross-references, Revision History) into the body of the markdown (retain the
existing step-by-step walkthrough content under Implementation Phases or an
Appendix), or move/reclassify the file out of the plan/* directory and update
its frontmatter status/tags to reflect "runbook" rather than a plan so it no
longer must conform to the plan template; ensure the frontmatter and title
"UhhCraft Go-Live — Session Walkthrough (You + Claude)" and companion_to
references remain consistent with the chosen option.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0d95899c-9dbc-4d19-9062-6dc2121001e4

📥 Commits

Reviewing files that changed from the base of the PR and between b3aaf80 and 7e8a0a3.

📒 Files selected for processing (2)
  • plan/development/UHHCRAFT-GO-LIVE-PLAN.md
  • plan/development/UHHCRAFT-GO-LIVE-WALKTHROUGH.md

Comment on lines 1 to 7
---
title: UhhCraft Go-Live — Phase 10 Production Validation Plan
date: 2026-06-02
status: ACTIVE — execution pending hardware + decisions
status: ACTIVE
audience: operator running Semaphore against the live cluster, and any agent preparing/triaging the rollout
tags: [uhhcraft, inference-comfyui, inference-hunyuan3d, caddy, phase-10, go-live, validation, proxmox, openbao]
---
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Frontmatter still misses required context field.

status is now correct, but this plan frontmatter still lacks context, which is required by the plan doc standard.

Suggested minimal fix
 ---
 title: UhhCraft Go-Live — Phase 10 Production Validation Plan
 date: 2026-06-02
 status: ACTIVE
+context: Phase 10 production go-live checklist/reference for operator execution.
 audience: operator running Semaphore against the live cluster, and any agent preparing/triaging the rollout
 tags: [uhhcraft, inference-comfyui, inference-hunyuan3d, caddy, phase-10, go-live, validation, proxmox, openbao]
 ---

As per coding guidelines: “Frontmatter (Date, Status, Context)” is required for plan docs.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
---
title: UhhCraft Go-Live — Phase 10 Production Validation Plan
date: 2026-06-02
status: ACTIVE — execution pending hardware + decisions
status: ACTIVE
audience: operator running Semaphore against the live cluster, and any agent preparing/triaging the rollout
tags: [uhhcraft, inference-comfyui, inference-hunyuan3d, caddy, phase-10, go-live, validation, proxmox, openbao]
---
---
title: UhhCraft Go-Live — Phase 10 Production Validation Plan
date: 2026-06-02
status: ACTIVE
context: Phase 10 production go-live checklist/reference for operator execution.
audience: operator running Semaphore against the live cluster, and any agent preparing/triaging the rollout
tags: [uhhcraft, inference-comfyui, inference-hunyuan3d, caddy, phase-10, go-live, validation, proxmox, openbao]
---
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plan/development/UHHCRAFT-GO-LIVE-PLAN.md` around lines 1 - 7, The
frontmatter for the plan (the YAML block containing title, date, status,
audience, tags) is missing the required context field; add a top-level context:
"<short context string>" entry (e.g., context: "production rollout validation
for UhhCraft Phase 10") to the frontmatter so it includes Date, Status, and
Context per the plan doc standard and retains existing fields (title, date,
status, audience, tags).

@JacobHaig JacobHaig merged commit 209fd25 into main Jun 4, 2026
13 checks passed
@JacobHaig JacobHaig deleted the docs/uhhcraft-go-live-walkthrough branch June 4, 2026 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant