Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
d3948ff
docs(spec/017): add specification with stateless --source-repo scope
outergod May 5, 2026
24e5426
docs(spec/017): add Phase 0/1 plan, research, data-model, contracts, …
outergod May 5, 2026
bd20451
docs(spec/017): add tasks.md with 50-task implementation breakdown
outergod May 5, 2026
3de4bdb
docs(spec/017): /speckit.analyze remediations
outergod May 5, 2026
9ce9bf5
chore(spec/016): clear superseded example fixtures
outergod May 5, 2026
929e489
feat(cli): add stateless --source-repo flag for plan/apply/explain
outergod May 5, 2026
de40ba2
feat(examples): publish 5 real-world homelab examples (spec/017 US1)
outergod May 5, 2026
da26a35
test(spec/017): US2 author-iterate-init transition coverage
outergod May 5, 2026
3ae5102
test(spec/017): US3 stateless apply + init'd-state preservation
outergod May 5, 2026
6055c57
docs(spec/017): populate synthesis table + NFS follow-up bullet
outergod May 5, 2026
35e821b
chore(spec/017): release governance + stale-doc cleanup (Phase 7)
outergod May 5, 2026
f5cfc90
fix(spec/017): codex review — stateless explain isolation + real exit…
outergod May 5, 2026
342e5aa
fix(spec/017): codex review round 2 — EX_DATAERR + probe-failure sema…
outergod May 5, 2026
f4b3eba
fix(spec/017): codex review round 3 — audit run outcome + access errors
outergod May 5, 2026
b2c13f5
fix(spec/017): codex review round 4 — force untracked-files probe
outergod May 6, 2026
61bf963
fix(spec/017): codex review round 5 — surface git probe failures
outergod May 6, 2026
5b41b3a
fix(spec/017): codex review round 6 — state-free plan + repo-corrupti…
outergod May 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .specify/feature.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
"feature_directory": "specs/016-source-repository-layout"
"feature_directory": "specs/017-real-world-validation"
}
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ versioning for public release policy decisions.
## [Unreleased]

<!-- core-ops-release:start -->
### Changed

- Add stateless `--source-repo <PATH>` flag for plan/apply/explain plus five real-world homelab examples under `examples/<NN-slug>/`.
<!-- core-ops-release:end -->

## [2.1.1] - 2026-05-04
Expand Down
6 changes: 4 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
# core-ops Development Guidelines

Auto-generated from all feature plans. Last updated: 2026-05-01
Auto-generated from all feature plans. Last updated: 2026-05-05

## Active Technologies
- Rust 2021 + clap 4, serde / serde_json, miette, thiserror, tempfile (015-controller-state-lifecycle)
- JSON state file at `/var/lib/core-ops/status.json` (atomic write via tempfile) (015-controller-state-lifecycle)
- Rust 2021 (stable toolchain), as established by the existing `core-ops` crate at v1.0.0; this feature is the trigger for the v2.0.0 major bump. + `clap` 4.5 (derive), `serde` 1.0 (derive), `serde_yaml` 0.9, `serde_json` 1.0, `miette` 7.2 (fancy diagnostics), `thiserror` 1.0, `tempfile` 3.10. No new runtime dependencies are required by this feature. (016-source-repository-layout)
- Source repository on filesystem (input); existing canonical status snapshot at `/var/lib/core-ops/status.json` (output). The status snapshot gains a `layout-version: "1"` field to record which layout produced it. (016-source-repository-layout)
- Rust 2021 (existing toolchain) + clap 4.5 (derive), serde 1.0, serde_yaml 0.9, serde_json 1.0, miette 7.2 (fancy diagnostics), thiserror 1.0, tempfile 3.10. **No new runtime dependencies.** Git invocation via `std::process::Command::new("git")` following the established pattern at `src/cli/init.rs:52`, `src/io/repo.rs:1312/1343/1372`, `src/io/release_governance.rs:367/440`, `src/cli/verification.rs:2068/2086/2090/2103`. (017-real-world-validation)
- Existing `/var/lib/core-ops/status.json` for init'd mode (unchanged). Stateless plan writes nothing under `/var/lib/`; stateless apply writes audit + status with path-based provenance (see FR-013); stateless explain writes nothing. Operator-explicit `--audit-dir` honored across both modes (see FR-012 plus 2026-05-05 clarification). (017-real-world-validation)

- Rust 2021 — clap 4, serde, miette, serde_json, serde_yaml
- GitHub Actions — ubuntu-latest runners, `gh` CLI, `rustup`
Expand Down Expand Up @@ -96,10 +98,10 @@ removed or renamed.
Follow standard Rust conventions. No new abstractions without justification.

## Recent Changes
- 017-real-world-validation: Added Rust 2021 (existing toolchain) + clap 4.5 (derive), serde 1.0, serde_yaml 0.9, serde_json 1.0, miette 7.2 (fancy diagnostics), thiserror 1.0, tempfile 3.10. **No new runtime dependencies.** Git invocation via `std::process::Command::new("git")` following the established pattern at `src/cli/init.rs:52`, `src/io/repo.rs:1312/1343/1372`, `src/io/release_governance.rs:367/440`, `src/cli/verification.rs:2068/2086/2090/2103`.
- 016-source-repository-layout: Added Rust 2021 (stable toolchain), as established by the existing `core-ops` crate at v1.0.0; this feature is the trigger for the v2.0.0 major bump. + `clap` 4.5 (derive), `serde` 1.0 (derive), `serde_yaml` 0.9, `serde_json` 1.0, `miette` 7.2 (fancy diagnostics), `thiserror` 1.0, `tempfile` 3.10. No new runtime dependencies are required by this feature.
- 015-controller-state-lifecycle: Added Rust 2021 + clap 4, serde / serde_json, miette, thiserror, tempfile

- 014-config-restart-fidelity: Fix planner to emit RestartUnit for config-file-dependent containers

<!-- MANUAL ADDITIONS START -->
<!-- MANUAL ADDITIONS END -->
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "core-ops"
version = "2.1.1"
version = "2.2.0"
edition = "2021"
license = "AGPL-3.0-or-later"

Expand Down
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,32 @@ A valid installation should:

---

## Real-World Examples

Five real-world homelab setups translated into the source-repository
layout. Each is runnable via stateless `--source-repo` invocation
without `core-ops init`. See `examples/<NN-slug>/README.md` for setup
intent, sources, and known limitations.

* [`examples/01-caddy-whoami`](examples/01-caddy-whoami) — Caddy reverse proxy fronting whoami (single-Container baseline).
* [`examples/02-nextcloud`](examples/02-nextcloud) — Nextcloud + Postgres + Redis + Traefik (multi-Container, intra-service network, persistent storage).
* [`examples/03-immich`](examples/03-immich) — Immich photo server with ML worker (GPU device, multi-network).
* [`examples/04-traefik-authelia`](examples/04-traefik-authelia) — Traefik + Authelia + protected backend (cross-service ForwardAuth composition).
* [`examples/05-observability`](examples/05-observability) — Prometheus + Grafana + node-exporter + cadvisor (host-scope sidecars).

Try one without committing to anything:

```sh
core-ops plan --source-repo examples/01-caddy-whoami --host example
```

No prior `core-ops init` required; nothing is written under
`/var/lib/core-ops/`. To switch into long-lived tracking mode after
copying an example to your own setup directory, run
`git init && core-ops init <path> <ref>` once.

---

## Installation (Current Phase)

CoreOps is currently distributed as direct binaries for `x86_64` (`amd64`)
Expand Down
7 changes: 7 additions & 0 deletions changes/017-real-world-validation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
change_id: 017-real-world-validation
release_intent: minor
summary: Add stateless `--source-repo <PATH>` flag for plan/apply/explain plus five real-world homelab examples under `examples/<NN-slug>/`.
scope: cli
release_preparation: false
---
11 changes: 9 additions & 2 deletions docs/development.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,10 +222,17 @@ local testing. The repository layout should include:
- `hosts/<host>/host.yaml` with explicit service selection
- `hosts/<host>/overrides/` for host-specific drop-ins

Override host selection during development with:
Override host selection during development. Stateless (no prior `init`):

```
CORE_OPS_HOST=<host> core-ops plan --repo <repo> --rev <rev>
core-ops plan --source-repo <PATH> --host <host>
```

Or initialize once and let persisted state carry the repo + ref:

```
core-ops init <repo-or-path> <ref>
core-ops plan --host <host>
```

When adding or changing behavior, ensure tests and diagnostics preserve
Expand Down
37 changes: 21 additions & 16 deletions docs/follow-ups.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,26 +6,18 @@ Deferred implementation work and discoveries that should be revisited after the

### Init Command

`core-ops` currently expects every `plan`, `apply` (or `agent`) to be supplied with `repo` and `rev` arguments. At the same time, expected use through operators is to initialize `core-ops` once against a repository and a tracking branch, and keep running `plan`, `apply` etc. against that.

`repository` and `requested_ref` are already tracked through `core-ops`, so `repo` should be taken from their and `rev` be assumed to be the latest tracking `requested_ref`.

In their place, a new `init` command shall be introduced in the form of `init [repo] [ref]` that sets up `core-ops` state with the tracking repository and ref. At the same time, remove the `repo` and `rev` arguments from `plan` and `apply`, effectively making the CLI stateful and aligned with the state store.
> Historical note: the `init`-as-explicit-entry-point + remove-`repo`/`rev`
> redesign described here shipped in spec/015. Stateless `--source-repo`
> for plan/apply/explain shipped in spec/017. The remaining open items
> in this section are about argument persistence and recovery UX,
> below.

Other arguments currently taken by `plan`, `apply`, and `agent` which should persist are `quadlet-dir`, `systemd-unit-dir`, `state-file`, and `audit-dir`.

Rollbacks would then be validated against `rev`s on the tracking branch, and otherwise refuse action if pointing to a non-reachable commit from the current ref.
`rollback-plan-only` (apply option) is completely misplaced and should instead become the `rollback` option for `plan`.

There should be an explicit flow to re-initialize using `init`, e.g. using `--reinitialize` that changes the tracking repo and/or ref.

Summary:
- CoreOps already persists tracking repository/ref in controller state
- CLI UX should be aligned with that existing persistence
- init becomes the explicit operator entry point for managing this persisted desired-state configuration

Read specs 004, 006, and 007 to get the full picture.

### Reconciliation Cleanup

Investigate the contents of status.json and deterministic-state.json to see whether state is duplicated. Consider removing state from status.json if duplicated.
Expand Down Expand Up @@ -84,17 +76,30 @@ Instead of a warning that the user doesn't have permission to read or operate on

For now, go with option 2.

## NFS-backed library mounts in real workloads

Real homelab workloads (Immich photo library, Nextcloud data
directory, etc.) frequently back container volumes with NFS mounts
declared in `services/<svc>/systemd/*.mount` units. Spec/017's
`examples/03-immich/` uses a Podman-managed `*.volume` instead because
NFS mount declarations are orthogonal to the validation iteration's
scope. A future iteration could ship a worked example exercising
mount-aware reconciliation against an NFS source. (Spec/017 synthesis
table classification: C.)

## Source Repository UX

There is no user-facing and no agent-facing documentation for the required layout of the source Git repository. Even the naming is not aligned (`Source repository` vs `workload Git repository`).
> The `Source repository` vs `workload Git repository` naming gap and the
> remaining authoring-tool follow-ups below. Spec/016 + spec/017 closed
> the "rich, documented real-life examples" and "QnA for known
> limitations" bullets — see `examples/<NN-slug>/` and the synthesis
> table at `specs/017-real-world-validation/spec.md`.

There should ideally be:
- User facing documentation how to author valid source repositories
- Agentic documentation for the same
- An installable Agent skill that teaches agents how to deal with source repositories
- A core-ops command that creates a source repository with basic layout, README.md and AGENTS.md from scratch (maybe plus the skill)
- Important: Rich, documented real-life examples of actual source repositories with real services, overrides, mounts etc.
- QnA for source repository use cases / known limitations and workarounds

These changes should be structured around schema, patterns (conventions), and tooling.

Expand Down
59 changes: 59 additions & 0 deletions examples/01-caddy-whoami/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# 01 — Caddy + whoami

Single-Container baseline: a Caddy reverse proxy fronting a `whoami`
HTTP echo backend over a shared Quadlet network. Default config-root
(`/etc/caddy/`). Shape coverage: one service, one Quadlet `*.container`,
plus auxiliary Quadlet `*.network` and `*.volume` units.

## Pressure axis

Single-Container baseline. Validates that the spec/016 layout supports
a minimal real-world reverse-proxy + backend pattern with persistent
state (Caddy automatically issues and stages TLS certificates into the
`caddy-data` volume).

## Sources

These references shaped the Quadlet equivalents. Upstream YAML/compose
blocks were not copied verbatim (research.md D5 license hygiene).

- Caddy quick-start: <https://caddyserver.com/docs/quick-starts/reverse-proxy>
- Caddy Docker official image: <https://hub.docker.com/_/caddy>
- traefik/whoami container README: <https://hub.docker.com/r/traefik/whoami>

## Service-by-service intent

| Service | Image | Purpose | Notes |
|---------|-------|---------|-------|
| `caddy` | `docker.io/library/caddy:2` | TLS terminator + reverse proxy | Mounts `/etc/caddy/Caddyfile` (default config-root); state in `caddy-data` volume |
| `whoami` | `docker.io/traefik/whoami` | HTTP echo backend | Joined to the same `caddy` network |

## Try it

> CLI output below is illustrative and not snapshot-tested.

```sh
core-ops plan --source-repo examples/01-caddy-whoami --host example
```

Expected: exit 0; plan lists the Caddy container, the whoami container,
the shared network, and the two Caddy volumes. No prior `core-ops init`
required; nothing written under `/var/lib/core-ops/`.

## Known limitations

None encountered during translation — this example is the spec/016
layout's narrowest shape and exercises no friction beyond the parser
contract.

## Scaffold for your own setup

```sh
cp -r examples/01-caddy-whoami ~/my-caddy
# Edit hosts/example/host.yaml → rename `example` to your host id.
# Edit services/caddy/config/Caddyfile → set your real domain + backend.
core-ops plan --source-repo ~/my-caddy --host <your-host>
```

Once happy, `git init && core-ops init ~/my-caddy main` to switch into
long-lived tracking mode.
4 changes: 4 additions & 0 deletions examples/01-caddy-whoami/hosts/example/host.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
host: example
services:
- caddy
- whoami
7 changes: 7 additions & 0 deletions examples/01-caddy-whoami/services/caddy/config/Caddyfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Illustrative Caddyfile; replace example.com with the operator's
# domain before applying. RFC 2606 reserved domain used here so the
# example is safe to commit.

whoami.example.com {
reverse_proxy whoami:80
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[Unit]
Description=Caddy autosaved JSON config

[Volume]
VolumeName=caddy-config
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[Unit]
Description=Persistent state for Caddy (certs, OCSP staples)

[Volume]
VolumeName=caddy-data
21 changes: 21 additions & 0 deletions examples/01-caddy-whoami/services/caddy/quadlet/caddy.container
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
[Unit]
Description=Caddy reverse proxy fronting whoami
After=network-online.target
Wants=network-online.target

[Container]
Image=docker.io/library/caddy:2
PublishPort=80:80
PublishPort=443:443
Volume=/etc/caddy:/etc/caddy:ro,Z
Volume=caddy-data.volume:/data:Z
Volume=caddy-config.volume:/config:Z
Network=caddy.network
Exec=caddy run --config /etc/caddy/Caddyfile --adapter caddyfile

[Service]
Restart=always
TimeoutStartSec=300

[Install]
WantedBy=multi-user.target default.target
6 changes: 6 additions & 0 deletions examples/01-caddy-whoami/services/caddy/quadlet/caddy.network
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[Unit]
Description=Shared network for caddy and whoami

[Network]
NetworkName=caddy
Subnet=192.0.2.0/24
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
[Unit]
Description=whoami — minimal example service for spec 016 layout v1
Description=whoami HTTP echo backend
After=network-online.target
Wants=network-online.target

[Container]
Image=docker.io/traefik/whoami:latest
PublishPort=8000:80
Volume=/etc/whoami/whoami.toml:/etc/whoami/whoami.toml:ro,Z
Exec=--port 80
ContainerName=whoami
Network=caddy.network

[Service]
Restart=always
TimeoutStartSec=900
TimeoutStartSec=180

[Install]
WantedBy=multi-user.target default.target
76 changes: 76 additions & 0 deletions examples/02-nextcloud/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# 02 — Nextcloud (community multi-container)

Multi-Container homelab Nextcloud stack: Nextcloud + Postgres + Redis +
Traefik edge proxy. Intra-service Quadlet network, persistent storage
volumes, host-side TLS port drop-in. The `traefik-edge` service id
diverges from its `config-root: traefik`, exercising the
`service.yaml` redirection path.

## Pressure axis

Multi-Container, intra-service network, persistent storage. Validates
that the spec/016 layout supports a real-world four-container stack
where each container is its own service directory and the headlining
service (`nextcloud`) depends on its peers via Quadlet `Requires=`.

## Sources

These references shaped the Quadlet equivalents. Upstream YAML/compose
blocks were not copied verbatim (research.md D5 license hygiene).

- Nextcloud official Docker image: <https://hub.docker.com/_/nextcloud>
- Nextcloud community Docker examples (NOT the All-In-One container,
which manages its own sub-containers via the Docker socket and is
incompatible with external orchestration):
<https://github.com/nextcloud/docker/tree/master/.examples/docker-compose>
- Postgres official image: <https://hub.docker.com/_/postgres>
- Redis official image: <https://hub.docker.com/_/redis>
- Traefik v3 docs: <https://doc.traefik.io/traefik/>

## Service-by-service intent

| Service | Image | Purpose | Notes |
|---------|-------|---------|-------|
| `nextcloud` | `docker.io/library/nextcloud:30` | Headlining Nextcloud app server | Mounts `nextcloud-data` volume; declares `Requires=` on db + redis |
| `nextcloud-db` | `docker.io/library/postgres:16` | Postgres backing store | Persistent `nextcloud-db-data` volume; password sourced via Podman secret |
| `nextcloud-redis` | `docker.io/library/redis:7-alpine` | In-memory cache | Save disabled (cache only) |
| `traefik-edge` | `docker.io/library/traefik:v3.1` | Edge reverse proxy | Service id `traefik-edge`, `config-root: traefik` (config-root divergence) |

## Try it

> CLI output below is illustrative and not snapshot-tested.

```sh
core-ops plan --source-repo examples/02-nextcloud --host example
```

Expected: exit 0; plan lists 4 containers, 1 network, 2 volumes, 1
config file (`/etc/traefik/traefik.yaml` — note the `traefik-edge` →
`traefik` config-root rewrite), and the host-side `traefik-edge.container.d/10-tls.conf`
drop-in adding the TLS port.

## Known limitations

- **Secrets are referenced, not committed**: the example declares a
Podman secret `nextcloud-db-password` but does not provide its
contents. Operators must `podman secret create nextcloud-db-password
/path/to/secret` on the host before applying. Secret bootstrap
belongs to the host, not the source-repo (FR-009: no real values).
- **Trusted domain placeholder**: `NEXTCLOUD_TRUSTED_DOMAINS` is set to
`cloud.example.com` (RFC 2606). Replace with the operator's real
domain in their own scaffold copy before applying.
- **Initial Nextcloud setup is interactive**: the first `apply`
installs files; the operator still needs to complete the install
wizard at `http://<host>/` to create the admin account. This is a
Nextcloud product behavior, not a layout limitation. (Synthesis
table classification: `B` — workaround documented here.)

## Scaffold for your own setup

```sh
cp -r examples/02-nextcloud ~/my-nextcloud
# Edit hosts/example/host.yaml → rename `example` to your host id.
# Edit services/traefik-edge/config/traefik.yaml → set your domain.
# `podman secret create nextcloud-db-password ...` on the target host.
core-ops plan --source-repo ~/my-nextcloud --host <your-host>
```
Loading
Loading