Skip to content

feat(spec/017): real-world validation, examples, and stateless --source-repo#32

Merged
outergod merged 17 commits into
masterfrom
017-real-world-validation
May 6, 2026
Merged

feat(spec/017): real-world validation, examples, and stateless --source-repo#32
outergod merged 17 commits into
masterfrom
017-real-world-validation

Conversation

@outergod
Copy link
Copy Markdown
Owner

@outergod outergod commented May 5, 2026

Summary

Closes spec/017 (real-world-validation) — the validation iteration
that translates five real-world homelab setups into the spec/016
source-repository layout, ships them under top-level examples/, and
absorbs the missing --source-repo CLI surface that was blocking all
five of them plus the four spec/016 in-tree fixtures from v2.0.0.

  • Stateless --source-repo <PATH> on core-ops plan, apply, and
    explain
    — bypasses the persisted controller config written by
    core-ops init. Path-based provenance via DesiredStateProvenance.requested_ref
    value conventions: full 40-char SHA for clean git checkouts,
    (stateless+dirty) for dirty working trees, (stateless) for
    non-git directories. No struct change; sentinels disambiguated by
    the leading ( (invalid in git ref names per git check-ref-format).
  • Five real-world examples under examples/<NN-slug>/:
    01-caddy-whoami, 02-nextcloud, 03-immich, 04-traefik-authelia,
    05-observability. Each carries a README with intent, public
    upstream sources, known limitations, "Try it" snippet, and
    "Scaffold for your own setup" instructions.
  • Stateless apply preserves init'd state byte-identical (FR-013,
    SC-009): performs zero state-file I/O. The audit chain is the
    persisted record. The init'd desired_state.repository /
    desired_state.requested_ref are never mutated by stateless apply.
  • Synthesis table in spec.md with 11 rows (1 A-classified
    absorbed self-escalation, 8 B-workaround-with-doc, 2 C-defer-to-follow-up)
    grounded in evidence from the five translations.
  • Spec/016 example fixtures removed as superseded; spec/016 spec.md
    • tasks.md carry [SUPERSEDED by spec/017] annotations preserving
      historical record.
  • Stale-doc cleanup: docs/follow-ups.md Init Command paragraphs
    collapsed to a historical note; docs/development.md:228 updated
    to use --source-repo; the closed Source Repository UX bullets
    removed; one new follow-up bullet added (NFS-backed library mounts,
    spec/017 synthesis table classification C).
  • Root README gains a ## Real-World Examples section between
    ## First Interaction and ## Installation (Current Phase).

Test plan

  • cargo build --locked --bin core-ops --bin core-ops-verify --bin core-ops-release clean.
  • cargo test — 468 passed (12 new clap+source_ref unit tests, 5
    per-example integration tests, 7 stateless-plan tests, 5
    stateless-explain tests, 2 author-iterate tests, 4 stateless-apply
    tests).
  • cargo clippy --all-targets -- -D warnings clean.
  • cargo run --bin core-ops-release -- validate --base-ref master:
    passed; classification=releasable; required=minor; declared=minor;
    CHANGELOG aligned.
  • Privacy gate: grep -rE 'not\.one|ulthar|192\.168\.1\.2|gcloud[-_]dns|gcloud\.json' examples/
    returns 0 matches.
  • RFC 2606 compliance: deployment-config FQDNs are
    *.example.com only; README citation URLs (hub.docker.com,
    docs.kernel.org, www.authelia.com, caddyserver.com,
    authelia.com) are real public upstream documentation references
    per FR-004(b) and exempt from FR-008's deployment-hostname rule.
  • RFC 5737 compliance: 192.0.2.0/24 + 198.51.100.0/24 documentation
    ranges only; the single 0.0.0.0 in Authelia config is the
    unspecified-address bind sentinel.
  • Quickstart manual validation:
    core-ops plan --source-repo examples/01-caddy-whoami --host example
    and --source-repo examples/05-observability succeed without
    prior core-ops init and without --force between examples;
    core-ops explain against container/caddy.container succeeds.
  • CI green on this PR.

Release governance

  • Cargo.toml 2.1.1 → 2.2.0.
  • changes/017-real-world-validation.md declares release_intent: minor.
  • CHANGELOG.md [Unreleased] block carries the spec/017 fragment summary.
  • tests/fixtures/provenance_state/valid-success.json controller.version
    bumped to 2.2.0 to keep the pinning test green.

🤖 Generated with Claude Code

outergod and others added 11 commits May 5, 2026 09:43
Validation iteration scoped beyond examples to also add a stateless
--source-repo flag to plan/apply/explain after the first US1 acceptance
scenario hit the now-removed --repo/--rev flags. Includes the 2026-05-05
clarifications locking the safety, coexistence, and provenance semantics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…quickstart

Implementation plan with technical context, constitution check (PASS with
narrow VM-scenario exemption), and project structure. Research records 10
technical decisions (shell-to-git for ref detection, sentinel-string fit
in DesiredStateProvenance, single-consumer impact for spec/016 example
removal, license hygiene by writing own Quadlet equivalents). Contracts
spec the --source-repo CLI surface and the friction-classification
synthesis-table invariants. Quickstart is the operator walkthrough.

Also corrects FR-021 baseline: master advanced to 2.1.1 via PR #31's
auto-promote during this session, so the bump target is 2.2.0 from
that baseline, not from the original 2.1.0 in the spec scaffold.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Decomposed into 7 phases following spec-kit conventions: Setup (clear
spec/016 examples), Foundational (--source-repo CLI surface + git ref
detection — blocks all user stories), four user-story phases (US1 P1
runnable examples, US2 P2 authoring scaffolds, US3 P2 stateless apply,
US4 P3 synthesis table), Polish (stale-doc cleanup + release governance
+ privacy gate). 18 of 50 tasks marked parallelizable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surgical fixes against the analysis report from /speckit.analyze:
nine medium/low findings resolved without changing the locked scope.

spec.md:
- FR-004: add (g) "Scaffold for your own setup" README section to
  match T032 and the US2 operator-as-author workflow
- Assumption: replace stale provenance-shape Assumption (deferred to
  /speckit.plan) with the locked statement (clarification Q3 + D2)

tasks.md:
- Add [P] to T003, T004, T005 (Phase 1 different-file parallelism)
- Add [P] to T021, T022, T029 (Phase 3 different-file parallelism)
- Remove [P] from T035 (same file as T033/T034 — sequential)
- T022: loop over all five examples to fully cover SC-011
- T034: add US3 AC3 sub-test (stateless apply -> init'd plan transition)
- T049: extend with RFC 2606/5737 compliance grep (FR-008 enforcement)
- T004: drop duplicate [X] in supersession annotation
- Phase 5 dependencies: fix T036 self-reference to T028+T035

No CRITICAL/HIGH findings; no scope change; no new tasks; renumbering
preserved (T001-T050).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the four `specs/016-source-repository-layout/examples/`
fixture directories superseded by spec/017's real-world examples
under top-level `examples/<NN-slug>/`.

Per spec/017 FR-017–FR-019:
- T001: `git rm -r` 4 spec/016 example dirs.
- T002: drop the 4 `example_0X_*_loads` parser tests in
  `tests/integration/test_source_repo_layout.rs` (covered by
  per-example tests T016–T020 in Phase 3); rewrite the
  `repeated_load_yields_identical_workloads` invariant test to
  use `materialize_skeleton` so it does not depend on a deleted
  fixture; drop the now-unused `EXAMPLES_DIR`, `examples_root`,
  and `materialize_example` helpers from `source_repo_support.rs`.
- T003: append `[SUPERSEDED by spec/017]` note to spec/016
  spec.md FR-023; preserve historical FR text.
- T004: append `[SUPERSEDED by spec/017]` to spec/016 tasks.md
  T101–T104 (already `[X]`-marked).
- T005: confirmed `scripts/migrate-legacy-source-repo.sh` has
  zero references to spec/016 example paths (per research.md
  D10) — no follow-up action required.

`cargo check --tests` and `cargo clippy --all-targets -- -D
warnings` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the foundational `--source-repo <PATH>` CLI surface for spec/017
real-world-validation. Stateless mode bypasses the persisted controller
configuration written by `core-ops init` and sources desired state
directly from a filesystem directory. Per FR-010..FR-016 + the
2026-05-05 clarifications:

- T006-T008: `source_repo: Option<PathBuf>` with `requires("host")`
  added to PlanArgs / ApplyArgs / ExplainArgs in `src/cli/args.rs`.
  Per-command `--help` text documents the bypass relationship, the
  --host requirement, and the canonical init'd-mode workflow per
  the FR-016 contract.
- T009: `src/io/source_ref.rs::detect_provenance(&Path) ->
  Result<StatelessSource, SourceRefError>` shells out to `git -C
  <path>` per the existing pattern (research.md D1). Records full
  40-char SHA for clean checkouts, `(stateless+dirty)` for dirty
  working trees, `(stateless)` for non-git directories. Exit
  codes 64/66 per contracts/cli-flag.md.
- T010-T012: dispatched from `src/main.rs` Plan/Apply/Explain
  branches; stateless paths bypass `resolve_state_file` /
  `read_persisted_state` entirely. Stateless apply uses a new
  `apply_with_report_stateless` entry in `src/cli/apply.rs` that
  performs zero state-file I/O — the audit chain is the persisted
  record (FR-013, SC-009 trivially satisfied).
- `src/io/repo.rs::load_desired_state_from_path` is the
  pure-path counterpart to `load_desired_state` — operates on
  the path in place, no clone, no checkout.
- T013: 12 clap unit tests (acceptance per command, --host
  requirement, init/agent/status rejection, --help text contract).
- T014: 6 source_ref unit tests (stateless / dirty / SHA shapes,
  PathMissing/PathNotDirectory exit-64, sentinel-vs-ref disjointness
  via `git check-ref-format`).
- T015 checkpoint: `cargo build --tests`, `cargo test` (445 passed),
  `cargo clippy --all-targets -- -D warnings` all green.

Stateless apply is mutually exclusive with `--rollback-to` (rollback
requires the persisted retention chain set by `core-ops init`); the
combination errors with FailureClass::Apply.

The init'd-mode CLI surface is unchanged — the flag is purely
additive (Principle 9 / FR-010).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the MVP deliverables for spec/017's US1: five real-world
homelab setups translated into the spec/016 source-repository layout
under top-level `examples/<NN-slug>/`, with per-example integration
tests asserting parser load + stateless `core-ops plan` exit-0 (FR-006,
FR-016, SC-001/SC-003/SC-006/SC-008/SC-011).

Examples authored (T023-T027):

- `01-caddy-whoami`: single-Container baseline; Caddy reverse-proxy
  fronting whoami; default config-root; one host-overlay drop-in
  (`MemoryMax=256M`, `CPUQuota=50%` on caddy).
- `02-nextcloud`: multi-Container with intra-service `Network=` and
  persistent `Volume=` storage; Postgres + Redis + Nextcloud + Traefik;
  service.yaml `config-root: traefik` divergence on `traefik-edge`;
  host overlay adding TLS port; Podman secret for db password.
- `03-immich`: GPU device + multi-network; immich-server joins both
  internal and public networks; immich-ml receives `AddDevice=/dev/dri`
  via host overlay drop-in.
- `04-traefik-authelia`: cross-service ForwardAuth composition via
  Traefik labels in a host-overlay drop-in on the protected `whoami`
  backend.
- `05-observability`: Prometheus + Grafana + node-exporter + cadvisor;
  host-scope `/proc`/`/sys` bind mounts; whole-file `prometheus.yml`
  replacement under `hosts/example/prometheus/config/` exercises the
  scrape-config templating workaround documented in the README's
  `## Known limitations` (synthesis table classification: B).

Per-example integration tests (T016-T020) assert via the parser
library + a spawned `core-ops plan --source-repo X --host example`
invocation that exit code is 0 and the resolved service catalog
contains the expected unit names.

Stateless integration tests:

- T021 `test_stateless_plan.rs` — covers all six contract paths:
  non-git → `(stateless)`, clean checkout → SHA, dirty → `(stateless+dirty)`,
  missing `--host` errors, non-directory path errors, `--audit-dir`
  honored when explicitly set (clarification Q4).
- T022 `test_stateless_explain.rs` — pure-read invocation against
  **all five** examples (one sub-test per example, picking a
  deterministic object id) per SC-011 coverage.

Implementation fixes surfaced by the tests:

- `src/io/repo.rs`: `load_desired_state_from_path` now bypasses
  `resolved_head_revision` (the init'd-mode `git rev-parse HEAD` call)
  for non-git source-repos via a new `revision_override` parameter.
  FR-015 (non-git stateless support) now works end-to-end.
- `src/cli/status.rs`: `short_revision` preserves stateless sentinels
  verbatim instead of truncating `(stateless+dirty)` to `(statele`.
  Sentinels begin with `(`, which is invalid in git ref names per
  `git check-ref-format`, so the discriminator is unambiguous.

Test inventory: +12 new integration tests, 0 deletions beyond the
spec/016 fixture cleanup in commit 9ce9bf5. `cargo test` passes 462
total. `cargo clippy --all-targets -- -D warnings` clean.

Root README (T029) gains a `## Real-World Examples` section between
`## First Interaction` and `## Installation (Current Phase)` linking
the five examples and the stateless try-it command (FR-007, SC-007).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `tests/integration/test_stateless_authoring.rs` covering:

- T030 / US2 AC1: copy examples/02-nextcloud to a non-git tempdir,
  rename hosts/example → hosts/myhost, edit host.yaml, run
  `core-ops plan --source-repo <tempdir> --host myhost` → exit 0
  with `(stateless)` provenance.
- T031 / US2 AC3: parser-level equivalence of stateless vs init'd
  loads against the same source tree. Asserts the workload set
  (by systemd_unit_name), managed_config_paths, and
  managed_config_roots are byte-identical between modes.

T032 verified: all five `examples/<slug>/README.md` files carry a
`## Scaffold for your own setup` section per quickstart.md Step 6
(grep across `examples/*/README.md` returns all five paths).

`cargo test --test mod test_stateless_authoring` → 2/2 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `tests/integration/test_stateless_apply.rs` covering:

- T033: stateless apply against a synthetic source repo asserts
  exit 0, audit-bundle populated, and `result.desired.requested_repository`
  / `result.desired.requested_ref` carry path-based provenance per
  FR-013 + clarification Q3.
- T034 / SC-009: init'd-state preservation. Pre-writes init'd state
  via `persist_success_state`, runs stateless apply against an
  unrelated source path, asserts `desired_state.repository` and
  `desired_state.requested_ref` are byte-identical pre/post.
- T035: provenance-shape coverage — non-git → `(stateless)`, clean
  git checkout → 40-char SHA, dirty working tree → `(stateless+dirty)`,
  asserted against `bundle.result.desired.requested_ref`.
- US3 AC3: stateless-apply → init'd-plan transition. After stateless
  apply, the canonical state file at `state_file` does NOT exist
  (verified) and a subsequent init'd-mode `load_desired_state` against
  the same tree resolves a non-empty workload catalog.
- T036: `mod test_stateless_apply;` registered in
  `tests/integration/mod.rs`.

Stateless apply implementation (shipped in commit 929e489) writes
zero state-file I/O — `apply_with_report_stateless` skips the
`persist_in_progress_state` / `persist_finished_state` /
deterministic-state writes entirely. The audit chain is the persisted
record. SC-009 is trivially satisfied because the canonical state
file is never opened in stateless mode.

`cargo test --test mod test_stateless_apply` → 4/4 pass.
`cargo clippy --all-targets -- -D warnings` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes Phase 6 (US4):

- T037: `## Synthesis table` section added to `spec.md` between
  `## Success Criteria` and `## Assumptions` (per contracts/synthesis-table.md
  invariant 1).
- T038: 11 rows transcribed from the five examples' `## Known
  limitations` sections — 1 A-classified absorbed self-escalation
  (the canonical pre-populated row), 8 B-classified workaround-with-doc
  rows backed by the affected examples' READMEs, 2 C-classified
  defer-to-follow-up rows backed by `docs/follow-ups.md`.
- T039: invariants 1–5 verified manually:
  - Inv 1: every `## Known limitations` entry across the 5 examples
    has a backing row.
  - Inv 2: the single A row references spec/017 (this slice).
  - Inv 3: every B row's workaround text is present in at least one
    affected example's README.
  - Inv 4: the C-classified NFS row gets a new bullet
    "NFS-backed library mounts in real workloads" in
    `docs/follow-ups.md`; the C-classified Authelia secrets row
    routes to the existing "Secrets UX" follow-up.
  - Inv 5: only the stateless-mode case is A-with-absorption.

`cargo test` 468 passed; `cargo clippy --all-targets -- -D warnings`
clean (no code change in this commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes spec/017 Phase 7 (T040-T050) — the polish + merge gate phase.

Stale-doc cleanup (FR-020):
- T040: `docs/follow-ups.md` — collapsed the now-shipped Init Command
  paragraphs into a brief historical note pointing at spec/015 (the
  CLI-statefulness work) and spec/017 (the stateless `--source-repo`
  CLI). Removed the two Source Repository UX bullets that this slice
  closes ("Rich, documented real-life examples" and "QnA for known
  limitations and workarounds"). Kept all still-valid follow-ups
  (arg-persistence, `rollback-plan-only` re-homing, `--reinitialize`).
- T041: `docs/development.md:228` — replaced the
  `CORE_OPS_HOST=<host> core-ops plan --repo <repo> --rev <rev>`
  example with the stateless flow plus a brief init'd-mode section.
- T042: `infra/repo/README.md` is gitignored
  (`.gitignore:19: /infra/repo/`) — local copy was updated to the
  canonical init-then-plan flow but the change is untracked.

Release governance (FR-021):
- T043: `Cargo.toml` 2.1.1 → 2.2.0; bumped controller.version in
  `tests/fixtures/provenance_state/valid-success.json` to keep the
  `controller_version_provenance_matches_cargo_package_version` test
  green.
- T044: `changes/017-real-world-validation.md` declares
  `release_intent: minor` per the validator's verdict.
- T045: `CHANGELOG.md` re-rendered via
  `cargo run --bin core-ops-release -- changelog --write` — the
  `[Unreleased]` block now carries the spec/017 fragment summary.
  Post-merge `core-ops-release promote` will move it into a
  `## [2.2.0]` section automatically.

Merge gates (T046, T047, T049):
- `cargo build --locked` clean.
- `cargo test` 468 passed.
- `cargo clippy --all-targets -- -D warnings` clean.
- `cargo run --bin core-ops-release -- validate --base-ref master`:
  passed; classification=releasable; required=minor; declared=minor;
  CHANGELOG aligned.
- Privacy gate: zero matches for
  `not\.one|ulthar|192\.168\.1\.2|gcloud[-_]dns|gcloud\.json`.
- RFC 2606: deployment-config FQDNs are `*.example.com`. README
  citation URLs (`hub.docker.com`, `docs.kernel.org`,
  `www.authelia.com`, `caddyserver.com`, `authelia.com`) point at
  real public upstream docs per FR-004(b) and are exempt from the
  deployment-hostname rule.
- RFC 5737: 192.0.2.0/24 + 198.51.100.0/24 documentation ranges only;
  the single 0.0.0.0 in Authelia config is the unspecified-address
  bind sentinel, not an addressable literal.

Quickstart manual validation (T048): spot-checked
`core-ops plan --source-repo examples/01-caddy-whoami --host example`,
`--source-repo examples/05-observability` (switching examples without
`--force`), and `core-ops explain --source-repo examples/01-caddy-whoami
--host example container/caddy.container` all succeed end-to-end.

T050: tasks.md ticked off per phase as work shipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 35e821b67a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/main.rs Outdated
Comment thread src/main.rs Outdated
… codes

PR #32 review feedback (P1 + P2) from chatgpt-codex-connector.

P1 — Route stateless explain through a state-independent engine:
the original `core-ops explain --source-repo` branch reused
`explain_cmd::explain`, which still calls
`last_applied_revision_from_state` and `last_applied_snapshot_for_scope`
under the hood. That breaks the FR-011a / clarification Q5 contract
("stateless explain writes nothing AND reads no persisted state") in
two operationally important ways:

- under root-owned `/var/lib/core-ops/status.json` the rootless
  caller's stateless explain spuriously fails on a permission read
  it doesn't actually need;
- with init'd state from an unrelated repo present, baseline-revision
  / last-applied snapshot context bleeds into the rendered output
  even though the source-of-truth path has nothing to do with that
  init'd configuration.

Fix: add `explain_stateless` to `src/cli/explain.rs` that mirrors
`explain` minus the `last_applied_*_from_state` lookups and feeds
`None` for the baseline snapshot to
`reconcile_deterministic_plan_with_runtime`. `src/main.rs` Explain
branch now routes the `--source-repo` path through this engine.

Coverage: new
`stateless_explain_output_is_unaffected_by_prior_initd_state` test
runs the same explain invocation twice — once with
`CORE_OPS_STATE_FILE` pointing at a populated unrelated init'd state
file, once at a non-existent path — and asserts byte-identical stdout
between the two runs.

P2 — Return documented exit codes for source-repo path errors:
`map_source_ref_error` was classifying exit codes (64 / 66 per
`contracts/cli-flag.md`) into the error message string only. The
process always exited via `std::process::exit(1)`, so automation
relying on usage/data/path classification through exit status alone
got `1` regardless of the underlying cause.

Fix: extend `CoreError` with an optional `exit_code: Option<i32>`
field plus a `with_exit_code(class, message, code)` constructor.
`main()` now reads `err.exit_code.unwrap_or(1)` and passes it to
`std::process::exit`. `map_source_ref_error` uses the new
constructor, so missing-path / non-directory `--source-repo` inputs
exit 64 (`EX_USAGE`) and canonicalize failures exit 66 per the
documented contract.

Coverage: existing
`stateless_plan_against_non_directory_path_errors` and
`stateless_plan_against_missing_path_errors` integration tests gain
explicit `output.status.code() == Some(64)` assertions and are
renamed to `..._with_exit_code_64` to make the contract intent
visible.

Mechanical follow-on: 10 test files (`test_idempotence.rs`,
`test_performance.rs`, `test_status_state.rs`, `test_verification.rs`,
`test_plan.rs`, `test_reconcile_apply.rs`, `test_ordering.rs`,
`test_verification_rules.rs`, `test_reboot_recovery.rs`,
`test_quadlet_artifacts.rs`) had local `map_io_error` helpers
constructing `CoreError { class, message }` directly; converted to
the public `CoreError::new` constructor so the new field doesn't
leak into every call site.

`cargo test` 469 passed; `cargo clippy --all-targets -- -D warnings`
clean; `core-ops-release validate --base-ref master` passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5cfc90e00

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/main.rs Outdated
Comment thread src/io/source_ref.rs Outdated
…ntics

PR #32 review feedback (P2 + P2) from chatgpt-codex-connector on
commit f5cfc90.

P2 #1 — Preserve EX_DATAERR for invalid source-repo layouts:
the previous fix wired exit code 64 (`EX_USAGE`) for path-shape
errors via `map_source_ref_error`, but layout/parser failures from
`load_desired_state_from_path` still flowed through `map_plan_error`,
which doesn't set an exit-code override. So a directory that exists
but isn't a spec/016-conformant source-repo (missing `services/`,
legacy artifacts, malformed `service.yaml`) exited 1 instead of the
documented 65 (`EX_DATAERR`), so automation couldn't distinguish
malformed input from generic runtime failures.

Fix: add `map_stateless_layout_error` in `src/main.rs` that builds
`CoreError::with_exit_code(FailureClass::Plan, msg, 65)` and use it
in the `load_desired` closures of all three stateless branches (plan,
apply, explain). For `apply_with_report_stateless`'s self-built deps
in `src/cli/apply.rs`, the same exit-code wrapper is inlined since
the stateless apply path can't reach the main.rs helper.

Coverage: new `stateless_plan_against_invalid_layout_errors_with_exit_code_65`
integration test. An empty tempdir (passes path-existence + is-dir
gates, fails the parser at `MissingServicesDir`) exits 65.

P2 #2 — Fall back to non-git sentinel on git probe failure:
`working_tree_dirty` returned `true` whenever the `git status`
subprocess errored (git missing from `$PATH`, invocation failure,
non-zero exit), which forced `(stateless+dirty)` provenance for
probe failures and conflated "actually dirty" with "couldn't
determine". This contradicts `research.md` D1 step 5 ("On any
subprocess error: log via miette diagnostic, fall back to
`(stateless)`") and broke the audit semantic that
`(stateless+dirty)` is supposed to assert.

Fix: replace `working_tree_dirty(&Path) -> bool` with
`working_tree_clean(&Path) -> Option<bool>`:
  - `Some(true)`  — `git status --porcelain` succeeded, empty output.
  - `Some(false)` — `git status --porcelain` succeeded, non-empty output.
  - `None`        — subprocess failed (probe non-verifiable).
`detect_provenance` matches on the result: clean → SHA (or
`(stateless)` if SHA fetch fails), dirty → `(stateless+dirty)`,
probe-failed → `(stateless)` per the spec's documented fallback.

`cargo test` 470 passed (was 469); `cargo clippy --all-targets -- -D
warnings` clean; `core-ops-release validate --base-ref master`
passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 342e5aa6d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/main.rs Outdated
Comment thread src/io/source_ref.rs Outdated
PR #32 review feedback (P2 + P2) from chatgpt-codex-connector on
commit 342e5aa.

P2 #1 — Populate stateless audit provenance with run outcome:
the synthetic `PersistedProvenanceState` for stateless apply
hardcoded `reconciliation.status = NeverRun` and `generation = 0`.
That object is fed to `build_audit_event`, which emits those fields
as authoritative audit metadata, so every stateless apply's journal
event reported `reconciliation_status = "never_run"` even after a
successful or failed apply — misleading for any downstream audit /
monitoring consumer that keys on reconciliation provenance.

Fix: extend `synthetic_stateless_provenance` (relocated from
`src/main.rs` to `src/cli/apply.rs` so it is unit-testable) to
accept the resolved `revision_id` plus the `RunStatus` outcome.
The audit event now reports:

  - `reconciliation_status` = `success` | `failed` (from `RunStatus`).
  - `attempted_revision` = `revision_id` (always — both success and
    failure attempted the same target).
  - `applied_revision` = `revision_id` on success, `None` on failure
    (failure means no revision became authoritative on the host).
  - `last_observed_revision` = `revision_id` so the audit event can
    correlate with the source-repo revision the apply ran against.
  - `generation` = 1 (single ad-hoc run; stateless mode has no
    prior generation context).

Coverage: new `synthetic_stateless_provenance_reflects_run_outcome`
unit test asserts both the Success and Failure matrices.

P2 #2 — Differentiate source-repo access errors from missing paths:
`detect_provenance` used `Path::exists()` / `Path::is_dir()`, which
swallow metadata I/O errors (most commonly `PermissionDenied`) into
`false`. So an inaccessible directory got classified as
`PathMissing` / `PathNotDirectory` (exit 64) instead of as an access
failure, with a misleading diagnostic for unprivileged runs against
restricted paths.

Fix: replace the existence/is-dir gates with a single
`std::fs::metadata` call. `ErrorKind::NotFound` maps to the
existing `PathMissing` (exit 64); any other I/O error maps to a
new `PathInaccessible { path, source }` variant (exit 66, alongside
`Canonicalize`). `metadata.is_dir()` continues to drive the
`PathNotDirectory` branch.

`cargo test` 471 passed (was 470); `cargo clippy --all-targets -- -D
warnings` clean; `core-ops-release validate --base-ref master`
passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4b3eba7bc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/io/source_ref.rs Outdated
PR #32 review feedback (P2) from chatgpt-codex-connector on
commit f4b3eba.

The dirty-tree probe used `git status --porcelain -- .` without
overriding `status.showUntrackedFiles`. A repo-local or user-global
config that sets `status.showUntrackedFiles=no` (a common pattern
for keeping `git status` output terse) silently makes the porcelain
output empty for untracked-only working trees. Stateless mode would
then misclassify an authoring edit as clean and emit the parent
commit's SHA instead of `(stateless+dirty)` — exactly the operator
state stateless mode is meant to flag, lost in audit/provenance.

Fix: pass `--untracked-files=normal` explicitly to `git status` in
`working_tree_clean`. The CLI flag overrides any
`status.showUntrackedFiles` setting from gitconfig levels (system,
global, user, repo, worktree, command-line), so the probe is
independent of the operator's gitconfig. Tracked-file modifications
already surfaced regardless of this setting; the fix closes the
untracked-only case.

Coverage: new `dirty_detection_overrides_repo_show_untracked_files_no`
unit test in `src/io/source_ref.rs`. The test sets
`status.showUntrackedFiles=no` locally on the fixture repo before
adding an untracked file, then asserts the resolved `requested_ref`
is `(stateless+dirty)`. Without the `--untracked-files=normal`
override the test would (correctly) fail by reporting a clean SHA.

`cargo test --lib` 53 passed (was 52); `cargo test` integration
suite 471 passed; `cargo clippy --all-targets -- -D warnings` clean;
`core-ops-release validate --base-ref master` passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b2c13f5197

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/io/source_ref.rs Outdated
PR #32 review feedback (P2) from chatgpt-codex-connector on
commit b2c13f5.

The probe-failure → `(stateless)` fallback shipped in round 2
(commit 342e5aa) was correct semantically but silent: in
environments where `git` is missing or a subprocess fails for an
unexpected reason, operators lost the distinction between an
actually-non-git directory and a degraded probe. The audit chain
looked valid while being structurally degraded — exactly the kind
of failure mode `contracts/cli-flag.md` documents:

> Git ref detection subprocess fails (`git -C <path> ...`)
> | continues with `(stateless)` fallback; logs miette warning to
> stderr | `warning: git ref detection failed for <path>;
> recording as non-git source`

Fix: refactor the probe layer of `src/io/source_ref.rs` to
distinguish definitive answers from probe failures.

- Introduce `GitClassification { NotGit, Clean(sha), Dirty,
  ProbeFailed(reason) }` and a single `classify_git_state(path)`
  driver. `detect_provenance` matches on the result.
- Tighten the three probe helpers' return types from
  `bool` / `Option<bool>` / `Option<String>` to `Result<_, String>`
  so spawn failures, unexpected non-zero exits, and malformed
  stdout each produce a short reason string.
- `is_inside_work_tree` now returns `Ok(false)` (definitive
  non-git, the canonical case) when `git` runs and reports
  non-zero — only spawn failures escalate to `Err`. This keeps
  the no-warning happy path for everyday non-git source-repos.
- On `ProbeFailed`, `detect_provenance` emits a stderr
  `warning: git ref detection failed for <path>: <reason>;
  recording as non-git source` line that matches the
  `contracts/cli-flag.md` template, then falls back to
  `(stateless)` per `research.md` D1 step 5.

Existing tests continue to pass: the success paths all run `git`
through PATH and never hit `ProbeFailed`. A regression test for
the warning path itself was deliberately deferred — the hermetic
way to provoke it (PATH manipulation to hide `git`) requires
process-global env mutation that's hostile to cargo test's
parallel scheduler. The structural change (4-state enum + visible
warning) captures the contract intent.

`cargo test --lib` 53 passed; `cargo test` integration suite 471
passed; `cargo clippy --all-targets -- -D warnings` clean;
`core-ops-release validate --base-ref master` passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 61bf963ebb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/main.rs Outdated
Comment thread src/io/source_ref.rs Outdated
…on discriminator

PR #32 review feedback (P1 + P2) from chatgpt-codex-connector on
commit 61bf963.

P1 — Route stateless plan through state-free planner:
the stateless plan branch in main.rs still called
`plan_cmd::plan`, which unconditionally consults persisted
controller state via `detached_header_from_state`,
`last_applied_revision_from_state`, and the deterministic-snapshot
lookups in `src/cli/plan.rs`. That violated FR-012 (stateless plan
is pure read-and-render with respect to controller state) in two
ways:

- a corrupt or unreadable `/var/lib/core-ops/status.json` could
  derail a stateless run that should be independent of it;
- detached-header context from an unrelated init'd configuration
  could leak into stateless plan output.

Fix: add `plan_stateless` in `src/cli/plan.rs` that mirrors `plan`
minus the three `*_from_state` lookups and feeds `None` for the
baseline snapshot to `reconcile_deterministic_plan_with_runtime`.
This is the planner-side analogue of round-2's `explain_stateless`.
`src/main.rs` Plan branch routes the `--source-repo` path through
this engine.

Coverage: new `stateless_plan_output_is_unaffected_by_prior_initd_state`
integration test asserts byte-identical stdout between a stateless
plan run with `CORE_OPS_STATE_FILE` pointed at a populated
unrelated init'd state file and one with an empty path, plus an
explicit `[DETACHED]`-header negative assertion.

P2 — Treat rev-parse nonzero as probe failure when ambiguous:
the previous `is_inside_work_tree` collapsed every non-zero
`git rev-parse --is-inside-work-tree` exit to `Ok(false)`. That
correctly handled the canonical "not a git repository" case but
silently masked repository corruption (damaged `.git/HEAD`, locked
index, permission errors inside `.git/`, etc.) — those would
emit `(stateless)` with no warning even though the source was a
broken git repo, hiding degraded provenance detection in audit.

Fix: discriminate via stderr. If the failure stderr contains
"not a git repository", treat as `Ok(false)` (definitive non-git,
no warning). Otherwise return `Err(reason)` so the existing
`ProbeFailed` → stderr-warning path engages and operators see the
underlying git diagnostic alongside the `(stateless)` fallback.

`cargo test --lib` 53 passed; `cargo test` integration suite 472
passed (was 471); `cargo clippy --all-targets -- -D warnings`
clean; `core-ops-release validate --base-ref master` passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@outergod outergod merged commit f3aba2a into master May 6, 2026
5 of 6 checks passed
@outergod outergod deleted the 017-real-world-validation branch May 6, 2026 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant