Skip to content

feat(admin): GET /admin/drift config drift endpoint (WOR-132)#72

Merged
rickcrawford merged 1 commit intomainfrom
claude/find-onnx-classifier-HucVm
May 6, 2026
Merged

feat(admin): GET /admin/drift config drift endpoint (WOR-132)#72
rickcrawford merged 1 commit intomainfrom
claude/find-onnx-classifier-HucVm

Conversation

@rickcrawford
Copy link
Copy Markdown
Contributor

Summary

Closes WOR-132. Adds GET /admin/drift, the proxy-side scraping endpoint K8s operators and dashboards use to flag a config that's been edited on disk but not yet hot-reloaded.

Drive-by: also fixes the workspace build under prometheus = "0.14" (heterogeneous &[&String, &str] arrays in with_label_values no longer compile because Rust unifies the array element type).

What ships

GET /admin/drift

AdminState gains a Mutex<Option<String>> content-hash baseline (12-char SHA-256 prefix of the raw YAML bytes via crate::identity::config_revision). Seeded by:

  • AdminState::with_loaded_config_content_hash(...) at startup (wired in crates/sbproxy/src/main.rs-equivalent path inside server::run).
  • The existing /admin/reload handler refreshes it on every successful swap, so the drift baseline tracks the live pipeline.

handle_drift returns:

{
  "config_path": "/etc/sbproxy/sb.yml",
  "loaded_revision": "a3f5b1d829c4",
  "loaded_content_hash": "8e1c5d4a9f7b",
  "on_disk_content_hash": "8e1c5d4a9f7b",
  "drift": false,
  "on_disk_size_bytes": 4321,
  "checked_at": "2026-05-06T15:42:00Z"
}

Failure modes: 503 if no on-disk path or no baseline yet, 500 (with scrubbed path) on read failure, 405 on non-GET, 401 if unauthenticated.

The pipeline's existing config_revision (12-char origin-set identity hash) is reported alongside but not used for drift comparison — it doesn't move when only policies, transforms, or ports change, which is precisely the case operators want to detect. The raw-bytes hash is what an operator means by drift.

Documented in docs/configuration.md § Admin fields with a worked response example and full failure-mode table.

Build fix (sbproxy-observe + sbproxy-core)

Recent commits made origin_san: String (via sanitize_label_budget), so &[&origin_san, method, &status_str] becomes a heterogeneous array. Rust unifies V to &String (from the first element) and rejects bare &str literals. Fixed every such call site by coercing to .as_str() so arrays are uniformly &[&str]. No behavioural change.

Test plan

Six new admin tests in crates/sbproxy-core/src/admin.rs::tests:

  • admin_drift_unauthorized_returns_401
  • admin_drift_rejects_post
  • admin_drift_without_config_path_returns_503
  • admin_drift_without_content_hash_baseline_returns_503
  • admin_drift_missing_file_returns_500_with_sanitised_path
  • admin_drift_after_reload_reports_no_drift — hits the full reload path so the baseline gets populated, then asserts drift: false and matching content hashes
  • admin_drift_after_file_change_reports_drift — reload, then mutate the file, then assert drift: true and differing hashes

Pre-commit gates run locally:

  • cargo fmt --all -- --check
  • cargo build --workspace --exclude sbproxy-e2e (e2e excluded because protoc isn't installed locally; CI has it)
  • cargo test -p sbproxy-core --release --tests — 271 passed, 0 failed
  • cargo clippy --workspace --all-targets --exclude sbproxy-e2e -- -D warnings
  • RUSTDOCFLAGS="-D warnings" cargo doc --workspace --no-deps --document-private-items --exclude sbproxy-e2e

CHANGELOG

[Unreleased] entries added: one in Added for the drift endpoint, one in Fixed for the prometheus 0.14 build repair. The drive-by fix is called out separately so the build-repair history is searchable.

Out of scope (follow-ups)

  • sbproxy drift -f sb.yml CLI: the ticket also lists a CLI surface. Out of this PR; a CLI wrapper around the admin endpoint is straightforward and can land next.
  • Hybrid-mode TTL'd overrides flagged distinctly from accidental drift: gated on the git+overlay work in WOR-133.
  • Structured diff (added/changed/removed origins): hash equality answers the binary "drifted yes/no" question. A structured diff is a richer follow-up that needs a YAML differ; out of scope here.

https://claude.ai/code/session_019zc6oCY6Kx2ssiuZEQdznk


Generated by Claude Code

Operators and dashboards can now scrape /admin/drift to see whether
the on-disk config has diverged from what the proxy has loaded,
without triggering a reload. Closes WOR-132.

Mechanics:

* AdminState gains a Mutex<Option<String>> baseline tracking the
  12-char SHA-256 prefix of the raw YAML bytes the proxy loaded.
  with_loaded_config_content_hash() seeds it at startup; the reload
  handler refreshes it on every successful swap.
* handle_drift compares that baseline against a fresh hash of the
  on-disk file and returns {config_path, loaded_revision,
  loaded_content_hash, on_disk_content_hash, drift, on_disk_size_bytes,
  checked_at}. 503 if no on-disk path or no baseline; 500 (with
  scrubbed path) on read failure; 405 on non-GET.
* The pipeline's existing config_revision is reported alongside but
  intentionally not used for drift comparison: it is an origin-set
  identity hash and does not move when only policies, transforms, or
  ports change. The raw-bytes hash is what an operator means by
  drift.

Six new admin tests cover unauthorized, method, no-path, no-baseline,
missing-file (sanitised path), no-drift, and post-edit-drift paths.
docs/configuration.md gains a /admin/drift subsection under the
Admin fields table.

Drive-by build fix: prometheus 0.14 unifies the with_label_values
generic V across the array literal, which forces all elements to the
same type. Heterogeneous &[&String, &str, ...] sites in
sbproxy-observe::metrics and sbproxy-core::server failed to compile.
Coerced every such call site to uniform &[&str] via .as_str(). No
behavioural change; CHANGELOG entry under Fixed.

https://claude.ai/code/session_019zc6oCY6Kx2ssiuZEQdznk
@rickcrawford rickcrawford merged commit 0421699 into main May 6, 2026
8 checks passed
@rickcrawford rickcrawford deleted the claude/find-onnx-classifier-HucVm branch May 6, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants