diff --git a/CHANGELOG.md b/CHANGELOG.md index 2e31e81c..fb4bbec9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,145 +1,41 @@ -## Changelog – Orkestra v0.3.8 +# **CHANGELOG — ONCOP Integration (Orkestra Native Cross‑Operator Protocol)** -### ork doctor + ork doctor deploy — Local to production in minutes +### **Added — ONCOP v1 (Orkestra Native Cross‑Operator Protocol)** +Introduced ONCOP as the unified, typed, cross‑operator observation protocol for Orkestra. ONCOP replaces ad‑hoc HTTP integrations and hard‑coded URLs with a declarative, URL‑inferable, cache‑aware protocol used across autoscaling, status fields, and template resolution. -Developers can now deploy any project to Kubernetes with three commands, no operator knowledge required. +Key components: -#### ork doctor +- **Typed observation surfaces** + Added first‑class ONCOP types: + `metrics`, `health`, `cr`, `info`, `events` + Each type maps to a deterministic URL shape under `/katalog/`. -Examines the current directory and reports what Orkestra found: Dockerfile, Git commit, language (Go, Node.js, Java, Python, Ruby, Rust), port, `.env` variables split into Secrets and ConfigMaps, frontend detection, SMTP/Slack presence, and missing CLI tools. If SMTP or Slack credentials are present but `--notify-me` was not set, it surfaces a hint. +- **URL inference engine** + Implemented `BuildONCOPURL` to construct ONCOP URLs from `CrossCRDDeclaration` using: + `source.host`, `source.type`, `crd`, `selector.namespace`, `selector.name`. -```bash -ork doctor -ork doctor init --name my-api -ork doctor init --name my-api --notify-me --add-ingress -``` +- **Cross‑operator resolver integration** + Updated `readCross()` to support ONCOP host‑based reads as Path 2, after informer registry and before raw endpoint fallback. + Responses injected into `.cross.` for templates, autoscale conditions, and status fields. -`ork doctor init` generates three files: -- `.orkestra/katalog.yaml` — all Kubernetes resources Orkestra manages; edit freely -- `.orkestra/app.yaml` — the ConfigMap CR the developer owns -- `.orkestra/values.yaml` — Helm values for the Orkestra operator +- **New ONCOP type: `cr`** + Added `type: cr` for CR‑specific detail (`status`, `spec`, `children`, `metrics`). + Distinguishes CR detail from CRD‑level `info`. -#### ork doctor deploy +- **Autoscaler ONCOP support** + Autoscale conditions now resolve `cross..metrics.*` via ONCOP metrics endpoint with optional caching (`cacheFor:`). -Builds the Docker image, pushes it, runs `ork kompose` to merge all registered katalogs, generates the cluster bundle, installs or verifies the Orkestra operator via Helm, patches the image in the CR, and watches the rollout. +- **Resolver enhancements** + Added `ParseCrossField` and extraction helpers (`ExtractCrossCRD`, `ExtractCrossCategory`, `ExtractCrossFieldName`, `ExtractCrossNamespace`) to unify cross‑field parsing. -```bash -ork doctor deploy --registry ghcr.io/myorg -ork doctor deploy --registry ghcr.io/myorg --dev # spins up a local kind cluster -ork doctor deploy --registry ghcr.io/myorg --dry-run -``` +- **Fallback semantics** + Resolution priority formalised as: + `informer registry → ONCOP host → raw endpoint → empty result`. -Key behaviours: -- **Auto-installs `kubectl` and `helm`** if missing — developers need only Docker and the `ork` CLI -- **Auto-installs ingress controller** (nginx) when the project has a frontend -- **Multi-project kompose**: all deployed projects are registered in `~/.orkestra/deploy/komposer.yaml` by absolute local path; `ork kompose` merges them into `__runtime_katalog_do_not_edit.yml` before bundle generation — no git commit or GitHub access needed -- **Kompose errors surface first**: any malformed or unreadable katalog fails before touching the cluster -- **Deploy state** written to `~/.orkestra/deploy/state.json` before every image patch; previous image always available for instant rollback -- **Internal service URL checklist** printed after every deploy so developers can wire projects together (`export MY_API_URL=...`) -- **Control Center fallback**: when `controlCenterHost` is empty, prints the `kubectl port-forward` command for local access +- **Cross‑binary caching** + Added per‑source caching for ONCOP responses to avoid repeated remote calls. -#### ork doctor deploy rollback - -Restores the previous image by reading `~/.orkestra/deploy/state.json` (annotation fallback for backward compatibility). Swaps current and previous before patching so every rollback is reversible. - -```bash -ork doctor deploy rollback -ork doctor deploy rollback --image ghcr.io/myorg/my-api:v1.2.0 -``` - -#### Out-of-the-box developer notifications - -Every katalog generated by `ork doctor init` ships with a `notify:` block on the deployment readiness condition. When replicas are not ready within the notification interval (default 15 minutes), Orkestra sends the `developer` team the exact `kubectl logs` command and a `ork doctor deploy rollback` hint. - -Wire the `developer` team with `ork doctor init --notify-me`: -- Reads the developer's Git author email from `git log -1` -- Reads SMTP and Slack credentials from `.env` -- Generates a `notification:` block in `katalog.yaml` with `defaults.interval: 15m` and a `developer` team -- Creates an `orkestra-notification` Secret in `orkestra-system` during deploy — credentials never touch the Katalog YAML -- Adds `runtime.extraEnvFrom` to `values.yaml` so `pkg/konfig` reads the credentials as env vars - -#### Developer example pack - -New pack at `examples/developer/` with five progressive examples: - -| Example | What it teaches | -|---------|----------------| -| 01 — One project | First deploy, Control Center walkthrough | -| 02 — Frontend + backend | Multi-project deploy, internal service URL wiring | -| 03 — Rollback and Ingress | Breaking a deploy and restoring it, public URL | -| 04 — Notifications | SMTP/Slack wiring, triggering and observing a notification | -| 05 — Deletion protection | Default protection, correct decommission sequence | - -Registered in `embed.go`, `init_packs.go`, CI packaging, and release workflows. - ---- - -## Changelog – Orkestra v0.2.9 - -### ✨ New `ork generate katalog` – scaffold a Katalog in seconds - -Scaffold a production‑ready `katalog.yaml` with sensible defaults, optional typed‑mode placeholders, and built‑in security, notification, and provider blocks. No more memorising the schema. - -**Flags:** -- `--add-hook` – typed mode with a `hooks` declaration (comment) -- `--add-constructor` – typed mode with a `constructor` declaration (`default: false`) -- `--typed` – both hook and constructor sections commented; you choose one -- `--add-security` – add namespace & deletion protection stubs -- `--add-notification` – add Slack/email notification example -- `--add-provider ` – add cloud provider configuration - -**Example:** -```bash -ork generate katalog --add-hook --add-security --add-provider aws -o my-katalog.yaml -``` - -[Read the full command reference](https://docs.orkestra.io/reference/cli/generate-katalog) - ---- - -### 🚀 Complete CI/CD for typed operators (hooks & constructors) - -Two new E2E workflows now run in GitHub Actions for the **advanced pack**: - -- **09-hooks** – typed hooks for a `Database` CRD (StatefulSet + Service + optional CronJob) -- **10-constructors** – custom constructor for a `Pipeline` CRD (state machine with Jobs) - -Both workflows: -- Generate the typed registry (`ork generate registry`) -- Show the expected validation failure with the standard `ork` binary -- Build a custom `ork` binary that includes the user’s Go code -- Build, tag (with `hooks-` or `constructor-` prefix), and push a container image to `ghcr.io/orkspace/orkestra-typed-extensions` -- Deploy the image via Helm, apply the CR, and verify resource creation -- Test cleanup via owner reference garbage collection - -These workflows prove that typed operators are **fully automatable** – from `git push` to a running cluster – using the same Orkestra GitHub Action that works for dynamic operators. - ---- - -### 🔧 Action improvements - -- New input `generate-registry` – runs `ork generate registry` after `init` -- New output `registry_file` – path to the generated registry (for inspection) -- `namespace` input now defaults to `orkestra-system` and is passed to `generate configmap` and `generate bundle` -- Support for custom `image_repo` and `image_tag` in typed E2E workflows - ---- - -### 🐛 Fixes - -- `mode:` is now automatically inferred when `apiTypes.location` is set (no need to write `mode: typed` manually) -- Registry generation no longer requires `init=true` – works with any existing Katalog -- The stub `pkg/runtime/zz_generated_runtime_registry.go` now includes structured debug logging (`logger.Debug()`) to help diagnose registration issues - ---- - -### 📖 Documentation - -- New command reference for `ork generate katalog` -- Updated typed extensions guide (`09-hooks` and `10-constructors`) with step‑by‑step instructions and the full E2E workflow - ---- - -### Upgrading - -No upgrade required if you’re using `ork generate bundle` or `ork run`. For typed operators, simply regenerate your registry file with the new `ork generate registry` (the output format has not changed). +### **Impact** +ONCOP enables consistent, declarative, cross‑operator observation across Orkestra. +Autoscalers, status fields, and templates now consume cross‑operator data without bespoke integrations or hard‑coded URLs. +Operators implementing ONCOP become first‑class participants in the Orkestra ecosystem. diff --git a/docs/design-documents/oncop-protocol.md b/docs/design-documents/oncop-protocol.md new file mode 100644 index 00000000..a885b94d --- /dev/null +++ b/docs/design-documents/oncop-protocol.md @@ -0,0 +1,318 @@ + +# ONCOP — The Cross‑Operator Protocol of Orkestra* +*Orkestra Project — April 2026* + +--- + +## The orchestration hierarchy + +In a distributed system, no operator is an island. +A single operator manages one CRD, one concern, one bounded context. +But real systems are ecosystems: deployments depend on queues, queues depend on databases, databases depend on storage, and autoscalers depend on metrics from other operators. + +A platform is not a monolith. +It is a **network of operators**, each responsible for its own domain, yet required to observe and react to the state of others. + +Orkestra formalises this with a protocol: + +``` +Registry — in‑process, same‑binary observation + ↓ +ONCOP — cross‑binary, cross‑cluster observation + ↓ +Resolver — unified field access (metrics, health, info, events) +``` + +Where Motif is the reusable unit of *construction*, +**ONCOP is the reusable unit of *observation*.** + +It is the protocol that lets one operator read another operator’s state — safely, consistently, and without hard‑coded URLs. + +--- + +## What ONCOP is + +ONCOP — the **Orkestra Native Cross‑Operator Protocol** — is a declarative, typed, URL‑inferable protocol for reading another operator’s: + +- **metrics** — queue depth, worker count, throughput, lag +- **health** — state, lastError, heartbeat +- **CR detail** — status, spec, children, conditions +- **CRD info** — operator‑level metrics and children +- **events** — CR‑scoped event streams + +It is the **observation layer** of Orkestra. + +ONCOP is: + +- **typed** — `metrics`, `health`, `cr`, `info`, `events` +- **predictable** — URL shape is derived from type + CRD + selector +- **declarative** — expressed in `cross:` blocks +- **cacheable** — each source can define `cacheFor:` +- **composable** — used by autoscale, status.fields, templates, and Motifs +- **fallback‑aware** — registry → ONCOP → raw endpoint + +ONCOP is not a replacement for CRDs. +It is the **protocol for reading CRDs managed by other operators**. + +--- + +## The ONCOP schema + +A cross‑operator declaration in a Katalog looks like: + +```yaml +operatorBox: + cross: + - crd: loader + selector: + name: my-loader + namespace: default + source: + host: "http://loader-runtime:8080" + type: cr + cacheFor: 10s + as: loaderCRInfo +``` + +This declares: + +- **what** to observe (`crd: loader`) +- **which instance** (`selector.name: my-loader`) +- **where** to fetch from (`source.host`) +- **how** to interpret the endpoint (`type: cr`) +- **how long** to cache (`cacheFor: 10s`) +- **under what name** to expose it (`as: loaderCRInfo`) + +The result becomes available in templates as: + +``` +.cross.loaderCRInfo.status.phase +.cross.loaderCRInfo.children.deployment.ready +.cross.loaderCRInfo.metrics.queueDepth +``` + +ONCOP is the bridge between operators. + +--- + +## ONCOP types + +ONCOP defines five observation surfaces: + +| Type | Meaning | URL shape | +|------|---------|-----------| +| **metrics** | operator‑level metrics | `/katalog/` | +| **health** | operator health | `/katalog//health` | +| **cr** | CR detail (status, spec, children, metrics) | `/katalog//cr//` | +| **info** | CRD‑level info (list, metrics, children) | `/katalog/` | +| **events** | CR‑scoped events | `/katalog//cr///events` | + +These types are **first‑class** in Orkestra: + +```go +const ( + ONCOPMetrics ONCOPType = "metrics" + ONCOPHealth ONCOPType = "health" + ONCOPInfo ONCOPType = "info" + ONCOPCR ONCOPType = "cr" + ONCOPEvents ONCOPType = "events" +) +``` + +Each type corresponds to a stable, versioned endpoint in the operator runtime. + +--- + +## URL inference — the heart of ONCOP + +ONCOP eliminates hard‑coded URLs. + +Given: + +```yaml +source: + host: "http://localhost:8080" + type: cr +selector: + name: my-loader + namespace: default +crd: loader +``` + +ONCOP constructs: + +``` +http://localhost:8080/katalog/loader/cr/default/my-loader +``` + +No developer writes this URL. +No operator embeds it. +No autoscaler hard‑codes it. + +The protocol infers it. + +This is what makes ONCOP **portable**, **composable**, and **safe**. + +--- + +## The resolver — unified access + +Once ONCOP fetches data, it is injected into the resolver under `.cross.*`. + +A template can reference: + +- `.cross.loaderCRInfo.status.phase` +- `.cross.loaderCRInfo.children.deployment.ready` +- `.cross.loaderHealth.state` +- `.cross.loaderCRDInfo.metrics.queueDepth` + +Autoscale conditions can reference: + +```yaml +when: + - field: cross.loader.metrics.queueDepth + greaterThan: "60" +``` + +Status fields can reference: + +```yaml +loaderState: "{{ .cross.loaderCRInfo.status.phase }}" +loaderHealthy: "{{ .cross.loaderHealth.healthy }}" +``` + +ONCOP makes cross‑operator data feel local. + +--- + +## Resolution priority + +ONCOP is not the only observation path. +It is part of a layered strategy: + +``` +1. Informer registry (same binary) +2. ONCOP host (cross binary) +3. Raw endpoint (fallback) +4. Empty result (not found) +``` + +This ensures: + +- **fastest path first** +- **no unnecessary HTTP calls** +- **compatibility with non‑Orkestra operators** +- **predictable fallback behavior** + +ONCOP is the middle layer — the cross‑binary, cross‑cluster path. + +--- + +## Autoscaling with ONCOP + +Autoscalers often depend on metrics from other operators: + +- queue depth from a loader +- lag from a consumer +- throughput from a gateway +- worker count from a processor + +ONCOP makes this trivial: + +```yaml +when: + - field: cross.loader.metrics.queueDepth + greaterThan: "60" + source: + host: "http://localhost:8080" + cacheFor: 10s +``` + +The autoscaler does not know: + +- where the loader runs +- how its metrics are exposed +- what its URL is +- whether it is in‑process or cross‑cluster + +ONCOP abstracts all of it. + +--- + +## Status fields with ONCOP + +Operators often want to surface cross‑operator state: + +```yaml +status: + fields: + - path: loaderState + value: "{{ .cross.loaderCRInfo.status.phase }}" + - path: loaderHealthy + value: "{{ .cross.loaderHealth.healthy }}" + - path: loaderQueueDepth + value: "{{ .cross.loaderCRDInfo.metrics.queueDepth }}" +``` + +This gives users a **single pane of glass** — the Processor CR shows the Loader’s health, metrics, and readiness. + +ONCOP makes this possible. + +--- + +## Distribution — ONCOP as a protocol, not a library + +ONCOP is not a Go package. +It is not a client library. +It is a **protocol** implemented by: + +- the Orkestra runtime (server) +- the Orkestra reconciler (client) +- the Orkestra resolver (template engine) +- the Orkestra autoscaler (evaluation engine) + +It is versioned, documented, and stable. + +Operators that implement ONCOP become **first‑class citizens** in the Orkestra ecosystem. + +--- + +## The composition story + +``` +Operator A exposes metrics, health, CR detail via ONCOP + ↓ +Operator B declares cross: entries + ↓ +ONCOP fetches data from Operator A + ↓ +Resolver injects .cross.* + ↓ +Autoscaler evaluates conditions + ↓ +Status fields surface cross-operator state +``` + +This is the **observation pipeline** of Orkestra. + +Where Motif composes resources, +**ONCOP composes operators.** + +--- + +## Summary + +ONCOP is the missing piece that makes Orkestra a **multi‑operator platform**: + +- typed +- declarative +- URL‑inferable +- cacheable +- composable +- cross‑binary +- cross‑cluster + +It is the protocol that lets operators observe each other without coupling, without hard‑coded URLs, and without bespoke integrations. + +ONCOP is to observation what Motif is to construction: +a reusable, declarative, versioned primitive. diff --git a/docs/design-documents/oncop-specs.md b/docs/design-documents/oncop-specs.md new file mode 100644 index 00000000..e6d83afb --- /dev/null +++ b/docs/design-documents/oncop-specs.md @@ -0,0 +1,346 @@ +# ONCOP — Orkestra Native Cross‑Operator Protocol +### *Specification v1.0 — April 2026* +### *Status: Draft Standard* + +--- + +## **1. Introduction** + +The Orkestra Native Cross‑Operator Protocol (ONCOP) defines a standard mechanism for one operator (the *observer*) to retrieve structured state from another operator (the *subject*) across process boundaries. ONCOP provides: + +- a typed observation model +- deterministic URL inference +- consistent JSON response shapes +- caching semantics +- fallback behavior +- compatibility with non‑Orkestra operators + +ONCOP is transport‑agnostic but defined over HTTP/1.1 and HTTP/2 for this version. + +This document specifies the protocol, URL structure, request/response semantics, error handling, and integration rules. + +--- + +## **2. Terminology** + +- **Observer** — the operator performing a cross‑operator read. +- **Subject** — the operator exposing ONCOP endpoints. +- **CRD** — Kubernetes CustomResourceDefinition managed by the subject. +- **CR** — a specific instance of a CRD. +- **Cross Declaration** — a Katalog `cross:` entry requesting ONCOP data. +- **Source** — the `source:` block specifying host, type, and caching. +- **Type** — the ONCOP observation surface (`metrics`, `health`, `cr`, `info`, `events`). +- **Namespace** — Kubernetes namespace of the CR. +- **Name** — name of the CR. + +--- + +## **3. Protocol Overview** + +ONCOP defines five observation surfaces: + +| Type | Description | +|------|-------------| +| **metrics** | Operator‑level metrics for the CRD | +| **health** | Operator health and last error | +| **cr** | CR‑specific detail: status, spec, children, metrics | +| **info** | CRD‑level info: list, metrics, children | +| **events** | CR‑scoped event stream | + +Each type maps to a deterministic URL shape under the subject operator’s `/katalog/` namespace. + +The observer constructs the URL using: + +- `source.host` +- `source.type` +- `decl.crd` +- `decl.selector.name` +- `decl.selector.namespace` + +The observer then performs an HTTP GET and injects the resulting JSON into `.cross.`. + +--- + +## **4. URL Construction** + +### **4.1 Base URL** + +``` +/katalog/ +``` + +Where: + +- `` is `source.host` without trailing slash +- `` is the CRD name from the cross declaration + +### **4.2 URL Shapes by Type** + +#### **4.2.1 metrics** + +``` +GET /katalog/ +``` + +Returns operator‑level metrics for the CRD. + +#### **4.2.2 health** + +``` +GET /katalog//health +``` + +Returns operator health state. + +#### **4.2.3 cr** + +``` +GET /katalog//cr// +``` + +Returns CR‑specific detail. + +#### **4.2.4 info** + +``` +GET /katalog/ +``` + +Returns CRD‑level info (same endpoint as metrics; response includes both). + +#### **4.2.5 events** + +``` +GET /katalog//cr///events +``` + +Returns CR‑scoped events. + +--- + +## **5. Request Semantics** + +### **5.1 HTTP Method** + +All ONCOP requests use: + +``` +GET +``` + +No request body is permitted. + +### **5.2 Headers** + +The observer MAY send: + +``` +Authorization: Bearer +Accept: application/json +User-Agent: Orkestra/ +``` + +### **5.3 Caching** + +If `source.cacheFor` is specified, the observer MUST cache the response for the specified duration. + +Cache keys are: + +``` +:::: +``` + +--- + +## **6. Response Semantics** + +### **6.1 Content Type** + +``` +Content-Type: application/json +``` + +### **6.2 Response Shapes** + +#### **6.2.1 metrics** + +```json +{ + "metrics": { + "": , + ... + } +} +``` + +#### **6.2.2 health** + +```json +{ + "state": "healthy" | "degraded" | "error", + "lastError": "" +} +``` + +#### **6.2.3 cr** + +```json +{ + "status": { ... }, + "spec": { ... }, + "children": { + "": { + "": { ... } + } + }, + "metrics": { ... } +} +``` + +#### **6.2.4 info** + +```json +{ + "crd": "", + "metrics": { ... }, + "children": { ... } +} +``` + +#### **6.2.5 events** + +```json +{ + "events": [ + { + "timestamp": "", + "type": "", + "reason": "", + "message": "" + } + ] +} +``` + +--- + +## **7. Error Handling** + +### **7.1 HTTP Errors** + +| Code | Meaning | +|------|---------| +| 404 | CR or CRD not found | +| 401 | Unauthorized | +| 403 | Forbidden | +| 500 | Operator internal error | +| 503 | Operator unavailable | + +### **7.2 Observer Behavior** + +If ONCOP returns an error: + +1. Observer MUST log the error +2. Observer MUST NOT retry immediately +3. Observer MUST fall back to: + - raw endpoint (if provided) + - empty result + +Empty result shape: + +```json +{ + "found": "false", + "status": {}, + "spec": {}, + "children": {} +} +``` + +--- + +## **8. Integration with Katalog** + +A cross declaration: + +```yaml +cross: + - crd: loader + selector: + name: my-loader + namespace: default + source: + host: "http://loader:8080" + type: cr + cacheFor: 10s + as: loaderCRInfo +``` + +MUST result in: + +``` +.cross.loaderCRInfo → JSON response +``` + +The resolver MUST expose: + +- `.cross..status.*` +- `.cross..spec.*` +- `.cross..children.*` +- `.cross..metrics.*` + +Autoscale conditions MUST be able to reference: + +``` +cross..metrics. +``` + +--- + +## **9. Security Considerations** + +- Operators SHOULD require authentication for ONCOP endpoints. +- Tokens SHOULD be short‑lived. +- Operators MUST NOT expose sensitive data in ONCOP responses. +- Cross‑cluster ONCOP SHOULD use TLS. + +--- + +## **10. Versioning** + +ONCOP is versioned independently of Orkestra. + +This document defines: + +``` +ONCOP/1.0 +``` + +Future versions MAY add: + +- streaming endpoints +- PATCH‑based deltas +- typed schemas +- CRD introspection + +Backward compatibility MUST be preserved for all URL shapes. + +--- + +## **11. IANA Considerations** + +ONCOP registers the following media type: + +``` +application/vnd.orkestra.oncop+json +``` + +Used for all ONCOP responses. + +--- + +## **12. Conclusion** + +ONCOP provides a stable, typed, declarative protocol for cross‑operator observation in Orkestra. It enables autoscaling, status propagation, dependency tracking, and multi‑operator composition without coupling or hard‑coded URLs. + +It is the observation substrate of the Orkestra platform. diff --git a/examples/advanced/12-autoscale/04-sibling-in-cluster/katalog-processor.yaml b/examples/advanced/12-autoscale/04-sibling-in-cluster/katalog-processor.yaml index a310f8be..7f12b0d0 100644 --- a/examples/advanced/12-autoscale/04-sibling-in-cluster/katalog-processor.yaml +++ b/examples/advanced/12-autoscale/04-sibling-in-cluster/katalog-processor.yaml @@ -37,15 +37,32 @@ spec: - crd: loader selector: name: my-loader + namespace: default source: - endpoint: "http://localhost:8080/katalog/loader/cr/default/my-loader" - as: loaderInfo + host: "http://localhost:8080" + type: cr - # Monitor the operator itself + # Or replace host and type with endpoint + # endpoint: "http://localhost:8080/katalog/loader/cr/default/my-loader" + as: loaderCRInfo + + # Monitor the operator health and itself - crd: loader source: - endpoint: "http://localhost:8080/katalog/loader/health" + host: "http://localhost:8080" + type: health + cacheFor: 30s + + # Or replace host and type with endpoint + # endpoint: "http://localhost:8080/katalog/loader/health" as: loaderHealth + + - crd: loader + source: + host: "http://localhost:8080" + cacheFor: 30s + as: loaderCRDInfo + autoscale: interval: 20s cooldown: 3m @@ -54,8 +71,8 @@ spec: - field: cross.loader.metrics.queueDepth greaterThan: "60" source: - # endpoint: "http://orkestra-runtime.loader-system:8080/katalog/loader" # > if running in a pod - endpoint: "http://localhost:8080/katalog/loader" + # host: "http://orkestra-runtime.loader-system:8080" # > if running in a pod + host: "http://localhost:8080" cacheFor: 10s do: workers: 8 @@ -75,9 +92,11 @@ spec: - path: state value: "{{ .health.state }}" - path: loaderQueueDepth - value: "{{ .cross.loaderInfo.metrics.queueDepth }}" + value: "{{ .cross.loaderCRDInfo.metrics.queueDepth }}" + - path: queueDepth + value: "{{ .metrics.queueDepth }}" - path: loaderState - value: "{{ .cross.loaderInfo.status.phase }}" + value: "{{ .cross.loaderCRInfo.status.phase }}" - path: loaderHealthy value: "{{ .cross.loaderHealth.healthy }}" - path: loaderLastError @@ -85,7 +104,7 @@ spec: - path: loaderPhase value: "{{ .cross.loaderHealth.state }}" - path: loaderDeploymentReady - value: "{{ .cross.loaderInfo.children.deployment.ready }}" + value: "{{ .cross.loaderCRInfo.children.deployment.ready }}" onCreate: deployments: diff --git a/pkg/autoscaler/autoscale_cross_metrics.go b/pkg/autoscaler/autoscale_cross_metrics.go index 4f6acbb8..652edac5 100644 --- a/pkg/autoscaler/autoscale_cross_metrics.go +++ b/pkg/autoscaler/autoscale_cross_metrics.go @@ -85,7 +85,7 @@ func (r *CrossMetricsRegistry) Get(crd string) *AutoMetrics { // field names as AutoMetrics.AsMap(). // // Returns "" when neither path finds a value. -func ResolveCrossMetric(registry *CrossMetricsRegistry, field string, source *orktypes.CrossSource) string { +func OldResolveCrossMetric(registry *CrossMetricsRegistry, field string, source *orktypes.CrossSource) string { // Expected: cross..metrics. // e.g. cross.managed-database.metrics.queueDepth if registry == nil && source == nil { @@ -117,6 +117,48 @@ func ResolveCrossMetric(registry *CrossMetricsRegistry, field string, source *or return "" } +func ResolveCrossMetric( + registry *CrossMetricsRegistry, + field string, + source *orktypes.CrossSource, +) string { + // Parse the cross..metrics. structure + cf := orktypes.ParseCrossField(field) + metricsType := orktypes.MetricsType().String() + if cf == nil || cf.Category != metricsType { + return "" + } + + metricField := metricsType + "." + cf.Field + + // Path 1: in-process registry (zero-hop, same-binary) + if registry != nil { + if m := registry.Get(cf.CRD); m != nil { + return m.Get(metricField) + } + } + + // Path 2: ONCOP remote fetch (cross-binary/cluster) + if source != nil { + // Raw endpoint takes precedence + if source.Endpoint != "" { + return fetchCrossMetricHTTP(source.Endpoint, source.Token, cf.Field) + } + + // ONCOP host-based URL inference + if source.Host != "" { + url := orktypes.BuildONCOPURL(orktypes.CrossCRDDeclaration{ + Source: source, + Crd: cf.CRD, + Selector: orktypes.CrossSelector{Namespace: cf.Namespace}, + }) + return fetchCrossMetricHTTP(url, source.Token, cf.Field) + } + } + + return "" +} + const crossMetricHTTPTimeout = 5 * time.Second // fetchCrossMetricHTTP calls the remote operator's /katalog/{crd} endpoint and @@ -173,9 +215,3 @@ func fetchCrossMetricHTTP(endpoint, token, metricName string) string { return fmt.Sprintf("%v", v) } } - -// IsCrossMetricField returns true when the field path refers to another -// operatorbox's runtime metrics via the cross.*.metrics.* namespace. -func IsCrossMetricField(field string) bool { - return strings.HasPrefix(field, "cross.") && strings.Contains(field, ".metrics.") -} diff --git a/pkg/autoscaler/autoscaler.go b/pkg/autoscaler/autoscaler.go index 0783d793..6ad9ed06 100644 --- a/pkg/autoscaler/autoscaler.go +++ b/pkg/autoscaler/autoscaler.go @@ -154,7 +154,7 @@ func (a *Autoscaler) buildConditionData() map[string]interface{} { all := append(a.spec.Conditions.AnyOf, a.spec.Conditions.When...) for _, cond := range all { // Cross-metric resolution - if IsCrossMetricField(cond.Field) { + if orktypes.IsCrossMetricField(cond.Field) { val := ResolveCrossMetric(GlobalCrossMetricsRegistry, cond.Field, cond.Source) if val != "" { injectCrossMetricValue(data, cond.Field, val) diff --git a/pkg/reconciler/generic_registry.go b/pkg/reconciler/generic_registry.go index 0be96c29..938cc4ee 100644 --- a/pkg/reconciler/generic_registry.go +++ b/pkg/reconciler/generic_registry.go @@ -31,9 +31,3 @@ type KatalogRegistry interface { // key and value. Returns nil, false when no CRD matches. GetInformerByLabel(key, value string) (cache.SharedIndexInformer, bool) } - -// HealthProvider is the minimal interface GenericReconciler needs -// to expose CRD health to templates without importing kordinator. -type HealthProvider interface { - HealthAsMap() map[string]interface{} -} diff --git a/pkg/reconciler/run_template_reconcile.go b/pkg/reconciler/run_template_reconcile.go index 323b5f27..4b4799e3 100644 --- a/pkg/reconciler/run_template_reconcile.go +++ b/pkg/reconciler/run_template_reconcile.go @@ -328,26 +328,56 @@ func (r *GenericReconciler[PTR]) readCross( } notFoundCrossBinary := false - // Path 2: HTTP endpoint fallback. // For cross-binary or cross-cluster. Uses Orkestra's CR detail endpoint. - if decl.Source != nil && decl.Source.Endpoint != "" { - endpointURL, _ := resolver.Resolve(decl.Source.Endpoint) - token := expandEnv(decl.Source.Token) - data := fetchCrossViaHTTP(ctx, endpointURL, token) - if data != nil { - result[as] = data - log.Debug(). + // Path 2: HTTP fallback (raw endpoint OR ONCOP host-based URL inference) + if decl.Source != nil { + + // 2a: Raw endpoint takes precedence (non-Orkestra operators) + if decl.Source.Endpoint != "" { + endpointURL, _ := resolver.Resolve(decl.Source.Endpoint) + token := expandEnv(decl.Source.Token) + + data := fetchCrossViaHTTP(ctx, endpointURL, token) + if data != nil { + result[as] = data + log.Debug(). + Str("crd", decl.Crd). + Str("as", as). + Str("endpoint", endpointURL). + Msg("cross: read via raw HTTP endpoint") + continue + } + + notFoundCrossBinary = true + log.Warn(). Str("crd", decl.Crd). - Str("as", as). Str("endpoint", endpointURL). - Msg("cross: read via HTTP endpoint") - continue + Msg("cross: raw HTTP endpoint returned nil") + } + + // 2b: ONCOP host-based URL inference (Orkestra-native operators) + if decl.Source.Host != "" { + // Build ONCOP URL from host + type + crd + ns + name + url := orktypes.BuildONCOPURL(decl) + + token := expandEnv(decl.Source.Token) + data := fetchCrossViaHTTP(ctx, url, token) + if data != nil { + result[as] = data + log.Debug(). + Str("crd", decl.Crd). + Str("as", as). + Str("endpoint", url). + Msg("cross: read via ONCOP host") + continue + } + + notFoundCrossBinary = true + log.Warn(). + Str("crd", decl.Crd). + Str("endpoint", url). + Msg("cross: ONCOP endpoint returned nil") } - notFoundCrossBinary = true - log.Warn(). - Str("crd", decl.Crd). - Str("endpoint", endpointURL). - Msg("cross: HTTP endpoint returned nil") } if notFoundInBianry && notFoundCrossBinary { diff --git a/pkg/types/conditions.go b/pkg/types/conditions.go index 7092b085..59b58535 100644 --- a/pkg/types/conditions.go +++ b/pkg/types/conditions.go @@ -112,7 +112,7 @@ type Condition struct { // - field: cross.managed-database.metrics.queueDepth // greaterThan: "500" // source: - // endpoint: "http://non-orkestra-database-operator:8080/katalog/managed-database" + // endpoint: "http://non-orkestra-database-operator:8080/api/managed-database/metrics" Source *CrossSource `yaml:"source,omitempty" json:"source,omitempty"` } diff --git a/pkg/types/cross.go b/pkg/types/cross.go index 3cdfd10b..15fc5f22 100644 --- a/pkg/types/cross.go +++ b/pkg/types/cross.go @@ -124,28 +124,32 @@ type CrossSelector struct { // i.e., the Orkestra CR detail endpoint format. // Namespace is optional; defaults to the CR's namespace when omitted. type CrossSource struct { - // Endpoint is a fully-qualified URL. If set, Orkestra uses it directly - // and ignores Host/Type/Namespace. Template expressions supported. - Endpoint string `yaml:"endpoint,omitempty" json:"endpoint,omitempty"` - - // Host is the base URL of a remote Orkestra runtime, e.g.: - // http://orkestra-runtime.loader-system:8080 - // Combined with Type to build the final URL. - Host string `yaml:"host,omitempty" json:"host,omitempty"` - - // Type selects which Orkestra-native endpoint to call. - // One of: "info", "metrics", "health", "events". - // Default: "info". - Type string `yaml:"type,omitempty" json:"type,omitempty"` - - // Namespace overrides the CR namespace when building info/events URLs. - // Optional — defaults to the CR's own namespace. - Namespace string `yaml:"namespace,omitempty" json:"namespace,omitempty"` - - // Token is a bearer token for the endpoint. $ENV_VAR syntax supported. - Token string `yaml:"token,omitempty" json:"token,omitempty"` - - // CacheFor controls how long to cache the result before calling again. - // Default: 30s — prevents hammering the endpoint on every evaluation. - CacheFor string `yaml:"cacheFor,omitempty" json:"cacheFor,omitempty"` + // // CRD short name (e.g. "loader", "processor", "managed-database"). + // // Required when Host is used. + // CRD string `yaml:"crd,omitempty" json:"crd,omitempty"` + + // Endpoint is a fully-qualified URL. If set, Orkestra uses it directly + // and ignores Host/Type/Namespace. Template expressions supported. + Endpoint string `yaml:"endpoint,omitempty" json:"endpoint,omitempty"` + + // Host is the base URL of a remote Orkestra runtime, e.g.: + // http://orkestra-runtime.loader-system:8080 + // Combined with Type to build the final URL. + Host string `yaml:"host,omitempty" json:"host,omitempty"` + + // Type selects which Orkestra-native endpoint to call. + // One of: "info", "metrics", "health", "events". + // Default: "info". + Type ONCOPType `yaml:"type,omitempty" json:"type,omitempty"` + + // Namespace overrides the CR namespace when building info/events URLs. + // Optional — defaults to the CR's own namespace. + Namespace string `yaml:"namespace,omitempty" json:"namespace,omitempty"` + + // Token is a bearer token for the endpoint. $ENV_VAR syntax supported. + Token string `yaml:"token,omitempty" json:"token,omitempty"` + + // CacheFor controls how long to cache the result before calling again. + // Default: 30s — prevents hammering the endpoint on every evaluation. + CacheFor string `yaml:"cacheFor,omitempty" json:"cacheFor,omitempty"` } diff --git a/pkg/types/cross_methods.go b/pkg/types/cross_methods.go new file mode 100644 index 00000000..0e586c00 --- /dev/null +++ b/pkg/types/cross_methods.go @@ -0,0 +1,214 @@ +package types + +import "strings" + +// +// ──────────────────────────────────────────────────────────────────────────────── +// CROSS‑FIELD PARSING HELPERS +// These helpers parse field paths of the form: +// +// cross... +// +// Examples: +// cross.loader.metrics.queueDepth +// cross.db.health.lastError +// cross.api.info.status.phase +// cross.worker.events.items[0].message +// +// They are used by the resolver, autoscaler, and ONCOP router. +// ──────────────────────────────────────────────────────────────────────────────── +// + +// ExtractCrossCRD returns the portion of a cross.* field path. +// Examples: +// +// cross.loader.metrics.queueDepth → "loader" +// cross.processor.health.state → "processor" +// cross.db.info.status.phase → "db" +func ExtractCrossCRD(field string) string { + if !strings.HasPrefix(field, "cross.") { + return "" + } + rest := strings.TrimPrefix(field, "cross.") + dot := strings.Index(rest, ".") + if dot < 0 { + return "" + } + return rest[:dot] +} + +// ExtractCrossSuffix returns the suffix after cross.. in a field path. +// Examples: +// +// cross.loader.metrics.queueDepth → "metrics.queueDepth" +// cross.db.health.lastError → "health.lastError" +// cross.api.info.status.phase → "info.status.phase" +func ExtractCrossSuffix(field string) string { + if !strings.HasPrefix(field, "cross.") { + return "" + } + rest := strings.TrimPrefix(field, "cross.") + dot := strings.Index(rest, ".") + if dot < 0 { + return "" + } + return rest[dot+1:] +} + +// ExtractCrossCategory returns the category segment after cross.. +// Examples: +// +// cross.loader.metrics.queueDepth → "metrics" +// cross.db.health.state → "health" +// cross.api.info.status.phase → "info" +func ExtractCrossCategory(field string) string { + suffix := ExtractCrossSuffix(field) + dot := strings.Index(suffix, ".") + if dot < 0 { + return "" + } + return suffix[:dot] +} + +// ExtractCrossFieldName returns the field name after the category. +// Examples: +// +// cross.loader.metrics.queueDepth → "queueDepth" +// cross.db.health.lastError → "lastError" +// cross.api.info.status.phase → "status.phase" +func ExtractCrossFieldName(field string) string { + suffix := ExtractCrossSuffix(field) + dot := strings.Index(suffix, ".") + if dot < 0 { + return "" + } + return suffix[dot+1:] +} + +// ExtractCrossNamespace returns the namespace portion for ONCOP info/events +// when encoded in a field path of the form: +// +// cross..info...status.phase +// +// If no namespace is encoded, returns "". +// +// Examples: +// +// cross.db.info.default.my-db.status.phase → "default" +// cross.api.info.prod.service-a.spec.port → "prod" +// cross.loader.metrics.queueDepth → "" +// +// NOTE: Namespace is only encoded for info/events categories. Metrics/health +// do not carry namespace in the field path. +func ExtractCrossNamespace(field string) string { + category := ExtractCrossCategory(field) + if category != "info" && category != "events" { + return "" + } + + suffix := ExtractCrossSuffix(field) + parts := strings.Split(suffix, ".") + if len(parts) < 3 { + return "" + } + + // suffix = "......" + // parts[0] = category + // parts[1] = namespace + return parts[1] +} + +// +// ──────────────────────────────────────────────────────────────────────────────── +// PARSED STRUCT +// ──────────────────────────────────────────────────────────────────────────────── +// + +// CrossField describes a parsed cross... reference. +type CrossField struct { + CRD string // loader, db, processor, etc. + Category string // metrics, health, info, events, children + Namespace string // optional (info/events only) + Field string // queueDepth, lastError, status.phase, etc. +} + +// ParseCrossField parses a cross.* field path into a structured CrossField. +// +// Examples: +// +// cross.loader.metrics.queueDepth +// → CRD=loader, Category=metrics, Namespace="", Field="queueDepth" +// +// cross.db.health.lastError +// → CRD=db, Category=health, Namespace="", Field="lastError" +// +// cross.api.info.default.my-api.status.phase +// → CRD=api, Category=info, Namespace=default, Field="status.phase" +// +// Returns nil if the field is not a valid cross.* reference. +func ParseCrossField(field string) *CrossField { + if !strings.HasPrefix(field, "cross.") { + return nil + } + + crd := ExtractCrossCRD(field) + if crd == "" { + return nil + } + + category := ExtractCrossCategory(field) + if category == "" { + return nil + } + + ns := ExtractCrossNamespace(field) + name := ExtractCrossFieldName(field) + + return &CrossField{ + CRD: crd, + Category: category, + Namespace: ns, + Field: name, + } +} + +// +// ──────────────────────────────────────────────────────────────────────────────── +// CATEGORY DETECTORS +// ──────────────────────────────────────────────────────────────────────────────── +// + +// IsCrossMetricField returns true when the field path refers to another +// operatorbox's runtime metrics via the cross.*.metrics.* namespace. +func IsCrossMetricField(field string) bool { + return strings.HasPrefix(field, "cross.") && + strings.Contains(field, ".metrics.") +} + +// IsCrossHealthField returns true when the field path refers to another +// operatorbox's runtime health via the cross.*.health.* namespace. +func IsCrossHealthField(field string) bool { + return strings.HasPrefix(field, "cross.") && + strings.Contains(field, ".health.") +} + +// IsCrossInfoField returns true when the field path refers to another +// operatorbox's CR detail (the CRD info endpoint) via cross.*.info.*. +func IsCrossInfoField(field string) bool { + return strings.HasPrefix(field, "cross.") && + strings.Contains(field, ".info.") +} + +// IsCrossEventsField returns true when the field path refers to another +// operatorbox's CR events via the cross.*.events.* namespace. +func IsCrossEventsField(field string) bool { + return strings.HasPrefix(field, "cross.") && + strings.Contains(field, ".events.") +} + +// IsCrossChildrenField returns true when the field path refers to another +// operatorbox's managed Kubernetes children via cross.*.children.*. +func IsCrossChildrenField(field string) bool { + return strings.HasPrefix(field, "cross.") && + strings.Contains(field, ".children.") +} diff --git a/pkg/types/cross_oncop.go b/pkg/types/cross_oncop.go new file mode 100644 index 00000000..b753de5e --- /dev/null +++ b/pkg/types/cross_oncop.go @@ -0,0 +1,107 @@ +package types + +import ( + "fmt" + "strings" +) + +// ONCOPType defines the category of cross-operator data being fetched. +type ONCOPType string + +const ( + ONCOPMetrics ONCOPType = "metrics" + ONCOPHealth ONCOPType = "health" + ONCOPInfo ONCOPType = "info" + ONCOPCR ONCOPType = "cr" + ONCOPEvents ONCOPType = "events" +) + +func MetricsType() ONCOPType { return ONCOPMetrics } +func HealthType() ONCOPType { return ONCOPHealth } +func InfoType() ONCOPType { return ONCOPInfo } +func CRType() ONCOPType { return ONCOPCR } +func EventsType() ONCOPType { return ONCOPEvents } + +// String returns the string representation of the ONCOPType. +// This ensures fmt.Printf, logs, and YAML marshaling behave predictably. +func (t ONCOPType) String() string { + switch t { + case ONCOPMetrics: + return "metrics" + case ONCOPHealth: + return "health" + case ONCOPInfo: + return "info" + case ONCOPCR: + return "cr" + case ONCOPEvents: + return "events" + default: + return string(t) + } +} + +// +// ──────────────────────────────────────────────────────────────────────────────── +// ONCOP URL BUILDER +// ──────────────────────────────────────────────────────────────────────────────── +// + +// BuildONCOPURL constructs an Orkestra‑native cross‑operator URL using the +// Orkestra Native Cross‑Operator Protocol (ONCOP). +// +// ONCOP allows cross‑binary CRD observation without requiring callers to +// hard‑code full URLs. When a CrossSource specifies a Host (and no Endpoint), +// the final URL is inferred from: +// +// - the CRD name extracted from the field path (cross..…) +// - the Source.Type ("info", "metrics", "health", "events") +// - the Source.Namespace override (optional) +// - the CR's own namespace/name when Type requires it +// +// URL shapes: +// +// Type: "cr" → /katalog//cr// +// Type: "info" → /katalog/ +// Type: "metrics" → /katalog/ +// Type: "health" → /katalog//health +// Type: "events" → /katalog//cr///events +// +// If Source.Endpoint is provided, ONCOP is bypassed entirely and the raw +// endpoint is used as‑is. This enables integration with non‑Orkestra operators +// or arbitrary JSON‑producing services. +// +// BuildONCOPURL centralises this logic so all cross‑CRD resolution paths +// (metrics, health, info, autoscale conditions, status.fields) share the same +// URL construction semantics. +func BuildONCOPURL(decl CrossCRDDeclaration) string { + src := decl.Source + crd := decl.Crd + name := decl.Selector.Name + ns := decl.Selector.Namespace + + host := strings.TrimSuffix(src.Host, "/") + + switch src.Type { + case ONCOPMetrics, ONCOPInfo, "": + return fmt.Sprintf("%s/katalog/%s", host, crd) + + case ONCOPHealth: + return fmt.Sprintf("%s/katalog/%s/health", host, crd) + + case ONCOPEvents: + if ns == "" { + return fmt.Sprintf("%s/katalog/%s/cr/%s/events", host, crd, name) + } + return fmt.Sprintf("%s/katalog/%s/cr/%s/%s/events", host, crd, ns, name) + + case ONCOPCR: + if ns == "" { + return fmt.Sprintf("%s/katalog/%s/cr/%s", host, crd, name) + } + return fmt.Sprintf("%s/katalog/%s/cr/%s/%s", host, crd, ns, name) + + default: + return fmt.Sprintf("%s/katalog/%s", host, crd) + } +}