diff --git a/.gitignore b/.gitignore
index 391564d4..f6ed7534 100644
--- a/.gitignore
+++ b/.gitignore
@@ -21,4 +21,7 @@ testbin/*
 /website/public
 
 # /tmp
-tmp/
\ No newline at end of file
+tmp/
+
+# coding agents
+.pi
\ No newline at end of file
diff --git a/docs/specs/version-cleanup-without-prometheus/spec.md b/docs/specs/version-cleanup-without-prometheus/spec.md
new file mode 100644
index 00000000..25ce2f85
--- /dev/null
+++ b/docs/specs/version-cleanup-without-prometheus/spec.md
@@ -0,0 +1,378 @@
+---
+workspace: ../../..
+planning:
+  model: hai-claude/anthropic--claude-4.7-opus
+  reviewer_model: hai-gemini/gemini-2.5-pro
+  max_rounds: 3
+  timeout: 30m
+  image: alpine
+execution:
+  model: hai-claude/anthropic--claude-4.6-sonnet
+  timeout: 8m
+  concurrency: 3
+  image: go-alpine
+correction:
+  max_retries: 2
+  max_waves: 2
+acceptance:
+  model: hai-claude/anthropic--claude-4.7-opus
+  image: go-alpine
+  timeout: 15m
+---
+
+# Allow CAPApplicationVersion Cleanup Without a Prometheus Connection
+
+The CAP Operator controller currently refuses to start the `CAPApplicationVersion`
+cleanup routines unless the `PROMETHEUS_ADDRESS` environment variable is set
+*and* a Prometheus client can be initialised against it (a `Runtimeinfo` probe
+must succeed). This makes Prometheus a hard dependency for an entirely
+metrics-free use case: an application whose `CAPApplicationVersion` workloads
+declare no `monitoring.deletionRules` would still like older, unused versions
+to be cleaned up automatically once a newer version becomes `Ready`.
+
+This change decouples the cleanup loop from Prometheus availability:
+
+- The cleanup routines start unconditionally.
+- Versions whose deployment workloads declare *no* deletion rules continue to
+  be eligible for cleanup as today, regardless of whether Prometheus is
+  reachable.
+- Versions that *do* declare deletion rules are only evaluated when a working
+  Prometheus client is available; otherwise they are skipped (left for the
+  next cycle) with a clear log message and *not* deleted.
+
+The annotation contract on `CAPApplication` (`sme.sap.com/enable-cleanup-monitoring`)
+is unchanged. Defaulting that annotation, or adding an opt-out value, is
+explicitly out of scope here and tracked separately.
+
+## Background — current behaviour
+
+All paths below are inside the controller package and refer to the existing
+implementation that this change is modifying.
+
+`internal/controller/version-monitoring.go`:
+
+- `parseMonitoringEnv()` returns `nil` when `PROMETHEUS_ADDRESS` is empty or
+  whitespace.
+- `(*Controller).startVersionCleanup(ctx)` short-circuits with `return` when
+  `parseMonitoringEnv()` returned `nil`. The schedule and worker goroutines
+  are never started.
+- `initializeVersionCleanupOrchestrator(ctx, mEnv)` constructs a Prometheus
+  client and validates it with `v1api.Runtimeinfo(ctx)`. Either failure
+  returns `nil`, and the surrounding loop in `startVersionCleanup` retries
+  after `acquireClientRetryDelay` (default `1h`).
+- The `cleanupOrchestrator` struct stores the `promv1.API` once at startup;
+  the `api` value is then read by every downstream evaluator and is assumed
+  to be non-nil.
+- `evaluateWorkloadForCleanup(ctx, cav, wl, promapi)` is the per-workload
+  evaluator. When `wl.DeploymentDefinition.Monitoring == nil` *or*
+  `Monitoring.DeletionRules == nil`, the workload is auto-eligible without
+  ever touching the Prometheus client; otherwise it calls
+  `evaluateExpression` or `evaluateMetric`, both of which dereference the
+  `promv1.API`.
+
+`internal/controller/controller.go` (around line 233):
+
+- Calls `c.startVersionCleanup(qCxt)` from the controller's main goroutine
+  group, wrapped in `wg.Go(...)`.
+
+`internal/controller/reconcile.go` (line 57):
+
+- Defines `AnnotationEnableCleanupMonitoring = "sme.sap.com/enable-cleanup-monitoring"`.
+- `queueVersionsForCleanupEvaluation` already filters CAPApplications by this
+  annotation (only `"true"` and `"dry-run"` enable the loop) and excludes
+  versions linked to tenants and versions newer than the latest `Ready`
+  version. None of that filtering needs to change.
+
+Documentation that mentions the old behaviour:
+
+- `website/content/en/docs/usage/version-monitoring.md` — section
+  "Integration with Prometheus" claims "If no Prometheus address is supplied,
+  the version monitoring function is not started."
+- `website/content/en/docs/configuration/_index.md` — the `PROMETHEUS_ADDRESS`
+  bullet says "If not set, the version monitoring function is not started."
+
+Tests in `internal/controller/version-monitoring_test.go` that hard-code the
+"empty address ⇒ nil monitoring env / no routines" assumption:
+
+- `TestMonitoringEnv` — first sub-case asserts `mEnv == nil` when the env var
+  is unset.
+- `TestGracefulShutdownMonitoringRoutines` — sets `PROMETHEUS_ADDRESS` before
+  starting the cleanup loop.
+- `Test_initializeVersionCleanupOrchestrator` — passes a non-empty address.
+
+## Required behaviour
+
+### Routine startup
+
+`startVersionCleanup(ctx)` must always start the schedule and worker
+goroutines, regardless of `PROMETHEUS_ADDRESS`:
+
+- `parseMonitoringEnv()` must always return a non-nil `*monitoringEnv`.
+  - When `PROMETHEUS_ADDRESS` is unset/blank, `address` is the empty string.
+    The other fields (`acquireClientRetryDelay`, `evaluationInterval`) are
+    populated using the same env-var lookup with the same defaults
+    (`1h` and `10m`) as today.
+  - When `PROMETHEUS_ADDRESS` is set, behaviour is unchanged.
+- `startVersionCleanup` must no longer `return` early based on the
+  monitoring env. The schedule + worker goroutines must run for the lifetime
+  of `ctx`, with the same panic-recovery + `restartSignal` semantics as
+  today.
+- The Prometheus *client* must no longer block routine startup. The current
+  `setup()` retry loop (which sleeps `acquireClientRetryDelay` while the
+  Prom server is unavailable) must not be on the critical path for starting
+  the worker. The orchestrator must be constructable without a working
+  Prometheus connection.
+
+### Prometheus client lifecycle
+
+Introduce a way to obtain the current Prometheus API client lazily, so that
+each evaluation cycle reflects the latest reachability:
+
+- When `address` is empty, the provider always reports "unavailable"; no
+  client is ever constructed and no `Runtimeinfo` probe is made.
+- When `address` is non-empty:
+  - A client is constructed; if construction fails, the provider reports
+    "unavailable" for this cycle and logs the error at most once per
+    transition from available → unavailable (avoid log spam every cycle).
+  - The client's reachability is verified via `Runtimeinfo` at most once per
+    `acquireClientRetryDelay` while it is unavailable, *not* on every
+    queueing iteration. Once a probe succeeds, the working client is cached
+    and reused; subsequent query failures during evaluation do not by
+    themselves invalidate the cache (those are already handled per-query
+    today).
+  - On a successful probe transitioning from unavailable → available, log
+    an informational message.
+
+Implementation freedom: this can be a `promClientProvider` interface, a
+field on `cleanupOrchestrator` mutated under a mutex, or any equivalent
+design. The only external contract is "give me the current API or tell me
+it's unavailable, without doing a network call on every evaluation".
+
+### Per-version evaluation
+
+`evaluateVersionForCleanup` and `evaluateWorkloadForCleanup` must accept the
+case where the Prometheus API is currently unavailable:
+
+- For each deployment workload of the version:
+  - Workloads whose `deploymentDefinition.monitoring` is nil, or whose
+    `deploymentDefinition.monitoring.deletionRules` is nil, remain
+    auto-eligible. No Prometheus access is required or attempted.
+  - Workloads with non-nil `deletionRules`:
+    - If the Prometheus API is available: evaluate as today (expression or
+      metric rules) and use the result.
+    - If the Prometheus API is unavailable: the workload is *not* eligible
+      for cleanup. Log a single informational/warning entry per evaluation
+      attempt naming the version, the workload, and the reason
+      ("prometheus client unavailable"). Do not return an error from
+      `evaluateVersionForCleanup`; the version simply stays in the system
+      and will be re-considered on the next cycle.
+- The "all deployment workloads must be eligible for the version to be
+  cleaned up" rule is unchanged.
+- The annotation-controlled deletion vs. dry-run decision in
+  `evaluateVersionForCleanup` (`AnnotationEnableCleanupMonitoring == "true"`
+  ⇒ delete; `"dry-run"` ⇒ event only) is unchanged.
+
+In particular, this means a `CAPApplicationVersion` whose deployment
+workloads collectively declare *no* deletion rules can be cleaned up even
+when no Prometheus is configured — that is the central use case being
+unlocked.
+
+### Queue behaviour
+
+`queueVersionsForCleanupEvaluation` continues to:
+
+- Only consider `CAPApplication`s whose
+  `sme.sap.com/enable-cleanup-monitoring` annotation is `"true"` or
+  `"dry-run"` (case-insensitive).
+- Exclude versions tied to tenants (`spec.version` or
+  `status.currentCAPApplicationVersionInstance`).
+- Exclude versions whose `spec.version` is greater than the latest `Ready`
+  version (semver compare).
+- Skip everything when no `Ready` version exists.
+
+These rules must still apply regardless of Prometheus availability.
+
+### Logging / observability
+
+- A clear `klog.InfoS` (or `klog.WarningS`) line when the cleanup loop is
+  starting without a configured Prometheus address, e.g.
+  "PROMETHEUS_ADDRESS is not set; only versions without deletion rules will
+  be cleaned up".
+- A clear log line when a workload is skipped because the Prom client is
+  unavailable (one per workload per evaluation cycle is fine; do not flood
+  the log every queue iteration).
+- Existing per-query error logs in `executePromQL`, `evaluateExpression`,
+  and `evaluateMetric` are kept.
+
+### Documentation updates
+
+`website/content/en/docs/usage/version-monitoring.md`:
+
+- The "Integration with Prometheus" paragraph must be rewritten to say:
+  Prometheus is only required to evaluate workloads with `deletionRules`.
+  When `PROMETHEUS_ADDRESS` is not set (or the configured server is
+  unreachable), the cleanup loop still runs; versions whose deployment
+  workloads have no `deletionRules` can be cleaned up, and versions with
+  `deletionRules` are skipped until Prometheus becomes available again.
+- The "Evaluating CAPApplicationVersion Resources for Cleanup" section
+  must mention the same Prometheus-availability behaviour next to the
+  existing description of how workloads without `deletionRules` are
+  treated.
+
+`website/content/en/docs/configuration/_index.md`:
+
+- Update the `PROMETHEUS_ADDRESS` bullet to: "URL of the Prometheus server
+  for executing PromQL queries... If not set, the controller still runs
+  the version cleanup loop, but only versions whose workloads have no
+  `deletionRules` are eligible for cleanup."
+
+## Constraints
+
+- Go only; no new third-party dependencies. Reuse the existing
+  `github.com/prometheus/client_golang/api` package already in use.
+- Public API of the `internal/controller` package (types and functions
+  exported from the package) must not be broken. Keep
+  `cleanupOrchestrator`, `monitoringEnv`, `initializeVersionCleanupOrchestrator`,
+  `startVersionCleanup`, `queueVersionsForCleanupEvaluation`,
+  `processVersionCleanupQueue`, and `processVersionCleanupQueueItem`
+  callable with their current signatures from tests, even if their
+  internals change. The `cleanupOrchestrator.api` field may be replaced
+  by an alternative client-provider field if you also update the tests
+  that touch it.
+- The controller CRDs, generated client code under `pkg/client/...`, and
+  the API types in `pkg/apis/sme.sap.com/v1alpha1` must not change.
+- The `sme.sap.com/enable-cleanup-monitoring` annotation contract is
+  unchanged. Do not introduce a new default, do not introduce new
+  accepted values, do not invert any condition.
+- The Prometheus reachability probe must remain rate-limited. Do not call
+  `Runtimeinfo` (or any equivalent) on every iteration of the schedule
+  loop — at most once per `acquireClientRetryDelay`.
+- Existing tests must continue to pass after their assertions about the
+  "empty address ⇒ nil mEnv / no routines" behaviour are updated to
+  reflect the new contract. No tests may be deleted; they are migrated.
+
+## Non-goals
+
+- Defaulting the `sme.sap.com/enable-cleanup-monitoring` annotation to
+  `"true"` or introducing an explicit opt-out value (e.g. `"false"`,
+  `"disabled"`). This is tracked separately and is out of scope here.
+- Changing how individual PromQL queries are constructed or what
+  `Gauge`/`Counter`/`Expression` evaluation means.
+- Changing how `ServiceMonitor` resources are created from
+  `monitoring.scrapeConfig`.
+- Adding a Prometheus-availability indicator to any CR status field.
+- Changing the controller's leader election, queue, or reconcile
+  scheduler behaviour for any other resource.
+
+## Tests
+
+All tests live in `internal/controller/version-monitoring_test.go` (plus
+`testdata/version-monitoring/*.yaml`). The build/test commands the
+planner should attach to validation tasks are:
+
+- `go build ./...`
+- `go test ./internal/controller/...`
+- `go test ./...` for a final wide check.
+
+Update existing tests as follows; they must keep passing.
+
+- `TestMonitoringEnv`:
+  - The case where `PROMETHEUS_ADDRESS` is unset must now assert that
+    `parseMonitoringEnv()` returns a non-nil `*monitoringEnv` whose
+    `address` is empty and whose `acquireClientRetryDelay` /
+    `evaluationInterval` carry their defaults (`1h`, `10m`). The cases
+    that set the address remain as today.
+- `TestGracefulShutdownMonitoringRoutines`:
+  - Must additionally cover the case where `PROMETHEUS_ADDRESS` is *not*
+    set. `startVersionCleanup` must still launch the routines and they
+    must shut down cleanly when the context is cancelled.
+- `Test_initializeVersionCleanupOrchestrator`:
+  - The "server unavailable" case must no longer rely on the orchestrator
+    being `nil`. The orchestrator must be constructable; the assertion
+    becomes "the embedded Prometheus client provider reports unavailable"
+    rather than "initialiser returned nil". If the test currently waits
+    for a context deadline as a proxy for "kept retrying", replace that
+    with a direct assertion against the provider.
+  - Add an explicit case: empty address ⇒ orchestrator constructs;
+    provider is permanently unavailable; no HTTP calls are made (the
+    test's mock server should record zero `runtimeinfo` hits).
+
+Add new test coverage:
+
+- A version-cleanup evaluation test where Prometheus is *not* configured
+  (empty address) and a CAV's deployment workloads all have no
+  `deletionRules`: the evaluator must mark the version eligible and the
+  CAV must be deleted (when the CAPApplication annotation is `"true"`).
+  Reuse the existing `setupTestControllerWithInitialResources` helper
+  and add a `cav-no-deletion-rules.yaml` (or equivalent) fixture under
+  `testdata/version-monitoring/`.
+- A version-cleanup evaluation test where Prometheus is *not* configured
+  but the CAV has at least one workload with `deletionRules`: the
+  evaluator must *not* delete the CAV, must not panic, and must not
+  return an error that causes the queue to spin (no requeues for the
+  same item beyond the normal flow). The existing
+  `cav-v1-deletion-rules.yaml` fixture is suitable.
+- A version-cleanup evaluation test where the address *is* configured
+  but the mock Prom server returns 503 from `runtimeinfo`: a CAV with
+  `deletionRules` is not deleted; a CAV with no `deletionRules` (in the
+  same evaluation cycle, separate fixture) *is* deleted.
+- A test that the reachability probe is rate-limited: drive two
+  consecutive evaluation cycles closer together than
+  `acquireClientRetryDelay` and assert the mock Prom server records at
+  most one `runtimeinfo` request across both. Use a short
+  `acquireClientRetryDelay` plus a shorter test-window between cycles
+  to make this deterministic without `time.Sleep` blocking the test.
+
+`TestVersionSelectionForCleanup` and `TestVersionCleanupEvaluation`
+existing scenarios should continue to pass unchanged once the production
+code is updated. If the orchestrator's `api` field is renamed or
+replaced, update the tests' construction of `cleanupOrchestrator` to
+match — keep the same scenarios.
+
+## Acceptance
+
+The acceptor will run the merged code in the `go-alpine` image and verify
+each of the following criteria. Any single failing criterion produces a
+rejection naming that criterion, so the re-planner can target the gap.
+
+1. **Build is green.** `go build ./...` exits 0 from the repo root.
+2. **Full test suite is green.** `go test ./...` exits 0 from the repo
+   root.
+3. **Controller tests are green and include the new coverage.**
+   `go test ./internal/controller/...` exits 0, and the test binary
+   reports tests covering at least:
+   - cleanup loop startup with `PROMETHEUS_ADDRESS` unset (graceful
+     shutdown);
+   - `parseMonitoringEnv()` returning a non-nil env when the address is
+     unset;
+   - eligibility-based deletion of a CAV with no `deletionRules` when no
+     Prometheus client is available;
+   - skipping (no deletion, no panic, no error-driven requeue storm) of a
+     CAV with `deletionRules` when no Prometheus client is available;
+   - rate-limited reachability probing (at most one `runtimeinfo` request
+     per `acquireClientRetryDelay` window across consecutive cycles).
+4. **Early-return is gone.** `internal/controller/version-monitoring.go`
+   no longer contains an `if mEnv == nil { return }` (or equivalent
+   early `return`) inside `startVersionCleanup`. The function always
+   reaches the goroutine-launching code path for any non-cancelled
+   context.
+5. **Annotation contract is unchanged.** A `grep` for
+   `AnnotationEnableCleanupMonitoring` over `internal/controller/`
+   shows the same accepted values (`"true"`, `"dry-run"`,
+   case-insensitive) — no new accepted values, no inverted defaults,
+   no removal of the existing filter in
+   `queueVersionsForCleanupEvaluation`.
+6. **API types and CRDs untouched.** `git diff` against the merge base
+   shows no changes under `pkg/apis/sme.sap.com/v1alpha1/`,
+   `pkg/client/`, or `crds/`.
+7. **Documentation reflects the new behaviour.**
+   - `website/content/en/docs/configuration/_index.md` no longer claims
+     that version monitoring is disabled when `PROMETHEUS_ADDRESS` is
+     unset; instead, it states that the cleanup loop still runs and
+     describes which versions are eligible without Prometheus.
+   - `website/content/en/docs/usage/version-monitoring.md` ("Integration
+     with Prometheus" and "Evaluating CAPApplicationVersion Resources
+     for Cleanup" sections) describes the new contract: cleanup runs
+     unconditionally, workloads without `deletionRules` are eligible
+     without Prometheus, workloads with `deletionRules` are skipped
+     until Prometheus is reachable.
diff --git a/internal/controller/testdata/version-monitoring/cav-v1-no-deletion-rules.yaml b/internal/controller/testdata/version-monitoring/cav-v1-no-deletion-rules.yaml
new file mode 100644
index 00000000..160f92e8
--- /dev/null
+++ b/internal/controller/testdata/version-monitoring/cav-v1-no-deletion-rules.yaml
@@ -0,0 +1,63 @@
+apiVersion: sme.sap.com/v1alpha1
+kind: CAPApplicationVersion
+metadata:
+  creationTimestamp: "2022-03-18T22:14:33Z"
+  generation: 1
+  annotations:
+    sme.sap.com/btp-app-identifier: btp-glo-acc-id.test-cap-01
+    sme.sap.com/owner-identifier: default.test-cap-01
+  labels:
+    sme.sap.com/btp-app-identifier-hash: f20cc8aeb2003b3abc33f749a16bd53544b6bab2
+    sme.sap.com/owner-generation: "2"
+    sme.sap.com/owner-identifier-hash: 1f74ae2fbff71a708786a4df4bb2ca87ec603581
+  name: test-cap-01-cav-v1
+  namespace: default
+  ownerReferences:
+    - apiVersion: sme.sap.com/v1alpha1
+      blockOwnerDeletion: true
+      controller: true
+      kind: CAPApplication
+      name: test-cap-01
+      uid: 3c7ba7cb-dc04-4fd1-be86-3eb3a5c64a98
+  resourceVersion: "11371108"
+  uid: 5e64489b-7346-4984-8617-e8c37338b3d8
+  finalizers:
+  - sme.sap.com/capapplicationversion
+spec:
+  capApplicationInstance: test-cap-01
+  registrySecrets:
+    - regcred
+  version: 5.6.7
+  workloads:
+    - name: cap-backend-srv
+      consumedBTPServices:
+        - cap-uaa
+        - cap-service-manager
+        - cap-saas-registry
+      deploymentDefinition:
+        type: CAP
+        image: docker.image.repo/srv/server:latest
+    - name: content
+      consumedBTPServices:
+        - cap-uaa
+      jobDefinition:
+        type: Content
+        image: docker.image.repo/content/cap-content:latest
+    - name: mtx
+      consumedBTPServices:
+        - cap-uaa
+        - cap-service-manager
+        - cap-saas-registry
+      jobDefinition:
+        type: "TenantOperation"
+        image: docker.image.repo/srv/server:latest
+status:
+  conditions:
+    - reason: WorkloadsReady
+      observedGeneration: 1
+      status: "True"
+      type: Ready
+  observedGeneration: 1
+  finishedJobs:
+    - test-cap-01-cav-v1-content
+  state: Ready
diff --git a/internal/controller/version-monitoring.go b/internal/controller/version-monitoring.go
index 4aa8ec69..82276516 100644
--- a/internal/controller/version-monitoring.go
+++ b/internal/controller/version-monitoring.go
@@ -10,6 +10,7 @@ import (
 	"fmt"
 	"os"
 	"strings"
+	"sync"
 	"time"
 
 	promapi "github.com/prometheus/client_golang/api"
@@ -40,10 +41,98 @@ const (
 	CounterEvaluationExpression = "sum(rate(%s{job=\"%s\",namespace=\"%s\"}[%s]))"
 )
 
+// promClientProvider abstracts Prometheus API availability so that each
+// evaluation cycle can obtain the current API client without making a network
+// call on every iteration.
+type promClientProvider interface {
+	// Get returns the current Prometheus API and true when available, or
+	// (nil, false) when the address is unset or the server is unreachable.
+	Get(ctx context.Context) (promv1.API, bool)
+}
+
+// noopPromClientProvider is used when PROMETHEUS_ADDRESS is not set.
+// It always reports the client as unavailable.
+type noopPromClientProvider struct{}
+
+func (n *noopPromClientProvider) Get(_ context.Context) (promv1.API, bool) {
+	return nil, false
+}
+
+// cachedPromClientProvider holds a cached Prometheus API client and manages
+// reachability probing with a configurable delay between probes.
+type cachedPromClientProvider struct {
+	address    string
+	retryDelay time.Duration
+
+	mu          sync.Mutex
+	cached      promv1.API // non-nil when the last probe succeeded
+	lastProbeAt time.Time  // when the last probe (success or failure) was attempted
+	loggedError bool       // true after we have logged the first construction/probe failure
+}
+
+// Get returns the cached Prometheus API when available. When unavailable it
+// probes at most once per retryDelay window.
+func (p *cachedPromClientProvider) Get(ctx context.Context) (promv1.API, bool) {
+	p.mu.Lock()
+	defer p.mu.Unlock()
+
+	// Fast path: already have a working client.
+	if p.cached != nil {
+		return p.cached, true
+	}
+
+	// Only probe if enough time has passed since the last attempt.
+	if time.Since(p.lastProbeAt) < p.retryDelay {
+		return nil, false
+	}
+
+	// Attempt to construct a client and verify reachability.
+	p.lastProbeAt = time.Now()
+
+	client, err := promapi.NewClient(promapi.Config{Address: p.address})
+	if err != nil {
+		if !p.loggedError {
+			klog.ErrorS(err, "could not create prometheus client", "address", p.address)
+			p.loggedError = true
+		}
+		return nil, false
+	}
+
+	api := promv1.NewAPI(client)
+	_, err = api.Runtimeinfo(ctx)
+	if err != nil {
+		if !p.loggedError {
+			klog.ErrorS(err, "could not fetch runtime info from prometheus server", "address", p.address)
+			p.loggedError = true
+		}
+		return nil, false
+	}
+
+	// Probe succeeded — cache the client and reset the error flag.
+	klog.InfoS("prometheus server is now reachable", "address", p.address)
+	p.cached = api
+	p.loggedError = false
+	return p.cached, true
+}
+
+// newPromClientProvider creates the appropriate promClientProvider based on
+// whether a Prometheus address is configured.
+func newPromClientProvider(mEnv *monitoringEnv) promClientProvider {
+	if mEnv.address == "" {
+		return &noopPromClientProvider{}
+	}
+	return &cachedPromClientProvider{
+		address:    mEnv.address,
+		retryDelay: mEnv.acquireClientRetryDelay,
+		// lastProbeAt is zero so the first call triggers an immediate probe.
+	}
+}
+
+// cleanupOrchestrator holds the state needed by the version-cleanup routines.
 type cleanupOrchestrator struct {
-	api   promv1.API
-	queue workqueue.TypedRateLimitingInterface[NamespacedResourceKey]
-	mEnv  *monitoringEnv
+	clientProvider promClientProvider
+	queue          workqueue.TypedRateLimitingInterface[NamespacedResourceKey]
+	mEnv           *monitoringEnv
 }
 
 type monitoringEnv struct {
@@ -52,12 +141,14 @@ type monitoringEnv struct {
 	evaluationInterval      time.Duration
 }
 
+// parseMonitoringEnv always returns a non-nil *monitoringEnv.
+// When PROMETHEUS_ADDRESS is unset or whitespace-only, address is set to the
+// empty string; the duration fields are still populated from their respective
+// env vars (with the same defaults as before).
 func parseMonitoringEnv() *monitoringEnv {
-	promAdd := strings.TrimSpace(os.Getenv(EnvPrometheusAddress))
-	if promAdd == "" {
-		return nil
+	env := &monitoringEnv{
+		address: strings.TrimSpace(os.Getenv(EnvPrometheusAddress)),
 	}
-	env := &monitoringEnv{address: promAdd}
 
 	evalDurationEnv := func(envName string, fallback time.Duration) time.Duration {
 		if v, ok := os.LookupEnv(envName); ok && strings.TrimSpace(v) != "" {
@@ -73,39 +164,35 @@ func parseMonitoringEnv() *monitoringEnv {
 	return env
 }
 
+// startVersionCleanup starts the version-cleanup schedule and worker goroutines
+// for the lifetime of ctx. It no longer short-circuits when there is no
+// Prometheus address configured — workloads without deletionRules are still
+// eligible for cleanup.
 func (c *Controller) startVersionCleanup(ctx context.Context) {
 	mEnv := parseMonitoringEnv()
-	if mEnv == nil {
-		return // no prometheus address
+
+	if mEnv.address == "" {
+		klog.InfoS("PROMETHEUS_ADDRESS is not set; only versions without deletion rules will be cleaned up")
 	}
 
 	restartSignal := make(chan bool, 1)
 	setup := func() context.CancelFunc {
-		for {
-			o := initializeVersionCleanupOrchestrator(ctx, mEnv)
-			if o == nil {
-				select {
-				case <-ctx.Done():
-					return nil
-				case <-time.After(mEnv.acquireClientRetryDelay): // sleep a long time before attempting to setup the cleanup process
-					continue
-				}
-			}
-			child, cancelFn := context.WithCancel(ctx)
-			go func() {
-				<-child.Done()
-				o.queue.ShutDown()
-			}()
-			go c.scheduleVersionCollectionForCleanup(child, o, restartSignal)
-			go c.processVersionCleanupQueue(child, o, restartSignal)
-			return cancelFn
-		}
+		o := initializeVersionCleanupOrchestrator(ctx, mEnv)
+		child, cancelFn := context.WithCancel(ctx)
+		go func() {
+			<-child.Done()
+			o.queue.ShutDown()
+		}()
+		go c.scheduleVersionCollectionForCleanup(child, o, restartSignal)
+		go c.processVersionCleanupQueue(child, o, restartSignal)
+		return cancelFn
 	}
 
 	for {
 		cancel := setup()
 		select {
 		case <-ctx.Done():
+			cancel()
 			return
 		case <-restartSignal: // restart broken routines
 			cancel()
@@ -124,24 +211,15 @@ func recoverVersionCleanupRoutine(restart chan<- bool) {
 	}
 }
 
+// initializeVersionCleanupOrchestrator always returns a non-nil
+// *cleanupOrchestrator. It sets up the work queue and the Prometheus client
+// provider but does NOT block on, or return nil because of, Prometheus
+// reachability.
 func initializeVersionCleanupOrchestrator(ctx context.Context, mEnv *monitoringEnv) *cleanupOrchestrator {
-	promClient, err := promapi.NewClient(promapi.Config{Address: mEnv.address})
-	if err != nil {
-		klog.ErrorS(err, "could not create client", "address", mEnv.address)
-		return nil
-	}
-	v1api := promv1.NewAPI(promClient)
-	_, err = v1api.Runtimeinfo(ctx)
-	if err != nil {
-		klog.ErrorS(err, "could not fetch runtime info from prometheus server", "address", mEnv.address)
-		return nil
-	}
-
-	// create orchestrator
 	return &cleanupOrchestrator{
-		api:   v1api,
-		queue: workqueue.NewTypedRateLimitingQueue(workqueue.DefaultTypedControllerRateLimiter[NamespacedResourceKey]()),
-		mEnv:  mEnv,
+		clientProvider: newPromClientProvider(mEnv),
+		queue:          workqueue.NewTypedRateLimitingQueue(workqueue.DefaultTypedControllerRateLimiter[NamespacedResourceKey]()),
+		mEnv:           mEnv,
 	}
 }
 
@@ -246,6 +324,10 @@ func (c *Controller) processVersionCleanupQueue(ctx context.Context, orc *cleanu
 	}
 }
 
+// processVersionCleanupQueueItem dequeues one item and evaluates it for
+// cleanup. It obtains the current Prometheus API from the provider; if the
+// provider reports unavailable, nil is passed to the evaluator so that
+// workloads without deletionRules can still be cleaned up.
 func (c *Controller) processVersionCleanupQueueItem(ctx context.Context, orc *cleanupOrchestrator) (stop bool) {
 	item, shutdown := orc.queue.Get()
 	if shutdown {
@@ -253,7 +335,10 @@ func (c *Controller) processVersionCleanupQueueItem(ctx context.Context, orc *cl
 	}
 	defer orc.queue.Done(item)
 
-	if c.evaluateVersionForCleanup(ctx, item, orc.api) != nil {
+	// Obtain the current Prometheus API — may be nil if unavailable.
+	api, _ := orc.clientProvider.Get(ctx)
+
+	if c.evaluateVersionForCleanup(ctx, item, api) != nil {
 		orc.queue.AddRateLimited(item)
 	} else {
 		orc.queue.Forget(item)
@@ -261,6 +346,11 @@ func (c *Controller) processVersionCleanupQueueItem(ctx context.Context, orc *cl
 	return false
 }
 
+// evaluateVersionForCleanup evaluates one CAPApplicationVersion for deletion
+// eligibility. promapi may be nil when Prometheus is unavailable; in that
+// case workloads that require deletionRules are considered not eligible (kept
+// for re-evaluation next cycle) while workloads without deletionRules remain
+// automatically eligible.
 func (c *Controller) evaluateVersionForCleanup(ctx context.Context, item NamespacedResourceKey, promapi promv1.API) error {
 	lister := c.crdInformerFactory.Sme().V1alpha1().CAPApplicationVersions().Lister()
 	cav, err := lister.CAPApplicationVersions(item.Namespace).Get(item.Name)
@@ -295,11 +385,28 @@ func (c *Controller) evaluateVersionForCleanup(ctx context.Context, item Namespa
 	return nil
 }
 
+// evaluateWorkloadForCleanup decides whether a single workload is eligible for
+// cleanup.
+//
+//   - When the workload has no deletionRules, it is automatically eligible
+//     (returns true) regardless of Prometheus availability.
+//   - When deletionRules are present but promapi is nil (Prometheus
+//     unavailable), the workload is NOT eligible (returns false) and a
+//     single informational log entry is emitted.
+//   - When deletionRules are present and promapi is available, the rules are
+//     evaluated via the existing helpers.
 func evaluateWorkloadForCleanup(ctx context.Context, cav NamespacedResourceKey, wl *v1alpha1.WorkloadDetails, promapi promv1.API) bool {
 	if wl.DeploymentDefinition == nil || wl.DeploymentDefinition.Monitoring == nil || wl.DeploymentDefinition.Monitoring.DeletionRules == nil {
 		return true // if there are no rules - the workload is automatically eligible for cleanup
 	}
 
+	// Deletion rules are present; we need Prometheus to evaluate them.
+	if promapi == nil {
+		klog.InfoS("prometheus client unavailable; workload with deletion rules cannot be evaluated and will be skipped",
+			"workload", wl.Name, "version", cav.Name)
+		return false
+	}
+
 	if wl.DeploymentDefinition.Monitoring.DeletionRules.ScalarExpression != nil { // evaluate provided expression
 		isRelevantForCleanup, err := evaluateExpression(ctx, *wl.DeploymentDefinition.Monitoring.DeletionRules.ScalarExpression, promapi)
 		if err != nil {
diff --git a/internal/controller/version-monitoring_test.go b/internal/controller/version-monitoring_test.go
index efc7e36b..b891b80c 100644
--- a/internal/controller/version-monitoring_test.go
+++ b/internal/controller/version-monitoring_test.go
@@ -14,6 +14,7 @@ import (
 	"net/http/httptest"
 	"os"
 	"sync"
+	"sync/atomic"
 	"testing"
 	"time"
 
@@ -56,8 +57,19 @@ func TestMonitoringEnv(t *testing.T) {
 			mEnv := parseMonitoringEnv()
 
 			if tt.add == nil {
-				if mEnv != nil {
-					t.Errorf("did not expect monitoring environment")
+				// New contract: always returns non-nil *monitoringEnv with empty address.
+				if mEnv == nil {
+					t.Errorf("expected non-nil monitoringEnv even when PROMETHEUS_ADDRESS is unset")
+					return
+				}
+				if mEnv.address != "" {
+					t.Errorf("expected empty address when PROMETHEUS_ADDRESS is unset, got %q", mEnv.address)
+				}
+				if mEnv.acquireClientRetryDelay != time.Hour {
+					t.Errorf("expected default acquire client retry interval (1h), got %v", mEnv.acquireClientRetryDelay)
+				}
+				if mEnv.evaluationInterval != 10*time.Minute {
+					t.Errorf("expected default evaluation interval (10m), got %v", mEnv.evaluationInterval)
 				}
 				return
 			}
@@ -114,27 +126,44 @@ func setupTestControllerWithInitialResources(t *testing.T, initialResources []st
 }
 
 func TestGracefulShutdownMonitoringRoutines(t *testing.T) {
-	// Deregister metrics at the end of the test
-	defer deregisterMetrics()
-
-	c := setupTestControllerWithInitialResources(t, []string{})
+	t.Run("with PROMETHEUS_ADDRESS set", func(t *testing.T) {
+		defer deregisterMetrics()
 
-	s, _ := getPromServer(false, []queryTestCase{})
-	defer s.Close()
+		c := setupTestControllerWithInitialResources(t, []string{})
 
-	os.Setenv(EnvPrometheusAddress, s.URL)
-	defer os.Unsetenv(EnvPrometheusAddress)
+		s, _ := getPromServer(false, []queryTestCase{})
+		defer s.Close()
 
-	testCtx, cancel := context.WithTimeout(context.Background(), time.Second)
-	defer cancel()
+		os.Setenv(EnvPrometheusAddress, s.URL)
+		defer os.Unsetenv(EnvPrometheusAddress)
 
-	var wg sync.WaitGroup
+		testCtx, cancel := context.WithTimeout(context.Background(), time.Second)
+		defer cancel()
 
-	wg.Go(func() {
-		c.startVersionCleanup(testCtx)
+		var wg sync.WaitGroup
+		wg.Go(func() {
+			c.startVersionCleanup(testCtx)
+		})
+		wg.Wait() // goroutines must exit when context is cancelled — or the test times out
 	})
 
-	wg.Wait() // check whether routines are closing - or test timeout
+	t.Run("without PROMETHEUS_ADDRESS set", func(t *testing.T) {
+		defer deregisterMetrics()
+
+		// Ensure the env var is not set.
+		os.Unsetenv(EnvPrometheusAddress)
+
+		c := setupTestControllerWithInitialResources(t, []string{})
+
+		testCtx, cancel := context.WithTimeout(context.Background(), time.Second)
+		defer cancel()
+
+		var wg sync.WaitGroup
+		wg.Go(func() {
+			c.startVersionCleanup(testCtx)
+		})
+		wg.Wait() // goroutines must exit when context is cancelled — or the test times out
+	})
 }
 
 func TestVersionSelectionForCleanup(t *testing.T) {
@@ -368,6 +397,12 @@ func mockPromQueryHandler(testCases []queryTestCase, query string, w http.Respon
 	)
 }
 
+// getPromServer starts a mock Prometheus HTTP server.
+// When unavailable is true the /api/v1/status/runtimeinfo endpoint returns 503.
+// The second return value is a function that returns the set of query strings
+// that were received by the /api/v1/query endpoint; the third return value is a
+// function that returns the total number of /api/v1/status/runtimeinfo requests
+// that were received (useful for rate-limiting assertions).
 func getPromServer(unavailable bool, cases []queryTestCase) (*httptest.Server, func() map[string]bool) {
 	calledQueries := map[string]bool{}
 	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
@@ -390,46 +425,108 @@ func getPromServer(unavailable bool, cases []queryTestCase) (*httptest.Server, f
 	}
 }
 
-func Test_initializeVersionCleanupOrchestrator(t *testing.T) {
-	tests := []struct {
-		name              string
-		serverUnavailable bool
-	}{
-		{
-			name:              "initialize cleanup orchestrator and verify connection",
-			serverUnavailable: false,
-		},
-		{
-			name:              "ensure retry of cleanup orchestrator initialization",
-			serverUnavailable: true,
-		},
-	}
-	for _, tt := range tests {
-		t.Run(tt.name, func(t *testing.T) {
-			s, _ := getPromServer(tt.serverUnavailable, []queryTestCase{})
-			defer s.Close()
-			var o *cleanupOrchestrator
-			testCtx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
-			go func() {
-				o = initializeVersionCleanupOrchestrator(testCtx, &monitoringEnv{address: s.URL, evaluationInterval: 2 * time.Minute, acquireClientRetryDelay: 30 * time.Second})
-				if o != nil {
-					cancel()
-				}
-			}()
-			<-testCtx.Done()
-			if tt.serverUnavailable {
-				if testCtx.Err() == nil || testCtx.Err() != context.DeadlineExceeded {
-					t.Error("expected to exceed test context deadline")
-				}
-			} else {
-				if o == nil {
-					t.Errorf("could not initialize prometheus client")
-				}
-				defer o.queue.ShutDown()
+// getPromServerWithCounter is like getPromServer but also returns an atomic
+// counter tracking how many times /api/v1/status/runtimeinfo was called.
+func getPromServerWithCounter(unavailable bool, cases []queryTestCase) (*httptest.Server, func() map[string]bool, *atomic.Int64) {
+	calledQueries := map[string]bool{}
+	var runtimeInfoHits atomic.Int64
+	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+		if r.URL.Path == "/api/v1/status/runtimeinfo" {
+			runtimeInfoHits.Add(1)
+			mockPromRuntimeInfoHandler(unavailable, w)
+			return
+		}
+		if r.URL.Path == "/api/v1/query" {
+			q := r.FormValue("query")
+			if q != "" {
+				calledQueries[q] = false
 			}
+			mockPromQueryHandler(cases, q, w)
+			return
+		}
+		w.WriteHeader(http.StatusInternalServerError)
+	}))
+	return server, func() map[string]bool { return calledQueries }, &runtimeInfoHits
+}
 
+func Test_initializeVersionCleanupOrchestrator(t *testing.T) {
+	t.Run("initialize cleanup orchestrator - server available", func(t *testing.T) {
+		s, _, hits := getPromServerWithCounter(false, []queryTestCase{})
+		defer s.Close()
+
+		o := initializeVersionCleanupOrchestrator(context.TODO(), &monitoringEnv{
+			address:                 s.URL,
+			evaluationInterval:      2 * time.Minute,
+			acquireClientRetryDelay: 30 * time.Second,
 		})
-	}
+
+		if o == nil {
+			t.Fatal("expected non-nil orchestrator")
+		}
+		defer o.queue.ShutDown()
+
+		// Provider should report the API as available on first probe.
+		api, ok := o.clientProvider.Get(context.TODO())
+		if !ok || api == nil {
+			t.Errorf("expected provider to report prometheus API as available")
+		}
+		if hits.Load() == 0 {
+			t.Errorf("expected at least one runtimeinfo probe to have been made")
+		}
+	})
+
+	t.Run("initialize cleanup orchestrator - server unavailable", func(t *testing.T) {
+		s, _, hits := getPromServerWithCounter(true, []queryTestCase{})
+		defer s.Close()
+
+		o := initializeVersionCleanupOrchestrator(context.TODO(), &monitoringEnv{
+			address:                 s.URL,
+			evaluationInterval:      2 * time.Minute,
+			acquireClientRetryDelay: 30 * time.Second,
+		})
+
+		if o == nil {
+			t.Fatal("expected non-nil orchestrator even when server is unavailable")
+		}
+		defer o.queue.ShutDown()
+
+		// Provider should immediately report unavailable (server returns 503).
+		api, ok := o.clientProvider.Get(context.TODO())
+		if ok || api != nil {
+			t.Errorf("expected provider to report prometheus API as unavailable")
+		}
+		if hits.Load() == 0 {
+			t.Errorf("expected at least one runtimeinfo probe to have been attempted")
+		}
+	})
+
+	t.Run("initialize cleanup orchestrator - empty address", func(t *testing.T) {
+		// Start a server but do NOT pass its URL to the orchestrator.
+		s, _, hits := getPromServerWithCounter(false, []queryTestCase{})
+		defer s.Close()
+
+		o := initializeVersionCleanupOrchestrator(context.TODO(), &monitoringEnv{
+			address:                 "",
+			evaluationInterval:      2 * time.Minute,
+			acquireClientRetryDelay: 30 * time.Second,
+		})
+
+		if o == nil {
+			t.Fatal("expected non-nil orchestrator when address is empty")
+		}
+		defer o.queue.ShutDown()
+
+		// Provider must always report unavailable when address is empty.
+		api, ok := o.clientProvider.Get(context.TODO())
+		if ok || api != nil {
+			t.Errorf("expected provider to report prometheus API as unavailable with empty address")
+		}
+
+		// The mock server must have received zero runtimeinfo hits.
+		if hits.Load() != 0 {
+			t.Errorf("expected zero runtimeinfo hits when address is empty, got %d", hits.Load())
+		}
+	})
 }
 
 func TestVersionCleanupEvaluation(t *testing.T) {
@@ -611,3 +708,184 @@ func TestVersionCleanupEvaluation(t *testing.T) {
 		})
 	}
 }
+
+// TestVersionCleanupEvaluation_NoPrometheus_NoDeletionRules verifies that a
+// CAV whose workloads have NO deletionRules is deleted even when Prometheus is
+// unavailable (empty address → noopPromClientProvider).
+func TestVersionCleanupEvaluation_NoPrometheus_NoDeletionRules(t *testing.T) {
+	defer deregisterMetrics()
+
+	// v2 is the latest Ready version; v1 (no deletion rules) is outdated.
+	c := setupTestControllerWithInitialResources(t, []string{
+		"testdata/version-monitoring/ca-cleanup-enabled.yaml",
+		"testdata/version-monitoring/cav-v1-no-deletion-rules.yaml",
+		"testdata/version-monitoring/cav-v2-deletion-rules.yaml",
+		"testdata/version-monitoring/cat-provider-v2-ready.yaml",
+	})
+
+	o := initializeVersionCleanupOrchestrator(context.TODO(), &monitoringEnv{
+		address:                 "", // no Prometheus → noopPromClientProvider
+		evaluationInterval:      10 * time.Minute,
+		acquireClientRetryDelay: time.Hour,
+	})
+	defer o.queue.ShutDown()
+
+	item := NamespacedResourceKey{Namespace: "default", Name: "test-cap-01-cav-v1"}
+	o.queue.Add(item)
+	_ = c.processVersionCleanupQueueItem(context.TODO(), o)
+
+	// No requeues expected.
+	if o.queue.NumRequeues(item) != 0 {
+		t.Errorf("expected zero requeues for version without deletion rules, got %d", o.queue.NumRequeues(item))
+	}
+
+	// v1 must have been deleted.
+	_, err := c.crdClient.SmeV1alpha1().CAPApplicationVersions("default").Get(context.TODO(), "test-cap-01-cav-v1", v1.GetOptions{})
+	if err == nil || !errors.IsNotFound(err) {
+		t.Errorf("expected version test-cap-01-cav-v1 to be deleted when it has no deletion rules and Prometheus is unavailable")
+	}
+}
+
+// TestVersionCleanupEvaluation_NoPrometheus_WithDeletionRules verifies that a
+// CAV whose workloads DO have deletionRules is NOT deleted when Prometheus is
+// unavailable, and that no requeue storm occurs.
+func TestVersionCleanupEvaluation_NoPrometheus_WithDeletionRules(t *testing.T) {
+	defer deregisterMetrics()
+
+	c := setupTestControllerWithInitialResources(t, []string{
+		"testdata/version-monitoring/ca-cleanup-enabled.yaml",
+		"testdata/version-monitoring/cav-v1-deletion-rules.yaml",
+		"testdata/version-monitoring/cav-v2-deletion-rules.yaml",
+		"testdata/version-monitoring/cat-provider-v2-ready.yaml",
+	})
+
+	o := initializeVersionCleanupOrchestrator(context.TODO(), &monitoringEnv{
+		address:                 "",
+		evaluationInterval:      10 * time.Minute,
+		acquireClientRetryDelay: time.Hour,
+	})
+	defer o.queue.ShutDown()
+
+	item := NamespacedResourceKey{Namespace: "default", Name: "test-cap-01-cav-v1"}
+	o.queue.Add(item)
+	_ = c.processVersionCleanupQueueItem(context.TODO(), o)
+
+	// No requeues expected (skipped, not an error).
+	if o.queue.NumRequeues(item) != 0 {
+		t.Errorf("expected zero requeues when CAV has deletion rules but Prometheus is unavailable, got %d", o.queue.NumRequeues(item))
+	}
+
+	// v1 must NOT have been deleted.
+	_, err := c.crdClient.SmeV1alpha1().CAPApplicationVersions("default").Get(context.TODO(), "test-cap-01-cav-v1", v1.GetOptions{})
+	if err != nil {
+		t.Errorf("expected version test-cap-01-cav-v1 to still exist when deletion rules cannot be evaluated: %v", err)
+	}
+}
+
+// TestVersionCleanupEvaluation_PromUnreachable tests two evaluation cycles
+// against an unreachable Prometheus server:
+//   - a CAV WITH deletionRules must NOT be deleted
+//   - a CAV WITHOUT deletionRules MUST be deleted
+func TestVersionCleanupEvaluation_PromUnreachable(t *testing.T) {
+	s, _ := getPromServer(true, []queryTestCase{}) // server always returns 503
+	defer s.Close()
+
+	t.Run("with deletion rules - not deleted", func(t *testing.T) {
+		defer deregisterMetrics()
+
+		c := setupTestControllerWithInitialResources(t, []string{
+			"testdata/version-monitoring/ca-cleanup-enabled.yaml",
+			"testdata/version-monitoring/cav-v1-deletion-rules.yaml",
+			"testdata/version-monitoring/cav-v2-deletion-rules.yaml",
+			"testdata/version-monitoring/cat-provider-v2-ready.yaml",
+		})
+
+		o := initializeVersionCleanupOrchestrator(context.TODO(), &monitoringEnv{
+			address:                 s.URL,
+			evaluationInterval:      10 * time.Minute,
+			acquireClientRetryDelay: time.Millisecond, // short so the first probe is attempted
+		})
+		defer o.queue.ShutDown()
+
+		item := NamespacedResourceKey{Namespace: "default", Name: "test-cap-01-cav-v1"}
+		o.queue.Add(item)
+		_ = c.processVersionCleanupQueueItem(context.TODO(), o)
+
+		if o.queue.NumRequeues(item) != 0 {
+			t.Errorf("expected zero requeues for CAV with deletion rules when Prometheus is unreachable, got %d", o.queue.NumRequeues(item))
+		}
+
+		_, err := c.crdClient.SmeV1alpha1().CAPApplicationVersions("default").Get(context.TODO(), "test-cap-01-cav-v1", v1.GetOptions{})
+		if err != nil {
+			t.Errorf("expected version test-cap-01-cav-v1 to still exist: %v", err)
+		}
+	})
+
+	t.Run("without deletion rules - deleted", func(t *testing.T) {
+		defer deregisterMetrics()
+
+		c := setupTestControllerWithInitialResources(t, []string{
+			"testdata/version-monitoring/ca-cleanup-enabled.yaml",
+			"testdata/version-monitoring/cav-v1-no-deletion-rules.yaml",
+			"testdata/version-monitoring/cav-v2-deletion-rules.yaml",
+			"testdata/version-monitoring/cat-provider-v2-ready.yaml",
+		})
+
+		o := initializeVersionCleanupOrchestrator(context.TODO(), &monitoringEnv{
+			address:                 s.URL,
+			evaluationInterval:      10 * time.Minute,
+			acquireClientRetryDelay: time.Millisecond,
+		})
+		defer o.queue.ShutDown()
+
+		item := NamespacedResourceKey{Namespace: "default", Name: "test-cap-01-cav-v1"}
+		o.queue.Add(item)
+		_ = c.processVersionCleanupQueueItem(context.TODO(), o)
+
+		if o.queue.NumRequeues(item) != 0 {
+			t.Errorf("expected zero requeues for CAV without deletion rules, got %d", o.queue.NumRequeues(item))
+		}
+
+		_, err := c.crdClient.SmeV1alpha1().CAPApplicationVersions("default").Get(context.TODO(), "test-cap-01-cav-v1", v1.GetOptions{})
+		if err == nil || !errors.IsNotFound(err) {
+			t.Errorf("expected version test-cap-01-cav-v1 to be deleted when it has no deletion rules")
+		}
+	})
+}
+
+// TestVersionCleanupEvaluation_RuntimeInfoRateLimited verifies that the
+// cachedPromClientProvider issues at most one runtimeinfo probe per
+// acquireClientRetryDelay window. Two back-to-back Get() calls should only
+// result in a single HTTP hit on the mock server.
+func TestVersionCleanupEvaluation_RuntimeInfoRateLimited(t *testing.T) {
+	s, _, hits := getPromServerWithCounter(true, []queryTestCase{}) // server always returns 503
+	defer s.Close()
+
+	// Use a generous retry delay so that the second immediate call is still
+	// within the rate-limit window.
+	o := initializeVersionCleanupOrchestrator(context.TODO(), &monitoringEnv{
+		address:                 s.URL,
+		evaluationInterval:      10 * time.Minute,
+		acquireClientRetryDelay: 5 * time.Second,
+	})
+	defer o.queue.ShutDown()
+
+	ctx := context.TODO()
+
+	// First call: triggers a probe (lastProbeAt is zero).
+	api1, ok1 := o.clientProvider.Get(ctx)
+	if ok1 || api1 != nil {
+		t.Errorf("expected first Get() to report unavailable (server returns 503)")
+	}
+
+	// Second call: must be rate-limited; no additional HTTP request should be made.
+	api2, ok2 := o.clientProvider.Get(ctx)
+	if ok2 || api2 != nil {
+		t.Errorf("expected second Get() to report unavailable")
+	}
+
+	// Only one runtimeinfo probe must have hit the server.
+	if hits.Load() > 1 {
+		t.Errorf("expected at most 1 runtimeinfo probe, got %d", hits.Load())
+	}
+}
diff --git a/website/content/en/docs/configuration/_index.md b/website/content/en/docs/configuration/_index.md
index 7eccf401..f0244266 100644
--- a/website/content/en/docs/configuration/_index.md
+++ b/website/content/en/docs/configuration/_index.md
@@ -18,7 +18,7 @@ The following environment variables are used to configure CAP Operator.
 - `DNS_MANAGER`: The external DNS manager to use. Possible values:
   - `gardener`: ["Gardener" external DNS manager](https://github.com/gardener/external-dns-management)
   - `kubernetes`: [External DNS management from Kubernetes](https://github.com/kubernetes-sigs/external-dns)
-- `PROMETHEUS_ADDRESS`: URL of the Prometheus server for executing PromQL queries, for example `http://prometheus-operated.monitoring.svc.cluster.local:9090`. If not set, the version monitoring function is not started.
+- `PROMETHEUS_ADDRESS`: URL of the Prometheus server for executing PromQL queries, for example `http://prometheus-operated.monitoring.svc.cluster.local:9090`. If not set, the controller still runs the version cleanup loop; however, only versions whose workloads have no `deletionRules` defined are eligible for cleanup — versions that have `deletionRules` are skipped until a Prometheus connection becomes available.
 - `PROM_ACQUIRE_CLIENT_RETRY_DELAY`: Time delay between retries when Prometheus client creation or connection check fails.
 - `METRICS_EVAL_INTERVAL`: Time interval between iterations where outdated versions are identified and queued for evaluation.
 - `MAX_CONCURRENT_RECONCILES_CAP_APPLICATION`: Maximum number of concurrent reconciles for `CAPApplication` (for example, `1`).
diff --git a/website/content/en/docs/usage/version-monitoring.md b/website/content/en/docs/usage/version-monitoring.md
index 9a1ecb04..b1f15673 100644
--- a/website/content/en/docs/usage/version-monitoring.md
+++ b/website/content/en/docs/usage/version-monitoring.md
@@ -11,7 +11,7 @@ In a continuous delivery environment where new application versions are deployed
 
 ## Integration with Prometheus
 
-[Prometheus](https://prometheus.io/) is the industry standard for monitoring application metrics and provides a wide variety of tools for managing and reporting metrics. The CAP Operator controller can be connected to a Prometheus server by setting the `PROMETHEUS_ADDRESS` environment variable (see [Configuration](../../configuration)). The controller then queries application-related metrics based on the workload specification of `CAPApplicationVersion` resources. If no Prometheus address is supplied, the version monitoring function is not started.
+[Prometheus](https://prometheus.io/) is the industry standard for monitoring application metrics and provides a wide variety of tools for managing and reporting metrics. The CAP Operator controller can be connected to a Prometheus server by setting the `PROMETHEUS_ADDRESS` environment variable (see [Configuration](../../configuration)). The controller then queries application-related metrics based on the workload specification of `CAPApplicationVersion` resources. Prometheus is only required to evaluate workloads that declare `deletionRules`. When `PROMETHEUS_ADDRESS` is not set, or the configured server is unreachable, the cleanup loop still runs. Versions whose deployment workloads declare no `deletionRules` can still be cleaned up in that case. Versions that do declare `deletionRules` are skipped — they are neither deleted nor surfaced as errors — and will be re-evaluated on the next cycle once Prometheus becomes available again.
 
 ## Configure `CAPApplication`
 
@@ -147,4 +147,6 @@ At specified intervals (controlled by the controller environment variable `METRI
   - `status.currentCAPApplicationVersionInstance` — the tenant's current version.
   - `spec.version` — the version a tenant is upgrading to.
 
-Workloads from the identified versions are then evaluated against their `deletionRules`. Workloads without `deletionRules` are automatically eligible for cleanup. All deployment workloads of a version must satisfy the evaluation criteria for the version to be deleted.
+Workloads from the identified versions are then evaluated against their `deletionRules`. Workloads without `deletionRules` are automatically eligible for cleanup regardless of whether Prometheus is configured or reachable. All deployment workloads of a version must satisfy the evaluation criteria for the version to be deleted.
+
+When a workload declares `deletionRules` and the Prometheus client is currently unavailable — either because no `PROMETHEUS_ADDRESS` has been configured, or because the `Runtimeinfo` probe against the configured server is failing — that workload is treated as not eligible and the version is left untouched until the next evaluation cycle. Reachability is re-checked at most once per `PROM_ACQUIRE_CLIENT_RETRY_DELAY` interval, so a recovered Prometheus server will be detected and used on the next cycle after that delay elapses.