OCPBUGS-81521: Adapt dashboard Prometheus polling interval based on query response time by stefanonardo · Pull Request #16441 · openshift/console

stefanonardo · 2026-05-14T10:15:02Z

Analysis / Root cause:
Dashboard Prometheus queries poll every 15 seconds regardless of cluster size. On large clusters,
these queries are expensive (more time series to aggregate), creating unnecessary load on the
Prometheus/Thanos monitoring stack.

Solution description:
Replace the hardcoded 15s polling delay in fetchPeriodically with an adaptive interval derived
from an Exponential Moving Average (EMA) of query response times. Fast clusters (~500ms responses)
stay at the 15s floor, while slow/large clusters automatically back off up to 60s.

New utility: adaptive-polling.ts with computeAdaptiveDelay() and emaToDelay()
Modified: dashboards.ts fetchPeriodically to measure fetch duration and compute adaptive delay
EMA state is per-query, passed through recursive calls (no Redux state changes)
No changes to exported types — no SDK/public API impact

Screenshots / screen recording:
N/A — no visual changes. Polling interval changes are observable in browser DevTools Network tab.

Test setup:
No special setup required.

Test cases:

Unit tests for computeAdaptiveDelay and emaToDelay (boundary values, EMA smoothing, NaN guards)
Integration tests for fetchPeriodically adaptive behavior (fast response, slow response, error backoff)
Manual: Open cluster dashboard, observe Prometheus request intervals in DevTools Network tab

Browser conformance:

Chrome
Firefox
Safari (or Epiphany on Linux)

Additional info:
Jira: https://redhat.atlassian.net/browse/OCPBUGS-81521

Summary by CodeRabbit

New Features
- Dashboard data polling now intelligently adapts refresh rates based on response times for improved performance and efficiency.
Tests
- Enhanced E2E test infrastructure with improved setup and teardown orchestration.

openshift-ci-robot · 2026-05-14T10:15:11Z

@stefanonardo: This pull request references Jira Issue OCPBUGS-81521, which is invalid:

expected the bug to target either version "5.0." or "openshift-5.0.", but it targets "4.22.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Analysis / Root cause:
Dashboard Prometheus queries poll every 15 seconds regardless of cluster size. On large clusters,
these queries are expensive (more time series to aggregate), creating unnecessary load on the
Prometheus/Thanos monitoring stack.

Solution description:
Replace the hardcoded 15s polling delay in fetchPeriodically with an adaptive interval derived
from an Exponential Moving Average (EMA) of query response times. Fast clusters (~500ms responses)
stay at the 15s floor, while slow/large clusters automatically back off up to 60s.

New utility: adaptive-polling.ts with computeAdaptiveDelay() and emaToDelay()

Modified: dashboards.ts fetchPeriodically to measure fetch duration and compute adaptive delay

EMA state is per-query, passed through recursive calls (no Redux state changes)

No changes to exported types — no SDK/public API impact

Screenshots / screen recording:
N/A — no visual changes. Polling interval changes are observable in browser DevTools Network tab.

Test setup:
No special setup required.

Test cases:

Unit tests for computeAdaptiveDelay and emaToDelay (boundary values, EMA smoothing, NaN guards)

Integration tests for fetchPeriodically adaptive behavior (fast response, slow response, error backoff)

Manual: Open cluster dashboard, observe Prometheus request intervals in DevTools Network tab

Browser conformance:

Chrome

Firefox

Safari (or Epiphany on Linux)

Additional info:
Jira: https://redhat.atlassian.net/browse/OCPBUGS-81521

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

stefanonardo · 2026-05-14T10:16:30Z

/jira refresh

openshift-ci-robot · 2026-05-14T10:16:35Z

@stefanonardo: This pull request references Jira Issue OCPBUGS-81521, which is invalid:

expected the bug to target only the "5.0.0" version, but multiple target versions were set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

stefanonardo · 2026-05-14T10:17:14Z

/jira refresh

openshift-ci-robot · 2026-05-14T10:17:22Z

@stefanonardo: This pull request references Jira Issue OCPBUGS-81521, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (5.0.0) matches configured target version for branch (5.0.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2026-05-14T10:17:32Z

@stefanonardo: This pull request references Jira Issue OCPBUGS-81521, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (5.0.0) matches configured target version for branch (5.0.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Details

In response to this:

Analysis / Root cause:
Dashboard Prometheus queries poll every 15 seconds regardless of cluster size. On large clusters,
these queries are expensive (more time series to aggregate), creating unnecessary load on the
Prometheus/Thanos monitoring stack.

Solution description:
Replace the hardcoded 15s polling delay in fetchPeriodically with an adaptive interval derived
from an Exponential Moving Average (EMA) of query response times. Fast clusters (~500ms responses)
stay at the 15s floor, while slow/large clusters automatically back off up to 60s.

New utility: adaptive-polling.ts with computeAdaptiveDelay() and emaToDelay()

Modified: dashboards.ts fetchPeriodically to measure fetch duration and compute adaptive delay

EMA state is per-query, passed through recursive calls (no Redux state changes)

No changes to exported types — no SDK/public API impact

Screenshots / screen recording:
N/A — no visual changes. Polling interval changes are observable in browser DevTools Network tab.

Test setup:
No special setup required.

Test cases:

Unit tests for computeAdaptiveDelay and emaToDelay (boundary values, EMA smoothing, NaN guards)

Integration tests for fetchPeriodically adaptive behavior (fast response, slow response, error backoff)

Manual: Open cluster dashboard, observe Prometheus request intervals in DevTools Network tab

Browser conformance:

Chrome

Firefox

Safari (or Epiphany on Linux)

Additional info:
Jira: https://redhat.atlassian.net/browse/OCPBUGS-81521

Summary by CodeRabbit

New Features

Dashboard data polling now intelligently adapts refresh rates based on response times for improved performance and efficiency.

Tests

Enhanced E2E test infrastructure with improved setup and teardown orchestration.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-05-14T10:20:51Z

📝 Walkthrough

Walkthrough

This pull request refactors the Playwright E2E test infrastructure from global setup/teardown hooks into explicit modular projects (cluster-setup, admin-auth, developer-auth, teardown). It extracts shared login logic into reusable helpers, adds a developer perspective smoke test, and updates the Playwright configuration to reflect the new dependency graph. Additionally, dashboard data fetching is enhanced with adaptive polling that adjusts delays based on response times via exponential moving average, replacing fixed-delay retry behavior.

Suggested reviewers

spadgett
sg00dwin
vikram-raj

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (11 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title directly references the main change: adaptive Prometheus polling based on query response time, with clear jira prefix and concise phrasing.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	Custom check not applicable. PR contains zero Ginkgo tests. All test names use stable, deterministic static strings with no dynamic values.
Test Structure And Quality	✅ Passed	Check is designed for Ginkgo (Go) tests. PR contains only Playwright and Jest (TypeScript) tests. Check is not applicable.
Microshift Test Compatibility	✅ Passed	Custom check not applicable. PR adds TypeScript/JavaScript tests (Jest, Playwright) only, not Ginkgo e2e tests required by this check.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	Check not applicable. PR adds TypeScript/JavaScript tests (Playwright e2e, Jest units), not Ginkgo e2e tests. SNO compatibility applies to Go-based origin tests.
Topology-Aware Scheduling Compatibility	✅ Passed	PR contains frontend code and E2E tests only. No deployment manifests, operator code, or K8s scheduling constraints present.
Ote Binary Stdout Contract	✅ Passed	OTE Binary Stdout Contract applies to Go code only. This PR modifies only frontend TypeScript/JavaScript files, no Go code changes.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	No Ginkgo e2e tests added in this PR. All changes are TypeScript/JavaScript (Playwright and Jest). The check only applies to Ginkgo tests, so it is not applicable here.
Description check	✅ Passed	PR description is comprehensive, well-structured, and covers all required sections: root cause analysis, solution with technical details, test cases, browser conformance, and Jira reference.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

frontend/playwright.config.ts (1)

131-141: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align developer project creation with the same credential checks used by setup.

Developer projects are enabled from a username-only flag, while developer-auth.setup.ts skips unless both username and password exist. This mismatch can produce developer projects without valid auth state.

Suggested patch

-const hasDeveloper = !!process.env.BRIDGE_HTPASSWD_USERNAME;
+const hasDeveloper =
+  !!process.env.BRIDGE_HTPASSWD_USERNAME && !!process.env.BRIDGE_HTPASSWD_PASSWORD;

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/playwright.config.ts` around lines 131 - 141, The developer project
generation uses hasDeveloper (username-only) which can create projects without
valid auth; change the conditional that yields the developer entries to require
the same credentials check used in developer-auth.setup.ts (i.e., ensure both
developer username and developer password exist) or reuse the shared helper/flag
from developer-auth.setup.ts instead of hasDeveloper; update the condition
surrounding devPackages mapping (and any references to developerStorageState) so
developer projects are only created when both username and password are present.

🧹 Nitpick comments (1)

frontend/public/actions/__tests__/dashboards.spec.ts (1)

165-174: ⚡ Quick win

Strengthen the error-backoff assertion to actually prevent ceiling jumps.

This test currently allows MAX_POLL_DELAY, which conflicts with its stated intent. Seed with a fast success first, then force an error and assert the second delay increases but stays strictly below max.

Proposed test tightening

-    it('backs off on fetch error without jumping to MAX_POLL_DELAY', async () => {
-      const fetchMock = jest.fn().mockRejectedValueOnce(new Error('network error'));
+    it('backs off on fetch error without jumping to MAX_POLL_DELAY', async () => {
+      const fetchMock = jest
+        .fn()
+        .mockResolvedValueOnce({ data: 'test' })
+        .mockRejectedValueOnce(new Error('network error'));
       setupWatchURL(fetchMock);
 
       await flushPromises();
+      const firstTimeout = setTimeoutSpy.mock.calls[setTimeoutSpy.mock.calls.length - 1];
+      const firstDelay = firstTimeout[1] as number;
+
+      const nextPoll = firstTimeout[0] as (...args: unknown[]) => unknown;
+      nextPoll();
+      await flushPromises();
 
       const lastSetTimeout = setTimeoutSpy.mock.calls[setTimeoutSpy.mock.calls.length - 1];
-      expect(lastSetTimeout[1]).toBeGreaterThan(MIN_POLL_DELAY);
-      expect(lastSetTimeout[1]).toBeLessThanOrEqual(MAX_POLL_DELAY);
+      expect(lastSetTimeout[1]).toBeGreaterThan(firstDelay);
+      expect(lastSetTimeout[1]).toBeLessThan(MAX_POLL_DELAY);
     });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@frontend/public/actions/__tests__/dashboards.spec.ts` around lines 165 - 174,
The test currently allows the backoff to equal MAX_POLL_DELAY which contradicts
its intent; update the test in dashboards.spec.ts to first seed a fast
successful poll (call setupWatchURL with a fetch mock that resolves once
quickly), then trigger a rejection (mockRejectedValueOnce) so the backoff
increases from the previous delay, and assert the subsequent timeout delay
(inspect setTimeoutSpy.mock.calls[...] like in the existing test) is greater
than the prior delay and strictly less than MAX_POLL_DELAY (use <
MAX_POLL_DELAY, not <=), while still being > MIN_POLL_DELAY; keep references to
setupWatchURL, setTimeoutSpy, flushPromises, MIN_POLL_DELAY and MAX_POLL_DELAY
when locating and changing the test.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@frontend/e2e/setup/teardown.setup.ts`:
- Around line 24-50: The teardown leaves CONFIG_FILE (.test-config.json) on
disk; wrap the config read and namespace deletion logic in a try/finally so the
finally always runs and removes CONFIG_FILE regardless of early returns or
errors. Specifically, keep reading into testNamespace/authToken/kubeConfigPath
and running KubernetesClient.deleteNamespace and waitForNamespaceDeleted as
before (referencing CONFIG_FILE, testNamespace, authToken, kubeConfigPath, and
KubernetesClient), but move the current early returns into the try and perform
fs.unlinkSync or fs.rmSync(CONFIG_FILE) inside the finally (guarded by
fs.existsSync and swallowing/logging any unlink errors) so the file is always
removed after teardown completes.

In `@frontend/public/actions/dashboards.ts`:
- Around line 86-94: The catch path currently seeds the EMA with MAX_POLL_DELAY
/ SCALE_FACTOR when responseTimeEma is zero, which can jump retries straight to
MAX_POLL_DELAY; modify the logic around computeAdaptiveDelay so that when
responseTimeEma === 0 you seed the EMA from the "floor" value (the
minimal/steady-state equivalent) instead of MAX_POLL_DELAY / SCALE_FACTOR (e.g.
use a MIN_POLL_DELAY-based seed or the floor EMA value), keeping all calls and
variables (computeAdaptiveDelay, responseTimeEma, MAX_POLL_DELAY, SCALE_FACTOR,
emaToDelay, fetchPeriodically) intact; ensure nextEma is computed from that
floor-seeded value so the first failure backs off conservatively rather than
immediately scheduling the max delay.

---

Outside diff comments:
In `@frontend/playwright.config.ts`:
- Around line 131-141: The developer project generation uses hasDeveloper
(username-only) which can create projects without valid auth; change the
conditional that yields the developer entries to require the same credentials
check used in developer-auth.setup.ts (i.e., ensure both developer username and
developer password exist) or reuse the shared helper/flag from
developer-auth.setup.ts instead of hasDeveloper; update the condition
surrounding devPackages mapping (and any references to developerStorageState) so
developer projects are only created when both username and password are present.

---

Nitpick comments:
In `@frontend/public/actions/__tests__/dashboards.spec.ts`:
- Around line 165-174: The test currently allows the backoff to equal
MAX_POLL_DELAY which contradicts its intent; update the test in
dashboards.spec.ts to first seed a fast successful poll (call setupWatchURL with
a fetch mock that resolves once quickly), then trigger a rejection
(mockRejectedValueOnce) so the backoff increases from the previous delay, and
assert the subsequent timeout delay (inspect setTimeoutSpy.mock.calls[...] like
in the existing test) is greater than the prior delay and strictly less than
MAX_POLL_DELAY (use < MAX_POLL_DELAY, not <=), while still being >
MIN_POLL_DELAY; keep references to setupWatchURL, setTimeoutSpy, flushPromises,
MIN_POLL_DELAY and MAX_POLL_DELAY when locating and changing the test.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 1e3997f8-4837-4ba4-b681-3b92dd163d9c

📥 Commits

Reviewing files that changed from the base of the PR and between 6701c23 and bd5db08.

📒 Files selected for processing (14)

frontend/e2e/global.setup.ts
frontend/e2e/global.teardown.ts
frontend/e2e/setup/admin-auth.setup.ts
frontend/e2e/setup/cluster.setup.ts
frontend/e2e/setup/developer-auth.setup.ts
frontend/e2e/setup/login-helper.ts
frontend/e2e/setup/teardown.setup.ts
frontend/e2e/tests/smoke/developer/smoke-test.spec.ts
frontend/package.json
frontend/playwright.config.ts
frontend/public/actions/__tests__/dashboards.spec.ts
frontend/public/actions/dashboards.ts
frontend/public/components/utils/__tests__/adaptive-polling.spec.ts
frontend/public/components/utils/adaptive-polling.ts

💤 Files with no reviewable changes (2)

frontend/e2e/global.setup.ts
frontend/e2e/global.teardown.ts

📜 Review details

🔇 Additional comments (9)

frontend/public/components/utils/__tests__/adaptive-polling.spec.ts (1)

1-98: LGTM!

frontend/public/components/utils/adaptive-polling.ts (1)

1-34: LGTM!

frontend/e2e/setup/cluster.setup.ts (1)

13-65: LGTM!

frontend/e2e/setup/login-helper.ts (1)

9-50: LGTM!

frontend/e2e/setup/admin-auth.setup.ts (1)

9-18: LGTM!

frontend/e2e/setup/developer-auth.setup.ts (1)

9-22: LGTM!

frontend/playwright.config.ts (1)

30-126: LGTM!

frontend/e2e/tests/smoke/developer/smoke-test.spec.ts (1)

3-8: LGTM!

frontend/package.json (1)

57-57: LGTM!

coderabbitai · 2026-05-14T10:20:54Z

+  try {
+    const config = JSON.parse(fs.readFileSync(CONFIG_FILE, 'utf-8'));
+    testNamespace = config.testNamespace;
+    kubeConfigPath = config.kubeConfigPath;
+    authToken = config.authToken;
+  } catch {
+    return;
+  }
+
+  if (!testNamespace) {
+    return;
+  }
+
+  const client = new KubernetesClient(
+    {
+      clusterUrl: process.env.CLUSTER_URL || '',
+      username: process.env.OPENSHIFT_USERNAME || 'kubeadmin',
+      password: process.env.BRIDGE_KUBEADMIN_PASSWORD || '',
+      token: authToken,
+    },
+    kubeConfigPath,
+  );
+
+  await client.deleteNamespace(testNamespace);
+  const deleted = await client.waitForNamespaceDeleted(testNamespace, 120_000);
+  expect(deleted, `Namespace ${testNamespace} should be deleted within 120s`).toBe(true);
+});


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Delete .test-config.json after teardown completes.

This file contains a bearer token and currently persists after cleanup. Remove it in a finally block to reduce secret-retention risk.

Suggested patch

teardown('delete test namespace', async () => { @@ - let testNamespace: string | undefined; - let kubeConfigPath: string | undefined; - let authToken: string | undefined; - - try { - const config = JSON.parse(fs.readFileSync(CONFIG_FILE, 'utf-8')); - testNamespace = config.testNamespace; - kubeConfigPath = config.kubeConfigPath; - authToken = config.authToken; - } catch { - return; - } - - if (!testNamespace) { - return; - } - - const client = new KubernetesClient( - { - clusterUrl: process.env.CLUSTER_URL || '', - username: process.env.OPENSHIFT_USERNAME || 'kubeadmin', - password: process.env.BRIDGE_KUBEADMIN_PASSWORD || '', - token: authToken, - }, - kubeConfigPath, - ); - - await client.deleteNamespace(testNamespace); - const deleted = await client.waitForNamespaceDeleted(testNamespace, 120_000); - expect(deleted, `Namespace ${testNamespace} should be deleted within 120s`).toBe(true); + try { + let testNamespace: string | undefined; + let kubeConfigPath: string | undefined; + let authToken: string | undefined; + + try { + const config = JSON.parse(fs.readFileSync(CONFIG_FILE, 'utf-8')); + testNamespace = config.testNamespace; + kubeConfigPath = config.kubeConfigPath; + authToken = config.authToken; + } catch { + return; + } + + if (!testNamespace) { + return; + } + + const client = new KubernetesClient( + { + clusterUrl: process.env.CLUSTER_URL || '', + username: process.env.OPENSHIFT_USERNAME || 'kubeadmin', + password: process.env.BRIDGE_KUBEADMIN_PASSWORD || '', + token: authToken, + }, + kubeConfigPath, + ); + + await client.deleteNamespace(testNamespace); + const deleted = await client.waitForNamespaceDeleted(testNamespace, 120_000); + expect(deleted, `Namespace ${testNamespace} should be deleted within 120s`).toBe(true); + } finally { + fs.rmSync(CONFIG_FILE, { force: true }); + } });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/e2e/setup/teardown.setup.ts` around lines 24 - 50, The teardown leaves CONFIG_FILE (.test-config.json) on disk; wrap the config read and namespace deletion logic in a try/finally so the finally always runs and removes CONFIG_FILE regardless of early returns or errors. Specifically, keep reading into testNamespace/authToken/kubeConfigPath and running KubernetesClient.deleteNamespace and waitForNamespaceDeleted as before (referencing CONFIG_FILE, testNamespace, authToken, kubeConfigPath, and KubernetesClient), but move the current early returns into the try and perform fs.unlinkSync or fs.rmSync(CONFIG_FILE) inside the finally (guarded by fs.existsSync and swallowing/logging any unlink errors) so the file is always removed after teardown completes.

coderabbitai · 2026-05-14T10:20:54Z

+    // Feed a synthetic slow response into the EMA to gradually back off without jumping to max
+    [, nextEma] = computeAdaptiveDelay(MAX_POLL_DELAY / SCALE_FACTOR, responseTimeEma);
    dispatch(setError(type, key, error));
    dispatch(setData(type, key, null));
  } finally {
    dispatch(updateWatchInFlight(type, key, false));
    const timeout = setTimeout(
-      () => fetchPeriodically(dispatch, type, key, getURL, getState, fetch),
-      URL_POLL_DEFAULT_DELAY,
+      () => fetchPeriodically(dispatch, type, key, getURL, getState, fetch, nextEma),
+      emaToDelay(nextEma),


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

First error can jump straight to 60s retry, causing overly aggressive backoff.

With responseTimeEma = 0, the catch path seeds EMA with MAX_POLL_DELAY / SCALE_FACTOR, which immediately schedules MAX_POLL_DELAY. That delays recovery after transient first-request failures.

Suggested fix (seed from floor-equivalent EMA when no history)

import { computeAdaptiveDelay, emaToDelay, + MIN_POLL_DELAY, MAX_POLL_DELAY, SCALE_FACTOR, } from '../components/utils/adaptive-polling'; @@ } catch (error) { - // Feed a synthetic slow response into the EMA to gradually back off without jumping to max - [, nextEma] = computeAdaptiveDelay(MAX_POLL_DELAY / SCALE_FACTOR, responseTimeEma); + // Feed a synthetic slow response into EMA; if no history, seed from floor-equivalent EMA. + const emaSeed = + responseTimeEma > 0 ? responseTimeEma : MIN_POLL_DELAY / SCALE_FACTOR; + [, nextEma] = computeAdaptiveDelay(MAX_POLL_DELAY / SCALE_FACTOR, emaSeed); dispatch(setError(type, key, error)); dispatch(setData(type, key, null)); } finally {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Feed a synthetic slow response into the EMA to gradually back off without jumping to max

[, nextEma] = computeAdaptiveDelay(MAX_POLL_DELAY / SCALE_FACTOR, responseTimeEma);

dispatch(setError(type, key, error));

dispatch(setData(type, key, null));

} finally {

dispatch(updateWatchInFlight(type, key, false));

const timeout = setTimeout(

() => fetchPeriodically(dispatch, type, key, getURL, getState, fetch),

URL_POLL_DEFAULT_DELAY,

() => fetchPeriodically(dispatch, type, key, getURL, getState, fetch, nextEma),

emaToDelay(nextEma),

// Feed a synthetic slow response into EMA; if no history, seed from floor-equivalent EMA.

const emaSeed =

responseTimeEma > 0 ? responseTimeEma : MIN_POLL_DELAY / SCALE_FACTOR;

[, nextEma] = computeAdaptiveDelay(MAX_POLL_DELAY / SCALE_FACTOR, emaSeed);

dispatch(setError(type, key, error));

dispatch(setData(type, key, null));

} finally {

dispatch(updateWatchInFlight(type, key, false));

const timeout = setTimeout(

() => fetchPeriodically(dispatch, type, key, getURL, getState, fetch, nextEma),

emaToDelay(nextEma),

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@frontend/public/actions/dashboards.ts` around lines 86 - 94, The catch path currently seeds the EMA with MAX_POLL_DELAY / SCALE_FACTOR when responseTimeEma is zero, which can jump retries straight to MAX_POLL_DELAY; modify the logic around computeAdaptiveDelay so that when responseTimeEma === 0 you seed the EMA from the "floor" value (the minimal/steady-state equivalent) instead of MAX_POLL_DELAY / SCALE_FACTOR (e.g. use a MIN_POLL_DELAY-based seed or the floor EMA value), keeping all calls and variables (computeAdaptiveDelay, responseTimeEma, MAX_POLL_DELAY, SCALE_FACTOR, emaToDelay, fetchPeriodically) intact; ensure nextEma is computed from that floor-seeded value so the first failure backs off conservatively rather than immediately scheduling the max delay.

jhadvig

/lgtm

openshift-ci · 2026-05-14T11:27:14Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jhadvig, stefanonardo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~frontend/OWNERS~~ [jhadvig]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2026-05-14T13:25:52Z

New changes are detected. LGTM label has been removed.

logonoff · 2026-05-14T14:03:31Z

QA Verification Evidence

	Details
Branch	`OCPBUGS-81521`
Baseline	`main` @ `6701c2350c`
Candidate	`OCPBUGS-81521` @ `09d4b6cfb9`
Verified	2026-05-14
Browser	Playwright 1.60.0 / Chromium
OS	Darwin 25.4.0
Jira	OCPBUGS-81521

Verification Steps

#	Route	Action	Status
1	/dashboards	Navigate, wait for load	pass
2	/dashboards	Check console errors	pass
3	/k8s/ns/openshift-console/deployments	Navigate to workloads	pass
4	/k8s/cluster/nodes	Navigate to nodes	pass
5	/monitoring/dashboards	Navigate to monitoring	pass
6	/k8s/cluster/overview	Navigate to overview	pass
7	/k8s/ns/openshift-console/pods	Navigate to pods	pass
8	/k8s/ns/openshift-console/events	Navigate to events	pass

Animated overview (click to expand)

Baseline	Candidate

Step 1: Dashboard overview (pass)

Baseline (`main`)	Candidate (`OCPBUGS-81521`)

Step 2: Console error check (pass)

Baseline (`main`)	Candidate (`OCPBUGS-81521`)

Step 3: Deployments list (pass)

Baseline (`main`)	Candidate (`OCPBUGS-81521`)

Step 4: Nodes list (pass)

Baseline (`main`)	Candidate (`OCPBUGS-81521`)

Step 5: Monitoring dashboards (404 - plugin not loaded) (pass)

Baseline (`main`)	Candidate (`OCPBUGS-81521`)

Step 6: Cluster overview (404 - plugin not loaded) (pass)

Baseline (`main`)	Candidate (`OCPBUGS-81521`)

Step 7: Pod list (pass)

Baseline (`main`)	Candidate (`OCPBUGS-81521`)

Warning

This verification was performed by an AI agent. Results may contain false positives or miss
regressions that require human judgment. Always review the screenshots manually before approving.

Automated QA verification by Claude Code

stefanonardo · 2026-05-15T07:07:39Z

/retest

…uery response time Replace the hardcoded 15s polling interval in fetchPeriodically with an adaptive delay derived from an Exponential Moving Average of response times. Fast clusters stay at the 15s floor while slow/large clusters automatically back off up to 60s, reducing unnecessary Prometheus load. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

stefanonardo · 2026-05-15T12:57:38Z

/retest

openshift-ci · 2026-05-15T16:04:21Z

@stefanonardo: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 14, 2026

openshift-ci Bot requested review from rhamilto and spadgett May 14, 2026 10:15

openshift-ci Bot added the component/core Related to console core functionality label May 14, 2026

openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 14, 2026

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

stefanonardo force-pushed the OCPBUGS-81521 branch 2 times, most recently from 4017ab6 to 97c74b3 Compare May 14, 2026 10:37

jhadvig approved these changes May 14, 2026

View reviewed changes

openshift-ci Bot assigned jhadvig May 14, 2026

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 14, 2026

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 14, 2026

logonoff reviewed May 14, 2026

View reviewed changes

Comment thread frontend/public/components/utils/adaptive-polling.ts Outdated

stefanonardo force-pushed the OCPBUGS-81521 branch from 97c74b3 to 09d4b6c Compare May 14, 2026 13:25

openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label May 14, 2026

stefanonardo force-pushed the OCPBUGS-81521 branch from 09d4b6c to 1828b2b Compare May 15, 2026 07:32

Conversation

stefanonardo commented May 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented May 14, 2026

Uh oh!

stefanonardo commented May 14, 2026

Uh oh!

openshift-ci-robot commented May 14, 2026

Uh oh!

stefanonardo commented May 14, 2026

Uh oh!

openshift-ci-robot commented May 14, 2026

Uh oh!

openshift-ci-robot commented May 14, 2026

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

jhadvig left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 14, 2026

Uh oh!

Uh oh!

openshift-ci Bot commented May 14, 2026

Uh oh!

logonoff commented May 14, 2026

QA Verification Evidence

Verification Steps

Uh oh!

stefanonardo commented May 15, 2026

Uh oh!

stefanonardo commented May 15, 2026

Uh oh!

openshift-ci Bot commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stefanonardo commented May 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 14, 2026 •

edited

Loading