Skip to content

test(medium): Implement Spotify API Mocking for VRT Stability#9191

Closed
arii wants to merge 11 commits intoleaderfrom
fix/spotify-vrt-mocking-3439321376983697641
Closed

test(medium): Implement Spotify API Mocking for VRT Stability#9191
arii wants to merge 11 commits intoleaderfrom
fix/spotify-vrt-mocking-3439321376983697641

Conversation

@arii
Copy link
Copy Markdown
Owner

@arii arii commented Feb 23, 2026

Description

Implemented Spotify SDK and API mocking for Playwright VRTs to eliminate 401 errors and stabilize tests. Also fixed a minor reporter config issue in playwright.config.ts.

Fixes #9183

Change Type: ✨ New feature (non-breaking change adding functionality)

Related Issues

Closes #9183

Original PR Body

Implemented Spotify SDK and API mocking for Playwright VRTs to eliminate 401 errors and stabilize tests. Also fixed a minor reporter config issue in playwright.config.ts.

Fixes #9183


PR created automatically by Jules for task 3439321376983697641 started by @arii

@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@github-actions github-actions Bot changed the title Implement Spotify API Mocking for VRT Stability test(medium): Implement Spotify API Mocking for VRT Stability Feb 23, 2026
@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 23, 2026

Reviewed commit: c4fc36f033c18d5a454aa2767e346c4b3234419d

This PR significantly improves the stability and reliability of the Visual Regression Test (VRT) suite by introducing comprehensive Spotify API mocking and standardizing test setup.

Key Improvements:

  1. Spotify API Mocking (tests/playwright/lib/mocks.ts): The mockSpotifySDK function is a robust implementation that intercepts the Spotify Web Playback SDK script loading and provides a mock window.Spotify.Player. It also intercepts all network requests to Spotify domains, preventing external calls and ensuring consistent behavior during VRTs. This is crucial for eliminating flakiness related to external dependencies.

  2. Standardized VRT Setup (tests/playwright/lib/setup.ts and VRT spec files): The setupVisualRegressionTest and setupMinimalVisualRegressionTest functions now correctly integrate the Spotify SDK mock. Furthermore, the beforeEach hooks across multiple VRT spec files (vrt-control-panel.spec.ts, vrt-mock-hrm-client.spec.ts, vrt-spotify-controls.spec.ts, vrt-timer-controls.spec.ts, vrt-workout-summary.spec.ts) have been refactored to:

    • Reset server-side state using resetServerState.
    • Reload client pages to ensure a clean state.
    • Wait for pages to be ready and, critically, for WebSocket connections to be re-established. This ensures that tests start from a known, connected state, addressing a common source of flakiness in real-time applications.
  3. Server Reset Enhancement (server.ts): The /api/debug/reset endpoint now also resets the TabataService state, ensuring a more complete server state reset, which aligns with the "Single Source of Truth Principle" by ensuring server state is fully reset.

Overall, these changes are well-implemented and directly address the goal of improving VRT stability. The approach to mocking external APIs and ensuring a clean, consistent state for each test is excellent.

Reviewed at commit: c4fc36f033c18d5a454aa2767e346c4b3234419d

@arii arii added test refactor enhancement New feature or request labels Feb 23, 2026
@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 23, 2026

🤖 AI Technical Audit

Code Review: Implement Spotify API Mocking

This PR introduces necessary stability improvements for VRT by mocking the Spotify SDK. However, there is a critical architectural bug in the mocking implementation that renders the SDK mock ineffective, and significant code duplication in the test files.

ANTI-AI-SLOP DIRECTIVES

  1. CRITICAL BUG (ARCHITECTURAL): In tests/playwright/lib/mocks.ts, the route handlers are defined in the wrong order. Playwright matches routes in reverse order of definition (handlers defined later take precedence).
    • Current Behavior: The generic regex route (defined 2nd) matches spotify-player.js. It calls route.continue(), which bypasses the specific mock (defined 1st) and sends the request to the real network. The mock is effectively disabled.
  2. DUPLICATE HOOKS/BOILERPLATE: The beforeEach logic in vrt-*.spec.ts files is copy-pasted ~5 times. This includes repetitive comments and identical logic for resetting state, reloading, and waiting for WebSockets.
  3. OVERLY VERBOSE COMMENTS: Comments like // 1. Reset server-side state and // 2. Reload pages... inside every test file are redundant noise.
  4. CODE RATIO: You added ~200 lines. By extracting the test setup logic into a helper, you can remove ~60-80 lines of duplicated code.

File-by-File Analysis

tests/playwright/lib/mocks.ts

Problem: Route Masking. The generic mock handler shadows the specific one because it is registered later.

Fix Implementation: Swap the order of registration so the specific handler takes precedence (is registered last).

export async function mockSpotifySDK(pageOrContext: Page | BrowserContext): Promise<void> {
  // 1. Register GENERIC catch-all first (lower priority)
  await pageOrContext.route(/\.(spotify\.com|scdn\.co)/, (route) => {
    const url = route.request().url()
    // No need to check for spotify-player.js here if the specific route is registered AFTER this one
    console.log('[Mock Spotify Network] Intercepting:', url)
    route.fulfill({
      status: 200,
      contentType: 'application/json',
      body: JSON.stringify({}),
    })
  })

  // 2. Register SPECIFIC route second (higher priority)
  await pageOrContext.route('https://sdk.scdn.co/spotify-player.js', (route) => {
    route.fulfill({
      status: 200,
      contentType: 'application/javascript',
      body: `window.Spotify = { ... };` // (Your mock content)
    })
  })
}

tests/playwright/vrt-*.spec.ts (Multiple Files)

Problem: High Code Duplication. The setup sequence is identical across multiple files.

Fix Implementation: Extract this logic to tests/playwright/lib/utils.ts (or similar).

// In lib/utils.ts or lib/setup.ts
export async function resetAndReload(request: APIRequestContext, ...pages: Page[]) {
  await resetServerState(request)
  await Promise.all(pages.map(p => p.reload()))
  await Promise.all(pages.map(p => waitForPageReady(p)))
  // encapsulate the websocket wait here as well
  await Promise.all(pages.map(p => 
    p.waitForFunction(() => document.body.dataset.connectionStatus === 'connected', { timeout: 5000 })
  ))
}

// Usage in tests
test.beforeEach(async ({ request }) => {
  await resetAndReload(request, controlPage, dashboardPage)
})

playwright.config.ts

Evaluation: The spread syntax change ...(process.env.CI ? [['github']] : []) is a good improvement for type safety and cleanliness. Approved.

server.ts

Evaluation: Adding the Tabata service reset to the debug endpoint is appropriate for ensuring test isolation. Ideally, this would be a generic service.resetAll() interface, but direct access works for now.

Review automatically published via RepoAuditor.

@google-labs-jules
Copy link
Copy Markdown
Contributor

Reviewed commit: c4fc36f033c18d5a454aa2767e346c4b3234419d

This PR significantly improves the stability and reliability of the Visual Regression Test (VRT) suite by introducing comprehensive Spotify API mocking and standardizing test setup.

Key Improvements:

  1. Spotify API Mocking (tests/playwright/lib/mocks.ts): The mockSpotifySDK function is a robust implementation that intercepts the Spotify Web Playback SDK script loading and provides a mock window.Spotify.Player. It also intercepts all network requests to Spotify domains, preventing external calls and ensuring consistent behavior during VRTs. This is crucial for eliminating flakiness related to external dependencies.

  2. Standardized VRT Setup (tests/playwright/lib/setup.ts and VRT spec files): The setupVisualRegressionTest and setupMinimalVisualRegressionTest functions now correctly integrate the Spotify SDK mock. Furthermore, the beforeEach hooks across multiple VRT spec files (vrt-control-panel.spec.ts, vrt-mock-hrm-client.spec.ts, vrt-spotify-controls.spec.ts, vrt-timer-controls.spec.ts, vrt-workout-summary.spec.ts) have been refactored to:

    • Reset server-side state using resetServerState.
    • Reload client pages to ensure a clean state.
    • Wait for pages to be ready and, critically, for WebSocket connections to be re-established. This ensures that tests start from a known, connected state, addressing a common source of flakiness in real-time applications.
  3. Server Reset Enhancement (server.ts): The /api/debug/reset endpoint now also resets the TabataService state, ensuring a more complete server state reset, which aligns with the "Single Source of Truth Principle" by ensuring server state is fully reset.

Overall, these changes are well-implemented and directly address the goal of improving VRT stability. The approach to mocking external APIs and ensuring a clean, consistent state for each test is excellent.

Reviewed at commit: c4fc36f033c18d5a454aa2767e346c4b3234419d

Thank you for the positive feedback! I'm glad the Spotify SDK mocking and the improved test isolation via server-side state resets are working as intended to stabilize the VRT suite.

@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 23, 2026

Reviewed commit: 0293cd04fd895f065c9ee37b5d33ff35c76b6dbd

This PR significantly enhances the stability and reliability of the Playwright Visual Regression Tests (VRTs) by introducing comprehensive Spotify API mocking and standardizing server state resets and WebSocket connection waits across various test suites. The mockSpotifySDK function effectively intercepts Spotify SDK script loading and network requests, preventing external dependencies and ensuring consistent test environments. The integration of resetServerState and explicit waits for WebSocket connections in beforeEach hooks is a crucial improvement for test isolation and reducing flakiness.

Key Improvements:

  • Spotify API Mocking: The mockSpotifySDK provides a robust mock for the Spotify Web Playback SDK, controlling its behavior and preventing external network calls during VRTs. This is essential for deterministic visual testing.
  • Server State Reset: The /api/debug/reset endpoint now also resets the tabataService state, ensuring a cleaner and more predictable server state for tests.
  • VRT Stability: Consistent application of resetServerState, page reloads, and waitForFunction for WebSocket connection status across multiple VRT files (vrt-control-panel.spec.ts, vrt-dashboard.spec.ts, vrt-experimental.spec.ts, vrt-mock-hrm-client.spec.ts, vrt-spotify-controls.spec.ts, vrt-timer-controls.spec.ts, vrt-workout-summary.spec.ts) drastically improves test isolation and reduces flakiness.
  • HR Zone Consistency: The update in vrt-hr-components.spec.ts from 'Zone 3' to 'ZONE_3' for HR zones, along with more specific BPM value checks, aligns the tests with what is likely an internal enum, improving data consistency.
  • Viewport Assertions: Adding expect(dashboard).toBeVisible() before screenshots in responsive VRTs (vrt-dashboard.spec.ts) provides an extra layer of confidence that the element is rendered before capturing the screenshot.

Overall, this PR is a high-quality contribution that directly addresses VRT stability, which is critical for maintaining UI consistency and preventing regressions. The changes are well-scoped and align with the project's testing guidelines.

Reviewed at commit: 0293cd04fd895f065c9ee37b5d33ff35c76b6dbd

@google-labs-jules
Copy link
Copy Markdown
Contributor

Reviewed commit: 0293cd04fd895f065c9ee37b5d33ff35c76b6dbd

This PR significantly enhances the stability and reliability of the Playwright Visual Regression Tests (VRTs) by introducing comprehensive Spotify API mocking and standardizing server state resets and WebSocket connection waits across various test suites. The mockSpotifySDK function effectively intercepts Spotify SDK script loading and network requests, preventing external dependencies and ensuring consistent test environments. The integration of resetServerState and explicit waits for WebSocket connections in beforeEach hooks is a crucial improvement for test isolation and reducing flakiness.

Key Improvements:

  • Spotify API Mocking: The mockSpotifySDK provides a robust mock for the Spotify Web Playback SDK, controlling its behavior and preventing external network calls during VRTs. This is essential for deterministic visual testing.
  • Server State Reset: The /api/debug/reset endpoint now also resets the tabataService state, ensuring a cleaner and more predictable server state for tests.
  • VRT Stability: Consistent application of resetServerState, page reloads, and waitForFunction for WebSocket connection status across multiple VRT files (vrt-control-panel.spec.ts, vrt-dashboard.spec.ts, vrt-experimental.spec.ts, vrt-mock-hrm-client.spec.ts, vrt-spotify-controls.spec.ts, vrt-timer-controls.spec.ts, vrt-workout-summary.spec.ts) drastically improves test isolation and reduces flakiness.
  • HR Zone Consistency: The update in vrt-hr-components.spec.ts from 'Zone 3' to 'ZONE_3' for HR zones, along with more specific BPM value checks, aligns the tests with what is likely an internal enum, improving data consistency.
  • Viewport Assertions: Adding expect(dashboard).toBeVisible() before screenshots in responsive VRTs (vrt-dashboard.spec.ts) provides an extra layer of confidence that the element is rendered before capturing the screenshot.

Overall, this PR is a high-quality contribution that directly addresses VRT stability, which is critical for maintaining UI consistency and preventing regressions. The changes are well-scoped and align with the project's testing guidelines.

Reviewed at commit: 0293cd04fd895f065c9ee37b5d33ff35c76b6dbd

Thank you for the detailed review! I'm pleased that the improvements to the Spotify SDK mocking, the expanded server reset, and the standardized isolation procedures are meeting the requirements for a more robust and stable VRT suite.

@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 23, 2026

Reviewed commit: ff27e1f2830afa1425055e64ae96d5e13f70d2cd

This PR significantly improves the stability and reliability of the Visual Regression Tests (VRTs) by introducing comprehensive Spotify API mocking and enhancing test setup/teardown procedures. The changes are well-scoped and directly address the goal of VRT stability.

Key Improvements:

  1. Spotify API Mocking: The mockSpotifySDK function is a crucial addition, effectively isolating VRTs from external Spotify dependencies. By intercepting SDK script loading and all Spotify network requests, it prevents flaky tests caused by network issues or API rate limits. This aligns well with the project's focus on robust testing.
  2. Improved Test Isolation: The consistent use of resetServerState(request) and page.reload() in beforeEach hooks across multiple VRT files (e.g., vrt-control-panel.spec.ts, vrt-dashboard.spec.ts) ensures that each test starts from a clean, predictable server and client state. This is a best practice for reliable end-to-end testing.
  3. Robust WebSocket Connection Checks: Adding waitForFunction calls to verify document.body.dataset.connectionStatus === 'connected' after page reloads is an excellent way to ensure that the WebSocket connection is re-established and stable before proceeding with visual assertions. This directly tackles a common source of flakiness in real-time applications.
  4. Enhanced HR Zone Testing: The updates in vrt-hr-components.spec.ts to use ZONE_3 instead of Zone 3 and to explicitly wait for BPM values to reflect mock data improve the precision and reliability of HR tile VRTs.
  5. Minor Configuration Refinement: The playwright.config.ts change for reporter configuration is a small but clean improvement.

Overall, this PR makes substantial progress towards a more stable and maintainable VRT suite. The approach to mocking external services and ensuring clean test states is commendable.

Reviewed at commit: ff27e1f2830afa1425055e64ae96d5e13f70d2cd

@google-labs-jules
Copy link
Copy Markdown
Contributor

Reviewed commit: ff27e1f2830afa1425055e64ae96d5e13f70d2cd

This PR significantly improves the stability and reliability of the Visual Regression Tests (VRTs) by introducing comprehensive Spotify API mocking and enhancing test setup/teardown procedures. The changes are well-scoped and directly address the goal of VRT stability.

Key Improvements:

  1. Spotify API Mocking: The mockSpotifySDK function is a crucial addition, effectively isolating VRTs from external Spotify dependencies. By intercepting SDK script loading and all Spotify network requests, it prevents flaky tests caused by network issues or API rate limits. This aligns well with the project's focus on robust testing.
  2. Improved Test Isolation: The consistent use of resetServerState(request) and page.reload() in beforeEach hooks across multiple VRT files (e.g., vrt-control-panel.spec.ts, vrt-dashboard.spec.ts) ensures that each test starts from a clean, predictable server and client state. This is a best practice for reliable end-to-end testing.
  3. Robust WebSocket Connection Checks: Adding waitForFunction calls to verify document.body.dataset.connectionStatus === 'connected' after page reloads is an excellent way to ensure that the WebSocket connection is re-established and stable before proceeding with visual assertions. This directly tackles a common source of flakiness in real-time applications.
  4. Enhanced HR Zone Testing: The updates in vrt-hr-components.spec.ts to use ZONE_3 instead of Zone 3 and to explicitly wait for BPM values to reflect mock data improve the precision and reliability of HR tile VRTs.
  5. Minor Configuration Refinement: The playwright.config.ts change for reporter configuration is a small but clean improvement.

Overall, this PR makes substantial progress towards a more stable and maintainable VRT suite. The approach to mocking external services and ensuring clean test states is commendable.

Reviewed at commit: ff27e1f2830afa1425055e64ae96d5e13f70d2cd

Thank you for the detailed and positive review! I'm glad to see that the Spotify API mocking, the improved test isolation, and the robust WebSocket connection checks are meeting the project's standards for VRT stability. These changes should provide a much more reliable foundation for visual testing going forward.

@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 23, 2026

Reviewed commit: 61741dbf6fc57679409d9f5ec1b4242ae02b2146

This PR significantly enhances the stability and reliability of the Visual Regression Tests (VRTs) by implementing comprehensive Spotify API mocking and standardizing server-side state resets. The changes align well with the project's emphasis on quality and consistency in testing.

Key improvements include:

  1. Spotify SDK Mocking: The mockSpotifySDK function effectively intercepts Spotify SDK script loading and all related network requests, preventing external dependencies and potential 401 errors during VRT runs. This is crucial for deterministic visual testing of Spotify-related components.
  2. Standardized Server State Reset: The resetServerState call in beforeEach hooks across multiple VRT files, combined with page reloads and WebSocket connection waiting, ensures that each test starts from a clean and consistent server state. This directly addresses the "Single Source of Truth Principle" by ensuring client-server state synchronization is robustly handled in tests.
  3. Improved VRT Robustness: Additions like expect(dashboard).toBeVisible() for various viewports and explicit waiting for HR tile updates (toHaveText) make the VRTs more resilient to timing issues and responsive layout shifts. The addStyleTag in vrt-spotify-controls.spec.ts is a pragmatic solution to prevent flaky blank screenshots.
  4. HR Zone Naming Consistency: The correction from Zone X to ZONE_X in vrt-hr-components.spec.ts improves consistency with what appears to be an enum-like naming convention for HR zones.

Overall, this is a well-executed PR that greatly contributes to the robustness of our testing suite.

Reviewed at commit: 61741dbf6fc57679409d9f5ec1b4242ae02b2146

@google-labs-jules
Copy link
Copy Markdown
Contributor

Reviewed commit: 61741dbf6fc57679409d9f5ec1b4242ae02b2146

This PR significantly enhances the stability and reliability of the Visual Regression Tests (VRTs) by implementing comprehensive Spotify API mocking and standardizing server-side state resets. The changes align well with the project's emphasis on quality and consistency in testing.

Key improvements include:

  1. Spotify SDK Mocking: The mockSpotifySDK function effectively intercepts Spotify SDK script loading and all related network requests, preventing external dependencies and potential 401 errors during VRT runs. This is crucial for deterministic visual testing of Spotify-related components.
  2. Standardized Server State Reset: The resetServerState call in beforeEach hooks across multiple VRT files, combined with page reloads and WebSocket connection waiting, ensures that each test starts from a clean and consistent server state. This directly addresses the "Single Source of Truth Principle" by ensuring client-server state synchronization is robustly handled in tests.
  3. Improved VRT Robustness: Additions like expect(dashboard).toBeVisible() for various viewports and explicit waiting for HR tile updates (toHaveText) make the VRTs more resilient to timing issues and responsive layout shifts. The addStyleTag in vrt-spotify-controls.spec.ts is a pragmatic solution to prevent flaky blank screenshots.
  4. HR Zone Naming Consistency: The correction from Zone X to ZONE_X in vrt-hr-components.spec.ts improves consistency with what appears to be an enum-like naming convention for HR zones.

Overall, this is a well-executed PR that greatly contributes to the robustness of our testing suite.

Reviewed at commit: 61741dbf6fc57679409d9f5ec1b4242ae02b2146

Thank you for the detailed feedback! I'm glad that the Spotify SDK mocking, standardized server resets, and robustness improvements like explicit visibility and text assertions are recognized as valuable contributions to the VRT suite's stability and consistency.

@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 24, 2026

🤖 AI Technical Audit

Code Review for PR #9191

⚠️ Critical Feedback

This PR successfully introduces the Spotify API mocking required to stabilize the VRTs, which is a great step forward. However, the implementation introduces significant code duplication across the test suite. The initialization logic added to the beforeEach hooks in multiple files is identical and should be abstracted.

1. DRY Violation in Test Setup

Problem:
You have copy-pasted the exact same server reset, page reload, and WebSocket connection waiting logic into vrt-control-panel.spec.ts, vrt-dashboard.spec.ts, vrt-hr-components.spec.ts, vrt-spotify-controls.spec.ts, vrt-timer-controls.spec.ts, and vrt-workout-summary.spec.ts.

Recommendation:
Extract this logic into a shared helper function in tests/playwright/lib/setup.ts.

Implementation Sample:

// tests/playwright/lib/setup.ts
export async function prepareVrtEnvironment(
  request: APIRequestContext,
  pages: Page[]
) {
  await resetServerState(request)
  
  await Promise.all(pages.map(async (page) => {
    await page.reload()
    await waitForPageReady(page)
    await page.waitForFunction(
      () => document.body.dataset.connectionStatus === 'connected',
      { timeout: 5000 }
    )
  }))
}

// Usage in spec files:
test.beforeEach(async ({ request }) => {
  await prepareVrtEnvironment(request, [controlPage, dashboardPage])
})

2. Mock Implementation Cleanliness

Problem:
tests/playwright/lib/mocks.ts contains console.log statements and setTimeout inside the injected browser code.

  • console.log pollutes the test output.
  • setTimeout introduces arbitrary waits (100ms) which can lead to race conditions or unnecessary slowness.

Recommendation:
Remove the logs or gate them behind a debug flag. Use Promise.resolve() or requestAnimationFrame instead of setTimeout if immediate execution is acceptable, or ensuring the timing is deterministic.

3. Service Coupling in Server Reset

Problem:
In server.ts, the API route directly manipulates the internals of tabataService (setting specific workDuration numbers).

services.tabataService.setConfig({ workDuration: 20, restDuration: 10 })

Recommendation:
The TabataService class should expose a public .reset() method that encapsulates these defaults. The API layer should not know the default configuration values.

4. VRT Stability vs Tolerance

Observation:
You increased maxDiffPixelRatio to 0.4 in vrt-dashboard.spec.ts while simultaneously deleting the style tag that forces opacity/transforms to be stable.

// Deleted
// await dashboardPage.addStyleTag({
//   content: `[data-testid="main-content-layout"] { opacity: 1 !important; transform: none !important; }`,
// })

Risk: Removing the animation/style override while mocking data suggests you are relying on higher pixel tolerance to pass tests rather than fixing the rendering instability (animations). This might mask actual visual regressions.


ANTI-AI-SLOP DIRECTIVES

  1. OVERLY VERBOSE COMMENTS:

    • server.ts: // Reset Timer Service to a clean state (Redundant)
    • setup.ts: // Use a retry mechanism to handle occasional socket hangups... (The code explains itself)
  2. OVER-ENGINEERING:

    • The resetServerState function in setup.ts implements a manual retry loop with try/catch and setTimeout. Playwright has built-in expecting/polling, or you could use a dedicated retry utility. However, this manual implementation is acceptable if the server behavior is uniquely flaky.
  3. DUPLICATE HOOKS:

    • The beforeEach hook in 6 different files contains 15+ lines of identical logic.
  4. CODE RATIO:

    • Flagged: The PR adds ~200 lines. At least 80 lines are duplicate boilerplate in the test specs.
    • Action: Refactoring the beforeEach block into a helper function will satisfy this directive immediately.
  5. STALE FEATURES:

    • Checked. The PR introduces new mocking strategies. Ensure that mockGoogleDocIframe (which is still called) doesn't conflict with the new full-page mocks if they overlap context.

Review automatically published via RepoAuditor.

@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 25, 2026

@pr-squash

@github-actions
Copy link
Copy Markdown
Contributor

❌ Automatic squash and rebase failed. The branch has conflicts with leader. Please resolve the conflicts manually and try again.

@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 25, 2026

🤖 AI Technical Audit

Code Review for PR #9191

🛑 Critical Issues

1. Merge Conflicts in Codebase

Problem: The file tests/playwright/vrt-hr-components.spec.ts contains raw git merge conflict markers (<<<<<<<, =======, >>>>>>>). This will cause the build and tests to fail immediately.

File: tests/playwright/vrt-hr-components.spec.ts

Implementation Sample:
You must resolve the conflicts manually. Based on the diff, it seems you want to keep the timeout logic from HEAD and the visible checks from origin.

// Resolve lines 112-123
// Wait for tiles to appear and reflect mock data
await expect(dashboardPage.getByTestId('hr-tile-card')).toHaveCount(2, {
  timeout: 5000,
})
await expect(dashboardPage.getByTestId('hr-tile-card').first()).toBeVisible()

2. Accidental Documentation Commit

Problem: The file spotify-volume-control-issues.md appears to be a bug report draft or context file. It claims fixes were applied ("DONE in current session") to SpotifyControls.tsx, but SpotifyControls.tsx is not included in this PR. This file adds 272 lines of noise to the repository and should not be committed.

Recommendation:

  1. Delete spotify-volume-control-issues.md.
  2. Verify if the actual code fixes for Spotify volume control were intended to be part of this PR. If so, add app/client/control/components/SpotifyControls.tsx to the commit.

### ANTI-AI-SLOP DIRECTIVES

  1. CODE RATIO: You added 272 lines in spotify-volume-control-issues.md which serves no functional purpose in the codebase. Action: Delete this entire file.
  2. OVERLY VERBOSE COMMENTS: In app/client/mock/page.tsx, the comment describing the metadata packet logic is expanding unnecessarily.
  3. STALE FEATURES: You successfully replaced manual VRT setup in multiple files with prepareVrtEnvironment. This is a good cleanup.

File-by-File Analysis

tests/playwright/lib/mocks.ts

Problem: The mockSpotifySDK function injects a script string with comments inside it. While harmless, it adds unnecessary bytes to the injection payload.

Best Practice: Keep injected code strings minimal.

// tests/playwright/lib/mocks.ts
// Remove internal comments like "// Simulate async success" from the string literal to keep the mock payload clean.

tests/playwright/lib/setup.ts

Architecture: The introduction of prepareVrtEnvironment is excellent. It centralizes the teardown/setup logic (reset server -> reload -> wait for connection), which reduces code duplication across VRT specs.

Observation: The retry logic in resetServerState is a good addition for CI stability.

app/client/mock/page.tsx

Problem: Redundant comment expansion.

Implementation Sample:

// app/client/mock/page.tsx

// Change this:
// NOTE: In a real client... feedback.
// We also ensure it's sent whenever the connection is established.

// To this (or just keep the original):
// NOTE: Sends metadata on connection or state change for immediate mock feedback.

tests/playwright/vrt-dashboard.spec.ts

Quality: The explicit viewport setting and masking updates look correct for stabilizing VRT. The removal of the manual reset logic in test.beforeEach in favor of prepareVrtEnvironment makes the test much more readable.

Summary

This PR contains good infrastructure work for VRT stability (mocking Spotify, centralizing setup), but it is currently unmergeable due to conflicts and contains artifacts (markdown files) that should be removed.

Review automatically published via RepoAuditor.

@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 26, 2026

Reviewed commit: 90643299394e68f5929d71736744a754920c14e1

This PR significantly enhances the stability and reliability of the Visual Regression Tests (VRTs) by implementing comprehensive Spotify API mocking and refactoring common test setup logic. The introduction of mockSpotifySDK is a crucial step towards isolating VRTs from external dependencies, ensuring consistent visual snapshots.

The prepareVrtEnvironment utility is an excellent refactoring, reducing boilerplate in individual test files and promoting a consistent test setup across the suite. The improvements to resetServerState and accessibility checks further contribute to a robust testing infrastructure.

Minor points for consideration are noted in the suggested issues, primarily regarding a potentially relaxed layout constraint and a contradictory comment in a VRT file.

Reviewed at commit: 90643299394e68f5929d71736744a754920c14e1

@arii
Copy link
Copy Markdown
Owner Author

arii commented Feb 26, 2026

Reviewed commit: 941272b4210546bbd7ca77f68fe188de0d5bf017

This PR significantly improves the stability and reliability of the Visual Regression Tests (VRTs) by introducing comprehensive Spotify API mocking and standardizing the test environment setup. The changes are well-implemented and address common sources of VRT flakiness.

Key improvements include:

  • Spotify SDK Mocking: The mockSpotifySDK utility provides a robust mock for the Spotify Web Playback SDK and related network requests, preventing external API dependencies from causing VRT failures.
  • Standardized VRT Environment Preparation: The new prepareVrtEnvironment function encapsulates server state resets, page reloads, and WebSocket connection waits, reducing boilerplate and ensuring a consistent starting state for tests.
  • Increased Test Resilience: Adjustments to accessibility cleanup (try-catch), VRT diff thresholds, and increased timeouts for assertions make the tests more robust against transient issues.
  • Comprehensive Server Reset: The /api/debug/reset endpoint now correctly resets the Tabata timer service, ensuring a cleaner state for testing.

Overall, this is a high-quality PR that directly contributes to the maintainability and reliability of our testing suite, aligning with our Quality & Security First and Continuous Improvement core principles.

Reviewed at commit: 941272b4210546bbd7ca77f68fe188de0d5bf017

@arii arii added chore not reviewed The review could not be performed or completed successfully. labels Feb 26, 2026
@arii arii marked this pull request as ready for review February 26, 2026 22:43
@arii arii added not reviewed The review could not be performed or completed successfully. and removed not reviewed The review could not be performed or completed successfully. labels Feb 26, 2026
arii and others added 9 commits February 26, 2026 15:47
This change introduces comprehensive mocking for the Spotify Web Playback SDK
and associated API calls during Visual Regression Tests.

Key improvements:
- Added `mockSpotifySDK` in `tests/playwright/lib/mocks.ts` to intercept the
  Spotify SDK script and provide a mock implementation of `window.Spotify.Player`.
- Implemented global network interception for `*.spotify.com` and `*.scdn.co`
  domains to prevent unauthorized (401) errors and external network calls.
- Integrated `mockSpotifySDK` into `setupVisualRegressionTest` and
  `setupMinimalVisualRegressionTest` in `tests/playwright/lib/setup.ts`.
- Fixed a minor bug in `playwright.config.ts` where the reporter configuration
  could result in an invalid state in non-CI environments.
- Applied project formatting standards using Prettier.

These changes ensure that VRTs capture the visual integrity of the dashboard
reliably, decoupled from the Spotify service's live status or authentication.

Co-authored-by: arii <342438+arii@users.noreply.github.com>
Acknowledged the positive review regarding the implementation of Spotify API
mocking and improved test isolation. No further code changes were required.

Co-authored-by: arii <342438+arii@users.noreply.github.com>
Acknowledged the positive review regarding the comprehensive VRT suite
stability improvements, including Spotify SDK mocking, server state reset
enhancements, and standardized test isolation procedures. No further
code changes were requested.

Co-authored-by: arii <342438+arii@users.noreply.github.com>
Acknowledged the positive review regarding Spotify API mocking, test
isolation, and robust WebSocket checks in the VRT suite.

Co-authored-by: arii <342438+arii@users.noreply.github.com>
This comprehensive update stabilizes the Visual Regression Test (VRT) suite
by eliminating external dependencies and ensuring strict test isolation.

Key Changes:
- **Spotify SDK Mocking**: Added `mockSpotifySDK` in `tests/playwright/lib/mocks.ts`
  to intercept the Spotify SDK script and provide a mock `window.Spotify.Player`.
  Global network interception for Spotify domains prevents 401 errors.
- **Server-Side Reset**: Enhanced `/api/debug/reset` in `server.ts` to clear
  both HRM and Timer Service states, providing a clean baseline for every test.
- **Test Isolation**: Refactored `beforeEach` hooks in all VRT specifications
  to perform a server reset and page reload, preventing state leakage between tests.
- **Flakiness Reductions**:
  - Added explicit waits for BPM updates in `vrt-hr-components.spec.ts` using
    regex-based text assertions to handle "BPM" suffixes.
  - Standardized heart rate zone identifiers (e.g., 'ZONE_3') for type safety.
  - Improved synchronization in responsive dashboard tests by waiting for
    container visibility.
- **Tooling & Cleanup**:
  - Fixed a reporter configuration error in `playwright.config.ts`.
  - Resolved a `ReferenceError` in `vrt-hr-components.spec.ts`.
  - Applied consistent formatting across all modified files.

These improvements ensure that VRTs are deterministic and reliably capture
UI regressions without interference from external services or previous test runs.

Co-authored-by: arii <342438+arii@users.noreply.github.com>
Removed an extra newline that caused a Prettier/ESLint failure in CI.

Co-authored-by: arii <342438+arii@users.noreply.github.com>
#9381)

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: arii <342438+arii@users.noreply.github.com>
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: arii <342438+arii@users.noreply.github.com>
@arii arii force-pushed the fix/spotify-vrt-mocking-3439321376983697641 branch from 3e5a561 to 618e6a2 Compare February 26, 2026 23:47
@arii arii added not reviewed The review could not be performed or completed successfully. and removed not reviewed The review could not be performed or completed successfully. labels Feb 26, 2026
@arii arii enabled auto-merge (squash) February 26, 2026 23:55
@arii arii disabled auto-merge February 26, 2026 23:56
@arii arii added not reviewed The review could not be performed or completed successfully. and removed not reviewed The review could not be performed or completed successfully. labels Feb 27, 2026
@arii arii closed this Mar 3, 2026
@arii arii deleted the fix/spotify-vrt-mocking-3439321376983697641 branch March 17, 2026 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Spotify API Mocking for Visual Regression Tests Stability

1 participant