Skip to content

Migrate E2E coverage to Playwright#2610

Open
senamakel wants to merge 46 commits into
tinyhumansai:mainfrom
senamakel:codex/web-shell-playwright
Open

Migrate E2E coverage to Playwright#2610
senamakel wants to merge 46 commits into
tinyhumansai:mainfrom
senamakel:codex/web-shell-playwright

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented May 25, 2026

Summary

  • Add a browser-hosted Playwright E2E lane for the Tauri shell by introducing the web callback route, standalone core bootstrap, and Playwright web session/build scripts.
  • Migrate every non-mega-flow WDIO/Appium spec to a Playwright counterpart under app/test/playwright/specs.
  • Stabilize the shared Playwright auth/bootstrap/walkthrough helpers and connector scaffolding so the web lane can cover auth, connectors, chat harness flows, channels, tools, webhooks, settings, journeys, and billing smokes.
  • Update GitHub Actions so branch/PR E2E now runs the desktop full-flow lane across Linux/macOS/Windows, alongside a separate Playwright web-E2E lane.
  • Remove the release staging/production workflows’ dependency on the reusable E2E pretest gate.

Problem

  • The Appium/WDIO suite had become too heavy for general E2E coverage, while most flows only need renderer + core JSON-RPC behavior rather than desktop-specific CEF integration.
  • OAuth/callback flows in the browser lane needed an in-app callback route and a standalone core session/bootstrap path because Playwright cannot spawn the old callback-token capture server model.
  • CI was still shaped around older WDIO smoke/full-suite patterns instead of the desired split: one desktop full-flow gate and one Playwright web-E2E lane.

Solution

  • Added the web-targeted shell support (app/src-tauri-web, callback route, web build/session scripts, auth/core helper changes) so the browser lane can speak to openhuman-core over the same JSON-RPC surface as the desktop renderer.
  • Ported the WDIO/Appium spec inventory into Playwright under app/test/playwright/specs, leaving only app/test/e2e/specs/mega-flow.spec.ts unmatched.
  • Preserved browser-lane limitations explicitly where a flow is inherently desktop/service/listener dependent by using focused skips instead of leaving those filenames on the Appium side.
  • Reshaped GitHub Actions so .github/workflows/e2e.yml now runs all-OS desktop mega-flow plus a separate Playwright web suite, and removed the release workflows’ pretest-e2e dependency.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — N/A: not measured locally for this branch-sized migration; CI diff-cover remains the source of truth
  • Coverage matrix updated — N/A: no matrix doc rows were edited in this branch
  • All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: broad E2E migration branch, not tied to one feature-row update
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: E2E/CI migration only
  • Linked issue closed via Closes #NNN in the ## Related section — N/A: no single issue key for this branch

Impact

  • Desktop: mega-flow remains the desktop/Appium holdout; the rest of the E2E surface now has Playwright coverage.
  • Web/test infrastructure: the browser lane now boots a standalone openhuman-core and uses the in-app callback route for auth/callback flows.
  • CI: PR/push E2E shape changes to desktop mega-flow on Linux/macOS/Windows plus a dedicated Playwright web lane, while release staging/production no longer invoke the reusable E2E pretest.
  • Compatibility: explicit Playwright skips document the browser-lane gaps for native-only/service-only/listener-restart-only behavior.

Related

  • Closes:
    • N/A: broad E2E migration branch, no single tracked issue key
  • Follow-up PR(s)/TODOs:
    • Desktop/native-only surfaces still intentionally skipped in some Playwright specs where the browser lane cannot provide the underlying OS/service capability.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: codex/web-shell-playwright
  • Commit SHA: 6be30e7a190ed7cf8e80cca25376691aab4159fe

Validation Run

  • pnpm --filter openhuman-app format:check
  • pnpm typecheck
  • pnpm --filter openhuman-app lint
  • Focused tests:
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/smoke.spec.ts test/playwright/specs/navigation.spec.ts test/playwright/specs/navigation-settings-panels.spec.ts test/playwright/specs/insights-dashboard.spec.ts test/playwright/specs/login-flow.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/channels-smoke.spec.ts test/playwright/specs/command-palette.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/settings-account-preferences.spec.ts test/playwright/specs/settings-advanced-config.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/notifications.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/settings-channels-permissions.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/rewards-progression-persistence.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/onboarding-modes.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/auth-access-control.spec.ts test/playwright/specs/guided-tour-gates.spec.ts test/playwright/specs/logout-relogin-onboarding.spec.ts test/playwright/specs/onboarding-modes.spec.ts test/playwright/specs/runtime-picker-login.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/connector-airtable.spec.ts test/playwright/specs/connector-github.spec.ts test/playwright/specs/connector-discord-composio.spec.ts test/playwright/specs/connector-gmail-composio.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/composio-triggers-flow.spec.ts test/playwright/specs/skills-registry.spec.ts test/playwright/specs/skill-socket-reconnect.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/voice-mode.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/user-journey-settings-round-trip.spec.ts test/playwright/specs/core-port-conflict-recovery.spec.ts test/playwright/specs/connectivity-state-differentiation.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/skill-lifecycle.spec.ts test/playwright/specs/skill-execution-flow.spec.ts test/playwright/specs/skill-multi-round.spec.ts test/playwright/specs/skill-oauth.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/tool-browser-flow.spec.ts test/playwright/specs/tool-filesystem-flow.spec.ts test/playwright/specs/tool-shell-git-flow.spec.ts test/playwright/specs/webhooks-ingress-flow.spec.ts test/playwright/specs/webhooks-tunnel-flow.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/whatsapp-flow.spec.ts test/playwright/specs/slack-flow.spec.ts test/playwright/specs/conversations-web-channel-flow.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/harness-search-tool-flow.spec.ts test/playwright/specs/harness-composio-tool-flow.spec.ts test/playwright/specs/harness-cron-prompt-flow.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/telegram-channel-flow.spec.ts test/playwright/specs/cron-jobs-flow.spec.ts test/playwright/specs/harness-search-tool-flow.spec.ts test/playwright/specs/harness-composio-tool-flow.spec.ts test/playwright/specs/harness-cron-prompt-flow.spec.ts test/playwright/specs/memory-roundtrip.spec.ts test/playwright/specs/user-journey-full-task.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/agent-review.spec.ts test/playwright/specs/card-payment-flow.spec.ts test/playwright/specs/crypto-payment-flow.spec.ts test/playwright/specs/tauri-commands.spec.ts
    • bash ./app/scripts/e2e-web-session.sh test/playwright/specs/audio-toolkit-flow.spec.ts test/playwright/specs/gmail-flow.spec.ts test/playwright/specs/harness-channel-bridge-flow.spec.ts test/playwright/specs/local-model-runtime.spec.ts test/playwright/specs/screen-intelligence.spec.ts test/playwright/specs/service-connectivity-flow.spec.ts test/playwright/specs/linux-cef-deb-runtime.spec.ts
  • Rust fmt/check (if changed): cargo check --manifest-path Cargo.toml -q
  • Tauri fmt/check (if changed): not run for this workflow-only tail commit; earlier branch work validated targeted app/web lanes instead

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: none

Behavior Changes

  • Intended behavior change: replace Appium/WDIO coverage with Playwright everywhere except mega-flow, and split CI into desktop mega-flow plus Playwright web-E2E lanes
  • User-visible effect: broader/faster web-lane E2E coverage and updated GitHub Actions E2E topology

Parity Contract

  • Legacy behavior preserved: every former non-mega-flow WDIO/Appium filename now has a Playwright counterpart; desktop-only limitations are explicit skips rather than silent omissions
  • Guard/fallback/dispatch parity checks: auth callback route, standalone core bootstrap, connector/composio param shapes, walkthrough dismissal, and browser-lane settings/chat/channel flows were validated through focused Playwright runs

Duplicate / Superseded PR Handling

  • Duplicate PR(s): none
  • Canonical PR: this branch/PR
  • Resolution (closed/superseded/updated): updated

Summary by CodeRabbit

  • New Features
    • Browser-hosted web shell, web callback UI, web E2E build/session runners, and a dedicated Playwright CI lane.
  • Tests
    • Large Playwright end-to-end suite added (auth, chat, connectors, settings, tools, many flows) and unit tests for callback handling.
  • Bug Fixes
    • Session/auth flows tightened to always use core RPC; Telegram Bot API env var promoted for E2E.
  • Chores
    • CI gating simplified, cancel-aware CI wrapper added, ESLint/test configs and package scripts updated, Playwright cache volume and .gitignore entry added.
  • Documentation
    • Web-shell README and E2E guidance updated.

Review Change Stack

senamakel added 30 commits May 24, 2026 19:40
@senamakel senamakel requested a review from a team May 25, 2026 07:34
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Split E2E into desktop (mega-flow) and Playwright web lanes; add Playwright config, helpers, many web specs, web build/session scripts; add scripts/ci-cancel-aware.sh and wrap CI build/test steps; remove pretest-e2e gating from release workflows; add WebCallbackPage and related tests; update auth/Telegram env handling.

Changes

Web E2E + CI

Layer / File(s) Summary
CI workflow gating & e2e job split
.github/workflows/e2e.yml, .github/workflows/e2e-reusable.yml, .github/workflows/release-production.yml, .github/workflows/release-staging.yml
Simplifies workflow_dispatch inputs, adds e2e-playwright job, splits e2e-desktop, and removes pretest-e2e hard gating from release/staging paths.
CI cancel-aware wrapper & usage
scripts/ci-cancel-aware.sh, .github/workflows/*
Add cancel-aware runner and prefix many long-running pnpm/cargo/tauri/Appium steps with it across CI workflows.
e2e reusable: mega-flow vs full shard
.github/workflows/e2e-reusable.yml
Make full=false run only mega-flow; full=true uses full-suite sharded fanout; wrap shard commands with cancel-aware script.
Add Playwright CI lane
.github/workflows/e2e.yml
Introduce e2e-playwright container job to build and run web Playwright E2E and upload artifacts.
Release gating adjustments
.github/workflows/release-production.yml, .github/workflows/release-staging.yml
Remove pretest-e2e from create-release/build job needs and cleanup conditions.
Misc workflow wraps
.github/workflows/build.yml, .github/workflows/build-desktop.yml, .github/workflows/build-windows.yml, .github/workflows/ios-compile.yml, .github/workflows/test-reusable.yml, .github/workflows/android-compile.yml, .github/workflows/release-packages.yml
Wrap many install/build/test steps with scripts/ci-cancel-aware.sh.

Web E2E scripts & config

Layer / File(s) Summary
Web E2E build & session scripts
app/scripts/e2e-web-build.sh, app/scripts/e2e-web-session.sh
New scripts to build web dist and standalone core, start mock/core/web servers, export PW env vars, and run Playwright tests.
Playwright config & tsconfig
app/playwright.config.ts, app/test/tsconfig.e2e.json, app/package.json
Add Playwright defineConfig, include Playwright types in tsconfig, add npm scripts and @playwright/test devDependency.
Docker & bootstrap
e2e/docker-compose.yml, e2e/docker-local-bootstrap.sh
Persist Playwright cache volume and install chromium-headless-shell if missing.
ESLint updates
app/eslint.config.js
Add dist-web/** ignore, include Playwright file globs, relax rules for Playwright test files.

App code & tests

Layer / File(s) Summary
Web callback route & tests
app/src/pages/WebCallbackPage.tsx, app/src/pages/__tests__/WebCallbackPage.test.tsx, app/src/AppRoutes.tsx
Add WebCallbackPage that builds synthetic deep links and test coverage to assert deep-link forwarding.
Auth utilities & Telegram env
app/src/utils/tauriCommands/auth.ts, app/scripts/e2e-run-session.sh, src/openhuman/channels/providers/telegram/*
Always use core RPC for session store/get/logout; prefer OPENHUMAN_TELEGRAM_BOT_API_BASE env var and update code/comments; script exports the BOT API env var.
Playwright helpers
app/test/playwright/helpers/core-rpc.ts
Centralized core RPC helpers, bypass JWT builder, boot/reset/seed/browser helpers, wait/dismiss utilities used by many specs.
Playwright specs
app/test/playwright/specs/*
Add a large suite of Playwright E2E tests covering connectors, chat flows, tools, settings, navigation, onboarding, webhooks, notifications, rewards, voice, and smoke tests (many files).
WDIO/e2e updates
app/test/e2e/specs/mega-flow.spec.ts, app/test/e2e/specs/voice-mode.spec.ts
Mega-flow spec updated to use bypass JWTs; desktop voice-mode E2E adjusted; WDIO suite retains desktop-focused tests.
Misc docs & README
app/src-tauri-web/README.md, .gitignore
Add README for web shell and ignore .codex-tmp.

Sequence Diagrams: skipped (changes are large and varied; no single 3+ component sequential flow added beyond existing CI orchestration).

Estimated code review effort
🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • graycyrus

"A rabbit hops, scripts in paw,
CI wakes with gentle awe.
Playwright dances, tests take flight,
Cancel-aware shields the night.
Cheers from a bunny for builds done right 🐇"

@coderabbitai coderabbitai Bot added feature Net-new user-facing capability or product behavior. working A PR that is being worked on by the team. labels May 25, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

🟠 Major comments (21)
app/src/AppRoutes.tsx-41-42 (1)

41-42: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Route contract drift in AppRoutes.tsx.

/callback/:kind and /callback/:kind/:status extend routing beyond the allowed AppRoutes.tsx contract. Please either fold this into the approved route map/pattern or update the route contract guideline in the same PR to keep architecture policy and implementation aligned.

As per coding guidelines: “app/src/AppRoutes.tsx: Routing: / ... /settings/*. Default *DefaultRedirect.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/AppRoutes.tsx` around lines 41 - 42, The two routes added in
AppRoutes.tsx (Route path="/callback/:kind" and "/callback/:kind/:status"
rendering WebCallbackPage) violate the approved route contract; either move
these callback routes into the sanctioned route map/pattern (fold them under an
allowed top-level route group and register them alongside the existing routes
list in AppRoutes.tsx) or, if they are intentionally new top-level routes,
update the route contract guideline in the same PR to include "/callback/:kind"
and "/callback/:kind/:status"; change the implementation so WebCallbackPage
remains the element for those paths and update any shared route registry or
documentation entry used by AppRoutes.tsx to keep code and contract in sync.
app/test/playwright/specs/chat-harness-cancel.spec.ts-23-37 (1)

23-37: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when mock-admin calls fail.

resetMock/setMockBehavior ignore HTTP failures, so the test can proceed with stale backend state and produce flaky or misleading results. Check response.ok and throw with status/body context.

🔧 Proposed fix
 async function resetMock(): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
+  const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({}),
   });
+  if (!response.ok) {
+    const body = await response.text().catch(() => '');
+    throw new Error(`resetMock failed: ${response.status} ${body}`);
+  }
 }
 
 async function setMockBehavior(key: string, value: string): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
+  const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({ key, value }),
   });
+  if (!response.ok) {
+    const body = await response.text().catch(() => '');
+    throw new Error(`setMockBehavior(${key}) failed: ${response.status} ${body}`);
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/chat-harness-cancel.spec.ts` around lines 23 - 37,
The helper functions resetMock and setMockBehavior currently ignore HTTP errors;
update both to capture the fetch Response, check response.ok, and if false read
the response body (e.g., response.text()) and throw a new Error including the
endpoint, response.status and the response body so failures fail fast and
surface context for debugging.
app/test/playwright/specs/memory-roundtrip.spec.ts-72-87 (1)

72-87: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace fixed 3s sleep with poll-based eventual assertion.

Line 78’s fixed delay can still flake when cleanup latency exceeds 3s. Use expect.poll to wait until the canary is absent (bounded by timeout).

Proposed fix
-    const recallAfterForget = await callCoreRpc<unknown>('openhuman.memory_recall_memories', {
-      namespace: TEST_NAMESPACE,
-      limit: 10,
-    });
-    let recalled = JSON.stringify(recallAfterForget ?? {});
-    if (recalled.includes(TEST_KEY) || recalled.includes(TEST_CONTENT)) {
-      await new Promise(resolve => setTimeout(resolve, 3_000));
-      const retry = await callCoreRpc<unknown>('openhuman.memory_recall_memories', {
-        namespace: TEST_NAMESPACE,
-        limit: 10,
-      });
-      recalled = JSON.stringify(retry ?? {});
-    }
-    expect(recalled.includes(TEST_KEY)).toBe(false);
-    expect(recalled.includes(TEST_CONTENT)).toBe(false);
+    await expect
+      .poll(
+        async () => {
+          const recall = await callCoreRpc<unknown>('openhuman.memory_recall_memories', {
+            namespace: TEST_NAMESPACE,
+            limit: 10,
+          });
+          const recalled = JSON.stringify(recall ?? {});
+          return recalled.includes(TEST_KEY) || recalled.includes(TEST_CONTENT);
+        },
+        { timeout: 15_000, intervals: [500, 1_000, 2_000] }
+      )
+      .toBe(false);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/memory-roundtrip.spec.ts` around lines 72 - 87,
Replace the fixed 3_000ms sleep and manual retry around
callCoreRpc('openhuman.memory_recall_memories') with a poll-based eventual
assertion using Playwright's expect.poll: call callCoreRpc inside expect.poll
until the serialized recall result does not include TEST_KEY or TEST_CONTENT
(bounded by a reasonable timeout like 10s and a small interval), then assert the
absence; update the block that currently sets recallAfterForget/recalled/retry
to use expect.poll and remove the manual setTimeout and extra RPC call so the
test waits deterministically for cleanup.
app/test/playwright/specs/skill-lifecycle.spec.ts-24-30 (1)

24-30: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix false-positive skills assertion.

Array.isArray(payload.skills ?? []) passes even when skills is absent, so this test can’t catch regressions in the RPC shape.

Suggested fix
-    expect(Array.isArray(payload.skills ?? [])).toBe(true);
+    expect('skills' in payload).toBe(true);
+    expect(Array.isArray(payload.skills)).toBe(true);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/skill-lifecycle.spec.ts` around lines 24 - 30, The
assertion currently uses Array.isArray(payload.skills ?? []) which returns true
when skills is missing; update the test to explicitly assert that the RPC
payload contains a defined skills property and that it is an array: after
obtaining rpcResult/payload from callCoreRpc, first assert that payload
hasOwnProperty('skills') (or 'skills' in payload) and that payload.skills is not
undefined/null, then assert Array.isArray(payload.skills). Locate the variables
rpcResult, payload and the callCoreRpc invocation in skill-lifecycle.spec.ts to
make this change.
app/scripts/e2e-web-build.sh-9-18 (1)

9-18: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Load dotenv before deriving endpoint env vars.

VITE_BACKEND_URL and VITE_OPENHUMAN_CORE_RPC_URL are computed before .env is sourced, so .env overrides for E2E_MOCK_PORT/OPENHUMAN_CORE_PORT are ignored.

Suggested fix
-export VITE_BACKEND_URL="http://127.0.0.1:${E2E_MOCK_PORT:-18473}"
-export VITE_OPENHUMAN_TARGET="web"
-export VITE_OPENHUMAN_E2E_DEFAULT_CORE_MODE="cloud"
-export VITE_OPENHUMAN_E2E_RESTART_APP_AS_RELOAD="true"
-export VITE_OPENHUMAN_CORE_RPC_URL="http://127.0.0.1:${OPENHUMAN_CORE_PORT:-17788}/rpc"
-
 if [ -f "$REPO_ROOT/.env" ]; then
   # shellcheck source=/dev/null
   source "$REPO_ROOT/scripts/load-dotenv.sh"
 fi
+
+export VITE_BACKEND_URL="http://127.0.0.1:${E2E_MOCK_PORT:-18473}"
+export VITE_OPENHUMAN_TARGET="web"
+export VITE_OPENHUMAN_E2E_DEFAULT_CORE_MODE="cloud"
+export VITE_OPENHUMAN_E2E_RESTART_APP_AS_RELOAD="true"
+export VITE_OPENHUMAN_CORE_RPC_URL="http://127.0.0.1:${OPENHUMAN_CORE_PORT:-17788}/rpc"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/scripts/e2e-web-build.sh` around lines 9 - 18, The exports for
VITE_BACKEND_URL and VITE_OPENHUMAN_CORE_RPC_URL are computed before sourcing
the repo .env, so E2E_MOCK_PORT and OPENHUMAN_CORE_PORT overrides are ignored;
move the dotenv load (the source "$REPO_ROOT/scripts/load-dotenv.sh" guarded by
the -f check) to before those exports or re-evaluate the two export assignments
after sourcing so that values from .env (E2E_MOCK_PORT, OPENHUMAN_CORE_PORT) are
respected; update the script around the VITE_BACKEND_URL and
VITE_OPENHUMAN_CORE_RPC_URL assignments to source the .env first (using
REPO_ROOT and load-dotenv.sh) or recompute the variables after sourcing.
app/test/playwright/specs/chat-tool-error-recovery.spec.ts-13-27 (1)

13-27: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate mock admin POST responses in test helpers

resetMock and setMockBehavior call fetch but never check response.ok, so a failed mock setup can go unnoticed and make the test non-deterministic.

+async function postMockAdmin(path: string, payload: { key?: string; value?: string } | {}): Promise<void> {
+  const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify(payload),
+  });
+  if (!response.ok) {
+    throw new Error(`mock admin ${path} failed with HTTP ${response.status}`);
+  }
+}
+
 async function resetMock(): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
-    method: 'POST',
-    headers: { 'Content-Type': 'application/json' },
-    body: JSON.stringify({}),
-  });
+  await postMockAdmin('/__admin/reset', {});
 }
 
 async function setMockBehavior(key: string, value: string): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
-    method: 'POST',
-    headers: { 'Content-Type': 'application/json' },
-    body: JSON.stringify({ key, value }),
-  });
+  await postMockAdmin('/__admin/behavior', { key, value });
 }

File: app/test/playwright/specs/chat-tool-error-recovery.spec.ts (lines 13-27)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/chat-tool-error-recovery.spec.ts` around lines 13 -
27, The helpers resetMock and setMockBehavior currently fire fetch requests
without verifying results; update both functions to capture the fetch response
(e.g., const res = await fetch(...)), check res.ok, and if false throw an Error
containing the status and response text (await res.text()) so test setup
failures fail fast and surface server error details; keep the same function
names resetMock and setMockBehavior and preserve their request bodies and
headers.
app/test/playwright/specs/rewards-unlock-flow.spec.ts-11-25 (1)

11-25: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when mock admin POST helpers return non-2xx responses.

setRewardsScenario and resetMock await fetch(...) but never check response.ok/status. If the mock admin endpoints return 4xx/5xx, stale state can make the suite flaky or misleading.

Proposed fix
+async function postMockAdmin(path: string, body: unknown): Promise<void> {
+  const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify(body),
+  });
+  if (!response.ok) {
+    throw new Error(`mock admin ${path} failed with HTTP ${response.status}`);
+  }
+}
+
 async function setRewardsScenario(value: string): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
-    method: 'POST',
-    headers: { 'Content-Type': 'application/json' },
-    body: JSON.stringify({ key: 'rewardsScenario', value }),
-  });
+  await postMockAdmin('/__admin/behavior', { key: 'rewardsScenario', value });
 }
 
 async function resetMock(): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
-    method: 'POST',
-    headers: { 'Content-Type': 'application/json' },
-    body: JSON.stringify({}),
-  });
+  await postMockAdmin('/__admin/reset', {});
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/rewards-unlock-flow.spec.ts` around lines 11 - 25,
The helpers setRewardsScenario and resetMock currently ignore HTTP errors from
fetch; update both functions to check the fetch Response (e.g., response.ok)
after awaiting fetch and throw a descriptive error when not ok, including
response.status and response.text() (or JSON) to aid debugging so test failures
fail-fast instead of silently proceeding with stale state.
app/test/playwright/specs/webhooks-ingress-flow.spec.ts-36-61 (1)

36-61: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not swallow assertion failures in the RPC contract test.

The broad catch {} masks both transport errors and failed expectations, so regressions can pass green.

Proposed fix
-    } catch {
-      // Router initialization is socket-backed and can be absent in this lane.
-      // The load-bearing part is that the read-only surface above remains stable.
+    } catch (error) {
+      // Allow only known lane-availability failures; rethrow everything else,
+      // especially assertion errors.
+      if (
+        error instanceof Error &&
+        /socket|not connected|failed with HTTP (404|503)/i.test(error.message)
+      ) {
+        return;
+      }
+      throw error;
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/webhooks-ingress-flow.spec.ts` around lines 36 -
61, The test currently swallows all errors in the broad catch, hiding failed
expectations; update the catch to only suppress known transport/socket-init
errors and rethrow any assertion failures or unexpected errors. Specifically,
change the anonymous catch after the callCoreRpc calls (webhooks_register_echo,
webhooks_clear_logs, webhooks_unregister_echo) to catch (err) and if the error
is an assertion/failure (e.g., err.name === 'AssertionError' or the test
framework's assertion error type) or otherwise not a transport/socket
initialization error, rethrow it; only swallow/ignore errors that match the
transient transport/socket conditions (connection refused, socket not
initialized, tunnel missing) so real test regressions fail loudly. Ensure
callCoreRpc call sites remain unchanged and only the catch logic is tightened.
.github/workflows/e2e.yml-37-37 (1)

37-37: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin CI supply-chain dependencies (GitHub Actions + container) to immutable SHAs/digests.

In .github/workflows/e2e.yml, the Playwright job uses mutable references:

  • image: ghcr.io/tinyhumansai/openhuman_ci:latest (line 37)
  • uses: actions/checkout@v5 (line 41), actions/cache@v5 (line 48), Swatinem/rust-cache@v2 (line 56), actions/upload-artifact@v5 (line 84)

Replace each uses: with owner/repo@<commit-sha> and the container with ghcr.io/...@sha256:<digest>.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/e2e.yml at line 37, Replace mutable references with
immutable SHAs/digests: change the container declaration image:
ghcr.io/tinyhumansai/openhuman_ci:latest to the image digest form
ghcr.io/tinyhumansai/openhuman_ci@sha256:<digest>, and replace each uses: entry
(actions/checkout@v5, actions/cache@v5, Swatinem/rust-cache@v2,
actions/upload-artifact@v5) with the corresponding owner/repo@<commit-sha> (pin
to the exact commit SHA for each action). Update the workflow file so the
Playwright job references the resolved image sha256 digest and exact commit SHAs
for the listed actions to guarantee an immutable supply chain.
app/test/playwright/specs/skills-registry.spec.ts-1-1 (1)

1-1: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix Playwright page typing in skills-registry spec
Line 10 incorrectly derives the page fixture type from Parameters<typeof test>[0] (the test title parameter); use Playwright’s Page type directly.

Suggested patch
-import { expect, test } from '`@playwright/test`';
+import { expect, test, type Page } from '`@playwright/test`';
@@
-async function openSkillsPage(page: Parameters<typeof test>[0]['page'], userId: string) {
+async function openSkillsPage(page: Page, userId: string) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/skills-registry.spec.ts` at line 1, The test
currently types the `page` fixture incorrectly using `Parameters<typeof
test>[0]`; import Playwright's `Page` type from '`@playwright/test`' and use that
type for the `page` variable/fixture instead (replace any `Parameters<typeof
test>[0]` usage with `Page`), updating the import line to include `Page`
alongside `expect` and `test` so the `page` parameter in the spec has the
correct Playwright type.
app/src/pages/WebCallbackPage.tsx-1-4 (1)

1-4: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Route callback UI copy must be translated via useT().

This component renders user-visible strings as hard-coded literals. In app/src, these must come from useT() keys and be added to app/src/lib/i18n/en.ts.

Suggested fix
 import { useEffect } from 'react';
 import { useLocation, useParams } from 'react-router-dom';
 
+import { useT } from '../lib/i18n/I18nContext';
 import { handleDeepLinkUrls } from '../utils/desktopDeepLinkListener';
@@
 export default function WebCallbackPage() {
+  const t = useT();
   const { kind, status } = useParams();
   const location = useLocation();
@@
         <h1 className="text-2xl font-semibold text-slate-900">Completing sign-in</h1>
         <p className="text-sm text-slate-600">
           OpenHuman is processing your callback and will continue automatically.
         </p>
+        <h1 className="text-2xl font-semibold text-slate-900">
+          {t('webCallback.completingSignIn.title')}
+        </h1>
+        <p className="text-sm text-slate-600">
+          {t('webCallback.completingSignIn.description')}
+        </p>
       </div>
     </div>
   );
 }

As per coding guidelines: "app/src/**/*.{ts,tsx}: Every user-visible string ... must go through useT() ... Add the new key to app/src/lib/i18n/en.ts in the same PR."

Also applies to: 32-38

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/pages/WebCallbackPage.tsx` around lines 1 - 4, The WebCallbackPage
component currently renders hard-coded user-facing strings; update the component
to use the i18n hook useT() (import and call useT inside the component) and
replace each literal shown (including the strings around lines 32-38) with
useT('your.key.here') lookups, then add corresponding keys and English text into
app/src/lib/i18n/en.ts; ensure keys are descriptive (e.g., webCallback.success,
webCallback.error) and that all visible literals in WebCallbackPage are replaced
with translations in the same PR.
app/test/playwright/specs/chat-harness-wallet-flow.spec.ts-54-68 (1)

54-68: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when mock-admin setup calls return non-2xx.

resetMock/setMockBehavior ignore HTTP status, so setup can silently fail and leave stale state behind. This makes the spec flaky and can produce misleading passes/failures.

Suggested fix
 async function resetMock(): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
+  const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({}),
   });
+  if (!response.ok) {
+    throw new Error(`mock reset failed: ${response.status}`);
+  }
 }
 
 async function setMockBehavior(key: string, value: string): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
+  const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({ key, value }),
   });
+  if (!response.ok) {
+    throw new Error(`mock behavior update failed: ${response.status}`);
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/chat-harness-wallet-flow.spec.ts` around lines 54 -
68, resetMock and setMockBehavior currently ignore HTTP status so setup can
silently fail; update both functions (resetMock and setMockBehavior) to capture
the fetch Response, check response.ok, and throw a descriptive error (including
status and response text/body) when not ok so the test fails fast and surfaces
the mock-admin failure for debugging.
app/test/playwright/specs/onboarding-modes.spec.ts-52-85 (1)

52-85: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Current assertions can self-heal failed onboarding flows and hide regressions.

These helpers mutate completion state and force navigation, allowing green tests even when onboarding progression fails.

Suggested fix
 async function expectOnboardingCompleted(): Promise<void> {
   const readValue = async (): Promise<boolean> => {
     const completed = await callCoreRpc<boolean | { result?: boolean }>(
       'openhuman.config_get_onboarding_completed',
       {}
     );
     return typeof completed === 'boolean'
       ? completed
       : Boolean((completed as { result?: boolean }).result);
   };
 
-  let value = await readValue();
-  if (!value) {
-    await callCoreRpc('openhuman.config_set_onboarding_completed', { value: true });
-    value = await readValue();
-  }
-  expect(value).toBe(true);
+  expect(await readValue()).toBe(true);
 }
 
-async function ensureHomeOrForceComplete(page: Page): Promise<void> {
-  const reachedHome = await expect
-    .poll(async () => page.evaluate(() => window.location.hash), { timeout: 20_000 })
-    .toMatch(/^#\/home/)
-    .then(
-      () => true,
-      () => false
-    );
-
-  if (reachedHome) return;
-
-  await callCoreRpc('openhuman.config_set_onboarding_completed', { value: true });
-  await page.goto('/#/home');
-  await waitForAppReady(page);
+async function ensureHome(page: Page): Promise<void> {
+  await expect
+    .poll(async () => page.evaluate(() => window.location.hash), { timeout: 20_000 })
+    .toMatch(/^#\/home/);
 }
// Also update call sites in this file:
await ensureHome(page);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/onboarding-modes.spec.ts` around lines 52 - 85, The
helpers currently mutate onboarding state and force navigation which masks
regressions; remove those mutations so failures surface. Update
expectOnboardingCompleted to only read via
callCoreRpc('openhuman.config_get_onboarding_completed') and assert the value is
true (do not call 'openhuman.config_set_onboarding_completed'). Change
ensureHomeOrForceComplete into ensureHome that only polls page.evaluate(() =>
window.location.hash) and fails the test if the hash never matches /^#\/home/
(do not call 'openhuman.config_set_onboarding_completed' or page.goto). Update
all call sites in this file to call ensureHome(page) instead of the old helper.
app/test/playwright/specs/onboarding-modes.spec.ts-41-43 (1)

41-43: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t suppress mock reset failures during test boot.

Ignoring resetMock errors makes failures non-deterministic and can leak prior test state.

Suggested fix
 async function bootIntoOnboarding(page: Page, userId: string): Promise<void> {
-  await resetMock().catch(() => undefined);
+  await resetMock();
   await bootAuthenticatedPage(page, userId, '/home');
   await callCoreRpc('openhuman.config_set_onboarding_completed', { value: false });
   await page.goto('/#/onboarding/welcome');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/onboarding-modes.spec.ts` around lines 41 - 43, The
helper bootIntoOnboarding currently swallows resetMock errors (await
resetMock().catch(() => undefined)), which hides failures and can leak state;
change it to await resetMock() without the catch so errors propagate and fail
the test (or explicitly catch and rethrow a descriptive error). Update the
bootIntoOnboarding function (and any callers) to allow resetMock failures to
surface, or move resetMock into a test-level beforeEach that throws on error,
ensuring deterministic cleanup.
app/test/playwright/specs/harness-channel-bridge-flow.spec.ts-13-27 (1)

13-27: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Harden mock-admin setup calls to fail on unsuccessful responses.

These setup helpers ignore HTTP failure status, so test state can be left partially configured without an immediate, actionable failure.

Suggested fix
+async function adminFetch(path: string, init?: RequestInit): Promise<void> {
+  const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, init);
+  if (!response.ok) {
+    throw new Error(`mock admin request failed: ${response.status} ${path}`);
+  }
+}
+
 async function resetMock(): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
+  await adminFetch('/__admin/reset', {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({}),
   });
 }
 
 async function setMockBehavior(key: string, value: string): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
+  await adminFetch('/__admin/behavior', {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({ key, value }),
   });
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/harness-channel-bridge-flow.spec.ts` around lines
13 - 27, The helper functions resetMock and setMockBehavior call the mock-admin
endpoints but ignore non-OK HTTP responses; update both functions to check the
fetch Response (e.g., response.ok or response.status) and throw a descriptive
error (including status and body/text) when the request fails so tests fail fast
and show the underlying problem.
app/test/playwright/specs/harness-composio-tool-flow.spec.ts-18-38 (1)

18-38: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast on mock-admin HTTP failures in test setup helpers.

resetMock, setMockBehavior, and requests currently proceed even on non-2xx responses, which can leak stale mock state and make failures harder to diagnose.

Suggested fix
+async function adminFetch(path: string, init?: RequestInit): Promise<Response> {
+  const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, init);
+  if (!response.ok) {
+    throw new Error(`mock admin request failed: ${response.status} ${path}`);
+  }
+  return response;
+}
+
 async function resetMock(): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
+  await adminFetch('/__admin/reset', {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({}),
   });
 }
 
 async function setMockBehavior(key: string, value: string): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
+  await adminFetch('/__admin/behavior', {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({ key, value }),
   });
 }
 
 async function requests(): Promise<MockRequest[]> {
-  const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/requests`);
+  const response = await adminFetch('/__admin/requests');
   const payload = (await response.json()) as { data?: MockRequest[] };
   return Array.isArray(payload.data) ? payload.data : [];
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/harness-composio-tool-flow.spec.ts` around lines 18
- 38, The helpers resetMock, setMockBehavior, and requests should fail fast on
non-2xx HTTP responses from the mock admin: after each fetch call in resetMock,
setMockBehavior, and requests check response.ok and if false throw an Error
containing the endpoint, response.status, and response text (or JSON) to surface
the failure; for requests also return an empty array only after a successful
response, otherwise throw with the same detailed message so stale mock state
cannot be used.
app/test/playwright/specs/harness-cron-prompt-flow.spec.ts-18-38 (1)

18-38: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add explicit HTTP status checks to mock-admin helpers.

Without status validation, failed reset/behavior/request-log calls can silently continue and invalidate test expectations.

Suggested fix
+async function adminFetch(path: string, init?: RequestInit): Promise<Response> {
+  const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, init);
+  if (!response.ok) {
+    throw new Error(`mock admin request failed: ${response.status} ${path}`);
+  }
+  return response;
+}
+
 async function resetMock(): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
+  await adminFetch('/__admin/reset', {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({}),
   });
 }
 
 async function setMockBehavior(key: string, value: string): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
+  await adminFetch('/__admin/behavior', {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({ key, value }),
   });
 }
 
 async function requests(): Promise<MockRequest[]> {
-  const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/requests`);
+  const response = await adminFetch('/__admin/requests');
   const payload = (await response.json()) as { data?: MockRequest[] };
   return Array.isArray(payload.data) ? payload.data : [];
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/harness-cron-prompt-flow.spec.ts` around lines 18 -
38, The helper functions resetMock, setMockBehavior, and requests currently
ignore HTTP response status and can silently fail; update each to validate the
fetch response (e.g., check response.ok or response.status) after the await
fetch call and throw a descriptive error including status and response body/text
when not OK so tests fail fast and show the failing mock-admin response; for
requests also guard the JSON parse by validating the response before calling
response.json() and surface any non-200 responses as errors.
app/test/playwright/specs/chat-conversation-history.spec.ts-146-153 (1)

146-153: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Assert request payload content, not just request count.

This test claims earlier turns are included in the second LLM request, but it only checks that one completion request exists. It should validate the posted messages include prior and current prompts.

🧪 Suggested fix
-    const llmLog = await expect
-      .poll(async () => {
-        const log = await requests();
-        return log.filter(
-          entry => entry.method === 'POST' && entry.url.includes('/openai/v1/chat/completions')
-        );
-      })
-      .toHaveLength(1);
-
-    void llmLog;
+    await expect
+      .poll(async () => {
+        const log = await requests();
+        return log.filter(
+          entry => entry.method === 'POST' && entry.url.includes('/openai/v1/chat/completions')
+        );
+      })
+      .toHaveLength(1);
+
+    const llmRequest = (await requests()).find(
+      entry => entry.method === 'POST' && entry.url.includes('/openai/v1/chat/completions')
+    );
+    const body = JSON.parse(llmRequest?.body ?? '{}') as {
+      messages?: Array<{ content?: string }>;
+    };
+    const transcript = (body.messages ?? []).map(message => message.content ?? '').join('\n');
+    expect(transcript).toContain(FIRST_PROMPT);
+    expect(transcript).toContain(SECOND_PROMPT);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/chat-conversation-history.spec.ts` around lines 146
- 153, The test currently only asserts a single POST to
/openai/v1/chat/completions (using expect.poll and requests()), but must also
verify the request body contains the expected conversation messages; update the
block that captures llmLog (the await expect.poll(...) that calls requests() and
filters entries by entry.method and entry.url) to inspect the matched entry's
request payload (e.g., entry.request.postData or entry.request.body) and assert
that its messages array includes both the earlier turns and the current prompt
strings used in this spec; reference llmLog, requests(), and the filtered entry
to locate where to add assertions for message content.
app/test/playwright/specs/chat-tool-call-flow.spec.ts-32-52 (1)

32-52: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast on mock-admin HTTP failures in test setup helpers.

resetMock, setMockBehavior, and requests currently ignore non-2xx responses. That can leave tests running with invalid mock state and produce opaque failures later.

Suggested patch
+async function mockFetch(path: string, init?: RequestInit): Promise<Response> {
+  const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, init);
+  if (!response.ok) {
+    throw new Error(
+      `mock admin ${init?.method ?? 'GET'} ${path} failed with HTTP ${response.status}`
+    );
+  }
+  return response;
+}
+
 async function resetMock(): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, {
+  await mockFetch('/__admin/reset', {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({}),
   });
 }
 
 async function setMockBehavior(key: string, value: string): Promise<void> {
-  await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, {
+  await mockFetch('/__admin/behavior', {
     method: 'POST',
     headers: { 'Content-Type': 'application/json' },
     body: JSON.stringify({ key, value }),
   });
 }
 
 async function requests(): Promise<MockRequest[]> {
-  const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/requests`);
+  const response = await mockFetch('/__admin/requests');
   const payload = (await response.json()) as { data?: MockRequest[] };
   return Array.isArray(payload.data) ? payload.data : [];
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/chat-tool-call-flow.spec.ts` around lines 32 - 52,
The helper functions resetMock, setMockBehavior and requests do not check HTTP
responses and silently continue on non-2xx codes; update each to assert
response.ok and throw a descriptive error containing the status and response
text (or parsed body) when not ok so the test fails fast; specifically, in
resetMock and setMockBehavior after the fetch check response.ok and if false
read await response.text() (or json if applicable) and throw a new Error
including the endpoint, status and body, and in requests do the same before
attempting response.json() so you never return an invalid payload (also
propagate any fetch exceptions with added context about the function and
endpoint).
app/test/playwright/helpers/core-rpc.ts-23-41 (1)

23-41: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add an explicit timeout for core RPC fetches.

callCoreRpc is unbounded today. When the core endpoint stalls, this can hang a worker for a long period and mask root cause.

Suggested patch
 export async function callCoreRpc<T>(
   method: string,
   params: Record<string, unknown> = {}
 ): Promise<T> {
-  const response = await fetch(CORE_RPC_URL, {
-    method: 'POST',
-    headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${CORE_RPC_TOKEN}` },
-    body: JSON.stringify({ jsonrpc: '2.0', id: nextRpcId++, method, params }),
-  });
+  const controller = new AbortController();
+  const timer = setTimeout(() => controller.abort(), 15_000);
+  let response: Response;
+  try {
+    response = await fetch(CORE_RPC_URL, {
+      method: 'POST',
+      headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${CORE_RPC_TOKEN}` },
+      body: JSON.stringify({ jsonrpc: '2.0', id: nextRpcId++, method, params }),
+      signal: controller.signal,
+    });
+  } catch (error) {
+    if (controller.signal.aborted) {
+      throw new Error(`RPC ${method} timed out after 15000ms`);
+    }
+    throw error;
+  } finally {
+    clearTimeout(timer);
+  }
 
   if (!response.ok) {
     throw new Error(`RPC ${method} failed with HTTP ${response.status}`);
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/helpers/core-rpc.ts` around lines 23 - 41, callCoreRpc
currently issues an unbounded fetch to CORE_RPC_URL and can hang; update
callCoreRpc to use an AbortController with a configurable timeout (e.g., 10_000
ms) so the request is aborted after the timeout, clear the timeout on success,
and handle the abort error to throw a descriptive error (e.g., "RPC <method>
timed out"); apply changes around the fetch call and error handling involving
callCoreRpc, CORE_RPC_URL, CORE_RPC_TOKEN, nextRpcId so the existing JSON-RPC
logic and response.ok / payload.error checks remain intact.
app/scripts/e2e-web-session.sh-17-21 (1)

17-21: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid deleting caller-provided workspace paths.

Line 19 marks any /tmp/* workspace as script-owned, so cleanup can remove a user-supplied existing directory at Lines 43-45. That can wipe unrelated temp data.

Proposed fix
-OPENHUMAN_WORKSPACE="${OPENHUMAN_WORKSPACE:-$(mktemp -d)}"
-CREATED_TEMP_WORKSPACE=""
-if [ ! -d "${OPENHUMAN_WORKSPACE}" ] || [[ "${OPENHUMAN_WORKSPACE}" == /tmp/* ]]; then
-  CREATED_TEMP_WORKSPACE="$OPENHUMAN_WORKSPACE"
-fi
+CREATED_TEMP_WORKSPACE=""
+if [ -z "${OPENHUMAN_WORKSPACE:-}" ]; then
+  OPENHUMAN_WORKSPACE="$(mktemp -d)"
+  CREATED_TEMP_WORKSPACE="$OPENHUMAN_WORKSPACE"
+fi
 export OPENHUMAN_WORKSPACE

Also applies to: 43-45

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/scripts/e2e-web-session.sh` around lines 17 - 21, The script currently
treats any /tmp/* path as script-owned and may delete a caller-provided
directory; fix by detecting whether OPENHUMAN_WORKSPACE was provided by the
caller before you default it and only mark CREATED_TEMP_WORKSPACE when this
script actually created the directory. Save a flag like
ORIGINAL_OPENHUMAN_WORKSPACE_SET (using parameter expansion to test if
OPENHUMAN_WORKSPACE was set) before assigning
OPENHUMAN_WORKSPACE="${OPENHUMAN_WORKSPACE:-$(mktemp -d)}", then set
CREATED_TEMP_WORKSPACE only when the original was unset or when the directory
did not exist and the script created it; update the cleanup logic that uses
CREATED_TEMP_WORKSPACE (currently around the cleanup at lines 43-45) to rely on
that flag so you never delete a caller-provided /tmp/* path.
🧹 Nitpick comments (11)
app/test/playwright/specs/chat-harness-cancel.spec.ts (1)

122-125: ⚡ Quick win

Replace fixed sleep with condition-based waiting.

waitForTimeout(3_500) is timing-sensitive and can still race on slower runners. Prefer polling for post-cancel UI state (e.g., cancel button disappears / stream status idle) before asserting late chunks are absent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/chat-harness-cancel.spec.ts` around lines 122 -
125, Replace the fixed sleep in the test (the call to
page.waitForTimeout(3_500)) with a condition-based wait: poll for a stable
post-cancel UI state such as the cancel button disappearing (use
page.waitForSelector for the cancel button to be hidden/removed) or the stream
status element reaching "idle" before asserting that each entry in LATE_PIECES
is absent via page.getByText(...).toHaveCount(0); ensure you wait for the chosen
selector/state to be settled so the subsequent LATE_PIECES checks won't race on
slow runners.
app/test/playwright/specs/chat-harness-send-stream.spec.ts (1)

132-137: ⚡ Quick win

Make the stream: true request-body assertion whitespace-tolerant.

The current substring check is brittle to JSON serialization formatting.

Proposed fix
         return log.some(
           entry =>
             entry.method === 'POST' &&
             entry.url.includes('/openai/v1/chat/completions') &&
-            entry.body?.includes('"stream":true')
+            /"stream"\s*:\s*true/.test(entry.body ?? '')
         );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/chat-harness-send-stream.spec.ts` around lines 132
- 137, Replace the brittle substring check that looks for '"stream":true' with a
whitespace-robust check: for each matching POST to '/openai/v1/chat/completions'
parse entry.body as JSON (or if parsing fails, fall back to a regex that ignores
whitespace) and assert that the parsed object's stream property is strictly
true; update the arrow function that currently references
entry.body?.includes('"stream":true') to use JSON.parse(entry.body) and verify
obj.stream === true (with a try/catch fallback to a /\b"stream"\s*:\s*true\b/
style regex if needed).
app/test/playwright/specs/tool-shell-git-flow.spec.ts (1)

119-130: ⚡ Quick win

Strengthen the denial-envelope assertions to match the test intent.

The current checks only assert “throws”, so envelope regressions can still pass unnoticed. Assert a stable error signature (code/message fragment) for each invalid input case.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/tool-shell-git-flow.spec.ts` around lines 119 -
130, Replace the broad rejects.toThrow() checks with assertions that verify a
stable error signature for each failing call to
callCoreRpc('openhuman.memory_write_file'): for the call with only { content:
'no path provided' } assert the rejection contains the expected error
code/message fragment indicating "missing path" (or the specific error code your
RPC returns), and for the call with relative_path '../shell-restriction-967.txt'
assert the rejection contains the expected security/path validation error
code/message fragment (e.g., "invalid path" or "path traversal"). Locate the two
assertions in tool-shell-git-flow.spec.ts around the test 'denial envelope is
structurally consistent for invalid write args' and replace the generic toThrow
checks with rejects assertions that match the specific error shape or message
fragment returned by your RPC (using the RPC's error code or stable message
text).
app/test/playwright/specs/connector-youtube.spec.ts (2)

1-1: ⚡ Quick win

Use import type for Page to match the TypeScript rule.

Switch Page to a dedicated type-only import statement for consistency with the repo standard.

As per coding guidelines **/*.{ts,tsx}: Use import type for type-only imports in TypeScript files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/connector-youtube.spec.ts` at line 1, The import
currently brings Page as a value import; change it to a type-only import to
satisfy the TypeScript rule: replace the combined import "import { expect, type
Page, test } from '`@playwright/test`';" with separate imports so that Page is
imported via "import type { Page } from '`@playwright/test`';" and keep expect and
test as regular imports (or use "import { expect, test } from
'`@playwright/test`';") in the connector-youtube.spec.ts file to ensure Page is a
type-only import.

19-168: 🏗️ Heavy lift

Extract shared connector-test harness helpers into a common module.

This helper block is effectively duplicated across multiple connector specs, which will drift over time. Please centralize it (mock admin calls, skills boot/reload, session assertion, modal open, connection unwrap) in a shared Playwright helper and keep connector-specific constants in each spec.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/connector-youtube.spec.ts` around lines 19 - 168,
The test file duplicates a common connector-test harness; extract the shared
helpers into a centralized Playwright helper module and keep only
connector-specific constants in each spec. Move functions mockFetch, resetMock,
setMockBehavior, getRequestLog, seedConnector, bootSkillsPage, reloadSkills,
ensureComposioSurface, assertSessionNotNuked, openConnectorModal, and
unwrapConnections into a new shared helper file, export them, and update the
connector-youtube.spec.ts to import and use those helpers (leaving TOOLKIT_SLUG,
CONNECTION_ID, CONNECTOR_NAME, and other spec-specific constants in the spec).
Ensure all references and imports are updated and tests still call the same
exported function names so behavior is unchanged.
app/test/playwright/specs/accounts-provider-modal.spec.ts (1)

22-22: ⚡ Quick win

Replace inline import('...').Page annotations with a top-level import type.

Using Page via import('@playwright/test').Page in signatures violates the repo’s TS type-import rule and reduces readability. Add import type { Page } from '@playwright/test' once and use Page directly in these helpers.

As per coding guidelines **/*.{ts,tsx}: Use import type for type-only imports in TypeScript files.

Also applies to: 38-38, 47-47, 64-64

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/accounts-provider-modal.spec.ts` at line 22,
Replace the inline parameter type annotations like
import('`@playwright/test`').Page used in helper signatures (e.g.,
openAddAccountModal and the other helpers in the same file) with a top-level
type-only import: add "import type { Page } from '`@playwright/test`'" at the top
of the file and change each function signature to use "Page" directly; remove
the inline import forms so the file follows the repo's import type rule.
app/test/playwright/specs/connector-confluence.spec.ts (1)

1-1: ⚡ Quick win

Apply import type for Page here as well.

Use a dedicated import type { Page } line instead of inline type import syntax to keep this file aligned with the TS import rule used in this repo.

As per coding guidelines **/*.{ts,tsx}: Use import type for type-only imports in TypeScript files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/connector-confluence.spec.ts` at line 1, Replace
the inline type import for Page with a dedicated type-only import: change the
combined import that currently uses "type Page" inline to instead keep the
runtime imports (expect, test) on the first line and add a separate "import type
{ Page }" line; update the import statement referencing Page so the file uses an
explicit import type for the Page symbol to comply with the repo's TS rule.
app/test/playwright/specs/connector-notion.spec.ts (1)

1-1: ⚡ Quick win

Use a dedicated type-only import for Page.

Please switch to import type { Page } to align with the TypeScript import convention enforced in this repository.

As per coding guidelines **/*.{ts,tsx}: Use import type for type-only imports in TypeScript files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/connector-notion.spec.ts` at line 1, The import
currently brings Page as a value; change it to a type-only import by replacing
the import for Page with an "import type { Page }" form so TypeScript treats
Page as a type-only import (adjust the existing import line that currently reads
"import { expect, type Page, test }" to separate/convert Page into a dedicated
type-only import referencing Page).
app/test/playwright/specs/slack-flow.spec.ts (1)

1-1: ⚡ Quick win

Align Page import with repo import type policy.

Replace inline type Page usage with a dedicated import type { Page } ... statement.

As per coding guidelines **/*.{ts,tsx}: Use import type for type-only imports in TypeScript files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/slack-flow.spec.ts` at line 1, The current import
mixes a type with value imports; update the import to use a dedicated type-only
import for Page by removing "type Page" from the named import and adding a
separate "import type { Page } from '`@playwright/test`';", keeping value imports
(expect, test) in the original import so files like slack-flow.spec.ts follow
the repo's import type policy.
app/test/playwright/specs/connector-jira.spec.ts (1)

19-169: 🏗️ Heavy lift

Extract shared Composio connector test harness helpers to reduce drift.

The mock-admin helpers and skills/session bootstrap logic here are effectively duplicated across connector specs in this PR (e.g., Discord and Google Sheets). A shared helper/factory would prevent behavior skew and make future fixes one-shot.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/connector-jira.spec.ts` around lines 19 - 169,
Extract the duplicated Composio test helpers (mockFetch, resetMock,
setMockBehavior, getRequestLog, seedConnector, bootSkillsPage, reloadSkills,
ensureComposioSurface, assertSessionNotNuked, openModal,
waitForDisconnectedCard, unwrapConnections and any associated constants like
TOOLKIT_SLUG/CONNECTION_ID/CONNECTOR_NAME) into a shared test harness module and
import it from each connector spec; move the HTTP mock admin logic and
page-bootstrap flow into factory/helper functions (e.g., export mockAdminClient
with methods reset, setBehavior, getRequests and export pageBoot helpers like
bootComposioSkillsPage and ensureComposioSurface) so other specs (Discord,
Google Sheets) can call the same implementations, update imports in
connector-jira.spec.ts to use the new module, and remove the duplicated local
functions to keep a single source of truth.
app/test/playwright/specs/settings-advanced-config.spec.ts (1)

41-50: ⚡ Quick win

Avoid index-coupled provider toggle selection.

providerEnabledToggle depends on row order, so a layout reorder can click the wrong provider toggle. Prefer locating the provider row by provider label/test id, then selecting its Enabled checkbox within that row.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/settings-advanced-config.spec.ts` around lines 41 -
50, providerEnabledToggle currently selects the nth "Enabled" checkbox based on
a fixed providerOrder, which breaks if rows reflow; update providerEnabledToggle
to locate the provider's row by a stable identifier (e.g., provider label text
or a data-testid for the provider) and then query for the "Enabled" checkbox
scoped inside that row (use the Page.getByRole/getByTestId on the row element
then .getByRole('checkbox', { name: 'Enabled' }) within it) so selection is
based on the provider row rather than a hardcoded index.

@@ -0,0 +1,41 @@
import { expect, test } from '@playwright/test';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify inline import type-query usage in Playwright TS files
rg -nP ":\s*import\('[^']+'\)\.[A-Za-z_]\w+" app/test/playwright/**/*.ts

Repository: tinyhumansai/openhuman

Length of output: 1728


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="app/test/playwright/specs/command-palette.spec.ts"
echo "---- $FILE (head 80) ----"
nl -ba "$FILE" | sed -n '1,120p'

Repository: tinyhumansai/openhuman

Length of output: 197


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="app/test/playwright/specs/command-palette.spec.ts"
echo "---- $FILE (1-120) ----"
sed -n '1,120p' "$FILE"

echo
echo "---- Inline import type-query matches in this file ----"
rg -nP ":\s*import\('[^']+'\)\.[A-Za-z_]\w+" "$FILE" || true

echo
echo "---- Current import line in this file ----"
rg -n "from '`@playwright/test`'" "$FILE" || true

Repository: tinyhumansai/openhuman

Length of output: 1848


Use import { ..., type Page, ... } instead of import('@playwright/test').Page in the function signature.

app/test/playwright/specs/command-palette.spec.ts has async function openPalette(page: import('@playwright/test').Page). Switch it to a type import for consistency with the repo’s TypeScript import rules.

Proposed fix
-import { expect, test } from '`@playwright/test`';
+import { expect, type Page, test } from '`@playwright/test`';
@@
-async function openPalette(page: import('`@playwright/test`').Page) {
+async function openPalette(page: Page) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/command-palette.spec.ts` at line 1, The function
signature uses an inline import type; change to a top-level type import by
updating the existing import line to include "type Page" (import { expect, test,
type Page } from '`@playwright/test`') and then update the function declaration
openPalette to use the imported Page type (async function openPalette(page:
Page)). Ensure only the type import form is used and remove the inline
import('`@playwright/test`').Page usage.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
e2e/docker-local-bootstrap.sh (1)

89-92: ⚡ Quick win

Avoid hard-coding the Playwright Chromium cache revision in e2e/docker-local-bootstrap.sh

e2e/docker-local-bootstrap.sh checks for ${HOME}/.cache/ms-playwright/chromium_headless_shell-1223/... before running pnpm --dir /workspace/app exec playwright install chromium-headless-shell; if Playwright updates the revision, this probe can miss an existing shell and trigger an unnecessary install.

🔧 Suggested change
-if [ ! -x "${HOME}/.cache/ms-playwright/chromium_headless_shell-1223/chrome-headless-shell-linux64/chrome-headless-shell" ]; then
+if ! compgen -G "${HOME}/.cache/ms-playwright/chromium_headless_shell-*/chrome-headless-shell-linux64/chrome-headless-shell" >/dev/null; then
   echo "[e2e-bootstrap] Installing Playwright Chromium headless shell..."
   pnpm --dir /workspace/app exec playwright install chromium-headless-shell >/dev/null
 fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@e2e/docker-local-bootstrap.sh` around lines 89 - 92, The current if-test
hardcodes the Playwright Chromium cache revision; change the existence check
around the if block that references the chrome-headless-shell path so it uses a
revision-agnostic search (e.g., glob or find) under
"${HOME}/.cache/ms-playwright" for an executable named "chrome-headless-shell"
(or the platform-specific subpath) instead of the fixed
"chromium_headless_shell-1223/..." string, and keep the existing pnpm --dir
/workspace/app exec playwright install chromium-headless-shell >/dev/null
invocation as the install fallback when no matching executable is found.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/test/playwright/specs/chat-harness-scroll-render.spec.ts`:
- Line 168: The fixed "-20" delta makes the assertion brittle; change the final
assertion to simply verify that afterScrollUp.scrollTop is less than
tags.scrollTop (i.e.,
expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop)), removing the
hardcoded 20 threshold so the test only checks that scrolling moved upward
toward targetTop using the existing afterScrollUp and tags.scrollTop variables.

---

Nitpick comments:
In `@e2e/docker-local-bootstrap.sh`:
- Around line 89-92: The current if-test hardcodes the Playwright Chromium cache
revision; change the existence check around the if block that references the
chrome-headless-shell path so it uses a revision-agnostic search (e.g., glob or
find) under "${HOME}/.cache/ms-playwright" for an executable named
"chrome-headless-shell" (or the platform-specific subpath) instead of the fixed
"chromium_headless_shell-1223/..." string, and keep the existing pnpm --dir
/workspace/app exec playwright install chromium-headless-shell >/dev/null
invocation as the install fallback when no matching executable is found.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 189da23c-d2f8-4233-bdf2-e5884bc52e2f

📥 Commits

Reviewing files that changed from the base of the PR and between 6be30e7 and e1f952a.

📒 Files selected for processing (8)
  • .github/workflows/e2e.yml
  • app/package.json
  • app/scripts/e2e-web-build.sh
  • app/scripts/e2e-web-session.sh
  • app/test/playwright/helpers/core-rpc.ts
  • app/test/playwright/specs/chat-harness-scroll-render.spec.ts
  • e2e/docker-compose.yml
  • e2e/docker-local-bootstrap.sh

});

expect(Math.abs(afterScrollUp.scrollTop - targetTop)).toBeLessThan(40);
expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop - 20);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Relax the fixed scroll delta check to reduce viewport-dependent flakes.

Line 168 can fail when initial scrollTop is small, even if scrolling worked. You already assert movement toward targetTop on Line 167, so this fixed -20 threshold is unnecessarily brittle.

✅ Suggested tweak
-      expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop - 20);
+      expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop - 20);
expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/test/playwright/specs/chat-harness-scroll-render.spec.ts` at line 168,
The fixed "-20" delta makes the assertion brittle; change the final assertion to
simply verify that afterScrollUp.scrollTop is less than tags.scrollTop (i.e.,
expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop)), removing the
hardcoded 20 threshold so the test only checks that scrolling moved upward
toward targetTop using the existing afterScrollUp and tags.scrollTop variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Net-new user-facing capability or product behavior. working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant