Migrate E2E coverage to Playwright#2610
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughSplit E2E into desktop (mega-flow) and Playwright web lanes; add Playwright config, helpers, many web specs, web build/session scripts; add scripts/ci-cancel-aware.sh and wrap CI build/test steps; remove pretest-e2e gating from release workflows; add WebCallbackPage and related tests; update auth/Telegram env handling. ChangesWeb E2E + CI
Web E2E scripts & config
App code & tests
Sequence Diagrams: skipped (changes are large and varied; no single 3+ component sequential flow added beyond existing CI orchestration). Estimated code review effort Possibly related PRs
Suggested reviewers
|
There was a problem hiding this comment.
Actionable comments posted: 1
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
🟠 Major comments (21)
app/src/AppRoutes.tsx-41-42 (1)
41-42:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRoute contract drift in
AppRoutes.tsx.
/callback/:kindand/callback/:kind/:statusextend routing beyond the allowedAppRoutes.tsxcontract. Please either fold this into the approved route map/pattern or update the route contract guideline in the same PR to keep architecture policy and implementation aligned.As per coding guidelines: “
app/src/AppRoutes.tsx: Routing:/.../settings/*. Default*→DefaultRedirect.”🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/AppRoutes.tsx` around lines 41 - 42, The two routes added in AppRoutes.tsx (Route path="/callback/:kind" and "/callback/:kind/:status" rendering WebCallbackPage) violate the approved route contract; either move these callback routes into the sanctioned route map/pattern (fold them under an allowed top-level route group and register them alongside the existing routes list in AppRoutes.tsx) or, if they are intentionally new top-level routes, update the route contract guideline in the same PR to include "/callback/:kind" and "/callback/:kind/:status"; change the implementation so WebCallbackPage remains the element for those paths and update any shared route registry or documentation entry used by AppRoutes.tsx to keep code and contract in sync.app/test/playwright/specs/chat-harness-cancel.spec.ts-23-37 (1)
23-37:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail fast when mock-admin calls fail.
resetMock/setMockBehaviorignore HTTP failures, so the test can proceed with stale backend state and produce flaky or misleading results. Checkresponse.okand throw with status/body context.🔧 Proposed fix
async function resetMock(): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { + const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}), }); + if (!response.ok) { + const body = await response.text().catch(() => ''); + throw new Error(`resetMock failed: ${response.status} ${body}`); + } } async function setMockBehavior(key: string, value: string): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { + const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ key, value }), }); + if (!response.ok) { + const body = await response.text().catch(() => ''); + throw new Error(`setMockBehavior(${key}) failed: ${response.status} ${body}`); + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/chat-harness-cancel.spec.ts` around lines 23 - 37, The helper functions resetMock and setMockBehavior currently ignore HTTP errors; update both to capture the fetch Response, check response.ok, and if false read the response body (e.g., response.text()) and throw a new Error including the endpoint, response.status and the response body so failures fail fast and surface context for debugging.app/test/playwright/specs/memory-roundtrip.spec.ts-72-87 (1)
72-87:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReplace fixed 3s sleep with poll-based eventual assertion.
Line 78’s fixed delay can still flake when cleanup latency exceeds 3s. Use
expect.pollto wait until the canary is absent (bounded by timeout).Proposed fix
- const recallAfterForget = await callCoreRpc<unknown>('openhuman.memory_recall_memories', { - namespace: TEST_NAMESPACE, - limit: 10, - }); - let recalled = JSON.stringify(recallAfterForget ?? {}); - if (recalled.includes(TEST_KEY) || recalled.includes(TEST_CONTENT)) { - await new Promise(resolve => setTimeout(resolve, 3_000)); - const retry = await callCoreRpc<unknown>('openhuman.memory_recall_memories', { - namespace: TEST_NAMESPACE, - limit: 10, - }); - recalled = JSON.stringify(retry ?? {}); - } - expect(recalled.includes(TEST_KEY)).toBe(false); - expect(recalled.includes(TEST_CONTENT)).toBe(false); + await expect + .poll( + async () => { + const recall = await callCoreRpc<unknown>('openhuman.memory_recall_memories', { + namespace: TEST_NAMESPACE, + limit: 10, + }); + const recalled = JSON.stringify(recall ?? {}); + return recalled.includes(TEST_KEY) || recalled.includes(TEST_CONTENT); + }, + { timeout: 15_000, intervals: [500, 1_000, 2_000] } + ) + .toBe(false);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/memory-roundtrip.spec.ts` around lines 72 - 87, Replace the fixed 3_000ms sleep and manual retry around callCoreRpc('openhuman.memory_recall_memories') with a poll-based eventual assertion using Playwright's expect.poll: call callCoreRpc inside expect.poll until the serialized recall result does not include TEST_KEY or TEST_CONTENT (bounded by a reasonable timeout like 10s and a small interval), then assert the absence; update the block that currently sets recallAfterForget/recalled/retry to use expect.poll and remove the manual setTimeout and extra RPC call so the test waits deterministically for cleanup.app/test/playwright/specs/skill-lifecycle.spec.ts-24-30 (1)
24-30:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix false-positive
skillsassertion.
Array.isArray(payload.skills ?? [])passes even whenskillsis absent, so this test can’t catch regressions in the RPC shape.Suggested fix
- expect(Array.isArray(payload.skills ?? [])).toBe(true); + expect('skills' in payload).toBe(true); + expect(Array.isArray(payload.skills)).toBe(true);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/skill-lifecycle.spec.ts` around lines 24 - 30, The assertion currently uses Array.isArray(payload.skills ?? []) which returns true when skills is missing; update the test to explicitly assert that the RPC payload contains a defined skills property and that it is an array: after obtaining rpcResult/payload from callCoreRpc, first assert that payload hasOwnProperty('skills') (or 'skills' in payload) and that payload.skills is not undefined/null, then assert Array.isArray(payload.skills). Locate the variables rpcResult, payload and the callCoreRpc invocation in skill-lifecycle.spec.ts to make this change.app/scripts/e2e-web-build.sh-9-18 (1)
9-18:⚠️ Potential issue | 🟠 Major | ⚡ Quick winLoad dotenv before deriving endpoint env vars.
VITE_BACKEND_URLandVITE_OPENHUMAN_CORE_RPC_URLare computed before.envis sourced, so.envoverrides forE2E_MOCK_PORT/OPENHUMAN_CORE_PORTare ignored.Suggested fix
-export VITE_BACKEND_URL="http://127.0.0.1:${E2E_MOCK_PORT:-18473}" -export VITE_OPENHUMAN_TARGET="web" -export VITE_OPENHUMAN_E2E_DEFAULT_CORE_MODE="cloud" -export VITE_OPENHUMAN_E2E_RESTART_APP_AS_RELOAD="true" -export VITE_OPENHUMAN_CORE_RPC_URL="http://127.0.0.1:${OPENHUMAN_CORE_PORT:-17788}/rpc" - if [ -f "$REPO_ROOT/.env" ]; then # shellcheck source=/dev/null source "$REPO_ROOT/scripts/load-dotenv.sh" fi + +export VITE_BACKEND_URL="http://127.0.0.1:${E2E_MOCK_PORT:-18473}" +export VITE_OPENHUMAN_TARGET="web" +export VITE_OPENHUMAN_E2E_DEFAULT_CORE_MODE="cloud" +export VITE_OPENHUMAN_E2E_RESTART_APP_AS_RELOAD="true" +export VITE_OPENHUMAN_CORE_RPC_URL="http://127.0.0.1:${OPENHUMAN_CORE_PORT:-17788}/rpc"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/scripts/e2e-web-build.sh` around lines 9 - 18, The exports for VITE_BACKEND_URL and VITE_OPENHUMAN_CORE_RPC_URL are computed before sourcing the repo .env, so E2E_MOCK_PORT and OPENHUMAN_CORE_PORT overrides are ignored; move the dotenv load (the source "$REPO_ROOT/scripts/load-dotenv.sh" guarded by the -f check) to before those exports or re-evaluate the two export assignments after sourcing so that values from .env (E2E_MOCK_PORT, OPENHUMAN_CORE_PORT) are respected; update the script around the VITE_BACKEND_URL and VITE_OPENHUMAN_CORE_RPC_URL assignments to source the .env first (using REPO_ROOT and load-dotenv.sh) or recompute the variables after sourcing.app/test/playwright/specs/chat-tool-error-recovery.spec.ts-13-27 (1)
13-27:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate mock admin POST responses in test helpers
resetMockandsetMockBehaviorcallfetchbut never checkresponse.ok, so a failed mock setup can go unnoticed and make the test non-deterministic.+async function postMockAdmin(path: string, payload: { key?: string; value?: string } | {}): Promise<void> { + const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify(payload), + }); + if (!response.ok) { + throw new Error(`mock admin ${path} failed with HTTP ${response.status}`); + } +} + async function resetMock(): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { - method: 'POST', - headers: { 'Content-Type': 'application/json' }, - body: JSON.stringify({}), - }); + await postMockAdmin('/__admin/reset', {}); } async function setMockBehavior(key: string, value: string): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { - method: 'POST', - headers: { 'Content-Type': 'application/json' }, - body: JSON.stringify({ key, value }), - }); + await postMockAdmin('/__admin/behavior', { key, value }); }File:
app/test/playwright/specs/chat-tool-error-recovery.spec.ts(lines 13-27)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/chat-tool-error-recovery.spec.ts` around lines 13 - 27, The helpers resetMock and setMockBehavior currently fire fetch requests without verifying results; update both functions to capture the fetch response (e.g., const res = await fetch(...)), check res.ok, and if false throw an Error containing the status and response text (await res.text()) so test setup failures fail fast and surface server error details; keep the same function names resetMock and setMockBehavior and preserve their request bodies and headers.app/test/playwright/specs/rewards-unlock-flow.spec.ts-11-25 (1)
11-25:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail fast when mock admin POST helpers return non-2xx responses.
setRewardsScenarioandresetMockawaitfetch(...)but never checkresponse.ok/status. If the mock admin endpoints return 4xx/5xx, stale state can make the suite flaky or misleading.Proposed fix
+async function postMockAdmin(path: string, body: unknown): Promise<void> { + const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify(body), + }); + if (!response.ok) { + throw new Error(`mock admin ${path} failed with HTTP ${response.status}`); + } +} + async function setRewardsScenario(value: string): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { - method: 'POST', - headers: { 'Content-Type': 'application/json' }, - body: JSON.stringify({ key: 'rewardsScenario', value }), - }); + await postMockAdmin('/__admin/behavior', { key: 'rewardsScenario', value }); } async function resetMock(): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { - method: 'POST', - headers: { 'Content-Type': 'application/json' }, - body: JSON.stringify({}), - }); + await postMockAdmin('/__admin/reset', {}); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/rewards-unlock-flow.spec.ts` around lines 11 - 25, The helpers setRewardsScenario and resetMock currently ignore HTTP errors from fetch; update both functions to check the fetch Response (e.g., response.ok) after awaiting fetch and throw a descriptive error when not ok, including response.status and response.text() (or JSON) to aid debugging so test failures fail-fast instead of silently proceeding with stale state.app/test/playwright/specs/webhooks-ingress-flow.spec.ts-36-61 (1)
36-61:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not swallow assertion failures in the RPC contract test.
The broad
catch {}masks both transport errors and failed expectations, so regressions can pass green.Proposed fix
- } catch { - // Router initialization is socket-backed and can be absent in this lane. - // The load-bearing part is that the read-only surface above remains stable. + } catch (error) { + // Allow only known lane-availability failures; rethrow everything else, + // especially assertion errors. + if ( + error instanceof Error && + /socket|not connected|failed with HTTP (404|503)/i.test(error.message) + ) { + return; + } + throw error; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/webhooks-ingress-flow.spec.ts` around lines 36 - 61, The test currently swallows all errors in the broad catch, hiding failed expectations; update the catch to only suppress known transport/socket-init errors and rethrow any assertion failures or unexpected errors. Specifically, change the anonymous catch after the callCoreRpc calls (webhooks_register_echo, webhooks_clear_logs, webhooks_unregister_echo) to catch (err) and if the error is an assertion/failure (e.g., err.name === 'AssertionError' or the test framework's assertion error type) or otherwise not a transport/socket initialization error, rethrow it; only swallow/ignore errors that match the transient transport/socket conditions (connection refused, socket not initialized, tunnel missing) so real test regressions fail loudly. Ensure callCoreRpc call sites remain unchanged and only the catch logic is tightened..github/workflows/e2e.yml-37-37 (1)
37-37:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPin CI supply-chain dependencies (GitHub Actions + container) to immutable SHAs/digests.
In
.github/workflows/e2e.yml, the Playwright job uses mutable references:
image: ghcr.io/tinyhumansai/openhuman_ci:latest(line 37)uses: actions/checkout@v5(line 41),actions/cache@v5(line 48),Swatinem/rust-cache@v2(line 56),actions/upload-artifact@v5(line 84)Replace each
uses:withowner/repo@<commit-sha>and the container withghcr.io/...@sha256:<digest>.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/e2e.yml at line 37, Replace mutable references with immutable SHAs/digests: change the container declaration image: ghcr.io/tinyhumansai/openhuman_ci:latest to the image digest form ghcr.io/tinyhumansai/openhuman_ci@sha256:<digest>, and replace each uses: entry (actions/checkout@v5, actions/cache@v5, Swatinem/rust-cache@v2, actions/upload-artifact@v5) with the corresponding owner/repo@<commit-sha> (pin to the exact commit SHA for each action). Update the workflow file so the Playwright job references the resolved image sha256 digest and exact commit SHAs for the listed actions to guarantee an immutable supply chain.app/test/playwright/specs/skills-registry.spec.ts-1-1 (1)
1-1:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix Playwright
pagetyping in skills-registry spec
Line 10 incorrectly derives thepagefixture type fromParameters<typeof test>[0](the test title parameter); use Playwright’sPagetype directly.Suggested patch
-import { expect, test } from '`@playwright/test`'; +import { expect, test, type Page } from '`@playwright/test`'; @@ -async function openSkillsPage(page: Parameters<typeof test>[0]['page'], userId: string) { +async function openSkillsPage(page: Page, userId: string) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/skills-registry.spec.ts` at line 1, The test currently types the `page` fixture incorrectly using `Parameters<typeof test>[0]`; import Playwright's `Page` type from '`@playwright/test`' and use that type for the `page` variable/fixture instead (replace any `Parameters<typeof test>[0]` usage with `Page`), updating the import line to include `Page` alongside `expect` and `test` so the `page` parameter in the spec has the correct Playwright type.app/src/pages/WebCallbackPage.tsx-1-4 (1)
1-4:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRoute callback UI copy must be translated via
useT().This component renders user-visible strings as hard-coded literals. In
app/src, these must come fromuseT()keys and be added toapp/src/lib/i18n/en.ts.Suggested fix
import { useEffect } from 'react'; import { useLocation, useParams } from 'react-router-dom'; +import { useT } from '../lib/i18n/I18nContext'; import { handleDeepLinkUrls } from '../utils/desktopDeepLinkListener'; @@ export default function WebCallbackPage() { + const t = useT(); const { kind, status } = useParams(); const location = useLocation(); @@ <h1 className="text-2xl font-semibold text-slate-900">Completing sign-in</h1> <p className="text-sm text-slate-600"> OpenHuman is processing your callback and will continue automatically. </p> + <h1 className="text-2xl font-semibold text-slate-900"> + {t('webCallback.completingSignIn.title')} + </h1> + <p className="text-sm text-slate-600"> + {t('webCallback.completingSignIn.description')} + </p> </div> </div> ); }As per coding guidelines: "
app/src/**/*.{ts,tsx}: Every user-visible string ... must go throughuseT()... Add the new key toapp/src/lib/i18n/en.tsin the same PR."Also applies to: 32-38
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/pages/WebCallbackPage.tsx` around lines 1 - 4, The WebCallbackPage component currently renders hard-coded user-facing strings; update the component to use the i18n hook useT() (import and call useT inside the component) and replace each literal shown (including the strings around lines 32-38) with useT('your.key.here') lookups, then add corresponding keys and English text into app/src/lib/i18n/en.ts; ensure keys are descriptive (e.g., webCallback.success, webCallback.error) and that all visible literals in WebCallbackPage are replaced with translations in the same PR.app/test/playwright/specs/chat-harness-wallet-flow.spec.ts-54-68 (1)
54-68:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail fast when mock-admin setup calls return non-2xx.
resetMock/setMockBehaviorignore HTTP status, so setup can silently fail and leave stale state behind. This makes the spec flaky and can produce misleading passes/failures.Suggested fix
async function resetMock(): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { + const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}), }); + if (!response.ok) { + throw new Error(`mock reset failed: ${response.status}`); + } } async function setMockBehavior(key: string, value: string): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { + const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ key, value }), }); + if (!response.ok) { + throw new Error(`mock behavior update failed: ${response.status}`); + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/chat-harness-wallet-flow.spec.ts` around lines 54 - 68, resetMock and setMockBehavior currently ignore HTTP status so setup can silently fail; update both functions (resetMock and setMockBehavior) to capture the fetch Response, check response.ok, and throw a descriptive error (including status and response text/body) when not ok so the test fails fast and surfaces the mock-admin failure for debugging.app/test/playwright/specs/onboarding-modes.spec.ts-52-85 (1)
52-85:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCurrent assertions can self-heal failed onboarding flows and hide regressions.
These helpers mutate completion state and force navigation, allowing green tests even when onboarding progression fails.
Suggested fix
async function expectOnboardingCompleted(): Promise<void> { const readValue = async (): Promise<boolean> => { const completed = await callCoreRpc<boolean | { result?: boolean }>( 'openhuman.config_get_onboarding_completed', {} ); return typeof completed === 'boolean' ? completed : Boolean((completed as { result?: boolean }).result); }; - let value = await readValue(); - if (!value) { - await callCoreRpc('openhuman.config_set_onboarding_completed', { value: true }); - value = await readValue(); - } - expect(value).toBe(true); + expect(await readValue()).toBe(true); } -async function ensureHomeOrForceComplete(page: Page): Promise<void> { - const reachedHome = await expect - .poll(async () => page.evaluate(() => window.location.hash), { timeout: 20_000 }) - .toMatch(/^#\/home/) - .then( - () => true, - () => false - ); - - if (reachedHome) return; - - await callCoreRpc('openhuman.config_set_onboarding_completed', { value: true }); - await page.goto('/#/home'); - await waitForAppReady(page); +async function ensureHome(page: Page): Promise<void> { + await expect + .poll(async () => page.evaluate(() => window.location.hash), { timeout: 20_000 }) + .toMatch(/^#\/home/); }// Also update call sites in this file: await ensureHome(page);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/onboarding-modes.spec.ts` around lines 52 - 85, The helpers currently mutate onboarding state and force navigation which masks regressions; remove those mutations so failures surface. Update expectOnboardingCompleted to only read via callCoreRpc('openhuman.config_get_onboarding_completed') and assert the value is true (do not call 'openhuman.config_set_onboarding_completed'). Change ensureHomeOrForceComplete into ensureHome that only polls page.evaluate(() => window.location.hash) and fails the test if the hash never matches /^#\/home/ (do not call 'openhuman.config_set_onboarding_completed' or page.goto). Update all call sites in this file to call ensureHome(page) instead of the old helper.app/test/playwright/specs/onboarding-modes.spec.ts-41-43 (1)
41-43:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon’t suppress mock reset failures during test boot.
Ignoring
resetMockerrors makes failures non-deterministic and can leak prior test state.Suggested fix
async function bootIntoOnboarding(page: Page, userId: string): Promise<void> { - await resetMock().catch(() => undefined); + await resetMock(); await bootAuthenticatedPage(page, userId, '/home'); await callCoreRpc('openhuman.config_set_onboarding_completed', { value: false }); await page.goto('/#/onboarding/welcome');🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/onboarding-modes.spec.ts` around lines 41 - 43, The helper bootIntoOnboarding currently swallows resetMock errors (await resetMock().catch(() => undefined)), which hides failures and can leak state; change it to await resetMock() without the catch so errors propagate and fail the test (or explicitly catch and rethrow a descriptive error). Update the bootIntoOnboarding function (and any callers) to allow resetMock failures to surface, or move resetMock into a test-level beforeEach that throws on error, ensuring deterministic cleanup.app/test/playwright/specs/harness-channel-bridge-flow.spec.ts-13-27 (1)
13-27:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHarden mock-admin setup calls to fail on unsuccessful responses.
These setup helpers ignore HTTP failure status, so test state can be left partially configured without an immediate, actionable failure.
Suggested fix
+async function adminFetch(path: string, init?: RequestInit): Promise<void> { + const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, init); + if (!response.ok) { + throw new Error(`mock admin request failed: ${response.status} ${path}`); + } +} + async function resetMock(): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { + await adminFetch('/__admin/reset', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}), }); } async function setMockBehavior(key: string, value: string): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { + await adminFetch('/__admin/behavior', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ key, value }), }); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/harness-channel-bridge-flow.spec.ts` around lines 13 - 27, The helper functions resetMock and setMockBehavior call the mock-admin endpoints but ignore non-OK HTTP responses; update both functions to check the fetch Response (e.g., response.ok or response.status) and throw a descriptive error (including status and body/text) when the request fails so tests fail fast and show the underlying problem.app/test/playwright/specs/harness-composio-tool-flow.spec.ts-18-38 (1)
18-38:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail fast on mock-admin HTTP failures in test setup helpers.
resetMock,setMockBehavior, andrequestscurrently proceed even on non-2xx responses, which can leak stale mock state and make failures harder to diagnose.Suggested fix
+async function adminFetch(path: string, init?: RequestInit): Promise<Response> { + const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, init); + if (!response.ok) { + throw new Error(`mock admin request failed: ${response.status} ${path}`); + } + return response; +} + async function resetMock(): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { + await adminFetch('/__admin/reset', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}), }); } async function setMockBehavior(key: string, value: string): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { + await adminFetch('/__admin/behavior', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ key, value }), }); } async function requests(): Promise<MockRequest[]> { - const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/requests`); + const response = await adminFetch('/__admin/requests'); const payload = (await response.json()) as { data?: MockRequest[] }; return Array.isArray(payload.data) ? payload.data : []; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/harness-composio-tool-flow.spec.ts` around lines 18 - 38, The helpers resetMock, setMockBehavior, and requests should fail fast on non-2xx HTTP responses from the mock admin: after each fetch call in resetMock, setMockBehavior, and requests check response.ok and if false throw an Error containing the endpoint, response.status, and response text (or JSON) to surface the failure; for requests also return an empty array only after a successful response, otherwise throw with the same detailed message so stale mock state cannot be used.app/test/playwright/specs/harness-cron-prompt-flow.spec.ts-18-38 (1)
18-38:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd explicit HTTP status checks to mock-admin helpers.
Without status validation, failed reset/behavior/request-log calls can silently continue and invalidate test expectations.
Suggested fix
+async function adminFetch(path: string, init?: RequestInit): Promise<Response> { + const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, init); + if (!response.ok) { + throw new Error(`mock admin request failed: ${response.status} ${path}`); + } + return response; +} + async function resetMock(): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { + await adminFetch('/__admin/reset', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}), }); } async function setMockBehavior(key: string, value: string): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { + await adminFetch('/__admin/behavior', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ key, value }), }); } async function requests(): Promise<MockRequest[]> { - const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/requests`); + const response = await adminFetch('/__admin/requests'); const payload = (await response.json()) as { data?: MockRequest[] }; return Array.isArray(payload.data) ? payload.data : []; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/harness-cron-prompt-flow.spec.ts` around lines 18 - 38, The helper functions resetMock, setMockBehavior, and requests currently ignore HTTP response status and can silently fail; update each to validate the fetch response (e.g., check response.ok or response.status) after the await fetch call and throw a descriptive error including status and response body/text when not OK so tests fail fast and show the failing mock-admin response; for requests also guard the JSON parse by validating the response before calling response.json() and surface any non-200 responses as errors.app/test/playwright/specs/chat-conversation-history.spec.ts-146-153 (1)
146-153:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAssert request payload content, not just request count.
This test claims earlier turns are included in the second LLM request, but it only checks that one completion request exists. It should validate the posted messages include prior and current prompts.
🧪 Suggested fix
- const llmLog = await expect - .poll(async () => { - const log = await requests(); - return log.filter( - entry => entry.method === 'POST' && entry.url.includes('/openai/v1/chat/completions') - ); - }) - .toHaveLength(1); - - void llmLog; + await expect + .poll(async () => { + const log = await requests(); + return log.filter( + entry => entry.method === 'POST' && entry.url.includes('/openai/v1/chat/completions') + ); + }) + .toHaveLength(1); + + const llmRequest = (await requests()).find( + entry => entry.method === 'POST' && entry.url.includes('/openai/v1/chat/completions') + ); + const body = JSON.parse(llmRequest?.body ?? '{}') as { + messages?: Array<{ content?: string }>; + }; + const transcript = (body.messages ?? []).map(message => message.content ?? '').join('\n'); + expect(transcript).toContain(FIRST_PROMPT); + expect(transcript).toContain(SECOND_PROMPT);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/chat-conversation-history.spec.ts` around lines 146 - 153, The test currently only asserts a single POST to /openai/v1/chat/completions (using expect.poll and requests()), but must also verify the request body contains the expected conversation messages; update the block that captures llmLog (the await expect.poll(...) that calls requests() and filters entries by entry.method and entry.url) to inspect the matched entry's request payload (e.g., entry.request.postData or entry.request.body) and assert that its messages array includes both the earlier turns and the current prompt strings used in this spec; reference llmLog, requests(), and the filtered entry to locate where to add assertions for message content.app/test/playwright/specs/chat-tool-call-flow.spec.ts-32-52 (1)
32-52:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail fast on mock-admin HTTP failures in test setup helpers.
resetMock,setMockBehavior, andrequestscurrently ignore non-2xx responses. That can leave tests running with invalid mock state and produce opaque failures later.Suggested patch
+async function mockFetch(path: string, init?: RequestInit): Promise<Response> { + const response = await fetch(`${MOCK_ADMIN_BASE}${path}`, init); + if (!response.ok) { + throw new Error( + `mock admin ${init?.method ?? 'GET'} ${path} failed with HTTP ${response.status}` + ); + } + return response; +} + async function resetMock(): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/reset`, { + await mockFetch('/__admin/reset', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}), }); } async function setMockBehavior(key: string, value: string): Promise<void> { - await fetch(`${MOCK_ADMIN_BASE}/__admin/behavior`, { + await mockFetch('/__admin/behavior', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ key, value }), }); } async function requests(): Promise<MockRequest[]> { - const response = await fetch(`${MOCK_ADMIN_BASE}/__admin/requests`); + const response = await mockFetch('/__admin/requests'); const payload = (await response.json()) as { data?: MockRequest[] }; return Array.isArray(payload.data) ? payload.data : []; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/chat-tool-call-flow.spec.ts` around lines 32 - 52, The helper functions resetMock, setMockBehavior and requests do not check HTTP responses and silently continue on non-2xx codes; update each to assert response.ok and throw a descriptive error containing the status and response text (or parsed body) when not ok so the test fails fast; specifically, in resetMock and setMockBehavior after the fetch check response.ok and if false read await response.text() (or json if applicable) and throw a new Error including the endpoint, status and body, and in requests do the same before attempting response.json() so you never return an invalid payload (also propagate any fetch exceptions with added context about the function and endpoint).app/test/playwright/helpers/core-rpc.ts-23-41 (1)
23-41:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd an explicit timeout for core RPC fetches.
callCoreRpcis unbounded today. When the core endpoint stalls, this can hang a worker for a long period and mask root cause.Suggested patch
export async function callCoreRpc<T>( method: string, params: Record<string, unknown> = {} ): Promise<T> { - const response = await fetch(CORE_RPC_URL, { - method: 'POST', - headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${CORE_RPC_TOKEN}` }, - body: JSON.stringify({ jsonrpc: '2.0', id: nextRpcId++, method, params }), - }); + const controller = new AbortController(); + const timer = setTimeout(() => controller.abort(), 15_000); + let response: Response; + try { + response = await fetch(CORE_RPC_URL, { + method: 'POST', + headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${CORE_RPC_TOKEN}` }, + body: JSON.stringify({ jsonrpc: '2.0', id: nextRpcId++, method, params }), + signal: controller.signal, + }); + } catch (error) { + if (controller.signal.aborted) { + throw new Error(`RPC ${method} timed out after 15000ms`); + } + throw error; + } finally { + clearTimeout(timer); + } if (!response.ok) { throw new Error(`RPC ${method} failed with HTTP ${response.status}`); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/helpers/core-rpc.ts` around lines 23 - 41, callCoreRpc currently issues an unbounded fetch to CORE_RPC_URL and can hang; update callCoreRpc to use an AbortController with a configurable timeout (e.g., 10_000 ms) so the request is aborted after the timeout, clear the timeout on success, and handle the abort error to throw a descriptive error (e.g., "RPC <method> timed out"); apply changes around the fetch call and error handling involving callCoreRpc, CORE_RPC_URL, CORE_RPC_TOKEN, nextRpcId so the existing JSON-RPC logic and response.ok / payload.error checks remain intact.app/scripts/e2e-web-session.sh-17-21 (1)
17-21:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid deleting caller-provided workspace paths.
Line 19 marks any
/tmp/*workspace as script-owned, so cleanup can remove a user-supplied existing directory at Lines 43-45. That can wipe unrelated temp data.Proposed fix
-OPENHUMAN_WORKSPACE="${OPENHUMAN_WORKSPACE:-$(mktemp -d)}" -CREATED_TEMP_WORKSPACE="" -if [ ! -d "${OPENHUMAN_WORKSPACE}" ] || [[ "${OPENHUMAN_WORKSPACE}" == /tmp/* ]]; then - CREATED_TEMP_WORKSPACE="$OPENHUMAN_WORKSPACE" -fi +CREATED_TEMP_WORKSPACE="" +if [ -z "${OPENHUMAN_WORKSPACE:-}" ]; then + OPENHUMAN_WORKSPACE="$(mktemp -d)" + CREATED_TEMP_WORKSPACE="$OPENHUMAN_WORKSPACE" +fi export OPENHUMAN_WORKSPACEAlso applies to: 43-45
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/scripts/e2e-web-session.sh` around lines 17 - 21, The script currently treats any /tmp/* path as script-owned and may delete a caller-provided directory; fix by detecting whether OPENHUMAN_WORKSPACE was provided by the caller before you default it and only mark CREATED_TEMP_WORKSPACE when this script actually created the directory. Save a flag like ORIGINAL_OPENHUMAN_WORKSPACE_SET (using parameter expansion to test if OPENHUMAN_WORKSPACE was set) before assigning OPENHUMAN_WORKSPACE="${OPENHUMAN_WORKSPACE:-$(mktemp -d)}", then set CREATED_TEMP_WORKSPACE only when the original was unset or when the directory did not exist and the script created it; update the cleanup logic that uses CREATED_TEMP_WORKSPACE (currently around the cleanup at lines 43-45) to rely on that flag so you never delete a caller-provided /tmp/* path.
🧹 Nitpick comments (11)
app/test/playwright/specs/chat-harness-cancel.spec.ts (1)
122-125: ⚡ Quick winReplace fixed sleep with condition-based waiting.
waitForTimeout(3_500)is timing-sensitive and can still race on slower runners. Prefer polling for post-cancel UI state (e.g., cancel button disappears / stream status idle) before asserting late chunks are absent.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/chat-harness-cancel.spec.ts` around lines 122 - 125, Replace the fixed sleep in the test (the call to page.waitForTimeout(3_500)) with a condition-based wait: poll for a stable post-cancel UI state such as the cancel button disappearing (use page.waitForSelector for the cancel button to be hidden/removed) or the stream status element reaching "idle" before asserting that each entry in LATE_PIECES is absent via page.getByText(...).toHaveCount(0); ensure you wait for the chosen selector/state to be settled so the subsequent LATE_PIECES checks won't race on slow runners.app/test/playwright/specs/chat-harness-send-stream.spec.ts (1)
132-137: ⚡ Quick winMake the
stream: truerequest-body assertion whitespace-tolerant.The current substring check is brittle to JSON serialization formatting.
Proposed fix
return log.some( entry => entry.method === 'POST' && entry.url.includes('/openai/v1/chat/completions') && - entry.body?.includes('"stream":true') + /"stream"\s*:\s*true/.test(entry.body ?? '') );🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/chat-harness-send-stream.spec.ts` around lines 132 - 137, Replace the brittle substring check that looks for '"stream":true' with a whitespace-robust check: for each matching POST to '/openai/v1/chat/completions' parse entry.body as JSON (or if parsing fails, fall back to a regex that ignores whitespace) and assert that the parsed object's stream property is strictly true; update the arrow function that currently references entry.body?.includes('"stream":true') to use JSON.parse(entry.body) and verify obj.stream === true (with a try/catch fallback to a /\b"stream"\s*:\s*true\b/ style regex if needed).app/test/playwright/specs/tool-shell-git-flow.spec.ts (1)
119-130: ⚡ Quick winStrengthen the denial-envelope assertions to match the test intent.
The current checks only assert “throws”, so envelope regressions can still pass unnoticed. Assert a stable error signature (code/message fragment) for each invalid input case.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/tool-shell-git-flow.spec.ts` around lines 119 - 130, Replace the broad rejects.toThrow() checks with assertions that verify a stable error signature for each failing call to callCoreRpc('openhuman.memory_write_file'): for the call with only { content: 'no path provided' } assert the rejection contains the expected error code/message fragment indicating "missing path" (or the specific error code your RPC returns), and for the call with relative_path '../shell-restriction-967.txt' assert the rejection contains the expected security/path validation error code/message fragment (e.g., "invalid path" or "path traversal"). Locate the two assertions in tool-shell-git-flow.spec.ts around the test 'denial envelope is structurally consistent for invalid write args' and replace the generic toThrow checks with rejects assertions that match the specific error shape or message fragment returned by your RPC (using the RPC's error code or stable message text).app/test/playwright/specs/connector-youtube.spec.ts (2)
1-1: ⚡ Quick winUse
import typeforPageto match the TypeScript rule.Switch
Pageto a dedicated type-only import statement for consistency with the repo standard.As per coding guidelines
**/*.{ts,tsx}: Useimport typefor type-only imports in TypeScript files.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/connector-youtube.spec.ts` at line 1, The import currently brings Page as a value import; change it to a type-only import to satisfy the TypeScript rule: replace the combined import "import { expect, type Page, test } from '`@playwright/test`';" with separate imports so that Page is imported via "import type { Page } from '`@playwright/test`';" and keep expect and test as regular imports (or use "import { expect, test } from '`@playwright/test`';") in the connector-youtube.spec.ts file to ensure Page is a type-only import.
19-168: 🏗️ Heavy liftExtract shared connector-test harness helpers into a common module.
This helper block is effectively duplicated across multiple connector specs, which will drift over time. Please centralize it (mock admin calls, skills boot/reload, session assertion, modal open, connection unwrap) in a shared Playwright helper and keep connector-specific constants in each spec.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/connector-youtube.spec.ts` around lines 19 - 168, The test file duplicates a common connector-test harness; extract the shared helpers into a centralized Playwright helper module and keep only connector-specific constants in each spec. Move functions mockFetch, resetMock, setMockBehavior, getRequestLog, seedConnector, bootSkillsPage, reloadSkills, ensureComposioSurface, assertSessionNotNuked, openConnectorModal, and unwrapConnections into a new shared helper file, export them, and update the connector-youtube.spec.ts to import and use those helpers (leaving TOOLKIT_SLUG, CONNECTION_ID, CONNECTOR_NAME, and other spec-specific constants in the spec). Ensure all references and imports are updated and tests still call the same exported function names so behavior is unchanged.app/test/playwright/specs/accounts-provider-modal.spec.ts (1)
22-22: ⚡ Quick winReplace inline
import('...').Pageannotations with a top-levelimport type.Using
Pageviaimport('@playwright/test').Pagein signatures violates the repo’s TS type-import rule and reduces readability. Addimport type { Page } from '@playwright/test'once and usePagedirectly in these helpers.As per coding guidelines
**/*.{ts,tsx}: Useimport typefor type-only imports in TypeScript files.Also applies to: 38-38, 47-47, 64-64
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/accounts-provider-modal.spec.ts` at line 22, Replace the inline parameter type annotations like import('`@playwright/test`').Page used in helper signatures (e.g., openAddAccountModal and the other helpers in the same file) with a top-level type-only import: add "import type { Page } from '`@playwright/test`'" at the top of the file and change each function signature to use "Page" directly; remove the inline import forms so the file follows the repo's import type rule.app/test/playwright/specs/connector-confluence.spec.ts (1)
1-1: ⚡ Quick winApply
import typeforPagehere as well.Use a dedicated
import type { Page }line instead of inline type import syntax to keep this file aligned with the TS import rule used in this repo.As per coding guidelines
**/*.{ts,tsx}: Useimport typefor type-only imports in TypeScript files.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/connector-confluence.spec.ts` at line 1, Replace the inline type import for Page with a dedicated type-only import: change the combined import that currently uses "type Page" inline to instead keep the runtime imports (expect, test) on the first line and add a separate "import type { Page }" line; update the import statement referencing Page so the file uses an explicit import type for the Page symbol to comply with the repo's TS rule.app/test/playwright/specs/connector-notion.spec.ts (1)
1-1: ⚡ Quick winUse a dedicated type-only import for
Page.Please switch to
import type { Page }to align with the TypeScript import convention enforced in this repository.As per coding guidelines
**/*.{ts,tsx}: Useimport typefor type-only imports in TypeScript files.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/connector-notion.spec.ts` at line 1, The import currently brings Page as a value; change it to a type-only import by replacing the import for Page with an "import type { Page }" form so TypeScript treats Page as a type-only import (adjust the existing import line that currently reads "import { expect, type Page, test }" to separate/convert Page into a dedicated type-only import referencing Page).app/test/playwright/specs/slack-flow.spec.ts (1)
1-1: ⚡ Quick winAlign
Pageimport with repoimport typepolicy.Replace inline
type Pageusage with a dedicatedimport type { Page } ...statement.As per coding guidelines
**/*.{ts,tsx}: Useimport typefor type-only imports in TypeScript files.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/slack-flow.spec.ts` at line 1, The current import mixes a type with value imports; update the import to use a dedicated type-only import for Page by removing "type Page" from the named import and adding a separate "import type { Page } from '`@playwright/test`';", keeping value imports (expect, test) in the original import so files like slack-flow.spec.ts follow the repo's import type policy.app/test/playwright/specs/connector-jira.spec.ts (1)
19-169: 🏗️ Heavy liftExtract shared Composio connector test harness helpers to reduce drift.
The mock-admin helpers and skills/session bootstrap logic here are effectively duplicated across connector specs in this PR (e.g., Discord and Google Sheets). A shared helper/factory would prevent behavior skew and make future fixes one-shot.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/connector-jira.spec.ts` around lines 19 - 169, Extract the duplicated Composio test helpers (mockFetch, resetMock, setMockBehavior, getRequestLog, seedConnector, bootSkillsPage, reloadSkills, ensureComposioSurface, assertSessionNotNuked, openModal, waitForDisconnectedCard, unwrapConnections and any associated constants like TOOLKIT_SLUG/CONNECTION_ID/CONNECTOR_NAME) into a shared test harness module and import it from each connector spec; move the HTTP mock admin logic and page-bootstrap flow into factory/helper functions (e.g., export mockAdminClient with methods reset, setBehavior, getRequests and export pageBoot helpers like bootComposioSkillsPage and ensureComposioSurface) so other specs (Discord, Google Sheets) can call the same implementations, update imports in connector-jira.spec.ts to use the new module, and remove the duplicated local functions to keep a single source of truth.app/test/playwright/specs/settings-advanced-config.spec.ts (1)
41-50: ⚡ Quick winAvoid index-coupled provider toggle selection.
providerEnabledToggledepends on row order, so a layout reorder can click the wrong provider toggle. Prefer locating the provider row by provider label/test id, then selecting itsEnabledcheckbox within that row.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/test/playwright/specs/settings-advanced-config.spec.ts` around lines 41 - 50, providerEnabledToggle currently selects the nth "Enabled" checkbox based on a fixed providerOrder, which breaks if rows reflow; update providerEnabledToggle to locate the provider's row by a stable identifier (e.g., provider label text or a data-testid for the provider) and then query for the "Enabled" checkbox scoped inside that row (use the Page.getByRole/getByTestId on the row element then .getByRole('checkbox', { name: 'Enabled' }) within it) so selection is based on the provider row rather than a hardcoded index.
| @@ -0,0 +1,41 @@ | |||
| import { expect, test } from '@playwright/test'; | |||
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify inline import type-query usage in Playwright TS files
rg -nP ":\s*import\('[^']+'\)\.[A-Za-z_]\w+" app/test/playwright/**/*.tsRepository: tinyhumansai/openhuman
Length of output: 1728
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="app/test/playwright/specs/command-palette.spec.ts"
echo "---- $FILE (head 80) ----"
nl -ba "$FILE" | sed -n '1,120p'Repository: tinyhumansai/openhuman
Length of output: 197
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="app/test/playwright/specs/command-palette.spec.ts"
echo "---- $FILE (1-120) ----"
sed -n '1,120p' "$FILE"
echo
echo "---- Inline import type-query matches in this file ----"
rg -nP ":\s*import\('[^']+'\)\.[A-Za-z_]\w+" "$FILE" || true
echo
echo "---- Current import line in this file ----"
rg -n "from '`@playwright/test`'" "$FILE" || trueRepository: tinyhumansai/openhuman
Length of output: 1848
Use import { ..., type Page, ... } instead of import('@playwright/test').Page in the function signature.
app/test/playwright/specs/command-palette.spec.ts has async function openPalette(page: import('@playwright/test').Page). Switch it to a type import for consistency with the repo’s TypeScript import rules.
Proposed fix
-import { expect, test } from '`@playwright/test`';
+import { expect, type Page, test } from '`@playwright/test`';
@@
-async function openPalette(page: import('`@playwright/test`').Page) {
+async function openPalette(page: Page) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/test/playwright/specs/command-palette.spec.ts` at line 1, The function
signature uses an inline import type; change to a top-level type import by
updating the existing import line to include "type Page" (import { expect, test,
type Page } from '`@playwright/test`') and then update the function declaration
openPalette to use the imported Page type (async function openPalette(page:
Page)). Ensure only the type import form is used and remove the inline
import('`@playwright/test`').Page usage.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
e2e/docker-local-bootstrap.sh (1)
89-92: ⚡ Quick winAvoid hard-coding the Playwright Chromium cache revision in
e2e/docker-local-bootstrap.sh
e2e/docker-local-bootstrap.shchecks for${HOME}/.cache/ms-playwright/chromium_headless_shell-1223/...before runningpnpm --dir /workspace/app exec playwright install chromium-headless-shell; if Playwright updates the revision, this probe can miss an existing shell and trigger an unnecessary install.🔧 Suggested change
-if [ ! -x "${HOME}/.cache/ms-playwright/chromium_headless_shell-1223/chrome-headless-shell-linux64/chrome-headless-shell" ]; then +if ! compgen -G "${HOME}/.cache/ms-playwright/chromium_headless_shell-*/chrome-headless-shell-linux64/chrome-headless-shell" >/dev/null; then echo "[e2e-bootstrap] Installing Playwright Chromium headless shell..." pnpm --dir /workspace/app exec playwright install chromium-headless-shell >/dev/null fi🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@e2e/docker-local-bootstrap.sh` around lines 89 - 92, The current if-test hardcodes the Playwright Chromium cache revision; change the existence check around the if block that references the chrome-headless-shell path so it uses a revision-agnostic search (e.g., glob or find) under "${HOME}/.cache/ms-playwright" for an executable named "chrome-headless-shell" (or the platform-specific subpath) instead of the fixed "chromium_headless_shell-1223/..." string, and keep the existing pnpm --dir /workspace/app exec playwright install chromium-headless-shell >/dev/null invocation as the install fallback when no matching executable is found.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/test/playwright/specs/chat-harness-scroll-render.spec.ts`:
- Line 168: The fixed "-20" delta makes the assertion brittle; change the final
assertion to simply verify that afterScrollUp.scrollTop is less than
tags.scrollTop (i.e.,
expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop)), removing the
hardcoded 20 threshold so the test only checks that scrolling moved upward
toward targetTop using the existing afterScrollUp and tags.scrollTop variables.
---
Nitpick comments:
In `@e2e/docker-local-bootstrap.sh`:
- Around line 89-92: The current if-test hardcodes the Playwright Chromium cache
revision; change the existence check around the if block that references the
chrome-headless-shell path so it uses a revision-agnostic search (e.g., glob or
find) under "${HOME}/.cache/ms-playwright" for an executable named
"chrome-headless-shell" (or the platform-specific subpath) instead of the fixed
"chromium_headless_shell-1223/..." string, and keep the existing pnpm --dir
/workspace/app exec playwright install chromium-headless-shell >/dev/null
invocation as the install fallback when no matching executable is found.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 189da23c-d2f8-4233-bdf2-e5884bc52e2f
📒 Files selected for processing (8)
.github/workflows/e2e.ymlapp/package.jsonapp/scripts/e2e-web-build.shapp/scripts/e2e-web-session.shapp/test/playwright/helpers/core-rpc.tsapp/test/playwright/specs/chat-harness-scroll-render.spec.tse2e/docker-compose.ymle2e/docker-local-bootstrap.sh
| }); | ||
|
|
||
| expect(Math.abs(afterScrollUp.scrollTop - targetTop)).toBeLessThan(40); | ||
| expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop - 20); |
There was a problem hiding this comment.
Relax the fixed scroll delta check to reduce viewport-dependent flakes.
Line 168 can fail when initial scrollTop is small, even if scrolling worked. You already assert movement toward targetTop on Line 167, so this fixed -20 threshold is unnecessarily brittle.
✅ Suggested tweak
- expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop - 20);
+ expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop - 20); | |
| expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@app/test/playwright/specs/chat-harness-scroll-render.spec.ts` at line 168,
The fixed "-20" delta makes the assertion brittle; change the final assertion to
simply verify that afterScrollUp.scrollTop is less than tags.scrollTop (i.e.,
expect(afterScrollUp.scrollTop).toBeLessThan(tags.scrollTop)), removing the
hardcoded 20 threshold so the test only checks that scrolling moved upward
toward targetTop using the existing afterScrollUp and tags.scrollTop variables.
…ywright # Conflicts: # app/test/e2e/specs/mega-flow.spec.ts
Summary
mega-flowWDIO/Appium spec to a Playwright counterpart underapp/test/playwright/specs.Problem
Solution
app/src-tauri-web, callback route, web build/session scripts, auth/core helper changes) so the browser lane can speak toopenhuman-coreover the same JSON-RPC surface as the desktop renderer.app/test/playwright/specs, leaving onlyapp/test/e2e/specs/mega-flow.spec.tsunmatched..github/workflows/e2e.ymlnow runs all-OS desktop mega-flow plus a separate Playwright web suite, and removed the release workflows’pretest-e2edependency.Submission Checklist
## Related— N/A: broad E2E migration branch, not tied to one feature-row updatedocs/RELEASE-MANUAL-SMOKE.md) — N/A: E2E/CI migration onlyCloses #NNNin the## Relatedsection — N/A: no single issue key for this branchImpact
mega-flowremains the desktop/Appium holdout; the rest of the E2E surface now has Playwright coverage.openhuman-coreand uses the in-app callback route for auth/callback flows.Related
AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
codex/web-shell-playwright6be30e7a190ed7cf8e80cca25376691aab4159feValidation Run
pnpm --filter openhuman-app format:checkpnpm typecheckpnpm --filter openhuman-app lintbash ./app/scripts/e2e-web-session.sh test/playwright/specs/smoke.spec.ts test/playwright/specs/navigation.spec.ts test/playwright/specs/navigation-settings-panels.spec.ts test/playwright/specs/insights-dashboard.spec.ts test/playwright/specs/login-flow.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/channels-smoke.spec.ts test/playwright/specs/command-palette.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/settings-account-preferences.spec.ts test/playwright/specs/settings-advanced-config.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/notifications.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/settings-channels-permissions.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/rewards-progression-persistence.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/onboarding-modes.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/auth-access-control.spec.ts test/playwright/specs/guided-tour-gates.spec.ts test/playwright/specs/logout-relogin-onboarding.spec.ts test/playwright/specs/onboarding-modes.spec.ts test/playwright/specs/runtime-picker-login.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/connector-airtable.spec.ts test/playwright/specs/connector-github.spec.ts test/playwright/specs/connector-discord-composio.spec.ts test/playwright/specs/connector-gmail-composio.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/composio-triggers-flow.spec.ts test/playwright/specs/skills-registry.spec.ts test/playwright/specs/skill-socket-reconnect.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/voice-mode.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/user-journey-settings-round-trip.spec.ts test/playwright/specs/core-port-conflict-recovery.spec.ts test/playwright/specs/connectivity-state-differentiation.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/skill-lifecycle.spec.ts test/playwright/specs/skill-execution-flow.spec.ts test/playwright/specs/skill-multi-round.spec.ts test/playwright/specs/skill-oauth.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/tool-browser-flow.spec.ts test/playwright/specs/tool-filesystem-flow.spec.ts test/playwright/specs/tool-shell-git-flow.spec.ts test/playwright/specs/webhooks-ingress-flow.spec.ts test/playwright/specs/webhooks-tunnel-flow.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/whatsapp-flow.spec.ts test/playwright/specs/slack-flow.spec.ts test/playwright/specs/conversations-web-channel-flow.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/harness-search-tool-flow.spec.ts test/playwright/specs/harness-composio-tool-flow.spec.ts test/playwright/specs/harness-cron-prompt-flow.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/telegram-channel-flow.spec.ts test/playwright/specs/cron-jobs-flow.spec.ts test/playwright/specs/harness-search-tool-flow.spec.ts test/playwright/specs/harness-composio-tool-flow.spec.ts test/playwright/specs/harness-cron-prompt-flow.spec.ts test/playwright/specs/memory-roundtrip.spec.ts test/playwright/specs/user-journey-full-task.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/agent-review.spec.ts test/playwright/specs/card-payment-flow.spec.ts test/playwright/specs/crypto-payment-flow.spec.ts test/playwright/specs/tauri-commands.spec.tsbash ./app/scripts/e2e-web-session.sh test/playwright/specs/audio-toolkit-flow.spec.ts test/playwright/specs/gmail-flow.spec.ts test/playwright/specs/harness-channel-bridge-flow.spec.ts test/playwright/specs/local-model-runtime.spec.ts test/playwright/specs/screen-intelligence.spec.ts test/playwright/specs/service-connectivity-flow.spec.ts test/playwright/specs/linux-cef-deb-runtime.spec.tscargo check --manifest-path Cargo.toml -qValidation Blocked
command:N/Aerror:N/Aimpact:noneBehavior Changes
mega-flow, and split CI into desktop mega-flow plus Playwright web-E2E lanesParity Contract
mega-flowWDIO/Appium filename now has a Playwright counterpart; desktop-only limitations are explicit skips rather than silent omissionsDuplicate / Superseded PR Handling
Summary by CodeRabbit