test(app): add W1 renderer contract smoke specs (#598)#1125
Conversation
Strengthen W1 regression coverage so CI gates more of the session-view rewrite before the manual handtest. Adds two @smoke specs: - Rendered turn locks the trow chevron geometry (12px) and collapsed-right / open-down orientation, user-message / agent-prose / agent-reasoning text selectability, the trow-result-body mono-small + fg-weak typography tokens (asserted against resolved tokens, no sans leakage), and the bounded [data-scrollable] raw-output container that the nested-wheel isolation keys off. Seeds one tool+reasoning turn so groupParts yields a multi-part trow. - Thinking indicator shows while the turn is working with nothing visible, and is gone once any part renders. Scope notes vs issue #598: - Scroll behaviours A.1/A.2 and the full nested-wheel isolation A.3 stay unit-covered (scroll-controller + scroll-intents tests) plus the existing weak-wheel E2E; only the DOM precondition is locked here. - Scope B.1 user-bubble hairline is dropped: the bubble moved to a fill-only treatment (surface-interactive-base / surface-raised, no border/box-shadow), so the W1 hairline contract is stale. Verified locally: bun script/e2e-local.ts -- e2e/session/session-w1-contracts.spec.ts --repeat-each=3 (6 passed); app typecheck clean. Refs #598
|
Warning Review limit reached
More reviews will be available in 55 minutes and 43 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds a new Playwright E2E test spec ( ChangesW1 Session Contract Tests
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a new Playwright E2E test suite (session-w1-contracts.spec.ts) to verify W1 renderer contracts, such as chevron behavior, text selectability, typography, and scrollable raw tool output. Feedback on the changes suggests using expect.poll when retrieving the count of inner triggers to prevent potential test flakiness, as Playwright's count() does not auto-wait for elements to be rendered.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
There was a problem hiding this comment.
Suggested priority: P3 (only low-risk paths changed (packages/app/e2e/session/session-w1-contracts.spec.ts)).
P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.
The thinking-indicator spec submits with a raw Enter, bypassing project.prompt(), so the session the UI creates was never registered. With an assistant.hang() reply, teardown dropped only the project dir and could leave a non-idle session/stream alive for the rest of the Playwright worker, leaking backend state into later smoke tests. Capture the session id from the URL once the turn is working and project.trackSession() it so cleanup tears the stream down. Addresses Codex review [P2] on #1125.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/app/e2e/session/session-w1-contracts.spec.ts (1)
186-187: 💤 Low valueRedundant clear before fill.
prompt.fill(text)already replaces existing content, so the precedingprompt.fill("")is unnecessary unless it guards a specific composer quirk.♻️ Optional simplification
await prompt.click() - await prompt.fill("") await prompt.fill(text)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/app/e2e/session/session-w1-contracts.spec.ts` around lines 186 - 187, The test contains a redundant clear call: remove the unnecessary prompt.fill("") before prompt.fill(text) because prompt.fill(text) replaces existing content; update the code by deleting the prompt.fill("") invocation and keep the single prompt.fill(text) call (references: prompt.fill("") and prompt.fill(text) in the spec).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/app/e2e/session/session-w1-contracts.spec.ts`:
- Around line 182-199: The manual Enter-submit flow in the W1 thinking indicator
test creates a session but never registers it for fixture cleanup, so the
session is left behind after teardown. After the submit in the session spec,
once the URL contains a session id, call project.trackSession with
sessionIDFromUrl(page.url()) and project.directory so the new session is added
to the cleanup set; use the existing manual-submit tracking pattern from other
tests and keep the change near the prompt.click()/fill()/Enter sequence.
---
Nitpick comments:
In `@packages/app/e2e/session/session-w1-contracts.spec.ts`:
- Around line 186-187: The test contains a redundant clear call: remove the
unnecessary prompt.fill("") before prompt.fill(text) because prompt.fill(text)
replaces existing content; update the code by deleting the prompt.fill("")
invocation and keep the single prompt.fill(text) call (references:
prompt.fill("") and prompt.fill(text) in the spec).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: e909f023-39f5-4b73-a2d6-f46f541555b3
📒 Files selected for processing (1)
packages/app/e2e/session/session-w1-contracts.spec.ts
) The e2e-smoke-tagging guard pins the exact set of @smoke-tagged app E2E tests; the two new session-w1-contracts specs must be listed or the guard fails (seen on unit-windows-opencode-config-project). Add both in sorted order. Verified: bun test packages/opencode/test/config/e2e-smoke-tagging.test.ts (2 pass).
count() does not auto-wait, so reading it immediately after opening the outer <details> could see 0/1 before the inner rows mount. Poll until the count settles above 1 before clicking the rows. Addresses gemini-code-assist review on #1125.
Summary
Strengthen W1 regression coverage (issue #598) so CI gates more of the session-view rewrite before the manual handtest. Adds one new spec file with two
@smokeE2E tests inpackages/app/e2e/session/session-w1-contracts.spec.ts.Test 1 — rendered turn locks contracts. Seeds one turn (
assistant.tool("bash")+assistant.reason(..., { text })) sogroupPartsyields a multi-parttool + reasoningtrow with a<details>chevron, then asserts:rotate(-90deg)) / open-down (rotate(0deg)) orientation (DESIGN.md L412/L468);user-select: texton user-message-text, agent prose (text-part), and agent reasoning (reasoning-body);trow-result-bodyresolves the--font-family-mono/--font-size-mono-small/--fg-weaktokens with no sans leakage (asserted against resolved tokens via probe nodes, so it tracks the token contract, not literal values);[data-scrollable]container (max-height: 240px,overflow-y: auto) — the DOM precondition the nested-wheel isolation keys off;Test 2 — thinking indicator positive case. A hung reply (
assistant.hang(), submitted by hand to avoidproject.prompt()'s idle wait) keeps the turn working with nothing visible; asserts[data-slot="session-turn-thinking"]+ its shimmer are shown, then drains the LLM queue so teardown stays clean.Scope decisions vs #598
session-renderer-diagnostics.spec.ts. This spec only locks the A.3 DOM precondition, not a full wheel-isolation E2E (high flakiness, marginal added protection over the unit coverage).--surface-interactive-baselight /--surface-raiseddark, 14px radius, no border or inset box-shadow), so the W1--border-weak/--border-weakerhairline contract is stale — there is nothing to assert against.docs/, which is untracked in this checkout and cannot ship in a PR; left out of code scope.Verification
bun script/e2e-local.ts -- e2e/session/session-w1-contracts.spec.ts --repeat-each=3→ 6 passed (~7s), no flakes.typecheck(tsgo -b) clean; new spec type-clean undere2e/tsconfig.jsonbuild mode.Plan was Codex-consulted (E2E-first, seeding order, gap4 drop, gap6 scope) before implementation.
Refs #598
Summary by CodeRabbit
Tests