test(app): add W1 renderer contract smoke specs (#598) by Astro-Han · Pull Request #1125 · Astro-Han/pawwork

Astro-Han · 2026-06-03T08:43:22Z

Summary

Strengthen W1 regression coverage (issue #598) so CI gates more of the session-view rewrite before the manual handtest. Adds one new spec file with two @smoke E2E tests in packages/app/e2e/session/session-w1-contracts.spec.ts.

Test 1 — rendered turn locks contracts. Seeds one turn (assistant.tool("bash") + assistant.reason(..., { text })) so groupParts yields a multi-part tool + reasoning trow with a <details> chevron, then asserts:

chevron geometry (12px + 12px icon) and collapsed-right (rotate(-90deg)) / open-down (rotate(0deg)) orientation (DESIGN.md L412/L468);
user-select: text on user-message-text, agent prose (text-part), and agent reasoning (reasoning-body);
trow-result-body resolves the --font-family-mono / --font-size-mono-small / --fg-weak tokens with no sans leakage (asserted against resolved tokens via probe nodes, so it tracks the token contract, not literal values);
the raw tool output is a bounded [data-scrollable] container (max-height: 240px, overflow-y: auto) — the DOM precondition the nested-wheel isolation keys off;
the thinking indicator is gone once the turn has visible parts.

Test 2 — thinking indicator positive case. A hung reply (assistant.hang(), submitted by hand to avoid project.prompt()'s idle wait) keeps the turn working with nothing visible; asserts [data-slot="session-turn-thinking"] + its shimmer are shown, then drains the LLM queue so teardown stays clean.

Scope decisions vs #598

Scroll behaviours A.1/A.2 (reading_history across content/dock resize; weak-trackpad demote) and the full nested-wheel isolation A.3 stay covered by the scroll-controller / scroll-intents unit tests plus the existing weak-wheel E2E in session-renderer-diagnostics.spec.ts. This spec only locks the A.3 DOM precondition, not a full wheel-isolation E2E (high flakiness, marginal added protection over the unit coverage).
Scope B.1 (user-bubble hairline) is dropped. The bubble moved to a fill-only treatment (--surface-interactive-base light / --surface-raised dark, 14px radius, no border or inset box-shadow), so the W1 --border-weak/--border-weaker hairline contract is stale — there is nothing to assert against.
Scope C (checklist segmentation) and D (tool-failure command guidance) target docs/, which is untracked in this checkout and cannot ship in a PR; left out of code scope.

Verification

bun script/e2e-local.ts -- e2e/session/session-w1-contracts.spec.ts --repeat-each=3 → 6 passed (~7s), no flakes.
App typecheck (tsgo -b) clean; new spec type-clean under e2e/tsconfig.json build mode.
Visual surface: the assertions read computed styles from the real rendered session timeline (chevron, trow body, user/agent bubbles) driven through the LLM fixture — the changed surface is exercised by the E2E itself.

Plan was Codex-consulted (E2E-first, seeding order, gap4 drop, gap6 scope) before implementation.

Refs #598

Summary by CodeRabbit

Tests

Added E2E session contract tests for turn rendering and display behavior.
Tests validate UI interactions (expand/collapse), design token compliance, text selectability, scrollable containers, and thinking indicator visibility during processing.

@smoke

Strengthen W1 regression coverage so CI gates more of the session-view rewrite before the manual handtest. Adds two @smoke specs: - Rendered turn locks the trow chevron geometry (12px) and collapsed-right / open-down orientation, user-message / agent-prose / agent-reasoning text selectability, the trow-result-body mono-small + fg-weak typography tokens (asserted against resolved tokens, no sans leakage), and the bounded [data-scrollable] raw-output container that the nested-wheel isolation keys off. Seeds one tool+reasoning turn so groupParts yields a multi-part trow. - Thinking indicator shows while the turn is working with nothing visible, and is gone once any part renders. Scope notes vs issue #598: - Scroll behaviours A.1/A.2 and the full nested-wheel isolation A.3 stay unit-covered (scroll-controller + scroll-intents tests) plus the existing weak-wheel E2E; only the DOM precondition is locked here. - Scope B.1 user-bubble hairline is dropped: the bubble moved to a fill-only treatment (surface-interactive-base / surface-raised, no border/box-shadow), so the W1 hairline contract is stale. Verified locally: bun script/e2e-local.ts -- e2e/session/session-w1-contracts.spec.ts --repeat-each=3 (6 passed); app typecheck clean. Refs #598

coderabbitai · 2026-06-03T08:43:35Z

Warning

Review limit reached

@Astro-Han, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 55 minutes and 43 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 057a577e-a265-42cc-a03b-52993476b5d2

📥 Commits

Reviewing files that changed from the base of the PR and between e6f1695 and e83735c.

📒 Files selected for processing (2)

packages/app/e2e/session/session-w1-contracts.spec.ts
packages/opencode/test/config/e2e-smoke-tagging.test.ts

📝 Walkthrough

Walkthrough

This PR adds a new Playwright E2E test spec (session-w1-contracts.spec.ts) that defines smoke tests for W1 rendered-turn UI contracts. It covers trow chevron rotation, design tokens, text selectability, nested scrollable containers, and thinking indicator visibility during async work states.

Changes

W1 Session Contract Tests

Layer / File(s)	Summary
Test selectors and helpers `packages/app/e2e/session/session-w1-contracts.spec.ts` (lines 1–34)	Defines CSS/data attribute selectors targeting trow blocks, chevron icon, reasoning body, raw tool output, and user/agent text. Provides utilities to extract chevron rotation from CSS `transform` and to detect `<details>` open state.
W1 rendered-turn and thinking indicator contracts `packages/app/e2e/session/session-w1-contracts.spec.ts` (lines 40–199)	Two smoke tests: (1) locks trow chevron sizing, rotation behavior across collapsed/expanded states, verifies text selectability, validates mono-small + fg-weak design tokens via computed-style probing, asserts raw tool output scroll bounds and overflow behavior, and confirms thinking indicator is removed after visible parts mount; (2) ensures thinking indicator with shimmer appears while turn is hung with no visible content, using manual submission and polling for async queue drain.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

[Task] chore(e2e): strengthen W1 regression coverage #598: Directly related—implements the same W1 behavior coverage for trow chevron sizing/rotation, thinking indicator, text selectability, design token assertions, and nested scrolling to strengthen W1 regression test suite.

Poem

A rabbit hops through test assertions bright,
Chevrons spin and details open just right,
Design tokens dance in mono-small delight,
Thinking shimmer glows when work's in flight,
🐰 ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'test(app): add W1 renderer contract smoke specs (`#598`)' clearly summarizes the main change: adding E2E smoke tests for W1 renderer contracts, following conventional commits format.
Description check	✅ Passed	The PR description covers most required sections: Summary (detailed change description), Why (issue reference and regression coverage motivation), Related Issue (references `#598`), verification steps with results, and risk/scope rationale. However, the description lacks explicit Human Review Status and Review Focus sections from the template.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/fix-598-w1-e2e

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a new Playwright E2E test suite (session-w1-contracts.spec.ts) to verify W1 renderer contracts, such as chevron behavior, text selectability, typography, and scrollable raw tool output. Feedback on the changes suggests using expect.poll when retrieving the count of inner triggers to prevent potential test flakiness, as Playwright's count() does not auto-wait for elements to be rendered.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

github-actions

Suggested priority: P3 (only low-risk paths changed (packages/app/e2e/session/session-w1-contracts.spec.ts)).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

The thinking-indicator spec submits with a raw Enter, bypassing project.prompt(), so the session the UI creates was never registered. With an assistant.hang() reply, teardown dropped only the project dir and could leave a non-idle session/stream alive for the rest of the Playwright worker, leaking backend state into later smoke tests. Capture the session id from the URL once the turn is working and project.trackSession() it so cleanup tears the stream down. Addresses Codex review [P2] on #1125.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

packages/app/e2e/session/session-w1-contracts.spec.ts (1)
186-187: 💤 Low value

Redundant clear before fill. prompt.fill(text) already replaces existing content, so the preceding prompt.fill("") is unnecessary unless it guards a specific composer quirk.
♻️ Optional simplification
   await prompt.click()
-  await prompt.fill("")
   await prompt.fill(text)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/app/e2e/session/session-w1-contracts.spec.ts` around lines 186 -
187, The test contains a redundant clear call: remove the unnecessary
prompt.fill("") before prompt.fill(text) because prompt.fill(text) replaces
existing content; update the code by deleting the prompt.fill("") invocation and
keep the single prompt.fill(text) call (references: prompt.fill("") and
prompt.fill(text) in the spec).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/app/e2e/session/session-w1-contracts.spec.ts`:
- Around line 182-199: The manual Enter-submit flow in the W1 thinking indicator
test creates a session but never registers it for fixture cleanup, so the
session is left behind after teardown. After the submit in the session spec,
once the URL contains a session id, call project.trackSession with
sessionIDFromUrl(page.url()) and project.directory so the new session is added
to the cleanup set; use the existing manual-submit tracking pattern from other
tests and keep the change near the prompt.click()/fill()/Enter sequence.

---

Nitpick comments:
In `@packages/app/e2e/session/session-w1-contracts.spec.ts`:
- Around line 186-187: The test contains a redundant clear call: remove the
unnecessary prompt.fill("") before prompt.fill(text) because prompt.fill(text)
replaces existing content; update the code by deleting the prompt.fill("")
invocation and keep the single prompt.fill(text) call (references:
prompt.fill("") and prompt.fill(text) in the spec).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: e909f023-39f5-4b73-a2d6-f46f541555b3

📥 Commits

Reviewing files that changed from the base of the PR and between 2a54f92 and e6f1695.

📒 Files selected for processing (1)

packages/app/e2e/session/session-w1-contracts.spec.ts

) The e2e-smoke-tagging guard pins the exact set of @smoke-tagged app E2E tests; the two new session-w1-contracts specs must be listed or the guard fails (seen on unit-windows-opencode-config-project). Add both in sorted order. Verified: bun test packages/opencode/test/config/e2e-smoke-tagging.test.ts (2 pass).

count() does not auto-wait, so reading it immediately after opening the outer <details> could see 0/1 before the inner rows mount. Poll until the count settles above 1 before clicking the rows. Addresses gemini-code-assist review on #1125.

gemini-code-assist Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread packages/app/e2e/session/session-w1-contracts.spec.ts Outdated

github-actions Bot removed the ui Design system and user interface label Jun 3, 2026

github-actions Bot reviewed Jun 3, 2026

View reviewed changes

Astro-Han added the ui Design system and user interface label Jun 3, 2026

coderabbitai Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread packages/app/e2e/session/session-w1-contracts.spec.ts

github-actions Bot removed the ui Design system and user interface label Jun 3, 2026

github-actions Bot added the harness Model harness, prompts, tool descriptions, and session mechanics label Jun 3, 2026

Astro-Han merged commit d7928f8 into dev Jun 3, 2026
33 checks passed

Astro-Han deleted the claude/fix-598-w1-e2e branch June 3, 2026 10:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(app): add W1 renderer contract smoke specs (#598)#1125

test(app): add W1 renderer contract smoke specs (#598)#1125
Astro-Han merged 4 commits into
devfrom
claude/fix-598-w1-e2e

Astro-Han commented Jun 3, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Astro-Han commented Jun 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Scope decisions vs #598

Verification

Summary by CodeRabbit

Tests

Uh oh!

coderabbitai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Astro-Han commented Jun 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading