feat(chat): Wave 1 polish — hide failed/closed sessions + grouped model picker#73
feat(chat): Wave 1 polish — hide failed/closed sessions + grouped model picker#73jmagar wants to merge 11 commits into
Conversation
Add `model-grouping` helper that splits codex model ids like `gpt-5.5/medium` or `GPT-5.5 (medium)` into `(base, effort)` tuples. When every option in the list parses cleanly, the picker collapses 20 rows into 5 base rows with inline low/medium/high/xhigh toggle pills; otherwise it falls back to the existing flat listbox so claude/gemini/cline pickers keep their current shape. Extract `nextNavIndex` + `useListKeyboard` to dedup the agent/model picker keyboard navigation. The pure helper is unit-tested; the hook supplies auto-reset-on-shrink so a provider switch that shrinks the option list doesn't leave activeIndex pointing past the end. Refs lab-de6yc.4
Adds a destructive `acp.session.bulk_close` action that closes every session
the caller owns matching a typed `BulkCloseSelector { states?, max_age_days?,
max_count? }`. Returns the canonical batch-result envelope
`{ closed: string[], failed: [{ id, kind, message }] }` documented in
docs/dev/ERRORS.md. Sessions owned by other principals are silently
omitted to preserve the not_found masking pattern of close_session.
Implementation notes:
- `BulkCloseSelector::validate_non_empty` blocks the delete-all shortcut.
- `max_count` (default 500) caps the fan-out — selector matches above the
cap return invalid_param without touching state.
- Per-session closes run concurrently bounded by a 5-permit semaphore.
- Dispatcher arm explicitly calls `require_confirm`, with a regression
test that proves the gate fires even if a surface bypasses destructive
elicitation.
- New `ToolError::user_message()` accessor exposes the inner message text
so `BulkCloseFailure.message` doesn't double-encode the JSON envelope.
Tests: empty-selector rejection, missing-confirm rejection, and
cross-principal isolation against fake sessions.
Refs lab-de6yc.1
Sessions in state failed, closed, or cancelled are filtered out of the sidebar by default. The provider keeps the unfiltered list in `runs` and exposes a derived `visibleRuns` with a `hiddenRunCount` and an `includeHiddenRuns` toggle through the actions context. When the hidden count is non-zero, the sidebar shows a "Show N inactive" toggle plus a "Clean up" button. Cleanup dispatches `session.bulk_close` through the existing destructive-action gate — the API auto-injects both the principal from the auth context and the `confirm: true` flag, so the client only sends the selector body. Tests cover the pure filter helper. The dispatch action itself ships with backend tests for selector validation, the confirm gate, and per-principal isolation (preceding commit). Refs lab-de6yc.1
- Hidden states are [failed, closed] only — cancelled stays visible per bead. - Default cleanup selector adds max_age_days: 7 (don't sweep recent failures). - Confirm button uses the bead-canonical 'Delete N Sessions' copy. - Toggle chip uses 'closed/failed' instead of generic 'inactive'. Refs lab-de6yc.1
📝 WalkthroughWalkthroughThis PR adds hidden run management to the chat system with keyboard-driven model selection improvements. The frontend introduces visibility filtering for failed/closed sessions, UI controls to toggle and bulk-clean them, and a refactored model picker using grouped rendering and keyboard navigation. The backend implements a concurrent bulk session closure action with partial-success reporting and principal-scoped authorization. ChangesHidden Run Management Feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/gateway-admin/lib/chat/use-list-keyboard.test.ts (1)
6-31: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd hook-level coverage for index reset on list shrink.
These tests validate
nextNavIndex, but they do not coveruseListKeyboard’s reset behavior whencountshrinks past the active index (Line 49-51 inapps/gateway-admin/lib/chat/use-list-keyboard.ts). Please add a test for that path to protect picker keyboard state from regressions.As per coding guidelines,
**/*.test.{js,ts,jsx,tsx}: Add unit tests for critical business logic.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/gateway-admin/lib/chat/use-list-keyboard.test.ts` around lines 6 - 31, Add a unit test in use-list-keyboard.test.ts that mounts the useListKeyboard hook (useListKeyboard) with an initial count and active index, then updates the count to a smaller value that is less than the current active index and assert the hook resets the active index (the same reset path around lines 49-51 in use-list-keyboard.ts); use the existing test style in the file (e.g., renderHook or equivalent) to simulate updating props and verify the active index becomes null (or the expected reset value) to cover the shrink/reset branch.apps/gateway-admin/components/chat/chat-input.tsx (1)
260-267:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMove focus into grouped picker on open.
Line 266 focuses
modelOptionRefs, but those refs only exist in flat mode. In grouped mode, focus stays on the trigger, so keyboard navigation does not immediately enter the toggle controls.🔧 Proposed fix
React.useEffect(() => { if (!modelPickerOpen) return const selectedIndex = Math.max(modelOptions.findIndex((model) => model.id === selectedModel?.id), 0) setActiveModelIndex(selectedIndex) const frame = window.requestAnimationFrame(() => { - modelOptionRefs.current[selectedIndex]?.focus() + if (groupedModels.kind === 'grouped') { + const pickerEl = document.getElementById(modelPickerId) + const selectedGrouped = pickerEl?.querySelector<HTMLButtonElement>('[data-state="on"]') + const firstGrouped = pickerEl?.querySelector<HTMLButtonElement>('button') + ;(selectedGrouped ?? firstGrouped)?.focus() + return + } + modelOptionRefs.current[selectedIndex]?.focus() }) @@ - }, [modelPickerOpen, modelOptions, selectedModel?.id]) + }, [modelPickerOpen, modelOptions, selectedModel?.id, groupedModels.kind, modelPickerId])Also applies to: 529-536, 562-612
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/gateway-admin/components/chat/chat-input.tsx` around lines 260 - 267, When opening the model picker (useEffect watching modelPickerOpen) the code always focuses modelOptionRefs which only exist in flat mode; change the effect to detect whether the picker is in grouped mode (the component's grouped mode flag, e.g., isGroupedView or similar) and conditionally focus the appropriate ref collection: keep setActiveModelIndex(selectedIndex) but call modelOptionRefs.current[selectedIndex]?.focus() for flat mode and call the grouped refs (e.g., groupedModelOptionRefs.current[selectedIndex] or the group's toggle control ref) for grouped mode so keyboard focus moves into the grouped picker; apply the same conditional focus fix in the other useEffect sites mentioned (around the blocks referenced at lines ~529-536 and ~562-612).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/gateway-admin/lib/chat/chat-session-provider.tsx`:
- Around line 682-687: In bulkCloseHiddenSessions, don't assume
readJsonSafe(response) returns a correctly shaped payload; validate that payload
is a non-null object with closed and failed arrays before computing
closedCount/failedCount. Change the logic in the section where payload is read
(the const payload = ... and closedCount/failedCount computation) to check
Array.isArray(payload.closed) and Array.isArray(payload.failed) (or treat
missing fields as an error), and if the shape is invalid log/report the
response/body and surface an error or treat the request as failed instead of
silently using zero counts; keep the subsequent refreshSessions() behavior but
ensure monitoring/metrics get an explicit failure path when the payload shape is
wrong.
In `@crates/lab/src/acp/registry.rs`:
- Around line 1077-1095: The loop over handles currently only handles Ok((id,
outcome)) and silently ignores Err(JoinError); update the await handling on
handles in the bulk-close aggregation to explicitly handle Err(join_err) by 1)
logging the join error (e.g., via tracing::error or your crate logger) and 2)
pushing a BulkCloseFailure into failed with a clear kind like "join_error" and
the join_err.to_string() as message (use a sensible id placeholder such as
"unknown" or extract an id if the join handle wrapper exposes it), keeping the
existing behavior for Ok((id, outcome)) and preserving
error.kind()/user_message() for outcome Err branches.
---
Outside diff comments:
In `@apps/gateway-admin/components/chat/chat-input.tsx`:
- Around line 260-267: When opening the model picker (useEffect watching
modelPickerOpen) the code always focuses modelOptionRefs which only exist in
flat mode; change the effect to detect whether the picker is in grouped mode
(the component's grouped mode flag, e.g., isGroupedView or similar) and
conditionally focus the appropriate ref collection: keep
setActiveModelIndex(selectedIndex) but call
modelOptionRefs.current[selectedIndex]?.focus() for flat mode and call the
grouped refs (e.g., groupedModelOptionRefs.current[selectedIndex] or the group's
toggle control ref) for grouped mode so keyboard focus moves into the grouped
picker; apply the same conditional focus fix in the other useEffect sites
mentioned (around the blocks referenced at lines ~529-536 and ~562-612).
In `@apps/gateway-admin/lib/chat/use-list-keyboard.test.ts`:
- Around line 6-31: Add a unit test in use-list-keyboard.test.ts that mounts the
useListKeyboard hook (useListKeyboard) with an initial count and active index,
then updates the count to a smaller value that is less than the current active
index and assert the hook resets the active index (the same reset path around
lines 49-51 in use-list-keyboard.ts); use the existing test style in the file
(e.g., renderHook or equivalent) to simulate updating props and verify the
active index becomes null (or the expected reset value) to cover the
shrink/reset branch.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 718c493d-0045-4fc9-b87e-314874914129
📒 Files selected for processing (17)
apps/gateway-admin/components/chat/chat-input.tsxapps/gateway-admin/components/chat/chat-shell.tsxapps/gateway-admin/components/chat/session-sidebar.tsxapps/gateway-admin/components/floating-chat-shell.tsxapps/gateway-admin/lib/chat/chat-session-provider.tsxapps/gateway-admin/lib/chat/model-grouping.test.tsapps/gateway-admin/lib/chat/model-grouping.tsapps/gateway-admin/lib/chat/session-filters.test.tsapps/gateway-admin/lib/chat/session-filters.tsapps/gateway-admin/lib/chat/use-list-keyboard.test.tsapps/gateway-admin/lib/chat/use-list-keyboard.tscrates/lab/src/acp/registry.rscrates/lab/src/dispatch/acp/catalog.rscrates/lab/src/dispatch/acp/dispatch.rscrates/lab/src/dispatch/acp/params.rscrates/lab/src/dispatch/error.rsdocs/dev/ERRORS.md
- Validate bulk-close response shape in `bulkCloseHiddenSessions` so malformed payloads surface as an error instead of silent zero counts. - Handle `JoinError` explicitly in `bulk_close_sessions` aggregation — record as `internal_error` failures and log via `tracing::error!` instead of dropping the result. - Move focus into the grouped model picker on open by querying the selected toggle (or first toggle) from `modelPickerId` — flat-mode refs don't exist in grouped mode. - Extract shrink-reset predicate `shouldResetActiveIndex` from `useListKeyboard` and cover it directly. Avoids pulling react-test-renderer/JSDOM into the bare-node test suite while still protecting `use-list-keyboard.ts:50` from regressions.
Replace 'New session', 'action route session', and empty titles with a derived 'provider · shortModel · timeAgo' label so the sidebar isn't flooded with indistinguishable placeholder rows. Live titles are passed through unchanged. Refs lab-de6yc.2
Add `dominantModelId` helper (strict majority across visible runs), expose via the chat session provider, and suppress the per-row model badge when a run shares the dominant model. Outliers still render the badge. Screen readers get the model in `aria-label` regardless. Refs lab-de6yc.3
Layer a regex-based pass over per-token redact_stdio_value to strip common identifying patterns (IPv4 with optional port, JWT-shaped tokens, absolute paths under /home, /Users, /root) before forwarding provider stderr to clients. Promotes regex from optional 'deploy'-gated dep to a core dep on the lab crate. Refs lab-de6yc.5 (5.1 server slice)
…rompt) Collapse the two-call `session.start` + `session.prompt` materialize path into a single atomic action. Closes the orphan-session window: when the prompt step fails, the just-created session is closed before the error returns so callers never see an orphan row in the sidebar. Response includes the SSE stream ticket so clients can subscribe to the stream without a second action call. Also drops a noisy unused-qualification warning in `sanitize_provider_error`. Refs lab-de6yc.6
- Add `lastDispatchError` transient state on the chat session provider
with an effect that clears it when the user switches providers. The
prior failure path stomped synthetic `{ ready: false }` into
`providerHealth`, making the provider look permanently down even
when only one dispatch had failed.
- Wire `lastDispatchError` through chat-shell and floating-chat-shell
banner surfaces. Provider-down still gates the input; transient
dispatch errors only surface a banner.
- Swap the two failure paths in createSession + sendPrompt from
synthetic `setProviderHealth` to `setLastDispatchError`.
- When the user sends a prompt without a selected run, issue one atomic
`session.start_and_prompt` action instead of `POST /sessions` then
`POST /sessions/<id>/prompt`. Closes the orphan-row race.
Refs lab-de6yc.5, lab-de6yc.6
Summary
Wave 1 of the chat-page polish sweep plan. Lands the two self-contained tasks that don't depend on the chat-send state-machine refactor:
lab-de6yc.1) — hide failed/closed sessions by default +session.bulk_closedispatch action with typed selector + sidebar cleanup trigger.lab-de6yc.4) — group the Codex model picker by base + effort (5 base rows with low/medium/high/xhigh toggle pills instead of 20 flat rows).Tasks 2, 3, 5, and 6 (auto-name placeholders, dominant-model badge suppression, draft-state refactor with AbortController, and the
session.start_and_promptorchestrator) are deferred to follow-up PRs — they share a hot path (chat-session-provider.tsx) where landing them piecemeal in one PR invites correctness regressions.Backend (Task 1)
BulkCloseSelector { states?, max_age_days?, max_count? }indispatch/acp/params.rs.validate_non_empty()blocks the delete-all shortcut;max_countdefaults to 500.session.bulk_closeaction in the catalog + dispatcher arm. Marked destructive; dispatcher itself callsrequire_confirm(regression test guards against gate drift if a surface bypasses elicitation).AcpSessionRegistry::bulk_close_sessionssnapshots candidate ids without holding the lock across.await, filters byprincipalper-session so other principals' sessions are silently skipped (preserves thenot_foundmasking pattern), then closes concurrently bounded by a 5-permit semaphore.ToolError::user_message()accessor lets the per-item failure shape carry the plain message instead of the full JSON envelope.{ closed: string[], failed: [{id, kind, message}] }) documented as canonical indocs/dev/ERRORS.mdso future bulk actions adopt the same pattern.Frontend (Task 1 + Task 4)
lib/chat/session-filters.ts— pure filter helper. Provider derivesvisibleRunsfromrunsand exposeshiddenRunCount+ anincludeHiddenRunstoggle through the actions context.ConfirmDialogprimitive; the API auto-injectsprincipalandconfirm: true, so the client only sends the selector. Default sweep selector:{ states: [failed, closed], max_age_days: 7, max_count: 500 }.lib/chat/model-grouping.ts— parses ids like `gpt-5.5/medium` or `GPT-5.5 (medium)` into `(base, effort)` tuples. Returns `flat` when any option fails to parse, so claude/gemini/cline pickers keep their existing shape.lib/chat/use-list-keyboard.ts— `nextNavIndex` reducer + `useListKeyboard` hook. Replaces the duplicated agent/model picker keyboard handlers; the hook also auto-resets the active index when the option list shrinks (e.g., on provider switch).Verification
Test plan
Refs
Summary by cubic
Hide failed/closed chat sessions by default with a safe bulk clean‑up, and group the Codex model picker by base with effort pills to reduce noise and improve keyboard navigation. Add atomic
session.start_and_prompt, auto‑derived labels for placeholder sessions, redundant-model badge suppression, and provider stderr sanitization. Addresses lab-de6yc.1, lab-de6yc.2, lab-de6yc.3, lab-de6yc.4, lab-de6yc.5, and lab-de6yc.6.New Features
{ states: [failed, closed], max_age_days: 7, max_count: 500 }.session.bulk_closewith typedBulkCloseSelector; per‑principal, bounded concurrency, returns{ closed: string[], failed: [{ id, kind, message }] }, confirm‑gated.session.start_and_promptatomically creates a session and queues the first prompt; returns an SSE ticket; used when sending without a selected session to avoid orphan rows.useListKeyboard/nextNavIndex; focus moves to the selected (or first) toggle on open.dominantModelIdand hide per‑row model badges when a run matches the dominant model.Bug Fixes
session.bulk_closeresponse shape; server recordsJoinErrorasinternal_errorinfailed[]and logs it; sanitize provider stderr to strip IPs/JWTs/home paths.shouldResetActiveIndex; improved focus behavior in the grouped picker.lastDispatchErrorto surface transient dispatch failures without marking the provider unhealthy; clears on provider change.Written for commit 2b9c52e. Summary will update on new commits. Review in cubic
Summary by CodeRabbit
Release Notes