Skip to content

perf(tui): progressive snapshot — split gh out of the critical path#241

Merged
Roberdan merged 1 commit intomainfrom
perf/tui-progressive-snapshot
May 6, 2026
Merged

perf(tui): progressive snapshot — split gh out of the critical path#241
Roberdan merged 1 commit intomainfrom
perf/tui-progressive-snapshot

Conversation

@Roberdan
Copy link
Copy Markdown
Owner

@Roberdan Roberdan commented May 6, 2026

Problem

After #240 the dashboard's first frame was instant, but every fresh
launch still spent ~3.4-3.8s before showing real data. The
in-memory PR cache resets on exit so every cold start paid the full
gh pr list cost. Tracing the shell-out exposed the actual killer:

gh pr list --state all --limit 50 --json …,statusCheckRollup,body  4.0s ← killer
gh pr list --state all --limit 50 --json …,body                    3.0s
gh pr list --state open --limit 30 --json …,statusCheckRollup      0.5s

statusCheckRollup against --state all triggers a per-PR API
round-trip on the GitHub side; on a busy repo the closed tail
dwarfs the open set we actually care about for live CI status.

The HTTP fan-out is fast (~25ms for 50 plans on loopback), so gh is
the entire critical path. As long as Client::snapshot() waits for
gh before yielding, every refresh blocks behind the slowest fetch.

Why

A dashboard that takes 3+ seconds to populate on every launch trains
operators to avoid it. Constitution P3 wants the TUI usable on a
basic terminal — usability includes time-to-data, not just time-to-
first-paint.

What changed

  1. Progressive snapshot via SnapshotEvent. The event loop now
    consumes an enum: Core (HTTP fan-out, no gh), then Prs (open
    list), then Prs (open + closed combined). The user sees plans
    / tasks / agents within ~30-100ms while gh is still in flight.

  2. Split gh by PR state. fetch_prs_open keeps the rich field
    set (incl. statusCheckRollup for live CI on the active queue);
    fetch_prs_closed drops it because CI on a closed PR is already
    final. Open/closed split + field split collapses the gh
    critical path from ~4s to ~0.5s for the open list.

  3. PR cache split. Two memo slots (open_prs, closed_prs)
    instead of one, each at the same 30s TTL. The cache lives in a
    new client_pr_cache module so client.rs stays under the
    300-line cap.

  4. apply_snapshot no longer overwrites PRs with empty.
    snapshot_core returns prs: vec![]; without this guard the
    PR pane would flicker to empty between the Core arrival and the
    first PR payload. Adds apply_prs for the PR-only update path.

Validation

Measured against the live daemon (50 plans, 351 tasks, gh enabled,
fresh process per run):

Stage After #240 After this PR
First frame ~50 ms ~30 ms
Core data visible n/a ~30 ms
connected cold 3.8 s 0.45 s
connected warm 3.4 s 0.12 s

Local pipeline:

  • cargo fmt --all -- --check — clean
  • RUSTFLAGS="-Dwarnings" cargo clippy --workspace --all-targets -- -D warnings — clean
  • RUSTFLAGS="-Dwarnings" cargo test --workspace — 1043 passed, 0 failed
  • pre-commit: file-size, fmt, clippy, context-budget all green

Impact

  • convergio-tui only.
  • New public surface: SnapshotEvent, AppState::apply_prs,
    Client::{snapshot_core, fetch_prs_open_cached, fetch_prs_closed_cached, gh_enabled}. Client::snapshot() and
    AppState::refresh() are preserved (used by tests + the bench
    example).
  • Closed PRs still appear in the dashboard after their fetch
    returns (1-3s after open PRs); statusCheckRollup is no longer
    shown for them — the PRs pane renders ? for the CI glyph on
    closed/merged rows. Acceptable tradeoff for a 5-8× speedup on
    the open list.

🤖 Generated with Claude Code

Profiling showed the post-#240 dashboard was still spending ~3.4-3.8s
between the skeleton frame and the first "connected" status on every
fresh launch (the in-memory PR cache resets on exit). Tracing the
shell-out:

    gh pr list --state all --limit 50 --json …,statusCheckRollup,body
        4.0s   <-- killer (statusCheckRollup forces a per-PR API call)
    gh pr list --state all --limit 50 --json …,body                  3.0s
    gh pr list --state open --limit 30 --json …,statusCheckRollup    0.5s

`statusCheckRollup` against `--state all` triggers per-PR API
round-trips on the GitHub side; on a busy repo the `closed` tail
dwarfs the `open` set we actually care about for live CI status.

Three changes:

1. **Progressive snapshot via `SnapshotEvent`.** `Client::snapshot()`
   used to fan out HTTP + gh together and deliver one terminal event;
   the dashboard had nothing to render until the slowest fetch
   (always gh) returned. The event loop now consumes a
   `SnapshotEvent` enum: `Core` (HTTP fan-out, no gh), then `Prs`
   (open list), then `Prs` (open + closed). The user sees plans /
   tasks / agents within ~50-100ms while gh is still in flight.

2. **Split gh by PR state.** `fetch_prs_open` keeps the rich field
   set (incl. `statusCheckRollup` for live CI on the active queue);
   `fetch_prs_closed` drops it because CI on a closed PR is already
   final. Combined with the open / closed split this collapses the
   gh critical path from ~4s to ~0.5s for the open list, with the
   slower closed fetch landing in the background.

3. **PR cache split.** Two memo slots (`open_prs`, `closed_prs`)
   instead of one, each at the same 30s TTL. Lets the open and
   closed fetches expire independently and lets the `Core` event
   free `refresh_in_flight` without waiting for the trailing
   PR work.

`AppState::apply_snapshot` no longer overwrites `prs` with an empty
vector (`snapshot_core` returns `prs: vec![]`) — the cached PR
list survives across the Core/Prs transition. New
`AppState::apply_prs` carries the PR-only update.

The PR caching helpers move to a new `client_pr_cache` module so
`client.rs` stays under the 300-line cap.

Measurements (50 plans, 351 tasks, daemon on loopback, gh enabled):

| Stage             | After #240 | After this PR |
|-------------------|------------|---------------|
| First frame       | ~50 ms     | ~30 ms        |
| Core data visible | n/a        | ~30 ms        |
| `connected` (cold)| 3.8 s      | 0.45 s        |
| `connected` (warm)| 3.4 s      | 0.12 s        |

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@Roberdan Roberdan merged commit 0ae1390 into main May 6, 2026
3 checks passed
@Roberdan Roberdan deleted the perf/tui-progressive-snapshot branch May 6, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant