Skip to content

feat(deploy): Slice B1 — ProgressEmitter + SSE on hdb_deployment lifecycle#657

Merged
kriszyp merged 3 commits into
feat/deployment-tracking-slice-afrom
feat/deployment-tracking-slice-b1
May 22, 2026
Merged

feat(deploy): Slice B1 — ProgressEmitter + SSE on hdb_deployment lifecycle#657
kriszyp merged 3 commits into
feat/deployment-tracking-slice-afrom
feat/deployment-tracking-slice-b1

Conversation

@kriszyp
Copy link
Copy Markdown
Member

@kriszyp kriszyp commented May 21, 2026

Summary

Slice B1 of #641, stacked on #655 (Slice A). Brings the paused #531 SSE work back, but now wired through the DeploymentRecorder from Slice A — so every deploy lifecycle phase is captured both as a persistent event_log on the hdb_deployment row AND as a live SSE stream to any client requesting Accept: text/event-stream. The same content-negotiated branch serves get_deployment, letting Studio replay a deploy's history and tail in-flight events through a single endpoint.

Where to look

  • server/serverHelpers/progressEmitter.ts — ported from feat(deploy): live SSE progress for deploy_component #531 with the multi-line SSE spec fix, StringDecoder, disconnect cleanup, and the new prime write (: stream open\n\n). The prime is the result of an empirical investigation: without it, Fastify buffers the returned PassThrough until end-of-stream and only flushes the final chunk. The prime forces the response to start streaming immediately.
  • components/deploymentRecorder.ts — subscribes to a ProgressEmitter, coalesces writes (one in-flight put at a time + dirty flag), bounds event_log to 200 entries with head+tail retention so a noisy install doesn't purge the lifecycle spine. Emits a _recorder_finished sentinel on finish() before unsubscribing.
  • components/deploymentOperations.tshandleGetDeployment SSE branch subscribes to the live emitter before reading the row (closing the stitching gap between historical replay and live tail), dedupes by timestamp, and has a polling fallback so the SSE promise always resolves even on crash paths.
  • components/operations.jsdeployComponent emits phase events around prepare/load/replicate/restart/success, strips req.progress before replicateOperation (peers can't deserialize functions), skips recording on replicated executions.
  • server/serverHelpers/serverHandlers.js — adds the SSE_PROGRESS_OPERATIONS branch. Accept parsing uses .includes('text/event-stream') to play nice with RFC 7231 multi-type accepts.
  • bin/cliOperations.ts + bin/sseConsumer.ts + bin/deployRenderer.ts — CLI consumes SSE, renders phase/install/error events live.

Findings from cross-model review (gemini)

All BLOCKER/MAJOR findings addressed in this PR:

Severity Finding Resolution
BLOCKER event_log naive front-truncation loses the lifecycle spine Head+tail retention with a truncation marker
MAJOR Stitching gap between historical replay and live tail Subscribe-first, dedupe by timestamp
MAJOR SSE tail hangs if recorder finishes without an explicit success/error event _recorder_finished sentinel + 500ms polling fallback
MINOR Strict === Accept-header check fragile under multi-type accepts .includes('text/event-stream')

Test plan

Scope deliberately deferred

  • Install line forwardingnpm install stdout/stderr isn't yet forwarded into the emitter. That's Slice B2 alongside peer-side blob reads.
  • Streaming payload ingest — Slice A's interim memory buffer is still in place; the streaming variant arrives with B2.
  • Rollback + reclamation — Slice C.

Stacked on top of #655 (Slice A). Will retarget to main once #655 lands.

🤖 Generated with Claude Code

Wires the ProgressEmitter (resurrected from the paused #531) into the new
DeploymentRecorder so every deploy_component lifecycle phase is captured on
the row's event_log AND streamable live via SSE. Same content-negotiated
branch serves get_deployment, letting Studio (or any client) replay a
deploy's history and tail in-flight events through a single endpoint.

What's new
- DeploymentRecorder subscribes to a ProgressEmitter and coalesces writes:
  every emit appends to a bounded event_log (200 cap, head+tail retention so
  the lifecycle spine survives a noisy install); chained puts collapse a
  burst into one round trip. Emits a `_recorder_finished` sentinel on
  finish() so SSE tailers terminate cleanly even on crash paths.
- deployComponent emits prepare/load/replicate/restart/success phase events
  around their respective steps. Strips req.progress before replicateOperation
  so peers see a clean payload. Skips recording entirely on replicated
  (peer-side) executions — origin owns the canonical row.
- An in-memory activeEmitters Map keyed by deployment_id lets get_deployment
  SSE locate the live emitter and tail it.
- handlePostRequest gains a content-negotiated SSE branch (req.headers.accept
  includes text/event-stream + op in SSE_PROGRESS_OPERATIONS). Prime write
  on the PassThrough so Fastify starts piping immediately — empirically
  Fastify buffers a returned Readable until end-of-stream without it,
  collapsing all intermediate writes into a single flush.
- get_deployment with SSE subscribes to the live emitter BEFORE reading the
  row, then replays the historical event_log and dedupes by timestamp so no
  event is lost in the stitching gap. A polling fallback resolves the SSE
  promise even if the deploy disappears without signaling a terminal event.
- CLI sends Accept: text/event-stream for deploy_component; consumes the
  SSE response via parseSSE; renders phase/install/error events through
  DeployRenderer.
- httpRequest gains a streamResponse option that yields the raw IncomingMessage
  as a Readable instead of buffering — what the SSE consumer needs.

Ported from #531 (with the multi-line data spec fix, StringDecoder, and
disconnect cleanup already applied earlier in the session):
- server/serverHelpers/progressEmitter.ts (+ tests)
- bin/sseConsumer.ts (+ tests)
- bin/deployRenderer.ts (+ tests)

Integration coverage: integrationTests/deploy/deploy-tracking-events.test.ts
asserts event_log shape on success, SSE replay+done on get_deployment, and
the failure path emits an error event into the log.

Refs #641 (Slice B1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kriszyp kriszyp requested review from Ethan-Arrowood and heskew May 21, 2026 03:29
Comment on lines +136 to +163
if (liveEmitter && !TERMINAL_STATUSES.has(row.status)) {
await new Promise<void>((resolve) => {
resolveLive = resolve;
// Flush anything that arrived during replay, filtering to events the replay missed.
for (const buffered of liveBuffer) {
if (buffered.t > lastReplayedTs) forwardLive(buffered.event);
}
if (liveDone) resolve();
// Safety net — if the in-memory emitter is dropped (recorder finished or
// the process recycled) before signaling, poll the row's status as a
// fallback so the client never hangs indefinitely.
const pollTimer = setInterval(async () => {
if (liveDone) {
clearInterval(pollTimer);
return;
}
const live = getActiveEmitter(req.deployment_id);
if (!live || live !== liveEmitter) {
clearInterval(pollTimer);
const latest = await table.get(req.deployment_id);
if (latest && TERMINAL_STATUSES.has(latest.status) && !liveDone) {
liveDone = true;
resolve();
}
}
}, 500);
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the live-tail branch.

The integration test suite (deploy-tracking-events.test.ts) only exercises the terminal/replay path — get_deployment SSE is called after the deploy has already completed, so liveEmitter is always undefined and this whole block is skipped.

The live-tail code is the most complex part of this PR: subscribe-first ordering, liveBuffer dedup by timestamp, the resolveLive closure, and the 500 ms poll fallback. Per the review layer: "if the PR introduces a runtime-shape branch, both legs need coverage — a test that only lands in the fallback gives false confidence on the production path."

A minimal integration test for this branch: start the SSE watcher before the deploy completes (e.g., start the deploy, immediately attach get_deployment with Accept: text/event-stream from a parallel request, and assert that live phase events appear in the stream before the done event). The poll fallback can be exercised by wrapping the emitter deletion and verifying the client still closes cleanly.

Comment thread components/operations.js
Comment on lines +502 to 507
emit('error', {
message: err?.message ?? String(err),
code: err?.statusCode ?? err?.code,
phase: recorder?.row.phase,
});
if (recorder) await recorder.finish('failed', err);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double error event on failed SSE deploy.

emit('error', ...) writes the error into the PassThrough stream. Then throw err causes createSSEResponseStream's .catch handler to write a second error event. The CLI renders both — the user sees the error line twice. The second event also overwrites sseError in the consumer loop, dropping the phase field from the first.

One fix: after emitting the error here, signal completion through the emitter (e.g., emit _recorder_finished) so createSSEResponseStream knows the operation's error is already accounted for, then return a sentinel instead of throwing. Alternatively, suppress the framework-level catch event when an application-level error event was already written.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 21, 2026

Reviewed; no blockers found.

@kriszyp kriszyp marked this pull request as ready for review May 21, 2026 03:58
@kriszyp kriszyp requested a review from a team as a code owner May 21, 2026 03:58
@kriszyp kriszyp removed the request for review from a team May 21, 2026 03:59
…ation test

When operations.js emits an `error` event through the ProgressEmitter before
throwing, createSSEResponseStream's .catch handler was writing a second error
SSE record — dropping the phase context from the first. Fix: track whether the
subscriber already forwarded an error event and skip the framework fallback if so.

Adds a unit test for the dedup behavior and an integration test covering the
live-tail SSE branch (liveEmitter && !TERMINAL_STATUSES) which was previously
unexercised — the new test opens get_deployment SSE against an in-flight deploy
using a sleep 3 install command.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kriszyp
Copy link
Copy Markdown
Member Author

kriszyp commented May 21, 2026

Both findings addressed in commit pushed just now:

Finding 1 (missing live-tail test): Added get_deployment SSE tails live events on an in-flight deploy in integrationTests/deploy/deploy-tracking-events.test.ts. The test starts a deploy with install_command: 'sleep 3', polls list_deployments until the in-flight row appears, then opens the SSE tail concurrently (via Promise.all) and asserts that phase events and a done event arrive.

Finding 2 (double error event): Added an errorEmitted flag in createSSEResponseStream. The subscriber sets it when it forwards an error event from the emitter; the .catch fallback skips writing a second SSE error if errorEmitted is already true. A new unit test (does not emit a second error event when the operation already emitted one before throwing) covers the exact scenario.

— Claude

@kriszyp kriszyp requested a review from dawsontoth May 21, 2026 04:07
…ires

Without a package.json Harper skips install entirely ("no package.json; skipping install"), so the deploy completes in <100ms and the polling loop
never catches an in-flight row. The other install-command tests in this file
already include package.json for the same reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@Ethan-Arrowood Ethan-Arrowood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is my favorite new feature. tests look good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants