Skip to content

Deployment tracking — Slice B2: peer-side blob reads + install-line streaming #758

@kriszyp

Description

@kriszyp

Slice B2 of the deployment-tracking redesign in #641. Builds on Slice A (#655, merged) and Slice B1 (#657, merged).

Context

Today multi-node deploy_component is broken on main:

  1. Origin's multipart handler writes the upload into hdb_deployment[id].payload_blob and sets req._deploymentId — good.
  2. Origin's deployComponent runs extract → install → load locally from the persisted blob.
  3. Origin calls server.replication.replicateOperation(req) — but at this point req.payload is an already-consumed Readable from busboy. Peers receive a useless stream.
  4. Peer-side deployComponent sees req._deploymentId is set (isReplicatedExecution = true), so it skips recording — but then tries to extract from req.payload, which is gone.

The Slice B2 plan (per #641) is to deliver the payload to peers via the already-replicated hdb_deployment row, using Harper's existing BLOB_CHUNK streamed-blob mechanism. No staging file, no direct-HTTPS relay — both jobs collapse into the table-replication path.

Scope

1. Origin-side cleanup

  • Strip req.payload (and req._packageStream, req._uploadSizeEstimate, etc.) before replicateOperation(req) so peers see a clean operation: {operation: 'deploy_component', _deploymentId, project, package?, restart, install_command?, install_timeout?, install_allow_scripts?} and nothing else.

2. Peer-side deployComponent

  • When req._deploymentId is set and req.payload is absent: look up hdb_deployment[deployment_id] (await the row arriving via table replication if not yet present), then read the payload from row.payload_blob.stream(). Harper's existing Blob.stream() API already blocks on incomplete writes; we just await it.
  • Wrap the await in the existing deployments.peerReceiveIdleTimeoutMs knob (defaults to the 60s BLOB_CHUNK idle timeout already in replicationConnection.ts); on timeout, peer records {status: 'failed', error: 'blob stream idle timeout'}.
  • Peer still runs full extract → install → load → restart lifecycle, but writes to its own per-node row state (recording is still skipped — origin owns the canonical row; peer feeds peer_results back via the operation return).

3. peer_results capture on origin

  • replicateOperation returns {message, replicated?: unknown[]}. After the call settles, origin parses the per-peer results into [{node, status, error?, started_at, completed_at}] and writes it to row.peer_results.

4. Live install-line streaming (residual from #531)

  • nonInteractiveSpawn in components/Application.ts gains an optional onLine?: (stream: 'stdout' | 'stderr', line: string) => void parameter with line-buffering so a chunk that splits mid-line doesn't fire a half-event.
  • installApplication threads an onLine closure that calls emit('install', {manager, stream, line}) through all three install code paths (custom command, devEngines packageManager, npm fallback).
  • The CLI already renders install events (renderInstall in bin/deployRenderer.ts); this finishes the wire.

5. harper-pro-side replicateOperation

  • Simplify the deploy_component path in harper-pro now that the operation no longer carries a payload — the previous chunked-relay / direct-HTTPS workarounds from harper-pro#146 are no longer needed for this operation. (Tracked in the matching harper-pro PR; closing harper-pro#146 once this lands.)

6. Integration test (3-node cluster)

  • Deploy a small fixture from node A.
  • Confirm peers B and C see the replicated hdb_deployment row + payload_blob, run their own prepare/install/load, and report back.
  • Confirm peer_results on the origin row is populated with the right per-node statuses.
  • Confirm get_deployment from node B via SSE replays the origin's event_log after the deploy lands.

Files touched

  • components/operations.js — strip payload before replicate, peer-side blob lookup, peer_results capture
  • components/Application.tsnonInteractiveSpawn gains onLine; installApplication wires the emitter
  • components/deploymentRecorder.ts (existing) — recordPeers(results) method
  • integrationTests/deploy/deploy-tracking-peer-relay.test.ts (new) — 3-node test
  • unitTests/components/applicationSpawn.test.js (new or extended) — onLine buffering
  • harper-pro: separate PR for the replicateOperation simplification

Non-goals (for B2)

  • delete_deployment_payload operation → Slice B3
  • Rollback (deploy_component {rollback_from}) → Slice C
  • onStorageReclamation blob pruning → Slice C

Verification

  • Single-node deploy continues to work end-to-end (no regression on existing Slice A + B1 tests).
  • 3-node deploy succeeds; peer_results shows three nodes with status: 'success'.
  • Killing the origin CLI mid-deploy still leaves a terminal row on every node.
  • Install-line streaming: install SSE events fire during npm install and show up in the CLI live output; recorded into event_log (subject to the 200-entry head+tail truncation already in place).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions