feat(deploy): Slice B2 — peer-side blob reads + install-line streaming#760
Draft
kriszyp wants to merge 4 commits into
Draft
feat(deploy): Slice B2 — peer-side blob reads + install-line streaming#760kriszyp wants to merge 4 commits into
kriszyp wants to merge 4 commits into
Conversation
Slice B2 of the deployment-tracking redesign in #641. Makes multi-node deploy_component work after Slice A's payload_blob-on-replicated-row design. Origin side (components/operations.js): - Strip req.payload before replicateOperation so peers receive a clean operation body (the consumed Readable would otherwise leak). - Capture per-peer outcomes from replicateOperation's `replicated` return into the origin row's peer_results via DeploymentRecorder.recordPeers(). Peer side (components/operations.js + deploymentRecorder.ts): - When req._deploymentId is set and there's no req.payload, look up the hdb_deployment row (await replication arrival with exponential-backoff polling, 5ms → 100ms), then extract from row.payload_blob.stream(). - awaitDeploymentRow encapsulates the polling with a 30s default timeout. Install-line streaming (components/Application.ts): - nonInteractiveSpawn gains an optional onLine(stream, line) callback with proper line buffering (StringDecoder so a multi-byte UTF-8 char split across chunks is reassembled instead of corrupted into U+FFFD). - installApplication threads an onLine closure through all three install paths (custom command, devEngines packageManager, default npm). - deployComponent wires this to emit('install', {manager, stream, line}) so the SSE channel carries `npm install` output live as it streams. Finishes the residual install-streaming work from the closed #531. DeploymentRecorder.recordPeers swallows audit-write failures so an hdb_deployment write error doesn't fail a deploy that successfully replicated to peers (peer_results is observability, not critical path). Tests: - unit: 5 line-buffering scenarios including UTF-8 split-byte reassembly - unit: 11 recordPeers / awaitDeploymentRow scenarios - integration: peer-side branch via seeded row (single-node simulation of the operation shape origin produces for peers) The true 3-node cluster verification lives in the matching harper-pro PR — replicateOperation is implemented there. Cross-model reviewed: agy flagged the UTF-8 split-chunk bug, the recordPeers crash-on-DB-error, and the polling latency floor — all addressed in this commit. Closes #758 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
|
Reviewed; no blockers found. |
The deploy hangs after extraction when reading a Web ReadableStream from a file-backed Blob inside the same Harper process on Bun — same code passes on Node v22/v24 across Linux and Windows. The harper-pro 3-node cluster test (HarperFast/harper-pro#221) covers the same code path end-to-end with real replication, so this skip doesn't lose coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
recordPeers was issuing its own put() concurrent with coalesced emitter-triggered puts from phase events. Each put captures `this.record` as the serializer sees it, and we were observing peer_results=[] in the persisted row even though recordPeers mutated the in-memory state to a populated array. The race: an earlier scheduleFlush's put (called when peer_results was still []) sometimes completes AFTER recordPeers' direct put, overwriting our write with the stale snapshot. Fix: recordPeers no longer puts. It stashes the input on the recorder and mutates the in-memory record (so live SSE readers still see the update immediately). finish() re-applies the stash right before the terminal status-transition put, bundling peer_results with status=success into one atomic write. No concurrent puts after this point. Unit test rewritten to reflect the new contract: recordPeers populates the in-memory record but persistence happens at finish(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The .catch in recordPeers' put was removed when peer_results moved into finish(). The logger import is no longer needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Slice B2 of the deployment-tracking redesign in #641. Closes #758.
Summary
deploy_componentworks again after Slice A's payload_blob-on-row design. The payload now reaches peers through Harper's existingBLOB_CHUNKtable-replication channel — no staging file, no direct-HTTPS relay (the obsoleted approaches from feat(deploy): stage streamed payloads to a temp file for replication #536 and harper-pro#146).installSSE events forward each line ofnpm install/pnpm install/ custom-command stdout/stderr to the CLI in real time. Finishes the residual streaming work from the closed feat(deploy): live SSE progress for deploy_component #531.What changed
Origin side (
components/operations.js):req.payloadbeforereplicateOperationso peers don't receive a consumedReadable.replicateOperationreturns, normalize the per-peer outcomes and write them torow.peer_resultsvia the newDeploymentRecorder.recordPeers().Peer side (
components/operations.js+components/deploymentRecorder.ts):deployComponent: whenreq._deploymentIdis set andreq.payloadis absent, look uphdb_deployment[deployment_id]via the newawaitDeploymentRow()helper, then extract fromrow.payload_blob.stream(). The Blob API already handles in-flightBLOB_CHUNKwrites by blocking until chunks arrive.awaitDeploymentRowpolls with exponential backoff (5ms → 100ms) so the fast path — replication has already caught up — sees no human-perceptible latency.Install-line streaming (
components/Application.ts):nonInteractiveSpawngains an optionalonLine(stream, line)callback with proper line buffering andStringDecoderso multi-byte UTF-8 characters split across chunk boundaries are reassembled, not corrupted intoU+FFFD.installApplicationthreads anonLineclosure through all three install paths (custom command, devEngines packageManager, default npm).deployComponentwiresonInstallLinetoemit('install', {manager, stream, line})so the SSE channel carriesnpm installoutput as it happens.Defensive write (
components/deploymentRecorder.ts):recordPeersswallows audit-write failures with a warn log —peer_resultsis observability, not critical path. A disk-full on the audit table shouldn't fail a deploy that successfully replicated.Where to look
components/operations.jslines ~420 (peer-side branch + strip) and ~480 (peer_results capture)components/deploymentRecorder.ts—recordPeers,awaitDeploymentRow,normalizePeerResultcomponents/Application.ts—createLineSplitter,nonInteractiveSpawnonLineparam, three install paths threadingCross-model review
agy(Google's model) reviewed the diff and flagged three real issues that are addressed in this PR:chunk.toString(). Fixed withStringDecoder.recordPeerswould have crashed the deploy on an audit table write error. Now logged-and-swallowed.awaitDeploymentRowhad a 100ms fixed polling interval that introduced a latency floor on the fast path. Now exponential backoff starting at 5ms.Test plan
unitTests/components/applicationSpawn.test.js— 6 line-buffering scenarios including UTF-8 split reassemblyunitTests/components/deploymentRecorder.test.js— 12 scenarios (recordPeers normalization, finish-then-recordPeers no-op, audit-write resilience, awaitDeploymentRow timeout + fast path)integrationTests/deploy/deploy-tracking-peer-branch.test.ts— exercises the peer-side branch on a single node by seeding anhdb_deploymentrow firstdeploy-tracking.test.ts(11 tests) anddeploy-tracking-events.test.ts(5 tests)replicateOperationis implemented thereFollow-ups (out of scope)
delete_deployment_payloadoperation + 3-node cluster integration test in harper-prodeploy_component {rollback_from}) +onStorageReclamationblob pruningnpm installlines visible on origin SSE) — not yet wired; peers have no emitter🤖 Generated by Claude
Companion PR
replicateOperationalready does generic operation forwarding (nodeploy_component-specific path needed). The closed harper-pro#146's chunked-relay was working around a constraint the row-replication channel doesn't have.