Skip to content

feat(deploy): Slice A — hdb_deployment system table + audit record per deploy#655

Merged
kriszyp merged 10 commits into
mainfrom
feat/deployment-tracking-slice-a
May 23, 2026
Merged

feat(deploy): Slice A — hdb_deployment system table + audit record per deploy#655
kriszyp merged 10 commits into
mainfrom
feat/deployment-tracking-slice-a

Conversation

@kriszyp
Copy link
Copy Markdown
Member

@kriszyp kriszyp commented May 21, 2026

Summary

Slice A of #641. Every deploy_component call now writes a row to a new system.hdb_deployment table capturing what was deployed, when, by whom, and whether it succeeded — plus the payload tarball itself as a Blob attribute. The deployment_id is returned in the deploy response and is the join key Studio/CLI will use to subscribe to live progress in Slice B.

This is the foundation layer for the broader deployment tracking, replicated payload delivery, and rollback design. Slices B and C will extend the same table with live ProgressEmitter integration (replacing #531), peer-side reads from the replicated blob (replacing #536/harper-pro#146), and rollback semantics.

Changes

  • json/systemSchema.json: hdb_deployment table with deployment_id (UUID hash) plus the full lifecycle attributes
  • utility/hdbTerms.ts: table-name constant and four new operation enums
  • upgrade/directives/5-2-0.ts: provisions the table on existing installs (fresh installs get it via mount_hdb's systemSchema iteration)
  • components/deploymentRecorder.ts: lifecycle wrapper used by deployComponent — creates the pending row, ingests payload into the Blob attribute with sha256 + size, commits the terminal status
  • components/deploymentOperations.ts: list_deployments (filters: project, status, since, until, limit, offset) and get_deployment. Payload bytes are stripped from both — only payload_blob_present: boolean is exposed
  • components/operations.js: deployComponent wraps prepareApplication in a try/catch driven by the recorder; payload is re-sourced from the persisted blob so extraction reads exactly what was recorded
  • server/serverHelpers/serverUtilities.ts: registers the two new ops
  • Brittle deepStrictEqual deploy-response assertions in 4 existing tests updated to allow the new deployment_id field

Scope deliberately deferred

  • Streaming ingest: ingestPayload currently buffers the upload before persisting (race-free; multi-GB optimization comes in Slice B alongside the ProgressEmitter subscriber that also benefits from chunk-level progress events)
  • Peer-side reads: Slice B reshapes peer execution to read the payload from the replicated hdb_deployment row's blob, replacing the staging temp file (feat(deploy): stage streamed payloads to a temp file for replication #536) and the direct-HTTPS relay (harper-pro#146)
  • Rollback / reclamation: Slice C — deploy_component {rollback_from} arg + onStorageReclamation hook for per-node blob pruning

Test plan

Refs #641

🤖 Generated with Claude Code

Slice A of #641. Every deploy_component call now writes a row to a new
system.hdb_deployment table capturing the project, package identifier,
sha256 of the payload tarball, payload size, status (pending → success
or failed), error info, and the upload payload itself as a Blob
attribute. The deployment_id is returned in the deploy response and is
the join key Studio/CLI will use to subscribe to live progress in
Slice B.

Includes:
- json/systemSchema.json: hdb_deployment table definition (deployment_id
  hash, with attributes mirroring the lifecycle)
- utility/hdbTerms.ts: SYSTEM_TABLE_NAMES.DEPLOYMENT_TABLE_NAME +
  LIST_DEPLOYMENTS / GET_DEPLOYMENT / GET_DEPLOYMENT_PAYLOAD /
  DELETE_DEPLOYMENT_PAYLOAD operation enums
- upgrade/directives/5-2-0.ts: provisions the table on existing installs
  (fresh installs get it via mount_hdb's systemSchema iteration)
- components/deploymentRecorder.ts: lifecycle wrapper used by
  deployComponent — creates the row up front, ingests the payload into
  a Blob attribute with sha256 + size, then commits success or failure
- components/deploymentOperations.ts: handlers for list_deployments
  (with project/status/since/until/limit/offset filters) and
  get_deployment; payload bytes are stripped from these responses
- components/operations.js: deployComponent now wraps prepareApplication
  in a try/catch driven by the recorder; payload is re-sourced from
  the persisted blob so extraction reads exactly what was recorded
- server/serverHelpers/serverUtilities.ts: registers the two new ops
- integrationTests/deploy/deploy-tracking.test.ts: end-to-end coverage
  for the happy path, list filtering, and failure recording

Updates the brittle deepStrictEqual deploy-response assertions in 4
existing tests to allow the new deployment_id field. Slice A scope is
deliberately single-node; Slice B will replace the in-memory buffer in
ingestPayload with a streaming variant and add peer-side reads from
the replicated blob.

Refs #641

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kriszyp kriszyp requested review from a team as code owners May 21, 2026 00:37
Comment thread server/serverHelpers/serverUtilities.ts
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 21, 2026

2 blockers found (same pattern, operations.js deploy path).

1. Audit bypass via client-injected _deploymentId

File: components/operations.js:397
What: isReplicatedExecution is derived solely from typeof req._deploymentId === 'string'. A superuser can send {"operation":"deploy_component","_deploymentId":"x",...} and the recorder is never created — no row in system.hdb_deployment, no deployment_id in the response.
Why it matters: Breaks the PR's stated guarantee that every deploy_component call writes a row.
Suggested fix: Either strip _deploymentId from inbound bodies before the check, or have the replication layer mark the call out-of-band rather than relying on a user-controlled body field.

2. req.progress injection causes uncaught TypeError in deploy path

File: components/operations.js:398
What: emitter = req.progress ?? new ProgressEmitter() — if the client sends {"progress":{}}, emitter is {}. DeploymentRecorder.create({..., emitter: {}}) then calls {}.subscribe(...) → TypeError. This call is before the deploy try/catch, so the catch handler never runs and the deploy fails with a confusing 500.
Why it matters: Same pattern as the prior handleGetDeployment blocker (now correctly fixed at deploymentOperations.ts:89). The fix there — checking typeof x.emit === 'function' — needs to be mirrored here (or use instanceof ProgressEmitter since the import is available).
Suggested fix: const emitter = isReplicatedExecution ? null : (req.progress instanceof ProgressEmitter ? req.progress : new ProgressEmitter());


Prior blocker (finding #1 from last run — req.progress guard in handleGetDeployment) is resolved: deploymentOperations.ts:89 correctly gates on typeof (req.progress as any).emit === 'function', which JSON-parsed bodies can never satisfy.

@kriszyp kriszyp marked this pull request as draft May 21, 2026 00:45
- Skip recording on replicated executions: peer nodes receiving deploy_component via
  replicateOperation already have req._deploymentId set by origin, so they no longer
  spin up a fresh recorder + UUID + row. Prevents N duplicate rows per N-node cluster.
- Cap payload at 200 MiB while Slice A buffers in memory. Throws a clear ClientError
  pointing users at the package-identifier path or Slice B's streaming variant.
- Register list_deployments and get_deployment in utility/operation_authorization.ts.
  Pattern matches get_components: requires_su=true with the operation enum as the
  named exception so a role can be granted it without SU rights (per the design's
  permission model).
- Add "audit": true to hdb_deployment in systemSchema.json so fresh installs match
  the audit setting the 5-2-0 upgrade directive applies.
- Drop two now-unused imports (Transform from recorder, existsSync from test).
- Auto-format pass via npm run format:write.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kriszyp kriszyp requested review from Ethan-Arrowood, cb1kenobi, dawsontoth and heskew and removed request for a team and cb1kenobi May 21, 2026 00:52
…ap pattern

Two non-obvious findings from #641 Slice A that future agents should know:

1. createBlob(readable) + table.put() doesn't synchronously drain the source.
   The blob's saveBlob runs concurrently; calling hash.digest() after the put
   resolves can race a still-flushing Transform and throw ERR_CRYPTO_HASH_FINALIZED.
2. Adding a new system table requires three changes: systemSchema.json (fresh
   installs), SYSTEM_TABLE_NAMES (constant), and an upgrade directive registered
   in directivesController.ts. The directive shape was undocumented after the
   .ts refactor cleared old directives.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kriszyp kriszyp marked this pull request as ready for review May 21, 2026 01:14
kriszyp and others added 6 commits May 22, 2026 14:07
Wires the ProgressEmitter (resurrected from the paused #531) into the new
DeploymentRecorder so every deploy_component lifecycle phase is captured on
the row's event_log AND streamable live via SSE. Same content-negotiated
branch serves get_deployment, letting Studio (or any client) replay a
deploy's history and tail in-flight events through a single endpoint.

What's new
- DeploymentRecorder subscribes to a ProgressEmitter and coalesces writes:
  every emit appends to a bounded event_log (200 cap, head+tail retention so
  the lifecycle spine survives a noisy install); chained puts collapse a
  burst into one round trip. Emits a `_recorder_finished` sentinel on
  finish() so SSE tailers terminate cleanly even on crash paths.
- deployComponent emits prepare/load/replicate/restart/success phase events
  around their respective steps. Strips req.progress before replicateOperation
  so peers see a clean payload. Skips recording entirely on replicated
  (peer-side) executions — origin owns the canonical row.
- An in-memory activeEmitters Map keyed by deployment_id lets get_deployment
  SSE locate the live emitter and tail it.
- handlePostRequest gains a content-negotiated SSE branch (req.headers.accept
  includes text/event-stream + op in SSE_PROGRESS_OPERATIONS). Prime write
  on the PassThrough so Fastify starts piping immediately — empirically
  Fastify buffers a returned Readable until end-of-stream without it,
  collapsing all intermediate writes into a single flush.
- get_deployment with SSE subscribes to the live emitter BEFORE reading the
  row, then replays the historical event_log and dedupes by timestamp so no
  event is lost in the stitching gap. A polling fallback resolves the SSE
  promise even if the deploy disappears without signaling a terminal event.
- CLI sends Accept: text/event-stream for deploy_component; consumes the
  SSE response via parseSSE; renders phase/install/error events through
  DeployRenderer.
- httpRequest gains a streamResponse option that yields the raw IncomingMessage
  as a Readable instead of buffering — what the SSE consumer needs.

Ported from #531 (with the multi-line data spec fix, StringDecoder, and
disconnect cleanup already applied earlier in the session):
- server/serverHelpers/progressEmitter.ts (+ tests)
- bin/sseConsumer.ts (+ tests)
- bin/deployRenderer.ts (+ tests)

Integration coverage: integrationTests/deploy/deploy-tracking-events.test.ts
asserts event_log shape on success, SSE replay+done on get_deployment, and
the failure path emits an error event into the log.

Refs #641 (Slice B1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ation test

When operations.js emits an `error` event through the ProgressEmitter before
throwing, createSSEResponseStream's .catch handler was writing a second error
SSE record — dropping the phase context from the first. Fix: track whether the
subscriber already forwarded an error event and skip the framework fallback if so.

Adds a unit test for the dedup behavior and an integration test covering the
live-tail SSE branch (liveEmitter && !TERMINAL_STATUSES) which was previously
unexercised — the new test opens get_deployment SSE against an in-flight deploy
using a sleep 3 install command.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ires

Without a package.json Harper skips install entirely ("no package.json; skipping install"), so the deploy completes in <100ms and the polling loop
never catches an in-flight row. The other install-command tests in this file
already include package.json for the same reason.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…og lines

Two bugs in the upload progress display:

1. DeployRenderer was instantiated without uploadTotal, so the bar initialised
   with total=1 and immediately showed 100% on the first chunk. Fix: pre-walk
   the project directory with getPackagedDirectorySize() and pass the
   uncompressed size as the total. The bar moves as gzipped bytes are sent and
   endUpload() snaps it to 100% on completion.

2. The non-TTY path emitted a log line every 5 MiB ("Uploaded 5.0 MiB",
   "Uploaded 10.0 MiB", ...) during the upload, cluttering CI logs. Fix:
   remove intermediate lines entirely; endUpload() now prints one
   "Uploaded X MiB" line on completion.

Bar format updated to show human-readable bytes via cli-progress payload tokens
({value_fmt} / ~{total_fmt}) instead of raw byte counts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oad finished

For SSE deploys, httpRequest resolves (streamResponse: true) when response
headers arrive — which is shortly after the multipart file header is seen by
busboy, not after all file data is sent. Calling endUpload() at that point
snapped the bar to 100% with only ~50 KiB counted.

The counter Transform's flush callback in tapUploadStream already calls
endUpload() at the correct moment: when every tar.gz chunk has flowed
through the counter and into the HTTP request body.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…y to 100%

Previously the bar counted gzip-compressed wire bytes against an uncompressed
directory-size total, so for a 100 MiB component the bar would only reach ~30%
before endUpload() snapped it to 100%.

- `streamPackagedDirectory` now accepts an optional `onBytes` callback invoked
  for each raw tar chunk before gzip; both counter and total are uncompressed.
- `DeployRenderer.countUploadBytes(n)` is the new public method for this; the
  counter Transform's `transform` callback no longer counts (it only exists for
  the `flush` → `endUpload()` signal).
- `cliOperations` defers `streamPackagedDirectory` until after the renderer is
  created, so the `onBytes` callback can be wired directly without a lazy closure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread components/deploymentOperations.ts Outdated
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
@kriszyp kriszyp merged commit 3ea6d76 into main May 23, 2026
16 checks passed
@kriszyp kriszyp deleted the feat/deployment-tracking-slice-a branch May 23, 2026 03:14
Comment thread components/operations.js
// Only the origin node records — peers receiving a replicated deploy_component skip
// recording so we don't accumulate one row per node for the same deploy. The row will
// reach peers via the table's replication once Slice B has them consume it.
const isReplicatedExecution = typeof req._deploymentId === 'string';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Audit bypass via client-injected _deploymentId.

Any superuser can skip audit-row creation by including "_deploymentId": "anything" in the request body. typeof req._deploymentId === 'string' is true, isReplicatedExecution is true, the recorder is never created, and the deploy proceeds with no row written to system.hdb_deployment. The response also omits deployment_id, so the client silently gets the pre-Slice-A shape.

The peer-replication use case needs a way to distinguish "this is a replicated call" from "a client happened to send this field." One approach: strip client-supplied _deploymentId early, before the recorder decision, and only trust it if it arrives via the replication-path context rather than the user-supplied body.

Suggested change
const isReplicatedExecution = typeof req._deploymentId === 'string';
const isReplicatedExecution = typeof req._deploymentId === 'string' && req._isReplicatedOperation === true;

…or, more robustly, have the replication layer mark the call out-of-band (e.g. on the request context) so _deploymentId in the body is no longer the sole signal.

Comment thread components/operations.js
// recording so we don't accumulate one row per node for the same deploy. The row will
// reach peers via the table's replication once Slice B has them consume it.
const isReplicatedExecution = typeof req._deploymentId === 'string';
// Slice B1 of #641: an SSE-bound caller already attached a ProgressEmitter (created in
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

req.progress injection causes TypeError outside the deploy try/catch.

Same pattern as the prior blocker in handleGetDeployment (now fixed at deploymentOperations.ts:89), but here in the deploy path. A superuser sending {"operation": "deploy_component", "progress": {}} reaches this line with req.progress = {}, so emitter = {}. Then DeploymentRecorder.create({..., emitter: {}}) calls emitter.subscribe(...)TypeError: emitter.subscribe is not a function. Because DeploymentRecorder.create is called before the try { … } catch block, the deploy's catch handler never runs — no recorder.finish('failed', err) and a confusing 500 to the client.

Suggested change
// Slice B1 of #641: an SSE-bound caller already attached a ProgressEmitter (created in
const emitter = isReplicatedExecution
? null
: req.progress instanceof ProgressEmitter
? req.progress
: new ProgressEmitter();
if (!req.progress) req.progress = emitter;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant