From 9378b8d6b0ffd3585f7ae364d2debd7c4f5576b6 Mon Sep 17 00:00:00 2001 From: Kris Zyp Date: Tue, 12 May 2026 18:26:53 -0600 Subject: [PATCH 1/3] docs: expand AGENTS.md and add replication/DESIGN.md Replace the 19-line AGENTS.md with a more substantive guide: full npm command reference, a "Where should this change go? (Pro vs core)" routing table, a folder map covering every top-level Pro directory, and cross-references into core's new per-folder docs. Add replication/DESIGN.md as a section index for the replication subsystem (~4200 lines across 6 files). Covers: file-level map, key abstractions (NodeReplicationConnection, Replicator, shared status buffers, hdb_nodes schema), the binary protocol command table with verified line numbers, subsystems (connection mgmt, protocol, data propagation, latency awareness, node discovery/TLS), and non-obvious behaviors including the auth-failure-without-DISCONNECT pattern relevant to issue #135 JWT/cluster lineage. Also flags that the cluster:* npm scripts reference utility/dev/ docker-compose files that aren't present in the repo. Generated by Claude Opus 4.7 (1M context). Co-Authored-By: Claude Opus 4.7 (1M context) --- AGENTS.md | 100 ++++++++++++++++++++++++++++++-- replication/DESIGN.md | 129 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 225 insertions(+), 4 deletions(-) create mode 100644 replication/DESIGN.md diff --git a/AGENTS.md b/AGENTS.md index 5ae5695ba..bc5b2a332 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -6,7 +6,9 @@ This file provides guidance when working with code in this repository. ## What This Is -`harper-pro` is the proprietary commercial layer on top of [Harper core](https://github.com/HarperFast/harper) (Apache-2.0). Most engineering conventions, build mechanics, and runtime behaviors are inherited from core — see Harper core's [AGENTS.md](https://github.com/HarperFast/harper/blob/main/AGENTS.md) for the full picture. +`harper-pro` is the proprietary commercial layer on top of [Harper core](https://github.com/HarperFast/harper) (Apache-2.0). Most engineering conventions, build mechanics, and runtime behaviors are inherited from core — **read [core/AGENTS.md](core/AGENTS.md) for the substrate's full picture**. This document covers only what's different or additive in Pro. + +`core/` is a git submodule pointing at `HarperFast/harper`. Pro adds: cluster replication, license enforcement, clone-node bootstrap, CPU profiling, and the docker-compose dev workflow. --- @@ -32,9 +34,99 @@ inside `.git/modules/core/`, remove them immediately — they are corrupting the --- -## Pro-specific notes +## Commands + +```bash +# Build (Pro-only — core has its own build) +npm run build # tsc → dist/ +npm run build:watch # incremental + +# Lint / Format +npm run lint # oxlint --deny-warnings +npm run lint:fix +npm run format:write # prettier +npm run lint:required # quiet — for CI + +# Tests — only integration here +npm run test:integration +npm run test:integration:all # all *.test.ts in integrationTests/ + +# Submodule +npm run core:sync # sync core submodule to its pinned commit +npm run core:set-branch # pin core to a different branch +``` + +The `cluster:*` scripts in `package.json` reference `utility/dev/docker-compose.*.yml` files that are not present in the repository — they're likely produced by a private dev tooling step. Don't expect them to work out of the box. + +**No `test:unit` exists.** Pro relies on core's unit suite for the substrate it inherits. `test:integration` is slow — run only when the change plausibly affects integration behavior. + +--- + +## Where should this change go? (Pro vs. core) + +If you're not sure which repo to edit, use this rule of thumb: + +| Change concerns… | Edit in | +| ----------------------------------------------------------- | -------------------------------------------------- | +| Tables, Resources, transactions, audit, storage format | `core/` | +| HTTP/WS/MQTT/GraphQL protocol handling, middleware | `core/` | +| Schema, validation, permissions | `core/` | +| Multi-node replication, cluster status, node membership | `harper-pro/replication/` | +| Initial node clone from a leader | `harper-pro/cloneNode/` | +| License validation/enforcement | `harper-pro/licensing/` | +| CPU profiling / pprof integration | `harper-pro/analytics/` | +| TLS cert signing for cluster auth | `harper-pro/security/` | +| `bin/harper.js` CLI behavior (component registration order) | `harper-pro/bin/` | +| Build / packaging / release scripts | `harper-pro/build-tools/` or `harper-pro/scripts/` | + +When a feature spans both, prefer landing as much as possible in `core/` and gluing it together via a Pro-registered component. + +--- + +## Repository map + +### Pro source folders + +- **`bin/`** — CLI entry points. `harper.js` is the main executable; loads `cloneNode` if `HDB_LEADER_URL` is set; registers `analytics`, `licensing`, `replication` components. +- **`replication/`** — multi-node replication subsystem. **See [replication/DESIGN.md](replication/DESIGN.md)** for the section index. The big file is `replication/replicationConnection.ts` (2288 lines). +- **`cloneNode/`** — `cloneNode.ts` (~30KB). One-shot replication from a leader during init when `HDB_LEADER_URL` is set. Auth via cert or credentials. Tests: `integrationTests/cloneNode/`. +- **`licensing/`** — usage license validation and enforcement. `usageLicensing.ts` (lifecycle, usage aggregation) and `validation.ts` (EdDSA signature verification). +- **`analytics/`** — CPU profiling via Datadog pprof. `profile.ts` is the entry. **Not the same as core's `resources/analytics/`** (which records request-level telemetry). +- **`security/`** — Pro-specific cryptography: `certificate.ts` (TLS signing/validation), `sshKeyOperations.ts`, `keyService.ts` (JWT + private-key resolution). **Core PKI lives in `core/security/`** — don't confuse them. + +### Pro tests + +- **`integrationTests/`** — end-to-end, runs full Harper instances. `run.mjs` is the custom test harness with shard support. Subdirs mirror source (`analytics/`, `cloneNode/`, `cluster/`, `licensing/`, `security/`). +- **`unitTests/`** — small. `testUtils.js` (mock helpers, db reset) and `setupTestApp.mjs` (in-memory app scaffold). + +### Pro non-source + +- **`build-tools/`** — `build-pro.sh` orchestrates the build; `sync-core.sh` syncs the core submodule; `download-prebuilds.js` fetches native prebuilds; `set-core-branch.sh` pins core's branch. +- **`scripts/`** — `patch-release.js` (~12KB). Cherry-picks PRs labeled `patch` from `main` onto a release branch in both core and Pro, bumps the version, syncs the submodule. See `CONTRIBUTING.md` for usage. +- **`dev/`** — `sync-commits.js`. One-time repo-migration utility, not part of normal runtime. +- **`static/`** — `defaultConfig.yaml` template, `ascii_logo.txt`. + +### Submodule + +- **`core/`** — the Harper OSS core (`HarperFast/harper`). Has its own AGENTS.md, DESIGN.md, and now per-folder DESIGN.md docs. **When touching substrate behavior, edit there, not here.** + +--- + +## Pro-specific conventions -- **Linter**: oxlint with `--deny-warnings` (`npm run lint`), same as core. -- **Tests**: only `npm run test:integration` exists here. There is no `test:unit` split — Pro relies on core for unit-test coverage of the substrate it inherits. `test:integration` is slow; run only when the change plausibly affects integration behavior. +- **Linter**: `oxlint --deny-warnings`, same as core. - **Storage substrate**: same as core — RocksDB primary, LMDB available via `HARPER_STORAGE_ENGINE=lmdb`. - **Documentation scope**: https://docs.harperdb.io is authoritative for Harper mechanics. Pro docs describe Pro-only surface, not core behavior. +- **Submodule pointer**: when changing core, commit there first, then bump the submodule pointer in Pro in a separate commit. Don't combine core changes with submodule bumps — they need to be reviewable separately. +- **Patch releases**: PRs that should land in a stable release branch must carry the **`patch`** label. See `CONTRIBUTING.md` for the patch-release workflow. + +--- + +## Cross-references + +- **[core/AGENTS.md](core/AGENTS.md)** — substrate architecture (Resources, Server, Components, Data Layer). Read first for substrate questions. +- **[core/DESIGN.md](core/DESIGN.md)** — non-obvious internals (RecordObject prototype, getFromSource timing, blob orphan cleanup). +- **[core/resources/DESIGN.md](core/resources/DESIGN.md)** — `Table.ts` and `Resource.ts` section indexes. +- **[core/server/DESIGN.md](core/server/DESIGN.md)** — HTTP/WS/MQTT layer + middleware ordering. +- **[replication/DESIGN.md](replication/DESIGN.md)** — Pro replication subsystem. +- **[CONTRIBUTING.md](CONTRIBUTING.md)** — patch release procedure; package-lock merge driver setup. diff --git a/replication/DESIGN.md b/replication/DESIGN.md new file mode 100644 index 000000000..9b0fdc310 --- /dev/null +++ b/replication/DESIGN.md @@ -0,0 +1,129 @@ +# replication/ — Navigation Guide + +Real-time, peer-to-peer replication of table data across cluster nodes via persistent WebSocket connections. Implements eventual consistency: when a local transaction commits, the audit records are forwarded asynchronously to peers. + +**Read this when:** you're touching cluster sync, debugging missed writes, JWT/cluster auth, latency-based node selection, or blob transfer between nodes. + +**Integration boundary with core:** replication hooks into core's table resource layer — a `Replicator` class is installed as a `source` of the table (`table.sourcedFrom(class Replicator extends Resource {...})`). When a local cache miss occurs, the Replicator picks the lowest-latency peer and fetches. Core's audit store (`core/resources/auditStore.ts`) and node-id mapping (`core/resources/nodeIdMapping.ts`) are the two data structures replication reads. + +--- + +## Files (6 total, ~4200 lines) + +| File | Lines | Purpose | +| -------------------------- | ----- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `replicationConnection.ts` | 2288 | The protocol engine. Defines `NodeReplicationConnection`, encodes/decodes the binary frame format, drives audit-record forwarding, manages blobs, and writes shared latency/back-pressure counters. **The big file.** | +| `replicator.ts` | 656 | Setup module: `start()`, per-database/per-table `Replicator` resource class, retrieval-connection pool, operation forwarding, mTLS config. | +| `subscriptionManager.ts` | 568 | Main-thread orchestration. Delegates subscription work to worker threads; routes around disconnects. | +| `setNode.ts` | 313 | Cluster member operations — add/remove nodes, CSR signing, TLS certificate negotiation. | +| `knownNodes.ts` | 297 | Node registry (`hdb_nodes` system table) + shared-memory `Float64Array` status buffers (latency, confirmation, back-pressure). | +| `clusterStatus.ts` | 82 | Read-only status reporting for `cluster_status` operation. | + +--- + +## Key abstractions + +### `NodeReplicationConnection` (replicationConnection.ts:197) + +A persistent connection to one remote node. Owns the WebSocket lifecycle, reconnection, latency tracking, and per-subscription state. **Inspect this when debugging connection drops or auth failures** (see issue #135 lineage on JWT/cluster auth). + +### `replicateOverWS(ws, options, authorization)` (replicationConnection.ts:339) + +The protocol decoder. Reads incoming binary commands (constants at lines 63–86): + +| Command | Value | Meaning | +| ------------------------------------------ | --------- | ----------------------------------------- | +| `SUBSCRIPTION_REQUEST` | 129 | Client wants to subscribe to a table | +| `RESIDENCY_LIST` | 130 | Negotiate which records each node holds | +| `TABLE_FIXED_STRUCTURE` | 132 | Schema sync | +| `GET_RECORD` / `GET_RECORD_RESPONSE` | 133 / 134 | Cache-miss fetch | +| `OPERATION_REQUEST` / `OPERATION_RESPONSE` | 136 / 137 | Forwarded operations | +| `NODE_NAME` / `NODE_NAME_TO_ID_MAP` | 140 / 141 | Identity exchange | +| `DISCONNECT` | 142 | Graceful close (not used on auth failure) | +| `SEQUENCE_ID_UPDATE` | 143 | Audit sequence cursor | +| `COMMITTED_UPDATE` | 144 | Confirm-on-commit | +| `DB_SCHEMA` | 145 | Database schema replication | +| `BLOB_CHUNK` | 146 | Blob bytes | +| `SUBSCRIPTION_UPDATE` | 147 | Audit record forwarded to subscribers | + +The authorization parameter is a **promise that may resolve asynchronously**; on rejection the socket closes without a DISCONNECT frame (relevant to JWT failure flows). + +### `Replicator extends Resource` (replicator.ts:334) + +A `Resource` class installed as a `source` of a table. Its `static async load(entry)` method (replicator.ts:376) picks the lowest-latency available node for cache-miss fetches. + +### Shared status buffers (knownNodes.ts:63–75) + +Per (database, remote_node) pair: an mmap-backed `Float64Array` shared across threads, used to avoid IPC for hot-path status updates. Positions are defined in `replicationConnection.ts` lines 78–84: + +| Position | Constant | +| -------- | ------------------------------ | +| 0 | `CONFIRMATION_STATUS_POSITION` | +| 1 | `RECEIVED_VERSION_POSITION` | +| 2 | `RECEIVED_TIME_POSITION` | +| 3 | `SENDING_TIME_POSITION` | +| 4 | `LATENCY_POSITION` | +| 5 | `RECEIVING_STATUS_POSITION` | +| 6 | `BACK_PRESSURE_RATIO_POSITION` | + +These are written concurrently by `replicationConnection.ts` without explicit synchronization. Don't introduce read-modify-write patterns on this buffer. + +### `hdb_nodes` system table (knownNodes.ts:18–61) + +Schema: `name` (PK), `subscriptions[]`, `system_info`, `url`, `routes`, `ca`, `ca_info`, `replicates`, `revoked_certificates`. Subscription updates trigger `subscribeToNodeUpdates` (knownNodes.ts:270), which fans out to `monitorNodeCAs` → refresh `replicationCertificateAuthorities` (replicator.ts:56). + +--- + +## Subsystems + +**Connection management** — `NodeReplicationConnection.connect()` (replicationConnection.ts:197+), `subscriptionManager.startOnMainThread()`. Dial/retry, thread-pool delegation, recovery on disconnect. + +**Binary protocol** — `replicateOverWS` (replicationConnection.ts:339+); command constants at lines 63–86; msgpack body; back-pressure ratio recomputed every 30s (line 434). + +**Data propagation** — Audit-record iteration → forwarding; blob streaming with concurrency cap `MAX_OUTSTANDING_BLOBS_BEING_SENT` (replicationConnection.ts:455); commit confirmation batched on `COMMITTED_UPDATE_DELAY = 2ms` (line 101). + +**Latency awareness** — Ping every `PING_INTERVAL = 30s` (line 102); latency captured on pong; `Replicator.load()` (replicator.ts:376) routes cache-miss fetches to the lowest-latency node. + +**Node discovery & TLS** — `hdb_nodes` subscriptions, `setNode.ts` for member ops, `buildReplicationMtlsConfig()` (replicator.ts:64), `monitorNodeCAs()` (replicator.ts:268). + +--- + +## Non-obvious behaviors + +1. **Auth failures don't send DISCONNECT.** When the `authorization` promise rejects in `replicateOverWS`, the connection closes with "Unauthorized" but no DISCONNECT frame is sent — the client is expected to retry. This is the lineage of JWT/cluster auth bugs (issue #135). + +2. **Origin loop prevention via delayed sequence updates.** A node receiving its own message (checked against `remoteToLocalNodeId`) skips local processing but still forwards. To avoid feedback loops, the sequence-update emit is delayed by `SKIPPED_MESSAGE_SEQUENCE_UPDATE_DELAY = 300ms` (replicationConnection.ts:97). + +3. **Blob back-pressure & timeout.** Blobs time out after `blobTimeout` (default 120s); concurrent sends are capped at `MAX_OUTSTANDING_BLOBS_BEING_SENT = 5`; back-pressure ratio (computed every 30s) tells senders to pause. If you're seeing large-data replication hangs, look here first. + +4. **Shared-buffer concurrency.** The Float64Array status buffers are touched from multiple threads with no lock. Treat them as eventually consistent; use the callback param of `subscribeToNodeUpdates` if you need notification. + +--- + +## Tests + +**Integration tests** live in `../integrationTests/cluster/`: + +| File | Purpose | +| ------------------------------------ | ------------------------------------------------ | +| `clusterShared.mjs` | Shared fixture/helper (cluster boot, node setup) | +| `fullyConnectedReplication.test.mjs` | Full-mesh topology | +| `replicationTopology.test.mjs` | Dynamic membership changes | +| `replicationLoad.test.mjs` | Concurrent-write load | + +There is no dedicated `unitTests/replication/` directory — replication is exercised entirely via integration tests that spin up multi-node clusters. + +--- + +## "Where is X" cheat sheet + +| Question | File:line | +| ----------------------------------------- | ----------------------------------------------------- | +| Where does a remote message get decoded? | `replicationConnection.ts:339` (`replicateOverWS`) | +| Where do cache-miss fetches pick a peer? | `replicator.ts:376` (`Replicator.load`) | +| Where is the connection retry loop? | `replicationConnection.ts:192` (`INITIAL_RETRY_TIME`) | +| Where is mTLS configured? | `replicator.ts:64` (`buildReplicationMtlsConfig`) | +| Where is a new cluster member added? | `setNode.ts:1` (whole file is one function) | +| Where are protocol message types defined? | `replicationConnection.ts:63–86` | +| Where is `hdb_nodes` schema? | `knownNodes.ts:18–61` | +| What does `cluster_status` return? | `clusterStatus.ts` (82 lines, whole file) | From fe7c1d94e58d93db9f25cede768003e52d6e0e0a Mon Sep 17 00:00:00 2001 From: Kris Zyp Date: Wed, 13 May 2026 08:54:18 -0600 Subject: [PATCH 2/3] Update replication/DESIGN.md Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> --- replication/DESIGN.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/replication/DESIGN.md b/replication/DESIGN.md index 9b0fdc310..f169c4fb5 100644 --- a/replication/DESIGN.md +++ b/replication/DESIGN.md @@ -70,7 +70,7 @@ These are written concurrently by `replicationConnection.ts` without explicit sy ### `hdb_nodes` system table (knownNodes.ts:18–61) -Schema: `name` (PK), `subscriptions[]`, `system_info`, `url`, `routes`, `ca`, `ca_info`, `replicates`, `revoked_certificates`. Subscription updates trigger `subscribeToNodeUpdates` (knownNodes.ts:270), which fans out to `monitorNodeCAs` → refresh `replicationCertificateAuthorities` (replicator.ts:56). +Schema: `name` (PK), `subscriptions[]`, `system_info`, `url`, `routes`, `ca`, `ca_info`, `replicates`, `revoked_certificates`. Subscription updates trigger `subscribeToNodeUpdates` (knownNodes.ts:77), which fans out to `monitorNodeCAs` → refresh `replicationCertificateAuthorities` (replicator.ts:56). --- From 69abd5a6b2d65a563427ec32c0165b08fbd8fac3 Mon Sep 17 00:00:00 2001 From: Kris Zyp Date: Tue, 19 May 2026 06:44:56 -0600 Subject: [PATCH 3/3] docs: replace line numbers in replication/DESIGN.md with symbol anchors MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bare line numbers drift on every edit to replicationConnection.ts. Every referenced line corresponded to a named class, function, or top-level const — so references now use the symbol name. Reader uses grep or go-to-symbol to jump. No source-file changes; doc-only. Co-Authored-By: Claude Opus 4.7 (1M context) --- replication/DESIGN.md | 76 ++++++++++++++++++++++--------------------- 1 file changed, 39 insertions(+), 37 deletions(-) diff --git a/replication/DESIGN.md b/replication/DESIGN.md index f169c4fb5..6e2d687a0 100644 --- a/replication/DESIGN.md +++ b/replication/DESIGN.md @@ -6,32 +6,34 @@ Real-time, peer-to-peer replication of table data across cluster nodes via persi **Integration boundary with core:** replication hooks into core's table resource layer — a `Replicator` class is installed as a `source` of the table (`table.sourcedFrom(class Replicator extends Resource {...})`). When a local cache miss occurs, the Replicator picks the lowest-latency peer and fetches. Core's audit store (`core/resources/auditStore.ts`) and node-id mapping (`core/resources/nodeIdMapping.ts`) are the two data structures replication reads. +> **Navigation convention.** Code is referenced by **symbol name** (class, function, exported const). Use your editor's go-to-symbol or `grep -n '' replication/` to jump. Line numbers drift; symbols don't. + --- ## Files (6 total, ~4200 lines) -| File | Lines | Purpose | -| -------------------------- | ----- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `replicationConnection.ts` | 2288 | The protocol engine. Defines `NodeReplicationConnection`, encodes/decodes the binary frame format, drives audit-record forwarding, manages blobs, and writes shared latency/back-pressure counters. **The big file.** | -| `replicator.ts` | 656 | Setup module: `start()`, per-database/per-table `Replicator` resource class, retrieval-connection pool, operation forwarding, mTLS config. | -| `subscriptionManager.ts` | 568 | Main-thread orchestration. Delegates subscription work to worker threads; routes around disconnects. | -| `setNode.ts` | 313 | Cluster member operations — add/remove nodes, CSR signing, TLS certificate negotiation. | -| `knownNodes.ts` | 297 | Node registry (`hdb_nodes` system table) + shared-memory `Float64Array` status buffers (latency, confirmation, back-pressure). | -| `clusterStatus.ts` | 82 | Read-only status reporting for `cluster_status` operation. | +| File | Purpose | +| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `replicationConnection.ts` | The protocol engine. Defines `NodeReplicationConnection`, encodes/decodes the binary frame format, drives audit-record forwarding, manages blobs, and writes shared latency/back-pressure counters. **The big file.** | +| `replicator.ts` | Setup module: `start()`, per-database/per-table `Replicator` resource class, retrieval-connection pool, operation forwarding, mTLS config. | +| `subscriptionManager.ts` | Main-thread orchestration. Delegates subscription work to worker threads; routes around disconnects. | +| `setNode.ts` | Cluster member operations — add/remove nodes, CSR signing, TLS certificate negotiation. | +| `knownNodes.ts` | Node registry (`hdb_nodes` system table) + shared-memory `Float64Array` status buffers (latency, confirmation, back-pressure). | +| `clusterStatus.ts` | Read-only status reporting for `cluster_status` operation. | --- ## Key abstractions -### `NodeReplicationConnection` (replicationConnection.ts:197) +### `NodeReplicationConnection` (`replicationConnection.ts`) -A persistent connection to one remote node. Owns the WebSocket lifecycle, reconnection, latency tracking, and per-subscription state. **Inspect this when debugging connection drops or auth failures** (see issue #135 lineage on JWT/cluster auth). +A persistent connection to one remote node. Owns the WebSocket lifecycle, reconnection (initial delay `INITIAL_RETRY_TIME`), latency tracking, and per-subscription state. **Inspect this when debugging connection drops or auth failures** (see issue #135 lineage on JWT/cluster auth). -### `replicateOverWS(ws, options, authorization)` (replicationConnection.ts:339) +### `replicateOverWS(ws, options, authorization)` (`replicationConnection.ts`) -The protocol decoder. Reads incoming binary commands (constants at lines 63–86): +The protocol decoder. Reads incoming binary commands — each is a top-level named const in the same file: -| Command | Value | Meaning | +| Command constant | Value | Meaning | | ------------------------------------------ | --------- | ----------------------------------------- | | `SUBSCRIPTION_REQUEST` | 129 | Client wants to subscribe to a table | | `RESIDENCY_LIST` | 130 | Negotiate which records each node holds | @@ -46,15 +48,15 @@ The protocol decoder. Reads incoming binary commands (constants at lines 63–86 | `BLOB_CHUNK` | 146 | Blob bytes | | `SUBSCRIPTION_UPDATE` | 147 | Audit record forwarded to subscribers | -The authorization parameter is a **promise that may resolve asynchronously**; on rejection the socket closes without a DISCONNECT frame (relevant to JWT failure flows). +The `authorization` parameter is a **promise that may resolve asynchronously**; on rejection the socket closes without a DISCONNECT frame (relevant to JWT failure flows). -### `Replicator extends Resource` (replicator.ts:334) +### `Replicator extends Resource` (`replicator.ts`) -A `Resource` class installed as a `source` of a table. Its `static async load(entry)` method (replicator.ts:376) picks the lowest-latency available node for cache-miss fetches. +A `Resource` class installed as a `source` of a table. Declared inside `setReplicator()` and passed to `table.sourcedFrom(...)`. Its `static async load(entry)` method picks the lowest-latency available node for cache-miss fetches. -### Shared status buffers (knownNodes.ts:63–75) +### Shared status buffers (`getReplicationSharedStatus` in `knownNodes.ts`) -Per (database, remote_node) pair: an mmap-backed `Float64Array` shared across threads, used to avoid IPC for hot-path status updates. Positions are defined in `replicationConnection.ts` lines 78–84: +Per (database, remote_node) pair: an mmap-backed `Float64Array` shared across threads, used to avoid IPC for hot-path status updates. Position constants live in `replicationConnection.ts`: | Position | Constant | | -------- | ------------------------------ | @@ -68,23 +70,23 @@ Per (database, remote_node) pair: an mmap-backed `Float64Array` shared across th These are written concurrently by `replicationConnection.ts` without explicit synchronization. Don't introduce read-modify-write patterns on this buffer. -### `hdb_nodes` system table (knownNodes.ts:18–61) +### `hdb_nodes` system table (`getHDBNodeTable` in `knownNodes.ts`) -Schema: `name` (PK), `subscriptions[]`, `system_info`, `url`, `routes`, `ca`, `ca_info`, `replicates`, `revoked_certificates`. Subscription updates trigger `subscribeToNodeUpdates` (knownNodes.ts:77), which fans out to `monitorNodeCAs` → refresh `replicationCertificateAuthorities` (replicator.ts:56). +Schema (defined in that function): `name` (PK), `subscriptions[]`, `system_info`, `url`, `routes`, `ca`, `ca_info`, `replicates`, `revoked_certificates`, plus `__createdtime__` / `__updatedtime__`. Subscription updates flow through `subscribeToNodeUpdates`, which fans out to `monitorNodeCAs` → refresh `replicationCertificateAuthorities` (exported from `replicator.ts`). --- ## Subsystems -**Connection management** — `NodeReplicationConnection.connect()` (replicationConnection.ts:197+), `subscriptionManager.startOnMainThread()`. Dial/retry, thread-pool delegation, recovery on disconnect. +**Connection management** — `NodeReplicationConnection.connect()` (`replicationConnection.ts`), `subscriptionManager.startOnMainThread()`. Dial/retry, thread-pool delegation, recovery on disconnect. -**Binary protocol** — `replicateOverWS` (replicationConnection.ts:339+); command constants at lines 63–86; msgpack body; back-pressure ratio recomputed every 30s (line 434). +**Binary protocol** — `replicateOverWS` (`replicationConnection.ts`); command constants are the `*_REQUEST` / `*_UPDATE` / `*_RESPONSE` consts at module top; msgpack body; back-pressure ratio recomputed on `BACK_PRESSURE_INTERVAL` (30 s). -**Data propagation** — Audit-record iteration → forwarding; blob streaming with concurrency cap `MAX_OUTSTANDING_BLOBS_BEING_SENT` (replicationConnection.ts:455); commit confirmation batched on `COMMITTED_UPDATE_DELAY = 2ms` (line 101). +**Data propagation** — Audit-record iteration → forwarding; blob streaming with concurrency cap `MAX_OUTSTANDING_BLOBS_BEING_SENT` (declared inside `replicateOverWS`); commit confirmation batched on `COMMITTED_UPDATE_DELAY` (2 ms). -**Latency awareness** — Ping every `PING_INTERVAL = 30s` (line 102); latency captured on pong; `Replicator.load()` (replicator.ts:376) routes cache-miss fetches to the lowest-latency node. +**Latency awareness** — Ping every `PING_INTERVAL` (30 s); latency captured on pong; `Replicator.load()` routes cache-miss fetches to the lowest-latency node. -**Node discovery & TLS** — `hdb_nodes` subscriptions, `setNode.ts` for member ops, `buildReplicationMtlsConfig()` (replicator.ts:64), `monitorNodeCAs()` (replicator.ts:268). +**Node discovery & TLS** — `hdb_nodes` subscriptions, `setNode.ts` for member ops, `buildReplicationMtlsConfig()` (`replicator.ts`), `monitorNodeCAs()` (`replicator.ts`). --- @@ -92,9 +94,9 @@ Schema: `name` (PK), `subscriptions[]`, `system_info`, `url`, `routes`, `ca`, `c 1. **Auth failures don't send DISCONNECT.** When the `authorization` promise rejects in `replicateOverWS`, the connection closes with "Unauthorized" but no DISCONNECT frame is sent — the client is expected to retry. This is the lineage of JWT/cluster auth bugs (issue #135). -2. **Origin loop prevention via delayed sequence updates.** A node receiving its own message (checked against `remoteToLocalNodeId`) skips local processing but still forwards. To avoid feedback loops, the sequence-update emit is delayed by `SKIPPED_MESSAGE_SEQUENCE_UPDATE_DELAY = 300ms` (replicationConnection.ts:97). +2. **Origin loop prevention via delayed sequence updates.** A node receiving its own message (checked against `remoteToLocalNodeId`) skips local processing but still forwards. To avoid feedback loops, the sequence-update emit is delayed by `SKIPPED_MESSAGE_SEQUENCE_UPDATE_DELAY` (300 ms; in `replicationConnection.ts`). -3. **Blob back-pressure & timeout.** Blobs time out after `blobTimeout` (default 120s); concurrent sends are capped at `MAX_OUTSTANDING_BLOBS_BEING_SENT = 5`; back-pressure ratio (computed every 30s) tells senders to pause. If you're seeing large-data replication hangs, look here first. +3. **Blob back-pressure & timeout.** Blobs time out after `blobTimeout` (default 120s); concurrent sends are capped at `MAX_OUTSTANDING_BLOBS_BEING_SENT = 5`; back-pressure ratio (computed every `BACK_PRESSURE_INTERVAL`) tells senders to pause. If you're seeing large-data replication hangs, look here first. 4. **Shared-buffer concurrency.** The Float64Array status buffers are touched from multiple threads with no lock. Treat them as eventually consistent; use the callback param of `subscribeToNodeUpdates` if you need notification. @@ -117,13 +119,13 @@ There is no dedicated `unitTests/replication/` directory — replication is exer ## "Where is X" cheat sheet -| Question | File:line | -| ----------------------------------------- | ----------------------------------------------------- | -| Where does a remote message get decoded? | `replicationConnection.ts:339` (`replicateOverWS`) | -| Where do cache-miss fetches pick a peer? | `replicator.ts:376` (`Replicator.load`) | -| Where is the connection retry loop? | `replicationConnection.ts:192` (`INITIAL_RETRY_TIME`) | -| Where is mTLS configured? | `replicator.ts:64` (`buildReplicationMtlsConfig`) | -| Where is a new cluster member added? | `setNode.ts:1` (whole file is one function) | -| Where are protocol message types defined? | `replicationConnection.ts:63–86` | -| Where is `hdb_nodes` schema? | `knownNodes.ts:18–61` | -| What does `cluster_status` return? | `clusterStatus.ts` (82 lines, whole file) | +| Question | Where | +| ----------------------------------------- | ---------------------------------------------------------------------------------------------- | +| Where does a remote message get decoded? | `replicationConnection.ts → replicateOverWS` | +| Where do cache-miss fetches pick a peer? | `replicator.ts → Replicator.load` (declared inside `setReplicator`) | +| Where is the connection retry loop? | `replicationConnection.ts → NodeReplicationConnection` (uses `INITIAL_RETRY_TIME`) | +| Where is mTLS configured? | `replicator.ts → buildReplicationMtlsConfig` | +| Where is a new cluster member added? | `setNode.ts` (the whole file is one operation) | +| Where are protocol message types defined? | `replicationConnection.ts` — top-level consts (`SUBSCRIPTION_REQUEST` … `SUBSCRIPTION_UPDATE`) | +| Where is `hdb_nodes` schema? | `knownNodes.ts → getHDBNodeTable` | +| What does `cluster_status` return? | `clusterStatus.ts` (82 lines, whole file) |