diff --git a/README.md b/README.md index 5582f00..a85ca92 100644 --- a/README.md +++ b/README.md @@ -19,8 +19,9 @@ NetCopy is **not**: bearer token, - a backup tool — there is no scheduling, snapshotting, or retention. -Tested on Linux. Runs on macOS and Windows too; see [Known issues](#known-issues) -for platform caveats. +Linux is the only platform under CI and the only one we ship release images +for. The pure-Java parts run on macOS and Windows too, but a few platform +quirks aren't tested on every commit — see [Known issues](#known-issues). ## Quick start @@ -145,40 +146,71 @@ NetCopy splits cleanly into a **control plane** and a **data plane**. | | | | data | | | | | HttpPuller TcpPuller (port 7778 server) |<------>| HttpPuller TcpPuller (port 7778 server) | | | | | | | | | -| SidecarStore (data.partial + chunks.bitmap + meta.json) | +| SidecarStore (data.partial + chunks.bitmap + chunks.hashes + meta.json) | | JsonJobStore (/jobs/.json) | +---------------------------------------------+ +---------------------------------------------+ ``` **Control plane (HTTP + WebSocket via Javalin, port 7777):** -- `GET /api/health` — liveness probe (no auth). -- `GET /api/browse` — list a directory under one of the peer's `--shared-root`s. -- `POST /api/manifest` — ask the peer to plan a transfer; returns a - `manifestId` plus a flat list of files, sizes, mtimes, and chunk plans. -- `POST /api/transfers` — start a job locally that will pull a manifest from - a remote peer. Persists a `JobState` in `/jobs/.json`. -- `GET /api/transfers/{id}` — poll job state. -- `WS /ws/progress` — live `ProgressEvent`s (subscribed per `transferId`). +| Endpoint | Auth | Purpose | +|---|---|---| +| `GET /api/health` | no | Liveness probe (open). | +| `GET /api/peer/info` | yes | Peer self-description: hostname, version, TCP blob port, root counts. | +| `GET /api/browse?root=&path=` | yes | List a directory under a `--shared-root`. | +| `GET /api/browse-local?root=&path=` | yes | Same shape, rooted under a `--receive-root` (UI uses it for the target panel). | +| `POST /api/browse/stats` | yes | Recursive file count + byte total per path; powers the selection-stats footer. | +| `POST /api/manifest` | yes | Plan a transfer. Returns the full manifest (entries, sizes, mtimes, chunk plans, `manifestId`). | +| `POST /api/manifest/register` | yes | Re-register a previously-issued manifest (used by the puller after a source-side restart). | +| `GET /api/blob/{manifestId}/{fileId}` | yes | HTTP data-plane: file bytes (with `Range` support, `X-Chunk-Hash` response header). | +| `GET /api/hash/{manifestId}/{fileId}` | yes | Lazy XXH3-128 of a manifest entry; returns `202` while computing. | +| `POST /api/transfers` | yes | Start a job (target host pulls from a remote source). | +| `GET /api/transfers` | yes | List status snapshots (newest first). | +| `GET /api/transfers/{id}` | yes | Single status snapshot, including per-file table and per-chunk metrics. | +| `POST /api/transfers/{id}/{pause,resume,cancel}` | yes | Lifecycle controls. | +| `DELETE /api/transfers/{id}` | yes | Dismiss a terminal-state job from the persistent store. | +| `POST /api/relay/push` | yes | "Push from here to peer" — proxies `POST /api/transfers` to the peer using its token. | +| `GET /api/metrics` | yes | Host metrics (CPU/RAM/disk/GC, top threads) + per-server serve metrics. | +| `WS /ws/progress` | yes | Live `ProgressEvent` stream (subscribe per transfer or wildcard). | **Data plane (two interchangeable protocols):** - `GET /api/blob/{manifestId}/{fileId}` with HTTP `Range` headers, served by Javalin via `FileChannel.transferTo`. - A custom binary TCP protocol on port 7778: framed `[len:u32][type:u8][payload]` - with `HELLO`/`REQUEST`/`DATA_HEAD`/`DATA`/`DATA_END`/`ERR`/`BYE`. Designed to - reuse one connection across many `pullChunk` calls and avoid HTTP parsing - overhead at the price of a more interesting wire format. + with `HELLO` / `REQUEST` / `DATA_HEAD` / `DATA` / `DATA_END` / `DATA_END_V2` + (xxh3 trailer, single-pass; v0.3.0+) / `ERR` / `BYE`. Designed to reuse one + connection across many `pullChunk` calls and avoid HTTP parsing overhead at + the price of a more interesting wire format. The protocol is selected per job at start time. See -[docs/protocol-comparison.md](docs/protocol-comparison.md) for benchmarks. +[docs/protocol-comparison.md](docs/protocol-comparison.md) for design notes. **State and resume:** - Each in-progress target file owns a sidecar directory `.netcopy/` - containing `data.partial` (sparse, written at offsets), `meta.json` - (size, mtime, chunk plan), and `chunks.bitmap` (1 bit per chunk, set after - the chunk is downloaded **and** its xxh3-128 hash verified). + containing four files: + - `data.partial` — sparse, pre-allocated to the final size, written via + positional FileChannel writes; + - `meta.json` — immutable per-file descriptor (relPath, size, sourceMtime, + chunk plan, `schemaVersion`); + - `chunks.bitmap` — one bit per chunk, set after the chunk's bytes are + fsynced **and** its **xxh3-128** chunk-level hash matches what the source + advertised on the wire; + - `chunks.hashes` — fixed-size array of XXH3-128 digests (16 bytes per + chunk), positionally written as each chunk completes. Used by the + selective re-verify path on full-file hash mismatch so resume re-pulls + only the corrupted chunks instead of the whole file. +- Hashing has two layers: + - **Per-chunk** verification (and the on-the-wire `X-Chunk-Hash` / + `DATA_END_V2`) is **XXH3-128** — fast, ~10 GB/s on x86, allocates a small + per-chunk buffer. + - **Full-file finalize** is **SHA-256** in 256 KiB strides. Streaming + XXH3-128 in this codebase buffers all bytes into a `ByteArrayOutputStream` + that overflows the array-size limit on multi-GiB files — SHA-256 streams + cleanly via `MessageDigest.update`. The resulting digest lives in the + JSON's `hashHex` field for v0.x wire-format stability (the field name + will change in a future major bump). - After all chunks are verified, `FileFinalizer` rehashes the whole file and atomic-renames `data.partial` to the final target path. - A job's overall state lives in `/jobs/.json` (one JSON per diff --git a/docs/protocol-comparison.md b/docs/protocol-comparison.md index 8e645a1..943b2de 100644 --- a/docs/protocol-comparison.md +++ b/docs/protocol-comparison.md @@ -1,82 +1,88 @@ -# HTTP vs TCP — protocol comparison - -NetCopy ships two interchangeable data-plane protocols. This document is -the home of the quantitative comparison between them. The numbers below -are produced by task **V5 — protocol comparison** and are placeholders -until that pass runs. - -## What we are measuring - -The same workload runs back-to-back over both protocols, on the same two -hosts, with the same chunk plan. Each row in the table below should report -median and p95 of three runs. - -- **Throughput**: useful payload bytes per wall-clock second, averaged over - the whole transfer. -- **Time to first byte (TTFB)**: from `POST /api/transfers` accepting to the - first `ChunkCompleted` ProgressEvent. -- **CPU time**: server-side and client-side `getrusage` deltas, normalised - per GB transferred. -- **Connection count**: peak concurrent sockets the data plane opened. -- **Behaviour under loss**: same transfer with `tc qdisc add ... netem - loss 1%` applied to the receive interface — does the protocol recover - cleanly, and what is the throughput delta. - -## Test workloads +# HTTP vs TCP — protocol design notes + +NetCopy ships two interchangeable data-plane protocols. The user picks one +per transfer. This document explains the trade-offs and points to a manual +reproduction for benchmark numbers. + +## What's different + +Both protocols carry the same byte payload (file contents, in chunks, with +XXH3-128 chunk-level verification). They differ in framing and how the +hash gets onto the wire: + +- **HTTP** — `GET /api/blob/{manifestId}/{fileId}` with a `Range: + bytes=START-END` header per chunk. Connection reuse via keep-alive. Server + pre-computes the chunk's XXH3-128, sets it as `X-Chunk-Hash` response + header, then streams the body via `FileChannel.transferTo` (which on Linux + decays to `sendfile(2)`). Pro: trivial to debug with `curl`, plays well with + any HTTP-aware proxy. Con: HTTP parsing overhead per chunk, and HTTP/1.1 + connection-per-concurrent-chunk. +- **TCP** — one long-lived connection per peer, multiplexed by `reqId`. + Custom binary framing (see `tasks/contracts/data-formats.md`). Versioned + protocol: v1 is two-pass (hash → DataHead → stream → DataEnd, identical to + the HTTP path conceptually); **v2 (default since v0.3.0)** streams and + hashes in a single pass, putting the digest in a trailing `DataEndV2` + frame. Pro: fewer TCP connections (one per peer), no HTTP overhead, single + read pass on the source-side disk. Con: needs its own port (`--tcp-port`), + not curl-debuggable. + +## Where the difference matters + +- **Many small files (≤ 1 MB each).** TCP wins clearly. HTTP pays a full + request line + headers per chunk; with thousands of files this dominates. +- **One big file (multi-GB) on a fast disk.** Mostly identical. Both + protocols are CPU-bound on the hash and IO-bound on the disk; framing + overhead is in the noise. +- **One big file on a cold-cache HDD.** TCP v2 is meaningfully faster + because it does one disk read per chunk on the source instead of two. + v1's two-pass design was tractable on SSDs (the second pass came from the + page cache) but on HDD the source ended up reading the file twice with + cold seeks. v0.3.0 fixed that. +- **Lossy network.** Both rely on the kernel's TCP retransmit; the + application layers don't differ. NetCopy retries failed chunks with + exponential backoff identically. + +In practice the user-visible bottleneck on a LAN is almost always **the +slower of the two disks** (source HDD seek + receiver fsync), not the +protocol. We've measured ~50–60 MB/s sustained from a single HDD source +with 8 parallel chunks regardless of which protocol we pick. + +## Reproducing a comparison by hand + +1. Start two daemons with identical flags except `--port`, `--tcp-port`, and + roots. Pin the JVM with `-XX:ActiveProcessorCount=N` if you want to + compare across CPU budgets. +2. Pre-generate the workload under one daemon's `--shared-root`. +3. From the UI on the other daemon, plan a transfer, then start it twice in + a row — once with `protocol: "http"`, once with `"tcp"`. Record the + `TransferCompleted` event's `totalDurationMs` and `avgThroughputBps`, + and screenshot the Performance modal's "This transfer (chunks)" tile for + per-chunk timings. +4. For loss runs: + + ```bash + sudo tc qdisc add dev root netem loss 1% + # ... run the transfer ... + sudo tc qdisc del dev root + ``` + +5. Repeat with the TCP server disabled (`--tcp-port 0`) on the source side + to confirm the HTTP fallback works. + +We deliberately don't ship a canned benchmark table here: numbers from a +single hardware setup mislead readers comparing to their own. The +Performance modal already exposes the per-chunk timings (source latency, +wire time, persist time, pool acquire wait) you need to identify your own +bottleneck. + +## Suggested workloads | ID | Description | |---|---| -| W1 | One 32 GB file (large-chunk path) | -| W2 | 1000 small files of ~64 KB each (small-chunk path, file-parallelism dominates) | +| W1 | One 32 GB file (large-chunk path; tests sustained throughput) | +| W2 | 1000 small files of ~64 KB each (request count dominates) | | W3 | Mixed: 4 GB ISO + 50 MB of small docs (typical real-world mix) | -| W4 | W1 again, but with `--file-parallelism=1 --chunks-per-file=1` (single-stream baseline) | - -Each workload runs once over HTTP (`--tcp-port 0` on the server side) and -once over TCP (`protocol: "tcp"` in the transfer request). - -## Results — placeholder +| W4 | W1 again with `--file-parallelism=1 --chunks-per-file=1` (single-stream baseline) | -Filled in by V5. - -| Workload | Protocol | Throughput (MB/s) | TTFB (ms) | Server CPU (s/GB) | Peak conns | Loss 1% throughput | -|---|---|---|---|---|---|---| -| W1 | HTTP | _TBD_ | _TBD_ | _TBD_ | _TBD_ | _TBD_ | -| W1 | TCP | _TBD_ | _TBD_ | _TBD_ | _TBD_ | _TBD_ | -| W2 | HTTP | _TBD_ | _TBD_ | _TBD_ | _TBD_ | _TBD_ | -| W2 | TCP | _TBD_ | _TBD_ | _TBD_ | _TBD_ | _TBD_ | -| W3 | HTTP | _TBD_ | _TBD_ | _TBD_ | _TBD_ | _TBD_ | -| W3 | TCP | _TBD_ | _TBD_ | _TBD_ | _TBD_ | _TBD_ | -| W4 | HTTP | _TBD_ | _TBD_ | _TBD_ | _TBD_ | _TBD_ | -| W4 | TCP | _TBD_ | _TBD_ | _TBD_ | _TBD_ | _TBD_ | - -## Provisional reasoning - -Until V5 produces real numbers, the design intuition is: - -- **W1 (one big file)**: the two protocols should be within a few percent. - Both are dominated by `FileChannel.transferTo` on the server and direct - `pwrite` on the client; the framing overhead is amortised across multi-MB - chunks. -- **W2 (many small files)**: TCP should win materially. HTTP pays a full - request/response round-trip per chunk, plus header parsing; TCP reuses - one connection and sends only an 8-byte `REQUEST` frame per chunk. -- **W3 (mixed)**: closer to W1 by byte count; closer to W2 by request count. - Expect TCP to be modestly ahead. -- **W4 (single stream)**: both protocols saturate one TCP flow; the - bottleneck is the kernel and the NIC, not the framing. - -## Reproducing the benchmark - -V5 will publish a script under `verify/V5/` that drives both daemons in -the same JVM (or two JVMs on the same host) using a tmpfs receive root -to factor out disk speed. Until then, reproduce by hand: - -1. Start two daemons with identical flags except `--port`, `--tcp-port`, - and roots. Pin the JVM with `-XX:ActiveProcessorCount=N` if you want - to compare across CPU budgets. -2. Pre-generate the workload under one daemon's `--shared-root`. -3. From the UI on the other daemon, plan a transfer, then start it twice - in a row — once with protocol HTTP, once with TCP. Record the - `TransferCompleted` event's `totalDurationMs` and `avgThroughputBps`. -4. For loss runs: `sudo tc qdisc add dev root netem loss 1%`. - Don't forget to `tc qdisc del` afterwards. +W2 is the workload where TCP shows its largest advantage; W4 is where the +two protocols converge. diff --git a/tasks/contracts/data-formats.md b/tasks/contracts/data-formats.md index f310247..9cb0bc4 100644 --- a/tasks/contracts/data-formats.md +++ b/tasks/contracts/data-formats.md @@ -2,6 +2,13 @@ Все REST-payload-ы и persisted JSON-файлы используют эти структуры. Изменения требуют согласования с тимлидом. +> **Note for v0.4.0+:** persisted state files (`/jobs/*.json` и +> `.netcopy/meta.json`) carry a `schemaVersion` field. Readers MUST refuse +> any file with `schemaVersion > CURRENT_SCHEMA_VERSION` to avoid +> mis-interpreting a future format. Pre-v0.4.0 files have no field — Jackson +> defaults to 0, treated as `1` on read. Bumping the constant signals an +> intentional break. + --- ## REST — `POST /api/manifest` @@ -10,15 +17,22 @@ Request: ```json { "rootIdx": 0, + "baseSubdir": "Media", "paths": ["movies/big.mkv", "docs/"] } ``` +`baseSubdir` (optional, since v0.2.5) is the user's source-side breadcrumb. Entry +`relPath` values become relative to it, so the receiver lays out +`movies/big.mkv` under its target root **without** the `Media/` prefix that the +user navigated through. + Response (`Manifest`): ```json { "manifestId": "9f6a2c8e-...", "rootIdx": 0, + "baseSubdir": "Media", "totalBytes": 5500000000, "totalFiles": 3, "entries": [ @@ -43,6 +57,19 @@ Response (`Manifest`): --- +## REST — `POST /api/manifest/register` + +Body: a complete `Manifest` (the JSON the source returned from `POST /api/manifest`). +Response: `204 No Content`. + +Used when the puller resumes a transfer whose source restarted — the in-memory +registry on the source is gone and would otherwise reject every blob request as +"manifest not found". Re-registration is gated server-side: every FILE entry +must, with `NOFOLLOW_LINKS`, resolve to a regular file whose real path is still +inside the shared root. Symlink-traversed entries are rejected with `400`. + +--- + ## REST — `GET /api/browse?root=&path=` Response: @@ -54,10 +81,39 @@ Response: "entries": [ { "name": "big.mkv", "type": "file", "size": 5368709120, "mtime": 1714000000 }, { "name": "subs", "type": "dir" } + ], + "rootFree": { "usable": 723000000000, "total": 1000000000000 } +} +``` + +`rootFree` (v0.3.1+) reflects `Files.getFileStore(root)`. May be absent on +filesystems where the JVM cannot answer. + +`GET /api/browse-local` has the same shape but takes a `--receive-root` index. + +--- + +## REST — `POST /api/browse/stats` + +Request: +```json +{ "rootIdx": 0, "kind": "shared", "baseSubdir": "Media", "paths": ["movies/", "docs/"] } +``` + +Response: +```json +{ + "totalFiles": 1234, + "totalBytes": 56789012345, + "perPath": [ + { "path": "movies/", "files": 1200, "bytes": 56000000000 }, + { "path": "docs/", "files": 34, "bytes": 789012345 } ] } ``` +Walks each path with `NOFOLLOW_LINKS` and a depth cap (`maxDepth=32`, v0.4.0+). + --- ## REST — `POST /api/transfers` @@ -66,17 +122,25 @@ Request: ```json { "sourcePeerUrl": "http://192.168.1.10:7777", - "sourcePeerToken": "github_pat_xxx_or_netcopy_token", + "sourcePeerToken": "...", "sourcePeerTcpPort": 7778, - "manifestId": "9f6a2c8e-...", + "manifest": { "manifestId": "9f6a2c8e-...", "...": "as returned by POST /api/manifest" }, "protocol": "tcp", "targetRootIdx": 0, - "targetSubdir": "from-deb1/2026-04-29", + "targetSubdir": "from-host-a/2026-04-29", "conflictPolicy": "skip", + "acknowledgeOverwrite": false, "parallelism": { "files": 4, "chunksPerFile": 8 } } ``` +- `manifest` MUST be the full manifest object (the puller has no copy locally; + passing only `manifestId` is rejected with `400`). +- `conflictPolicy ∈ {"skip", "rename", "overwrite"}`. +- `acknowledgeOverwrite` (v0.4.0+) MUST be `true` when `conflictPolicy = "overwrite"`, + otherwise the request is rejected with `400`. The UI sends what its checkbox + state is; a scripted client must echo the same explicit acknowledgement. + Response: ```json { "transferId": "uuid", "state": "running" } @@ -103,13 +167,25 @@ Response: "totalDurationMs": null, "files": [ { "id": "f1", "relPath": "movies/big.mkv", "size": 5368709120, - "bytesDone": 1234567, "state": "downloading", "hash": null } - ] + "bytesDone": 1234567, "state": "downloading", "hashHex": null } + ], + "metrics": { + "sourceLatency": { "p50Ms": 1.4, "p95Ms": 5.2, "maxMs": 18.0 }, + "wireTime": { "p50Ms": 16.7, "p95Ms": 42.0, "maxMs": 110.0 }, + "persistTime": { "p50Ms": 0.6, "p95Ms": 2.0, "maxMs": 8.0 }, + "poolAcquireWait": { "p50Ms": 0.0, "p95Ms": 0.1, "maxMs": 1.2 }, + "succeeded": 1234, "retried": 2, "failed": 0, "inFlight": 8 + } } ``` -`state ∈ {"planned","running","paused","completed","failed","cancelled"}` -`files[].state ∈ {"pending","downloading","verifying","done","failed","skipped"}` +- `state ∈ {"planned", "running", "paused", "completed", "failed", "cancelled"}` — **lowercase wire format**. +- `files[].state ∈ {"pending", "downloading", "verifying", "done", "failed", "skipped"}`. +- `files[].hashHex` — final-file digest as 64-char SHA-256 hex (the field name retains + its historical "hash" identifier for v0.x wire compatibility; the actual algorithm + is documented in `FileFinalizer.java`). Per-chunk hashes on the wire are XXH3-128. +- `metrics` is `null` when the puller for this job is not currently in memory + (terminal jobs, reload after restart). --- @@ -118,26 +194,36 @@ Response: ```json { "id": "uuid", - "direction": "INBOUND", + "direction": "inbound", "peerUrl": "http://192.168.1.10:7777", "peerTcpPort": 7778, - "protocol": "TCP", - "manifestJson": "{...full manifest as a string...}", + "protocol": "tcp", + "manifestJson": "{\"manifestId\":\"...\",\"...\":\"...\"}", "targetRootIdx": 0, - "targetSubdir": "from-deb1/2026-04-29", - "conflictPolicy": "SKIP", + "targetSubdir": "from-host-a/2026-04-29", + "conflictPolicy": "skip", "fileParallelism": 4, "chunksPerFile": 8, - "state": "RUNNING", + "state": "running", "createdAt": 1714000000000, "updatedAt": 1714000123000, "completedAt": null, "totalDurationMs": null, - "avgThroughputBps": null + "avgThroughputBps": null, + "schemaVersion": 1 } ``` -Atomic write: `.json.tmp` → fsync → `rename(.json.tmp, .json)`. +All enum-valued fields (`direction`, `protocol`, `conflictPolicy`, `state`) are +serialised **lowercase** since v0.3.0. Pre-v0.3.0 files used UPPERCASE; the +`@JsonCreator` accepts either casing on read for backward compat. + +`schemaVersion` is `1` for new writes (v0.4.0+). Pre-v0.4.0 files have no +field — readers default to 0 and treat as schema 1. Future bumps signal an +intentional break. + +Atomic write: `.json.tmp` → `force(true)` → `Files.move(..., ATOMIC_MOVE, +REPLACE_EXISTING)`. POSIX perms (v0.4.0+): file `0600`, dir `0700`. --- @@ -153,17 +239,54 @@ Atomic write: `.json.tmp` → fsync → `rename(.json.tmp, .json)`. { "idx": 0, "offset": 0, "len": 33554432 }, { "idx": 1, "offset": 33554432, "len": 33554432 } ], - "expectedHashHex": null + "expectedHashHex": null, + "schemaVersion": 1 } ``` +`expectedHashHex` is hex-encoded XXH3-128 if the source pre-computed it at +planning time, else `null`. `schemaVersion` semantics same as for `JobState`. + +Written exactly once on sidecar creation with `CREATE_NEW + force(true)`. + --- ## Sidecar — `.netcopy/chunks.bitmap` -Бинарный формат. `chunkCount` бит, padded до байта. Bit i = 1 если chunk[i] завершён и hash проверен. +Binary. `chunkCount` бит, padded до байта. Bit i = 1 если chunk[i] завершён и +его XXH3-128 hash проверен. -Атомарная запись: `chunks.bitmap.tmp` → fsync → `rename(chunks.bitmap.tmp, chunks.bitmap)`. Можно делать batch update раз в 100 мс или после fsync chunk-байтов в `data.partial`, что наступит раньше. +Updated in place via positional `FileChannel.write(buf, 0)`. The bitmap is +idempotent — a torn write at most loses bits, causing those chunks to be +re-pulled on resume. A successful flush of the bitmap is always preceded by an +fsync of the corresponding `data.partial` byte range and `chunks.hashes` slot. +On v0.3.1+ the partial / hashes / bitmap fsync triple is batched (every 8 +chunks or 500 ms). + +--- + +## Sidecar — `.netcopy/chunks.hashes` + +Binary. Fixed length = `chunkCount × 16` bytes. Each chunk's XXH3-128 digest +lives at byte offset `idx * 16`. Zero-filled at create time; positionally +overwritten as each chunk completes (before the bitmap bit flips). + +Read only by the post-pull selective-re-verify path: on full-file SHA-256 +mismatch, the puller hashes each chunk's bytes from the partial and compares +against this file. Only the bits of chunks whose stored hash diverges are +cleared, so resume re-pulls only the corrupted chunks. + +Files written by NetCopy ≤ v0.2.4 don't have this companion; the verify-fail +path falls back to wiping the whole bitmap. + +--- + +## Sidecar — `.netcopy/data.partial` + +Sparse, pre-allocated to the final size via `RandomAccessFile.setLength`. +Written via positional `FileChannel.write(buf, offset)` from N parallel chunk +workers. After all chunks are verified and the full-file SHA-256 check passes, +`FileFinalizer` atomically renames this file onto its final target. --- @@ -171,16 +294,27 @@ Atomic write: `.json.tmp` → fsync → `rename(.json.tmp, .json)`. ```json { "type": "Subscribe", "transferId": "uuid" } +{ "type": "Subscribe" } { "type": "Unsubscribe", "transferId": "uuid" } { "type": "UnsubscribeAll" } ``` +`Subscribe` без `transferId` — wildcard: клиент получает события всех transfer-ов +на этом сервере. Cap (v0.4.0+): 256 различных подписок на сессию. + ## WebSocket — сервер → клиент (`ProgressEvent`) -Дискриминатор — `type`. Все события включают `transferId` и `timestamp`. +Дискриминатор — `type`. Все события включают `transferId` (или `null` для +"глобальных") и `timestamp`. ```json -{ "type": "ManifestReady", "transferId": "...", "timestamp": 1714000001234, +{ "type": "TransferRegistered", "transferId": "...", "timestamp": ..., + "state": "running", "protocol": "tcp", "peerUrl": "...", + "totalBytes": 5500000000, "totalFiles": 3 } + +{ "type": "TransferDismissed", "transferId": "...", "timestamp": ... } + +{ "type": "ManifestReady", "transferId": "...", "timestamp": ..., "manifestId": "...", "totalBytes": 5500000000, "totalFiles": 3 } { "type": "TransferStarted", "transferId": "...", "timestamp": ..., "protocol": "tcp" } @@ -189,7 +323,9 @@ Atomic write: `.json.tmp` → fsync → `rename(.json.tmp, .json)`. "bytesDone": 1234567, "totalBytes": 5500000000, "filesDone": 1, "totalFiles": 3, "currentThroughputBps": 102400000, "avgThroughputBps": 95000000, - "currentFiles": [ { "fileId": "f7", "relPath": "movies/big.mkv", "bytesDone": 1234, "size": 5368709120 } ] } + "currentFiles": [ { "id": "f7", "relPath": "movies/big.mkv", + "bytesDone": 1234, "size": 5368709120, + "state": "downloading", "verifyBytesDone": 0 } ] } { "type": "ChunkCompleted", "transferId": "...", "timestamp": ..., "fileId": "f7", "chunkIdx": 5, "bytes": 33554432 } @@ -209,6 +345,13 @@ Atomic write: `.json.tmp` → fsync → `rename(.json.tmp, .json)`. "reason": "source_changed", "detail": "movies/big.mkv mtime mismatch" } ``` +`TransferProgress.currentFiles[].verifyBytesDone` (v0.3.1+) climbs from 0 → size +while the file is in the `verifying` state. Used by the UI to show a sub-progress +bar while the post-pull SHA-256 streams. + +`TransferProgress` events are throttled: at most one per (session, transferId) +every 200 ms. + --- ## TCP wire — Frame layout @@ -225,11 +368,12 @@ Atomic write: `.json.tmp` → fsync → `rename(.json.tmp, .json)`. | 0x06 | DATA_END | `reqId:u32` | | 0x07 | ERR | `reqId:u32` `code:u16` `messageLen:u16` `message:bytes(messageLen)` | | 0x08 | BYE | (no payload) | +| 0x09 | DATA_END_V2 | `reqId:u32` `xxh3:bytes(16)` | Все строки — UTF-8. Все длины — Big-Endian. UUID — 16 байт raw (8 байт MSB BE + 8 байт LSB BE). Max payload size: -- HELLO/HELLO_OK/REQUEST/DATA_HEAD/DATA_END/ERR/BYE: 64 KB +- HELLO/HELLO_OK/REQUEST/DATA_HEAD/DATA_END/DATA_END_V2/ERR/BYE: 64 KB - DATA: до 1 MB (рекомендуется 256 KB чанки внутри одного REQUEST) Если frame превышает лимит — closing connection с ERR_BAD_REQUEST. @@ -241,4 +385,23 @@ Codes для ERR: - 1020 ERR_BAD_REQUEST (out-of-range offset, etc.) - 1500 ERR_INTERNAL -Protocol version: текущая = 1. +### Protocol versions + +- `protoVer = 1` — original two-pass design. Source reads the requested byte + range twice: once to compute XXH3-128 (which is then placed in `DATA_HEAD` + before the body), then again to ship it as a sequence of `DATA` frames + followed by a single `DATA_END`. Receiver verifies the XXH3 from + `DATA_HEAD`. +- `protoVer = 2` (server default since v0.3.0) — single-pass. Source reads the + range once, hashing inline. `DATA_HEAD` carries an all-zero sentinel + `xxh3` (receiver ignores it). The real XXH3 trailer arrives in + `DATA_END_V2` after the last `DATA` frame. Saves one full disk read on the + source — meaningful on cold-cache HDDs. + +The negotiated version is `min(client.protoVer, server.protoVer)` and +returned in `HELLO_OK`. Older v1-only clients keep working transparently; +v2 clients downgrade gracefully against v1 servers. + +Authentication: a connection that does not deliver a valid `HELLO` within 30 +seconds is closed by the server (v0.4.0+ HELLO timeout watchdog). The server +also caps concurrent connections at 1024.