diff --git a/src/main/java/dev/netcopy/server/tcp/TcpConnectionHandler.java b/src/main/java/dev/netcopy/server/tcp/TcpConnectionHandler.java
index 22513da..10c6184 100644
--- a/src/main/java/dev/netcopy/server/tcp/TcpConnectionHandler.java
+++ b/src/main/java/dev/netcopy/server/tcp/TcpConnectionHandler.java
@@ -25,17 +25,18 @@
*
*
Conversation:
*
- * - Read first frame, expect {@link Frame.Hello}. Validate {@code protoVer == 1} and the
- * token via {@link TokenGate#matches(String)}. On failure: write {@link Frame.Err} with
- * {@link FrameCodec#ERR_INCOMPATIBLE_VERSION} or {@link FrameCodec#ERR_UNAUTHORIZED} and
- * close the connection.
- * - Reply with {@link Frame.HelloOk}.
+ * - Read first frame, expect {@link Frame.Hello}. Reject {@code protoVer < 1} with
+ * {@link FrameCodec#ERR_INCOMPATIBLE_VERSION}; validate the token via
+ * {@link TokenGate#matches(String)}, reject mismatch with
+ * {@link FrameCodec#ERR_UNAUTHORIZED}. The negotiated version is
+ * {@code min(client.protoVer, SERVER_PROTO_VER)} and reported back in
+ * {@link Frame.HelloOk} so the client knows which trailer format to expect.
* - Loop:
*
* - {@link Frame.Bye} or EOF — close gracefully.
* - {@link Frame.Request} — resolve {@code manifestId} → {@link Manifest},
- * {@code fileId} → entry, validate {@code offset}/{@code length}, hash the byte range,
- * emit {@link Frame.DataHead} → N × {@link Frame.Data} → {@link Frame.DataEnd}.
+ * {@code fileId} → entry, validate {@code offset}/{@code length}, then stream the byte
+ * range using whichever variant the negotiated version selects (see below).
* Errors map to {@link FrameCodec#ERR_NOT_FOUND}/{@link FrameCodec#ERR_BAD_REQUEST}/
* {@link FrameCodec#ERR_INTERNAL}.
* - Anything else — {@link Frame.Err} with {@link FrameCodec#ERR_BAD_REQUEST} and
@@ -44,14 +45,21 @@
*
*
*
- * Hashing+streaming strategy: the byte range is read from disk in {@value #DATA_CHUNK_BYTES}
- * chunks; each chunk is fed into a streaming {@link Xxh3Hasher} (one full pass through the
- * bytes) to compute the canonical xxh3-128 of the range. Once the digest is known we write the
- * {@link Frame.DataHead}, then we re-read the range and ship it as a sequence of
- * {@link Frame.Data} frames followed by a single {@link Frame.DataEnd}. Two passes is the price
- * of placing the hash before the data on the wire — the alternative (buffering the
- * whole range in memory) would not scale to multi-GB files. The OS page cache absorbs the
- * second read for any range that fits in RAM.
+ *
Hashing + streaming variants
+ *
+ * v2 (default since v0.3.0) — single-pass. The byte range is read once in
+ * {@value #DATA_CHUNK_BYTES} chunks; every chunk is simultaneously written to the wire and
+ * fed through a streaming {@link Xxh3Hasher}. {@link Frame.DataHead} goes out first with an
+ * all-zero sentinel {@code xxh3} (the receiver ignores it on v2), DATA frames carry the body,
+ * and the real digest arrives in a trailing {@link Frame.DataEndV2}. One pass over the bytes
+ * — meaningful on cold-cache HDDs where the v1 second read paid full seek cost.
+ *
+ *
v1 (legacy) — two passes. Read the range once into a streaming
+ * {@link Xxh3Hasher} to compute the digest; write {@link Frame.DataHead} carrying the digest;
+ * re-read the same range (typically from page cache) and ship it as a sequence of
+ * {@link Frame.Data} frames followed by {@link Frame.DataEnd}. Hash-before-body is wire-
+ * compatible with simpler clients but doubles disk read load on the source. Selected
+ * automatically when the client speaks v1 — old clients keep working.
*/
final class TcpConnectionHandler {
diff --git a/src/main/java/dev/netcopy/transfer/Puller.java b/src/main/java/dev/netcopy/transfer/Puller.java
index bb609ee..7052913 100644
--- a/src/main/java/dev/netcopy/transfer/Puller.java
+++ b/src/main/java/dev/netcopy/transfer/Puller.java
@@ -452,9 +452,14 @@ private Outcome processManifest(JobState job, Manifest manifest, BlobPuller pull
private void pullFile(JobState job, Manifest manifest, Manifest.Entry entry,
BlobPuller puller, Path targetBase) throws IOException, InterruptedException {
- log.info("job {} pullFile: relPath={} size={} chunks={}",
- job.id(), entry.relPath(), entry.size(),
- entry.chunks() != null ? entry.chunks().size() : 0);
+ // Per-file trace at DEBUG so a 1k-file transfer doesn't drown INFO logs in
+ // 1k+ "pullFile relPath=..." lines. Job-level lifecycle events (start, pause,
+ // complete) keep INFO level so an operator can scan the log at a glance.
+ if (log.isDebugEnabled()) {
+ log.debug("job {} pullFile: relPath={} size={} chunks={}",
+ job.id(), entry.relPath(), entry.size(),
+ entry.chunks() != null ? entry.chunks().size() : 0);
+ }
Path targetAbs = targetBase.resolve(entry.relPath()).normalize();
Path parent = targetAbs.getParent();
diff --git a/src/main/resources/logback.xml b/src/main/resources/logback.xml
index 8e8413a..e10c937 100644
--- a/src/main/resources/logback.xml
+++ b/src/main/resources/logback.xml
@@ -1,7 +1,15 @@
- %d{HH:mm:ss.SSS} %-5level [%thread] %logger{20} - %msg%n
+
+ %d{yyyy-MM-dd HH:mm:ss.SSS} %-5level [%thread] %logger{20} - %msg%n
diff --git a/tasks/contracts/data-formats.md b/tasks/contracts/data-formats.md
index 9cb0bc4..5ea0072 100644
--- a/tasks/contracts/data-formats.md
+++ b/tasks/contracts/data-formats.md
@@ -1,8 +1,8 @@
-# Data formats — JSON-схемы
+# Data formats — JSON schemas
-Все REST-payload-ы и persisted JSON-файлы используют эти структуры. Изменения требуют согласования с тимлидом.
+All REST payloads and persisted JSON files use these shapes. Changes require maintainer review.
-> **Note for v0.4.0+:** persisted state files (`/jobs/*.json` и
+> **Note for v0.4.0+:** persisted state files (`/jobs/*.json` and
> `.netcopy/meta.json`) carry a `schemaVersion` field. Readers MUST refuse
> any file with `schemaVersion > CURRENT_SCHEMA_VERSION` to avoid
> mis-interpreting a future format. Pre-v0.4.0 files have no field — Jackson
@@ -53,7 +53,7 @@ Response (`Manifest`):
}
```
-`type ∈ {"file", "dir", "symlink"}`. Только `file` имеет `size`/`mtime`/`chunks`. Только `symlink` имеет `target`.
+`type ∈ {"file", "dir", "symlink"}`. Only `file` carries `size` / `mtime` / `chunks`. Only `symlink` carries `target`.
---
@@ -253,8 +253,8 @@ Written exactly once on sidecar creation with `CREATE_NEW + force(true)`.
## Sidecar — `.netcopy/chunks.bitmap`
-Binary. `chunkCount` бит, padded до байта. Bit i = 1 если chunk[i] завершён и
-его XXH3-128 hash проверен.
+Binary. `chunkCount` bits, padded up to a whole byte. Bit `i` is `1` once
+chunk `i` has been written, fsynced, and its XXH3-128 hash verified.
Updated in place via positional `FileChannel.write(buf, 0)`. The bitmap is
idempotent — a torn write at most loses bits, causing those chunks to be
@@ -290,7 +290,7 @@ workers. After all chunks are verified and the full-file SHA-256 check passes,
---
-## WebSocket — клиент → сервер
+## WebSocket — client → server
```json
{ "type": "Subscribe", "transferId": "uuid" }
@@ -299,13 +299,13 @@ workers. After all chunks are verified and the full-file SHA-256 check passes,
{ "type": "UnsubscribeAll" }
```
-`Subscribe` без `transferId` — wildcard: клиент получает события всех transfer-ов
-на этом сервере. Cap (v0.4.0+): 256 различных подписок на сессию.
+`Subscribe` without `transferId` is a wildcard: the client receives events for
+every transfer on this server. Cap (v0.4.0+): 256 distinct subscriptions per session.
-## WebSocket — сервер → клиент (`ProgressEvent`)
+## WebSocket — server → client (`ProgressEvent`)
-Дискриминатор — `type`. Все события включают `transferId` (или `null` для
-"глобальных") и `timestamp`.
+Discriminator: `type`. Every event carries `transferId` (or `null` for global
+events) and `timestamp`.
```json
{ "type": "TransferRegistered", "transferId": "...", "timestamp": ...,
@@ -356,7 +356,7 @@ every 200 ms.
## TCP wire — Frame layout
-Каждый frame: `[4 байта BE length:u32][1 байт type:u8][payload bytes...]` — `length` — длина payload (без header-а).
+Each frame: `[4 bytes BE length:u32][1 byte type:u8][payload bytes...]` — `length` is the payload size (header bytes excluded).
| Type | Name | Payload layout |
|---|---|---|
@@ -370,18 +370,19 @@ every 200 ms.
| 0x08 | BYE | (no payload) |
| 0x09 | DATA_END_V2 | `reqId:u32` `xxh3:bytes(16)` |
-Все строки — UTF-8. Все длины — Big-Endian. UUID — 16 байт raw (8 байт MSB BE + 8 байт LSB BE).
+All strings are UTF-8. All length fields are big-endian. UUID is 16 raw bytes
+(8 bytes MSB BE + 8 bytes LSB BE).
Max payload size:
-- HELLO/HELLO_OK/REQUEST/DATA_HEAD/DATA_END/DATA_END_V2/ERR/BYE: 64 KB
-- DATA: до 1 MB (рекомендуется 256 KB чанки внутри одного REQUEST)
+- HELLO / HELLO_OK / REQUEST / DATA_HEAD / DATA_END / DATA_END_V2 / ERR / BYE: 64 KB
+- DATA: up to 1 MB (256 KB chunks within one REQUEST is the typical sizing)
-Если frame превышает лимит — closing connection с ERR_BAD_REQUEST.
+A frame exceeding its limit closes the connection with ERR_BAD_REQUEST.
-Codes для ERR:
+ERR codes:
- 1001 ERR_INCOMPATIBLE_VERSION
- 1002 ERR_UNAUTHORIZED
-- 1010 ERR_NOT_FOUND (manifest или file нет)
+- 1010 ERR_NOT_FOUND (manifest or file absent)
- 1020 ERR_BAD_REQUEST (out-of-range offset, etc.)
- 1500 ERR_INTERNAL