From ab39a6d519d6373208628846e1649a5888e4b16f Mon Sep 17 00:00:00 2001
From: Isaac Cheng <47993930+IsaacCheng9@users.noreply.github.com>
Date: Mon, 11 May 2026 00:13:46 +0100
Subject: [PATCH 1/3] fix: Show `Scan`snapshot in README architecture diagram
---
README.md | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 9e08965..55d77fb 100644
--- a/README.md
+++ b/README.md
@@ -46,10 +46,13 @@ flowchart TD
grpc -->|engine API| engine[Engine]
engine -->|writes| wal[Write-Ahead Log]
engine -->|writes / reads| memtable[Memtable
sorted in-memory]
+ engine -->|scan creates| snapshot[Scan Snapshot
memtable copy + SSTable readers]
memtable -->|flush when full| l0[L0 SSTables
overlapping key ranges]
l0 -->|background compaction| l1[L1 SSTables
merged, deduplicated]
- engine -.reads.-> l0
- engine -.reads.-> l1
+ engine -.point reads.-> l0
+ engine -.point reads.-> l1
+ snapshot -.range reads.-> l0
+ snapshot -.range reads.-> l1
wal -.replay on startup.-> memtable
```
From 3f0e45ff00e376b8497272f55acd66c9967b1f84 Mon Sep 17 00:00:00 2001
From: Isaac Cheng <47993930+IsaacCheng9@users.noreply.github.com>
Date: Mon, 11 May 2026 00:18:28 +0100
Subject: [PATCH 2/3] fix: Add performance table derived from benchmark results
---
README.md | 33 +++++++++++++++++++++++----------
1 file changed, 23 insertions(+), 10 deletions(-)
diff --git a/README.md b/README.md
index 55d77fb..d97ac2a 100644
--- a/README.md
+++ b/README.md
@@ -2,11 +2,11 @@
[](https://github.com/IsaacCheng9/kv-engine/actions/workflows/test.yml)
-A C++23 LSM-tree key-value store with crash recovery and a gRPC API
-supporting point operations and server-streaming range scans.
+A C++23 LSM-tree key-value store with crash recovery and a gRPC API supporting
+point operations and server-streaming range scans.
-Modelled after LevelDB and RocksDB, with the LSM-tree design from O'Neil
-et al. (1996).
+Modelled after LevelDB and RocksDB, with the LSM-tree design from O'Neil et al.
+(1996).
## Key Features
@@ -23,13 +23,12 @@ et al. (1996).
- **SSTable reader cache** – parsed readers stay resident for each file's
lifetime and serve concurrent `get()` callers via positioned reads,
eliminating per-lookup open and index-parse cost
-- **Per-SSTable Bloom filter** – probabilistic membership test consulted
- before the binary search on `get()`, short-circuiting lookups for keys
- guaranteed not to be in the file (no false negatives, ~1% false positive
- rate)
+- **Per-SSTable Bloom filter** – probabilistic membership test consulted before
+ the binary search on `get()`, short-circuiting lookups for keys guaranteed not
+ to be in the file (no false negatives, ~1% false positive rate)
- **Key range pruning** – cached min/max keys let `get()` skip SSTables whose
- key range cannot contain the lookup key, avoiding the Bloom check and
- binary search entirely
+ key range cannot contain the lookup key, avoiding the Bloom check and binary
+ search entirely
- **gRPC API** – `Put` / `Get` / `Delete` as unary RPCs and `Scan` as
server-streaming, with snapshot semantics isolating in-flight scans from
concurrent writes, flushes, and compactions
@@ -38,6 +37,20 @@ et al. (1996).
- Raft consensus for distributed replication across multiple nodes
+## Performance
+
+Measured on M1 Max in Release build. Full numbers in
+[`docs/2026_05_05_grpc_with_scan_baseline.txt`](docs/2026_05_05_grpc_with_scan_baseline.txt).
+
+| Workload | Throughput | Latency (p50) | Notes |
+| ------------------- | -------------: | ------------: | ---------------------------------------------------------------- |
+| Memtable read | 2.6M ops/sec | 0.33 µs | Hot in-memory path |
+| SSTable read | 114k ops/sec | 8.54 µs | Cached reader + Bloom filter + range pruning |
+| Negative lookup | 73k ops/sec | 13.58 µs | All read-path optimisations short-circuit |
+| Write (`put`) | 16k ops/sec | 42 µs | `fsync`-bound on the WAL |
+| gRPC unary read | 7.3k ops/sec | ~130 µs | Loopback overhead vs direct in-process call |
+| gRPC streaming scan | ~117k rows/sec | ~8.5 µs/row | ~15x amortisation vs unary (HTTP/2 framing paid once per stream) |
+
## Architecture
```mermaid
From 9a30e9378e7f9456843c2c5d655a266efc7c056b Mon Sep 17 00:00:00 2001
From: Isaac Cheng <47993930+IsaacCheng9@users.noreply.github.com>
Date: Mon, 11 May 2026 00:18:47 +0100
Subject: [PATCH 3/3] fix: Explain why we used server-streaming for gRPC `Scan`
---
README.md | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)
diff --git a/README.md b/README.md
index d97ac2a..13c2543 100644
--- a/README.md
+++ b/README.md
@@ -161,17 +161,12 @@ the scan don't change what it yields. Tombstones are collapsed and shadowed
older versions of a key are discarded; the caller sees only the newest live
value per key in `[start_key, end_key)` order.
-### Performance
+### Why Server-Streaming for `Scan`
-On loopback (no real network RTT), gRPC adds ~130 µs round-trip vs direct
-in-process engine calls – HTTP/2 framing + protobuf serialise/deserialise +
-kernel TCP loopback. See the `grpc_*` rows in
-`docs/2026_05_05_grpc_with_scan_baseline.txt` for full numbers.
-
-Streaming RPCs amortise that overhead: `grpc_scan` measures ~8.5 µs per row vs
-~130 µs per unary call. Server-streaming pays the HTTP/2 framing cost once per
-stream rather than once per row, so the per-operation gRPC tax shrinks ~15x
-for range queries. This is the argument for using server-streaming `Scan` over
+Streaming RPCs amortise gRPC overhead: `grpc_scan` measures **~8.5 µs per row vs
+~130 µs per unary call.** Server-streaming pays the HTTP/2 framing cost once per
+stream rather than once per row, so the **per-operation gRPC tax shrinks ~15x
+for range queries.** This is the argument for using server-streaming `Scan` over
a cursor-based unary API for `Scan`-shaped workloads.
## Benchmarks