Skip to content

block: zstd compression (codec auto-detect) + deduplication-model docs#73

Merged
jaredLunde merged 4 commits into
mainfrom
jared/zstd
Jun 6, 2026
Merged

block: zstd compression (codec auto-detect) + deduplication-model docs#73
jaredLunde merged 4 commits into
mainfrom
jared/zstd

Conversation

@jaredLunde
Copy link
Copy Markdown
Contributor

Summary

Switch newly written packs from LZ4 to zstd, keeping every existing LZ4 pack readable, and document the deduplication/addressing model the codebase kept forcing people to re-derive.

Why

Measured per-128KiB-block on real blessed images, zstd is ~27% smaller at level 3 and ~37% at level 19 than LZ4 — a direct cut to S3 storage and egress, orthogonal to dedup. Reads get net faster: smaller packs transfer less, and zstd decode (~60µs/block) is ~0.1% of an S3 GET (measured). Bless is offline + write-once/read-many and zstd decode is ~level-independent, so the most-read data uses the max level for free.

How (mixed-codec safe, no format change)

Codec is detected on read by sniffing the zstd frame magic. A legacy LZ4 block's size-prefix can never collide (high byte 0x00 vs zstd's 0xFD), and per-block self-describing frames survive compaction's byte-reuse (a pack may legitimately hold both codecs). So old LZ4 packs read forever; content_pack_id is unchanged.

  • block_map: compress_block(data, level) / decompress_block (auto-detect, 2 MiB guard) / zstd_compress / CompressError; COMPRESSION_LZ4 sentinel, RUNTIME_DEFAULT (zstd-1), BLESS (zstd-19).
  • Default codec is zstd-1 (GLIDEFS_COMPRESSION_LEVEL overrides; 0 pins LZ4). Carried on CacheInner (atomic, set once before flush) rather than WriteCacheConfig — avoids churning 83 struct literals. bless overrides to zstd-19.
  • Read paths (read.rs ×3, compact.rs) and both compress paths (flush.rs, ext4_store::store_ext4_stream) routed through the new helpers.

Docs

ARCHITECTURE.md gains a Deduplication Model section (three tiers / three granularities; the deliberate position-addressed-S3 vs content-addressed-cache asymmetry and its consequences) + corrected flex_bg on-disk layout + WriterOption table. README.md gains Core Properties including the addressing split as a first-class property.

Testing

  • 423 lib + 220 integration + 10 ublk zero-copy tests pass on zstd-1 by default (flipping the default means these suites exercise zstd end-to-end: snapshots, compaction, fork, crash-recovery, data-safety).
  • New: zstd roundtrip, codec auto-detect, legacy-LZ4 read, old-LZ4-rejects-zstd-frame (documents the rollback floor), mixed-codec pack, end-to-end zstd flush→S3→cold-read (levels 1/3/19, asserts the stored frame is genuinely zstd), mixed-codec-across-flushes cold read.
  • Fuzz target renamed to fuzz_decompress_block (exercises both branches).
  • compress_probe bin: per-block ratio + decode-speed measurement.

Rollout (single-shot — forward-only)

Once a zstd pack is written, a pre-this-change binary cannot read it: it hard-fails cleanly (the 2 MiB guard trips on zstd's 0xFD high byte) — never silent corruption. Deploy is forward-only; GLIDEFS_COMPRESSION_LEVEL=0 can pin LZ4 if a staged rollout is ever wanted.

🤖 Generated with Claude Code

jaredLunde and others added 4 commits June 6, 2026 10:54
Capture the model the codebase kept making people re-derive: dedup happens
in three tiers at three granularities — lineage CoW (shared packs along
ancestry), the content-addressed host clean cache (per-block, host-global),
and position-addressed S3 packs (whole-pack). Explains the deliberate
asymmetry (S3 is position-addressed for range-read/request economics; the
cache is content-addressed for density — they optimize opposite things) and
its consequences: cross-lineage overlap isn't deduped in S3 except via
--layered; alignment helps the cache, not intra-rootfs S3; more S3 dedup is
only pack/layer-granular.

- ARCHITECTURE.md: new "Deduplication Model" section; corrected on-disk
  layout (flex_bg, reserved backup-superblock holes); WriterOption table.
- README.md: "Core Properties" section incl. the position-vs-content
  addressing split as a first-class property.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Switch new packs from LZ4 to zstd — ~27% smaller at level 3, ~37% at 19
(measured per-128KiB-block on real images), cutting S3 storage and egress.
Reads get net faster: smaller packs transfer less and zstd decode (~60µs/
block) is ~0.1% of an S3 GET. Orthogonal to dedup.

Codec is detected on read by sniffing the zstd frame magic — no on-disk
format change, no pack-version bump. A legacy LZ4 block's size-prefix can
never collide (high byte 0x00 vs zstd's 0xFD), and per-block self-describing
frames survive compaction's byte-reuse (a pack may legitimately hold both
codecs). So existing LZ4 packs stay readable forever; content_pack_id is
unchanged.

- block_map: compress_block(data, level) / decompress_block (auto-detect,
  2 MiB guard) / zstd_compress / CompressError; COMPRESSION_LZ4 sentinel,
  RUNTIME_DEFAULT (zstd-1), BLESS (zstd-19).
- Default codec is zstd-1 (env GLIDEFS_COMPRESSION_LEVEL overrides; 0 = pin
  LZ4). Carried on CacheInner (atomic, set once before flush) rather than
  WriteCacheConfig to avoid churning 83 literals. bless overrides to zstd-19
  (offline, write-once/read-many; decode is ~level-independent). Read paths
  (read.rs x3, compact.rs) and both compress paths (flush.rs, ext4_store
  store_ext4_stream) routed through the new helpers.

Tests: zstd roundtrip, codec auto-detect, legacy-LZ4 read, old-LZ4-rejects-
zstd-frame (documents the single-shot rollback floor), mixed-codec pack,
end-to-end zstd flush->S3->cold-read (levels 1/3/19, asserts the stored
frame is actually zstd), mixed-codec-across-flushes cold read. Fuzz target
renamed to fuzz_decompress_block (both branches). 423 lib + 220 integration
+ 10 ublk zero-copy tests pass on zstd-1 by default.

compress_probe bin: per-block LZ4-vs-zstd ratio + decode-speed measurement.

NOTE (single-shot rollout): once a zstd pack is written, a pre-this-change
binary cannot read it (clean hard-fail, never corruption). Forward-only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two CI failures from the codec swap:
- cli/bless.rs tests decompressed bless output with raw lz4_decompress, but
  bless now writes zstd-19 → use the codec-detecting decompress_block. These
  are test-utils-gated, so a plain `cargo test --lib` missed them; CI runs
  `--features test-utils`.
- The fuzz CI step still invoked the old target `fuzz_lz4_decompress`; point
  it at the renamed `fuzz_decompress_block`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The docs predated the codec swap and still described LZ4 as *the* compressor.
Update the write/read/sync paths, the Pack term, the on-disk pack format, the
integrity/verification chain, and the Compression section to: zstd by default
(runtime zstd-1, bless zstd-19; GLIDEFS_COMPRESSION_LEVEL=0 pins LZ4), with
per-block codec auto-detection on read so legacy LZ4 packs stay readable. The
remaining LZ4 mentions are intentional (legacy/auto-detect context).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jaredLunde jaredLunde merged commit e77676a into main Jun 6, 2026
24 checks passed
@jaredLunde jaredLunde deleted the jared/zstd branch June 6, 2026 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant