Skip to content

hllplus: allocate sparse delta slice and buffer lazily#22

Merged
dim merged 1 commit into
bsm:mainfrom
antoxas1986:lazy-sparse-alloc
Jun 16, 2026
Merged

hllplus: allocate sparse delta slice and buffer lazily#22
dim merged 1 commit into
bsm:mainfrom
antoxas1986:lazy-sparse-alloc

Conversation

@antoxas1986

Copy link
Copy Markdown
Contributor

Problem

The sparse HLL++ state pre-allocates its delta slice and buffer map for the worst case when a sketch is constructed. In streaming/aggregation pipelines that key HLL sketches by a high-cardinality grouping key, many low-cardinality sketches are held live simultaneously, and this worst-case preallocation dominates heap — most of those bytes are never touched because each sketch only ever sees a handful of distinct values.

Change

Two allocations in newSparseState switch from worst-case sizing to lazy/minimal sizing:

  • the delta slice starts at a small capacity instead of maxDataLen,
  • the buffer map starts empty instead of being pre-sized to maxBufferLen.

The delta slice is append-grown and the buffer is a map, so both grow correctly on demand. maxDataLen/maxBufferLen are retained as flush thresholds — only their use as allocation sizes is dropped. Serialization output and cardinality estimates are unchanged; only the initial per-sketch heap footprint drops.

Benchmark

Low-cardinality sparse sketch (construct + add 10 distinct values), -benchmem -count=5:

B/op allocs/op
before (worst-case sizing) 22298 10
after (lazy sizing) 520 9

Tests

  • Existing suite passes (go test ./...).
  • Added TestLazyAllocInvariant: builds independent sketches over identical input across cardinalities {1, 10, 100, 10k, 1M} at precisions 12/17 and 15/20, asserting byte-identical serialization, identical estimates, and a serialize→deserialize round-trip that preserves the estimate.

The sparse state pre-sized its delta slice and buffer map for the worst
case at construction. For workloads holding many low-cardinality sketches
live at once, that preallocation dominates heap — most of those bytes are
never used. Allocate both minimally and grow on demand; maxDataLen and
maxBufferLen are retained as flush thresholds. Serialization output and
cardinality estimates are unchanged.

Adds an invariant test asserting byte-identical serialization and
identical estimates across a 1–1M cardinality range at two precisions,
plus a low-cardinality allocation benchmark.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dim dim merged commit 2189689 into bsm:main Jun 16, 2026
1 of 3 checks passed
@dim

dim commented Jun 16, 2026

Copy link
Copy Markdown
Member

thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants