Skip to content

Parallelize PQRetrainer training-vector extraction across sources#12

Merged
eolivelli merged 1 commit into
mainfrom
issue-587-pq-retrain-parallel-io
May 18, 2026
Merged

Parallelize PQRetrainer training-vector extraction across sources#12
eolivelli merged 1 commit into
mainfrom
issue-587-pq-retrain-parallel-io

Conversation

@eolivelli
Copy link
Copy Markdown
Owner

Summary

  • PQRetrainer.extractVectorsSequential extracted every PQ training sample with a single-threaded, blocking getVectorInto() loop. Against remote-backed graph storage each read is a network round-trip with no OS read-ahead, so thousands of samples serialize into thousands of sequential round-trips.
  • Observed downstream as a 2+ hour stall during a 53-segment compaction in HerdDB ([k3s-bench] PQRetrainer.extractVectorsSequential: 2+ hour stall due to sequential random I/O over remote file server herddb#587).
  • Each source is now extracted on its own worker thread using its own OnDiskGraphIndex.View (one RandomAccessReader per View — never shared), so up to jvector.pq.retrain.io.threads (default 16) remote reads are in flight at once. Reads within a source stay ascending, preserving read-ahead friendliness.
  • Also closes the Views that the old code leaked, and emits periodic progress logs so a slow extraction can be distinguished from a hang.

Tests

  • New TestOnDiskGraphIndexCompactor#testCompactManySourcesParallelRetrain compacts 8 FusedPQ sources (exercising the parallel path) and asserts every source's inline vectors survive compaction exactly at their remapped ordinals.
  • Full TestOnDiskGraphIndexCompactor suite passes (8 tests).

🤖 Generated with Claude Code

PQRetrainer.extractVectorsSequential read every training sample with a
single-threaded blocking getVectorInto() loop. Against remote storage
each read is a network round-trip with no OS read-ahead, so thousands
of samples serialize into thousands of round-trips — observed as a 2+
hour stall during a 53-segment compaction (HerdDB issue datastax#587).

Extract each source on its own thread/View (one RandomAccessReader per
View, never shared) so up to jvector.pq.retrain.io.threads (default 16)
remote reads are in flight at once; within a source reads stay ascending
for read-ahead friendliness. Also close the previously-leaked Views and
emit periodic progress logs so a slow extraction is distinguishable from
a hang.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eolivelli eolivelli merged commit b9fbe52 into main May 18, 2026
4 of 10 checks passed
eolivelli added a commit to eolivelli/herddb that referenced this pull request May 18, 2026
Fixes #587.

A vector-index compaction cycle that selected **53 input segments**
stalled the Indexing Service for 2+ hours. The downstream PQ-retraining
step (jvector `PQRetrainer.extractVectorsSequential`) samples training
vectors per input segment with random remote-storage reads, so its I/O
cost scales with the number of input segments.
`VectorIndexCompactor.chooseSegmentsToMerge` bounded the picked set only
by a byte cap, which never bites when many segments are individually
small — leaving the input count effectively unbounded.

This PR is the **HerdDB-side mitigation**. The complementary
jvector-side fix (parallelizing the per-source extraction so the
per-read latency is hidden) is in **eolivelli/jvector#12**. The two
changes are independent — HerdDB CI builds jvector from
`eolivelli/jvector` `main` and is unaffected by the jvector PR's merge
state — but full latency relief needs both.

## Changes
- `VectorIndexCompactor` — new 7-arg `chooseSegmentsToMerge` overload
with a `maxInputs` parameter (6-arg overload delegates with the cap
disabled). After the fire/no-fire trigger decision, the normal
byte-capped selection is truncated smallest-first to at most `maxInputs`
segments, with an INFO log when truncation happens. The micro-segment
fast path (#570) is **deliberately exempt** — those cycles must stay
fast slot-reclaiming merges and the PQ-retraining-I/O concern does not
apply to them. Added `clampMaxInputs` (`<=0` disables, `1`→`2`) and
`computeTieredMaxInputs`.
- `PersistentVectorStore` — `DEFAULT_VECTOR_INDEX_COMPACTION_MAX_INPUTS
= 16`, a `vectorIndexCompactionMaxInputs` field,
`setCompactionMaxInputs`/`getCompactionMaxInputs`. The base cap is
**tier-scaled** (2×/4×/8× at 100/300/500 segments) per cycle alongside
the byte/count caps, so the per-cycle drain rate rises with the backlog
and the cap cannot starve the tailer toward the back-pressure threshold.
The cycle still fires on the same triggers and merges leftover segments
in subsequent cycles.
- `IndexingServerConfiguration` / `IndexingServiceEngine` — new
`vector.index.compaction.maxInputs` config key (default 16), wired into
the store and the startup config log.

## Tests
- `VectorIndexCompactorChooseTest` — new cases: a 53-segment pick is
truncated to the 16 smallest in order; `maxInputs=0` disables the cap;
the cap never changes the fire/no-fire trigger decision; a picked set
within the cap is returned untruncated; the micro-segment fast-path
result is **not** capped; `clampMaxInputs` normalisation.
- `Issue587CompactionInputCapTest` — new end-to-end test: builds a
50-segment backlog with the cap enabled at its default, drives multiple
compaction cycles, and asserts every cycle merges at most the cap and
the segment count strictly converges (no starvation).
- `Issue354TieredCompactionTest` — new `computeTieredMaxInputs` unit
tests (scaling, overflow, disabled-cap); the two end-to-end tiered tests
disable the orthogonal input cap so their "drain the whole backlog in
one cycle" premise still holds.
- Pre-PR validation green: `spotless:check apache-rat:check install
-DskipTests spotbugs:check -Pci` (the exact CI gate).
- Hammer suite green (twice):
`DirectMultipleConcurrentUpdatesSuite{NoIndexes,WithNonUniqueIndexes,WithUniqueIndexes}Test`,
`BLinkConcurrentSearchInsertTest`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant