Skip to content

Implement Panorama into IndexIVFPQPanorama#4970

Open
AlSchlo wants to merge 45 commits intofacebookresearch:mainfrom
AlSchlo:main
Open

Implement Panorama into IndexIVFPQPanorama#4970
AlSchlo wants to merge 45 commits intofacebookresearch:mainfrom
AlSchlo:main

Conversation

@AlSchlo
Copy link
Copy Markdown
Contributor

@AlSchlo AlSchlo commented Mar 21, 2026

This PR implements Panorama on IVFPQ achieving up to 18x speedups at high recall.

bench_ivfpq_recall_qps_allM

We observe that the speedup is roughly 2× larger than the pruning ratio. This is due to two main factors: (1) vertical LUT lookups, which are faster because they avoid horizontal additions across SIMD lanes, and (2) improved LUT locality. Specifically, we keep a single level of the LUT resident in cache while processing an entire batch, analogous to loop tiling in matrix computations.

Unlike flat indexes, we rely on vertical kernels to keep SIMD lanes fully utilized. This requires a vertical data layout within each batch. We provide kernels for both AVX512 and AVX2. When available, we enable the mbmi2 flag to leverage the PEXT instruction, which compresses the current batch prior to filtering.

Because PCA degrades PQ performance, we instead redistribute excess energy across dimensions using localized random projections. Additionally, we apply a random projection within each level to better equalize energy across dimensions.

The additional memory overhead consists of nlevels + 1 floats per point. This is acceptable at 4× compression, but becomes more noticeable at higher compression rates. Scalar quantization of these coefficients appears to be a reasonable approach to reduce this metadata footprint by approximately 4×.

Cosine similarity (and other metrics) is deferred to a future PR.

Many thanks to @aknayar for the help on this PR.

cc: @alexanderguzhva @mdouze @mnorris11

@meta-cla meta-cla bot added the CLA Signed label Mar 21, 2026
@AlSchlo AlSchlo changed the title Implement Panorama into IVFPQPanorama Implement Panorama into IndexIVFPQPanorama Mar 21, 2026
@AlSchlo AlSchlo marked this pull request as draft March 21, 2026 08:40
@AlSchlo AlSchlo marked this pull request as ready for review March 22, 2026 00:46
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync bot commented Mar 26, 2026

@zoeyeye has imported this pull request. If you are a Meta employee, you can view this in D98230581.

solve lint error.
@mnorris11
Copy link
Copy Markdown
Contributor

Hi @AlSchlo @aknayar , should we start reviewing this one?

@AlSchlo
Copy link
Copy Markdown
Contributor Author

AlSchlo commented Mar 26, 2026

@mnorris11 Yes please, it should be ready :)

Copy link
Copy Markdown
Contributor

@mdouze mdouze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level, could you explain what is the level based data layout of the PQ panorama storage?

There is already a block oriented format used in FastScan indices that stores data per columns, see

struct CodePacker {

struct CodePackerPQ4 : CodePacker {

The difference here seems to be that Panorama does not have a fixed block size.

In any case, it would be better to avoid integrating Panorama adaptations into the InvertedLists object. Please inherit from it (like BlockInvertedLists does).

@AlSchlo
Copy link
Copy Markdown
Contributor Author

AlSchlo commented Mar 27, 2026

Thanks @mdouze

The storage layout is as follows:
Batch of 3 points with M = 3:

  • P0M0 P1M0 P2M0
  • P0M1 P1M1 P2M1
  • P0M2 P1M2 P2M2

Agreed, this seems to be duplicated, we will figure out a way to use the existing class. It should not be too difficult.

@aknayar
Copy link
Copy Markdown
Contributor

aknayar commented Apr 7, 2026

Thanks @mdouze,

We will begin the BlockInvertedLists refactor once #5041 goes through as there is significant overlap with that PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants