Add byte-packed IVF RaBitQ fast-scan layout for 4+ bit indexes#5084
Open
lyang24 wants to merge 1 commit intofacebookresearch:mainfrom
Open
Add byte-packed IVF RaBitQ fast-scan layout for 4+ bit indexes#5084lyang24 wants to merge 1 commit intofacebookresearch:mainfrom
lyang24 wants to merge 1 commit intofacebookresearch:mainfrom
Conversation
Introduce a new Iwrp serialization format for IndexIVFRaBitQFastScan and a byte-packed ex-code path for nb_bits >= 4. This switches multibit refinement to dedicated byte-packed SIMD kernels while leaving the existing bit-packed layout in place for lower-bit configurations. The new path preserves recall, keeps nbits=2 behavior unchanged, and continues to read older Iwrn/Iwrf indexes. It also guards unsupported distance_to_code usage on byte-packed indexes so the optimization stays confined to the FastScan batch search path.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds an optional byte-packed multibit layout for
IndexIVFRaBitQFastScanwhennb_bits >= 4.The new layout stores
ex_codeas one byte per dimension and uses dedicated byte-packed multibit distance kernels in the FastScan refinement path. Lower-bit configurations keep the existing bit-packed layout, sonbits=1/2/3behavior is unchanged.Format / compatibility
IwrnandIwrfcontinue to read through the old bit-packed pathIwrpfourccIwrproundtrip serialization was verifiedPerformance
On Cohere 1M, single-threaded search,
nbits=4:Recall is unchanged:
nprobe=32:0.8847nprobe=64:0.9281nprobe=128:0.9501nbits=2is unchanged.Notes
nbits >= 4distance_to_code()is explicitly rejected for byte-packed IVF FastScan indexes, rather than silently using the wrong multibit decoder