Add AVX512_BF16 fast path for BF16 inner product by xtangxtang · Pull Request #1 · epeshared/faiss

xtangxtang · 2026-03-04T09:02:54Z

Summary

This PR adds a fast path for BF16 inner product computation using AVX512_BF16 instructions.

Implements DCBF16IPDpbf16 distance computer that leverages VDPBF16PS instructions
Quantizes query to BF16 once in set_query(), then computes inner products against BF16-coded vectors
Only enabled when __AVX512BF16__ is available (e.g., -march=sapphirerapids)
Requires d % 32 == 0 (32 bf16 elements per dpbf16 operation)

Key features:

Performance optimization for ScalarQuantizer with QT_bf16 + IP
Targets CPUs with AVX512_BF16 support (Sapphire Rapids and newer)
Falls back to standard DCTemplate<QuantizerBF16> when conditions not met

Test plan

Existing tests should pass
Performance testing recommended on AVX512_BF16 capable hardware (Sapphire Rapids+)

Implement DCBF16IPDpbf16 distance computer that leverages AVX512_BF16 instructions (VDPBF16PS) for accelerated BF16 inner product computation. Key features: - Quantizes query to BF16 once in set_query() - Computes inner products using VDPBF16PS against BF16-coded vectors - Only enabled when __AVX512BF16__ is available (e.g., -march=sapphirerapids) - Requires d % 32 == 0 to use dpbf16 cleanly (32 bf16 elements per op) Performance optimization for ScalarQuantizer with QT_bf16 + IP on CPUs with AVX512_BF16 support (Sapphire Rapids and newer).

xtangxtang added 2 commits March 4, 2026 16:58

SQ bf16: AMX batch path + python install fixes

6cc707c

xtangxtang merged commit b705fab into main Mar 5, 2026
1 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AVX512_BF16 fast path for BF16 inner product#1

Add AVX512_BF16 fast path for BF16 inner product#1
xtangxtang merged 2 commits intomainfrom
feature/avx512_bf16_fast_path

xtangxtang commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xtangxtang commented Mar 4, 2026

Summary

Key features:

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant