Skip to content

[Bug]: SQ8 quantization causes significant recall drop on OpenAI Performance1536D50K dataset #328

@JoeJRW

Description

@JoeJRW

Description

When benchmarking zvec v0.2.1 with SQ8 quantization on the OpenAI Performance1536D50K dataset using VectorDBBench, I observed a significant recall drop to 0.7377, which is much lower than expected.

Command used:

vectordbbench zvec --path Performance1536D50K --db-label 16c64g-v0.1 \
    --case-type Performance1536D50K --num-concurrency 1 \
    --quantize-type int8 --m 15 --ef-search 180

After investigating, I found that the root cause is in src/core/quantizer/record_quantizer.h. Currently, the metadata fields sum (extras[2]) and squared_sum (extras[3]) are computed from the pre-rounded float values to preserve precision. However, on datasets with asymmetric value ranges — like OpenAI embeddings, where |x_min| ≈ 0.64 >> x_max ≈ 0.21, causing most values to be compressed into the right half of the quantization interval [−127, 127] — computing these metadata fields from the rounded int8 values instead actually yields significantly better recall.

By changing the computation to use rounded values — consistent with what is actually stored — recall improves from 0.7377 to 0.9588 on the same benchmark, with no regression observed on other datasets (Cohere 768D, BIOASQ 1024D).

A detailed mathematical analysis and experimental validation are included below. The document was collaboratively authored by me and Claude Code, and is written in Chinese.

sq8_recall_analysis.md

Steps to Reproduce

1. pip install zvec==v0.2.1
2. vectordbbench zvec --path Performance1536D50K --db-label 16c64g-v0.1 \
    --case-type Performance1536D50K --num-concurrency 1 \
    --quantize-type int8 --m 15 --ef-search 180

Logs / Stack Trace

Operating System

Ubuntu 22.04

Build & Runtime Environment

Python 3.11

Additional Context

  • I've checked git status — no uncommitted submodule changes
  • I built with CMAKE_BUILD_TYPE=Debug
  • This occurs with or without COVERAGE=ON
  • The issue involves Python ↔ C++ integration (pybind11)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions