Fix race condition, memory management, debug output, and hashtable lookup in sorting.cpp (#5078) by alibeklfc · Pull Request #5078 · facebookresearch/faiss

alibeklfc · 2026-04-10T17:57:25Z

Summary:

Four fixes in faiss/utils/sorting.cpp:

1. OpenMP directive fix in fvec_argsort_parallel

The initialization loop used #pragma omp parallel without the for directive. This caused every thread to execute the entire loop independently rather than distributing iterations. With nt threads, each permA[i] was written by all nt threads concurrently — a data race under the C++ memory model (multiple unsynchronized writes to the same non-atomic location), and O(n * nt) wasted work instead of O(n). Fixed by changing to #pragma omp parallel for.

In practice, all threads write the same value (permA[i] = i), so the output was always correct despite the UB. The fix eliminates the undefined behavior and the redundant work.

2. RAII memory management in fvec_argsort_parallel

Replaced new size_t[n] / delete[] perm2 with std::vector<size_t>. The old code had no realistic exception path between allocation and deallocation (all intermediate code is either C functions or non-throwing OpenMP regions), but the manual new/delete pattern is fragile against future edits that might introduce a throwing path. The std::vector provides RAII lifetime management with no behavioral change.

3. Removed debug printf in fvec_argsort_parallel

A leftover printf("merge %d %d, %d threads\n", ...) in the parallel merge loop wrote to stdout during normal operation. Removed.

4. Missing early termination in hashtable_int64_to_int64_lookup

The linear probing loop did not check for empty slots (tab[slot * 2] == -1). In an open-addressing hash table with no deletion support, an empty slot is definitive proof that the key was not inserted — the insert function would have placed it there or earlier. Without this check, lookups for absent keys probed every slot in the bucket before the wrap-around termination at slot == hk_i. The fix adds the standard empty-slot check, matching the structure of the insert function (hashtable_int64_to_int64_add). This is a performance optimization — the old code always returned the correct result (-1 after a full bucket scan), just slower.

Differential Revision: D100317917

meta-codesync · 2026-04-10T17:57:37Z

@alibeklfc has exported this pull request. If you are a Meta employee, you can view the originating Diff in D100317917.

…okup in sorting.cpp (facebookresearch#5078) Summary: Four fixes in `faiss/utils/sorting.cpp`: **1. OpenMP directive fix in `fvec_argsort_parallel`** The initialization loop used `#pragma omp parallel` without the `for` directive. This caused every thread to execute the entire loop independently rather than distributing iterations. With `nt` threads, each `permA[i]` was written by all `nt` threads concurrently — a data race under the C++ memory model (multiple unsynchronized writes to the same non-atomic location), and O(n * nt) wasted work instead of O(n). Fixed by changing to `#pragma omp parallel for`. In practice, all threads write the same value (`permA[i] = i`), so the output was always correct despite the UB. The fix eliminates the undefined behavior and the redundant work. **2. RAII memory management in `fvec_argsort_parallel`** Replaced `new size_t[n]` / `delete[] perm2` with `std::vector<size_t>`. The old code had no realistic exception path between allocation and deallocation (all intermediate code is either C functions or non-throwing OpenMP regions), but the manual `new`/`delete` pattern is fragile against future edits that might introduce a throwing path. The `std::vector` provides RAII lifetime management with no behavioral change. **3. Removed debug `printf` in `fvec_argsort_parallel`** A leftover `printf("merge %d %d, %d threads\n", ...)` in the parallel merge loop wrote to stdout during normal operation. Removed. **4. Missing early termination in `hashtable_int64_to_int64_lookup`** The linear probing loop did not check for empty slots (`tab[slot * 2] == -1`). In an open-addressing hash table with no deletion support, an empty slot is definitive proof that the key was not inserted — the insert function would have placed it there or earlier. Without this check, lookups for absent keys probed every slot in the bucket before the wrap-around termination at `slot == hk_i`. The fix adds the standard empty-slot check, matching the structure of the insert function (`hashtable_int64_to_int64_add`). This is a performance optimization — the old code always returned the correct result (`-1` after a full bucket scan), just slower. Differential Revision: D100317917

meta-codesync · 2026-04-11T01:02:21Z

This pull request has been merged in aa3ce37.

meta-cla bot added the CLA Signed label Apr 10, 2026

meta-codesync bot added fb-exported meta-exported labels Apr 10, 2026

meta-codesync bot changed the title ~~Fix race condition, memory management, debug output, and hashtable lookup in sorting.cpp~~ Fix race condition, memory management, debug output, and hashtable lookup in sorting.cpp (#5078) Apr 10, 2026

alibeklfc force-pushed the export-D100317917 branch from afca08b to 12bae86 Compare April 10, 2026 21:51

meta-codesync bot closed this in aa3ce37 Apr 11, 2026

facebook-github-tools bot added the Merged label Apr 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix race condition, memory management, debug output, and hashtable lookup in sorting.cpp (#5078)#5078

Fix race condition, memory management, debug output, and hashtable lookup in sorting.cpp (#5078)#5078
alibeklfc wants to merge 1 commit intofacebookresearch:mainfrom
alibeklfc:export-D100317917

alibeklfc commented Apr 10, 2026 •

edited by meta-codesync bot

Loading

Uh oh!

meta-codesync bot commented Apr 10, 2026

Uh oh!

meta-codesync bot commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alibeklfc commented Apr 10, 2026 • edited by meta-codesync bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

meta-codesync bot commented Apr 10, 2026

Uh oh!

meta-codesync bot commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alibeklfc commented Apr 10, 2026 •

edited by meta-codesync bot

Loading