Read the full write-up and benchmark analysis: A Cache-Conscious Hash Join for 19× Faster Binary Descriptor Matching
Benchmarks 1:1 inner hash joins between two multisets A and B. Unlike standard hash joins that compute cartesian products on duplicate keys, this implementation excludes keys that have duplicates in either multiset. Only pairs with equal keys that have cardinality 1 in each multiset are emitted.
This problem appears e.g. in the hot loop of feature matching and pose estimation of computer vision pipelines, where the goal is to find unique correspondences between two sets of (binary) features/embeddings. While this problem can also be solved with ANN, the exclusive match criterion is interesting because it allows to both find a matching and exclude weak matches through the exclusion of duplicates in O(N).
In such feature matching contexts, the hash table is only retained to match one image's features to as little as a single other image's feature. As a result, table construction time is equally important as efficient lookups.
The benchmarks in this repo compare optimized implementations against unordered_map baselines as well as against more specialized hashmap implementations from Google's abseil(absl) and Martin Ankerl's ankrl.
Note: All of the optimized implementations only leverage cache effects and ILP. In particular, no explicit SIMD or multi-threading is used.
Unzip the two dataset files: gunzip {statesSrcLarge,statesTarLarge}.txt.gz.
# Build
mkdir build
cd build
cmake ..
make
# Test
ctest
# Benchmark
./run_and_plot_benchmarkTo add and compare new implementations:
- add implementation in new
.cppand.hpptosrc - include and register the new implementation in
registry.cpp/.hpp - add the new .cpp to the
CMakeLists.txtand build - Tests and benchmarks will automatically include the new implementation.