Skip to content

Comments

Fix non-deterministic lidar odometry#368

Merged
JanuszBedkowski merged 1 commit intoMapsHD:mainfrom
bloom256:fix/deterministic-lidar-odometry
Feb 20, 2026
Merged

Fix non-deterministic lidar odometry#368
JanuszBedkowski merged 1 commit intoMapsHD:mainfrom
bloom256:fix/deterministic-lidar-odometry

Conversation

@bloom256
Copy link
Collaborator

Problem

Lidar odometry produces different results across runs, in both single-threaded and
multithreaded modes.

Fix: Zero-initialize NDT bucket structs

Root cause

Eigen's default constructors intentionally leave matrix/vector members uninitialized
for performance. When NDT bucket structs (Bucket, BucketCoef, Bucket2) are
created, fields like normal_vector, cov, mean contain whatever was in memory.

In update_rgd(), when a new bucket is created, mean, cov, cov_inverse, and
number_of_points are explicitly set -- but normal_vector is not. It only gets
computed when number_of_points reaches 3 (via PCA eigenvectors + viewport-based
orientation).

Meanwhile, compute_hessian() has no point count guard -- it uses any bucket found
in the map, including those with only 1-2 points. The viewpoint check in
add_indoor_hessian_contribution() and add_outdoor_hessian_contribution() reads
normal_vector unconditionally:

if (ablation_study_use_view_point_and_normal_vectors)
{
    if (bucket.normal_vector.dot(viewport - bucket.mean) < 0)
        return;  // skip this bucket's contribution
}

With garbage in normal_vector, the dot product produces random values, causing
valid buckets to be randomly skipped or included across runs. This is both a
correctness bug (valid observations randomly rejected) and a source of
non-determinism.

Fix

Added default member initializers to all NDT bucket structs in core/include/ndt.h:

  • Bucket: cov, cov_inverse, mean, normal_vector = Zero();
    number_of_points, index_begin, index_end = 0
  • BucketCoef: mean, cov, normal_vector = Zero()
  • Bucket2: index_begin_inclusive, index_end_exclusive = 0

With normal_vector initialized to zero, Zero().dot(anything) = 0, which is not
< 0, so the viewpoint check consistently passes for buckets with < 3 points.
This matches the intended behavior -- the viewpoint check should only reject buckets
that have a valid surface normal pointing away from the sensor.

Added comments in add_indoor_hessian_contribution() and
add_outdoor_hessian_contribution() documenting this behavior.

Files changed

  • core/include/ndt.h
  • apps/lidar_odometry_step_1/lidar_odometry_utils_optimizers.cpp (comments only)

Note on multithreaded Hessian accumulation

The multithreaded Hessian accumulation uses tbb::combinable + combine_each,
which sums floating-point matrices in an order that varies by thread scheduling.
Since floating-point addition is not associative, this could in theory produce
slightly different results across runs. However, testing showed no noticeable effect
on determinism after the bucket initialization fix above -- the rounding-level
differences from summation order do not propagate into observable output differences.
No changes were made to the accumulation code.

Testing

Tested on a 5.5 km dataset in both single-threaded and multithreaded modes.
The resulting trajectories are absolutely identical across runs.

Origin

It is difficult to pinpoint when this bug was introduced from the git log, as the
normal_vector field has existed uninitialized in Bucket since 2023, and the
viewpoint dot-product check was added, commented out, and uncommented across several
commits. My best guess is 19ebcb0 ("hybhrid approach for LO", Mar 13, 2025) which
uncommented the nv.dot(viewport - this_bucket.mean) < 0 check in the hessian path
without adding a point count guard.

@bloom256
Copy link
Collaborator Author

bloom256 commented Feb 20, 2026

It seems that the Clang-Format 21 check is not working correctly. But I'm sure that the code is formatted correctly. I ran a formatting script before committing the code.

@JanuszBedkowski JanuszBedkowski merged commit a160d6f into MapsHD:main Feb 20, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants