GMM-SLAM is a lidar / depth–driven SLAM stack that represents geometry with Gaussian mixture models (GMMs), runs a fixed-lag smoother for high-rate poses, and maintains a global submap pose graph optimized with GTSAM iSAM2. Loop closure and submap alignment use distribution-to-distribution (D2D) GMM registration from the GIRA3D ecosystem.
High-level data flow:
- Input — Depth point clouds (ROS), optional ground truth pose, optional IMU.
- Preprocessing — Range filtering, voxel downsampling.
- Local odometry / smoothing — A fixed-lag factor graph (
FixedLagBackend) integrates relative motion (e.g. noisy GT–based odometry) at a high rate and keeps a sliding window of poses and keyframes. - Keyframe GMM fitting — On selected keyframes, a self-organizing GMM (SOGMM) is fit asynchronously (GPU path where available) and stored per keyframe for registration and map building.
- Registration — Sequential and loop-closure requests use SOLiD-style descriptors for candidates where configured, and D2D GMM registration to estimate relative transforms and scores for factors.
- Global pose graph — Submaps are anchored on the smoother; when a submap closes, its keyframe GMMs are merged in the submap frame, pruned (R-tree candidate search, Bhattacharyya gate, then keeping the Gaussian with lower associated pose uncertainty), saved to disk, and used for overlap-based submap registration. iSAM2 refines submap poses when new between / loop factors arrive.
- Visualization — RViz markers for trajectory, latest GMM, global per-submap GMM map, submap graph, and loop-closure debug markers.
The Dockerfile targets Linux x86_64 with an NVIDIA GPU and the NVIDIA Container Toolkit (GPU passthrough). The image is based on:
nvidia/cuda:11.8.0-devel-ubuntu20.04— CUDA 11.8 development image on Ubuntu 20.04- ROS 1 Noetic
- GTSAM 4.2, Eigen 3.4, GIRA3D reconstruction & registration (Open3D, SOGMM, D2D), Webots +
webots_rosfor simulation-style workflows
It is not intended for ARM-only or CPU-only production (the default build enables CUDA paths in the reconstruction stack). CPU-only or other arches would require Dockerfile changes.
After a full build, the Docker image is on the order of ~24.2 GB (layers include CUDA, ROS, Webots, colcon-built GIRA3D + Open3D, and tooling). Plan disk space accordingly.
The following host setup was used for builds and runs (container OS remains Ubuntu 20.04 inside the image):
| Item | Value |
|---|---|
| Host OS | Ubuntu 22.04.5 Desktop |
| CPU | Intel Core Ultra 9 275HX |
| GPU | NVIDIA GeForce RTX 5070 Ti (12 GB) |
| Driver / CUDA (host) | CUDA 13.2 (driver stack on host; container ships CUDA 11.8 from the base image) |
| RAM | 32 GB |
Other Linux hosts with a supported NVIDIA driver and Docker + Compose should work, but only the configuration above is explicitly documented as tested.
From the gmmslam/docker directory (compose build context is the gmmslam package root ..):
cd docker
docker compose up -d --buildAttach a shell:
docker exec -it disal_slam bashStop the container:
docker stop disal_slamRebuild without cache (heavy):
docker compose build --no-cacheThe compose file (docker/compose.yaml) mounts the repo into the workspace and passes DISPLAY and GPU devices for RViz and CUDA workloads.
docker exec -it disal_slam run_gmmslamIn a separate terminal:
docker exec -it disal_slam run_rvizRViz will show:
- DepthCloud: Live depth point cloud from drone
- GmmMap: Per-submap GMM map (3D ellipsoids, color-coded per submap, re-anchored on every loop closure)
- GmmLatest (off by default): GMM fit of the latest keyframe in white
- SubmapGraph: Submap nodes (green) and loop-closure edges (magenta)
- KeyframeNodes (off by default): Per-keyframe pose markers (red)
- LoopClosures: Detected loop-closure correspondences
- Path: Trajectory estimate
- Odometry: Current pose
- TF: Transform tree (world → m500_1_base_link)
Subscribed:
/m500_1/mpa/depth/points— Depth point cloud
Published (visualization):
/gmmslam_node/path— Trajectory/gmmslam_node/odom— Odometry/gmmslam_node/gmm_global_markers— The map: per-submap GMMs as 3D ellipsoids, globally aligned via iSAM2, refreshed every loop closure/gmmslam_node/gmm_markers— Latest-keyframe GMM/gmmslam_node/global_graph_markers— Submap pose graph + loop edges/gmmslam_node/graph_nodes— Per-keyframe pose markers/gmmslam_node/loop_closure_markers— Loop-closure correspondences/gmmslam_node/latest_frame_cloud— Most recent depth scan in world frame/gmmslam_node/map_cloud— Accumulated depth scans (one chunk per smoother frame), reprojected with currentpose_by_idxso fixed-lag and keyframe-level loop updates move the cloud; latched; rate set bymap_cloud_publish_hz(default 0.5 Hz). Pure submap–submap graph edges alone do not move these points (they follow the smoother, not the global submap graph).
The map is stored as one merged GMM per submap on disk (gmm_dir/submap_XXXX.gmm). Near-duplicate components produced by overlapping observations are pruned at submap finalization using optional R-tree spatial indexing, a Bhattacharyya-distance gate, and a keep-lower-pose-uncertainty rule; see the map: block in config/params.yaml to tune.
Edit the SOGMM bandwidth in python/gmmslam.py (line 375):
self.sogmm_bandwidth = rospy.get_param("~sogmm_bandwidth", 0.2)- Larger bandwidth (0.3–0.5) → more stable D2D registration
- Smaller bandwidth (0.1) → more precise GMM fitting
This repository vendors or depends on the following (versions may follow the Docker image or your host install):
| Component | Role | Upstream / license |
|---|---|---|
| ROS Noetic | Middleware, messages, RViz, tf, Catkin |
Open Robotics |
| GTSAM | Smoothing and iSAM2 global submap graph | borglab/gtsam |
| Eigen | Linear algebra | libeigen/eigen |
| yaml-cpp | YAML configuration loading | jbeder/yaml-cpp |
| nlohmann/json | JSON in C++ (FetchContent if not found on system) | nlohmann/json |
| dlib | Pulled in via GIRA3D / registration stack | davisking/dlib |
| GIRA3D reconstruction | SOGMM / Open3D colcon workspace (self_organizing_gmm, etc.) |
gira3d/gira3d-reconstruction |
| GIRA3D registration | GMM I/O, D2D registration (gmm, gmm_d2d_registration) |
gira3d/gira3d-registration |
| SOLiD | Place recognition for loop closure — spatially organized LiDAR global descriptor (cosine gate on radius candidates, optional yaw prior for D2D init); see Kim et al., IEEE RA-L 2024 | sparolab/SOLiD (official implementation; BSD-3-Clause) |
| Open3D | Built as part of the GIRA3D reconstruction image flow | isl-org/Open3D |
| nanoflann | KD-tree headers in the GIRA3D dry install | jlblancoc/nanoflann |
| NVIDIA CUDA (base image) | GPU builds for reconstruction / SOGMM path | NVIDIA EULA |
| CMake (Kitware tarball in Docker) | Build system for newer Open3D | Kitware/CMake |
| Webots + webots_ros | Simulator integration in the Docker image | cyberbotics/webots |
R-tree (src/util/rtree.h) |
Spatial indexing for GMM pruning candidates | Vendored from nushoin/RTree (header-only RTree.h lineage: Guttman & Stonebraker’s algorithm, C++ templated port by Greg Douglas, community forks; see that repo’s README and LICENSE for authorship and licensing) |
Design ideas were informed by public systems such as:
- ROS 1 Noetic — Primary path (Docker + Catkin).
- ROS 2 Humble — TODO.