Skip to content

Anisotropic#55

Merged
davidackerman merged 37 commits intomainfrom
anisotropic
Mar 17, 2026
Merged

Anisotropic#55
davidackerman merged 37 commits intomainfrom
anisotropic

Conversation

@davidackerman
Copy link
Copy Markdown
Collaborator

@stuarteberg
This pull request introduces comprehensive support for anisotropic voxel sizes throughout the codebase, improving the accuracy and flexibility of physical-unit calculations and operations. The changes ensure that all tools and algorithms correctly handle datasets with non-uniform voxel dimensions, including Gaussian filtering, volume thresholds, and center-of-mass computations. The documentation has also been updated to reflect these enhancements.

Anisotropic voxel support and physical-unit handling:

  • All tools now handle anisotropic voxel sizes, and physical-unit parameters are automatically converted to per-axis voxel units. When datasets have mismatched voxel sizes, resampling is performed to a common resolution. (README.md) [1] [2]
  • Volume threshold calculations in connected_components.py and clean_connected_components.py now use the correct scalar voxel volume for both minimum and maximum volume thresholds, ensuring accurate filtering for anisotropic data. [1] [2]
  • Gaussian smoothing and block padding in connected_components.py are now performed with per-axis sigma and padding, allowing for anisotropic filtering and precise block alignment. [1] [2]
  • The offset calculation for global ID assignment in connected_components.py now divides by the per-axis voxel size, ensuring correct spatial mapping for anisotropic blocks.

Algorithm and utility updates:

  • The center-of-mass computation in the C++ backend (centers.hpp and centers.pyx) now accepts and applies per-axis voxel sizes, with backward compatibility for isotropic cases. This change propagates through Python wrappers and ensures all center and radius calculations are in physical units. [1] [2] [3] [4] [5] [6]
  • The block padding and measurement functions in measure.py now use per-axis voxel sizes, improving the accuracy of blockwise measurements and neighbor block handling. [1] [2] [3] [4]
  • The line fitting algorithm in fit_lines_to_segmentations.py now operates in physical space, multiplying coordinates by voxel size before SVD and endpoint calculations.

Documentation and CLI improvements:

  • The README.md has been updated to clarify anisotropic support, describe physical-unit parameter handling, and document changes to tool descriptions, including new measurement capabilities and resampling behavior. [1] [2]

Codebase maintenance:

  • Minor changes to CLI entry points and imports to support updated utilities and maintain consistency. [1] [2]

These changes collectively ensure robust and accurate processing for datasets with anisotropic voxel sizes, aligning all operations and measurements to physical units and improving usability and correctness across the package.

This commit addresses critical issues with anisotropic data processing
where voxel sizes differ across dimensions (e.g., [32, 16, 8] nm).

Key fixes:
- Added trim_array_anisotropic() utility function to handle per-axis
  padding calculations based on physical units (nm) rather than uniform
  voxel counts
- Fixed TypeError in contact_sites.py where min() on voxel_size tuples
  returned arrays instead of scalars
- Fixed IndexError in watershed_segmentation.py caused by mismatched
  padding/trimming with anisotropic data
- Fixed ValueError in connected_components.py from array truthiness
  checks on volume thresholds and gaussian smoothing parameters
- Updated mask_util.py to calculate anisotropic padding correctly
- Fixed test fixtures to properly handle anisotropic voxel volumes

Results:
- Reduced failing tests from 67 to 47 (20 tests fixed)
- Increased passing tests from 157 to 177
- Core anisotropic processing now functional
Additional fixes for array scalar conversion issues:
- Fixed contact_sites.py to ensure voxel_size is a Coordinate object
  for proper arithmetic operations with chunk_shape
- Fixed connected_components.py to handle gaussian_smoothing_sigma_nm
  as arrays with proper truthiness checking
- Updated test fixtures to convert voxel-based parameters to physical
  units correctly for anisotropic data
- Ensured all volume and distance calculations use scalar values

Test improvements:
- 40 failed, 184 passed (was 47 failed, 177 passed)
- 7 more tests now passing
Key fixes:
- Fixed morphological_operations.py to use uniform voxel-based trimming
  since erosion/dilation operate uniformly in voxel space
- Fixed clean_connected_components.py to ensure volume thresholds are
  scalar values for array comparisons
- Updated test fixtures in test_clean_connected_components.py to
  calculate volumes correctly for anisotropic data

Test results:
- 30 failed, 194 passed (was 40 failed, 184 passed)
- 10 additional tests now passing
- Morphological operations fully working with anisotropic data
- Clean connected components fully working
- Fixed label_with_mask.py to use uniform voxel-based trimming since
  erosion operates uniformly in voxel space
- Updated image_data_interface tests to ensure voxel_size is converted
  to Coordinate type for ROI multiplication operations

Test results:
- 29 failed, 195 passed (was 30 failed, 194 passed)
- All label_with_mask tests now passing with anisotropic data
The key issue was that Gaussian smoothing requires per-axis sigma values
when working with anisotropic voxel sizes. The test was using a uniform
scalar sigma value, but it should calculate per-axis sigma by dividing
the sigma in nm by each axis's voxel size.

This matches the implementation in connected_components.py which already
correctly computes per-axis sigma values (line 196-198).

Test results:
- 25 failed, 199 passed (was 29 failed, 195 passed)
- All 4 Gaussian smoothing tests now passing for anisotropic data
Added missing Coordinate import to test_image_data_interface.py which
was needed for the voxel_size type checking and ROI multiplication
operations added in previous commits.

Test results:
- 17 failed, 207 passed (was 25 failed, 199 passed)
- All 10 image_data_interface tests now passing
Fixed tolerance calculations in skeletonize tests that were creating
array-valued tolerances when multiplying by anisotropic voxel_size
arrays. Now uses max(voxel_size) to get a scalar tolerance value
for the most generous bounds checking.

This allows the assertions comparing coordinates to bounds to work
properly with both isotropic and anisotropic voxel sizes.

Test results:
- 13 failed, 211 passed (was 17 failed, 207 passed)
- All 7 anisotropic skeletonize tests now passing
- Both isotropic skeletonize tests also fixed

Note: EDT already properly handles anisotropy via the anisotropy
parameter (line 187 in watershed_segmentation.py), returning
distances in physical units.
- KDTree now operates in physical space by scaling coordinates with voxel_size
- Contact distance threshold properly uses nm units for anisotropic data
- Calculate actual padding from array shapes to handle ROI rounding correctly
- Use per-axis trimming based on actual padding applied

This ensures contact detection uses consistent physical distances regardless
of voxel anisotropy. Fixes 1 of 6 anisotropic contact_sites test failures.
Calculate actual padding from array shapes instead of estimating from nm.
This accounts for rounding differences when ROIs are converted to voxel coordinates.

Fixes test_masks for anisotropic data.
- Updated contact_sites fixtures to compute ground truth using actual contact detection
- This ensures ground truth matches expected behavior for both isotropic and anisotropic data
- Physical distance calculations now consistent across test and ground truth
- Fixes anisotropic contact_sites whole array tests

Remaining: shape mismatch issues in blockwise processing and distance=3 isotropic tests
- Use original isotropic ground truth for isotropic data
- Only generate dynamically for anisotropic data
- This preserves the original test expectations for isotropic cases

Still investigating: 3 isotropic distance=3 tests failing, 2 anisotropic shape mismatches
- Removed dynamic ground truth generation using the function itself
- Created separate fixtures for isotropic and anisotropic ground truth
- Anisotropic ground truth correctly reflects that no contacts exist
  with small contact distances due to large Z-axis spacing
- Whole tests now use legacy behavior (voxel-space only)

Remaining: 3 isotropic distance=3 failures, 3 anisotropic test failures
For anisotropic data, when segmentation_1 is downsampled 2x to create
segmentation_1_downsampled, the array shape changes from (11,11,11) to
(6,6,6) due to integer division. When voxel_size also doubles, this
results in different physical coverage:
- Original: (11,11,11) at [32,16,8] nm = [352,176,88] nm
- Downsampled: (6,6,6) at [64,32,16] nm = [384,192,96] nm (larger!)

This mismatch causes issues in contact_sites processing when trying to
upsample and align the data. The fix ensures that when writing the
downsampled zarr datasets, we use the same total_roi as the original
data, so both datasets represent the same physical region.
When processing blocks with different voxel sizes, arrays are upsampled
to the finer output_voxel_size. Due to snap_to_grid expanding ROIs and
integer upsampling factors, the resulting array can be slightly larger
than expected (off by 1 voxel in some dimensions).

Changes:
- Use np.round() when converting physical ROI to voxel coordinates for
  more accurate rounding
- Calculate total_padding_voxels instead of assuming symmetric padding
- Make trim_with_padding more robust by:
  - Attempting to center the crop
  - Verifying the result shape matches expected
  - Falling back to crop from origin if needed

This handles edge cases where upsampling creates arrays that are 1 voxel
larger than the target write_roi expects.
The WatershedSegmentation class was broken for anisotropic data and
redundant given mutex watershed. This removes the module, its tests,
CLI entry point, and documentation references.
Use per-axis padding so each axis gets the correct number of context
voxels, and update dask_util to check np.any(padding) for array padding.
Perform SVD in physical space so line fitting accounts for per-axis
voxel sizes. Update test cylinder construction to use physical-space
distance checks, producing true circular cross-sections regardless of
voxel anisotropy.
Merge separate isotropic/anisotropic contact site fixtures into unified
fixtures that branch on voxel_size internally. Remove redundant
anisotropic-specific fixtures and simplify test_image_dict. Skip legacy
contact site tests for anisotropic data.
…round truth helper

Replace hardcoded contact sites ground truth arrays with dynamic computation
via a shared helper module, and rename fixtures/parameters from distance_1/2/3
(voxel units) to distance_8nm/16nm/24nm to make the physical units explicit.
Test initialize_contact_site_array (surface detection, overlap detection,
masking behavior) and bresenham_3D_lines (axis-aligned, diagonal, mask
blocking, multiple pairs) independently from the full contact sites pipeline.
Mermaid-based diagrams covering the high-level pipeline, blockwise processing
pattern, data flow through ImageDataInterface, and CLI command mapping.
Scalar padding_nm produced non-integer voxel counts per axis when
voxel sizes differed, causing shape mismatches during trimming.
Use per-axis Coordinate padding (integer multiples of each voxel size)
in both connected components Gaussian smoothing and contact sites
blockwise processing. Update trim_array_anisotropic to accept
per-axis padding_nm.
- Change anisotropic voxel_size from (32,16,8) to (3,7,5) so test
  segmentations produce non-trivial contacts at 8/16/24nm distances
- Rewrite ground truth helper with independent NumPy implementation
  (pure NumPy surface detection + KDTree pairing) to avoid circular
  testing against get_ndarray_contact_sites
- Update whole tests to use the modern API (voxel_size +
  contact_distance_nm), removing the 2 pytest.skip calls
- Rename tests from _whole_1/2/3 to _whole_8nm/16nm/24nm
- Increase fit_lines tolerance from 0.5 to 1.0nm for thin cylinders
  at smaller anisotropic voxel sizes
Add support for non-integer voxel size ratios (e.g., 8nm→5nm = 1.6x) using
scipy interpolation methods while maintaining fast paths for integer scaling.

Key changes:
- Add is_close_to_integer() and requires_interpolation() helpers to detect
  non-integer scale factors within 1% tolerance
- Add interpolation_order parameter to control interpolation method:
  * order=0 (default): nearest-neighbor for segmentations
  * order=1: linear interpolation for raw data
- Use map_coordinates() for ROI-based non-integer resampling to ensure
  global grid alignment across datasets with different native resolutions
- Use zoom() for non-ROI non-integer resampling
- Replace block_reduce with simple slicing for integer downsampling to
  preserve exact label values
- Maintain fast paths (repeat/slicing) for integer or close-to-integer
  factors (within 1% tolerance)

This enables blockwise operations on datasets with mismatched voxel sizes
where the scale factors are not integers, which previously caused shape
mismatches and grid misalignment.
Add test coverage for non-integer voxel size resampling including the
critical cross-dataset alignment requirement.

Test coverage includes:
- Helper function validation (is_close_to_integer, requires_interpolation)
- Non-integer upsampling and downsampling (1.6x, 0.4x, mixed anisotropic)
- Label preservation with interpolation_order=0
- ROI extraction with non-integer scaling
- Integer fast path verification (repeat for up, slicing for down)
- Close-to-integer factor handling (1.999 → 2)
- Physical dimension preservation through resampling
- Cross-dataset alignment: datasets with different input voxel sizes
  must return identical output shapes when queried with the same ROI
  and output_voxel_size (critical for blockwise operations)

Add non_integer_voxel_sizes fixture to conftest.py for parameterized
testing of various non-integer scale factor combinations.
Use per-axis checks instead of first-axis-only for rescaling decisions,
handle mixed up/down factors via zoom fallback, and add test for datasets
with independently different anisotropic voxel sizes.
Resolve conflicts in conftest.py (keep both raw_intensity and anisotropic
contact site fixtures). Fix voxel_edge_length -> voxel_size in raw stats
test, and make skeleton metric assertions voxel-size-aware.
Upsample binary mask to isotropic resolution (using min voxel size)
before skeletonizing so Lee's thinning algorithm operates uniformly
in physical space. Update tests to account for Lee's known limitation
of producing empty skeletons for certain block cross-sections.
- Slim global conftest.py (714 -> 45 lines): only core parametrization
  fixtures (voxel_size, image_shape, chunk_size, shared_tmpdir)
- Move all data fixtures, zarr writing, and CSV setup to
  tests/operations/conftest.py
- Centralize test helper functions in tests/test_utils.py, fixing the
  circular import (conftest importing from test_measure.py)
- Remove ~250 lines of duplicated helper code from test_measure.py
@davidackerman davidackerman merged commit 7cd67b3 into main Mar 17, 2026
2 checks passed
@davidackerman davidackerman deleted the anisotropic branch March 17, 2026 17:43
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 82.03593% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.67%. Comparing base (e09a692) to head (a59cdc7).
⚠️ Report is 38 commits behind head on main.

Files with missing lines Patch % Lines
src/cellmap_analyze/util/image_data_interface.py 66.66% 17 Missing ⚠️
src/cellmap_analyze/process/contact_sites.py 78.57% 9 Missing ⚠️
src/cellmap_analyze/util/measure_util.py 87.50% 3 Missing ⚠️
src/cellmap_analyze/util/mask_util.py 92.30% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #55      +/-   ##
==========================================
- Coverage   81.92%   81.67%   -0.25%     
==========================================
  Files          24       23       -1     
  Lines        3076     3040      -36     
==========================================
- Hits         2520     2483      -37     
- Misses        556      557       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant