docs: clarify volumetric array axis order is WHD (X,Y,Z), not HWD#8826
docs: clarify volumetric array axis order is WHD (X,Y,Z), not HWD#8826aymuos15 wants to merge 2 commits intoProject-MONAI:devfrom
Conversation
…oject-MONAI#8337) MONAI's NIfTI/ITK readers return ndarrays whose axis 0 is columns/Width (X), axis 1 is rows/Height (Y), and axis 2 is Depth/slices (Z) — see `ITKReader._get_array_data` in `monai/data/image_reader.py` which applies `.T` to the ITK-native (Z,Y,X) array to produce (X,Y,Z), and nibabel which already serves (X,Y,Z). Several docstrings and one error message described this layout as `HWD` / `CHWD` / `NCHWD`, which swaps the spatial meanings of H and W and is misleading to users writing custom loaders or comparing array shapes against DICOM/NIfTI metadata. Update the user-facing docstrings to use `WHD` / `CWHD` / `NCWHD` and add a brief note on each describing the per-axis meaning so the convention is unambiguous when read in isolation: - monai/data/image_writer.py: `H, HW, HWD` → `W, WH, WHD` in the `resample_if_needed` docstring (the canonical NIfTI-semantics description in the writer hierarchy). - monai/visualize/utils.py: `matshow3d` volume shape doc. - monai/visualize/img2tensorboard.py: docstrings + the `AssertionError` message in `_image3_animated_gif`. The TensorBoard API arguments `dataformats="HW"/"CHW"/"NCHWT"` and the user-visible GIF tag suffix `f"{tag}_HWD"` are intentionally left unchanged — those are TensorBoard-side conventions / public log keys. - monai/handlers/tensorboard_handlers.py: `frame_dim` docstring on `TensorBoardImageHandler`. - monai/apps/vista3d/inferer.py: `point_based_window_inferer` input shape doc. This change is documentation-only; no runtime behavior is affected. Out of scope for this commit (tracked separately): - `monai/apps/deepedit/transforms.py` uses the `CHWD` label in 15 places that are tightly coupled to click-coordinate construction (`[g[0], g[-2], g[-1], sid]`). Renaming the label without verifying the click semantics would create a doc/code mismatch and may mask a real coordinate-order bug; this needs a separate code-level review. - `monai/apps/deepgrow/transforms.py` uses `CDHW` (depth-first), which is a different convention question from the H/W swap addressed here. - Tensor-axis docstrings in losses/metrics/inferers (NCHW, NCHWD, BCDHW, BC[HWD], etc.) follow the established PyTorch placeholder convention and are dimension-agnostic; not touched. - Detection box-mode enums (`XYZWHD`, `CCCWHD`) and anchor flatten counts (`HWA`, `HWDA`) define H/W/D explicitly within their own subsystems and are internally consistent; not touched. Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR updates documentation and assertion messages in five files (monai/apps/vista3d/inferer.py, monai/data/image_writer.py, monai/handlers/tensorboard_handlers.py, monai/visualize/img2tensorboard.py, monai/visualize/utils.py) to correct spatial axis conventions from HWD/H... ordering to WHD/XYX-style ordering (Width, Height, Depth) consistent with NIfTI/ITK reader outputs. Docstrings, parameter descriptions, example shapes, and one assertion message were changed. No function signatures, executable logic, or control flow were modified. Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@monai/apps/vista3d/inferer.py`:
- Around line 48-50: The docstring for the parameter `inputs` incorrectly
describes axis 0/1/2 as Width/Height/Depth; update the `inputs` shape
description (tensor shape `[1, C, W, H, D]`) to state that axis 0 is the batch
dimension, axis 1 is the channel dimension, and spatial axes are axes 2–4
(Width, Height, Depth) to match MONAI/NumPy ordering; edit the `inputs` entry in
inferer.py so it explicitly calls out spatial axes and their meanings (e.g.,
"axes 2–4 are spatial: W, H, D") and ensure wording aligns with other docstrings
in the file.
In `@monai/visualize/utils.py`:
- Around line 68-73: The matshow3d docstring contains minor typos and unclear
phrasing: change the phrase "dim and reshape to (-1, spatial0, spatial1) shape
to construct frames" to a clearer verb phrase such as "transpose and reshape to
(-1, spatial0, spatial1) to construct frames" and fix "dimensionas" to
"dimension as" in the channel_dim description; update the channel_dim sentence
so it reads something like "if not None, explicitly specify the channel
dimension to be transposed to the last dimension as shape (-1, spatial0,
spatial1, C)" and ensure references to frame_dim and batch_dim remain accurate
(function: matshow3d, params: channel_dim, frame_dim, batch_dim).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9c4bcd84-f190-43d4-873e-3f20a6124d99
📒 Files selected for processing (5)
monai/apps/vista3d/inferer.pymonai/data/image_writer.pymonai/handlers/tensorboard_handlers.pymonai/visualize/img2tensorboard.pymonai/visualize/utils.py
monai/apps/vista3d/inferer.py
Outdated
| inputs: [1CWHD], input image to be processed (axis 0 is columns/Width, | ||
| axis 1 is rows/Height, axis 2 is Depth, matching arrays returned by | ||
| MONAI's NIfTI/ITK readers). |
There was a problem hiding this comment.
Correct axis-index wording for inputs shape [1, C, W, H, D].
Line 48-Line 50 currently state axis 0/1/2 as Width/Height/Depth, but for this tensor axis 0 is batch and axis 1 is channel. Please describe spatial axes explicitly.
Proposed doc fix
- inputs: [1CWHD], input image to be processed (axis 0 is columns/Width,
- axis 1 is rows/Height, axis 2 is Depth, matching arrays returned by
- MONAI's NIfTI/ITK readers).
+ inputs: [1CWHD], input image to be processed (spatial axes are in
+ Width, Height, Depth order; i.e., for [N, C, W, H, D], Width is
+ axis 2, Height is axis 3, Depth is axis 4).As per coding guidelines, "Docstrings should be present for all definition which describe each variable...," which includes correct axis semantics.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@monai/apps/vista3d/inferer.py` around lines 48 - 50, The docstring for the
parameter `inputs` incorrectly describes axis 0/1/2 as Width/Height/Depth;
update the `inputs` shape description (tensor shape `[1, C, W, H, D]`) to state
that axis 0 is the batch dimension, axis 1 is the channel dimension, and spatial
axes are axes 2–4 (Width, Height, Depth) to match MONAI/NumPy ordering; edit the
`inputs` entry in inferer.py so it explicitly calls out spatial axes and their
meanings (e.g., "axes 2–4 are spatial: W, H, D") and ensure wording aligns with
other docstrings in the file.
monai/visualize/utils.py
Outdated
| the `-3` dimension. dim and reshape to (-1, spatial0, spatial1) shape to construct frames, | ||
| default to `-3`. | ||
| channel_dim: if not None, explicitly specify the channel dimension to be transposed to the | ||
| last dimensionas shape (-1, H, W, C). this can be used to plot RGB color image. | ||
| if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as shape (-1, H, W). | ||
| last dimensionas shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image. | ||
| if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as | ||
| shape (-1, spatial0, spatial1). |
There was a problem hiding this comment.
Fix docstring typos in matshow3d args.
Line 68 and Line 71 have wording issues ("dim and reshape" and "dimensionas"), which hurt readability.
Proposed doc fix
- frame_dim: for higher dimensional arrays, which dimension from (`-1`, `-2`, `-3`) is moved to
- the `-3` dimension. dim and reshape to (-1, spatial0, spatial1) shape to construct frames,
+ frame_dim: for higher dimensional arrays, which dimension from (`-1`, `-2`, `-3`) is moved to
+ the `-3` dimension, then reshaped to (-1, spatial0, spatial1) to construct frames,
default to `-3`.
channel_dim: if not None, explicitly specify the channel dimension to be transposed to the
- last dimensionas shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image.
+ last dimension as shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image.As per coding guidelines, "Review the Python code for quality and correctness... Suggest any enhancements for code improving ... comprehensibility."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| the `-3` dimension. dim and reshape to (-1, spatial0, spatial1) shape to construct frames, | |
| default to `-3`. | |
| channel_dim: if not None, explicitly specify the channel dimension to be transposed to the | |
| last dimensionas shape (-1, H, W, C). this can be used to plot RGB color image. | |
| if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as shape (-1, H, W). | |
| last dimensionas shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image. | |
| if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as | |
| shape (-1, spatial0, spatial1). | |
| the `-3` dimension, then reshaped to (-1, spatial0, spatial1) to construct frames, | |
| default to `-3`. | |
| channel_dim: if not None, explicitly specify the channel dimension to be transposed to the | |
| last dimension as shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image. | |
| if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as | |
| shape (-1, spatial0, spatial1). |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@monai/visualize/utils.py` around lines 68 - 73, The matshow3d docstring
contains minor typos and unclear phrasing: change the phrase "dim and reshape to
(-1, spatial0, spatial1) shape to construct frames" to a clearer verb phrase
such as "transpose and reshape to (-1, spatial0, spatial1) to construct frames"
and fix "dimensionas" to "dimension as" in the channel_dim description; update
the channel_dim sentence so it reads something like "if not None, explicitly
specify the channel dimension to be transposed to the last dimension as shape
(-1, spatial0, spatial1, C)" and ensure references to frame_dim and batch_dim
remain accurate (function: matshow3d, params: channel_dim, frame_dim,
batch_dim).
- vista3d/inferer.py: clarify spatial-axis indices for [N,C,W,H,D] tensor
(axes 2-4, not 0-2) in point_based_window_inferer inputs docstring.
- visualize/utils.py: fix matshow3d docstring typos ('dim and reshape' -> 'then
reshaped'; 'dimensionas' -> 'dimension as').
Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
Fixes #8337
Summary
MONAI's NIfTI/ITK readers return ndarrays whose axis 0 is columns/Width (X), axis 1 is rows/Height (Y), and axis 2 is Depth/slices (Z) — see
ITKReader._get_array_datainmonai/data/image_reader.pywhich applies.Tto the ITK-native (Z,Y,X) array to produce (X,Y,Z); nibabel already serves (X,Y,Z). Several docstrings and one error message described this layout asHWD/CHWD/NCHWD, swapping the spatial meanings of H and W.This PR updates the user-facing docstrings to use
WHD/CWHD/NCWHDand adds a brief per-axis note on each so the convention is unambiguous when read in isolation:monai/data/image_writer.py:H, HW, HWD→W, WH, WHDin theresample_if_neededdocstring (canonical NIfTI-semantics description in the writer hierarchy).monai/visualize/utils.py:matshow3dvolume shape doc.monai/visualize/img2tensorboard.py: docstrings + theAssertionErrormessage in_image3_animated_gif. The TensorBoard API argumentsdataformats="HW"/"CHW"/"NCHWT"and the user-visible GIF tag suffixf"{tag}_HWD"are intentionally left unchanged — those are TensorBoard-side conventions / public log keys.monai/handlers/tensorboard_handlers.py:frame_dimdocstring onTensorBoardImageHandler.monai/apps/vista3d/inferer.py:point_based_window_infererinput shape doc.Documentation-only; no runtime behavior is affected.
Out of scope (tracked separately)
monai/apps/deepedit/transforms.pyuses theCHWDlabel in 15 places that are tightly coupled to click-coordinate construction ([g[0], g[-2], g[-1], sid]). Renaming the label without verifying the click semantics would create a doc/code mismatch and may mask a real coordinate-order bug; needs a separate code-level review.monai/apps/deepgrow/transforms.pyusesCDHW(depth-first), which is a different convention question from the H/W swap addressed here.XYZWHD,CCCWHD) and anchor flatten counts (HWA,HWDA) define H/W/D explicitly within their own subsystems and are internally consistent; not touched.