Skip to content

docs: clarify volumetric array axis order is WHD (X,Y,Z), not HWD#8826

Open
aymuos15 wants to merge 2 commits intoProject-MONAI:devfrom
aymuos15:fix-8337-whd-docs
Open

docs: clarify volumetric array axis order is WHD (X,Y,Z), not HWD#8826
aymuos15 wants to merge 2 commits intoProject-MONAI:devfrom
aymuos15:fix-8337-whd-docs

Conversation

@aymuos15
Copy link
Copy Markdown
Contributor

@aymuos15 aymuos15 commented Apr 13, 2026

Fixes #8337

Summary

MONAI's NIfTI/ITK readers return ndarrays whose axis 0 is columns/Width (X), axis 1 is rows/Height (Y), and axis 2 is Depth/slices (Z) — see ITKReader._get_array_data in monai/data/image_reader.py which applies .T to the ITK-native (Z,Y,X) array to produce (X,Y,Z); nibabel already serves (X,Y,Z). Several docstrings and one error message described this layout as HWD / CHWD / NCHWD, swapping the spatial meanings of H and W.

This PR updates the user-facing docstrings to use WHD / CWHD / NCWHD and adds a brief per-axis note on each so the convention is unambiguous when read in isolation:

  • monai/data/image_writer.py: H, HW, HWDW, WH, WHD in the resample_if_needed docstring (canonical NIfTI-semantics description in the writer hierarchy).
  • monai/visualize/utils.py: matshow3d volume shape doc.
  • monai/visualize/img2tensorboard.py: docstrings + the AssertionError message in _image3_animated_gif. The TensorBoard API arguments dataformats="HW"/"CHW"/"NCHWT" and the user-visible GIF tag suffix f"{tag}_HWD" are intentionally left unchanged — those are TensorBoard-side conventions / public log keys.
  • monai/handlers/tensorboard_handlers.py: frame_dim docstring on TensorBoardImageHandler.
  • monai/apps/vista3d/inferer.py: point_based_window_inferer input shape doc.

Documentation-only; no runtime behavior is affected.

Out of scope (tracked separately)

  • monai/apps/deepedit/transforms.py uses the CHWD label in 15 places that are tightly coupled to click-coordinate construction ([g[0], g[-2], g[-1], sid]). Renaming the label without verifying the click semantics would create a doc/code mismatch and may mask a real coordinate-order bug; needs a separate code-level review.
  • monai/apps/deepgrow/transforms.py uses CDHW (depth-first), which is a different convention question from the H/W swap addressed here.
  • Tensor-axis docstrings in losses/metrics/inferers (NCHW, NCHWD, BCDHW, BC[HWD], etc.) follow the established PyTorch placeholder convention and are dimension-agnostic; not touched.
  • Detection box-mode enums (XYZWHD, CCCWHD) and anchor flatten counts (HWA, HWDA) define H/W/D explicitly within their own subsystems and are internally consistent; not touched.

…oject-MONAI#8337)

MONAI's NIfTI/ITK readers return ndarrays whose axis 0 is columns/Width
(X), axis 1 is rows/Height (Y), and axis 2 is Depth/slices (Z) — see
`ITKReader._get_array_data` in `monai/data/image_reader.py` which
applies `.T` to the ITK-native (Z,Y,X) array to produce (X,Y,Z), and
nibabel which already serves (X,Y,Z). Several docstrings and one error
message described this layout as `HWD` / `CHWD` / `NCHWD`, which swaps
the spatial meanings of H and W and is misleading to users writing
custom loaders or comparing array shapes against DICOM/NIfTI metadata.

Update the user-facing docstrings to use `WHD` / `CWHD` / `NCWHD` and
add a brief note on each describing the per-axis meaning so the
convention is unambiguous when read in isolation:

- monai/data/image_writer.py: `H, HW, HWD` → `W, WH, WHD` in the
  `resample_if_needed` docstring (the canonical NIfTI-semantics
  description in the writer hierarchy).
- monai/visualize/utils.py: `matshow3d` volume shape doc.
- monai/visualize/img2tensorboard.py: docstrings + the
  `AssertionError` message in `_image3_animated_gif`. The TensorBoard
  API arguments `dataformats="HW"/"CHW"/"NCHWT"` and the user-visible
  GIF tag suffix `f"{tag}_HWD"` are intentionally left unchanged —
  those are TensorBoard-side conventions / public log keys.
- monai/handlers/tensorboard_handlers.py: `frame_dim` docstring on
  `TensorBoardImageHandler`.
- monai/apps/vista3d/inferer.py: `point_based_window_inferer` input
  shape doc.

This change is documentation-only; no runtime behavior is affected.

Out of scope for this commit (tracked separately):

- `monai/apps/deepedit/transforms.py` uses the `CHWD` label in 15
  places that are tightly coupled to click-coordinate construction
  (`[g[0], g[-2], g[-1], sid]`). Renaming the label without verifying
  the click semantics would create a doc/code mismatch and may mask a
  real coordinate-order bug; this needs a separate code-level review.
- `monai/apps/deepgrow/transforms.py` uses `CDHW` (depth-first), which
  is a different convention question from the H/W swap addressed here.
- Tensor-axis docstrings in losses/metrics/inferers (NCHW, NCHWD,
  BCDHW, BC[HWD], etc.) follow the established PyTorch placeholder
  convention and are dimension-agnostic; not touched.
- Detection box-mode enums (`XYZWHD`, `CCCWHD`) and anchor flatten
  counts (`HWA`, `HWDA`) define H/W/D explicitly within their own
  subsystems and are internally consistent; not touched.

Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a1f115f0-dcf9-4127-b7dd-bb6cbc7bb79d

📥 Commits

Reviewing files that changed from the base of the PR and between 4cb8977 and 7c9e273.

📒 Files selected for processing (2)
  • monai/apps/vista3d/inferer.py
  • monai/visualize/utils.py
✅ Files skipped from review due to trivial changes (1)
  • monai/visualize/utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • monai/apps/vista3d/inferer.py

📝 Walkthrough

Walkthrough

This PR updates documentation and assertion messages in five files (monai/apps/vista3d/inferer.py, monai/data/image_writer.py, monai/handlers/tensorboard_handlers.py, monai/visualize/img2tensorboard.py, monai/visualize/utils.py) to correct spatial axis conventions from HWD/H... ordering to WHD/XYX-style ordering (Width, Height, Depth) consistent with NIfTI/ITK reader outputs. Docstrings, parameter descriptions, example shapes, and one assertion message were changed. No function signatures, executable logic, or control flow were modified.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title accurately summarizes the main change: correcting volumetric array axis order documentation from HWD to WHD across multiple modules.
Description check ✅ Passed Description meets template requirements with issue reference, detailed summary of changes, types of changes checkbox, and clear rationale for axis convention correction.
Linked Issues check ✅ Passed PR fully addresses issue #8337 by updating docstrings across five modules to correctly describe volumetric array axis order as WHD instead of HWD, with semantic per-axis clarification.
Out of Scope Changes check ✅ Passed All changes are within scope; they are documentation-only updates to clarify axis conventions. Out-of-scope items (deepedit/deepgrow label usages, detection enums, PyTorch-style placeholders) are explicitly enumerated and intentionally excluded.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@monai/apps/vista3d/inferer.py`:
- Around line 48-50: The docstring for the parameter `inputs` incorrectly
describes axis 0/1/2 as Width/Height/Depth; update the `inputs` shape
description (tensor shape `[1, C, W, H, D]`) to state that axis 0 is the batch
dimension, axis 1 is the channel dimension, and spatial axes are axes 2–4
(Width, Height, Depth) to match MONAI/NumPy ordering; edit the `inputs` entry in
inferer.py so it explicitly calls out spatial axes and their meanings (e.g.,
"axes 2–4 are spatial: W, H, D") and ensure wording aligns with other docstrings
in the file.

In `@monai/visualize/utils.py`:
- Around line 68-73: The matshow3d docstring contains minor typos and unclear
phrasing: change the phrase "dim and reshape to (-1, spatial0, spatial1) shape
to construct frames" to a clearer verb phrase such as "transpose and reshape to
(-1, spatial0, spatial1) to construct frames" and fix "dimensionas" to
"dimension as" in the channel_dim description; update the channel_dim sentence
so it reads something like "if not None, explicitly specify the channel
dimension to be transposed to the last dimension as shape (-1, spatial0,
spatial1, C)" and ensure references to frame_dim and batch_dim remain accurate
(function: matshow3d, params: channel_dim, frame_dim, batch_dim).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9c4bcd84-f190-43d4-873e-3f20a6124d99

📥 Commits

Reviewing files that changed from the base of the PR and between 5a2d0a7 and 4cb8977.

📒 Files selected for processing (5)
  • monai/apps/vista3d/inferer.py
  • monai/data/image_writer.py
  • monai/handlers/tensorboard_handlers.py
  • monai/visualize/img2tensorboard.py
  • monai/visualize/utils.py

Comment on lines +48 to +50
inputs: [1CWHD], input image to be processed (axis 0 is columns/Width,
axis 1 is rows/Height, axis 2 is Depth, matching arrays returned by
MONAI's NIfTI/ITK readers).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Correct axis-index wording for inputs shape [1, C, W, H, D].

Line 48-Line 50 currently state axis 0/1/2 as Width/Height/Depth, but for this tensor axis 0 is batch and axis 1 is channel. Please describe spatial axes explicitly.

Proposed doc fix
-        inputs: [1CWHD], input image to be processed (axis 0 is columns/Width,
-            axis 1 is rows/Height, axis 2 is Depth, matching arrays returned by
-            MONAI's NIfTI/ITK readers).
+        inputs: [1CWHD], input image to be processed (spatial axes are in
+            Width, Height, Depth order; i.e., for [N, C, W, H, D], Width is
+            axis 2, Height is axis 3, Depth is axis 4).

As per coding guidelines, "Docstrings should be present for all definition which describe each variable...," which includes correct axis semantics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@monai/apps/vista3d/inferer.py` around lines 48 - 50, The docstring for the
parameter `inputs` incorrectly describes axis 0/1/2 as Width/Height/Depth;
update the `inputs` shape description (tensor shape `[1, C, W, H, D]`) to state
that axis 0 is the batch dimension, axis 1 is the channel dimension, and spatial
axes are axes 2–4 (Width, Height, Depth) to match MONAI/NumPy ordering; edit the
`inputs` entry in inferer.py so it explicitly calls out spatial axes and their
meanings (e.g., "axes 2–4 are spatial: W, H, D") and ensure wording aligns with
other docstrings in the file.

Comment on lines +68 to +73
the `-3` dimension. dim and reshape to (-1, spatial0, spatial1) shape to construct frames,
default to `-3`.
channel_dim: if not None, explicitly specify the channel dimension to be transposed to the
last dimensionas shape (-1, H, W, C). this can be used to plot RGB color image.
if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as shape (-1, H, W).
last dimensionas shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image.
if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as
shape (-1, spatial0, spatial1).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix docstring typos in matshow3d args.

Line 68 and Line 71 have wording issues ("dim and reshape" and "dimensionas"), which hurt readability.

Proposed doc fix
-        frame_dim: for higher dimensional arrays, which dimension from (`-1`, `-2`, `-3`) is moved to
-            the `-3` dimension. dim and reshape to (-1, spatial0, spatial1) shape to construct frames,
+        frame_dim: for higher dimensional arrays, which dimension from (`-1`, `-2`, `-3`) is moved to
+            the `-3` dimension, then reshaped to (-1, spatial0, spatial1) to construct frames,
             default to `-3`.
         channel_dim: if not None, explicitly specify the channel dimension to be transposed to the
-            last dimensionas shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image.
+            last dimension as shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image.

As per coding guidelines, "Review the Python code for quality and correctness... Suggest any enhancements for code improving ... comprehensibility."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
the `-3` dimension. dim and reshape to (-1, spatial0, spatial1) shape to construct frames,
default to `-3`.
channel_dim: if not None, explicitly specify the channel dimension to be transposed to the
last dimensionas shape (-1, H, W, C). this can be used to plot RGB color image.
if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as shape (-1, H, W).
last dimensionas shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image.
if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as
shape (-1, spatial0, spatial1).
the `-3` dimension, then reshaped to (-1, spatial0, spatial1) to construct frames,
default to `-3`.
channel_dim: if not None, explicitly specify the channel dimension to be transposed to the
last dimension as shape (-1, spatial0, spatial1, C). this can be used to plot RGB color image.
if None, the channel dimension will be flattened with `frame_dim` and `batch_dim` as
shape (-1, spatial0, spatial1).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@monai/visualize/utils.py` around lines 68 - 73, The matshow3d docstring
contains minor typos and unclear phrasing: change the phrase "dim and reshape to
(-1, spatial0, spatial1) shape to construct frames" to a clearer verb phrase
such as "transpose and reshape to (-1, spatial0, spatial1) to construct frames"
and fix "dimensionas" to "dimension as" in the channel_dim description; update
the channel_dim sentence so it reads something like "if not None, explicitly
specify the channel dimension to be transposed to the last dimension as shape
(-1, spatial0, spatial1, C)" and ensure references to frame_dim and batch_dim
remain accurate (function: matshow3d, params: channel_dim, frame_dim,
batch_dim).

- vista3d/inferer.py: clarify spatial-axis indices for [N,C,W,H,D] tensor
  (axes 2-4, not 0-2) in point_based_window_inferer inputs docstring.
- visualize/utils.py: fix matshow3d docstring typos ('dim and reshape' -> 'then
  reshaped'; 'dimensionas' -> 'dimension as').

Signed-off-by: Soumya Snigdha Kundu <soumya_snigdha.kundu@kcl.ac.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MONAI inline comments and documentation mistakenly claim loaded volumetric images are of HWD format, but in fact WHD

1 participant