Skip to content

Modern Data Ecosystem: Support Cloud-Native OME-Zarr I/O #218

@jameslehoux

Description

@jameslehoux

Labels: ecosystem, io, performance, phase:4-hpc
Priority: Medium (Future-proofing)

Description

While Parallel HDF5 and TIFF support are functional, the global microscopy and tomography communities are rapidly migrating toward chunked, cloud-native storage formats like OME-Zarr.

Zarr's chunked architecture perfectly mirrors AMReX's internal BoxArray domain decomposition. By supporting OME-Zarr, we allow users to stream multi-billion voxel datasets directly from cloud buckets (S3) or distributed file systems without monolithic file bottlenecks. Furthermore, this provides native, zero-copy compatibility with our planned napari integration.

Acceptance Criteria

  • Add zarr and ome-zarr to the optional dependencies.
  • Implement a VoxelImage.from_zarr() method that efficiently reads chunked data into the AMReX iMultiFab.
  • Implement a VoxelImage.to_zarr() export function for saving output property fields (e.g., flux maps) efficiently.
  • Document the performance comparison of loading a 10GB+ dataset via TIFF vs Parallel HDF5 vs Zarr.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions