Skip to content

Releases: ICICLE-ai/ArrayMorph

v0.2.0-beta.4

27 Feb 23:18
69e06db

Choose a tag to compare

v0.2.0-beta.4 Pre-release
Pre-release

What's Changed

Full Changelog: v0.2.0-beta.3...v0.2.0-beta.4

v0.2.0.-beta.3

27 Feb 22:29
2ba6860

Choose a tag to compare

v0.2.0.-beta.3 Pre-release
Pre-release

What's Changed

Full Changelog: v0.2.0-beta.2...v0.2.0-beta.3

v0.2.0-beta.2

27 Feb 17:51
2fd5089

Choose a tag to compare

v0.2.0-beta.2 Pre-release
Pre-release

What's Changed

Full Changelog: v0.2.0-beta.1...v0.2.0-beta.2

v0.2.0-beta.1

26 Feb 17:56
08970dd

Choose a tag to compare

v0.2.0-beta.1 Pre-release
Pre-release

⚠️ Pre-release — API may change. Feedback welcome via [GitHub Issues](https://github.com/ICICLE-ai/ArrayMorph/issues).

What is ArrayMorph?

ArrayMorph is an HDF5 Virtual Object Layer (VOL) connector that lets you read and write HDF5 datasets directly to cloud object storage — AWS S3 and Azure Blob Storage — without changing your existing h5py code. Configure your storage backend, enable the plugin, and your existing h5py.File() calls transparently go to the cloud.

Highlights

  • Drop-in h5py integrationarraymorph.enable() registers the VOL connector. All existing h5py code works as-is.
  • AWS S3 + Azure Blob — read and write HDF5 datasets to either provider.
  • Pre-built wheelspip install on Linux (x86_64, aarch64) and macOS (arm64) for Python 3.9–3.14.
  • Standalone native binary — download lib_arraymorph.so / .dylib from the release page and use directly with any HDF5 application via HDF5_PLUGIN_PATH.

Install

Via pip (recommended)

pip install arraymorph

Wheels are available for:

Platform Architecture Python
Linux x86_64 3.9 – 3.14
Linux aarch64 3.9 – 3.14
macOS arm64 (Apple Silicon) 3.9 – 3.14

Standalone native library

Download the binary for your platform from the [release assets](#assets) below:

File Platform
lib_arraymorph-linux-x86_64.so Linux x86_64
lib_arraymorph-linux-aarch64.so Linux ARM64
lib_arraymorph-macos-arm64.dylib macOS Apple Silicon

Then tell HDF5 where to find it:

export HDF5_PLUGIN_PATH=/path/to/directory/containing/lib_arraymorph
export HDF5_VOL_CONNECTOR="arraymorph"

Works with any HDF5 application — not just h5py.

Build Pipeline

This release introduces a fully automated CI/CD pipeline:

  • 18 wheel builds — 3 platforms × 6 Python versions, all built in parallel
  • Conan for C++ dependency management (AWS SDK, Azure SDK, OpenSSL, curl)
  • scikit-build-core drives CMake from pyproject.toml
  • auditwheel / delocate repairs wheels for PyPI compliance (manylinux_2_28)
  • TestPyPI gate — wheels are published and smoke-tested on TestPyPI before reaching real PyPI
  • Dynamic versioning via setuptools-scm — version derived from git tags, no manual syncing
  • Conan cache shared across Python versions per platform for faster CI runs
  • Standalone binaries with corrected HDF5 rpaths attached to each GitHub release

Dependencies

Runtime

  • h5py >= 3.11.0 — provides the HDF5 shared library at runtime

Build (handled automatically)

  • AWS SDK for C++ (S3 only)
  • Azure SDK for C++ (Storage Blobs only)
  • OpenSSL
  • libcurl
  • HDF5 headers

Usage

AWS S3

import arraymorph
import h5py

arraymorph.configure_s3(
    bucket="my-bucket",
    access_key="my-access-key",
    secret_key="my-secret-key",
    region="us-east-2",
)
arraymorph.enable()

# Just use the filename — bucket is configured above
with h5py.File("data.h5", "r") as f:
    data = f["dataset"][:]

S3-compatible stores (MinIO, Ceph, Garage, etc.)

arraymorph.configure_s3(
    bucket="playgrounds",
    access_key="my-access-key",
    secret_key="my-secret-key",
    endpoint="http://localhost:3900",
    region="garage",
    use_tls=True,
    addressing_style=True,       # path-style: endpoint/bucket/key
    use_signed_payloads=True,    # some stores require this
)
arraymorph.enable()

Azure Blob Storage

arraymorph.configure_azure(
    container="my-container",
    connection_string="DefaultEndpointsProtocol=https;AccountName=...",
)
arraymorph.enable()

Environment variables (standalone binary or non-Python)

If you're using the native library outside of Python, configure via environment variables directly:

# S3 example
export STORAGE_PLATFORM=S3
export BUCKET_NAME=playgrounds
export AWS_ENDPOINT_URL_S3=http://localhost:3900
export AWS_S3_ADDRESSING_STYLE=path
export AWS_ACCESS_KEY_ID=my-access-key
export AWS_SECRET_ACCESS_KEY=my-secret-key
export AWS_REGION=us-east-2
export AWS_USE_TLS=true

# Point HDF5 to the plugin
export HDF5_VOL_CONNECTOR=arraymorph
export HDF5_PLUGIN_PATH=/path/to/directory/containing/lib_arraymorph

# Linux: make HDF5 discoverable
export LD_LIBRARY_PATH=/path/to/hdf5/lib
# macOS: make HDF5 discoverable
# export DYLD_LIBRARY_PATH=/path/to/hdf5/lib

python my_script.py

A .env file is included in the repo as a starting point — see env-example.txt.

Known Limitations

  • Write path is functional but less tested than read path
  • The standalone .so/.dylib requires HDF5 to be discoverable via LD_LIBRARY_PATH / DYLD_LIBRARY_PATH

What's Next

  • Performance benchmarks
  • Expanded test coverage for write operations

Full Changelog: https://github.com/ICICLE-ai/ArrayMorph/commits/v0.2.0-beta.1

Update cmake minimum version

18 Jun 19:51

Choose a tag to compare

Pre-release

Update the CMake minimum version from 3.0 to 3.5 to address some compatibility issues.

Initial Release

18 Jun 16:35

Choose a tag to compare

Initial Release Pre-release
Pre-release

ArrayMorph First Release

ArrayMorph is a software to manage array data stored on cloud object storage efficiently. It supports both HDF5 C++ API and h5py API. The data returned by h5py API is numpy arrays. By using h5py API, users can access array data stored on the cloud and feed the read data into machine learning pipelines seamlessly.