Releases: ICICLE-ai/ArrayMorph
v0.2.0-beta.4
v0.2.0.-beta.3
v0.2.0-beta.2
What's Changed
- Migrating to VCPKG to support by @guzman109 in #6
Full Changelog: v0.2.0-beta.1...v0.2.0-beta.2
v0.2.0-beta.1
⚠️ Pre-release — API may change. Feedback welcome via [GitHub Issues](https://github.com/ICICLE-ai/ArrayMorph/issues).
What is ArrayMorph?
ArrayMorph is an HDF5 Virtual Object Layer (VOL) connector that lets you read and write HDF5 datasets directly to cloud object storage — AWS S3 and Azure Blob Storage — without changing your existing h5py code. Configure your storage backend, enable the plugin, and your existing h5py.File() calls transparently go to the cloud.
Highlights
- Drop-in h5py integration —
arraymorph.enable()registers the VOL connector. All existing h5py code works as-is. - AWS S3 + Azure Blob — read and write HDF5 datasets to either provider.
- Pre-built wheels —
pip installon Linux (x86_64, aarch64) and macOS (arm64) for Python 3.9–3.14. - Standalone native binary — download
lib_arraymorph.so/.dylibfrom the release page and use directly with any HDF5 application viaHDF5_PLUGIN_PATH.
Install
Via pip (recommended)
pip install arraymorphWheels are available for:
| Platform | Architecture | Python |
|---|---|---|
| Linux | x86_64 | 3.9 – 3.14 |
| Linux | aarch64 | 3.9 – 3.14 |
| macOS | arm64 (Apple Silicon) | 3.9 – 3.14 |
Standalone native library
Download the binary for your platform from the [release assets](#assets) below:
| File | Platform |
|---|---|
lib_arraymorph-linux-x86_64.so |
Linux x86_64 |
lib_arraymorph-linux-aarch64.so |
Linux ARM64 |
lib_arraymorph-macos-arm64.dylib |
macOS Apple Silicon |
Then tell HDF5 where to find it:
export HDF5_PLUGIN_PATH=/path/to/directory/containing/lib_arraymorph
export HDF5_VOL_CONNECTOR="arraymorph"Works with any HDF5 application — not just h5py.
Build Pipeline
This release introduces a fully automated CI/CD pipeline:
- 18 wheel builds — 3 platforms × 6 Python versions, all built in parallel
- Conan for C++ dependency management (AWS SDK, Azure SDK, OpenSSL, curl)
- scikit-build-core drives CMake from
pyproject.toml - auditwheel / delocate repairs wheels for PyPI compliance (manylinux_2_28)
- TestPyPI gate — wheels are published and smoke-tested on TestPyPI before reaching real PyPI
- Dynamic versioning via setuptools-scm — version derived from git tags, no manual syncing
- Conan cache shared across Python versions per platform for faster CI runs
- Standalone binaries with corrected HDF5 rpaths attached to each GitHub release
Dependencies
Runtime
h5py >= 3.11.0— provides the HDF5 shared library at runtime
Build (handled automatically)
- AWS SDK for C++ (S3 only)
- Azure SDK for C++ (Storage Blobs only)
- OpenSSL
- libcurl
- HDF5 headers
Usage
AWS S3
import arraymorph
import h5py
arraymorph.configure_s3(
bucket="my-bucket",
access_key="my-access-key",
secret_key="my-secret-key",
region="us-east-2",
)
arraymorph.enable()
# Just use the filename — bucket is configured above
with h5py.File("data.h5", "r") as f:
data = f["dataset"][:]S3-compatible stores (MinIO, Ceph, Garage, etc.)
arraymorph.configure_s3(
bucket="playgrounds",
access_key="my-access-key",
secret_key="my-secret-key",
endpoint="http://localhost:3900",
region="garage",
use_tls=True,
addressing_style=True, # path-style: endpoint/bucket/key
use_signed_payloads=True, # some stores require this
)
arraymorph.enable()Azure Blob Storage
arraymorph.configure_azure(
container="my-container",
connection_string="DefaultEndpointsProtocol=https;AccountName=...",
)
arraymorph.enable()Environment variables (standalone binary or non-Python)
If you're using the native library outside of Python, configure via environment variables directly:
# S3 example
export STORAGE_PLATFORM=S3
export BUCKET_NAME=playgrounds
export AWS_ENDPOINT_URL_S3=http://localhost:3900
export AWS_S3_ADDRESSING_STYLE=path
export AWS_ACCESS_KEY_ID=my-access-key
export AWS_SECRET_ACCESS_KEY=my-secret-key
export AWS_REGION=us-east-2
export AWS_USE_TLS=true
# Point HDF5 to the plugin
export HDF5_VOL_CONNECTOR=arraymorph
export HDF5_PLUGIN_PATH=/path/to/directory/containing/lib_arraymorph
# Linux: make HDF5 discoverable
export LD_LIBRARY_PATH=/path/to/hdf5/lib
# macOS: make HDF5 discoverable
# export DYLD_LIBRARY_PATH=/path/to/hdf5/lib
python my_script.pyA .env file is included in the repo as a starting point — see env-example.txt.
Known Limitations
- Write path is functional but less tested than read path
- The standalone
.so/.dylibrequires HDF5 to be discoverable viaLD_LIBRARY_PATH/DYLD_LIBRARY_PATH
What's Next
- Performance benchmarks
- Expanded test coverage for write operations
Full Changelog: https://github.com/ICICLE-ai/ArrayMorph/commits/v0.2.0-beta.1
Update cmake minimum version
Update the CMake minimum version from 3.0 to 3.5 to address some compatibility issues.
Initial Release
ArrayMorph First Release
ArrayMorph is a software to manage array data stored on cloud object storage efficiently. It supports both HDF5 C++ API and h5py API. The data returned by h5py API is numpy arrays. By using h5py API, users can access array data stored on the cloud and feed the read data into machine learning pipelines seamlessly.