Skip to content

Releases: justapithecus/lode

v0.7.0

07 Feb 03:41
79ddbbd

Choose a tag to compare

Codec-agnostic per-file statistics and nil metadata coalescing

Summary

v0.7.0 adds per-file column statistics to manifests (enabling pruning workflows without opening data files) and relaxes nil metadata handling across all write paths to coalesce to empty instead of returning an error.

Highlights

  • Per-file column statistics: New StatisticalCodec and StatisticalStreamEncoder interfaces allow any codec to report per-file column stats (min, max, null count, distinct count) persisted on FileRef
  • Parquet statistics: The Parquet codec implements StatisticalCodec, reporting column-level min/max/null count for all orderable types (int32, int64, float32, float64, string, timestamp)
  • New public types: FileStats, ColumnStats on the public API surface
  • Nil metadata coalescing: Write, StreamWrite, StreamWriteRecords, and Volume.Commit now coalesce nil metadata to Metadata{} instead of returning an error
  • Contract updates: CONTRACT_CORE, CONTRACT_WRITE_API, CONTRACT_VOLUME, and CONTRACT_PARQUET updated to reflect new semantics
  • 14 new stats tests and 4 updated coalescing tests with full traceability matrix coverage

Upgrade Notes

  • Callers that previously passed Metadata{} solely to avoid nil errors can now pass nil safely
  • Callers that relied on nil metadata returning an error should remove that expectation
  • Per-file stats are opt-in: only codecs implementing StatisticalCodec produce them; manifests without stats remain valid

References

  • Per-file statistics: #103
  • Nil metadata coalescing + housekeeping: #104

Full Changelog: v0.6.0...v0.7.0

What's Changed

  • docs(agents): 📝 enhance AGENTS.md with Go style and composition guardrails by @justapithecus in #102
  • feat(manifest): ✨ add per-file column statistics for codec-agnostic pruning by @justapithecus in #103
  • chore(api): 🩹 post-stats housekeeping and nil metadata coalescing by @justapithecus in #104
  • docs: 📝 backfill CHANGELOG for v0.6.0 and v0.7.0 by @justapithecus in #105

Full Changelog: v0.6.0...v0.7.0

v0.6.0

07 Feb 00:50
1145ee3

Choose a tag to compare

Dual persistence: Dataset + Volume

Summary

Introduces Volume as a second first-class persistence paradigm alongside Dataset. Volume provides sparse, resumable, range-addressable byte-space persistence with incremental commits and overflow-safe arithmetic throughout.

Highlights

  • NewVolume constructor with VolumeID, TotalLength, and optional WithVolumeChecksum
  • StageWriteAt / Commit / ReadAt for incremental block-level persistence
  • Cumulative snapshot manifests with strict overlap validation
  • Latest / Snapshots / Snapshot for Volume history access
  • ErrRangeMissing and ErrOverlappingBlocks error sentinels
  • Overflow-safe bounds checks across all Volume code paths

Breaking Changes

  • SnapshotDatasetSnapshot, SnapshotIDDatasetSnapshotID
  • ReaderDatasetReader, NewReaderNewDatasetReader
  • SegmentRefManifestRef (Dataset), BlockRef (Volume)
  • ListSegmentsListManifests, SegmentListOptionsManifestListOptions
  • ErrOverlappingSegmentsErrOverlappingBlocks
  • ErrOptionNotValidForReaderErrOptionNotValidForDatasetReader

Upgrade Notes

  • All Dataset type renames are mechanical find-and-replace in consuming code
  • Volume is entirely additive — no changes needed if you only use Dataset
  • Volume uses a fixed internal layout; the Layout abstraction remains Dataset-specific

References

  • docs/contracts/CONTRACT_VOLUME.md
  • docs/contracts/CONTRACT_WRITE_API.md (concurrency matrices)
  • docs/contracts/CONTRACT_ERRORS.md (new sentinels)
  • examples/volume_sparse/

Full Changelog: v0.5.0...v0.6.0

What's Changed

  • docs(roadmap): 📝 add v0.6 volume contract + api plan by @justapithecus in #92
  • refactor(api): ♻️ rename Dataset/Reader types and add Volume type definitions by @justapithecus in #94
  • feat(volume): ✨ implement core Volume persistence by @justapithecus in #95
  • test(volume): ✅ add comprehensive Volume test suite by @justapithecus in #96
  • feat(examples): ✨ add sparse Volume ranges example by @justapithecus in #97
  • docs(volume): 📝 finalize Volume documentation and README refresh by @justapithecus in #98
  • docs: 📝 pre-v0.6.0 documentation audit fixes by @justapithecus in #99
  • fix(volume): 🐛 use overflow-safe arithmetic in bounds checks by @justapithecus in #100
  • docs(roadmap): 📝 mark Parquet and Volume Phase 6 deliverables complete by @justapithecus in #101

Full Changelog: v0.5.0...v0.6.0

v0.5.0

06 Feb 04:18
d09ea08

Choose a tag to compare

Parquet codec and schema‑validated columnar storage

Summary

Adds a Parquet codec with explicit schema validation and new Parquet examples, plus dedicated Parquet error sentinels.

Highlights

  • Introduces NewParquetCodec(schema, opts...) with schema validation at construction time.
  • Adds Parquet schema/types (ParquetSchema, ParquetField, ParquetType) and compression options.
  • Adds examples/parquet/ and Parquet-specific error sentinels.

Breaking Changes

  • NewParquetCodec now returns (Codec, error) instead of Codec.

Upgrade Notes

  • Parquet is non‑streaming: StreamWriteRecords returns ErrCodecNotStreamable; use Dataset.Write.
  • When using Parquet, set the Lode compressor to NewNoOpCompressor() to avoid double compression.
  • Invalid Parquet schemas now fail at construction time.

References

  • docs/contracts/CONTRACT_PARQUET.md
  • docs/contracts/CONTRACT_ERRORS.md

Full Changelog: v0.4.1...v0.5.0

v0.4.1

05 Feb 13:55
7a7fea9

Choose a tag to compare

S3 multipart atomicity hardening

Summary

Improves S3 multipart atomic no‑overwrite for large uploads and documents backend compatibility.

Highlights

  • Adds a documented S3 backend compatibility matrix for multipart completion behavior.
  • Uses conditional CompleteMultipartUpload for large uploads to close the TOCTOU window.

Known Limitations

  • Atomic no‑overwrite for large uploads is verified on AWS S3 and assumed, but untested, on other S3‑compatible backends.

References

  • PUBLIC_API.md
  • docs/contracts/

Full Changelog: v0.4.0...v0.4.1

v0.4.0

05 Feb 04:38
c1cafb8

Choose a tag to compare

Compression and example contract alignment

Summary

Adds Zstd compression support, formalizes example conventions, and updates streaming API parameter order.

Highlights

  • Adds NewZstdCompressor() for improved compression and decompression performance.
  • Adds docs/contracts/CONTRACT_EXAMPLES.md and updates agent conventions.
  • Normalizes example variable naming and reorganizes documentation.

Breaking Changes

  • StreamWriteRecords parameter order changed to (ctx, records, metadata).

Known Limitations

  • Context cancellation cleanup remains best‑effort due to storage adapter timing characteristics.

Upgrade Notes

  • Update all StreamWriteRecords callsites to ds.StreamWriteRecords(ctx, iter, metadata).

References

  • PUBLIC_API.md
  • docs/contracts/CONTRACT_EXAMPLES.md
  • docs/contracts/

Full Changelog: v0.3.0...v0.4.0

v0.3.0

05 Feb 01:55
19714f5

Choose a tag to compare

Docs, examples, and test coverage expansion

Summary

Major documentation, examples, and contract alignment improvements plus additional streaming failure tests.

Highlights

  • Adds a README quick start, write API decision table, and guarantees/gotchas sections.
  • Adds an examples index and option applicability matrix in PUBLIC_API.md.
  • Adds a complete sentinel error table and streaming constraints guidance.
  • Adds streaming failure tests for commit/abort/error semantics.

Known Limitations

  • Context cancellation cleanup remains best‑effort due to storage adapter timing characteristics.

Upgrade Notes

  • No API changes; documentation and test coverage improvements only.

References

  • PUBLIC_API.md
  • docs/contracts/CONTRACT_TEST_MATRIX.md
  • docs/contracts/

Full Changelog: v0.2.0...v0.3.0

v0.2.0

03 Feb 13:47
7b8ec37

Choose a tag to compare

Public S3 adapter and release tracking

Summary

Introduces the S3 adapter as a public API and adds S3 usage documentation.

Highlights

  • Adds the lode/s3 adapter to the public API.
  • Adds S3 examples for AWS S3, MinIO, LocalStack, and Cloudflare R2.
  • Adds CHANGELOG.md for release tracking.
  • Switches project license to Apache 2.0.

Known Limitations

  • Single‑writer semantics only.
  • Large uploads (>5GB on S3) have a TOCTOU window for no‑overwrite.
  • Cleanup of partial objects is best‑effort.

Upgrade Notes

  • Update import paths from internal/ to lode/s3 if using the experimental adapter.

References

  • PUBLIC_API.md
  • examples/s3_experimental/

Full Changelog: v0.1.0...v0.2.0

v0.1.0

03 Feb 12:57
5230f31

Choose a tag to compare

Initial immutable dataset release

Summary

Initial public release with immutable datasets, snapshots, and explicit metadata.

Highlights

  • Public API for datasets and readers with immutable snapshots and manifests.
  • Default, Hive, and Flat layouts with enumeration and partition pruning.
  • Filesystem and in‑memory storage adapters.

Full Changelog: https://github.com/justapithecus/lode/commits/v0.1.0