Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -309,7 +309,7 @@ Each example is self-contained and runnable. See the example source for detailed

## Status

Lode is at **v0.5.0** and under active development.
Lode is at **v0.7.0** and under active development.
APIs are stabilizing; some changes are possible before v1.0.

If you are evaluating Lode, focus on:
Expand Down
4 changes: 2 additions & 2 deletions docs/IMPLEMENTATION_PLAN.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ Explore new adapters or codecs without expanding the public API.
- [x] CONTRACT_STORAGE.md compliance verified
- [x] Zstd compressor added as an additive compression option
- [x] Parquet codec implemented
- [ ] Manifest stats extensions finalized (additive)
- [x] Manifest stats extensions finalized (additive)

### S3 Adapter

Expand Down Expand Up @@ -262,7 +262,7 @@ Any change that affects contract behavior must:
### Priority Track B — Format and Ecosystem

- [x] Prioritize Parquet codec delivery
- [ ] Define additive manifest stats needed for Parquet-oriented pruning workflows
- [x] Define additive manifest stats needed for Parquet-oriented pruning workflows
- [x] Add/refresh examples for columnar and streaming workflows

### Priority Track C — Zarr/Xarray Direction
Expand Down
4 changes: 2 additions & 2 deletions docs/contracts/CONTRACT_CORE.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,10 @@ Manifests are immutable once written.
## Metadata Rules

- Metadata MUST be explicit on every snapshot.
- `nil` metadata is invalid and MUST error.
- `nil` metadata is coalesced to empty (`{}`) at the API boundary.
- Empty metadata (`{}`) is valid and MUST be persisted as an explicit object.
- Metadata values MUST be JSON-serializable.
- Metadata MUST never be inferred or defaulted.
- Manifests always contain a non-nil metadata map.

---

Expand Down
1 change: 1 addition & 0 deletions docs/contracts/CONTRACT_ERRORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ These indicate a manifest fails structural or semantic validation.
- `ParentSnapshotID`: May be empty for first snapshot
- `MinTimestamp`, `MaxTimestamp`: May be nil when not applicable
- `Checksum` in `FileRef`: May be empty
- `Stats` in `FileRef`: May be nil (omitted when codec does not report statistics)

**File Validation**:
- Each `FileRef.Path` must be non-empty
Expand Down
29 changes: 24 additions & 5 deletions docs/contracts/CONTRACT_TEST_MATRIX.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,13 @@ Gaps are tracked with codes indicating category and priority:
| ParentSnapshotID | First snapshot tests (no parent) |
| MinTimestamp/MaxTimestamp | `TestDataset_Write_NonTimestampedRecords_OmitsMinMax` |
| Checksum | `TestDataset_Write_WithoutChecksum_OmitsChecksum` |
| FileRef.Stats (present) | `TestDataset_Write_ParquetCodec_StatsPopulated` |
| FileRef.Stats (absent) | `TestDataset_Write_JSONLCodec_StatsNil`, `TestDataset_Write_RawBlob_StatsNil` |
| FileRef.Stats (JSON round-trip) | `TestFileRef_Stats_JSONRoundTrip`, `TestFileRef_Stats_BackwardCompat`, `TestFileRef_Stats_OmittedWhenNil` |

**Metadata Rules**: All covered ✅

- nil metadata rejected: `TestDataset_Write_NilMetadata_ReturnsError`, `TestDataset_StreamWrite_NilMetadata_ReturnsError`, `TestDataset_StreamWriteRecords_NilMetadata_ReturnsError`
- nil metadata coalesced: `TestDataset_Write_NilMetadata_CoalescesToEmpty`, `TestDataset_StreamWrite_NilMetadata_CoalescesToEmpty`, `TestDataset_StreamWriteRecords_NilMetadata_CoalescesToEmpty`
- Empty metadata valid: `TestDataset_Write_EmptyMetadata_ValidAndPersisted`, etc.

**Immutability**: All covered ✅
Expand All @@ -78,15 +81,15 @@ Gaps are tracked with codes indicating category and priority:
| Requirement | Test |
|-------------|------|
| Creates snapshot | Multiple write tests |
| nil metadata error | `TestDataset_Write_NilMetadata_ReturnsError` |
| nil metadata coalesced | `TestDataset_Write_NilMetadata_CoalescesToEmpty` |
| Parent snapshot linked | `TestDataset_StreamWrite_ParentSnapshotLinked` |
| Raw blob RowCount=1 | `TestDataset_StreamWrite_Success` |

**StreamWrite**: All covered ✅

| Requirement | Test |
|-------------|------|
| nil metadata error | `TestDataset_StreamWrite_NilMetadata_ReturnsError` |
| nil metadata coalesced | `TestDataset_StreamWrite_NilMetadata_CoalescesToEmpty` |
| Commit writes manifest | `TestDataset_StreamWrite_Success` |
| Snapshot invisible before Commit | `TestDataset_StreamWrite_NotVisibleBeforeCommit` |
| Abort → no manifest | `TestDataset_StreamWrite_Abort_NoManifest` |
Expand All @@ -103,7 +106,7 @@ Gaps are tracked with codes indicating category and priority:

| Requirement | Test |
|-------------|------|
| nil metadata error | `TestDataset_StreamWriteRecords_NilMetadata_ReturnsError` |
| nil metadata coalesced | `TestDataset_StreamWriteRecords_NilMetadata_CoalescesToEmpty` |
| nil iterator error | `TestDataset_StreamWriteRecords_NilIterator_ReturnsError` |
| Non-streaming codec error | `TestDataset_StreamWriteRecords_NonStreamingCodec_ReturnsError` |
| Partitioning error | `TestDataset_StreamWriteRecords_WithPartitioner_ReturnsError` |
Expand All @@ -120,6 +123,22 @@ Gaps are tracked with codes indicating category and priority:
| Non-timestamped omits | `TestDataset_Write_NonTimestampedRecords_OmitsMinMax` |
| Raw blob omits | `TestDataset_Write_RawBlob_OmitsTimestamps` |

**Per-File Statistics**: All covered ✅

| Requirement | Test |
|-------------|------|
| StatisticalCodec populates stats | `TestDataset_Write_ParquetCodec_StatsPopulated` |
| Non-statistical codec → nil stats | `TestDataset_Write_JSONLCodec_StatsNil` |
| Raw blob → nil stats | `TestDataset_Write_RawBlob_StatsNil` |
| StreamWriteRecords → nil stats (JSONL) | `TestDataset_StreamWriteRecords_StatsNil` |
| Parquet basic types stats | `TestParquetCodec_FileStats_BasicTypes` |
| Parquet nullable field stats | `TestParquetCodec_FileStats_NullableFields` |
| Parquet all-null column stats | `TestParquetCodec_FileStats_AllNulls` |
| Parquet single record stats | `TestParquetCodec_FileStats_SingleRecord` |
| Parquet bool/bytes no min/max | `TestParquetCodec_FileStats_BoolAndBytes_NoMinMax` |
| Parquet timestamp stats | `TestParquetCodec_FileStats_Timestamps` |
| Parquet empty records stats | `TestParquetCodec_FileStats_EmptyRecords` |

**Empty Dataset**: All covered ✅

| Requirement | Test |
Expand Down Expand Up @@ -257,7 +276,7 @@ All error sentinels covered ✅
|-------------|------|
| End-to-end round-trip | `TestVolume_StageCommitReadAt_EndToEnd` |
| Cumulative manifest | `TestVolume_CumulativeManifest` |
| nil metadata rejected | `TestVolume_Commit_NilMetadata_ReturnsError` |
| nil metadata coalesced | `TestVolume_Commit_NilMetadata_CoalescesToEmpty` |
| Empty metadata accepted | `TestVolume_Commit_EmptyMetadata_Succeeds` |
| Empty blocks rejected | `TestVolume_Commit_EmptyBlocks_ReturnsError` |
| Empty block path rejected | `TestVolume_Commit_EmptyBlockPath_ReturnsError` |
Expand Down
2 changes: 1 addition & 1 deletion docs/contracts/CONTRACT_VOLUME.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ Volume accepts a minimal set of options:
- Missing committed range MUST return `ErrRangeMissing`.
- Overlapping blocks at Commit MUST return `ErrOverlappingBlocks`.
- Empty block list at Commit MUST return an error.
- Nil metadata at Commit MUST return an error.
- `nil` metadata at Commit MUST be coalesced to empty (`Metadata{}`).
- Range reads MUST NOT return partial data without error.
- `Latest` on empty volume MUST return `ErrNoSnapshots`.
- `Snapshot` for unknown ID MUST return `ErrNotFound`.
Expand Down
6 changes: 3 additions & 3 deletions docs/contracts/CONTRACT_WRITE_API.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ It is authoritative for any `Dataset` implementation.
### Required behavior

- `Write(ctx, data, metadata)` MUST create a new snapshot on success.
- `metadata` MUST be non-nil; nil MUST return an error.
- `nil` metadata MUST be coalesced to empty (`Metadata{}`).
- Empty metadata is valid and MUST be persisted explicitly.
- The new snapshot MUST reference the previous snapshot as its parent (if any).
- Writes MUST NOT mutate existing snapshots or manifests.
Expand All @@ -40,7 +40,7 @@ It is authoritative for any `Dataset` implementation.

### StreamWrite Semantics

- `StreamWrite(ctx, metadata)` MUST return an error if metadata is nil.
- `StreamWrite(ctx, metadata)` MUST coalesce `nil` metadata to empty (`Metadata{}`).
- `StreamWrite` MUST return a `StreamWriter` for a single binary data unit.
- `StreamWriter.Commit(ctx)` MUST write the manifest and return the new snapshot.
- A snapshot MUST NOT be visible before `Commit` writes the manifest.
Expand All @@ -55,7 +55,7 @@ It is authoritative for any `Dataset` implementation.

### StreamWriteRecords Semantics

- `StreamWriteRecords(ctx, records, metadata)` MUST return an error if metadata is nil.
- `StreamWriteRecords(ctx, records, metadata)` MUST coalesce `nil` metadata to empty (`Metadata{}`).
- `StreamWriteRecords` MUST return an error if records iterator is nil.
- `StreamWriteRecords` MUST consume records via a pull-based iterator.
- `StreamWriteRecords` MUST return an error if the configured codec does not support
Expand Down
6 changes: 3 additions & 3 deletions lode/dataset.go
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,7 @@ func (d *dataset) ID() DatasetID {

func (d *dataset) Write(ctx context.Context, data []any, metadata Metadata) (*DatasetSnapshot, error) {
if metadata == nil {
return nil, errors.New("lode: metadata must be non-nil (use empty map {} for no metadata)")
metadata = Metadata{}
}

var parentID DatasetSnapshotID
Expand Down Expand Up @@ -449,7 +449,7 @@ func (d *dataset) Latest(ctx context.Context) (*DatasetSnapshot, error) {

func (d *dataset) StreamWrite(ctx context.Context, metadata Metadata) (StreamWriter, error) {
if metadata == nil {
return nil, errors.New("lode: metadata must be non-nil (use empty map {} for no metadata)")
metadata = Metadata{}
}
if d.codec != nil {
return nil, ErrCodecConfigured
Expand Down Expand Up @@ -513,7 +513,7 @@ func (d *dataset) StreamWrite(ctx context.Context, metadata Metadata) (StreamWri

func (d *dataset) StreamWriteRecords(ctx context.Context, records RecordIterator, metadata Metadata) (*DatasetSnapshot, error) {
if metadata == nil {
return nil, errors.New("lode: metadata must be non-nil (use empty map {} for no metadata)")
metadata = Metadata{}
}
if records == nil {
return nil, ErrNilIterator
Expand Down
50 changes: 32 additions & 18 deletions lode/dataset_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -776,18 +776,21 @@ func TestDataset_Snapshot_EmptyDataset_ReturnsErrNotFound(t *testing.T) {
// Write validation tests
// -----------------------------------------------------------------------------

func TestDataset_Write_NilMetadata_ReturnsError(t *testing.T) {
func TestDataset_Write_NilMetadata_CoalescesToEmpty(t *testing.T) {
ds, err := NewDataset("test-ds", NewMemoryFactory())
if err != nil {
t.Fatal(err)
}

_, err = ds.Write(t.Context(), []any{[]byte("data")}, nil)
if err == nil {
t.Fatal("expected error for nil metadata, got nil")
snap, err := ds.Write(t.Context(), []any{[]byte("data")}, nil)
if err != nil {
t.Fatalf("expected nil metadata to succeed, got: %v", err)
}
if !strings.Contains(err.Error(), "metadata must be non-nil") {
t.Errorf("expected metadata error, got: %v", err)
if snap.Manifest.Metadata == nil {
t.Fatal("expected non-nil metadata in manifest after nil coalescing")
}
if len(snap.Manifest.Metadata) != 0 {
t.Errorf("expected empty metadata, got %v", snap.Manifest.Metadata)
}
}

Expand Down Expand Up @@ -961,18 +964,26 @@ func TestDataset_StreamWrite_CloseWithoutCommit_BehavesAsAbort(t *testing.T) {
}
}

func TestDataset_StreamWrite_NilMetadata_ReturnsError(t *testing.T) {
func TestDataset_StreamWrite_NilMetadata_CoalescesToEmpty(t *testing.T) {
ds, err := NewDataset("test-ds", NewMemoryFactory())
if err != nil {
t.Fatal(err)
}

_, err = ds.StreamWrite(t.Context(), nil)
if err == nil {
t.Fatal("expected error for nil metadata, got nil")
sw, err := ds.StreamWrite(t.Context(), nil)
if err != nil {
t.Fatalf("expected nil metadata to succeed, got: %v", err)
}
if !strings.Contains(err.Error(), "metadata must be non-nil") {
t.Errorf("expected metadata error, got: %v", err)
_, _ = sw.Write([]byte("data"))
snap, err := sw.Commit(t.Context())
if err != nil {
t.Fatal(err)
}
if snap.Manifest.Metadata == nil {
t.Fatal("expected non-nil metadata in manifest after nil coalescing")
}
if len(snap.Manifest.Metadata) != 0 {
t.Errorf("expected empty metadata, got %v", snap.Manifest.Metadata)
}
}

Expand Down Expand Up @@ -1426,19 +1437,22 @@ func TestDataset_StreamWriteRecords_NoCodec_ReturnsError(t *testing.T) {
}
}

func TestDataset_StreamWriteRecords_NilMetadata_ReturnsError(t *testing.T) {
func TestDataset_StreamWriteRecords_NilMetadata_CoalescesToEmpty(t *testing.T) {
ds, err := NewDataset("test-ds", NewMemoryFactory(), WithCodec(NewJSONLCodec()))
if err != nil {
t.Fatal(err)
}

iter := &sliceIterator{records: []any{D{"id": "1"}}}
_, err = ds.StreamWriteRecords(t.Context(), iter, nil)
if err == nil {
t.Fatal("expected error for nil metadata, got nil")
snap, err := ds.StreamWriteRecords(t.Context(), iter, nil)
if err != nil {
t.Fatalf("expected nil metadata to succeed, got: %v", err)
}
if snap.Manifest.Metadata == nil {
t.Fatal("expected non-nil metadata in manifest after nil coalescing")
}
if !strings.Contains(err.Error(), "metadata must be non-nil") {
t.Errorf("expected metadata error, got: %v", err)
if len(snap.Manifest.Metadata) != 0 {
t.Errorf("expected empty metadata, got %v", snap.Manifest.Metadata)
}
}

Expand Down
2 changes: 1 addition & 1 deletion lode/volume.go
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ func (v *volume) StageWriteAt(ctx context.Context, offset int64, r io.Reader) (B
// Commit records the provided blocks into a new immutable snapshot.
func (v *volume) Commit(ctx context.Context, blocks []BlockRef, metadata Metadata) (*VolumeSnapshot, error) {
if metadata == nil {
return nil, fmt.Errorf("lode: metadata must not be nil (use empty Metadata{} for no metadata)")
metadata = Metadata{}
}
if len(blocks) == 0 {
return nil, fmt.Errorf("lode: commit must include at least one new block")
Expand Down
14 changes: 10 additions & 4 deletions lode/volume_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -270,7 +270,7 @@ func TestVolume_ReadAt_MissingRange_ReturnsErrRangeMissing(t *testing.T) {
}
}

func TestVolume_Commit_NilMetadata_ReturnsError(t *testing.T) {
func TestVolume_Commit_NilMetadata_CoalescesToEmpty(t *testing.T) {
vol, err := NewVolume("test-vol", NewMemoryFactory(), 100)
if err != nil {
t.Fatalf("unexpected error: %v", err)
Expand All @@ -282,9 +282,15 @@ func TestVolume_Commit_NilMetadata_ReturnsError(t *testing.T) {
t.Fatalf("StageWriteAt failed: %v", err)
}

_, err = vol.Commit(ctx, []BlockRef{blk}, nil)
if err == nil {
t.Fatal("expected error for nil metadata")
snap, err := vol.Commit(ctx, []BlockRef{blk}, nil)
if err != nil {
t.Fatalf("expected nil metadata to succeed, got: %v", err)
}
if snap.Manifest.Metadata == nil {
t.Fatal("expected non-nil metadata in manifest after nil coalescing")
}
if len(snap.Manifest.Metadata) != 0 {
t.Errorf("expected empty metadata, got %v", snap.Manifest.Metadata)
}
}

Expand Down