Skip to content

Journald testing; Deltalake/Datafusion/Arrow upgrade#60

Merged
jmacd merged 5 commits intomainfrom
jmacd/34
Apr 4, 2026
Merged

Journald testing; Deltalake/Datafusion/Arrow upgrade#60
jmacd merged 5 commits intomainfrom
jmacd/34

Conversation

@jmacd
Copy link
Copy Markdown
Owner

@jmacd jmacd commented Apr 4, 2026

Specifically a new maintenance routine based on vacuum and checkpointing.

jmacd and others added 4 commits April 2, 2026 20:50
Upgrade core dependencies to pick up deltalake 0.30 features
(log compaction, vacuum lite mode, parallel partition writers).

Version changes:
- deltalake: 0.29.4 → 0.30.2
- arrow/parquet: 56.2 → 57.3
- datafusion: 50.3 → 51.0
- serde_arrow: 0.13.7 → 0.14.0
- delta_kernel: 0.16 → 0.19.2

Code changes (all mechanical):
- DeltaOps deprecated; migrate to DeltaTable methods directly
- try_from_uri renamed to try_from_url
- Add type annotation for inference change in test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement automatic checkpointing and vacuuming for both the pond data
and control Delta tables, addressing the unbounded delta log growth
documented in deltalake-efficiency.md.

Automatic maintenance (best-effort, after every write commit):
- Checkpoint every 10 versions (Delta standard interval)
- Vacuum stale parquet files (lite mode)
- Log cleanup of expired delta log JSON files

Manual maintenance via 'pond maintain' CLI command:
- Always creates checkpoint (forced)
- Runs vacuum
- Optional --compact flag to merge small parquet files

New files:
- crates/steward/src/maintenance.rs: checkpoint/vacuum/compact logic
- crates/cmd/src/commands/maintain.rs: CLI command

Modified:
- Ship::commit_transaction(): auto-maintenance after write commits
- Ship::maintain(): orchestrates maintenance on both tables
- ControlTable::set_table(): replace table after vacuum/optimize
- OpLogPersistence::set_table(): replace table after vacuum/optimize
- Steward dispatch: expose maintain() through enum

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jmacd jmacd changed the title Journald testing; Deltalake/Datafusion/Arrpw upgrade Journald testing; Deltalake/Datafusion/Arrow upgrade Apr 4, 2026
@jmacd jmacd merged commit 6f01ce6 into main Apr 4, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant