From 63bd260c563c2bfd61016c96854c46296775a8b0 Mon Sep 17 00:00:00 2001 From: Davide Angelocola Date: Wed, 10 Jun 2026 20:32:22 +0200 Subject: [PATCH] docs(compatibility): bump reference to v0.74.0, document Union/onpair/Variant gaps MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Rust reference implementation moved from v0.72.0 to v0.74.0 (two minor releases) while this doc continued to claim v0.72.0 as the tested ceiling. Audit findings against current Java code: - `DType::Union` (`fbs.DType.Type.Union = 12`, added in Rust 0.71.0) is not decoded — `PostscriptParser.convertDType` switch has no Union case and falls through to the default branch, throwing `VortexException("unsupported DType typeType=12")`. The sealed `DType` interface in core has no `Union` variant either, so adding support is a multi-touch change. - `vortex.onpair` (experimental in Rust 0.74.0, encodings/onpair/) is not registered. Files using it require `Registry.allowUnknown()` to open. - `vortex.variant`: decode is complete (incl. shredded child); encode still throws `"encode not yet implemented"`. Rust 0.73.0 (#7945) added their write path + parquet-variant IO tests, widening the asymmetry. Adds a new "Known wire-format gaps" section above the encoding table for visibility; adds a row for `vortex.onpair`; rewords the Variant row to call out the 0.73+ write-side asymmetry; flags the S3 fixture matrix as locked to v0.72.0 and in need of a re-run when the v0.74.0 fixture set publishes. Co-Authored-By: Claude Opus 4.7 --- docs/compatibility.md | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/docs/compatibility.md b/docs/compatibility.md index 82f596f6..eeb2a19c 100644 --- a/docs/compatibility.md +++ b/docs/compatibility.md @@ -1,8 +1,17 @@ # Compatibility -Tested against the [Rust reference implementation](https://github.com/vortex-data/vortex) v0.72.0. +Tested against the [Rust reference implementation](https://github.com/vortex-data/vortex) v0.74.0. For the rest of the API surface (reader, writer, scan, CLI), see [reference.md](reference.md). +## Known wire-format gaps + +| Item | Introduced | Java status | +|------|------------|-------------| +| `DType::Union` (`fbs.DType.Type.Union = 12`) | Rust 0.71.0 | ❌ Decode throws `VortexException("unsupported DType typeType=12")`. No `DType.Union` variant in Java's sealed type. | +| `vortex.onpair` experimental string encoding | Rust 0.74.0 | ❌ Not registered. Files using it fail to decode unless `Registry.allowUnknown()` is enabled. | +| `vortex.variant` write path | Rust 0.73.0 (`Allow writing Variant to files`, #7945) | ❌ Java decode works; Java encode throws `"encode not yet implemented"`. Java→Rust round-trip not possible for Variant columns. | +| Arrow extension array import affecting Variant shape | Rust 0.74.0 (#8125) | Untested. Re-run integration fixtures against v0.74.0 once published. | + ## Encodings | Encoding ID | Class | Decode | Encode | Notes | @@ -39,7 +48,8 @@ For the rest of the API surface (reader, writer, scan, CLI), see [reference.md]( | `fastlanes.for` | `FrameOfReferenceEncoding` | ✅ | ✅ | Integer PTypes | | `fastlanes.rle` | `RleEncoding` | ✅ | ✅ | Chunk-based RLE | | `vortex.patched` | `PatchedEncoding` | ✅ | ❌ | Primitive PTypes; encode not yet implemented | -| `vortex.variant` | `VariantEncoding` | ✅ | ❌ | Encode not yet implemented | +| `vortex.variant` | `VariantEncoding` | ✅ | ❌ | Decode (incl. shredded child); encode not yet implemented (Rust 0.73+) | +| `vortex.onpair` | _none_ | ❌ | ❌ | Experimental in Rust 0.74.0; not yet ported | ### Unknown encodings @@ -123,6 +133,10 @@ themselves. ## S3 Fixture Status (v0.72.0) +> **Note:** the fixture matrix below is locked to `v0.72.0/`. The Rust reference is +> now at `v0.74.0`; re-run the integration suite against `v0.74.0/arrays/` once +> upstream publishes the corresponding fixture set, and refresh this section. + Cross-language round-trips tested against Rust-written fixture files hosted at `s3://vortex-compat-fixtures/v0.72.0/arrays/`.