feat(parity): close final 14-name gap to reach 100% addressable coverage#130
feat(parity): close final 14-name gap to reach 100% addressable coverage#130estebanzimanyi wants to merge 1 commit into
Conversation
2dc95c0 to
c53a482
Compare
577128e to
22c64c0
Compare
c53a482 to
0ae9c41
Compare
22c64c0 to
be8eb8d
Compare
Diagnosis of the macOS osx_arm64
|
| Test | Lines using hex |
|---|---|
021_tbox_wkb_hash.test |
4 |
050_geo_inventory.test |
1 |
050b_geoset_parsers.test |
3 |
051c_stbox_hash_iohex.test |
3 |
053_tgeompoint_io.test |
7 |
All of them pass on Linux/Windows but the macOS osx_arm64 build emits an odd-length hex string from one of the asHexWKB / asHexEWKB producers — which is impossible for a correctly-encoded WKB (every byte → 2 hex chars). 69 chars = 34.5 bytes ⇒ corrupted output.
This pattern most often indicates one of:
- Endianness or padding mismatch on Apple Silicon between MEOS's
temporal_as_hexwkband the consumer'stemporal_from_hexwkb/DuckDB's hex decoder. - Off-by-one length in the wrapper (
src/geo/tgeompoint.cpp:2020etc.) — e.g.size_t sz = ...; std::string ret(hex, sz - 1);truncating the last char on macOS only. - Memory layout in a struct member where macOS arm64 aligns differently than Linux x86_64, producing a different (smaller) serialized blob.
Recommended next step
A reviewer on macOS arm64 can repro by running just the failing batch:
GEN=ninja make release && ./build/release/test/unittest 'test/sql/parity/021_tbox_wkb_hash*' 'test/sql/parity/050*' 'test/sql/parity/051c*' 'test/sql/parity/053_tgeompoint_io*'…and report which of the five tests fires the 69-char hex output. From there the wrapper or MEOS-side cause is identifiable in one or two iterations.
(Posting the same comment on #126 since both PRs hit the same failure mode.)
`meosType` (lower-case) is the **pre-consolidation** MEOS type name; `MeosType` (upper-case) is the **post-consolidation** target that the upstream rename sweep has not yet reached. The current vcpkg pin (`vcpkg_ports/meos/portfile.cmake` REF f11b7443ee98…) is still pre-consolidation: `meos/include/temporal/meos_catalog.h` line 121 declares the typedef as `} meosType;` and every MEOS API uses the lower-case spelling. MobilityDuck's source code consistently uses `meosType` to match — `grep -rn '\bMeosType\b' src/` finds the name only on the alias line and its comment, nowhere else. c8cad6d added `using meosType = MeosType;` as a forward-looking bridge for the eventual consolidation bump. That bridge points at `MeosType`, which the current pin does NOT yet expose, so it breaks every PR's Linux arm64 build with: /duckdb_build_dir/src/include/tydef.hpp:18:18: error: ‘MeosType’ does not name a type; did you mean ‘meosType’? The fix is to drop the premature alias and replace the misleading comment with one that documents the pre/post-consolidation distinction and the resume path for the next pin bump — at that point a reviewer can either restore the bridge (this time it'll be valid because `MeosType` will exist) or sweep the MobilityDuck source from `meosType` to `MeosType` in a single PR. Unblocks every in-flight PR's Linux arm64 build: MobilityDB#126, MobilityDB#130, MobilityDB#149, MobilityDB#158, MobilityDB#159, MobilityDB#160, plus the entire `feat/*_port_core` extended-type stack (MobilityDB#148/MobilityDB#150/MobilityDB#151/MobilityDB#153/MobilityDB#155/MobilityDB#156).
Reviewer's quickstart — ~10 minutes (final 14-name closer)What this PR does in one sentence: closes the residual 14-name gap from #126 — adds the last UDFs to bring the headline coverage to 943/943 = 100.0% of MobilityDuck's active-addressable temporal+geo surface. Files to read (the 14 are listed in the PR title; one new file per family):
Failing CI: same macOS-arm64 hex bug as #126 (see #126 reviewer comment for the diagnostic instrumentation plan), and the Linux arm64 Why it's safe to merge: strictly additive — 14 new UDFs each backed by an already-exported MEOS function. No semantic change to any existing UDF. Quick way to evaluategh pr checkout 130
# What is `docs/parity-status.md` now claiming?
grep "Active addressable" docs/parity-status.md
# Expected: "Active addressable scope ... 943/943 covered (100.0%)" |
PR MobilityDB#129's base (feat/parity-final-batch / PR MobilityDB#130) does not yet carry the PR MobilityDB#173 fix for src/temporal/span_table_functions.cpp:47. Cherry- picking the th3index payload onto current MobilityDB#130 inherits the DuckDB 1.4.4 implicit derived->base unique_ptr conversion error. This polyfill applies the same fix as MobilityDB#173 (uses unique_ptr_cast), matching the polyfill pattern from the feat/geography-* stack.
`meosType` (lower-case) is the **pre-consolidation** MEOS type name; `MeosType` (upper-case) is the **post-consolidation** target that the upstream rename sweep has not yet reached. The current vcpkg pin (`vcpkg_ports/meos/portfile.cmake` REF f11b7443ee98…) is still pre-consolidation: `meos/include/temporal/meos_catalog.h` line 121 declares the typedef as `} meosType;` and every MEOS API uses the lower-case spelling. MobilityDuck's source code consistently uses `meosType` to match — `grep -rn '\bMeosType\b' src/` finds the name only on the alias line and its comment, nowhere else. c8cad6d added `using meosType = MeosType;` as a forward-looking bridge for the eventual consolidation bump. That bridge points at `MeosType`, which the current pin does NOT yet expose, so it breaks every PR's Linux arm64 build with: /duckdb_build_dir/src/include/tydef.hpp:18:18: error: ‘MeosType’ does not name a type; did you mean ‘meosType’? The fix is to drop the premature alias and replace the misleading comment with one that documents the pre/post-consolidation distinction and the resume path for the next pin bump — at that point a reviewer can either restore the bridge (this time it'll be valid because `MeosType` will exist) or sweep the MobilityDuck source from `meosType` to `MeosType` in a single PR. Unblocks every in-flight PR's Linux arm64 build: #126, #130, #149, stack (#148/#150/#151/#153/#155/#156).
Re-applies the binding content from PRs MobilityDB#130 (parity-final-batch) and MobilityDB#139 (trtree-index-assertion) that the MEOS-pin/symbol-rename refactor silently dropped (the later 3-way merges were no-ops because the content sat in a common ancestor): - temporal_hash(tgeompoint/tgeometry), geometry/geography(tgeompoint) with the TgeoToGeomExec measure-geometry executor, atElevation/minusElevation, transformPipeline(tgeompoint), tgeompointSeqSetGaps, bearing/eCovers surface - TRTREE index support for STBOX/TBOX/span/temporal columns + the mandatory rowid projection (fixes the CREATE INDEX internal crash) - wire TemporalParquetFunctions::Register so temporalFooter is exposed
PRs #126 and #130 have been failing the macOS osx_arm64 unittest with: Invalid hex string, length (69) has to be a multiple of two! That message is from DuckDB's hex/blob decoder rejecting an odd-length string — meaning one of the MEOS `*_as_hexwkb` producers is emitting a 69-char hex string on that platform. Hex strings of WKB-encoded MEOS values must always be even-length (each byte is encoded as 2 hex chars), so the producer is either: - producing a literal odd-length string (MEOS-side bug), or - producing an even-length string with an embedded null at an odd position (DuckDB sees that null as the string end). This commit wraps every `*_as_hexwkb` call-site with a diagnostic-grade precondition check that runs immediately after the MEOS call returns. When the producer emits odd-length output, the exception message now includes both `strlen(hex)` and the `sz` out-parameter reported by MEOS — pinpointing which side is wrong and what the truncation point was. Sites instrumented (7 producers): * src/geo/tgeompoint.cpp — TgeoAsHexWkbExec (temporal_as_hexwkb) * src/geo/stbox_functions.cpp — Stbox_as_hexwkb (stbox_as_hexwkb) * src/temporal/span_functions.cpp — Span_as_hexwkb * src/temporal/spanset_functions.cpp — Spanset_as_hexwkb * src/temporal/set_functions.cpp — Set_as_hexwkb * src/temporal/tbox_functions.cpp — Tbox_as_hexwkb * src/temporal/temporal_functions.cpp — Temporal_as_hexwkb Each site also switches `StringVector::AddString(result, hex)` (which uses `strlen(hex)` implicitly) to `AddString(result, hex, actual)` (which uses the explicit length we just measured), so any latent embedded-null bug becomes detectable at the producer rather than the consumer. The change is intentionally a no-op on green paths: hex strings that are already even-length are passed through with no extra behaviour change, just an explicit length argument. Refs: #126, #130, #158 (touches some of the same files; sequence after #158 lands).
0ae9c41 to
131e45d
Compare
7d61a0b to
8f81a39
Compare
131e45d to
0ae9c41
Compare
8f81a39 to
7d61a0b
Compare
0ae9c41 to
25546aa
Compare
7d61a0b to
3abb5ae
Compare
25546aa to
b72c0f9
Compare
3abb5ae to
fa87dea
Compare
b72c0f9 to
2057fb5
Compare
8f22ddc to
b9df0d3
Compare
a44739e to
c74aea5
Compare
b9df0d3 to
da0aece
Compare
c74aea5 to
92db319
Compare
da0aece to
4e8ea11
Compare
92db319 to
87353e3
Compare
4e8ea11 to
1dd04c1
Compare
87353e3 to
5555d7c
Compare
1dd04c1 to
727ba26
Compare
5555d7c to
99b514c
Compare
727ba26 to
595fcaf
Compare
99b514c to
13eaacb
Compare
…sts, and transform pipeline
595fcaf to
47277ef
Compare
Summary
Closes the last 14 active addressable parity gaps flagged by
scripts/parity-audit.py, bringing MobilityDuck to 100.0%coverage of the active addressable surface (943/943 names).
Stacked on top of PR #126 (which itself adds the bearing / eCovers/
tCovers / stbox dimensional constructors / SeqSetGaps batch).
Functions added
051_stbox.in.sqlperimeter,quadSplit,geography(stbox)stbox_perimeter,stbox_quad_split,stbox_to_geo076_tpoint_analytics.in.sqlgeometry(tgeompoint),geography(tgeogpoint)(×2 arities)tpoint_tfloat_to_geomeaswith NULL measure025_temporal_tile.in.sqltimeBins,valueBins,timeBoxes,valueBoxes,valueTimeBoxestemporal_time_bins,tint/tfloat_value_bins,tint/tfloat_time_boxes,tint/tfloat_value_time_boxes050_geoset.in.sql,056_t{geo,point}_spatialfuncs.in.sqltransformPipeline(7 overloads)tspatial_transform_pipeline,stbox_transform_pipeline,spatialset_transform_pipelineTest plan
051d_stbox_perimeter_quadsplit.test— 5 assertions076b_tpoint_geometry_geography.test— 4 assertions025b_temporal_tile_bins_boxes.test— 14 assertions076c_transform_pipeline.test— 7 assertionspython3 scripts/parity-audit.pyreportsActive addressable coverage: 943/943 (100.0%)exclude_archs)Notes
geographytype;geography(stbox)andgeography(tgeogpoint)both produce a GEOMETRY-aliased blob.timeBoxes(tgeometry)/timeBoxes(tgeompoint)(5-arg form withbitmatrix/borderInc) and thebins× deferred-familysurfaces remain out of scope; not flagged by the audit.
cbuffer / npoint / pose / rgeo) remain in the audit appendices.