Skip to content

fix(entities): make JSON numeric comparison backend-safe across SQL backends#270

Open
maxdubrinsky wants to merge 1 commit into
aircore-749-like-literal-escaping/mdfrom
aircore-749-json-coercion/md
Open

fix(entities): make JSON numeric comparison backend-safe across SQL backends#270
maxdubrinsky wants to merge 1 commit into
aircore-749-like-literal-escaping/mdfrom
aircore-749-json-coercion/md

Conversation

@maxdubrinsky

@maxdubrinsky maxdubrinsky commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Continues AIRCORE-749, addressing items (2), (4), and (5). Stacked on #264 (item (1), $like) — review/merge that first; this branches from it.

All three are verified against real PostgreSQL + SQLite (the parity suite now runs both).

(2) Numeric comparison no longer 500s on PostgreSQL (the prod crash)

_cast_json_to_numeric emitted an unconditional CAST(... AS FLOAT). On PostgreSQL a $gt/$lt against a JSON field that holds non-numeric text raised invalid input syntax for type double precision — a latent production 500 reachable from the user-facing filter param — while SQLite silently coerced the text to 0.0 (wrong matches). New dialect-dispatched _JsonNumeric construct casts only when the JSON value is actually a number (json_type IN ('integer','real') on SQLite; json_typeof = 'number' on PostgreSQL), yielding SQL NULL (no match) for non-numeric / null / absent on both backends — matching InMemoryFilterRepository's ordered-comparison contract. The path is bound as a parameter, so emitted SQL text is constant (cache-safe).

(4) $eq None matches absent JSON keys on PostgreSQL too

cast(col, String) == 'null' matched absent keys only on SQLite (which renders absent as the text 'null'); on PostgreSQL an absent key is SQL NULL, so absent keys did not match $eq None. Added an IS NULL test so both backends match absent + explicit-null, as the in-memory backend does.

(5) Postgres parity leg

test_filter_matches_sql_parity.py now runs every case against SQLite and PostgreSQL (testcontainers, skipped when Docker is unavailable) plus in-memory — the SQLite-only leg is exactly what let these divergences hide. New cases pin the numeric-guard and absent-key behavior.

Verification

  • Full entities unit suite: 323 passed (parity now 64 = 32 cases × 2 backends).
  • Pre-fix proof against real Postgres: the numeric cases raise psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type double precision: "notnum", SQLite diverges (SQL=[1,2,3,4,…] vs [1]), and eq_data_k_none gives SQL=[1] vs [1,2,4,5,…]. All pass after the change on both backends.
  • ruff + ty clean.

AIRCORE-749 item (3): won't-fix

Boolean text rendering under $like/$in/$nin (only $eq/$ne normalize JSON booleans across backends) is intentionally not addressed. Boolean filtering in practice uses $eq/$ne, which work correctly; the cross-backend divergence only exists for degenerate shapes ($like on a boolean — substring-matching a bool; $in [True] which is equivalent to $eq True) that have no real use case. The divergence stays documented in InMemoryFilterRepository's docstring and the parity suite's module docstring; no shared-filter-layer code is added for it.

@maxdubrinsky maxdubrinsky requested review from a team as code owners June 10, 2026 21:14
@github-actions github-actions Bot added the fix label Jun 10, 2026
@maxdubrinsky maxdubrinsky changed the title fix(entities): backend-safe JSON numeric comparison + absent-key $eq, +Postgres parity (AIRCORE-749) fix(entities): make JSON numeric comparison backend-safe across SQL backends Jun 10, 2026
@maxdubrinsky maxdubrinsky force-pushed the aircore-749-json-coercion/md branch from af97ac5 to eadae43 Compare June 10, 2026 21:25
@maxdubrinsky maxdubrinsky force-pushed the aircore-749-like-literal-escaping/md branch from 2c0d860 to 4be74d2 Compare June 11, 2026 03:23
… with Postgres parity leg (AIRCORE-749)

Addresses items (2), (4), and (5) of AIRCORE-749. Item (1) ($like) is in the
parent PR this stacks on.

(2) Numeric comparison no longer 500s on PostgreSQL. _cast_json_to_numeric emitted
an unconditional CAST(... AS FLOAT); on PostgreSQL a $gt/$lt against a JSON field
holding non-numeric text raised "invalid input syntax for type double precision"
(a latent prod 500), while SQLite silently coerced it to 0.0. A new dialect-
dispatched _JsonNumeric construct casts only when the JSON value is actually a
number (json_type IN ('integer','real') on SQLite; json_typeof = 'number' on
PostgreSQL), yielding SQL NULL — no match — for non-numeric, null, and absent
values on both backends, matching InMemoryFilterRepository's ordered-comparison
contract.

(4) $eq None now matches absent JSON keys on PostgreSQL too. The old
cast(col, String) == 'null' matched absent keys only on SQLite (which renders them
as the text 'null'); on PostgreSQL an absent key is SQL NULL, so it did not match.
Added an IS NULL test so both backends match absent + explicit null, as the
in-memory backend does.

(5) The SQL-parity suite now runs against real PostgreSQL via testcontainers
(skipped without Docker), not just SQLite — which is what let these cross-backend
divergences hide. Added cases pin the numeric-guard and absent-key behavior; they
error/diverge on PostgreSQL before this change and pass after.

Remaining 749 item (3) — boolean text rendering under $like/$in/$nin — is still a
documented divergence, untouched here.

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
@maxdubrinsky maxdubrinsky force-pushed the aircore-749-json-coercion/md branch from eadae43 to 32e602f Compare June 11, 2026 03:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant