fix(entities): make JSON numeric comparison backend-safe across SQL backends#270
Open
maxdubrinsky wants to merge 1 commit into
Open
fix(entities): make JSON numeric comparison backend-safe across SQL backends#270maxdubrinsky wants to merge 1 commit into
maxdubrinsky wants to merge 1 commit into
Conversation
af97ac5 to
eadae43
Compare
2c0d860 to
4be74d2
Compare
… with Postgres parity leg (AIRCORE-749)
Addresses items (2), (4), and (5) of AIRCORE-749. Item (1) ($like) is in the
parent PR this stacks on.
(2) Numeric comparison no longer 500s on PostgreSQL. _cast_json_to_numeric emitted
an unconditional CAST(... AS FLOAT); on PostgreSQL a $gt/$lt against a JSON field
holding non-numeric text raised "invalid input syntax for type double precision"
(a latent prod 500), while SQLite silently coerced it to 0.0. A new dialect-
dispatched _JsonNumeric construct casts only when the JSON value is actually a
number (json_type IN ('integer','real') on SQLite; json_typeof = 'number' on
PostgreSQL), yielding SQL NULL — no match — for non-numeric, null, and absent
values on both backends, matching InMemoryFilterRepository's ordered-comparison
contract.
(4) $eq None now matches absent JSON keys on PostgreSQL too. The old
cast(col, String) == 'null' matched absent keys only on SQLite (which renders them
as the text 'null'); on PostgreSQL an absent key is SQL NULL, so it did not match.
Added an IS NULL test so both backends match absent + explicit null, as the
in-memory backend does.
(5) The SQL-parity suite now runs against real PostgreSQL via testcontainers
(skipped without Docker), not just SQLite — which is what let these cross-backend
divergences hide. Added cases pin the numeric-guard and absent-key behavior; they
error/diverge on PostgreSQL before this change and pass after.
Remaining 749 item (3) — boolean text rendering under $like/$in/$nin — is still a
documented divergence, untouched here.
Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
eadae43 to
32e602f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Continues AIRCORE-749, addressing items (2), (4), and (5). Stacked on #264 (item (1),
$like) — review/merge that first; this branches from it.All three are verified against real PostgreSQL + SQLite (the parity suite now runs both).
(2) Numeric comparison no longer 500s on PostgreSQL (the prod crash)
_cast_json_to_numericemitted an unconditionalCAST(... AS FLOAT). On PostgreSQL a$gt/$ltagainst a JSON field that holds non-numeric text raisedinvalid input syntax for type double precision— a latent production 500 reachable from the user-facingfilterparam — while SQLite silently coerced the text to0.0(wrong matches). New dialect-dispatched_JsonNumericconstruct casts only when the JSON value is actually a number (json_type IN ('integer','real')on SQLite;json_typeof = 'number'on PostgreSQL), yielding SQLNULL(no match) for non-numeric / null / absent on both backends — matchingInMemoryFilterRepository's ordered-comparison contract. The path is bound as a parameter, so emitted SQL text is constant (cache-safe).(4)
$eq Nonematches absent JSON keys on PostgreSQL toocast(col, String) == 'null'matched absent keys only on SQLite (which renders absent as the text'null'); on PostgreSQL an absent key is SQLNULL, so absent keys did not match$eq None. Added anIS NULLtest so both backends match absent + explicit-null, as the in-memory backend does.(5) Postgres parity leg
test_filter_matches_sql_parity.pynow runs every case against SQLite and PostgreSQL (testcontainers, skipped when Docker is unavailable) plus in-memory — the SQLite-only leg is exactly what let these divergences hide. New cases pin the numeric-guard and absent-key behavior.Verification
psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type double precision: "notnum", SQLite diverges (SQL=[1,2,3,4,…]vs[1]), andeq_data_k_nonegivesSQL=[1]vs[1,2,4,5,…]. All pass after the change on both backends.AIRCORE-749 item (3): won't-fix
Boolean text rendering under
$like/$in/$nin(only$eq/$nenormalize JSON booleans across backends) is intentionally not addressed. Boolean filtering in practice uses$eq/$ne, which work correctly; the cross-backend divergence only exists for degenerate shapes ($likeon a boolean — substring-matching a bool;$in [True]which is equivalent to$eq True) that have no real use case. The divergence stays documented inInMemoryFilterRepository's docstring and the parity suite's module docstring; no shared-filter-layer code is added for it.