fix(entities): treat $like as a literal substring, not a SQL wildcard#264
Open
maxdubrinsky wants to merge 1 commit into
Open
fix(entities): treat $like as a literal substring, not a SQL wildcard#264maxdubrinsky wants to merge 1 commit into
maxdubrinsky wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR fixes SQL LIKE wildcard injection by escaping metacharacters ( ChangesSQL LIKE Metacharacter Escaping
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Contributor
|
… (AIRCORE-749) SQLAlchemyFilterRepository.like() interpolated the user filter value into an ILIKE pattern without escaping, so % and _ in a $like value were silently treated as SQL wildcards on both SQLite and PostgreSQL. The shipped in-memory backend (InMemoryFilterRepository.like) — the documented, test-pinned canonical contract — treats $like as a case-insensitive literal substring where % and _ are ordinary characters. The two backends therefore returned different result sets for any value containing % or _ (e.g. name~"a_b" matched "axb"). Escape the LIKE metacharacters (\, %, _) and pass an explicit ESCAPE clause, which behaves identically on SQLite and PostgreSQL, so the SQL backend matches the in-memory literal-substring contract. The in-memory backend is unchanged. Extends the SQL/in-memory parity suite with rows whose name/data.tier contain _ and %, each paired with a near-identical decoy a wildcard would wrongly match (covering both the plain-column and JSON cast-to-text paths). The new cases fail before this change (SQL over-matches the decoy) and pass after. Addresses item (1) of AIRCORE-749. The cross-backend JSON-coercion foot-guns (numeric-cast Postgres 500, boolean rendering, absent-key semantics) and the Postgres-backed parity leg remain open on the ticket — they require cross-backend behavior decisions. Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
2c0d860 to
4be74d2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SQLAlchemyFilterRepository.like()interpolated the user filter value directly into anILIKEpattern (f"%{value}%") with no escaping, so%and_in a$likevalue were silently treated as SQL wildcards on both SQLite and PostgreSQL. The shipped in-memory backend (InMemoryFilterRepository.like) — the documented, test-pinned canonical contract — treats$likeas a case-insensitive literal substring where%and_are ordinary characters. The two backends therefore returned different result sets for any value containing%or_(e.g.name~"a_b"matched"axb";"100%"over-matched).This is item (1) of AIRCORE-749.
Fix
Escape the LIKE metacharacters (
\,%,_) and pass an explicitESCAPEclause, which behaves identically on SQLite and PostgreSQL. The SQL backend now matches the in-memory literal-substring contract. The in-memory backend is unchanged (it was already correct).Testing
Extended the SQL↔in-memory parity suite (
test_filter_matches_sql_parity.py) with rows whosename/data.tiercontain_and%, each paired with a near-identical decoy a wildcard would wrongly match — covering both the plain-column and JSON cast-to-text paths. The three new cases fail before this change (SQL=[6,7]vsin-memory=[6]) and pass after. Full entities unit suite: 289 passed, and all existing filter tests still pass (none relied on wildcard$like).Scope note — remaining AIRCORE-749 work
This PR is the decision-free, dialect-agnostic slice. The other foot-guns in the ticket are deliberately not included because they need cross-backend behavior decisions (and surface coupled failures):
$gt/$ltagainst non-numeric JSON text (SQLite silently coerces to0.0). Needs a dialect-aware numeric guard + a decision on non-numeric → no-match.like/in/nin/ordered (onlyeqnormalizes).'null'; Postgres/in-memory treat it as no-match.These remain open on the ticket; (2)/(4) in particular change current query results and warrant their own review.
Summary by CodeRabbit