-
Notifications
You must be signed in to change notification settings - Fork 40
Description
I'm building pg2sqlite, a Rust crate that translates PostgreSQL SQL to SQLite SQL. While validating compatibility, I ran ~80 PostgreSQL statements through polyglot-sql, sqlglot, and pg2sqlite, then executed each output on SQLite 3.49.1 to check runtime behavior.
This surfaced cases where polyglot-sql either returns PostgreSQL syntax unchanged (which SQLite rejects at runtime) or returns a parse error. I'm grouping them here since most share the same root pattern: the expression or statement is accepted by the PostgreSQL parser but emitted without translation.
All inputs below were tested on 2026-02-25 with the current published version of polyglot-sql.
Pass-through outputs that later fail at SQLite runtime
The following inputs are passed through unchanged and fail when executed against SQLite:
Functions with no SQLite equivalent (runtime "no such function" error)
| Input | polyglot output | SQLite error |
|---|---|---|
SELECT LEAST(a, b) FROM t |
SELECT LEAST(a, b) FROM t |
no such function: LEAST |
SELECT GREATEST(a, b) FROM t |
SELECT GREATEST(a, b) FROM t |
no such function: GREATEST |
SELECT FLOOR(x) FROM t |
SELECT FLOOR(x) FROM t |
no such function: FLOOR |
SELECT CEIL(x) FROM t |
SELECT CEIL(x) FROM t |
no such function: CEIL |
SELECT JSON_AGG(name) FROM t |
SELECT JSON_AGG(name) FROM t |
no such function: JSON_AGG |
SELECT JSONB_AGG(name) FROM t |
SELECT JSONB_AGG(name) FROM t |
no such function: JSONB_AGG |
SELECT JSON_OBJECT_AGG(k, v) FROM t |
SELECT JSON_OBJECT_AGG(k,v) FROM t |
no such function: JSON_OBJECT_AGG |
SELECT JSON_BUILD_OBJECT('id', id, 'name', name) FROM t |
SELECT JSON_BUILD_OBJECT(…) FROM t |
no such function: JSON_BUILD_OBJECT |
SELECT JSON_BUILD_ARRAY(a, b, c) FROM t |
SELECT JSON_BUILD_ARRAY(…) FROM t |
no such function: JSON_BUILD_ARRAY |
SELECT STDDEV(x) FROM t |
SELECT STDDEV(x) FROM t |
no such function: STDDEV |
SELECT BOOL_AND(active) FROM t |
SELECT BOOL_AND(active) FROM t |
no such function: BOOL_AND |
SELECT BIT_AND(x) FROM t |
SELECT BIT_AND(x) FROM t |
no such function: BIT_AND |
SELECT ROLLUP(department) FROM emp GROUP BY … |
pass-through | no such function: ROLLUP |
SELECT NEXTVAL('my_seq') |
pass-through | no such function: NEXTVAL |
SELECT * FROM generate_series(1, 10) |
pass-through | no such table: GENERATE_SERIES |
SELECT LPAD(str, 10, '0') FROM t |
pass-through | no such function: LPAD |
SELECT RPAD(str, 10, ' ') FROM t |
pass-through | no such function: RPAD |
SELECT INITCAP(str) FROM t |
pass-through | no such function: INITCAP |
SELECT TO_CHAR(ts, 'YYYY-MM-DD') FROM t |
pass-through | no such function: TO_CHAR |
SELECT DATE_PART('year', ts) FROM t |
pass-through | no such function: DATE_PART |
SELECT DATE_PART('epoch', ts) FROM t |
pass-through | no such function: DATE_PART |
SELECT DATE_TRUNC('month', ts) FROM t |
pass-through | no such function: DATE_TRUNC |
Several of these have well-known SQLite equivalents that a translator could emit rather than passing through:
| PostgreSQL | Compatible SQLite translation |
|---|---|
LEAST(a, b) |
MIN(a, b) SQLite supports multi-argument scalar MIN |
GREATEST(a, b) |
MAX(a, b) similarly |
JSON_AGG(x) / JSONB_AGG(x) |
json_group_array(x) |
JSON_OBJECT_AGG(k, v) |
json_group_object(k, v) |
DATE_PART('year', ts) |
CAST(strftime('%Y', ts) AS INTEGER) |
DATE_PART('epoch', ts) |
CAST(strftime('%s', ts) AS REAL) |
DATE_TRUNC('month', ts) |
strftime('%Y-%m-01', ts) |
Syntax unsupported in SQLite (runtime syntax error)
| Input | polyglot output | SQLite error |
|---|---|---|
SELECT EXTRACT(YEAR FROM ts) FROM t |
pass-through | syntax error near "FROM" |
SELECT EXTRACT(EPOCH FROM ts) FROM t |
pass-through | syntax error near "FROM" |
SELECT SUBSTRING(str FROM 2 FOR 5) FROM t |
pass-through | syntax error near "FROM" |
SELECT TRIM(LEADING 'x' FROM str) FROM t |
pass-through | syntax error near "'x'" |
SELECT ts AT TIME ZONE '+02:00' FROM t |
pass-through | syntax error near "TIME" |
SELECT ts AT TIME ZONE 'utc' FROM t |
pass-through | syntax error near "TIME" |
SELECT OVERLAY(str PLACING 'x' FROM 1 FOR 1) FROM t |
pass-through | syntax error near "PLACING" |
SELECT DATE '2023-01-01' |
pass-through | no such column: DATE |
CREATE TABLE public.t (id INT, name TEXT) |
emits public.t |
unknown database public |
Compatible strftime() translations exist for the EXTRACT cases:
EXTRACT(YEAR FROM ts) → CAST(strftime('%Y', ts) AS INTEGER),
EXTRACT(EPOCH FROM ts) → CAST(strftime('%s', ts) AS REAL).
SUM OVER DATE_TRUNC (window function context)
SELECT SUM(val) OVER (PARTITION BY DATE_TRUNC('month', created_at)) FROM tOutput keeps DATE_TRUNC in the window spec → runtime error.
Semantic mismatch: <=> (cosine distance) → IS NOT DISTINCT FROM
SELECT a <=> b FROM tpolyglot output: SELECT a IS NOT DISTINCT FROM b FROM t
<=> is the pgvector cosine distance operator, returning a FLOAT distance score. IS NOT DISTINCT FROM is a NULL-safe equality check returning a boolean. These operators have different semantics: <=> returns a distance value, while IS NOT DISTINCT FROM returns a boolean. The query executes without error, but the result meaning changes.
Parse error: <~> (hamming distance operator)
SELECT a <~> b FROM tpolyglot returns: ERROR: Parse error at line 1, column 13: Unexpected token: Gt
The <~> pgvector operator is not recognized by the parser.
Parse error: CAST(id AS BLOB(50))
SELECT CAST(id AS BLOB(50)) FROM tpolyglot returns: ERROR: Parse error at line 1, column 24: Expected RParen, got LParen ('(') near [id AS BLOB(50))]
SQLite and PostgreSQL both accept BLOB(n) as a valid type in CAST. The parser rejects the (n) precision suffix.
DDL: PostgreSQL-specific types not translated
CREATE TABLE t (flag BIT, label NVARCHAR(100), doc TSVECTOR)polyglot output: CREATE TABLE t (flag BIT, label TEXT(100), doc TSVECTOR)
For SQLite compatibility, BIT could map to INTEGER and TSVECTOR could map to TEXT. NVARCHAR is already normalized to TEXT; dropping the (100) suffix would keep emitted types consistent.
Similarly:
CREATE TABLE t (data JSONB, ts TIMESTAMPTZ)polyglot output: CREATE TABLE t (data JSON, ts TIMESTAMPTZ)
JSON is accepted by SQLite, though TEXT is often used as the canonical affinity in cross-dialect mappings. TIMESTAMPTZ is not a native SQLite type and likely needs dialect-specific mapping.
PL/pgSQL trigger functions emitted as-is, fail at runtime
CREATE OR REPLACE FUNCTION trg_fn() RETURNS TRIGGER LANGUAGE plpgsql AS $$
BEGIN
IF NEW.val < 0 THEN RAISE EXCEPTION 'val must be non-negative'; END IF;
RETURN NEW;
END; $$;
CREATE TRIGGER trg_check BEFORE INSERT ON t FOR EACH ROW EXECUTE FUNCTION trg_fn();polyglot emits the entire CREATE OR REPLACE FUNCTION … LANGUAGE plpgsql AS '…' body unchanged. SQLite fails with syntax error near "OR" — SQLite has no CREATE FUNCTION statement at all.
Some PL/pgSQL trigger patterns can be lowered to SQLite trigger syntax. Common patterns
that map naturally:
IF / ELSIF / ELSEblocks → guard conditions on each DML statementRAISE EXCEPTION 'msg'→SELECT RAISE(ABORT, 'msg')DECLARE v T; SELECT col INTO v FROM …→ CTE-based bindingNEW/OLDrecord references → kept as-is (SQLite supports these in triggers)
If full PL/pgSQL translation is out of scope, returning an explicit unsupported diagnostic for CREATE OR REPLACE FUNCTION would avoid invalid SQLite output. For comparison, sqlglot returns UNSUPPORTED: Unsupported property triggerproperties for this input, which at
least prevents the invalid SQL from reaching the database. Silent pass-through is harder to diagnose than a clear error.
CREATE ROLE / GRANT / REVOKE emitted as-is, fail at runtime
CREATE ROLE app_user
GRANT SELECT ON t TO app_user
REVOKE SELECT ON t FROM app_userpolyglot emits all three unchanged. SQLite fails with:
syntax error near "ROLE"onCREATE ROLEsyntax error near "GRANT"onGRANTsyntax error near "REVOKE"onREVOKE
SQLite has no built-in role/permission system. A safer default could be to omit these statements (or emit a clear unsupported warning), since SQLite access control is typically handled at the application layer.
GIN / GiST indices emitted as-is, fail at runtime
CREATE INDEX docs_fts ON docs USING GIN (to_tsvector('english', body))
CREATE INDEX docs_gist ON docs USING GIST (to_tsvector('english', body))polyglot emits both unchanged. SQLite fails with syntax error near "USING", it has no GIN or GiST index types.
A common approach for to_tsvector()-based GIN/GiST indices is an FTS5 virtual table with AFTER INSERT / DELETE / UPDATE sync triggers. I can share a minimal example if helpful.
Parse error: ALTER TABLE … ENABLE ROW LEVEL SECURITY / CREATE POLICY
ALTER TABLE t ENABLE ROW LEVEL SECURITY;
CREATE POLICY owner_policy ON t USING (owner = current_user);polyglot returns a parse error on the first statement:
ERROR: Parse error at line 1, column 78: Expected ADD, DROP, RENAME, ALTER, SET, UNSET, SWAP, CLUSTER, or REPLACE in ALTER TABLE, got Var
Both statements are valid PostgreSQL DDL. One possible approach is to emulate RLS via a renamed backing table, a view that applies the policy USING expression as a WHERE clause, and INSTEAD OF triggers for write operations. If RLS emulation is out of scope, these statements could be handled via a clear unsupported diagnostic path rather than an unexpected parse failure.
Summary
Across these examples, many unsupported PostgreSQL constructs are forwarded to the output without dialect translation, which then fails at SQLite runtime. Two things that would make behavior much clearer for users:
- Unsupported diagnostics: return an explicit error or warning for constructs that cannot be lowered to the target dialect, rather than forwarding the original syntax.
- Known-limitations documentation: a section in the README listing which PostgreSQL features are out of scope for each target dialect, so users can set expectations before running a migration.