Summary
When a connection executes a DDL statement (CREATE, DROP, ALTER) or triggers a ROLLBACK, the prepared-statement LRU cache enters a split-brain state. Once the cache subsequently fills back up to max_prepared_statements (default: 1000), the next query crashes with a KeyError inside the eviction logic. The crash also corrupts the connection so that the following rollback() call fails too, leaving the connection permanently broken.
This manifests in some dbt projects because users can run an arbitrary number of queries after dbt executes at least one DDL all of which execute on the same connection.
Environment
redshift-connector version: 2.1.12
- Python: 3.x
- Cluster: Redshift (provisioned)
Steps to reproduce
Minimal inline repro:
```python
import redshift_connector
conn = redshift_connector.connect(host=..., database=..., user=..., password=...)
conn.autocommit = True
cur = conn.cursor()
Step 1: DDL poisons the LRU tracker
cur.execute("CREATE TABLE IF NOT EXISTS repro_test (id INT)")
Step 2: fill the prepared-statement cache to the limit (default 1000)
for i in range(1000):
cur.execute(f"SELECT {i} AS n")
cur.fetchall()
Step 3: one more new query triggers eviction → KeyError
cur.execute("SELECT 'boom' AS x") # ← crashes here
```
Expected: query succeeds.
Actual:
```
KeyError: ('CREATE TABLE IF NOT EXISTS repro_test (id INT)', ())
File "redshift_connector/core.py", line 1849, in execute
statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
```
Root cause
There are two parallel cache structures kept in sync for LRU eviction:
```
cache["ps"] – dict : key → prepared-statement metadata
cache["statement_dict"] – OrderedDict : key → None (LRU order)
```
Bug 1 — handle_COMMAND_COMPLETE (core.py:2029–2034) clears ps on DDL/ROLLBACK but leaves statement_dict intact:
```python
if command in (b"ALTER", b"CREATE", b"DROP", b"ROLLBACK"):
for ps in pcache["ps"].values():
self.close_prepared_statement(ps["statement_name_bin"])
pcache["ps"].clear() # ← ps emptied
# statement_dict is NOT cleared ← stale keys remain
```
Bug 2 — The consistency-repair check (core.py:1842–1844) only adds entries to statement_dict; it never removes stale ones:
```python
if len(cache["ps"]) != len(cache["statement_dict"]):
for existing_key in cache["ps"]:
cache["statement_dict"][existing_key] = None # adds, never removes
```
Bug 3 (crash site) — Eviction blindly assumes the oldest statement_dict key exists in ps (core.py:1847–1850):
```python
if len(cache["ps"]) >= self.max_prepared_statements:
oldest_key, _ = cache["statement_dict"].popitem(last=False)
statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
# ↑ KeyError if oldest_key is stale
```
Failure sequence
- Connection runs N unique queries →
ps and statement_dict each have N entries
- A DDL or ROLLBACK executes →
ps cleared to 0, statement_dict still has N stale keys
- N more unique queries run →
ps refills to N; statement_dict now has 2N entries
- One more new query: eviction fires (
len(ps) >= 1000), pops the oldest statement_dict entry (a stale pre-DDL key), crashes with KeyError
- The crashed query left a
PARSE message partially sent; the subsequent rollback() fails with ProgrammingError: prepared statement "..." already exists, making the connection permanently unusable
Observed diagnostic output
```
ps size after DDL: 0 statement_dict size: 502 ← 502 stale entries remain
query 1000: ps=1000, sd=1502 ← divergence grows with each query
Step 4: KeyError: ('SELECT 1 AS query_num', ()) ← crash at eviction
```
Note: in a realistic dbt workflow the failure occurs at query ~999 rather than 1001, because DDL setup queries (DROP TABLE, CREATE TABLE) run before the main loop and pre-poison statement_dict with stale keys, reducing the effective headroom.
Proposed fix
The minimal fix is to also clear statement_dict wherever ps is cleared:
```python
core.py ~line 2034
pcache["ps"].clear()
pcache["statement_dict"].clear() # ← add this line
```
A more defensive fix also guards the eviction against stale keys:
```python
core.py ~line 1847
if len(cache["ps"]) >= self.max_prepared_statements:
while cache["statement_dict"]:
oldest_key, _ = cache["statement_dict"].popitem(last=False)
if oldest_key in cache["ps"]:
statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
del cache["ps"][oldest_key]
break
# else: stale key, discard and keep looking
```
Both fixes together fully resolve the issue.
Summary
When a connection executes a DDL statement (
CREATE,DROP,ALTER) or triggers aROLLBACK, the prepared-statement LRU cache enters a split-brain state. Once the cache subsequently fills back up tomax_prepared_statements(default: 1000), the next query crashes with aKeyErrorinside the eviction logic. The crash also corrupts the connection so that the followingrollback()call fails too, leaving the connection permanently broken.This manifests in some dbt projects because users can run an arbitrary number of queries after dbt executes at least one DDL all of which execute on the same connection.
Environment
redshift-connectorversion: 2.1.12Steps to reproduce
Minimal inline repro:
```python
import redshift_connector
conn = redshift_connector.connect(host=..., database=..., user=..., password=...)
conn.autocommit = True
cur = conn.cursor()
Step 1: DDL poisons the LRU tracker
cur.execute("CREATE TABLE IF NOT EXISTS repro_test (id INT)")
Step 2: fill the prepared-statement cache to the limit (default 1000)
for i in range(1000):
cur.execute(f"SELECT {i} AS n")
cur.fetchall()
Step 3: one more new query triggers eviction → KeyError
cur.execute("SELECT 'boom' AS x") # ← crashes here
```
Expected: query succeeds.
Actual:
```
KeyError: ('CREATE TABLE IF NOT EXISTS repro_test (id INT)', ())
File "redshift_connector/core.py", line 1849, in execute
statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
```
Root cause
There are two parallel cache structures kept in sync for LRU eviction:
```
cache["ps"] – dict : key → prepared-statement metadata
cache["statement_dict"] – OrderedDict : key → None (LRU order)
```
Bug 1 —
handle_COMMAND_COMPLETE(core.py:2029–2034) clearspson DDL/ROLLBACK but leavesstatement_dictintact:```python
if command in (b"ALTER", b"CREATE", b"DROP", b"ROLLBACK"):
for ps in pcache["ps"].values():
self.close_prepared_statement(ps["statement_name_bin"])
pcache["ps"].clear() # ← ps emptied
# statement_dict is NOT cleared ← stale keys remain
```
Bug 2 — The consistency-repair check (
core.py:1842–1844) only adds entries tostatement_dict; it never removes stale ones:```python
if len(cache["ps"]) != len(cache["statement_dict"]):
for existing_key in cache["ps"]:
cache["statement_dict"][existing_key] = None # adds, never removes
```
Bug 3 (crash site) — Eviction blindly assumes the oldest
statement_dictkey exists inps(core.py:1847–1850):```python
if len(cache["ps"]) >= self.max_prepared_statements:
oldest_key, _ = cache["statement_dict"].popitem(last=False)
statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
# ↑ KeyError if oldest_key is stale
```
Failure sequence
psandstatement_dicteach have N entriespscleared to 0,statement_dictstill has N stale keyspsrefills to N;statement_dictnow has 2N entrieslen(ps) >= 1000), pops the oldeststatement_dictentry (a stale pre-DDL key), crashes withKeyErrorPARSEmessage partially sent; the subsequentrollback()fails withProgrammingError: prepared statement "..." already exists, making the connection permanently unusableObserved diagnostic output
```
ps size after DDL: 0 statement_dict size: 502 ← 502 stale entries remain
query 1000: ps=1000, sd=1502 ← divergence grows with each query
Step 4: KeyError: ('SELECT 1 AS query_num', ()) ← crash at eviction
```
Note: in a realistic dbt workflow the failure occurs at query ~999 rather than 1001, because DDL setup queries (
DROP TABLE,CREATE TABLE) run before the main loop and pre-poisonstatement_dictwith stale keys, reducing the effective headroom.Proposed fix
The minimal fix is to also clear
statement_dictwhereverpsis cleared:```python
core.py ~line 2034
pcache["ps"].clear()
pcache["statement_dict"].clear() # ← add this line
```
A more defensive fix also guards the eviction against stale keys:
```python
core.py ~line 1847
if len(cache["ps"]) >= self.max_prepared_statements:
while cache["statement_dict"]:
oldest_key, _ = cache["statement_dict"].popitem(last=False)
if oldest_key in cache["ps"]:
statements_to_close.append(cache["ps"][oldest_key]["statement_name_bin"])
del cache["ps"][oldest_key]
break
# else: stale key, discard and keep looking
```
Both fixes together fully resolve the issue.