Skip to content

test(server): characterize SQL/JS predicate parity — ne crashes handleSub, like diverges#6

Closed
grrowl wants to merge 19 commits into
feat/ssrfrom
advisor/003-predicate-parity
Closed

test(server): characterize SQL/JS predicate parity — ne crashes handleSub, like diverges#6
grrowl wants to merge 19 commits into
feat/ssrfrom
advisor/003-predicate-parity

Conversation

@grrowl

@grrowl grrowl commented Jun 11, 2026

Copy link
Copy Markdown
Owner

Executes plan 003 from the advisor audit. Characterization tests only — and they found real divergence. The failing cases are marked it.fails (suite stays green; they'll loudly flip the moment a fix lands).

Confirmed divergences

  1. P1 — ne crashes the DO's handleSub; the client hangs forever. sql-compiler.ts accepts ne in its operator floor, but @tanstack/db's compileSingleRowExpression throws QueryCompilationError: Unknown function: ne — from compilePredicate via subs.add, outside the UnsupportedPredicateError catch. No reset is sent, no snap-end arrives.
  2. MED — like membership depends on connection timing. SQLite LIKE is ASCII-case-insensitive; @tanstack/db's like evaluator is case-sensitive. "HELLO" is in the snapshot under like "hello%" but excluded from deltas.

eq, gt, not(eq), in were verified to agree on NULL semantics across both paths (including the null-vs-omitted spelling variants).

Next (maintainer decision + ADR — deliberately not in this PR)

  • Guard compilePredicate failures in handleSub (send reset).
  • Resolve the operator floor: plan 003's Maintenance notes sketch options; (a) an in-tree SQLite-semantics row evaluator for the supported floor keeps SQL the source of truth.

🤖 Generated with Claude Code

grrowl and others added 19 commits June 11, 2026 18:27
Records the SSR design against TanStack DB draft PR #1564: readSnapshot
RPC with a durable high-water cursor, SsrSnapshotTransport, syncMeta
cursor round-trip with since-on-first-sub, snapshot reconciliation, the
C1' barrier, and the on-demand transient catch-up sub — including the
adversarial-review findings that shaped them and the known limitations
(no incarnation epoch; upstream is a draft).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ater cursor

The SSR read path (ADR-0011 D1): one consistent {rows, cursor} read over
the DO binding, no WebSocket. The cursor is max(MAX(_sync_changes.seq),
drain_cursor) — bare currentSeq reads 0 once retention prunes the log
empty, which would hand SSR a bogus no-history cursor for live rows (a
delete between render and hydration would then strand a stale row).
Cursor "0" honestly means no resume point. Fails loud on unknown
collections and un-lowerable predicates.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…e transport

SSR hydration hands the client rows it never streamed, so the first sub
must resume from the dehydrated cursor (server catch-up, not a redundant
snapshot) and the transport must claim that position: seedCursor keeps a
bootstrap-window drop from re-snapshotting over hydrated rows (snapshots
carry no tombstones), and a LATE seed — a streamed chunk after live
advance, whose stale rows upstream applies without a veto — regresses to
the shorter prefix (always safe to claim less) and resubscribes so the
idempotent catch-up replay re-freshens the clobbered window. Also
extracts the structural Transport interface the SSR snapshot transport
will share (ADR-0011 D2/D3).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…he same adapter

The same doCollectionOptions runs in a per-request DbClient on the
worker, swapped at the new structural Transport seam: each subscribe is
one readSnapshot RPC (rows + durable cursor) synthesized as
onSnap*/onSnapEnd; on-demand loadSubset works under a server-side live
query preload; the render's cursor is the MIN across reads (the safe
joint resume point — replay is idempotent, skipping is not); writes
throw SsrReadOnlyError. Predicates are flattened through the wire
tagged-value codec before the RPC — TanStack's IR is class instances,
which structured clone rejects.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…napshot reconcile

doCollectionOptions implements the draft DbClient contract (ADR-0011 D3/D4):
- exportSyncMeta/importSyncMeta/mergeSyncMeta carry {v:1, cursor} —
  opaque to TanStack, inert on older @tanstack/db. Merge takes the MIN
  cursor: a late chunk's rows are applied upstream without a veto, and
  the idempotent replay from the earlier position re-freshens them.
- A hydrated eager collection is ready immediately (stale-while-
  revalidate), resumes its first sub from the dehydrated cursor, and —
  with no resume point (cursor 0) — reconciles the fresh snapshot as
  authoritative set semantics (update-if-held, delete-unseen at the
  boundary): no flash-to-empty, no stranded deletes, no
  DuplicateKeySyncError. Synced-presence checks consult syncedData, not
  the combined view, so optimistic overlays are never steered by them.
- On-demand adds ONE transient unfiltered catch-up sub from the cursor
  (per-subset resume is unsound; always-emit covers every changed key,
  tombstones included), unsubscribing at ITS terminal — the wire's
  uptodate gains an optional sub field so a catch-up terminal is
  distinguishable from a broadcast boundary (additive).
- Round-trip tests run the real vendored PR-1564 DbClient on both sides
  with writes landing between dehydrate and hydrate.

BREAKING: doCollectionOptions accepts the structural Transport;
SubHandler.onUptodate gains an ownTerminal parameter.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Four findings from the post-implementation review (gpt-5.5), all real:
- An EMPTY catch-up snapshot (server wiped the table) skipped the
  reconcile entirely — the seen-set was created lazily on first snap.
  Initialize it eagerly when armed: zero rows is an authoritative set.
- A live cursor regress (late streamed chunk) resubscribed on the SAME
  socket; boundary frames already in flight re-advanced the claim past
  the repair window. A regress now forces a reconnect with advance
  suppressed until the fresh socket resubscribes from the seed.
- on-demand markReady raced the transient catch-up registration
  (connect().then ordering); ready now gates on the catch-up sub frame
  being sent, so subset snapshots always follow it on the wire.
- A changed eager where between render and hydrate made the cursor
  unsound (an unchanged out-of-filter row is invisible to catch-up).
  syncMeta now fingerprints the filter; mismatch downgrades to the
  snapshot-reconcile path.
Also changed: unresumable on-demand hydrated rows (cursor 0 / below
floor) are now TRUNCATED instead of patched by an unfiltered full
snapshot — never-subscribed whole-table rows would go permanently
stale, which is worse than a one-roundtrip subset refetch.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…→converge

One worker serves both halves (WS upgrades to the DO, everything else
the Start app). A createServerFn loader does the per-request DbClient +
SsrSnapshotTransport read and dehydrates into the route payload; the
browser hydrates a fresh DbClient, paints the rows before any socket
exists, then converges via the cursor catch-up. The worker's React
render pass uses an inert snapshot transport — useLiveQuery starts
collection sync during SSR, and hydrate() already supplied the rows.
Vendored draft-PR builds with npm overrides + vite dedupe so exactly
one @tanstack/db resolves (two copies break the Symbol-branded
collectionOptions). Verified: curl shows seeded rows in raw HTML with
syncMeta cursor; headless two-tab insert/update broadcasts converge;
zero console errors.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The base class owns the subclass author's namespace: its public surface
is consistently Sync-marked (registerSync, runSyncedWrite). A bare
readSnapshot invites collision with author methods and hides what it
reads. Pre-release rename, no alias.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…pages

The SSR round trip lifts into a pathless _db layout (one loader, one
DbClient, ONE socket per tab — per-page clients would leak a fresh
never-closed WebSocket on every client-side navigation), with two
showcase routes consuming the same hydrated collection:
- /live-query — the baseline useLiveQuery todos experience.
- /live-suspense-query — useLiveSuspenseQuery in a Suspense boundary
  with a visible fallback counter and a where-toggle (query identity =
  the structured IR). The demonstrable finding: a HYDRATED collection
  never suspends — rows are in the server HTML inside a COMPLETED
  boundary, and the fallback count stays 0 through hydration and
  identity changes. Readiness comes from this library's hydrated path
  calling markReady() synchronously (stale-while-revalidate, ADR-0011
  D3) — upstream hydrate() itself never marks ready.
Also: example now typechecks clean (Start's RequestHandler is
(request, opts?) — env rides cloudflare:workers; the dehydrated
payload is asserted serializable at the server-fn boundary, since
upstream types syncMeta as unknown). Context lives outside the route
file — Start code-splits route modules, and a context exported from
one evaluates twice (two distinct contexts; SSR falls back to client
render).

Verified: curl shows all rows in both pages' raw HTML (suspense page:
completed boundary, no fallback); headless two-page pass with filter
toggle, zero console errors.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Grill-session finding: the WS path FORCES an auth gate (parseAttachment
at upgrade) while the snapshot read had none — any worker holding the
binding could read any collection, inverting the socket path's
safe-by-default shape. An author's tenant check was free on one path
and silently bypassable on the other.

readSyncSnapshot now REQUIRES the claims-bearing Request and runs it
through parseAttachment before reading: two paths, one gate, reject by
throwing. The await precedes the synchronous SQLite reads, so rows and
cursor stay at one position. The minted claims are the seam where
uniform read-scoping would land later (neither path filters rows by
identity today; where is shaping, not security — documented).

BREAKING: readSyncSnapshot gains a required second argument and is now
async (RPC callers were already awaiting).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…s armed

Grill-session finding (Q4): importSyncMeta/mergeSyncMeta run AFTER
upstream applies the chunk's rows — a validation throw cannot veto
them. Throwing alone left applied rows with no reconcile intent (and on
on-demand, no truncate): a server-deleted hydrated row would be stale
forever, reachable by any future-versioned or corrupt syncMeta. Both
hooks now set the safe state (cursor '0' -> snapshot-reconcile /
truncate route) BEFORE throwing — loud AND recoverable, and the
de-facto gradual-upgrade path for a future v:2 without per-version
fallback logic.

Structurally: eager subs now ALWAYS arm snapshot reconcile — an eager
snapshot is authoritative set semantics over synced rows, period. The
normal empty-at-first-snapshot flow is a no-op (boundary-free: begin
opens only when a delete is due); any path where synced rows precede a
snapshot converges automatically, including ones we haven't imagined.
The seen-set is per-snapshot, so the invariant survives multiple
snapshots on one sub. Spy harnesses model _state.syncedData.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Grill-session finding (Q9), pre-existing: a connect() triggered on
demand -- a mutation fired within reconnectDelayMs of a drop --
established the fresh socket with the reconnecting flag still false: no
resubscribeAll, every subscription silently dead on the new socket, and
the late timer connect() early-returned, wedging the flag. On the
ADR-0011 forced-reconnect path the same race also left suppressAdvance
set (a frozen cursor). The flag now sets when the reconnect is
SCHEDULED, so whichever connect() establishes -- timer- or
demand-driven -- runs the resubscribe path. Pinned with a fake-socket
test driving the exact interleaving.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…, set-semantics rationale, skew note

Records the grill-session conclusions that changed no code: ADR-0002
points forward to C1' (0011); 0011 gains the ready-as-renderable
semantics + SyncIndicator recipe, the below-floor flash acceptance with
the LRU-persistence future-scope marker, the min-cursor self-consistency
rationale, the purity-leak staleness-is-unobservable argument, the
on-demand memory-contract line, and the pre-1.0 version-skew note on
the sub-scoped uptodate terminal.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Records the exact upstream commit (132d53a9) the tarballs were built
from, the build command, the single-copy resolution gotcha, and the
exit plan (rebase tarballs out of history once upstream ships). Green
tests against stale tarballs prove nothing about the current draft.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
SSR support as a prerelease on the `dev` dist-tag, so people can try it
without it becoming the default install (`latest` stays 0.3.1). SSR is
additive, so the next minor is 0.4.0; this is its first -dev iteration and
bumps as PR #1564 evolves.

The adapter installs and imports cleanly against a released @tanstack/db
(it imports only stable exports), but end-to-end SSR is DORMANT until paired
with the PR #1564 build — dehydrate/hydrate/DbClient and the syncMeta hook
calls are upstream and unreleased. The vendored tarballs remain devDeps only;
the published package depends on @tanstack/db purely as a peer (>=0.6.0).

Publish with: npm publish --tag dev   (NOT plain publish — a prerelease still
goes to `latest` without the tag).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Two divergences confirmed (plan 003):

1. `ne` — accepted by sql-compiler.ts COMPARATORS but rejected by
   @tanstack/db compileSingleRowExpression ("Unknown function: ne").
   The subscription hangs: the client never receives snap-end.
   Operator floor is inconsistent across the two evaluators.

2. `like` — SQLite LIKE is case-insensitive for ASCII (default);
   @tanstack/db is case-sensitive. "HELLO" matches LIKE "hello%"
   in the snapshot path but not in the delta path. Clients diverge
   by connection timing.

4 cases verified passing (eq, gt, not(eq), in) — NULL three-valued
logic is mirrored by toBooleanPredicate for those operators.

Failing cases are marked it.fails() so the suite stays green while
the divergences are visible. Fix needs a maintainer decision + ADR
(see plan 003 Maintenance notes for options).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tive cases

The helper was defined but only called from case 1 (ne), which is
unreachable because snapshotMembers times out first on the ne crash.
Wire it into cases gt, not(eq), and in — each using a third uniquely-named
room (pp-gt-delta-explicit, pp-not-delta-explicit, pp-in-delta-explicit) —
and assert the omit-spelling and explicit-null-spelling delta members agree,
naming the spelling mismatch in the failure message (plan 003 step 2).

Also soften the always-emit comment in deltaMembers from "the assertion
below surfaces it" (incorrect — a missing d frame is currently silent)
to "currently silent here; see STOP condition in plan 003".

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@grrowl

grrowl commented Jun 13, 2026

Copy link
Copy Markdown
Owner Author

Parked pending your ADR decision on the ne / like divergence. The 0.3.2 bug-fix batch (#11#15) landed on main without 003 — these are it.fails characterization tests pinning two real bugs (ne crashes handleSub; like membership diverges by connection timing), and we don't merge failing tests.

Next step is a focused design session on the fix + a new ADR; this PR/branch stays as the characterization deliverable until then.

@grrowl

grrowl commented Jun 13, 2026

Copy link
Copy Markdown
Owner Author

Superseded by #17, which landed the real fix on main as v0.3.3 (ADR-0013): ne dropped from the floor + fail-loud guard, like made case-sensitive via PRAGMA case_sensitive_like. The it.fails characterizations here are replaced by passing parity assertions in tests/predicate-parity.test.ts, now also on feat/ssr via the rebase.

@grrowl grrowl closed this Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant