Skip to content

Exp 160: incremental view maintenance — the v2 stream engine#155

Open
danReynolds wants to merge 10 commits into
mainfrom
exp-160-stream-delta-ivm
Open

Exp 160: incremental view maintenance — the v2 stream engine#155
danReynolds wants to merge 10 commits into
mainfrom
exp-160-stream-delta-ivm

Conversation

@danReynolds

@danReynolds danReynolds commented Jun 10, 2026

Copy link
Copy Markdown
Owner

This is an architectural change, not a tuning experiment. It introduces a second execution strategy for reactive streams — materialized results maintained incrementally from row deltas — with the existing re-query path retained as the universal, fail-closed fallback. It is the largest change to the reactive engine since its creation, and it carries the experiment-protocol evidence trail (exp 160) on top of that.

The architecture

Today's engine answers every invalidation by re-executing the query (hash suppression saves decode/emission, never execution). This PR adds the machinery for the engine to instead understand queries and maintain their results:

preupdate hook ──► bounded row deltas ──► writer reply ──► stream engine
                                                              │
                                       classifier (at registration, fail-closed)
                                          ├─ full maintenance   → patch cached rows, emit
                                          ├─ windowed (LIMIT K) → top-K cache, patch/evict
                                          ├─ skip-only          → prove misses, else re-query
                                          ├─ aggregates         → maintain exact COUNT/SUM/MIN/MAX/AVG
                                          └─ everything else    → today's re-query path, unchanged

New durable infrastructure (each usable beyond this PR)

  1. Row-delta channel (native/resqlite.c, lib/src/row_deltas.dart) — the preupdate hook captures per-row old/new values (bounded: 256 rows / 32 cols / 256 KB per cycle; poisoned by savepoint rollback or overflow), ships them in writer replies, decoded lazily. This is a CDC-grade primitive: diff-emitting streams, sync engines, and undo journals can all consume it later without touching native code again.
  2. Query classifier (lib/src/stream_ivm.dart) — a strict registration-time grammar (bare columns, aggregates with aliases, AND-ed comparisons, composite ORDER BY with pk tiebreak, LIMIT) producing sealed admission shapes. Fail-closed by construction: anything unparsed or unprovable costs a re-query, never correctness.
  3. Maintained state machines — full / windowed / skip-only / aggregate states with clone-on-write patching, complete-set window tracking, and exact integer aggregate maintenance.
  4. A new ordering primitive: writer-ordered state builds. All IVM baselines (caches, aggregate snapshots) are built through reads on the writer port, whose FIFO totally orders a snapshot against the write replies that carry deltas — reader results and writer replies have no cross-port happens-before (a defect class this PR discovered, fixed, and recorded as a JOURNAL lesson). Paired invariant: maintained state survives only an unbroken chain of processed cycles; any bypassed delta-bearing cycle drops it for an exact rebuild.
  5. Randomized equivalence harness (test/stream_ivm_equivalence_test.dart) — seeded write storms (rowid changes, NULLs, savepoint rollbacks, overflow batches) across every admission mode, every emission checked against a fresh select. It caught two ~1-in-5 ordering races deterministic replays never fired; it is the permanent safety net for all future IVM work.

Architectural properties

  • Zero public API changedb.stream() is untouched; admission is an internal execution strategy.
  • Fail-closed at every layer — capture overflow, classifier miss, unprovable cell, cache inconsistency, schema drift all degrade to today's path. Zero bails observed across all measured workloads.
  • One-line revertible — remove the maintain check in the dirty loop and the system is byte-for-byte today's behavior.
  • Emission semantics sharpen: admitted streams deliver each write's patch instead of coalescing behind re-query latency (the semantics of an infinitely fast re-query). The exp 045 microtask-batch pattern is the pre-identified mitigation if a workload prefers coalescing.

Evidence (full record: experiments/160-stream-delta-ivm.md)

evidence result
Tracelite primary gate many-streams −18.2%, formal improved verdict — the first cleared gate in this direction's history; reproduced order-flipped (+19.5% main-slower)
App-shaped admission audit 7/9 stream shapes admitted (tier 1 alone: 3/8); 7,643 invalidation decisions resolved with zero reader re-queries, 0 bails; burst wall 132.9 ms vs main ≈200 ms while delivering per-write patches
Exp-147 audit overlap −19% with 500 vs 27 emissions delivered; keyed-PK −14% (matched pair)
Honest costs high-cardinality density +3.5–4.8% (both collection orders): O(admitted-streams) main-isolate predicate eval replaces reader-distributed suppression — v3 fix specified (per-table equality-predicate index, O(1) dispatch); keyed-PK neutral for the full stack
Found & fixed en route the diagnostics×reader NOMUTEX race (landed separately as #156) and the two cross-port ordering defects above

What this is not (yet) — the mapped ladder

Tier 4 one-hop joins, the equality-predicate dispatch index, rebuild-storm coalescing, and diff-carrying emissions are specified in signals.json with evidence trails, deliberately not built here.

Test plan

  • 315/315 tests incl. 38 classifier/state-machine units and the equivalence harness; 20/20 clean harness loops post-fix
  • dart analyze lib test benchmark clean; finalizer green
  • Tracelite gate, two passes (standard + order-flipped), both committed in the writeup
  • Release-suite record + profile aggregates committed (markdown only per CI guard)

🤖 Generated with Claude Code

@danReynolds

Copy link
Copy Markdown
Owner Author

Added the admission-rate benchmarking + an encapsulation pass (see latest commit): tier-1 admits 30/62 distinct entries on an app-shaped chat+feed stream mix and eliminates 3,000 re-queries in the burst with 0 bails — but the rejected set holds the highest-churn DESC+LIMIT screens (~3,100 remaining reader replies), which is the quantified case for tier-2 (composite-key ordering + LIMIT K+buffer). Per-cycle delta decode also moved out of the engine into RowDeltaBatch; engine-side IVM state is now just StreamEntry.ivm + the table_info cache. 298/298 tests, analyze clean, finalizer green.

@danReynolds

Copy link
Copy Markdown
Owner Author

Pushed three updates (branch rebased onto main with #153 merged — conflicts in the experiments metadata resolved by union):

  1. Fix for a pre-existing data race this PR surfaced (own commit, cherry-pickable if you'd rather land it separately): Database.diagnostics() calls sqlite3_db_status on reader connections guarded by in_use — dead code since exp 030's dedicated-reader assignment — so the main isolate was reading live NOMUTEX connections; SCHEMA_USED measures via the connection's pnBytesFreed dry-run, corrupting a mid-query reader's allocation accounting (flaky reader SEGV, ~1-in-30 stream_test runs once this PR's detached admission reads made readers busy at diagnostics-poll time). Read workers now bracket each request with resqlite_reader_set_busy (two ~ns leaf FFI calls); the sacrifice path clears it before Isolate.exit. Bisection + 100/100 clean stress runs in the experiment doc.
  2. Test hardening: the e2e IVM tests used fixed 60ms settles and flaked under full-suite load; now poll-until-count / quiet-window. 8/8 clean full-suite runs (~1-in-3 flaked before).
  3. JOURNAL: added the transferable lesson (a dead guard flag is a data race waiting for a traffic pattern).

Full suite 8×, analyze, and finalizer all green post-rebase.

danReynolds and others added 6 commits June 10, 2026 14:12
The writer preupdate hook now captures bounded per-row old/new values
(256 rows / 32 cols / 256 KB per cycle; overflow, OOM, or a savepoint
rollback poisons the cycle and the write falls back to plain re-query
invalidation). Deltas ride the writer reply as raw bytes and are
decoded lazily on the main isolate.

At stream registration the engine classifies the query against a
fail-closed tier-1 grammar (single table, AND-ed integer comparisons,
INTEGER PRIMARY KEY projected, ORDER BY pk or pk-equality pin) using
PRAGMA table_info. Admitted streams maintain their materialized result
from deltas: proven misses skip the reader pool entirely, in-window
changes patch clone-on-write and emit, and anything unprovable bails to
the existing re-query path. A hash sentinel after patched emissions
guarantees the next fallback re-query can never be suppressed against a
pre-patch baseline.

Results: Tracelite stream gate in both collection orders — many-streams
-18.5% / +22.2% main-slower (CIs -122..-96 ms, +100..+113 ms), keyed-PK
-14.1% / +9.4% (CIs -60..-27 ms, +13..+37 ms), high-cardinality neutral
after order-flipped adjudication. A11c-overlap engagement: 24,500 of
25,000 invalidation decisions proven misses, 500 local patches, 0 bails,
zero reader re-queries; audit wall -29% overlap / -46% keyed-PK with
emissions 44 -> 500 (per-write patches no longer coalesce behind
re-query latency). 26 new equivalence tests; full suite green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Adds the app-shaped streaming workload the suite was missing:
benchmark/profile/ivm_admission_audit.dart runs eight reactive-UI
stream shapes (chat panes, conversation list, unread badge, user
cards, transcripts, feed page, drafts) over chat+feed schemas with a
chat-shaped write burst. Tier-1 admits 30/62 distinct entries and
resolves 3,000 invalidation decisions without reader re-query (0
bails), while the unadmitted DESC+LIMIT shapes — the highest-churn
screens — still generate ~3,100 reader replies: the quantified case
for tier-2 (composite-key ordering + LIMIT K+buffer).

Encapsulation: per-cycle delta decode moves out of the stream engine
into RowDeltaBatch (lazy, grouped by table, no cross-cycle state) —
the engine's only remaining IVM state is StreamEntry.ivm plus the
table_info admission cache. Adds ivm_admitted/rejected_total
classifier counters.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
resqlite_db_status_total skips readers marked in_use, but that flag has
been dead code since exp 030 moved workers to dedicated reader
assignment — Database.diagnostics() was calling sqlite3_db_status on
live NOMUTEX reader connections from the main isolate. SCHEMA_USED
measures memory via the connection's pnBytesFreed dry-run mechanism;
toggling it under a reader mid-query corrupts the reader's allocation
accounting (flaky SEGV in sqlite3VdbeDelete, ~1-in-30 stream_test runs
once exp 160's detached admission reads made readers reliably busy at
diagnostics-poll time).

Read workers now bracket each request with resqlite_reader_set_busy
(atomic store, two leaf FFI calls per request, ~ns), making the existing
busy guard real. The sacrifice path clears the bracket before
Isolate.exit since exit skips finally. Bisection: crash gone with
admission disabled (60/60); with the fix and admission enabled, 100/100
clean stress iterations.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The end-to-end tests used fixed 60ms settles, which flaked under full-
suite concurrency when an initial query or fallback re-query took longer
(observed: the first emission assertion seeing an empty list). Positive
emission-count assertions now poll until the expected count (15s
deadline) before asserting exactly; must-not-emit assertions wait for a
quiet window (two consecutive 50ms windows with no new emissions) so in-
flight work drains first. 8/8 clean full-suite runs after; ~1-in-3
flaked before.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The rebase onto main (which landed the diagnostics race fix via #156)
re-applied the function on top of main's copy.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@danReynolds danReynolds force-pushed the exp-160-stream-delta-ivm branch from 1b19ec2 to 124fbae Compare June 10, 2026 18:16
danReynolds and others added 4 commits June 10, 2026 15:57
…gates

Generalizes the registration-time classifier into three fail-closed
admission modes sharing one strict grammar (now with string literals,
aggregate calls, AS aliases, DESC, LIMIT/OFFSET, DISTINCT):

- Full maintenance gains composite ordering (ORDER BY intCol [DESC],
  pk [DESC] — explicit pk tiebreak required so tie order is exact) and
  LIMIT-K windows: a top-K cache with complete-set tracking; entries
  and in-window patches are O(delta), departures from a full incomplete
  window fall back (the replacement row is unknown), boundary-crossing
  moves fall back. TEXT equality predicates are admitted when the
  table's CREATE statement (from sqlite_master, cached per table)
  contains no COLLATE clause, making BINARY semantics provable; the
  delta decoder's strict UTF-8 already rejects malformed text upstream.

- Tier 1.5 skip-only: shapes whose results cannot be maintained
  (DESC without tiebreak, OFFSET, DISTINCT, unprojected keys) but whose
  WHERE is an evaluable conjunction get proven-miss elision with no
  cached state at all; any hit or unprovable cell re-queries.

- Tier 3 aggregates: COUNT(*)/COUNT/SUM/MIN/MAX/AVG(col) AS alias over
  an evaluable (possibly empty) predicate, seeded exactly by a one-time
  snapshot query at admission and maintained per delta (NULL cells
  follow SQL semantics; integer sums exact; AVG derived as sum/count; a
  departing MIN/MAX extremum bails and re-seeds asynchronously).

Engine: sealed IvmState dispatch, per-table meta cache (table_info +
CREATE sql), windowed emissions compare by row identity so invisible
tail changes don't emit, aggregate re-seed scheduling after bails.
New counters: ivm_admitted_skip/agg_total, ivm_hit_fallback_total.

38 unit tests across classifier modes and all three state machines;
full suite 312/312.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Two defects found by the randomized equivalence harness under dart test
load, both invisible to deterministic runs:

1. Reader-built IVM baselines race late writer replies. An aggregate
   snapshot (or full-cache build) executed on a reader can observe a
   commit whose delta is then applied on top of it — cross-port event
   ordering gives no guarantee a reader result is processed before the
   writer reply that produced what it read. All IVM state is now built
   through writer-ordered reads (Database wires a writer.locked
   selectInTransaction hook into the engine): the writer port is FIFO
   with the writes themselves, so a snapshot's position totally orders
   it against every delta.

2. A maintained state only survives an unbroken chain of processed
   cycles. A delta-bearing write routed to the re-query fallback
   (dirty/in-flight guard, malformed or absent deltas, capture
   overflow) leaves the state's baseline permanently stale — and a
   hash-suppressed re-query validates emissions without re-syncing it
   (ledger-captured: seed at 2784, insert swallowed by an
   unchanged-hash re-query, every later apply walking the -1 forward).
   The engine now drops maintained state whenever a cycle bypasses it;
   the writer-ordered rebuild restores an exact baseline. Known trade:
   churn cycles (e.g. overflow batches) trigger rebuild storms — steady
   state is untouched.

Hot path: predicate conjunctions compile to flattened primitive arrays
(IvmPredicateProgram — no string switches or Object re-checks per
delta), and full states keep a pk set for O(1) presence checks on the
proven-miss path, which is the single hottest IVM operation.

Equivalence harness: 20/20 clean full-loop runs post-fix (~1-in-5
diverged before); full suite green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
many-streams-writer-throughput clears the Tracelite primary gate with a
formal `improved` verdict (-18.2%, 546 -> 447 ms, p < 0.001) — the
first cleared gate in the stream-rerun-dispatch direction's history —
and reproduces order-flipped (+19.5% main-slower, CI +83..+94 ms).
High-cardinality carries a real ~4% cost in both orders: per-write
main-isolate predicate evaluation is O(admitted streams on the table),
replacing reader-distributed hash suppression; the identified v3 fix is
a per-table equality-predicate index for O(1) delta dispatch. Keyed-PK
is honestly neutral for the v2 stack (sign flips across passes).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@danReynolds

Copy link
Copy Markdown
Owner Author

v2 tier stack pushed (per the goal to take this as far as it goes — all isolated on this branch). Summary of what's now here beyond tier 1:

Tiers — one fail-closed grammar, three admission modes:

  • Full maintenance extended with composite ORDER BY intCol [DESC], pk [DESC] (explicit tiebreak ⇒ exact tie order) and LIMIT K windows (top-K cache, complete-set tracking, departures-from-full-windows fall back); TEXT equality under introspected no-COLLATE tables.
  • Skip-only (tier 1.5): proven-miss elision for unmaintainable-result shapes (DESC w/o tiebreak, OFFSET, DISTINCT…) — no cached state at all.
  • Aggregates (tier 3): COUNT/SUM/MIN/MAX/AVG … AS alias, exactly seeded and maintained; extremum departures re-seed.

Headline results:

evidence result
Tracelite gate many-streams −18.2%, formal improved verdict — first cleared primary gate in this direction's history; reproduced order-flipped (+19.5% main-slower)
App-shaped admission audit 7/9 shapes admitted (3/8 at tier 1); 7,643 invalidation decisions with zero reader re-queries, 0 bails; burst wall 132.9 ms vs main's ~200 ms while delivering per-write patches
Honest costs high-cardinality +3.5–4.8% both orders (O(admitted-streams) main-isolate eval at 100-streams-one-table density — v3 fix identified: per-table equality-predicate index for O(1) dispatch); keyed-PK neutral for v2 (tier-1 alone won it)

Two ordering defects found and fixed by the new randomized equivalence harness (3 seeds × 12 rounds × 9 streams, every emission vs fresh select; both fired ~1-in-5 under load, never in deterministic replays): reader-built IVM baselines race late writer replies → all state now builds through writer-ordered reads; and maintained state survives only an unbroken chain of processed cycles → any bypassed delta-bearing cycle drops the state (a hash-suppressed re-query validates emissions without re-syncing state). Both are recorded as a JOURNAL lesson (cross-port replies have no happens-before).

315/315 tests, 20/20 equivalence-loop runs, analyze + finalizer green.

@danReynolds danReynolds changed the title Exp 160: tier-1 incremental stream maintenance (row deltas) Exp 160: incremental view maintenance — the v2 stream engine Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant