Exp 160: incremental view maintenance — the v2 stream engine#155
Exp 160: incremental view maintenance — the v2 stream engine#155danReynolds wants to merge 10 commits into
Conversation
|
Added the admission-rate benchmarking + an encapsulation pass (see latest commit): tier-1 admits 30/62 distinct entries on an app-shaped chat+feed stream mix and eliminates 3,000 re-queries in the burst with 0 bails — but the rejected set holds the highest-churn DESC+LIMIT screens (~3,100 remaining reader replies), which is the quantified case for tier-2 (composite-key ordering + LIMIT K+buffer). Per-cycle delta decode also moved out of the engine into |
0d614af to
1b19ec2
Compare
|
Pushed three updates (branch rebased onto main with #153 merged — conflicts in the experiments metadata resolved by union):
Full suite 8×, analyze, and finalizer all green post-rebase. |
The writer preupdate hook now captures bounded per-row old/new values (256 rows / 32 cols / 256 KB per cycle; overflow, OOM, or a savepoint rollback poisons the cycle and the write falls back to plain re-query invalidation). Deltas ride the writer reply as raw bytes and are decoded lazily on the main isolate. At stream registration the engine classifies the query against a fail-closed tier-1 grammar (single table, AND-ed integer comparisons, INTEGER PRIMARY KEY projected, ORDER BY pk or pk-equality pin) using PRAGMA table_info. Admitted streams maintain their materialized result from deltas: proven misses skip the reader pool entirely, in-window changes patch clone-on-write and emit, and anything unprovable bails to the existing re-query path. A hash sentinel after patched emissions guarantees the next fallback re-query can never be suppressed against a pre-patch baseline. Results: Tracelite stream gate in both collection orders — many-streams -18.5% / +22.2% main-slower (CIs -122..-96 ms, +100..+113 ms), keyed-PK -14.1% / +9.4% (CIs -60..-27 ms, +13..+37 ms), high-cardinality neutral after order-flipped adjudication. A11c-overlap engagement: 24,500 of 25,000 invalidation decisions proven misses, 500 local patches, 0 bails, zero reader re-queries; audit wall -29% overlap / -46% keyed-PK with emissions 44 -> 500 (per-write patches no longer coalesce behind re-query latency). 26 new equivalence tests; full suite green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Adds the app-shaped streaming workload the suite was missing: benchmark/profile/ivm_admission_audit.dart runs eight reactive-UI stream shapes (chat panes, conversation list, unread badge, user cards, transcripts, feed page, drafts) over chat+feed schemas with a chat-shaped write burst. Tier-1 admits 30/62 distinct entries and resolves 3,000 invalidation decisions without reader re-query (0 bails), while the unadmitted DESC+LIMIT shapes — the highest-churn screens — still generate ~3,100 reader replies: the quantified case for tier-2 (composite-key ordering + LIMIT K+buffer). Encapsulation: per-cycle delta decode moves out of the stream engine into RowDeltaBatch (lazy, grouped by table, no cross-cycle state) — the engine's only remaining IVM state is StreamEntry.ivm plus the table_info admission cache. Adds ivm_admitted/rejected_total classifier counters. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
resqlite_db_status_total skips readers marked in_use, but that flag has been dead code since exp 030 moved workers to dedicated reader assignment — Database.diagnostics() was calling sqlite3_db_status on live NOMUTEX reader connections from the main isolate. SCHEMA_USED measures memory via the connection's pnBytesFreed dry-run mechanism; toggling it under a reader mid-query corrupts the reader's allocation accounting (flaky SEGV in sqlite3VdbeDelete, ~1-in-30 stream_test runs once exp 160's detached admission reads made readers reliably busy at diagnostics-poll time). Read workers now bracket each request with resqlite_reader_set_busy (atomic store, two leaf FFI calls per request, ~ns), making the existing busy guard real. The sacrifice path clears the bracket before Isolate.exit since exit skips finally. Bisection: crash gone with admission disabled (60/60); with the fix and admission enabled, 100/100 clean stress iterations. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The end-to-end tests used fixed 60ms settles, which flaked under full- suite concurrency when an initial query or fallback re-query took longer (observed: the first emission assertion seeing an empty list). Positive emission-count assertions now poll until the expected count (15s deadline) before asserting exactly; must-not-emit assertions wait for a quiet window (two consecutive 50ms windows with no new emissions) so in- flight work drains first. 8/8 clean full-suite runs after; ~1-in-3 flaked before. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The rebase onto main (which landed the diagnostics race fix via #156) re-applied the function on top of main's copy. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
1b19ec2 to
124fbae
Compare
…gates Generalizes the registration-time classifier into three fail-closed admission modes sharing one strict grammar (now with string literals, aggregate calls, AS aliases, DESC, LIMIT/OFFSET, DISTINCT): - Full maintenance gains composite ordering (ORDER BY intCol [DESC], pk [DESC] — explicit pk tiebreak required so tie order is exact) and LIMIT-K windows: a top-K cache with complete-set tracking; entries and in-window patches are O(delta), departures from a full incomplete window fall back (the replacement row is unknown), boundary-crossing moves fall back. TEXT equality predicates are admitted when the table's CREATE statement (from sqlite_master, cached per table) contains no COLLATE clause, making BINARY semantics provable; the delta decoder's strict UTF-8 already rejects malformed text upstream. - Tier 1.5 skip-only: shapes whose results cannot be maintained (DESC without tiebreak, OFFSET, DISTINCT, unprojected keys) but whose WHERE is an evaluable conjunction get proven-miss elision with no cached state at all; any hit or unprovable cell re-queries. - Tier 3 aggregates: COUNT(*)/COUNT/SUM/MIN/MAX/AVG(col) AS alias over an evaluable (possibly empty) predicate, seeded exactly by a one-time snapshot query at admission and maintained per delta (NULL cells follow SQL semantics; integer sums exact; AVG derived as sum/count; a departing MIN/MAX extremum bails and re-seeds asynchronously). Engine: sealed IvmState dispatch, per-table meta cache (table_info + CREATE sql), windowed emissions compare by row identity so invisible tail changes don't emit, aggregate re-seed scheduling after bails. New counters: ivm_admitted_skip/agg_total, ivm_hit_fallback_total. 38 unit tests across classifier modes and all three state machines; full suite 312/312. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Two defects found by the randomized equivalence harness under dart test load, both invisible to deterministic runs: 1. Reader-built IVM baselines race late writer replies. An aggregate snapshot (or full-cache build) executed on a reader can observe a commit whose delta is then applied on top of it — cross-port event ordering gives no guarantee a reader result is processed before the writer reply that produced what it read. All IVM state is now built through writer-ordered reads (Database wires a writer.locked selectInTransaction hook into the engine): the writer port is FIFO with the writes themselves, so a snapshot's position totally orders it against every delta. 2. A maintained state only survives an unbroken chain of processed cycles. A delta-bearing write routed to the re-query fallback (dirty/in-flight guard, malformed or absent deltas, capture overflow) leaves the state's baseline permanently stale — and a hash-suppressed re-query validates emissions without re-syncing it (ledger-captured: seed at 2784, insert swallowed by an unchanged-hash re-query, every later apply walking the -1 forward). The engine now drops maintained state whenever a cycle bypasses it; the writer-ordered rebuild restores an exact baseline. Known trade: churn cycles (e.g. overflow batches) trigger rebuild storms — steady state is untouched. Hot path: predicate conjunctions compile to flattened primitive arrays (IvmPredicateProgram — no string switches or Object re-checks per delta), and full states keep a pk set for O(1) presence checks on the proven-miss path, which is the single hottest IVM operation. Equivalence harness: 20/20 clean full-loop runs post-fix (~1-in-5 diverged before); full suite green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
many-streams-writer-throughput clears the Tracelite primary gate with a formal `improved` verdict (-18.2%, 546 -> 447 ms, p < 0.001) — the first cleared gate in the stream-rerun-dispatch direction's history — and reproduces order-flipped (+19.5% main-slower, CI +83..+94 ms). High-cardinality carries a real ~4% cost in both orders: per-write main-isolate predicate evaluation is O(admitted streams on the table), replacing reader-distributed hash suppression; the identified v3 fix is a per-table equality-predicate index for O(1) delta dispatch. Keyed-PK is honestly neutral for the v2 stack (sign flips across passes). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
v2 tier stack pushed (per the goal to take this as far as it goes — all isolated on this branch). Summary of what's now here beyond tier 1: Tiers — one fail-closed grammar, three admission modes:
Headline results:
Two ordering defects found and fixed by the new randomized equivalence harness (3 seeds × 12 rounds × 9 streams, every emission vs fresh select; both fired ~1-in-5 under load, never in deterministic replays): reader-built IVM baselines race late writer replies → all state now builds through writer-ordered reads; and maintained state survives only an unbroken chain of processed cycles → any bypassed delta-bearing cycle drops the state (a hash-suppressed re-query validates emissions without re-syncing state). Both are recorded as a JOURNAL lesson (cross-port replies have no happens-before). 315/315 tests, 20/20 equivalence-loop runs, analyze + finalizer green. |
The architecture
Today's engine answers every invalidation by re-executing the query (hash suppression saves decode/emission, never execution). This PR adds the machinery for the engine to instead understand queries and maintain their results:
New durable infrastructure (each usable beyond this PR)
native/resqlite.c,lib/src/row_deltas.dart) — the preupdate hook captures per-row old/new values (bounded: 256 rows / 32 cols / 256 KB per cycle; poisoned by savepoint rollback or overflow), ships them in writer replies, decoded lazily. This is a CDC-grade primitive: diff-emitting streams, sync engines, and undo journals can all consume it later without touching native code again.lib/src/stream_ivm.dart) — a strict registration-time grammar (bare columns, aggregates with aliases, AND-ed comparisons, composite ORDER BY with pk tiebreak, LIMIT) producing sealed admission shapes. Fail-closed by construction: anything unparsed or unprovable costs a re-query, never correctness.test/stream_ivm_equivalence_test.dart) — seeded write storms (rowid changes, NULLs, savepoint rollbacks, overflow batches) across every admission mode, every emission checked against a fresh select. It caught two ~1-in-5 ordering races deterministic replays never fired; it is the permanent safety net for all future IVM work.Architectural properties
db.stream()is untouched; admission is an internal execution strategy.Evidence (full record: experiments/160-stream-delta-ivm.md)
improvedverdict — the first cleared gate in this direction's history; reproduced order-flipped (+19.5% main-slower)What this is not (yet) — the mapped ladder
Tier 4 one-hop joins, the equality-predicate dispatch index, rebuild-storm coalescing, and diff-carrying emissions are specified in signals.json with evidence trails, deliberately not built here.
Test plan
dart analyze lib test benchmarkclean; finalizer green🤖 Generated with Claude Code