perf(datastore): enable WAL journal mode#616
Conversation
In delete mode every batch commit paid two fsyncs plus rollback-journal file create/delete churn; in WAL a commit is a single sequential WAL append+fsync. It also stops commits from blocking future reader connections, which is groundwork for serving reads in parallel with the writer. synchronous=FULL is set explicitly so each commit is durable the moment it returns, same as the delete-mode default. The overall durability window is unchanged: the worker already batches events into one transaction committed every 15 s / 100 events / bucket change, and that policy is untouched. In-memory datastores (tests, --no-db) ignore the pragma and keep journal_mode=memory. Verified on a 98 MB production snapshot: WAL active, clean checkpoint on shutdown, integrity ok, heartbeat throughput +4% (5572 -> 5815/s), reads unchanged.
The worker acked ForceCommit and Close inside the request loop and only committed the batch transaction after breaking out of it, so callers were told their data was committed before it was. Under the rollback journal this was masked by file locking, but with WAL readers never block on the writer: a caller could force_commit, reopen the database, and read a pre-commit snapshot. This is exactly how test_datastore_reload failed on the ubuntu CI runner (bucket metadata cache populated from a snapshot taken before the forced commit landed). Hold the ack for these two commands until tx.commit() returns, and propagate a commit failure to the caller as an error instead of a silent success. Heartbeats and other commands are still acked immediately, since they must not wait for the batch commit.
Greptile SummaryThis PR enables SQLite WAL journal mode on every file-based connection and fixes a pre-existing race where
Confidence Score: 5/5Safe to merge. The deferred-ack change eliminates a commit-before-ack inversion that was always a bug; WAL activation is additive and falls back gracefully. Both changes are narrowly scoped to worker.rs. The race fix is correct: deferred_ack is set atomically before breaking the inner loop, the ack is sent only after tx.commit() returns, and the error path propagates commit failures to callers rather than swallowing them. The WAL path is guarded against unsupported modes with a warn-and-continue fallback, synchronous=FULL preserves existing durability semantics, and in-memory databases are explicitly excluded from the WAL warning. The PR description documents the durability tradeoff analysis thoroughly, and the fix was stress-tested 50 iterations. No files require special attention. The single changed file has clear, well-commented logic and the control-flow changes are easy to follow. Important Files Changed
Sequence DiagramsequenceDiagram
participant C as Caller
participant W as Worker Thread
participant DB as SQLite WAL
note over C,DB: Before this PR - race condition
C->>W: ForceCommit or Close
W->>W: handle_request sets commit or quit flag
W-->>C: ACK sent immediately before commit
C->>DB: Reopen datastore reads pre-commit snapshot
W->>DB: tx.commit() runs too late
note over C,DB: After this PR - race fixed
C->>W: ForceCommit or Close
W->>W: handle_request sets commit or quit flag
W->>W: store deferred_ack and break inner loop
W->>DB: tx.commit()
DB-->>W: Ok
W-->>C: ACK sent after durable commit
C->>DB: Reopen datastore reads committed snapshot
note over C,DB: Commit failure path
W->>DB: tx.commit()
DB-->>W: Err
W-->>C: Err InternalError propagated to caller
Reviews (1): Last reviewed commit: "fix(datastore): ack ForceCommit/Close on..." | Re-trigger Greptile |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #616 +/- ##
==========================================
+ Coverage 70.81% 77.09% +6.27%
==========================================
Files 51 62 +11
Lines 2916 4951 +2035
==========================================
+ Hits 2065 3817 +1752
- Misses 851 1134 +283 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Re-lands the WAL change that was split out of #615 so that PR could merge, together with a fix for the race that made
test_datastore_reloadfail on the ubuntu runner when this was briefly part of #615 (run 27410891058).Commits
perf(datastore): enable WAL journal mode with synchronous=FULLEach batch commit becomes a single sequential WAL append+fsync where delete mode paid two fsyncs plus rollback-journal file create/delete churn, and commits no longer block readers — groundwork for serving reads from parallel read-only connections later (the existing TODO at the top of
worker.rs).journal_mode=WALis set at connection open and persists in the db header; any SQLite ≥ 3.7.0 (2010) can still open the file.--no-db) ignore it and keepjournal_mode=memory.integrity_checkok, heartbeat throughput +4% on top of the Performance improvements #615 numbers (5,572 → 5,815/s), reads unchanged.fix(datastore): ack ForceCommit/Close only after the transaction commitsThis is the actual cause of the ubuntu CI failure, a pre-existing bug that WAL unmasked rather than a WAL problem: the worker acked
ForceCommit/Closeinside the request loop, buttx.commit()only runs after breaking out of it — soforce_commit()returned while the data was still uncommitted.test_datastore_reloadcallsforce_commit(), reopens the datastore, and the new instance populates its bucket-metadata cache from a snapshot that could predate the commit. The rollback journal's locking happened to serialize the second reader behind the commit; in WAL readers never block on the writer, so the stale read became reachable (macOS won the race, ubuntu lost it).The acks for these two commands are now held until
tx.commit()returns, and a failed commit is propagated to the caller as an error instead of a silent success — this also fixesclose()acknowledging shutdown before the final commit was durable. All other commands (heartbeats in particular) are still acked immediately and never wait on the batch commit.test_datastore_reloadpassed 50/50 stress iterations with the fix; full workspace suite green.Why
synchronous=FULLand notNORMAL?Raised by @ErikBjare on #615 — the usual "WAL+NORMAL for many small writes" advice doesn't transfer here because the worker already does group commit at the application level: all writes accumulate in one transaction committed every 15 s / 100 events / on bucket change. The per-commit fsync that NORMAL exists to eliminate happens ~4×/minute, not per heartbeat — it isn't in the per-write path at all (the +4% above was measured with FULL, hammering writes ~1000× faster than real watchers).
What NORMAL would cost: the WAL is then only synced at checkpoints, and autocheckpoint triggers on page count (1000 pages), not time — at AW's trickle write rate a power loss could drop hours of activity since the last checkpoint, vs ≤15 s today. Bounding that would need a time-based
wal_checkpoint, i.e. fsyncing on a timer — which is what FULL already does, given commits are timer-driven. FULL keeps exactly the durability semantics the server has always had (delete mode's default is FULL), with strictly fewer syncs than before.Readers are unaffected either way: in WAL the commit fsync holds only the write lock, and there's a single application-level writer.