feat(reader): add RecordLogReader for lockless concurrent inspection (#5)#27
Merged
Conversation
) Adds RecordLogReader, a read-only handle that can be opened on a RecordLog directory in parallel with a live writer in the same or another process, without contending for the writer's fs2 advisory lock. The reader snapshots the segment id list at open(). Iteration is performed via the existing RecordIter and obeys the same recovery contract as the writer: a truncated tail on the last segment yields a clean end and is reflected in the RecoveryReport; a CRC mismatch or truncated frame in a sealed (non-active) segment is a hard error. What ships ---------- * src/record_log_reader.rs: new RecordLogReader { dir, snapshot_ids } with open(), dir(), segments(), and scan_iter(). open() does not create the directory and does not acquire any lock; it errors on a missing or non-directory path. scan_iter() can be called repeatedly and returns a fresh iterator over the same snapshot each time. * src/record_log.rs: RecordIter::new is now pub(crate) so the reader module can construct iterators against the snapshotted segment ids. No behaviour change for existing callers. * src/lib.rs: re-exports RecordLogReader. Public surface delta is one additive type. * tests/reader_api.rs: nine integration tests covering open semantics (missing path, non-dir path, empty dir, locked-by-writer), snapshot freeze across rotate, intentional pinning of segment-level snapshot granularity (within-snapshotted-segment appends remain visible), scan_iter idempotence across repeated calls, tolerance of unfsynced tail on the last segment, and hard error on a corrupted sealed segment. * formal/ReadWhileWrite.tla and ReadWhileWrite.cfg: a new TLA+ model for read-while-write. Writer actions append + fsync as in RecordLog.tla. Reader actions snapshot the recoverable prefix at open, step through it record-by-record, and close. Invariants: YieldedIsPrefix(snapshot), NoSpuriousYield, ReaderBoundedByOpen, ClosedReaderHasNoState, plus type and self-prefix sanity. * formal/README.md: documents the new model, removes it from the "NOT modelled" list, and adds a fourth java/TLC invocation example. * .github/workflows/ci.yml: adds TLC ReadWhileWrite as a fourth step in the formal job, mirroring the existing pattern, and uploads the log to the tlc-logs artifact. Job display name updated. What stays the same ------------------- * Wire format frozen at WIRE_VERSION = 1. Six corpus fixtures unchanged. * Single-writer single-process invariant on RecordLog. The writer's fs2 lock is unaffected; readers do not acquire any lock. * Longest-valid-prefix recovery contract. The reader reuses the existing RecordIter and inherits the same per-segment behaviour. * Snapshot-style compaction. The reader is independent of compaction. * MSRV 1.75.0. No new dependencies. Public surface delta -------------------- + pub use record_log_reader::RecordLogReader Snapshot is at segment-id granularity. Records appended to a segment that was already in the snapshot at open time will be visible to the reader when it gets to that segment. This is documented in the module-level rustdoc and pinned by a dedicated test (reader_observes_writer_appends_within_snapshot_segments), and is modelled by snapshotting durable \o buffered at open time in ReadWhileWrite.tla. Local validation ---------------- * cargo fmt --all -- --check: clean * cargo clippy --workspace --all-targets -- -D warnings: clean * cargo test --workspace --all-targets: 131 tests pass (including 9 new in tests/reader_api.rs) * RUSTDOCFLAGS='-D warnings' cargo doc --workspace --no-deps: clean * cargo bench --workspace --no-run: builds * cargo publish --dry-run -p datawal --allow-dirty: clean TLC ReadWhileWrite is exercised in CI; the workbench has no Java 21 to run it locally. Out of scope ------------ * Live tail / notification on new records. The reader is snapshot at open by design; a follow-on iterator-from-offset variant can be added without changing this contract. * Cross-process concurrent reader fuzz. The reader's lack of lock acquisition means a remote process can hold the writer lock; the in-process test exercises the same code path. * Per-byte snapshot of the active segment. Segment-id snapshot is sufficient for inspection and keeps the reader trivially cheap. Closes #5.
This was referenced May 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
RecordLogReader, a read-only handle that can be opened on aRecordLogdirectory in parallel with a live writer in the same oranother process, without contending for the writer's
fs2advisory lock.
The reader snapshots the segment id list at
open(). Iteration isperformed via the existing
RecordIterand obeys the same recoverycontract as the writer.
What this is, what it isn't
The reader is intentionally cheap: no lock acquisition, no directory
creation, no fd hygiene tricks. It is suitable for local recoverable
logs where JSONL would otherwise be used.
What ships
src/record_log_reader.rs— newRecordLogReader { dir, snapshot_ids }with
open(),dir(),segments(),scan_iter(). ~165 LOC.src/record_log.rs—RecordIter::newnowpub(crate)so the newmodule can construct iterators. No behaviour change for existing
callers. (+6 lines).
src/lib.rs— re-exportsRecordLogReader. (+2 lines).tests/reader_api.rs— nine integration tests covering opensemantics, snapshot freeze across rotate, intentional pinning of
segment-level snapshot granularity,
scan_iteridempotence,unfsynced tail tolerance, and hard error on corrupted sealed
segment. ~277 LOC.
formal/ReadWhileWrite.tla+.cfg— TLA+ model forread-while-write. Six invariants. ~190 + 12 LOC.
formal/README.md— documents the new model, removes the"NOT modelled" line, adds run command.
.github/workflows/ci.yml— addsTLC ReadWhileWriteas a fourthstep in the
formaljob, mirroring the existing pattern.Total: 4 new files + 4 modified, +690 / -4 LOC.
What stays the same
WIRE_VERSION = 1. Six corpus fixturesunchanged.
RecordLog. The writer'sfs2lock is unaffected; readers do not acquire any lock.RecordIterandinherits the per-segment behaviour.
Public surface delta
One additive type. Aligned with the kit 0.1.4 release.
Snapshot semantics (pinned)
Snapshot is at segment-id granularity. Records appended to a
segment that was already in the snapshot at
open()time will bevisible to the reader when it reaches that segment. This is
documented in the module-level rustdoc and pinned by a dedicated
test (
reader_observes_writer_appends_within_snapshot_segments),and modelled in
ReadWhileWrite.tlaby snapshottingdurable \o bufferedat open.Local validation
cargo fmt --all -- --checkcargo clippy --workspace --all-targets -- -D warningscargo test --workspace --all-targetsRUSTDOCFLAGS='-D warnings' cargo doc --workspace --no-depscargo bench --workspace --no-runcargo publish --dry-run -p datawal --allow-dirtyTLC
ReadWhileWriteis exercised in CI; the workbench has no Java 21to run it locally.
TLA+ model summary
ReadWhileWrite.tlaextends the durability model with reader state:readerOpen,snapshot,yielded.DoOpenReader(snapshotsdurable \o buffered),DoReaderStep(yields the next element ofsnapshot),DoCloseReader(clears reader state).YieldedIsPrefix— the reader's yielded sequence is always aprefix of its snapshot.
NoSpuriousYield— every yielded element was appended.ReaderBoundedByOpen—Len(yielded) <= Len(snapshot).ClosedReaderHasNoState.Deliberately not re-modelled here: crash (covered by
RecordLog.tla),concurrent writers (single-writer by contract).
Out of scope
Closes #5.