fix(dump): drain pending data in FIFO writer on shutdown#71
Open
nick-youngblut wants to merge 2 commits into
Open
fix(dump): drain pending data in FIFO writer on shutdown#71nick-youngblut wants to merge 2 commits into
nick-youngblut wants to merge 2 commits into
Conversation
ThreadWriter coordinated data and shutdown through two independent mechanisms (an mpsc channel for shutdown and a Condvar over the buffer). The worker re-checked the shutdown channel right after waking from cvar.wait and could return while bytes were still buffered, silently truncating FASTA/FASTQ output (notably over named pipes). Collapse the buffer and a `closed` flag into a single mutex-guarded WriterState driven by one condition predicate. The worker now exits only when the buffer is empty AND closed, so a non-empty buffer always drains before the thread exits. Drop sets `closed` under the lock before notifying, closing the wakeup gap. Adds in-memory regression tests including a 1000-iteration stress loop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request refactors ThreadWriter to resolve a data-loss race condition during shutdown by introducing a shared WriterState struct that manages both the write buffer and the shutdown state under a single mutex and condition variable. Additionally, new unit tests are added to verify proper draining of pending data on drop. The review feedback highlights two key areas for improvement: first, a critical issue where a mutex lock is held during a sleep call in the ingest loop, which blocks the worker thread; and second, a performance optimization to reuse the write buffer's allocated capacity instead of discarding it on every write cycle.
Updated ThreadWriter to release the lock before sleeping, allowing the worker thread to drain the buffer during backpressure. This change improves the efficiency of data handling during shutdown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a data-integrity race in the multi-threaded
dumpwriter where pending bytes could be silently dropped when the buffered FIFO writer shut down, truncating FASTA/FASTQ output (most visibly over named pipes).Changes
ThreadWriter's dual signaling (anmpscshutdown channel plus aCondvarover the buffer) with a single mutex-guardedWriterState { buffer, closed }driven by one condition predicate.closedis set; a non-empty buffer always takes the drain branch, so all ingested bytes are flushed before the thread exits.Dropsetsclosed = trueunder the lock before notifying, closing the wakeup gap, then joins the worker.ingest()adapted to the new state struct; backpressure behavior is unchanged.