fix: reduce optimize lock hold time with revision check#17
fix: reduce optimize lock hold time with revision check#17PhantomInTheWire wants to merge 1 commit intomasterfrom
Conversation
- Clone state under read lock, rebuild optimized segments outside RwLock - Use revision counter to detect concurrent writes/checkpoint mutations - Return clear FailedPrecondition error if optimize sees stale revision - Serialize checkpoint file writes with checkpoint_lock to avoid races - Add regression test for optimize persistence without extra flush - Add regression test for optimize vs concurrent write durability P0: optimize is a blocking full-dataset rewrite
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 59 minutes and 59 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| let mut optimized = { | ||
| let state = self.read_state(); | ||
| ensure_collection_is_writable(&state)?; | ||
| let mut optimized = state.clone(); | ||
| let meta = optimized.meta.clone(); | ||
| optimized.segments.optimize( | ||
| &mut optimized.next_segment_id, | ||
| optimized.options.segment_max_docs, | ||
| &optimized.schema, | ||
| |doc_id| meta.is_deleted(doc_id), | ||
| ); | ||
| state.rebuild_indexes(); | ||
| optimized.rebuild_indexes(); | ||
| optimized | ||
| }; |
There was a problem hiding this comment.
🟡 RwLock read guard held during expensive optimization prevents concurrent writes
The self.read_state() guard (state) at line 192 lives until the end of the block at line 204. This means the RwLock read lock is held during the expensive segments.optimize() (lines 196-201) and rebuild_indexes() (line 202) calls. Since Rust's RwLock blocks writers while any read lock is held, all concurrent write operations (insert, upsert, update, delete, delete_by_filter, DDL, flush) are blocked for the entire optimization duration.
The revision-based conflict check at line 209 was specifically introduced to detect concurrent modifications, but the long-held read lock prevents writers from making progress during optimization, defeating much of the stated purpose ("reduce optimize lock hold time"). After state.clone() at line 194, the read guard is no longer needed — only the cloned optimized is used. Dropping the read guard immediately after cloning would allow concurrent writes to proceed during the optimization computation phase.
| let mut optimized = { | |
| let state = self.read_state(); | |
| ensure_collection_is_writable(&state)?; | |
| let mut optimized = state.clone(); | |
| let meta = optimized.meta.clone(); | |
| optimized.segments.optimize( | |
| &mut optimized.next_segment_id, | |
| optimized.options.segment_max_docs, | |
| &optimized.schema, | |
| |doc_id| meta.is_deleted(doc_id), | |
| ); | |
| state.rebuild_indexes(); | |
| optimized.rebuild_indexes(); | |
| optimized | |
| }; | |
| let mut optimized = { | |
| let state = self.read_state(); | |
| ensure_collection_is_writable(&state)?; | |
| state.clone() | |
| }; | |
| let meta = optimized.meta.clone(); | |
| optimized.segments.optimize( | |
| &mut optimized.next_segment_id, | |
| optimized.options.segment_max_docs, | |
| &optimized.schema, | |
| |doc_id| meta.is_deleted(doc_id), | |
| ); | |
| optimized.rebuild_indexes(); |
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
P0 Issue
Optimize is a blocking full-dataset rewrite under the collection write lock.