⚡ Bolt: Optimize RAG indexing by caching file hashes based on mtime#2
⚡ Bolt: Optimize RAG indexing by caching file hashes based on mtime#2
Conversation
- Added Trusted Mode (alwaysAllowReadWrite) to bypass tool confirmations.\n- Added fetch_url alias for web_fetch for agent compatibility.\n- Added UpdateFrontmatterTool and AppendContentTool for improved Bases integration.\n- Updated README.md.\n\nNote: This code was written with Gemini 3 thinking in Google Antigravity.
Optimizes the RAG indexing process by using file modification times (mtime) to cache content hashes. This prevents re-reading and re-hashing files that have not changed since the last index, significantly reducing I/O and CPU usage during indexing. - Modified `ObsidianVaultAdapter` to accept a `hashCacheProvider` callback. - Updated `computeHash` to return the cached hash if the file's `mtime` matches the cache. - Updated `RagIndexingService` to implement the `hashCacheProvider` using its internal file cache. - Updated `RagIndexingService` to store `mtime` in the index cache. - Added `test/services/obsidian-file-adapter.test.ts` to verify the caching logic. Co-authored-by: ArnBdev <207385326+ArnBdev@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
💡 What: Optimized RAG indexing by caching file hashes based on modification time (mtime).
🎯 Why: To avoid expensive I/O and CPU operations (reading and hashing) for files that haven't changed during indexing.
📊 Impact: Reduces disk reads and CPU usage during RAG sync, especially for large vaults with few changes.
🔬 Measurement: Verified with unit tests ensuring
computeHashuses the cache provider and avoidsreadBinarywhen a valid cache entry exists.PR created automatically by Jules for task 15498880683306253740 started by @ArnBdev