Skip to content

feat: Auto-heal for silent semantic search failure#8

Merged
PatrickSys merged 4 commits intomasterfrom
feat/auto-heal-semantic-search
Jan 6, 2026
Merged

feat: Auto-heal for silent semantic search failure#8
PatrickSys merged 4 commits intomasterfrom
feat/auto-heal-semantic-search

Conversation

@PatrickSys
Copy link
Owner

@PatrickSys PatrickSys commented Jan 6, 2026

Description

This PR implements an auto-heal mechanism to resolve the issue where Semantic Search would silently fail after LanceDB tables becoming corrupted. This degraded the performance of the MCP Server by defaulting to keyword-based search

Changes

  • Adds automatic detection and recovery when the vector index becomes corrupted
  • Implements new \IndexCorruptedError\ custom error type
  • Triggers automatic re-indexing when critical index failures occur
  • Falls back to keyword search for non-critical errors
  • Updates version to v1.3.1

Testing

  • Verified in real-world scenarios
  • New test scripts validate the auto-heal behavior

@greptile-apps
Copy link

greptile-apps bot commented Jan 6, 2026

Greptile Summary

Implements an auto-heal mechanism to detect and recover from LanceDB vector index corruption by triggering automatic re-indexing when schema validation fails or the vector column is missing.

Key Changes:

  • New IndexCorruptedError custom error class for signaling index corruption
  • Detection logic in LanceDBStorageProvider checks for missing vector columns during initialization and search operations
  • Error propagation through CodebaseSearcher.initialize() and search() methods
  • Auto-heal flow in MCP server's search_codebase handler catches IndexCorruptedError, triggers full re-indexing via performIndexing(), and retries the search
  • Changed embedding model from bge-base-en-v1.5 to bge-small-en-v1.5 (smaller, faster)
  • Comprehensive test coverage for corruption detection, error propagation, and auto-heal flow

Concern:
The error handling in src/storage/lancedb.ts:192-196 is overly broad - it converts any LanceDB or Arrow error into IndexCorruptedError, which could trigger expensive full re-indexing for transient failures like file locks, network timeouts, or memory errors.

Confidence Score: 4/5

  • Safe to merge with one logical concern in error handling that could cause unnecessary re-indexing
  • The auto-heal mechanism is well-designed with proper error propagation and comprehensive tests. However, the overly broad error classification (lines 192-196 in lancedb.ts) could trigger expensive full re-indexing on transient failures. The implementation is solid but would benefit from more precise error detection.
  • Pay attention to src/storage/lancedb.ts - the broad error handling on lines 192-196 may cause false positives

Important Files Changed

Filename Overview
src/errors/index.ts New custom error class for index corruption detection - clean implementation
src/storage/lancedb.ts Detects schema corruption and throws IndexCorruptedError; overly broad error handling may trigger unnecessary re-indexing
src/core/search.ts Propagates IndexCorruptedError correctly for auto-heal mechanism
src/index.ts Implements auto-heal flow with re-indexing and retry; minor formatting changes included

Sequence Diagram

sequenceDiagram
    participant User
    participant MCP as MCP Server (index.ts)
    participant Searcher as CodebaseSearcher
    participant Storage as LanceDBStorage
    participant Indexer as CodebaseIndexer

    User->>MCP: search_codebase(query)
    MCP->>Searcher: search(query, limit, filters)
    Searcher->>Searcher: initialize()
    Searcher->>Storage: initialize(storagePath)
    
    alt Table exists but corrupted
        Storage->>Storage: openTable('code_chunks')
        Storage->>Storage: validate schema
        Storage-->>Storage: Missing vector column detected
        Storage->>Storage: dropTable('code_chunks')
        Storage-->>Searcher: throw IndexCorruptedError
        Searcher-->>MCP: throw IndexCorruptedError
    else Search on corrupted index
        Storage->>Storage: vectorSearch(queryVector)
        Storage-->>Storage: "No vector column" error
        Storage-->>Searcher: throw IndexCorruptedError
        Searcher-->>MCP: throw IndexCorruptedError
    end

    MCP->>MCP: catch IndexCorruptedError
    MCP->>MCP: console.error('[Auto-Heal] Index corrupted')
    MCP->>Indexer: performIndexing()
    Indexer->>Indexer: index files and rebuild vector DB
    Indexer-->>MCP: indexState.status = 'ready'
    
    MCP->>MCP: Check if indexing succeeded
    alt Indexing succeeded
        MCP->>MCP: console.error('[Auto-Heal] Success')
        MCP->>Searcher: new CodebaseSearcher(ROOT_PATH)
        MCP->>Searcher: search(query, limit, filters)
        Searcher->>Storage: search with fresh index
        Storage-->>Searcher: results
        Searcher-->>MCP: results
        MCP-->>User: Success response with results
    else Indexing failed
        MCP-->>User: Error: Auto-heal failed
    end
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

- Keep schema validation in initialize() where it belongs
- Only trigger auto-heal for verified 'no vector column' pattern
- Remove complex verifyTableHealth() method (48 fewer lines)
- Add test for graceful degradation on transient errors
- Gracefully degrade to keyword search for unknown errors

Addresses Greptile code review feedback on PR #8
@PatrickSys PatrickSys merged commit 9fde6c0 into master Jan 6, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant