feat(meta-tools): add hybrid BM25 + TF-IDF search strategy

ryoppippi · ryoppippi · commit ed3632473280 · 2025-11-07T15:34:39.000Z
This commit implements hybrid search combining BM25 and TF-IDF algorithms for meta_search_tools, matching the functionality in the Node.js SDK (PR #122). Based on evaluation results showing 10.8% accuracy improvement with the hybrid approach. Changes: 1. TF-IDF Implementation (stackone_ai/utils/tfidf_index.py): - Lightweight TF-IDF vector index with no external dependencies - Tokenizes text with stopword removal - Computes smoothed IDF values - Uses sparse vectors for efficient cosine similarity computation - Returns results with scores clamped to [0, 1] 2. Hybrid Search Integration (stackone_ai/meta_tools.py): - Updated ToolIndex to support hybrid_alpha parameter (default: 0.2) - Implements score fusion: hybrid_score = alpha * bm25 + (1 - alpha) * tfidf - Fetches top 50 candidates from both algorithms for better fusion - Normalizes and clamps all scores to [0, 1] range - Default alpha=0.2 gives more weight to BM25 (optimized through testing) - Both BM25 and TF-IDF use weighted document representations: * Tool name boosted 3x for TF-IDF * Category and actions included for better matching 3. Enhanced API (stackone_ai/models.py): - Add hybrid_alpha parameter to Tools.meta_tools() method - Defaults to 0.2 (optimized value from Node.js validation) - Allows customization for different use cases - Updated docstrings to explain hybrid search benefits 4. Comprehensive Tests (tests/test_meta_tools.py): - 4 new test cases for hybrid search functionality: * hybrid_alpha parameter validation (including boundary checks) * Hybrid search returns meaningful results * Different alpha values affect ranking * meta_tools() accepts custom alpha parameter - All 18 tests passing 5. Documentation Updates (README.md): - Updated Meta Tools section to highlight hybrid search - Added "Hybrid Search Configuration" subsection with examples - Explained how BM25 and TF-IDF complement each other - Documented the alpha parameter and its effects - Updated Features section to mention hybrid search Technical Details: - TF-IDF uses standard term frequency normalization and smoothed IDF - Sparse vector representation for memory efficiency - Cosine similarity for semantic matching - BM25 provides keyword matching strength - Fusion happens after score normalization for fair weighting - Alpha=0.2 provides optimal balance (validated in Node.js SDK) Performance: - 10.8% accuracy improvement over BM25-only approach - Efficient sparse vector operations - Minimal memory overhead - No additional external dependencies Reference: StackOneHQ/stackone-ai-node#122
diff --git a/README.md b/README.md
@@ -14,7 +14,7 @@ StackOne AI provides a unified interface for accessing various SaaS tools throug
   - Glob pattern filtering with patterns like `"hris_*"` and exclusions `"!hris_delete_*"`
   - Provider and action filtering with `fetch_tools()`
   - Multi-account support
-- **Meta Tools** (Beta): Dynamic tool discovery and execution based on natural language queries
+- **Meta Tools** (Beta): Dynamic tool discovery and execution based on natural language queries using hybrid BM25 + TF-IDF search
 - Integration with popular AI frameworks:
   - OpenAI Functions
   - LangChain Tools
@@ -337,7 +337,9 @@ result = feedback_tool.call(
 
 ## Meta Tools (Beta)
 
-Meta tools enable dynamic tool discovery and execution without hardcoding tool names:
+Meta tools enable dynamic tool discovery and execution without hardcoding tool names. The search functionality uses **hybrid BM25 + TF-IDF search** for improved accuracy (10.8% improvement over BM25 alone).
+
+### Basic Usage
 
 ```python
 # Get meta tools for dynamic discovery
@@ -353,6 +355,30 @@ execute_tool = meta_tools.get_tool("meta_execute_tool")
 result = execute_tool.call(toolName="hris_list_employees", params={"limit": 10})
 ```
 
+### Hybrid Search Configuration
+
+The hybrid search combines BM25 and TF-IDF algorithms. You can customize the weighting:
+
+```python
+# Default: hybrid_alpha=0.2 (more weight to BM25, proven optimal in testing)
+meta_tools = tools.meta_tools()
+
+# Custom alpha: 0.5 = equal weight to both algorithms
+meta_tools = tools.meta_tools(hybrid_alpha=0.5)
+
+# More BM25: higher alpha (0.8 = 80% BM25, 20% TF-IDF)
+meta_tools = tools.meta_tools(hybrid_alpha=0.8)
+
+# More TF-IDF: lower alpha (0.2 = 20% BM25, 80% TF-IDF)
+meta_tools = tools.meta_tools(hybrid_alpha=0.2)
+```
+
+**How it works:**
+- **BM25**: Excellent at keyword matching and term frequency
+- **TF-IDF**: Better at understanding semantic relationships
+- **Hybrid**: Combines strengths of both for superior accuracy
+- **Default alpha=0.2**: Optimized through validation testing for best tool discovery
+
 ## Examples
 
 For more examples, check out the [examples/](examples/) directory:
diff --git a/stackone_ai/meta_tools.py b/stackone_ai/meta_tools.py
@@ -10,6 +10,7 @@
 from pydantic import BaseModel
 
 from stackone_ai.models import ExecuteConfig, JsonDict, StackOneTool, ToolParameters
+from stackone_ai.utils.tfidf_index import TfidfDocument, TfidfIndex
 
 if TYPE_CHECKING:
     from stackone_ai.models import Tools
@@ -24,14 +25,24 @@ class MetaToolSearchResult(BaseModel):
 
 
 class ToolIndex:
-    """BM25-based tool search index"""
+    """Hybrid BM25 + TF-IDF tool search index"""
 
-    def __init__(self, tools: list[StackOneTool]) -> None:
+    def __init__(self, tools: list[StackOneTool], hybrid_alpha: float = 0.2) -> None:
+        """Initialize tool index with hybrid search
+
+        Args:
+            tools: List of tools to index
+            hybrid_alpha: Weight for BM25 in hybrid search (0-1). Default 0.2 gives
+                more weight to BM25 scoring, which has been shown to provide better
+                tool discovery accuracy (10.8% improvement in validation testing).
+        """
         self.tools = tools
         self.tool_map = {tool.name: tool for tool in tools}
+        self.hybrid_alpha = max(0.0, min(1.0, hybrid_alpha))
 
-        # Prepare corpus for BM25
+        # Prepare corpus for both BM25 and TF-IDF
         corpus = []
+        tfidf_docs = []
         self.tool_names = []
 
         for tool in tools:
@@ -44,7 +55,18 @@ def __init__(self, tools: list[StackOneTool]) -> None:
             actions = [p for p in parts if p in action_types]
 
             # Combine name, description, category and tags for indexing
-            doc_text = " ".join(
+            # For TF-IDF: use weighted approach similar to Node.js
+            tfidf_text = " ".join(
+                [
+                    f"{tool.name} {tool.name} {tool.name}",  # boost name
+                    f"{category} {' '.join(actions)}",
+                    tool.description,
+                    " ".join(parts),
+                ]
+            )
+
+            # For BM25: simpler approach
+            bm25_text = " ".join(
                 [
                     tool.name,
                     tool.description,
@@ -54,17 +76,21 @@ def __init__(self, tools: list[StackOneTool]) -> None:
                 ]
             )
 
-            corpus.append(doc_text)
+            corpus.append(bm25_text)
+            tfidf_docs.append(TfidfDocument(id=tool.name, text=tfidf_text))
             self.tool_names.append(tool.name)
 
         # Create BM25 index
-        self.retriever = bm25s.BM25()
-        # Tokenize without stemming for simplicity
+        self.bm25_retriever = bm25s.BM25()
         corpus_tokens = bm25s.tokenize(corpus, stemmer=None, show_progress=False)
-        self.retriever.index(corpus_tokens)
+        self.bm25_retriever.index(corpus_tokens)
+
+        # Create TF-IDF index
+        self.tfidf_index = TfidfIndex()
+        self.tfidf_index.build(tfidf_docs)
 
     def search(self, query: str, limit: int = 5, min_score: float = 0.0) -> list[MetaToolSearchResult]:
-        """Search for relevant tools using BM25
+        """Search for relevant tools using hybrid BM25 + TF-IDF
 
         Args:
             query: Natural language query
@@ -74,30 +100,64 @@ def search(self, query: str, limit: int = 5, min_score: float = 0.0) -> list[Met
         Returns:
             List of search results sorted by relevance
         """
-        # Tokenize query
+        # Get more results initially to have better candidate pool for fusion
+        fetch_limit = max(50, limit)
+
+        # Tokenize query for BM25
         query_tokens = bm25s.tokenize([query], stemmer=None, show_progress=False)
 
         # Search with BM25
-        results, scores = self.retriever.retrieve(query_tokens, k=min(limit * 2, len(self.tools)))
+        bm25_results, bm25_scores = self.bm25_retriever.retrieve(
+            query_tokens, k=min(fetch_limit, len(self.tools))
+        )
+
+        # Search with TF-IDF
+        tfidf_results = self.tfidf_index.search(query, k=min(fetch_limit, len(self.tools)))
+
+        # Build score map for fusion
+        score_map: dict[str, dict[str, float]] = {}
 
-        # Process results
+        # Add BM25 scores
+        for idx, score in zip(bm25_results[0], bm25_scores[0]):
+            tool_name = self.tool_names[idx]
+            # Normalize BM25 score to 0-1 range
+            normalized_score = float(1 / (1 + np.exp(-score / 10)))
+            # Clamp to [0, 1]
+            clamped_score = max(0.0, min(1.0, normalized_score))
+            score_map[tool_name] = {"bm25": clamped_score}
+
+        # Add TF-IDF scores
+        for result in tfidf_results:
+            if result.id not in score_map:
+                score_map[result.id] = {}
+            score_map[result.id]["tfidf"] = result.score
+
+        # Fuse scores: hybrid_score = alpha * bm25 + (1 - alpha) * tfidf
+        fused_results: list[tuple[str, float]] = []
+        for tool_name, scores in score_map.items():
+            bm25_score = scores.get("bm25", 0.0)
+            tfidf_score = scores.get("tfidf", 0.0)
+            hybrid_score = self.hybrid_alpha * bm25_score + (1 - self.hybrid_alpha) * tfidf_score
+            fused_results.append((tool_name, hybrid_score))
+
+        # Sort by score descending
+        fused_results.sort(key=lambda x: x[1], reverse=True)
+
+        # Build final results
         search_results = []
-        # TODO: Add strict=False when Python 3.9 support is dropped
-        for idx, score in zip(results[0], scores[0]):
+        for tool_name, score in fused_results:
             if score < min_score:
                 continue
 
-            tool_name = self.tool_names[idx]
-            tool = self.tool_map[tool_name]
-
-            # Normalize score to 0-1 range
-            normalized_score = float(1 / (1 + np.exp(-score / 10)))
+            tool = self.tool_map.get(tool_name)
+            if tool is None:
+                continue
 
             search_results.append(
                 MetaToolSearchResult(
                     name=tool.name,
                     description=tool.description,
-                    score=normalized_score,
+                    score=score,
                 )
             )
 
@@ -118,8 +178,9 @@ def create_meta_search_tools(index: ToolIndex) -> StackOneTool:
     """
     name = "meta_search_tools"
     description = (
-        "Searches for relevant tools based on a natural language query. "
-        "This tool should be called first to discover available tools before executing them."
+        f"Searches for relevant tools based on a natural language query using hybrid BM25 + TF-IDF search "
+        f"(alpha={index.hybrid_alpha}). This tool should be called first to discover available tools "
+        f"before executing them."
     )
 
     parameters = ToolParameters(
diff --git a/stackone_ai/models.py b/stackone_ai/models.py
@@ -532,10 +532,16 @@ def to_langchain(self) -> Sequence[BaseTool]:
         """
         return [tool.to_langchain() for tool in self.tools]
 
-    def meta_tools(self) -> Tools:
+    def meta_tools(self, hybrid_alpha: float = 0.2) -> Tools:
         """Return meta tools for tool discovery and execution
 
-        Meta tools enable dynamic tool discovery and execution based on natural language queries.
+        Meta tools enable dynamic tool discovery and execution based on natural language queries
+        using hybrid BM25 + TF-IDF search.
+
+        Args:
+            hybrid_alpha: Weight for BM25 in hybrid search (0-1). Default 0.2 gives more weight
+                to BM25 scoring, which has been shown to provide better tool discovery accuracy
+                (10.8% improvement in validation testing).
 
         Returns:
             Tools collection containing meta_search_tools and meta_execute_tool
@@ -549,8 +555,8 @@ def meta_tools(self) -> Tools:
             create_meta_search_tools,
         )
 
-        # Create search index
-        index = ToolIndex(self.tools)
+        # Create search index with hybrid search
+        index = ToolIndex(self.tools, hybrid_alpha=hybrid_alpha)
 
         # Create meta tools
         filter_tool = create_meta_search_tools(index)
diff --git a/stackone_ai/utils/__init__.py b/stackone_ai/utils/__init__.py
@@ -0,0 +1 @@
+"""Utility modules for StackOne AI SDK."""
diff --git a/stackone_ai/utils/tfidf_index.py b/stackone_ai/utils/tfidf_index.py
diff --git a/tests/test_meta_tools.py b/tests/test_meta_tools.py

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	`+"""Utility modules for StackOne AI SDK."""`