-
Notifications
You must be signed in to change notification settings - Fork 16
Agent Skill Vector Search Guide
Rick Hightower edited this page Feb 1, 2026
·
2 revisions
Vector search uses semantic similarity to find documents based on meaning rather than exact word matches. It converts both your query and documents into vector embeddings, then finds the most similar vectors using mathematical distance calculations.
Choose vector search when:
- Looking for conceptual understanding or semantic similarity
- The query uses natural language descriptions
- You want to find related content even if exact terms don't match
- Working with conceptual documentation, tutorials, or explanatory content
- The query involves synonyms, related concepts, or abstract ideas
Examples of vector queries:
-
"How do I authenticate users?"- Finds authentication-related content even with different terminology -
"troubleshooting connection issues"- Finds related problems and solutions -
"best practices for error handling"- Finds conceptual guidance on error management -
"understanding OAuth flow"- Finds explanations of OAuth concepts
# Basic vector search (default mode)
agent-brain query "how does authentication work"
# Explicit vector mode
agent-brain query "troubleshooting guide" --mode vector
# With custom settings
agent-brain query "error handling patterns" --mode vector --threshold 0.5 --top-k 10# POST /query endpoint
curl -X POST http://localhost:8000/query/ \
-H "Content-Type: application/json" \
-d '{
"query": "how does authentication work",
"mode": "vector",
"threshold": 0.5,
"top_k": 8
}'| Option | Default | Description | Use Case |
|---|---|---|---|
--mode vector |
Default | Uses semantic similarity | Conceptual queries |
--threshold F |
0.7 | Similarity cutoff (0.0-1.0) | Higher = more relevant, fewer results |
--top-k N |
5 | Maximum results | More results for exploration |
Vector Advantages:
- 🧠 Semantic Understanding: Finds meaning, not just keywords
- 🔄 Flexible Matching: Works with synonyms and related concepts
- 🌍 Language Agnostic: Works across languages and domains
- 🎯 Conceptual Search: Great for tutorials and explanations
When Vector is better than BM25:
- Natural language queries
- Conceptual or explanatory content
- When exact terminology might vary
- Cross-language or multilingual content
When Vector is better than Hybrid:
- Pure semantic understanding needed
- No exact technical terms to match
- Performance-critical applications
- When keyword matching could be misleading
Vector search uses:
- Text Embedding: Converts text to high-dimensional vectors (3072 dimensions for text-embedding-3-large)
- Cosine Similarity: Measures angle between query and document vectors
- Ranking: Sorts by similarity score (higher = more similar)
Similarity Range: 0.0 (completely dissimilar) to 1.0 (identical meaning)
Embedding Model: OpenAI text-embedding-3-large (high quality, semantic understanding)
Query: agent-brain query "how does user authentication work"
Response:
{
"results": [
{
"text": "User authentication involves validating credentials against a user database. The process typically includes: 1) Username/password verification, 2) Token generation for session management, 3) Optional two-factor authentication...",
"source": "/docs/security/auth-overview.md",
"score": 0.87,
"vector_score": 0.87,
"bm25_score": null,
"chunk_id": "chunk_123",
"metadata": {
"file_name": "auth-overview.md",
"chunk_index": 0
}
},
{
"text": "OAuth 2.0 provides a secure way to authenticate users without sharing passwords. The flow involves: authorization request, user consent, token exchange...",
"source": "/docs/api/oauth-integration.md",
"score": 0.82,
"vector_score": 0.82,
"bm25_score": null,
"chunk_id": "chunk_456",
"metadata": {
"file_name": "oauth-integration.md",
"chunk_index": 1
}
}
],
"query_time_ms": 1240.5,
"total_results": 2
}Query: agent-brain query "connection problems and solutions"
Response:
{
"results": [
{
"text": "Common connection issues: 1) Network timeouts - increase timeout values, 2) SSL certificate problems - verify certificates, 3) Firewall blocking - check port access...",
"source": "/docs/troubleshooting/network-issues.md",
"score": 0.91,
"vector_score": 0.91,
"bm25_score": null,
"chunk_id": "chunk_789",
"metadata": {
"file_name": "network-issues.md",
"chunk_index": 0
}
},
{
"text": "Database connection pooling can prevent connection exhaustion. Configure minimum and maximum pool sizes based on your application load...",
"source": "/docs/database/connection-pooling.md",
"score": 0.78,
"vector_score": 0.78,
"bm25_score": null,
"chunk_id": "chunk_101",
"metadata": {
"file_name": "connection-pooling.md",
"chunk_index": 2
}
}
],
"query_time_ms": 1180.2,
"total_results": 2
}- Response Time: 800-1500ms (requires API calls to OpenAI)
- CPU Usage: Medium (vector similarity calculations)
- Memory Usage: High (loads all document vectors)
- API Costs: Requires OpenAI API credits
- Scalability: Good (vectors pre-computed, similarity calculated locally)
- Use natural language: Vector search works best with conversational queries
- Adjust thresholds carefully: Start with 0.7, lower to 0.3-0.5 for more results
- Combine with domain knowledge: Understand what concepts are covered in your docs
- Use for exploration: Great for discovering related content you didn't know existed
- API key required: Must have valid OpenAI API key
- Slow responses: Expected due to API calls (800-1500ms typical)
- Cost considerations: Each query consumes OpenAI credits
- No exact matches: Won't find content that uses completely different terminology
#!/bin/bash
# Semantic search for troubleshooting
agent-brain query "fix $1 problem" --mode vector --json | jq '.results[0].text'# Find related documentation
agent-brain query "best practices for $TOPIC" --mode vector --json | jq -r '.results[].source'import requests
response = requests.post('http://localhost:8000/query/', json={
'query': 'how to handle errors gracefully',
'mode': 'vector',
'threshold': 0.6
})
results = response.json()['results']| Aspect | Vector | BM25 | Hybrid |
|---|---|---|---|
| Speed | Slow (1-2s) | Fast (10-50ms) | Medium (1-2s) |
| Precision | Semantic | Exact terms | Balanced |
| API Required | Yes | No | Yes |
| Best For | Concepts | Technical terms | General use |
| Language Support | Excellent | Good | Excellent |
- Design-Architecture-Overview
- Design-Query-Architecture
- Design-Storage-Architecture
- Design-Class-Diagrams
- GraphRAG-Guide
- Agent-Skill-Hybrid-Search-Guide
- Agent-Skill-Graph-Search-Guide
- Agent-Skill-Vector-Search-Guide
- Agent-Skill-BM25-Search-Guide
Search
Server
Setup
- Pluggable-Providers-Spec
- GraphRAG-Integration-Spec
- Agent-Brain-Plugin-Spec
- Multi-Instance-Architecture-Spec