A simple RAG (Retrieval Augmented Generation) application using RushDB for document storage and vector search capabilities.
- Document Ingestion: Load markdown documents from local directories or upload files
- Automatic Chunking: Split documents into manageable chunks for better vector search
- Vector Embeddings: Use sentence transformers to create embeddings for semantic search
- RushDB Storage: Store documents and chunks with relationships in RushDB
- Vector Search: Search for relevant chunks using cosine similarity
- FastAPI Interface: RESTful API for easy integration
- Auto-Configuration: Automatic initialization from environment variables
This project uses UV for dependency management. Make sure you have UV installed:
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repository and navigate to the project
cd packages/python-simple-rag
# Install dependencies
uv sync- Copy the example environment file:
cp .env.example .env- Edit
.envand add your RushDB API token:
# Get your API token from https://app.rushdb.com/
RUSHDB_API_TOKEN=your_actual_token_here- (Optional) Customize other settings in
.env:
EMBEDDING_MODEL=all-MiniLM-L6-v2
CHUNK_SIZE=500
SIMILARITY_THRESHOLD=0.7- Run the application:
uv run python run_app.py- Or start the API server directly:
uv run uvicorn src.api:app --host 0.0.0.0 --port 8000 --reloadThe application will automatically initialize from your .env configuration. The API will be available at http://localhost:8000 with interactive docs at http://localhost:8000/docs.
# Navigate to the project directory
cd /path/to/rushdb/packages/python-simple-rag
# Install dependencies with UV
uv syncYou'll need a RushDB API token. You can get one from:
- RushDB Cloud Dashboard (for cloud instance)
- Your self-hosted RushDB instance
The application provides a RESTful API for document ingestion and search. All configuration is handled through environment variables - no manual initialization required.
- Check API status and configuration:
curl http://localhost:8000/- Health check:
curl http://localhost:8000/health- Ingest documents from directory:
curl -X POST "http://localhost:8000/ingest/directory" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "docs_path=/path/to/your/markdown/docs"- Upload and ingest files:
curl -X POST "http://localhost:8000/ingest/files" \
-F "files=@document1.md" \
-F "files=@document2.md"- Search for relevant chunks:
curl -X POST "http://localhost:8000/search" \
-H "Content-Type: application/json" \
-d '{
"query": "What is RushDB?",
"limit": 5
}'All endpoints return JSON responses. The API automatically initializes from your .env configuration on startup.
The application stores documents using the following structure in RushDB:
{
"label": "Document",
"data": {
"title": "document_title",
"path": "/path/to/document.md",
"file_hash": "md5_hash_for_change_detection",
"content_preview": "First 200 characters...",
"Chunk": [
{
"text": "chunk_content",
"chunk_index": 0,
"embedding": [0.1, 0.2, 0.3, ...],
"document_title": "document_title"
},
# ... more chunks
]
}
}- Document Loading: Markdown files are loaded from the specified directory
- Content Processing: Markdown is converted to plain text
- Chunking: Documents are split into chunks of ~500 words each
- Vectorization: Each chunk is converted to a vector embedding using sentence transformers
- Storage: Documents and chunks are stored in RushDB with the
create_manymethod - Search: Vector similarity search is performed using RushDB's
$vectoroperator
The application uses RushDB's updated vector search capabilities with aggregation:
results = db.records.find({
"labels": ["Chunk"],
"aggregate": {
"text": "$record.text",
"document_title": "$record.document_title",
"chunk_index": "$record.chunk_index",
"score": {
"alias": "$record",
"field": "embedding",
"fn": "gds.similarity.cosine",
"query": query_vector
}
},
"orderBy": { "score": "desc" },
"limit": limit
})The project includes test documents in the test_docs/ directory. To test the system:
- Start the application:
uv run python run_app.py - Ingest test documents:
POST /ingest/directorywithdocs_path=test_docs - Search for content:
POST /searchwith queries like "vector search" or "RushDB features"
src/rag_engine.py: Core RAG implementation with document processing and RushDB operationssrc/api.py: FastAPI application with REST endpointssrc/config.py: Configuration management and environment variable handlingrun_app.py: Application runner with testing and server startuppyproject.toml: Project configuration and dependencies
- DocumentProcessor: Handles document loading, chunking, and vectorization
- RAGDatabase: Manages RushDB operations for storage and retrieval
- SimpleRAG: Main class that orchestrates the RAG workflow
- FastAPI App: RESTful API with automatic configuration from environment
- Embedding Model: Change the
EMBEDDING_MODELin.envto use different sentence transformer models - Chunk Size: Adjust
CHUNK_SIZEfor different chunking strategies - Search Configuration: Modify similarity scoring in the search aggregation
- Document Formats: Extend DocumentProcessor to support other document formats beyond markdown
fastapi: Web framework for the APIrushdb: RushDB Python SDKsentence-transformers: For text embeddingspython-markdown: Markdown processinguvicorn: ASGI servernumpy: Numerical operationspydantic: Data validation
- The application automatically skips documents that haven't changed (using file hash comparison)
- Vector embeddings are stored as arrays in RushDB and searchable using aggregation with cosine similarity
- The default embedding model (
all-MiniLM-L6-v2) provides a good balance of performance and quality - Search results include relevance scoring and document metadata
- Configuration is entirely environment-driven - no manual API initialization required