RushDB Simple RAG Implementation

A simple RAG (Retrieval Augmented Generation) application using RushDB for document storage and vector search capabilities.

Features

Document Ingestion: Load markdown documents from local directories or upload files
Automatic Chunking: Split documents into manageable chunks for better vector search
Vector Embeddings: Use sentence transformers to create embeddings for semantic search
RushDB Storage: Store documents and chunks with relationships in RushDB
Vector Search: Search for relevant chunks using cosine similarity
FastAPI Interface: RESTful API for easy integration
Auto-Configuration: Automatic initialization from environment variables

Setup with UV

This project uses UV for dependency management. Make sure you have UV installed:

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone the repository and navigate to the project
cd packages/python-simple-rag

# Install dependencies
uv sync

Configuration

Copy the example environment file:

cp .env.example .env

Edit .env and add your RushDB API token:

# Get your API token from https://app.rushdb.com/
RUSHDB_API_TOKEN=your_actual_token_here

(Optional) Customize other settings in .env:

EMBEDDING_MODEL=all-MiniLM-L6-v2
CHUNK_SIZE=500
SIMILARITY_THRESHOLD=0.7

Quick Start

Run the application:

uv run python run_app.py

Or start the API server directly:

uv run uvicorn src.api:app --host 0.0.0.0 --port 8000 --reload

The application will automatically initialize from your .env configuration. The API will be available at http://localhost:8000 with interactive docs at http://localhost:8000/docs.

Install Dependencies

# Navigate to the project directory
cd /path/to/rushdb/packages/python-simple-rag

# Install dependencies with UV
uv sync

Configuration

You'll need a RushDB API token. You can get one from:

RushDB Cloud Dashboard (for cloud instance)
Your self-hosted RushDB instance

Usage

The application provides a RESTful API for document ingestion and search. All configuration is handled through environment variables - no manual initialization required.

API Endpoints

Check API status and configuration:

curl http://localhost:8000/

Health check:

curl http://localhost:8000/health

Ingest documents from directory:

curl -X POST "http://localhost:8000/ingest/directory" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "docs_path=/path/to/your/markdown/docs"

Upload and ingest files:

curl -X POST "http://localhost:8000/ingest/files" \
  -F "files=@document1.md" \
  -F "files=@document2.md"

Search for relevant chunks:

curl -X POST "http://localhost:8000/search" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is RushDB?",
    "limit": 5
  }'

All endpoints return JSON responses. The API automatically initializes from your .env configuration on startup.

How It Works

Data Structure in RushDB

The application stores documents using the following structure in RushDB:

{
  "label": "Document",
  "data": {
    "title": "document_title",
    "path": "/path/to/document.md",
    "file_hash": "md5_hash_for_change_detection",
    "content_preview": "First 200 characters...",
    "Chunk": [
      {
        "text": "chunk_content",
        "chunk_index": 0,
        "embedding": [0.1, 0.2, 0.3, ...],
        "document_title": "document_title"
      },
      # ... more chunks
    ]
  }
}

Processing Flow

Document Loading: Markdown files are loaded from the specified directory
Content Processing: Markdown is converted to plain text
Chunking: Documents are split into chunks of ~500 words each
Vectorization: Each chunk is converted to a vector embedding using sentence transformers
Storage: Documents and chunks are stored in RushDB with the create_many method
Search: Vector similarity search is performed using RushDB's $vector operator

Vector Search

The application uses RushDB's updated vector search capabilities with aggregation:

results = db.records.find({
    "labels": ["Chunk"],
    "aggregate": {
        "text": "$record.text",
        "document_title": "$record.document_title", 
        "chunk_index": "$record.chunk_index",
        "score": {
            "alias": "$record",
            "field": "embedding",
            "fn": "gds.similarity.cosine",
            "query": query_vector
        }
    },
    "orderBy": { "score": "desc" },
    "limit": limit
})

Testing

The project includes test documents in the test_docs/ directory. To test the system:

Start the application: uv run python run_app.py
Ingest test documents: POST /ingest/directory with docs_path=test_docs
Search for content: POST /search with queries like "vector search" or "RushDB features"

Development

Code Structure

src/rag_engine.py: Core RAG implementation with document processing and RushDB operations
src/api.py: FastAPI application with REST endpoints
src/config.py: Configuration management and environment variable handling
run_app.py: Application runner with testing and server startup
pyproject.toml: Project configuration and dependencies

Key Components

DocumentProcessor: Handles document loading, chunking, and vectorization
RAGDatabase: Manages RushDB operations for storage and retrieval
SimpleRAG: Main class that orchestrates the RAG workflow
FastAPI App: RESTful API with automatic configuration from environment

Customization

Embedding Model: Change the EMBEDDING_MODEL in .env to use different sentence transformer models
Chunk Size: Adjust CHUNK_SIZE for different chunking strategies
Search Configuration: Modify similarity scoring in the search aggregation
Document Formats: Extend DocumentProcessor to support other document formats beyond markdown

Dependencies

fastapi: Web framework for the API
rushdb: RushDB Python SDK
sentence-transformers: For text embeddings
python-markdown: Markdown processing
uvicorn: ASGI server
numpy: Numerical operations
pydantic: Data validation

Notes

The application automatically skips documents that haven't changed (using file hash comparison)
Vector embeddings are stored as arrays in RushDB and searchable using aggregation with cosine similarity
The default embedding model (all-MiniLM-L6-v2) provides a good balance of performance and quality
Search results include relevance scoring and document metadata
Configuration is entirely environment-driven - no manual API initialization required

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RushDB Simple RAG Implementation

Features

Setup with UV

Configuration

Quick Start

Install Dependencies

Configuration

Usage

API Endpoints

How It Works

Data Structure in RushDB

Processing Flow

Vector Search

Testing

Development

Code Structure

Key Components

Customization

Dependencies

Notes

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

RushDB Simple RAG Implementation

Features

Setup with UV

Configuration

Quick Start

Install Dependencies

Configuration

Usage

API Endpoints

How It Works

Data Structure in RushDB

Processing Flow

Vector Search

Testing

Development

Code Structure

Key Components

Customization

Dependencies

Notes