bigRAG

Open-source, self-hostable RAG platform. Upload documents, auto-chunk, embed, and search — all behind a simple REST API.

Features

Document ingestion — PDF, DOCX, PPTX, HTML, Markdown, images, and more via Docling
S3 bucket ingestion — ingest from S3 or any S3-compatible service (MinIO, R2, Spaces, etc.), including public buckets
Embedding providers — OpenAI, Cohere, and any openai_compatible gateway (Ollama, vLLM, TEI, LiteLLM, Azure, Bedrock)
Embedding presets — save named provider/model configs once, reuse across collections
Vector search — semantic, keyword, and hybrid search modes via Milvus
Reranking — Cohere reranking for improved result relevance
Multi-collection queries — search across collections in a single request
Batch operations — bulk upload, delete, status checks, and queries
Real-time progress — SSE streaming for document processing status
Auth, audit, scopes — admin accounts, session cookies, bigrag_sk_… API keys with per-scope permissions and rate limits, full audit log
PII redaction + moderation — per-collection content filtering at ingest
Retrieval evaluation runner — ship recall@k / MRR / nDCG regressions against a golden set
Analytics — per-collection query analytics and platform-wide stats
Webhooks — HMAC-signed delivery, retries, circuit breaker, admin replay
Encrypted credentials at rest — provider API keys and webhook secrets sealed with Fernet (BIGRAG_MASTER_KEY)
Self-hostable — single docker compose up to run everything
Clients — TypeScript, Python, and Rust SDKs plus an MCP server for Claude Desktop, Cursor, and any MCP-aware runtime

Quick Start

docker compose up -d

This starts bigRAG API, Postgres, Redis, and Milvus. Open http://localhost:6100/docs for the interactive API docs.

# Create a collection
curl -X POST http://localhost:6100/v1/collections \
  -H "Content-Type: application/json" \
  -d '{"name": "docs", "embedding_api_key": "sk-..."}'

# Upload a document
curl -X POST http://localhost:6100/v1/collections/docs/documents \
  -F "file=@paper.pdf"

# Ingest from a public S3 bucket
curl -X POST http://localhost:6100/v1/collections/docs/documents/s3 \
  -H "Content-Type: application/json" \
  -d '{"bucket": "indian-supreme-court-judgments", "prefix": "judgments/2025/", "region": "ap-south-1", "no_sign_request": true}'

# Query
curl -X POST http://localhost:6100/v1/collections/docs/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What are the main findings?"}'

Development

./dev.sh  # starts Postgres, Redis, Milvus, and the API with hot reload

Docker Images

docker pull yoginth/bigrag:latest

Architecture

graph TD
    MCP([MCP client<br/>Claude / Cursor]) -->|bigrag-mcp| API
    Studio([Studio admin UI]) -->|session cookie| API
    SDK([TS / Python / Rust SDK]) -->|bigrag_sk_… key| API
    Curl([curl / any HTTP client]) -->|bigrag_sk_… key| API

    API[bigRAG API<br/>Python / FastAPI]

    API --> Auth[Auth, scopes, audit]
    API --> Collections[Collections]
    API --> Documents[Documents]
    API --> Query[Query]
    API --> Webhooks[Webhooks]

    Documents -->|store files| Storage[(Storage<br/>Local / S3)]
    Documents -->|enqueue| Redis[(Redis<br/>Job queue + event bus)]
    Redis -->|process| Worker[Ingestion worker]

    Worker -->|parse| Docling[Docling<br/>PDF, DOCX, HTML, Images]
    Worker -->|embed| Embedding[Embedding provider<br/>OpenAI / Cohere / openai_compatible]
    Worker -->|store vectors| Milvus[(Milvus<br/>Vector DB)]

    Query -->|search| Milvus
    Query -->|embed query| Embedding
    Query -->|rerank| Reranker[Cohere Rerank]

    Auth --> Postgres
    Collections --> Postgres[(Postgres<br/>Metadata + audit + deliveries)]
    Documents --> Postgres
    Webhooks --> Postgres

API Reference

Method	Endpoint	Description
Health
`GET`	`/health`	Liveness check
`GET`	`/health/ready`	Readiness check (all dependencies)
Auth
`GET`	`/v1/auth/setup-status`	First-run setup status
`POST`	`/v1/auth/setup`	Create first admin
`POST`	`/v1/auth/login`	Session login
`POST`	`/v1/auth/logout`	Revoke current session
`POST`	`/v1/auth/logout-all`	Revoke all sessions for user
`GET`	`/v1/auth/me`	Current session
`POST`	`/v1/auth/password`	Change password
`GET`/`PUT`	`/v1/auth/preferences`	Per-user Studio UI preferences
Collections
`POST`	`/v1/collections`	Create collection
`GET`	`/v1/collections`	List collections
`GET`	`/v1/collections/{name}`	Get collection
`PUT`	`/v1/collections/{name}`	Update collection
`DELETE`	`/v1/collections/{name}`	Delete collection
`GET`	`/v1/collections/{name}/stats`	Collection stats
`POST`	`/v1/collections/{name}/reembed`	Re-embed all documents with a new model
`POST`	`/v1/collections/{name}/truncate`	Delete all documents, keep the collection
`GET`	`/v1/collections/{name}/events`	Stream collection events (SSE)
Documents
`POST`	`/v1/collections/{name}/documents`	Upload document
`GET`	`/v1/collections/{name}/documents`	List documents
`GET`	`/v1/collections/{name}/documents/{id}`	Get document
`DELETE`	`/v1/collections/{name}/documents/{id}`	Delete document
`POST`	`/v1/collections/{name}/documents/{id}/reprocess`	Reprocess document
`GET`	`/v1/collections/{name}/documents/{id}/chunks`	Get document chunks
`GET`	`/v1/collections/{name}/documents/{id}/file`	Download original file
`GET`	`/v1/collections/{name}/documents/{id}/progress`	Stream processing progress (SSE)
`POST`	`/v1/collections/{name}/documents/s3`	Ingest from S3 bucket
`POST`	`/v1/collections/{name}/documents/batch/upload`	Batch upload (up to 100)
`POST`	`/v1/collections/{name}/documents/batch/status`	Batch status check
`POST`	`/v1/collections/{name}/documents/batch/get`	Batch get documents
`POST`	`/v1/collections/{name}/documents/batch/delete`	Batch delete
`GET`	`/v1/collections/{name}/documents/batch/progress`	Stream batch progress (SSE)
`GET`	`/v1/documents/{id}`	Cross-collection document lookup
`GET`	`/v1/documents/{id}/chunks`	Cross-collection chunks lookup
Query
`POST`	`/v1/collections/{name}/query`	Query collection
`POST`	`/v1/query`	Multi-collection query
`POST`	`/v1/batch/query`	Batch query
Vectors
`POST`	`/v1/collections/{name}/vectors/upsert`	Upsert raw vectors
`POST`	`/v1/collections/{name}/vectors/delete`	Delete vectors by ID
S3 jobs
`GET`	`/v1/collections/{name}/s3-jobs`	List S3 ingest jobs
`GET`/`PATCH`/`DELETE`	`/v1/collections/{name}/s3-jobs/{id}`	Inspect / update / cancel a job
`POST`	`/v1/collections/{name}/s3-jobs/{id}/resync`	Re-scan the bucket for a job
Evaluation
`POST`	`/v1/evaluation`	Run a golden-set eval (recall@k, MRR, nDCG)
Webhooks (admin)
`GET`/`POST`	`/v1/admin/webhooks`	List / create webhooks
`GET`/`PUT`/`DELETE`	`/v1/admin/webhooks/{id}`	Manage a webhook
`POST`	`/v1/admin/webhooks/{id}/test`	Fire a test delivery
`GET`	`/v1/admin/webhooks/{id}/deliveries`	Delivery history
`POST`	`/v1/admin/webhooks/{id}/deliveries/{did}/replay`	Replay a past delivery
Admin
`GET`/`POST`	`/v1/admin/users`	Manage admin accounts
`GET`/`POST`	`/v1/admin/api-keys`	Mint `bigrag_sk_…` API keys with scopes
`GET`	`/v1/admin/audit`	Audit log
`GET`/`POST`	`/v1/admin/embedding-presets`	Saved embedding provider configs
`GET`	`/v1/stats`	Platform stats
`GET`	`/v1/usage`	Usage analytics
`GET`	`/v1/embeddings/models`	List embedding models
`GET`	`/v1/collections/{name}/analytics`	Collection analytics

Full interactive docs at /docs (Swagger UI) when running.

Embedding Models

Provider	Model	Dimensions
openai	`text-embedding-3-small` (default)	1536
openai	`text-embedding-3-large`	3072
cohere	`embed-english-v3.0`	1024
cohere	`embed-multilingual-v3.0`	1024
cohere	`embed-english-light-v3.0`	384
cohere	`embed-multilingual-light-v3.0`	384

SDKs

TypeScript

npm install @bigrag/client

import { BigRAG } from "@bigrag/client";

const client = new BigRAG({ apiKey: "your-key", baseUrl: "http://localhost:6100" });

// Upload a document
const doc = await client.uploadDocument("docs", new File([pdf], "paper.pdf"));

// Stream processing progress
for await (const event of client.streamDocumentProgress("docs", doc.id)) {
  console.log(event.step, event.progress);
}

// Query
const { results } = await client.query("docs", { query: "What is RAG?" });

// Ingest from S3
await client.documents.ingestS3("docs", {
  bucket: "my-bucket",
  prefix: "reports/",
  no_sign_request: true,
});

Python

pip install bigrag

from bigrag import BigRAG

client = BigRAG(api_key="your-key", base_url="http://localhost:6100")

# Upload a document
doc = await client.documents.upload("docs", "/path/to/paper.pdf")

# Query
result = await client.queries.query("docs", {"query": "What is RAG?"})

# Ingest from S3
result = await client.documents.ingest_s3(
    "docs",
    bucket="my-bucket",
    prefix="reports/",
    no_sign_request=True,
)

Rust

# Cargo.toml
[dependencies]
bigrag = "0.1"

use bigrag::BigRAG;

let client = BigRAG::new("http://localhost:6100").with_api_key("your-key");
let result = client.query("docs", "What is RAG?").top_k(10).send().await?;

MCP server

Expose bigRAG to Claude Desktop, Cursor, and any MCP-aware runtime:

BIGRAG_URL=https://bigrag.example.com \
BIGRAG_API_KEY=bigrag_sk_... \
bigrag-mcp

Drop this into claude_desktop_config.json:

{
  "mcpServers": {
    "bigrag": {
      "command": "bigrag-mcp",
      "env": {
        "BIGRAG_URL": "https://bigrag.example.com",
        "BIGRAG_API_KEY": "bigrag_sk_..."
      }
    }
  }
}

Six tools are exposed — list_collections, get_collection, query, list_documents, get_document, get_document_chunks. See docs/sdks/mcp for details.

Configuration

All settings use the BIGRAG_ prefix as environment variables, or configure via bigrag.toml:

Variable	Description	Default
`BIGRAG_PORT`	Server port	`6100`
`BIGRAG_DATABASE_URL`	Postgres URL	`postgres://bigrag:bigrag@localhost:5433/bigrag`
`BIGRAG_MILVUS_URI`	Milvus URI	`http://localhost:19530`
`BIGRAG_REDIS_URL`	Redis URL	`redis://localhost:6380/0`
`BIGRAG_ENV`	`dev` or `prod` (prod enables startup safety checks)	`dev`
`BIGRAG_SESSION_COOKIE_SECURE`	HTTPS-only session cookies	`false`
`BIGRAG_EMBEDDING_API_KEY`	Default embedding API key	—
`BIGRAG_MASTER_KEY`	Fernet key that encrypts provider credentials at rest (required in `prod`)	—
`BIGRAG_STORAGE_BACKEND`	`local` or `s3` for document blob storage	`local`
`BIGRAG_INGESTION_WORKERS`	Background workers	`4`
`BIGRAG_MAX_UPLOAD_SIZE_MB`	Max upload size	`1024`

Supported Formats

PDF, DOCX, PPTX, XLSX, HTML, Markdown, CSV, TSV, XML, JSON, PNG, JPG, TIFF, BMP, GIF — powered by Docling with OCR support for scanned documents and images.

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Sponsor

If bigRAG is useful to you, consider sponsoring the project.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 525 Commits
.github		.github
api		api
app		app
sdks		sdks
website		website
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
STYLEGUIDE.md		STYLEGUIDE.md
bigrag.toml		bigrag.toml
biome.jsonc		biome.jsonc
dev.sh		dev.sh
docker-compose.yml		docker-compose.yml
milvus.yaml		milvus.yaml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bigRAG

Features

Quick Start

Development

Docker Images

Architecture

API Reference

Embedding Models

SDKs

TypeScript

Python

Rust

MCP server

Configuration

Supported Formats

Contributing

Sponsor

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bigRAG

Features

Quick Start

Development

Docker Images

Architecture

API Reference

Embedding Models

SDKs

TypeScript

Python

Rust

MCP server

Configuration

Supported Formats

Contributing

Sponsor

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages