Skip to content

VishvakR/PaperCrew

Repository files navigation

CrewAI RAG Qdrant — Production Research Pipeline

A production-grade multi-agent research pipeline using CrewAI, LangChain, Qdrant, and your choice of LLM (Ollama or OpenAI) with HuggingFace embeddings.

Architecture

src/
├── core/               # Config, logging, exceptions
├── ingestion/          # load → split → embed (HuggingFace) → upsert
├── vectorstore/        # Qdrant client factory + collection manager
├── rag/                # RetrievalQA chain (Ollama or OpenAI)
├── tools/              # CrewAI BaseTool wrapping the RAG chain
├── agents/             # Researcher & Writer agent factories
├── crew/               # Crew orchestrator
├── api/                # FastAPI server (schemas, router, app)
└── main.py             # CLI entrypoint
outputs/                # Agent-generated .md files land here

LLM Providers

Provider Env Notes
Ollama LLM_PROVIDER=ollama Free, runs locally — default
OpenAI LLM_PROVIDER=openai Requires OPENAI_API_KEY

Embeddings

Uses sentence-transformers/all-MiniLM-L6-v2 via HuggingFace — runs 100% locally, no API key required. Outputs 384-dim vectors.

Quick Start

1. Install dependencies

uv sync

2. Configure environment

cp .env.example .env
# Fill in: QDRANT_URL, QDRANT_API_KEY
# If using Ollama: ensure Ollama is running and set OLLAMA_MODEL
# If using OpenAI: set OPENAI_API_KEY and LLM_PROVIDER=openai

3. (Ollama only) Pull your model

ollama pull llama3.2      # or mistral, gemma3, phi4, etc.

4. Ingest a PDF

make ingest PDF=attention.pdf

5. Run the research crew

make run              # uses provider from .env
make run-full PDF=attention.pdf   # ingest + run in one shot

Results → outputs/researcher_analysis.md and outputs/writer_summary.md.

6. Start the REST API

make serve        # dev with auto-reload
make serve-prod   # production

API docs → http://localhost:8000/docs

API Endpoints

Method Path Description
GET /api/v1/health Health check
POST /api/v1/ingest Ingest a PDF into the vector store
POST /api/v1/run Kick off the research crew

POST /api/v1/run supports per-request LLM override:

{
  "auto_ingest": false,
  "llm_provider": "ollama",
  "model": "mistral"
}

Environment Variables

See .env.example for the full list.

Variable Default Description
LLM_PROVIDER ollama openai or ollama
OLLAMA_MODEL llama3.2 Any Ollama model
OLLAMA_BASE_URL http://localhost:11434 Ollama server
OPENAI_API_KEY Required only for OpenAI
OPENAI_MODEL gpt-4o OpenAI model name
EMBEDDING_MODEL sentence-transformers/all-MiniLM-L6-v2 HuggingFace model
QDRANT_URL Qdrant cluster endpoint
QDRANT_API_KEY Qdrant API key
VECTOR_SIZE 384 all-MiniLM-L6-v2 dims
CHUNK_SIZE 512 Text chunk size
LOG_LEVEL INFO Logging verbosity

Development

make lint        # ruff check
make format      # ruff format
make smoke-test  # quick import checks (no API keys needed)

About

A production-grade, multi-agent AI research pipeline using CrewAI, Qdrant, and local LLMs to autonomously ingest and analyze documents.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors