Skip to content

anmolsharma152/CodexEngine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

144 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CodexEngine

A self-hosted document intelligence platform that lets you upload documents, ask questions, and get answers backed by source citations.

The project currently has two tracks:

  • v4: a stable retrieval-first research engine
  • v5: an experimental workspace agent that creates and reuses persistent artifacts

Repository Structure

Stable Release (v4)

The main branch contains the stable document intelligence platform:

  • PDF and document ingestion
  • Hybrid retrieval (vector search + BM25)
  • Source citations
  • FastAPI backend
  • Next.js frontend
  • PostgreSQL + pgvector
  • Self-hosted deployment

Experimental Development (v5)

Active development is happening on the agentic branch.

The current research direction explores whether persistent artifacts can make AI assistants more useful than chat alone. Instead of relying entirely on conversation history, the agent creates, stores, and reuses workspace artifacts across sessions.

Highlights:

  • Custom agent loop (no LangGraph)
  • Provider-agnostic LLM layer
  • Workspace artifacts
  • Persistent context experiments
  • Tool-driven architecture

➑️ Experimental branch: https://github.com/anmolsharma152/CodexEngine/tree/agentic

Why This Project Exists

Most document assistants answer a question and immediately forget the work they just performed.

CodexEngine started as a retrieval-augmented research system and is evolving into an experiment around persistent AI workspaces, where analysis, reports, and findings can become reusable knowledge objects.

CodexEngine began as a retrieval-first research engine and is now being used to explore persistent AI workspaces.

Branches

Branch Status Purpose
main Stable Production-ready document intelligence platform (v4)
agentic Experimental Workspace-agent research and v5 development

Quick Start

git clone https://github.com/anmolsharma152/CodexEngine.git
cd CodexEngine

# Backend
cd codex-backend
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env   # fill in your keys
uvicorn server:app --reload --host 127.0.0.1 --port 8000

# Frontend (new terminal)
cd codex-frontend
npm install && npm run dev

Set NEXT_PUBLIC_SUPABASE_URL, NEXT_PUBLIC_SUPABASE_ANON_KEY, NEXT_PUBLIC_API_URL in codex-frontend/.env.local. Open http://localhost:3000 β€” register, upload a PDF, and start asking questions.

How It Works

When you ask a question, CodexEngine:

  1. Decides if it needs to search your documents or can answer directly
  2. Searches your indexed content using vector similarity + keyword search + optional web fallback
  3. Scores and reranks the results
  4. Generates an answer with source citations ([p. X], [r. X], [doc], [web])

All of this runs through the v4 retrieval pipeline, which currently uses LangGraph-based orchestration and self-evaluation loops (up to 3 retries if the initial answer is weak).

The experimental agentic branch replaces this architecture with a custom agent loop.

Running Modes

Feature Local / CI Production (Render 512MB)
Embeddings fastembed ONNX (bge-small-en-v1.5) Google Gemini API
Reranker CrossEncoder (ms-marco-MiniLM-L-6-v2) Score-based sort
Detection MemTotal > 1.5GB or no RENDER env RENDER=true or < 1.5GB

Both modes produce 384-dimensional vectors.

Architecture

flowchart TD
    User([User / Browser])

    subgraph Frontend [codex-frontend β€” Next.js 15]
        AuthUI[Auth UI]
        ChatUI[Chat UI / SSE]
        DocMgr[Document Manager]
        SupaSDK["@supabase/supabase-js<br>Auth JWT β†’ Bearer"]
    end

    subgraph Supabase [Supabase]
        SA[Auth<br>sign up / sign in]
        SB[Storage<br>documents bucket]
    end

    subgraph Backend [codex-backend β€” FastAPI]
        direction LR
        
        %% Graph Flow
        R[1. Router] -->|retrieval_required| C[2. Condenser]
        R -->|direct/meta| A[6. Actor]
        C --> Ret[3. Retriever]
        Ret --> E[4. Evaluator]
        E -->|retry_needed| RW[5. Rewriter]
        RW --> Ret
        E -->|sufficient: False| WS[Web Search Fallback]
        E -->|sufficient: True| A
        WS --> A
        A --> Resp[SSE Response]

        %% Ingestion Flow
        subgraph Ingestion [Background Ingestion]
            Q[(asyncio.Queue)]
            Worker[Worker Task]
            Q --> Worker
            Worker -->|Chunk & Embed| DB
        end
    end

    subgraph DB [PostgreSQL + pgvector]
        Threads[threads]
        Chunks[prose_chunks<br>384-dim vectors]
    end

    subgraph External [External APIs]
        Groq[Groq<br>LLM β€” llama-3.1]
        Gemini[Google Gemini<br>embeddings]
        FastEmbed[fastembed ONNX<br>embeddings]
    end

    User --> Frontend
    AuthUI --> SA
    SA -.->|JWT session| SupaSDK
    SupaSDK --> Backend
    ChatUI <-->|SSE stream| Backend
    DocMgr -->|upload to| SB
    DocMgr -->|enqueue| Q
    
    Backend --> DB
    
    Ret ---> FastEmbed
    Ret -.->|fallback| Gemini
    Backend ---> Groq
Loading

Testing

cd codex-backend
source .venv/bin/activate
python tests/test_golden.py       # Single golden query
python tests/test_rigorous.py     # Full sweep
python eval/ragas_eval.py         # RAGAS metrics

Learn More

Technical Highlights

  • FastAPI backend
  • Next.js frontend
  • PostgreSQL + pgvector
  • Hybrid retrieval (vector + BM25)
  • Server-sent events (SSE) streaming
  • Supabase authentication and storage
  • Provider-agnostic LLM architecture
  • Workspace-agent experimentation (v5)

Packages

 
 
 

Contributors