This project provides:
- FastAPI backend
- PostgreSQL + pgvector memory store
- LangChain-based chunking + embeddings for RAG
- Ollama local generation with
llama3.1:8b - Optional Claude cloud generation via dropdown selection
- Minimal React frontend
cd /Users/vashanth/Desktop/Repos/vash/pkb/backend
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
cp .env.example .envcd /Users/vashanth/Desktop/Repos/vash/pkb/frontend
npm installUsing Homebrew services:
/opt/homebrew/bin/brew install postgresql@17 pgvector
/opt/homebrew/bin/brew services stop postgresql@16 || true
/opt/homebrew/bin/brew services start postgresql@17
# Creates the DB using your local macOS username role (typical on Homebrew Postgres).
createdb pkbApply schema:
psql -d pkb -f /Users/vashanth/Desktop/Repos/vash/pkb/backend/schema.sqlIf psql -d pkb fails with a role error, set DATABASE_URL in /Users/vashanth/Desktop/Repos/vash/pkb/backend/.env to:
DATABASE_URL=postgresql://$USER@localhost:5432/pkb
Set REDACT_PII=1 in /Users/vashanth/Desktop/Repos/vash/pkb/backend/.env to redact common PII patterns before embedding/storing.
ollama serve
ollama pull llama3.1:8b
ollama pull nomic-embed-textSet these in /Users/vashanth/Desktop/Repos/vash/pkb/backend/.env:
CLAUDE_API_KEY=your_key_hereCLAUDE_MODEL=claude-3-5-sonnet-latestCLAUDE_BASE_URL=https://api.anthropic.comALLOW_CLAUDE_WITHOUT_OLLAMA=1EMBEDDING_FALLBACK_PROVIDER=simple
When Claude is selected in the frontend dropdown, backend generation is sent to Anthropic API.
If Ollama embeddings are unavailable, Claude mode can still run semantic RAG using a deterministic local simple embedding fallback.
If Claude returns 404, your model id is usually not available for your account/version; set CLAUDE_MODEL to one available in your Anthropic console.
cd /Users/vashanth/Desktop/Repos/vash/pkb/backend
source .venv/bin/activate
uvicorn app.main:app --reload --port 8000cd /Users/vashanth/Desktop/Repos/vash/pkb/backend
source .venv/bin/activate
python ingest_notes.py /absolute/path/to/notes_or_filescd /Users/vashanth/Desktop/Repos/vash/pkb/frontend
npm run devRequest:
{
"question": "What did I say about my onboarding plan?",
"session_id": "optional-existing-session-id",
"top_k": 8,
"model_provider": "llama"
}Response includes:
answercontext(exact injected context block)retrieveditemssession_idmodel_provider
- Retrieval combines:
- recent conversation items from the same
session_id - semantic vector matches from
memory_items
- recent conversation items from the same
- This avoids failures where repeated short questions (for example,
"what's my name") dominate nearest-neighbor results.