A fully local, modular chatbot application with strict environment isolation and clean architecture. Everything runs completely offline with no external APIs or cloud dependencies.
- Local LLM: Powered by Ollama (llama3 model)
- Vector Database: ChromaDB for semantic search
- Local Embeddings: Ollama's nomic-embed-text model
- Memory Layer: Persistent conversation memory
- Production Architecture: Modular, replaceable components
- Clean UI: Modern, responsive chat interface
- Environment Isolation: Dedicated Python virtual environment
- Zero Dependencies: No external APIs or internet required
local-chatbot/
├── backend/
│ ├── app.py # FastAPI application
│ ├── config.py # Configuration settings
│ ├── routes.py # API routes
│ ├── requirements.txt # Python dependencies
│ ├── __init__.py
│ └── services/
│ ├── __init__.py
│ ├── llm.py # LLM service (Ollama)
│ ├── embeddings.py # Embeddings service
│ ├── vector_store.py # ChromaDB operations
│ ├── memory.py # Memory service
│ └── chat_service.py # Main orchestration
│
├── frontend/
│ ├── index.html # Chat UI
│ ├── app.js # Frontend logic
│ ├── style.css # Styling
│
├── data/ # Data storage
│ └── chroma_db/ # Vector database
│
├── .venv/ # Python virtual environment (auto-created)
├── setup.ps1 # Setup script (Windows PowerShell)
├── setup.sh # Setup script (Linux/macOS)
├── .gitignore # Git ignore rules
└── README.md # This file
- Python 3.9+: Download here
- Ollama: Download here
- Modern Web Browser: Chrome, Firefox, Safari, or Edge
cd "Project Jarvis"Windows (PowerShell):
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
.\setup.ps1Linux/macOS:
chmod +x setup.sh
./setup.shIf not done automatically by setup script:
ollama pull llama3
ollama pull nomic-embed-textAfter setup, in the same terminal:
python backend/app.pyYou should see:
INFO: Uvicorn running on http://0.0.0.0:8000
Open your browser to: http://localhost:8000
Start chatting! 💬
python -m venv .venvWindows:
.venv\Scripts\activateLinux/macOS:
source .venv/bin/activatepip install -r backend/requirements.txtmkdir -p data/chroma_dbpython backend/app.py┌─────────────────────────────────────────────┐
│ Frontend (HTML/CSS/JS) │
│ - Chat UI │
│ - Message handling │
│ - Health monitoring │
└──────────────┬──────────────────────────────┘
│ HTTP REST API
┌──────────────▼──────────────────────────────┐
│ FastAPI Backend │
│ - /api/chat (POST) │
│ - /api/health (GET) │
│ - /api/clear (POST) │
└──────────────┬──────────────────────────────┘
│
┌──────────┼──────────────┬────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌────────┐ ┌────────────┐ ┌────────────┐ ┌──────────┐
│ LLM │ │ Embeddings │ │ Vector DB │ │ Memory │
│ (llm) │ │ (embeddings)│ │ (chroma_db)│ │ (json) │
└────────┘ └────────────┘ └────────────┘ └──────────┘
↓ ↓ ↓ ↓
┌────────────────────────────────────────────────────┐
│ Ollama (Local) │
│ - llama3 (LLM) │
│ - nomic-embed-text (Embeddings) │
└────────────────────────────────────────────────────┘
Each service is standalone and replaceable:
llm.py: LLM interactions via Ollamaembeddings.py: Text embedding generationvector_store.py: ChromaDB operationsmemory.py: Conversation history and factschat_service.py: Orchestration layer
POST /api/chat
Request:
{
"message": "Hello, how are you?",
"system_prompt": null // Optional
}
Response:
{
"status": "success",
"message": "I'm doing well, thank you!",
"context_used": true,
"num_context_docs": 2
}
GET /api/health
Response:
{
"status": "healthy",
"llm": true,
"embeddings": true,
"vector_store": true,
"memory": true
}
POST /api/clear
Response:
{
"status": "success",
"message": "All data cleared"
}
Configure via environment variables:
export ENV=development # development or production
export HOST=0.0.0.0 # Server host
export PORT=8000 # Server port
export OLLAMA_URL=http://localhost:11434 # Ollama server
export LLM_MODEL=llama3 # LLM model name
export EMBEDDING_MODEL=nomic-embed-text # Embedding modelSee backend/config.py for all options.
- Vector Database:
data/chroma_db/ - Memory/Conversations:
data/memory.json
All data is stored locally on your machine.
Error: ConnectionError: Cannot connect to Ollama at localhost:11434
Solution:
- Download and install Ollama: https://ollama.ai
- Start Ollama (check your applications menu)
- Verify it's running:
curl http://localhost:11434
Error: ModuleNotFoundError: No module named 'fastapi'
Solution: Make sure virtual environment is activated:
- Windows:
.venv\Scripts\activate - Linux/macOS:
source .venv/bin/activate
Error: Address already in use: ('0.0.0.0', 8000)
Solution: Change the port:
export PORT=8001 # Or any other free port
python backend/app.pyError: Error generating response from LLM: pull access denied
Solution: Pull models manually:
ollama pull llama3
ollama pull nomic-embed-textSolution: Recreate it:
rm -rf .venv # Or: rmdir /s .venv (Windows)
python -m venv .venv
.venv\Scripts\activate # Windows
# or
source .venv/bin/activate # Linux/macOS
pip install -r backend/requirements.txtThe application logs to console with timestamps. Filter by level:
View only errors:
python backend/app.py 2>&1 | grep ERROR- All data is stored locally
- No telemetry or tracking
- No external API calls
- Ollama runs on
localhost:11434only - CORS is enabled for development (disable in production)
For production:
- Set
ENV=productionin environment - Disable CORS or configure specific origins
- Use production ASGI server (Gunicorn/Uvicorn cluster)
- Set up proper logging and monitoring
- Use environment files for sensitive config
Example:
export ENV=production
gunicorn -w 4 -k uvicorn.workers.UvicornWorker backend.app:app- FastAPI: Web framework
- Uvicorn: ASGI server
- Ollama: Local LLM integration
- ChromaDB: Vector database
- Pydantic: Data validation
See backend/requirements.txt for exact versions.
This is a starter template. Feel free to:
- Add new services
- Customize the LLM model
- Extend the frontend
- Add new capabilities
This project is provided as-is for local use.
Happy Chatting! 🚀
For issues or questions, check the troubleshooting section or consult the project structure.