This project uses a fully open-source AI stack to provide:
π¬ Streaming AI chat replies
π€ Voice output (Text-to-Speech)
π§ Local LLM via Ollama
π FastAPI backend
Install Ollama:
- π https://ollama.com
After installing, pull a model:
- ollama pull llama3
Start Ollama server:
-
ollama serve
-
(Default runs on http://localhost:11434)
Python (Recommended) Python 3.11.x -> (Required for TTS compatibility)
Install required packages:
- pip install fastapi uvicorn requests TTS
π Running the Backend
Start FastAPI server:
- python -m uvicorn main:app --reload
Backend will run on:
π Chat Streaming Endpoint
-
POST /chat-stream
-
Streams AI response from Ollama in real-time.
π Voice (Text-to-Speech)
- This project uses Coqui TTS (open-source) for natural AI voice.
Example model:
- tts_models/en/ljspeech/glow-tts
Voice is generated after full AI message is received.
-
π Web Support (CORS Enabled)
-
FastAPI is configured with CORS to support.
- FastAPI -> Backend server
- Ollama -> Local LLM (AI brain)
- Coqui TTS -> Voice generation
- Flutter -> Frontend UI (https://github.com/darttechwala/ChatBot)
-
Fully offline capable (local AI)
-
Streaming chat responses
-
Voice replies
-
Multi-platform (Android, iOS, Web, macOS, Windows)
-
Open-source stack
Ollama must be running before starting backend