Skip to content

Rithikis14/Reminisence_AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 Reminiscence AI

A Voice Companion for Dementia Care

Python React FastAPI Ollama License

A fully local, privacy-first AI voice companion that speaks to dementia patients in a familiar, comforting voice — 24/7, with infinite patience.

FeaturesArchitectureInstallationUsageContributing


💡 The Problem

Dementia patients experience profound isolation, confusion, and anxiety — especially when human caregivers are unavailable. Repetitive questions go unanswered. Familiar faces are forgotten. The patient feels like a burden.

Reminiscence AI solves this.

By combining local speech recognition, a compassionate language model, and voice cloning technology, this system creates a tireless AI companion that:

  • Speaks in the voice of a loved one
  • Applies Validation Therapy — never correcting, always accepting
  • Uses Reminiscence Therapy — gently evoking positive memories
  • Is available 24 hours a day, 7 days a week
  • Runs 100% locally — no cloud, no subscriptions, no privacy concerns

"Humans naturally tire of repetitive questions — an AI does not."


✨ Features

Feature Description
🎙️ Voice Cloning Upload a 15–30 second audio clip of a family member — the AI speaks in that voice
🧠 Local LLM Powered by Ollama (phi3:mini / llama3) — no API keys, no usage costs
👂 Speech-to-Text Faster-Whisper for fast, accurate offline transcription
🔊 Neural TTS Coqui XTTS-v2 for high-quality, cloned voice synthesis
🛡️ 100% Private All processing happens on your machine — patient data never leaves
👴 Elder-Friendly UI Large buttons, calm colors, simple one-tap interaction
💬 Personalized Prompts Patient name, memories, loved ones, and hobbies shape every response
🔄 Fallback Chain XTTS-v2 → edge-tts → pyttsx3 — always speaks, never crashes

🏗️ Architecture

Patient speaks
      ↓
Browser records audio (WebM)
      ↓
POST /chat ──→ FastAPI (port 8000)
      ↓
Faster-Whisper transcribes speech → text
      ↓
Patient profile loaded (name, memories, loved ones)
      ↓
Personalized system prompt built
      ↓
Ollama (phi3:mini) generates compassionate response
      ↓
POST /tts ──→ Coqui XTTS-v2 (port 8001)
      ↓
If voice sample exists → clones the voice
      ↓
Audio played back to patient 🔊

Tech Stack

Layer Tool Why
STT Faster-Whisper Fast, local, no cloud
LLM Ollama + phi3:mini Runs on 4GB VRAM or CPU
TTS Coqui XTTS-v2 Real voice cloning, fully offline
TTS Fallback edge-tts / pyttsx3 Always available
Backend FastAPI (Python) Fast async API
Frontend React Clean, accessible UI

📁 Project Structure

Reminiscence_AI/
├── backend/                    # FastAPI backend (Python 3.12)
│   ├── companion_data/         # Patient profiles & voice samples (gitignored)
│   │   ├── patient.json        # Patient info & preferences
│   │   ├── voice_sample.wav    # Uploaded voice sample for cloning
│   │   └── cloned_response.wav # Generated audio output
│   ├── venv/                   # Python 3.12 virtual environment
│   ├── main.py                 # Main FastAPI server (port 8000)
│   ├── tts_server.py           # Coqui XTTS-v2 microservice (port 8001)
│   ├── requirements.txt        # Main backend dependencies
│   └── tts_requirements.txt    # TTS server dependencies (Python 3.11)
│
├── frontend/                   # React frontend
│   ├── src/
│   │   └── App.jsx             # Main UI component
│   ├── public/
│   └── package.json
│
├── tts_venv/                   # Python 3.11 venv for Coqui TTS
├── Modelfile                   # Ollama custom model config (phi3-small)
├── .gitignore
└── README.md

🚀 Installation

Prerequisites

Requirement Version Download
Python 3.12 (main) + 3.11 (TTS) python.org
Node.js 18+ nodejs.org
Ollama Latest ollama.com
Git Any git-scm.com

Step 1 — Clone the repository

git clone https://github.com/Rithikis14/Reminisence_AI.git
cd Reminisence_AI

Step 2 — Set up the main backend (Python 3.12)

cd backend
python -m venv venv

# Windows
venv\Scripts\activate

# macOS/Linux
source venv/bin/activate

pip install -r requirements.txt

Step 3 — Set up the TTS server (Python 3.11)

cd ..  # back to root

# Windows
py -3.11 -m venv tts_venv
tts_venv\Scripts\activate

# macOS/Linux
python3.11 -m venv tts_venv
source tts_venv/bin/activate

pip install -r backend/tts_requirements.txt

Note: Coqui TTS requires Python 3.11. It will not install on Python 3.12+.


Step 4 — Download the Ollama model

# Pull a lightweight model that fits in 4GB VRAM or runs on CPU
ollama pull phi3:mini

# Create a memory-optimized version
ollama create phi3-small -f Modelfile

Step 5 — Set up the frontend

cd frontend
npm install

🎮 Usage

Run these in 4 separate terminals:

Terminal 1 — Ollama (LLM):

ollama run phi3-small "say hello"
# Wait for response, then leave running

Terminal 2 — TTS Server (Voice Cloning):

# Windows
cd Reminiscence_AI\backend
..\tts_venv\Scripts\activate
uvicorn tts_server:app --host 0.0.0.0 --port 8001

# macOS/Linux
cd Reminiscence_AI/backend
source ../tts_venv/bin/activate
uvicorn tts_server:app --host 0.0.0.0 --port 8001

Terminal 3 — Main Backend:

cd Reminiscence_AI/backend
# activate venv (not tts_venv)
venv\Scripts\activate        # Windows
source venv/bin/activate     # macOS/Linux

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Terminal 4 — Frontend:

cd Reminiscence_AI/frontend
npm start

Open http://localhost:3000 in your browser.


🧭 First-Time Setup

When you first open the app, you'll be guided through a 3-step onboarding:

  1. Patient Info — Enter the patient's name and age
  2. Memories & Loved Ones — Add family member names, key life memories, hobbies
  3. Voice Sample (optional) — Upload a 15–30 second WAV audio clip of a family member's voice

Once setup is complete, the patient can hold the microphone button and speak naturally. The AI will respond in a warm, personalized way — and if a voice sample was uploaded, it will speak in that familiar voice.


🔧 Configuration

Changing the LLM model

In backend/main.py, find and update:

async def call_ollama(system_prompt: str, user_message: str, model: str = "phi3-small") -> str:

Available options:

Model VRAM/RAM Quality
phi3:mini ~2GB Good
phi3-small ~2GB (reduced context) Good
mistral ~4GB Better
llama3 ~5GB Best

Changing the TTS voice (no voice sample)

In backend/tts_server.py, update the fallback speaker name:

tts.tts_to_file(text=..., speaker="Claribel Dervla", ...)

Run tts --list_models to see available speakers.


🖥️ Hardware Requirements

Tier RAM GPU Experience
Minimum 8GB None CPU only, slower responses (~10s)
Recommended 16GB 4GB VRAM Good quality, ~3-5s responses
Ideal 16GB+ 8GB+ VRAM Fast responses, full voice cloning

Tip: Close VS Code and other heavy applications while running to free up RAM.


🔒 Privacy

  • No cloud — everything runs on your machine
  • No API keys — no Anthropic, OpenAI, or Google accounts needed
  • No data collection — patient data stays in companion_data/ on your disk
  • No internet required — works completely offline after setup

🤝 Contributing

Contributions are welcome! Here's how to get started:

# Fork the repo, then:
git clone https://github.com/YOUR_USERNAME/Reminisence_AI.git
git checkout -b feature/your-feature-name

# Make your changes, then:
git commit -m "Add: your feature description"
git push origin feature/your-feature-name
# Open a Pull Request

Ideas for contributions

  • 🌍 Multi-language support (Tamil, Hindi, Spanish, etc.)
  • 📱 Mobile-friendly PWA version
  • 📊 Caregiver dashboard with conversation logs
  • 🎵 Background music / ambient sound support
  • 🖼️ Photo display to trigger memories during conversation

🙏 Acknowledgements


Built with ❤️ for caregivers and patients everywhere

"Simply being heard makes patients feel valued rather than a burden."

About

This is an AI-driven project focused on memory preservation and enhancement. It leverages deep learning techniques such as GANs, U-Nets, and contextual agents to process, recall, and enrich user memories through text and image workflows. The project is modular, research-oriented, and designed for experimentation with advanced AI architectectures.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Contributors