SOTA Voice AI Receptionist - FamilyHealth Clinic

A state-of-the-art AI-powered voice assistant and receptionist designed for dental clinics. Built with LangGraph, Groq, Pinecone, and FastAPI, this system provides human-like conversational experiences with full RAG-enabled knowledge recall and automated appointment scheduling.

🚀 Key Features

Ultra-Low Latency Conversational Flow: Optimized with Groq (Llama-3.1-8b and GPT-OSS-20B) to deliver under ~1s response times.
RAG-Enabled Knowledge Base: Context-aware retrieval from Pinecone allows the bot to answer complex FAQs about clinic procedures, hours, and pricing.
Automated Scheduling: Direct integration with Google Calendar API to check availability and book appointments in real-time.
Persistent State Management: Uses LangGraph's checkpointer to maintain "long-term memory" across multiple turns.
Proactive Latency Masking: Implements filler messages and early RAG pre-fetching to eliminate silence during heavy API processing.
Multi-Channel Delivery: Native support for high-stakes voice (Vapi) and visual web chat.

📞 Vapi Integration (Custom LLM)

This backend is designed as a Custom LLM Provider for Vapi. It implements the Vapi Custom LLM protocol via SSE (Server-Sent Events).

Connection Steps:

Deploy the Backend: Use ngrok or a public VPC to expose the FastAPI server.
Vapi Configuration:
- Set Model: custom-llm
- Set URL: https://your-domain.com/chat/completions
Internal Logic:
- The /chat/completions endpoint maps incoming Vapi payloads to LangGraph thread_ids.
- It supports the model parameter passed by Vapi.
- It extracts the call.id for persistent checkpointing.

💻 Web Chat Interface

The project includes a separate, modern Frontend Web Chat UI.

Features:

Streaming Response: Real-time SSE streaming for a "typing" effect.
Status Indicators: Displays [SIGNAL_PROCESSING] when the AI is performing RAG or Calendar lookups.
Thread Persistence: Each web session maintains its own threadId for consistent memory.

Setup:

Navigate to the frontend directory.
Run npm install followed by npm run build.
The FastAPI server automatically serves the build at the /chat endpoint.

🏗 Architecture & Logic Flow

The core of the assistant is a directed acyclic graph (DAG) managed by LangGraph.

graph TD
    Entry([User Message]) --> Handshake[Early Handshake]
    Handshake --> Classify[Intent Classification]
    
    Classify --> Router{Intent Router}
    
    %% Standard Flows
    Router -- Greeting --> HandleGreeting[Handle Greeting]
    Router -- FAQ Query --> RAG[RAG Retrieval]
    Router -- Cancellation --> HandleCancel[Handle Cancellation]
    
    %% High-Latency Voice Flows (Masked by Filler)
    Router -- Appointment Check --> StatusFiller[Provide Filler]
    Router -- Search Availability --> AvailFiller[Provide Filler]
    Router -- Confirm Booking --> ConfirmFiller[Provide Filler]
    
    %% Booking Logic
    Router -- Booking Request --> Capture[Capture Details]
    Capture -- Missing Info --> END
    Capture -- Has Info --> AvailFiller
    
    %% Transitions from Filler
    StatusFiller --> CheckStatus[Check Appointment Status]
    AvailFiller --> CheckAvail[Check Availability]
    ConfirmFiller --> FinalConfirm[Confirm Appointment]
    
    %% Exit Points
    HandleGreeting --> END
    RAG --> END
    HandleCancel --> END
    CheckStatus --> END
    CheckAvail --> END
    FinalConfirm --> END
    
    style Handshake fill:#f96,stroke:#333
    style AvailFiller fill:#bbf,stroke:#333
    style StatusFiller fill:#bbf,stroke:#333
    style ConfirmFiller fill:#bbf,stroke:#333

Advanced Parallelization

In the classify_intent node, the system pre-fetches RAG results in a parallel asyncio.create_task. While the Intent Classifier (Llama 8B) is determining the turn's goal, the Vector Database (Pinecone) is already retrieving relevant clinic data. This cuts response latency by ~30-40%.

🛠 Tech Stack

Core: Python 3.10+, FastAPI
Orchestration: LangGraph (LangChain)
Large Language Models:
- Groq/Llama-3.1-8b-instant (Intent Classification)
- Groq/GPT-OSS-20B (Primary Brain/Reasoning)
Vector Database: Pinecone
Embeddings: FastEmbed (BAAI/bge-small-en-v1.5)
External APIs: Google Calendar, Twilio
State Persistence: SQLite (Checkpointer)

⚙️ Environment Variables

Copy .env.example to .env and fill in:

GROQ_API_KEY: For ultra-fast inference.
OPENAI_API_KEY: Fallback or primary brain.
PINECONE_API_KEY & PINECONE_INDEX_NAME: Knowledge base.
GOOGLE_CALENDAR_TOKEN_JSON: Encoded credentials for scheduling.
TWILIO_ACCOUNT_SID & TWILIO_AUTH_TOKEN: For SMS notifications.

📜 Design Philosophy

Built for FamilyHealth Clinic, this receptionist is engineered for Functional Purity. By leveraging LangGraph's update-based state return pattern, the assistant avoids memory wipes and maintains a robust, immutable history of the conversation, ensuring it follows the persona rules of "Daniel," our helpful clinic receptionist.

📚 License

This project is licensed under the MIT License - see the LICENSE file for details.

📧 Contact

Ben Onwurah - @onwurahben

Project Link: https://github.com/onwurahben/voice-ai-bot

Built with ❤️ using Groq, LangGraph, Pinecone, and FastAPI

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.langgraph_api		.langgraph_api
app		app
data		data
frontend		frontend
logs		logs
utils		utils
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
checkpoints.db		checkpoints.db
docker-compose.yml		docker-compose.yml
langgraph.json		langgraph.json
requirements.txt		requirements.txt
script.md		script.md
test_vapi.html		test_vapi.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SOTA Voice AI Receptionist - FamilyHealth Clinic

🚀 Key Features

📞 Vapi Integration (Custom LLM)

Connection Steps:

💻 Web Chat Interface

Features:

Setup:

🏗 Architecture & Logic Flow

Advanced Parallelization

🛠 Tech Stack

⚙️ Environment Variables

📜 Design Philosophy

📚 License

📧 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SOTA Voice AI Receptionist - FamilyHealth Clinic

🚀 Key Features

📞 Vapi Integration (Custom LLM)

Connection Steps:

💻 Web Chat Interface

Features:

Setup:

🏗 Architecture & Logic Flow

Advanced Parallelization

🛠 Tech Stack

⚙️ Environment Variables

📜 Design Philosophy

📚 License

📧 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages