A production-ready Retrieval-Augmented Generation (RAG) chatbot system that enables intelligent conversations with your documents. Upload PDFs, Word documents, or text files and ask questions about their content using advanced AI capabilities.
- Multi-format Document Processing: PDF, DOCX, DOC, and TXT files
- Intelligent Document Chunking: Semantic chunking with overlap for context preservation
- Real-time Chat Interface: Live streaming responses with SignalR
- Vector Semantic Search: Powered by Qdrant for accurate document retrieval
- Citation Support: Automatic source document citations in responses
- Background Processing: Async document processing with real-time status updates
- Real-time Updates: Live document processing status via SignalR hubs
- Multi-layer Caching: Redis caching for optimal performance
- Health Monitoring: Comprehensive health checks for all services
- Clean Architecture: Modular design with clear separation of concerns
- Production Ready: Full error handling, logging, and monitoring
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β React Web β β .NET Core β β Azure OpenAI β
β Frontend βββββΊβ API βββββΊβ GPT-4o β
β β β β β β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
β SignalR β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Document β β SQL Server β β Qdrant β
β Processing β β Metadata β β Vector DB β
β Worker β β β β β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Azure Blob β β Redis Cache β β Background β
β Storage β β β β Services β
β β β β β β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
RagChatbot/
βββ RagChatbot.API/ # Web API & SignalR Hubs
β βββ Controllers/ # REST API endpoints
β βββ Hubs/ # SignalR real-time communication
β βββ frontend/ # React TypeScript frontend
βββ RagChatbot.Core/ # Domain entities & interfaces
βββ RagChatbot.Application/ # Business logic & services
βββ RagChatbot.Infrastructure/ # External service implementations
β βββ Data/ # EF Core & database
β βββ Services/ # Azure OpenAI, Qdrant, Redis
β βββ Workers/ # Background processing
βββ Tests/ # Unit & integration tests
- .NET 8 - Modern C# web framework
- ASP.NET Core - REST API with OpenAPI/Swagger
- SignalR - Real-time bidirectional communication
- Entity Framework Core - ORM with SQL Server
- Serilog - Structured logging
- React 18 - Modern component-based UI
- TypeScript - Type-safe JavaScript
- Tailwind CSS - Utility-first styling
- Zustand - Lightweight state management
- Microsoft SignalR Client - Real-time updates
- Axios - HTTP client with interceptors
- Azure OpenAI - GPT-4o for chat, text-embedding-ada-002 for embeddings
- Qdrant - Vector database for semantic search
- Redis - Multi-layer caching strategy
- Azure Blob Storage - Scalable file storage
- iText7 - PDF text extraction
- DocumentFormat.OpenXml - Word document processing
- Semantic Chunking - Intelligent text segmentation
- .NET 8 SDK
- Node.js 18+
- Docker & Docker Compose
- Azure OpenAI API access
- SQL Server (or LocalDB)
git clone https://github.com/yourusername/rag-chatbot-poc.git
cd rag-chatbot-poc# Start Qdrant and Redis
docker-compose up -d qdrant rediscd RagChatbot.API
cp appsettings.json appsettings.Development.jsonEdit appsettings.Development.json:
{
"ConnectionStrings": {
"DefaultConnection": "Server=(localdb)\\mssqllocaldb;Database=RagChatbotDB;Trusted_Connection=true;",
"Redis": "localhost:6379"
},
"AzureOpenAI": {
"Endpoint": "https://your-resource.openai.azure.com/",
"ApiKey": "your-api-key",
"ChatDeploymentName": "gpt-4o",
"EmbeddingDeploymentName": "text-embedding-ada-002",
"ApiVersion": "2024-02-15-preview"
},
"Qdrant": {
"Host": "localhost",
"Port": 6333,
"CollectionName": "documents"
}
}dotnet ef database updatedotnet runAPI available at: https://localhost:7262
cd frontend
npm install
npm run devFrontend available at: http://localhost:3000
- Navigate to the Documents page
- Drag & drop or select files (PDF, DOCX, DOC, TXT)
- Monitor real-time processing status
- View processing statistics and chunk counts
- Go to the Chat page
- Create a new chat session
- Ask questions about your uploaded documents
- View source citations for all responses
- Use the Query page for one-off questions
- Search documents without creating a chat session
- View similarity scores and source chunks
# Database
ConnectionStrings__DefaultConnection="Server=localhost;Database=RagChatbot;..."
ConnectionStrings__Redis="localhost:6379"
# Azure OpenAI
AzureOpenAI__Endpoint="https://your-resource.openai.azure.com/"
AzureOpenAI__ApiKey="your-api-key"
AzureOpenAI__ChatDeploymentName="gpt-4o"
AzureOpenAI__EmbeddingDeploymentName="text-embedding-ada-002"
# Qdrant
Qdrant__Host="localhost"
Qdrant__Port="6333"
Qdrant__CollectionName="documents"
# Azure Storage
AzureStorage__ConnectionString="DefaultEndpointsProtocol=https;..."
AzureStorage__ContainerName="documents"{
"RagSettings": {
"ChunkSize": 500,
"ChunkOverlap": 50,
"MaxRetrievedChunks": 5,
"SimilarityThreshold": 0.7
}
}docker-compose up -d# Build and deploy
docker-compose -f docker-compose.prod.yml up -d/health- Overall application health/health/ready- Readiness probe (DB/Cache)/health/live- Liveness probe/health/external- External service status
- Serilog structured logging to console and files
- Application Insights integration ready
- Health check dashboard for monitoring dependencies
# Unit tests
dotnet test RagChatbot.Tests.Unit
# Integration tests
dotnet test RagChatbot.Tests.Integration
# All tests with coverage
dotnet test --collect:"XPlat Code Coverage"cd frontend
npm test
npm run test:coverage- API Authentication ready for JWT integration
- CORS configured for frontend origins
- Input validation on all endpoints
- SQL injection protection via EF Core
- XSS protection via React's built-in sanitization
- Secrets management via Azure Key Vault (configurable)
- Redis caching for frequently accessed data
- Connection pooling for database and Redis
- Lazy loading for large document collections
- Streaming responses for real-time user experience
- Background processing for CPU-intensive tasks
- Horizontal scaling ready with stateless design
- Load balancing compatible
- Container orchestration ready (Kubernetes)
- CDN integration for static assets
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow SOLID principles and clean architecture
- Write unit tests for new functionality
- Update documentation for API changes
- Use conventional commits for clear history
- Chat Sessions:
/api/chat/sessions - Documents:
/api/documents - Queries:
/api/query - Health:
/health
- Chat Hub:
/hubs/chat- Real-time messaging - Document Hub:
/hubs/documents- Processing updates
Full API documentation available at /swagger when running in development mode.
- Azure OpenAI Connection: Verify endpoint and API key
- Qdrant Connection: Ensure Docker container is running
- Database Issues: Check connection string and run migrations
- Frontend CORS: Verify API URL in frontend configuration
- Create an issue for bugs or feature requests
- Check existing issues for solutions
- Review logs for detailed error information
This project is licensed under the MIT License - see the LICENSE file for details.
- Multi-tenant Support - Organization-based document isolation
- Advanced Analytics - Usage metrics and conversation insights
- Plugin System - Extensible document processors
- Mobile App - React Native companion app
- Voice Interface - Speech-to-text integration
- Collaborative Features - Shared documents and conversations
Built with β€οΈ for intelligent document interaction