AI Debugging Copilot for Distributed Systems

📌 Overview

This project is a production-style AI backend system designed to help developers debug distributed systems by combining:

Document-based Retrieval (RAG)
Real-time Log Ingestion (upcoming)
LLM-based Root Cause Analysis

The system retrieves relevant context from documents (and later logs), and uses an LLM to generate accurate, grounded debugging explanations and fix suggestions.

🧠 Problem Statement

Debugging distributed systems is difficult because:

Logs are noisy and scattered
Errors lack context
Documentation is underutilized

This system solves that by:

Combining logs + documentation + semantic search → to explain issues and suggest fixes

⚙️ Current Features (Phase 1)

✅ Core AI Pipeline

Retrieval-Augmented Generation (RAG)
Semantic search using embeddings
Context-aware response generation

✅ Backend Architecture

FastAPI-based modular backend
Clean separation of layers (API, retrieval, service, cache)

✅ Vector Database

Qdrant (persistent via Docker)
Rich payload metadata:
- user_id (multi-tenancy ready)
- chunk tracking
- embedding versioning

✅ LLM Reliability

Retry mechanism with exponential backoff
Timeout protection
Safe fallback responses

✅ Caching

Redis-based response caching
Reduces latency and cost

✅ Security

API key-based authentication (middleware)
Secure request validation

✅ Rate Limiting

Redis-based per-key rate limiting
Prevents API abuse

🏗️ Architecture

🔹 High-Level Flow

Client → FastAPI API Layer → Middleware Layer (Auth + Rate Limiting) → Service Layer (LLM Orchestration) → Retrieval Layer (Qdrant Vector Search) → Cache Layer (Redis) → LLM (OpenAI API)

🔹 Detailed Request Lifecycle

Client Request
- User sends query to /api/ask
Middleware Layer
- API Key Authentication
- Rate Limiting (Redis-based)
Cache Layer (Redis)
- Checks for cached response
- Returns immediately if available
Retrieval Layer (RAG)
- Converts query → embedding
- Searches similar chunks in Qdrant
- Filters based on similarity threshold
- Builds contextual input
LLM Service Layer
- Constructs structured prompt (context + query)
- Calls OpenAI API with retry & timeout handling
Response Processing
- Formats response (answer, confidence, source)
- Stores result in Redis cache
Response Returned to Client

🔹 Data Storage Design

Qdrant (Vector Database)
- Stores embeddings + metadata payload
- Enables semantic search and filtering
Redis
- Used for caching responses
- Used for rate limiting counters

🔹 Key Design Principles

Modular architecture with clear separation of concerns
Config-driven system behavior
Fault-tolerant LLM integration (retry + fallback)
Scalable vector-based retrieval
Secure and rate-limited API access

🛠️ Tech Stack

Python — Core language
FastAPI — Backend API framework
Qdrant — Vector database for semantic search
Redis — Caching and rate limiting
OpenAI API — LLM inference
Docker — Containerized services

🚀 How to Run Locally

1. Clone Repository

git clone https://github.com/your-username/your-repo.git
cd your-repo

2. Setup Environment

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

3. Create `.env` File

APP_API_KEY=your_secret_key
OPENAI_API_KEY=your_openai_key

QDRANT_HOST=localhost
QDRANT_PORT=6333

REDIS_URL=redis://localhost:6379

4. Start Services

Qdrant

docker run -p 6333:6333 qdrant/qdrant

Redis

redis-server

5. Run Application

uvicorn app.main:app --reload

🧪 Example API Usage

Endpoint

POST /api/ask

Headers

x-api-key: your_secret_key
Content-Type: application/json

Request Body

{
  "query": "What backend technologies are used?"
}

📌 Example Response

{
  "answer": "The backend technologies include Python, FastAPI, Redis...",
  "confidence": "medium",
  "source": "rag_docs"
}

🔮 Upcoming Features (Phase 2+)

📊 Log ingestion system (/ingest-log)
🧠 Incident detection (error patterns)
🔍 Multi-source retrieval (logs + docs)
⚡ Hybrid search (vector + keyword)
📈 Observability (Prometheus + Grafana)
👥 Multi-user API key system
☁️ Cloud deployment (AWS)
🔁 CI/CD pipeline

🎯 Goal

To build a production-grade AI backend system that demonstrates:

Backend engineering expertise
Scalable system design
Real-world AI/LLM integration

👨‍💻 Author

Akash Akuthota Backend Developer (Python | FastAPI | AI Systems)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
.gitignore		.gitignore
README.MD		README.MD
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AI Debugging Copilot for Distributed Systems

📌 Overview

🧠 Problem Statement

⚙️ Current Features (Phase 1)

✅ Core AI Pipeline

✅ Backend Architecture

✅ Vector Database

✅ LLM Reliability

✅ Caching

✅ Security

✅ Rate Limiting

🏗️ Architecture

🔹 High-Level Flow

🔹 Detailed Request Lifecycle

🔹 Data Storage Design

🔹 Key Design Principles

🛠️ Tech Stack

🚀 How to Run Locally

1. Clone Repository

2. Setup Environment

3. Create .env File

4. Start Services

Qdrant

Redis

5. Run Application

🧪 Example API Usage

Endpoint

Headers

Request Body

📌 Example Response

🔮 Upcoming Features (Phase 2+)

🎯 Goal

👨‍💻 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3. Create `.env` File

Packages