Skip to content

hasanusluu/IT-Support-RAG-System-LangGraph-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IT Support RAG System & LangGraph Pipeline 🧠

An intelligent IT Desk Assistant powered by LangChain, LangGraph, and Google Gemini. This system employs a robust LangGraph Pipeline to orchestrate the flow between specific policy retrieval (RAG), general conversation, and topic guardrails.

📸 Interface Demo

Here is the system in action, demonstrating both general chat and policy-based retrieval:

Chat Interface Demo 1 Managing General vs. Technical Queries

Chat Interface Demo 2 Retrieving Specific Policy Information

⚙️ How It Works (The Logic Flow)

The system doesn't just "guess"; it follows a strict LangGraph decision tree:

  1. Input: User asks a question (e.g., "How do I update VPN?").
  2. Router Node: The "Brain" analyzes the intent.
    • 👉 Case A (INTERNAL): If it's about company policy, software, or errors -> Go to PDF Store.
      • Retrieves relevant chunks from data/pdfs/.
    • 👉 Case B (GENERAL): If it's a greeting or general IT chat -> Go to QA Style Store.
      • Retrieves conversation examples from the CSV.
    • 👉 Case C (GUARDRAIL): If it's about Sports/Cooking -> Reject.
  3. Generator: Gemini combines the retrieved info (PDFs or QA Style) to generate the final response.

🚀 Features

  • LangGraph Pipeline: A structured control flow that acts as the "brain," deciding which tool to use (Router -> Retriever -> Generator).
  • Dual-Store Strategy:
    • Access Policy RAG: Retrieves facts from PDFs.
    • Conversational QA: Retrieves "Persona & Style" from CSV.
  • Topic Guardrails: Politely refuses non-IT questions (e.g., sports, cooking).
  • Modern UI: A dark-themed React chat interface with real-time system logs.

🛠️ Project Structure

  • src/api.py: FastAPI Backend (The Brain).
  • src/graph.py: LangGraph Logic (Nodes & Edges).
  • src/ingestion.py: Script to load PDFs and CSV into the vector database.
  • ui/: React Frontend (Vite).
  • data/pdfs/: Place your company policy PDFs here.
  • vectorstore/: ChromaDB storage for embeddings.

🏃‍♂️ How to Run

You need two terminal windows running.

1. Backend (Python)

python src/api.py

Wait for "Application startup complete".

2. Frontend (React)

cd ui
npm run dev

Open the provided localhost link (e.g., http://localhost:5173).

🧪 Verified Scenarios

  • "VPN bağlantım koptu": Fetches troubleshooting steps from PDF.
  • "Şifremi değiştiremiyorum": Checks Password Policy.
  • "Spor yapacağım": Intercepted by Guardrails and refused.
  • "Nasılsın?": Returns a polite, professional greeting.

🔑 Requirements

  • Python 3.10+
  • Node.js
  • Google Gemini API Key (in .env)

🧠 Teknik Detaylar ve RAG Stratejisi

Bu proje, verileri işlemek ve anlamlandırmak için özelleştirilmiş bir RAG (Retrieval-Augmented Generation) mimarisi kullanır. İşte teknik detaylar:

1. Chunking (Parçalama) Stratejisi

Veriler iki farklı yaklaşım ile parçalanır:

  • PDF Dokümanları (Politikalar):

    • Yöntem: RecursiveCharacterTextSplitter kullanılır.
    • Boyut (Chunk Size): 1000 karakter.
    • Örtüşme (Overlap): 100 karakter (Bağlam kopmaması için).
    • Metadata: Her parça, geldiği dosya ismini ve sayfa numarasını korur.
  • CSV Dosyası (Sohbet Stili):

    • Yöntem: "Row-Based Chunking" (Satır Bazlı Parçalama).
    • Mantık: Her CSV satırı tek bir bütün chunk olarak ele alınır.
    • Yapı: Kullanıcının sorusu (body) vektör veritabanında aranacak metin olur. Cevap (answer) ve Konu (subject) ise metadata (gizli bilgi) olarak saklanır. LLM, cevabı bu metadatadan okur.

2. RAG Mimarisi (LangGraph)

Sistem, klasik RAG zincirlerinden farklı olarak bir State Machine (Durum Makinesi) yapısındadır:

  1. Router (Yönlendirici): Gelen soruyu analiz eder.
    • Teknik/Şirket Sorusu ise: PDF Store + Style Store (Hem bilgi hem üslup).
    • Genel Sohbet ise: Sadece Style Store (Sadece üslup).
  2. Retrieval (Veri Getirme): İlgili veritabanından en benzer kayıtlar (Top-K) çekilir.
    • Embeddings: models/text-embedding-004 (Google).
    • Vector Store: ChromaDB (Local).
  3. Generation (Üretim):
    • Model: gemini-2.5-flash-lite.
    • Model, hem doğru bilgiyi (PDF) hem de doğru konuşma tarzını (CSV Metadata) birleştirerek nihai cevabı üretir.

⚠️ Disclaimer

The PDF documents provided in the data/pdfs/ directory (e.g., VPN Policy, Password Policy) are fictional examples created solely for demonstration purposes. They do not represent the actual policies of any real company or organization.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors