Skip to content

KrishnaNsingh/Capstone_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

A modern chatbot application that allows users to upload PDF documents and ask questions about their content using Retrieval Augmented Generation (RAG) with Google Gemini and Pinecone vector database.

RAG Chatbot Demo

🌟 Features

Feature Description
πŸ“„ PDF Upload Upload and process PDF documents of any size
πŸ’¬ AI-Powered Chat Ask questions about your documents using Google Gemini
πŸ” Context Retrieval Get relevant context snippets from your documents
πŸŒ“ Dark/Light Mode Toggle between dark and light themes
πŸ“± Responsive Design Works on desktop, tablet, and mobile devices
πŸ—‚οΈ Document Management View, manage, and delete uploaded documents

πŸ—οΈ Architecture

graph TB
    A[Frontend - React/Vite] --> B[API Gateway]
    B --> C[Backend - Express.js]
    C --> D[(Pinecone Vector DB)]
    C --> E[(In-Memory Storage)]
    C --> F[Google Gemini API]
    C --> G[PDF Processing]
    
    subgraph Vercel
        A
    end
    
    subgraph Render
        C
        D
        E
        F
        G
    end
    
    style A fill:#4F46E5,stroke:#000,color:#fff
    style C fill:#10B981,stroke:#000,color:#fff
    style D fill:#8B5CF6,stroke:#000,color:#fff
    style F fill:#F59E0B,stroke:#000,color:#fff
Loading

πŸ”„ Data Flow

sequenceDiagram
    participant U as User
    participant F as Frontend
    participant B as Backend
    participant P as Pinecone
    participant G as Gemini
    
    U->>F: Upload PDF
    F->>B: POST /api/upload
    B->>B: Process PDF & Extract Text
    B->>B: Chunk Text Content
    B->>G: Generate Embeddings
    B->>P: Store Embeddings
    B->>F: Return Document Info
    
    U->>F: Ask Question
    F->>B: POST /api/chat
    B->>G: Generate Question Embedding
    B->>P: Retrieve Similar Context
    B->>G: Generate Answer with Context
    B->>F: Return Answer & Context Snippets
    F->>U: Display Results
Loading

πŸš€ Quick Start

Prerequisites

Requirement Version
Node.js >= 16.0.0
npm >= 8.0.0
Google Gemini API Key -
Pinecone Account -

Environment Variables

Create a .env.local file with the following variables:

Variable Description Example
GEMINI_API_KEY Google Gemini API Key AIzaSyB123456789...
PINECONE_API_KEY Pinecone API Key abc123xyz...
PINECONE_CLOUD Pinecone Cloud Provider aws
PINECONE_REGION Pinecone Region us-west-2
PINECONE_INDEX Pinecone Index Name rag-chatbot-index

Installation

# Clone the repository
git clone https://github.com/your-username/rag-chatbot.git
cd rag-chatbot

# Install dependencies
npm install

# Start development server
npm run dev

Deployment Architecture

Service Platform URL Pattern
Frontend Vercel https://your-app.vercel.app
Backend Render + Uptime Robot https://your-backend.onrender.com
Vector DB Pinecone https://your-index-1234567.svc.XYZ.pinecone.io

πŸ› οΈ API Endpoints

Document Management

Endpoint Method Description
/api/upload POST Upload and process a PDF document
/api/documents GET Retrieve all uploaded documents
/api/documents/:id GET Retrieve a specific document
/api/documents/:id DELETE Delete a document and its data
/api/messages/:documentId GET Retrieve chat messages for a document
/api/messages/:documentId DELETE Clear chat messages for a document

Chat Endpoints

Endpoint Method Description
/api/chat POST Ask a question about a document

πŸ“ Project Structure

rag-chatbot/
β”œβ”€β”€ client/                 # React frontend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/    # UI components
β”‚   β”‚   β”œβ”€β”€ hooks/         # Custom React hooks
β”‚   β”‚   β”œβ”€β”€ lib/           # Utility functions
β”‚   β”‚   β”œβ”€β”€ pages/         # Page components
β”‚   β”‚   └── App.tsx        # Main app component
β”‚   └── index.html         # HTML entry point
β”œβ”€β”€ server/                # Express backend
β”‚   β”œβ”€β”€ lib/               # Core services
β”‚   β”‚   β”œβ”€β”€ gemini-service.ts    # Gemini API integration
β”‚   β”‚   β”œβ”€β”€ pdf-processor.ts     # PDF processing
β”‚   β”‚   β”œβ”€β”€ pinecone-service.ts  # Pinecone integration
β”‚   β”‚   └── rag-service.ts       # RAG logic
β”‚   β”œβ”€β”€ index.ts           # Server entry point
β”‚   β”œβ”€β”€ routes.ts          # API routes
β”‚   └── storage.ts         # Data storage
β”œβ”€β”€ shared/                # Shared types and schemas
β”œβ”€β”€ package.json           # Project dependencies
└── README.md              # This file

🧠 How It Works

Retrieval Augmented Generation (RAG)

  1. Document Processing:

    • PDF files are uploaded and parsed
    • Text content is extracted and chunked
    • Each chunk is converted to embeddings using Google Gemini
  2. Vector Storage:

    • Embeddings are stored in Pinecone vector database
    • Each vector is associated with metadata (document ID, page number, etc.)
  3. Question Answering:

    • User questions are converted to embeddings
    • Similar context is retrieved from Pinecone
    • Gemini generates answers using the retrieved context

Technologies Used

Category Technology
Frontend React, TypeScript, Tailwind CSS, Vite
Backend Express.js, Node.js
AI Google Gemini, Pinecone
Storage In-Memory Storage (for demo)
PDF Processing pdf-parse
Deployment Vercel (Frontend), Render (Backend)

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a pull request

About

PDF Upload-Upload and process PDF documents of any size | πŸ’¬ AI-Powered-Chat Ask questions about your documents using Google Gemini

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors