A modern chatbot application that allows users to upload PDF documents and ask questions about their content using Retrieval Augmented Generation (RAG) with Google Gemini and Pinecone vector database.
| Feature | Description |
|---|---|
| π PDF Upload | Upload and process PDF documents of any size |
| π¬ AI-Powered Chat | Ask questions about your documents using Google Gemini |
| π Context Retrieval | Get relevant context snippets from your documents |
| π Dark/Light Mode | Toggle between dark and light themes |
| π± Responsive Design | Works on desktop, tablet, and mobile devices |
| ποΈ Document Management | View, manage, and delete uploaded documents |
graph TB
A[Frontend - React/Vite] --> B[API Gateway]
B --> C[Backend - Express.js]
C --> D[(Pinecone Vector DB)]
C --> E[(In-Memory Storage)]
C --> F[Google Gemini API]
C --> G[PDF Processing]
subgraph Vercel
A
end
subgraph Render
C
D
E
F
G
end
style A fill:#4F46E5,stroke:#000,color:#fff
style C fill:#10B981,stroke:#000,color:#fff
style D fill:#8B5CF6,stroke:#000,color:#fff
style F fill:#F59E0B,stroke:#000,color:#fff
sequenceDiagram
participant U as User
participant F as Frontend
participant B as Backend
participant P as Pinecone
participant G as Gemini
U->>F: Upload PDF
F->>B: POST /api/upload
B->>B: Process PDF & Extract Text
B->>B: Chunk Text Content
B->>G: Generate Embeddings
B->>P: Store Embeddings
B->>F: Return Document Info
U->>F: Ask Question
F->>B: POST /api/chat
B->>G: Generate Question Embedding
B->>P: Retrieve Similar Context
B->>G: Generate Answer with Context
B->>F: Return Answer & Context Snippets
F->>U: Display Results
| Requirement | Version |
|---|---|
| Node.js | >= 16.0.0 |
| npm | >= 8.0.0 |
| Google Gemini API Key | - |
| Pinecone Account | - |
Create a .env.local file with the following variables:
| Variable | Description | Example |
|---|---|---|
GEMINI_API_KEY |
Google Gemini API Key | AIzaSyB123456789... |
PINECONE_API_KEY |
Pinecone API Key | abc123xyz... |
PINECONE_CLOUD |
Pinecone Cloud Provider | aws |
PINECONE_REGION |
Pinecone Region | us-west-2 |
PINECONE_INDEX |
Pinecone Index Name | rag-chatbot-index |
# Clone the repository
git clone https://github.com/your-username/rag-chatbot.git
cd rag-chatbot
# Install dependencies
npm install
# Start development server
npm run dev| Service | Platform | URL Pattern |
|---|---|---|
| Frontend | Vercel | https://your-app.vercel.app |
| Backend | Render + Uptime Robot | https://your-backend.onrender.com |
| Vector DB | Pinecone | https://your-index-1234567.svc.XYZ.pinecone.io |
| Endpoint | Method | Description |
|---|---|---|
/api/upload |
POST | Upload and process a PDF document |
/api/documents |
GET | Retrieve all uploaded documents |
/api/documents/:id |
GET | Retrieve a specific document |
/api/documents/:id |
DELETE | Delete a document and its data |
/api/messages/:documentId |
GET | Retrieve chat messages for a document |
/api/messages/:documentId |
DELETE | Clear chat messages for a document |
| Endpoint | Method | Description |
|---|---|---|
/api/chat |
POST | Ask a question about a document |
rag-chatbot/
βββ client/ # React frontend
β βββ src/
β β βββ components/ # UI components
β β βββ hooks/ # Custom React hooks
β β βββ lib/ # Utility functions
β β βββ pages/ # Page components
β β βββ App.tsx # Main app component
β βββ index.html # HTML entry point
βββ server/ # Express backend
β βββ lib/ # Core services
β β βββ gemini-service.ts # Gemini API integration
β β βββ pdf-processor.ts # PDF processing
β β βββ pinecone-service.ts # Pinecone integration
β β βββ rag-service.ts # RAG logic
β βββ index.ts # Server entry point
β βββ routes.ts # API routes
β βββ storage.ts # Data storage
βββ shared/ # Shared types and schemas
βββ package.json # Project dependencies
βββ README.md # This file
-
Document Processing:
- PDF files are uploaded and parsed
- Text content is extracted and chunked
- Each chunk is converted to embeddings using Google Gemini
-
Vector Storage:
- Embeddings are stored in Pinecone vector database
- Each vector is associated with metadata (document ID, page number, etc.)
-
Question Answering:
- User questions are converted to embeddings
- Similar context is retrieved from Pinecone
- Gemini generates answers using the retrieved context
| Category | Technology |
|---|---|
| Frontend | React, TypeScript, Tailwind CSS, Vite |
| Backend | Express.js, Node.js |
| AI | Google Gemini, Pinecone |
| Storage | In-Memory Storage (for demo) |
| PDF Processing | pdf-parse |
| Deployment | Vercel (Frontend), Render (Backend) |
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a pull request
