📚 AI PDF Chatbot

Chat with your PDFs using the power of LangChain, OpenAI, and FAISS — all wrapped in a slick Streamlit interface.

This app lets you upload multiple PDFs and ask natural language questions about their content. It uses semantic search and a conversational AI model (GPT-3.5/4) to retrieve and answer based on your document context.

🚀 Demo

Upload PDFs → Process → Chat in natural language

🧠 How It Works

You upload one or more PDFs through the sidebar of the app. These could be reports, manuals, research papers—anything.
The app reads all the text from those PDFs using a PDF parser. This raw text might be hundreds of lines long.
It then splits the text into smaller, overlapping chunks (like paragraphs), so they’re easier for the AI to handle. Think of this as breaking a book into pages.
Each chunk is converted into a vector—a mathematical representation that captures the meaning of the text. This is done using OpenAI’s embedding model.
All these vectors are stored in a FAISS vector database, which acts like a super-fast "search engine for meaning."
Now, when you ask a question, like:

“What findings are mentioned in the scan report?”

your question is also converted into a vector.
The app searches the FAISS database for chunks that are semantically similar (even if the words don’t exactly match).
The most relevant chunks are passed to ChatGPT, which reads them along with your question and responds intelligently—like it’s read your documents.
The conversation is remembered, so you can ask follow-up questions naturally.

🛠️ Features

🧾 Upload and chat with multiple PDFs
⚡ Built with LangChain + FAISS + OpenAI
🧠 Remembers previous questions in the chat
🎯 Retrieves semantic matches from documents
💬 Clean Streamlit chat UI with custom templates

📂 Project Structure

├── main.py                # Main Streamlit app
├── htmlTemplates.py       # Custom HTML & CSS for chatbot UI
├── .env                   # OpenAI API key stored securely
├── requirements.txt       # Python dependencies

🧑‍💻 Tech Stack

Component	Role
Streamlit	Frontend framework
PyPDF2	Extracts text from uploaded PDFs
LangChain	Orchestrates LLM + retrieval + memory
OpenAI	Provides embeddings + GPT model responses
FAISS	Fast semantic search on text embeddings
dotenv	Loads API keys securely

⚙️ Installation

Clone the repo

git clone https://github.com/yourusername/pdf-chat-ai.git
cd pdf-chat-ai

Create .env file

OPENAI_API_KEY=your_openai_key

Install dependencies

pip install -r requirements.txt

Run the app

streamlit run main.py

📥 Requirements

See requirements.txt or install manually:

streamlit
PyPDF2
langchain
openai
faiss-cpu
python-dotenv

❓Usage

Go to the sidebar and upload PDFs
Click Process
Ask any question about the documents like:
- "What abnormalities are found?"
- "Summarize the second report"
- "What is the diagnosis?"

🔐 Security

.env is used to protect your OpenAI key
API calls are handled server-side in Streamlit

📌 To-Do / Future Enhancements

Add source highlighting and chunk citations
Add document summarization button
Support scanned OCR PDFs (e.g., with Tesseract)
Integrate Whisper for audio-to-text documents

📄 License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.devcontainer		.devcontainer
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
htmlTemplates.py		htmlTemplates.py
output.jpg		output.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 AI PDF Chatbot

🚀 Demo

🧠 How It Works

🛠️ Features

📂 Project Structure

🧑‍💻 Tech Stack

⚙️ Installation

📥 Requirements

❓Usage

🔐 Security

📌 To-Do / Future Enhancements

📄 License

🙌 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 AI PDF Chatbot

🚀 Demo

🧠 How It Works

🛠️ Features

📂 Project Structure

🧑‍💻 Tech Stack

⚙️ Installation

📥 Requirements

❓Usage

🔐 Security

📌 To-Do / Future Enhancements

📄 License

🙌 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages