Skip to content

hamim23z/Personal_PDF_Chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📚 MultiPDF - Personal Assistant

MultiPDF - Personal Assistant is an interactive document-querying application built with Python, LangChain, Streamlit, and Hugging Face.
It helps users upload multiple PDFs, ask conversational questions, and instantly retrieve accurate answers with exact source citations.

Deployed using Streamlit Cloud: personalpdfreader.streamlit.app/


🚀 Features

  • Upload and process multiple PDF documents simultaneously.
  • Ask follow-up questions
  • Advanced Document Retrieval:
    • Uses MMR (Maximal Marginal Relevance) to cast a wide net and balance similarity with diversity, ensuring small documents (like resumes) don't get buried by big PDFs.
    • Automatically stamps text chunks with their source filenames to prevent the LLM from crossing context streams.
  • Transparent Sourcing: Appends a clean HTML list to the bottom of the bot's answers, showing the exact document snippets it used.
  • Custom Prompting: Overrides default LangChain memory poisoning to work flawlessly with open-weight instruction models.

🧩 Tech Stack

  • Python 3.12
  • LangChain
  • Streamlit
  • FAISS
  • Hugging Face Inference API
  • Sentence-Transformers
  • PyPDF2

⚙️ Installation & Setup

  1. Clone the repository
    git clone https://github.com/yourUsername/Personal_PDF_Chatbot.git
    
  2. Create and activate a virtual environment
    python3 -m venv venv
    source venv/bin/activate -> for Mac
    venv\Scripts\activate -> for Windows
    
  3. Set up your environment variables
    Make a .env file and put your HUGGINGFACE_HUB_API_TOKEN=  in there
    
  4. Install dependencies
    pip install -r requirements.txt
    
  5. Start the app via Streamlit
    streamlit run app.py

About

This is a Python/Langchain/HuggingFace application where users will be able to import multiple PDFs and ask questions about the content of the PDFs. Users will also be given the exact location of where the answer was pulled from, ask follow up questions, and jump back and forth between different PDFs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages