π AI-Search
A simple semantic search engine for PDFs using embeddings + Pinecone vector database β written in Python.
Given a set of PDF documents, this project:
β Loads & chunks text β Generates vector embeddings β Stores vectors in Pinecone β Queries them with natural language β Returns relevant passages, document names, and page numbers
π§© Features
Extract text from PDF files
Compute vector embeddings via your adapter (e.g., LangChain, OpenAI)
Index vectors in a Pinecone index
Run semantic search queries
Filter and dedupe results
Stream or show exact matches + context
π Quick Start Requirements
Before you begin, make sure you have:
β Python 3.8 or higher β Pinecone API key β OpenAI key (or other embedding provider)