Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
8347779
feat: initialize project structure and core utilities
bhavana679 Mar 15, 2026
51105b2
feat: implement Endee client and semantic retrieval
bhavana679 Mar 15, 2026
b6402b6
feat: add document chunking and vector ingestion
bhavana679 Mar 15, 2026
5d39b20
feat: implement RAG orchestration with citations
bhavana679 Mar 15, 2026
7e7305c
feat: integrate FastAPI backend and Streamlit frontend
bhavana679 Mar 15, 2026
18fd689
docs: final project documentation and execution script
bhavana679 Mar 15, 2026
b536bcf
fix: update requirements and pin Python version for cloud compatibility
bhavana679 Mar 15, 2026
e426b1c
fix: switch to OpenAI embeddings to resolve Render OOM and improve st…
bhavana679 Mar 15, 2026
91bc57a
fix: remove numpy-specific .tolist() from embedding pipeline for Open…
bhavana679 Mar 15, 2026
a43c2c3
fix: support HTTPS for cloud database connections
bhavana679 Mar 15, 2026
c2c1d49
fix: make database connection URL construction more robust
bhavana679 Mar 15, 2026
9fe2fee
fix: ultra-clean cloud database connection URL
bhavana679 Mar 15, 2026
34e88f2
Added Dev Container Folder
bhavana679 Mar 15, 2026
f243859
feat: switch to Google Gemini for 100% free embeddings and RAG genera…
bhavana679 Mar 15, 2026
d645c64
fix: use models/embedding-001 for better Gemini API compatibility
bhavana679 Mar 15, 2026
e4e0fce
fix: bump gemini sdk version and simplify embedding calls for cloud c…
bhavana679 Mar 15, 2026
c42e72c
fix: restore text-embedding-004 with explicit task_type for Gemini AP…
bhavana679 Mar 15, 2026
1268f3f
fix: remove sys.exit(1) and raise exception instead to prevent API cr…
bhavana679 Mar 15, 2026
f633d4b
fix: use plural 'embeddings' key for Gemini batch results
bhavana679 Mar 15, 2026
56912c1
fix: revert embedding key to singular and add root route for API
bhavana679 Mar 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"name": "Python 3",
// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
"image": "mcr.microsoft.com/devcontainers/python:1-3.11-bookworm",
"customizations": {
"codespaces": {
"openFiles": [
"README.md",
"ai-knowledge-engine/ui/app.py"
]
},
"vscode": {
"settings": {},
"extensions": [
"ms-python.python",
"ms-python.vscode-pylance"
]
}
},
"updateContentCommand": "[ -f packages.txt ] && sudo apt update && sudo apt upgrade -y && sudo xargs apt install -y <packages.txt; [ -f requirements.txt ] && pip3 install --user -r requirements.txt; pip3 install --user streamlit; echo '✅ Packages installed and Requirements met'",
"postAttachCommand": {
"server": "streamlit run ai-knowledge-engine/ui/app.py --server.enableCORS false --server.enableXsrfProtection false"
},
"portsAttributes": {
"8501": {
"label": "Application",
"onAutoForward": "openPreview"
}
},
"forwardPorts": [
8501
]
}
48 changes: 48 additions & 0 deletions ai-knowledge-engine/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Environment Variables (SENSITIVE)
.env

# Python artifacts
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# Virtual Environment
venv/
.venv/
ENV/

# Local Cache & Knowledge Engine Specific
.ingested_docs.json
*.log
server.log
streamlit.log

# Database files
db/

# OS specific
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db
117 changes: 117 additions & 0 deletions ai-knowledge-engine/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# AI Engineering Knowledge Memory Engine

A high-performance, Retrieval-Augmented Generation (RAG) platform designed as a "second brain" for AI Engineers. Built on the **Endee Vector Database**, this engine allows users to index specialized technical documentation and retrieve contextually accurate answers with full traceability.

---

## Architecture

The system follows a modular, scalable pipeline to ensure rapid retrieval and high-quality generation:

1. **Ingestion:** Raw documents (`.txt`) are processed, cleaned, and split into semantic chunks.
2. **Embedding:** Chunks are transformed into 384-dimensional vectors using the `all-MiniLM-L6-v2` SentenceTransformer.
3. **Storage:** Vectors and metadata are stored in the **Endee Vector Database** with cosine similarity metrics.
4. **Retrieval:** User queries are embedded and matched against the database to find the top K most relevant contexts.
5. **RAG Pipeline:** Contextual snippets are injected into a specialized prompt and sent to an LLM (GPT-4o/mini) for answer generation.
6. **Backend API:** A **FastAPI** layer facilitates communication between the database, the AI model, and the frontend.
7. **Frontend UI:** A clean **Streamlit** interface provides a chat-based experience for the end-user.

---

## Features

* **Semantic Search:** Deep architectural search beyond simple keywords.
* **Vector Embeddings:** State-of-the-art dense vector representation.
* **Retrieval Augmented Generation:** Answers generated strictly based on provided technical documentation.
* **Inline Source Citations:** Every answer includes bracketed references (e.g., `[1]`, `[2]`) linked to source files.
* **Retrieval Diagnostics:** Real-time visibility into similarity distances and ranking metadata.
* **Query Observability:** End-to-end tracking of response times and source utilization.
* **FastAPI Backend:** Production-ready RESTful API.
* **Knowledge Manager:** Trigger re-indexing of the entire knowledge base directly from the UI.

---

## Technology Stack

* **Language:** Python 3.12+
* **Embeddings:** SentenceTransformers (`all-MiniLM-L6-v2`)
* **Vector Database:** [Endee](https://github.com/endeeio/endee)
* **LLM Integration:** OpenAI API
* **Backend:** FastAPI & Pydantic
* **Frontend:** Streamlit

---

## Setup Instructions

### 1. Start the Endee Vector Database
Ensure you have Docker installed and run:
```bash
docker run -d --ulimit nofile=100000:100000 -p 8080:8080 -v ./data:/data --name endee-server endeeio/endee-server:latest
```

### 2. Environment Configuration
Create a `.env` file in the project root:
```env
OPENAI_API_KEY=your_openai_api_key_here
```

### 3. Install Dependencies
```bash
pip install -r requirements.txt
```

### 4. Run the Platform

**Step A: Ingest Knowledge** (One-time or when data changes)
```bash
python embeddings/embed_store.py
```

**Step B: Launch Backend API**
```bash
python3 -m uvicorn api.main:app --reload --port 8000
```

**Step C: Launch Streamlit UI**
```bash
streamlit run ui/app.py
```

---

## Folder Structure

```text
ai-knowledge-engine/
├── api/ # FastAPI application & routes
├── config/ # Environment & model configurations
├── data/ # Raw engineering documents (.txt)
├── embeddings/ # Document chunking & vector ingestion logic
├── rag/ # RAG orchestration & prompt engineering
├── retrieval/ # Endee client & semantic search logic
├── ui/ # Streamlit interface
├── utils/ # Text processing & cleaning utilities
├── requirements.txt # Project dependencies
└── README.md # Project documentation
```

---

## Example Queries

* "What is the modular architecture of the RAG engine?"
* "How are text chunks handled in the embedding pipeline?"
* "What was the fix for the vector precision bug?"

---

## Future Improvements

* **Multi-modal Support:** Indexing PDFs, Markdown, and technical diagrams.
* **Hybrid Search:** Combining semantic search with BM25 keyword matching for better precision.
* **Local LLM Support:** Integration with Ollama or vLLM for fully air-gapped operations.
* **User Authentication:** Multi-tenant support for private knowledge bases.

---

73 changes: 73 additions & 0 deletions ai-knowledge-engine/api/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
import os
import sys
from typing import List, Dict, Any
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

# Add the project root to the system path to allow importing internal modules
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.append(BASE_DIR)

from rag.rag_pipeline import RAGPipeline
from embeddings.embed_store import EmbeddingPipeline
from retrieval.endee_client import EndeeClient
from config.config import ENDEE_HOST, ENDEE_PORT

app = FastAPI(
title="AI Knowledge Engine API",
description="Backend API for Retrieval Augmented Generation using Endee Vector DB",
version="1.0.0"
)

@app.get("/")
def root():
return {
"message": "Welcome to the AI Engineering Knowledge Memory Engine API!",
"docs": "/docs",
"health": "/health"
}

# Initialize application dependencies
rag_pipeline = RAGPipeline()
endee_client = EndeeClient(host=ENDEE_HOST, port=int(ENDEE_PORT))

class QueryRequest(BaseModel):
query: str
top_k: int = 3

@app.get("/health")
def health_check():
"""Confirms API connectivity and checks the local Endee database health."""
try:
endee_status = endee_client.health()
return {
"status": "healthy",
"api": "online",
"endee_connected": True,
"endee_status": endee_status
}
except Exception as e:
return {
"status": "degraded",
"api": "online",
"endee_connected": False,
"error": str(e)
}

@app.post("/ingest")
def trigger_ingestion():
"""Triggers the Embedding Ingestion pipeline for all document files."""
try:
pipeline = EmbeddingPipeline()
pipeline.run_ingestion()
return {"status": "success", "message": "Ingestion pipeline completed successfully."}
except Exception as e:
raise HTTPException(status_code=500, detail=f"Ingestion pipeline failed: {str(e)}")

@app.post("/query")
def query_knowledge_base(request: QueryRequest):
try:
result = rag_pipeline.answer_question(query=request.query, top_k=request.top_k)
return result
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
22 changes: 22 additions & 0 deletions ai-knowledge-engine/config/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import os

# Base directory paths
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
DATA_DIR = os.path.join(BASE_DIR, "data")
DB_DIR = os.path.join(BASE_DIR, "db")

# Chunking settings
CHUNK_SIZE = 500
CHUNK_OVERLAP = 50

# Embedding settings (Gemini)
EMBEDDING_MODEL_NAME = "models/text-embedding-004"
EMBEDDING_DIMENSION = 768

# Endee Vector DB settings
ENDEE_HOST = os.getenv("ENDEE_HOST", "localhost")
ENDEE_PORT = os.getenv("ENDEE_PORT", "8080")

# LLM configurations (Gemini)
LLM_PROVIDER = "gemini"
LLM_MODEL = "gemini-1.5-flash"
3 changes: 3 additions & 0 deletions ai-knowledge-engine/data/architecture_decisions.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
ADR-001: Modular Architecture for RAG Engine
Decision: We will modularize our AI Engine. Modules include data, embeddings, retrieval, rag, api, and ui.
Rationale: A modular structure reduces tight coupling, enabling the Endee vector search to be iterated upon independently of the FastAPI and Streamlit code.
3 changes: 3 additions & 0 deletions ai-knowledge-engine/data/bug_fixes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Bug 1032: Vector precision loss during ingest
Root Cause: The text chunker was splitting directly in the middle of words, causing tokenization failures in the embedding model.
Fix: Adjusted chunking algorithm to only split at word boundaries by checking for spaces before creating the substring chunks.
1 change: 1 addition & 0 deletions ai-knowledge-engine/data/engineering_notes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Knowledge Memory Systems require robust vector embeddings to capture semantic meaning. We use 500-token chunks with an overlap of 50 tokens to ensure context continuity. Semantic search leverages Cosine Similarity or Dot Product to retrieve documents against natural language queries efficiently within a high-speed database like Endee.
Loading