endee-io · bhavana679 · Mar 15, 2026 · Mar 15, 2026 · Mar 15, 2026 · Mar 15, 2026
diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json
@@ -0,0 +1,33 @@
+{
+  "name": "Python 3",
+  // Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
+  "image": "mcr.microsoft.com/devcontainers/python:1-3.11-bookworm",
+  "customizations": {
+    "codespaces": {
+      "openFiles": [
+        "README.md",
+        "ai-knowledge-engine/ui/app.py"
+      ]
+    },
+    "vscode": {
+      "settings": {},
+      "extensions": [
+        "ms-python.python",
+        "ms-python.vscode-pylance"
+      ]
+    }
+  },
+  "updateContentCommand": "[ -f packages.txt ] && sudo apt update && sudo apt upgrade -y && sudo xargs apt install -y <packages.txt; [ -f requirements.txt ] && pip3 install --user -r requirements.txt; pip3 install --user streamlit; echo '✅ Packages installed and Requirements met'",
+  "postAttachCommand": {
+    "server": "streamlit run ai-knowledge-engine/ui/app.py --server.enableCORS false --server.enableXsrfProtection false"
+  },
+  "portsAttributes": {
+    "8501": {
+      "label": "Application",
+      "onAutoForward": "openPreview"
+    }
+  },
+  "forwardPorts": [
+    8501
+  ]
+}
diff --git a/ai-knowledge-engine/.gitignore b/ai-knowledge-engine/.gitignore
@@ -0,0 +1,48 @@
+# Environment Variables (SENSITIVE)
+.env
+
+# Python artifacts
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+
+# Virtual Environment
+venv/
+.venv/
+ENV/
+
+# Local Cache & Knowledge Engine Specific
+.ingested_docs.json
+*.log
+server.log
+streamlit.log
+
+# Database files
+db/
+
+# OS specific
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
diff --git a/ai-knowledge-engine/README.md b/ai-knowledge-engine/README.md
@@ -0,0 +1,117 @@
+# AI Engineering Knowledge Memory Engine
+
+A high-performance, Retrieval-Augmented Generation (RAG) platform designed as a "second brain" for AI Engineers. Built on the **Endee Vector Database**, this engine allows users to index specialized technical documentation and retrieve contextually accurate answers with full traceability.
+
+---
+
+## Architecture
+
+The system follows a modular, scalable pipeline to ensure rapid retrieval and high-quality generation:
+
+1.  **Ingestion:** Raw documents (`.txt`) are processed, cleaned, and split into semantic chunks.
+2.  **Embedding:** Chunks are transformed into 384-dimensional vectors using the `all-MiniLM-L6-v2` SentenceTransformer.
+3.  **Storage:** Vectors and metadata are stored in the **Endee Vector Database** with cosine similarity metrics.
+4.  **Retrieval:** User queries are embedded and matched against the database to find the top K most relevant contexts.
+5.  **RAG Pipeline:** Contextual snippets are injected into a specialized prompt and sent to an LLM (GPT-4o/mini) for answer generation.
+6.  **Backend API:** A **FastAPI** layer facilitates communication between the database, the AI model, and the frontend.
+7.  **Frontend UI:** A clean **Streamlit** interface provides a chat-based experience for the end-user.
+
+---
+
+##  Features
+
+*   **Semantic Search:** Deep architectural search beyond simple keywords.
+*   **Vector Embeddings:** State-of-the-art dense vector representation.
+*   **Retrieval Augmented Generation:** Answers generated strictly based on provided technical documentation.
+*   **Inline Source Citations:** Every answer includes bracketed references (e.g., `[1]`, `[2]`) linked to source files.
+*   **Retrieval Diagnostics:** Real-time visibility into similarity distances and ranking metadata.
+*   **Query Observability:** End-to-end tracking of response times and source utilization.
+*   **FastAPI Backend:** Production-ready RESTful API.
+*   **Knowledge Manager:** Trigger re-indexing of the entire knowledge base directly from the UI.
+
+---
+
+##  Technology Stack
+
+*   **Language:** Python 3.12+
+*   **Embeddings:** SentenceTransformers (`all-MiniLM-L6-v2`)
+*   **Vector Database:** [Endee](https://github.com/endeeio/endee)
+*   **LLM Integration:** OpenAI API
+*   **Backend:** FastAPI & Pydantic
+*   **Frontend:** Streamlit
+
+---
+
+##  Setup Instructions
+
+### 1. Start the Endee Vector Database
+Ensure you have Docker installed and run:
+```bash
+docker run -d --ulimit nofile=100000:100000 -p 8080:8080 -v ./data:/data --name endee-server endeeio/endee-server:latest
+```
+
+### 2. Environment Configuration
+Create a `.env` file in the project root:
+```env
+OPENAI_API_KEY=your_openai_api_key_here
+```
+
+### 3. Install Dependencies
+```bash
+pip install -r requirements.txt
+```
+
+### 4. Run the Platform
+
+**Step A: Ingest Knowledge** (One-time or when data changes)
+```bash
+python embeddings/embed_store.py
+```
+
+**Step B: Launch Backend API**
+```bash
+python3 -m uvicorn api.main:app --reload --port 8000
+```
+
+**Step C: Launch Streamlit UI**
+```bash
+streamlit run ui/app.py
+```
+
+---
+
+## Folder Structure
+
+```text
+ai-knowledge-engine/
+├── api/                # FastAPI application & routes
+├── config/             # Environment & model configurations
+├── data/               # Raw engineering documents (.txt)
+├── embeddings/         # Document chunking & vector ingestion logic
+├── rag/                # RAG orchestration & prompt engineering
+├── retrieval/          # Endee client & semantic search logic
+├── ui/                 # Streamlit interface
+├── utils/              # Text processing & cleaning utilities
+├── requirements.txt    # Project dependencies
+└── README.md           # Project documentation
+```
+
+---
+
+##  Example Queries
+
+*   "What is the modular architecture of the RAG engine?"
+*   "How are text chunks handled in the embedding pipeline?"
+*   "What was the fix for the vector precision bug?"
+
+---
+
+##  Future Improvements
+
+*   **Multi-modal Support:** Indexing PDFs, Markdown, and technical diagrams.
+*   **Hybrid Search:** Combining semantic search with BM25 keyword matching for better precision.
+*   **Local LLM Support:** Integration with Ollama or vLLM for fully air-gapped operations.
+*   **User Authentication:** Multi-tenant support for private knowledge bases.
+
+---
+
diff --git a/ai-knowledge-engine/api/main.py b/ai-knowledge-engine/api/main.py
@@ -0,0 +1,73 @@
+import os
+import sys
+from typing import List, Dict, Any
+from fastapi import FastAPI, HTTPException
+from pydantic import BaseModel
+
+# Add the project root to the system path to allow importing internal modules
+BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+sys.path.append(BASE_DIR)
+
+from rag.rag_pipeline import RAGPipeline
+from embeddings.embed_store import EmbeddingPipeline
+from retrieval.endee_client import EndeeClient
+from config.config import ENDEE_HOST, ENDEE_PORT
+
+app = FastAPI(
+    title="AI Knowledge Engine API",
+    description="Backend API for Retrieval Augmented Generation using Endee Vector DB",
+    version="1.0.0"
+)
+
+@app.get("/")
+def root():
+    return {
+        "message": "Welcome to the AI Engineering Knowledge Memory Engine API!",
+        "docs": "/docs",
+        "health": "/health"
+    }
+
+# Initialize application dependencies
+rag_pipeline = RAGPipeline()
+endee_client = EndeeClient(host=ENDEE_HOST, port=int(ENDEE_PORT))
+
+class QueryRequest(BaseModel):
+    query: str
+    top_k: int = 3
+
+@app.get("/health")
+def health_check():
+    """Confirms API connectivity and checks the local Endee database health."""
+    try:
+        endee_status = endee_client.health()
+        return {
+            "status": "healthy",
+            "api": "online",
+            "endee_connected": True,
+            "endee_status": endee_status
+        }
+    except Exception as e:
+        return {
+            "status": "degraded",
+            "api": "online",
+            "endee_connected": False,
+            "error": str(e)
+        }
+
+@app.post("/ingest")
+def trigger_ingestion():
+    """Triggers the Embedding Ingestion pipeline for all document files."""
+    try:
+        pipeline = EmbeddingPipeline()
+        pipeline.run_ingestion()
+        return {"status": "success", "message": "Ingestion pipeline completed successfully."}
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Ingestion pipeline failed: {str(e)}")
+
+@app.post("/query")
+def query_knowledge_base(request: QueryRequest):
+    try:
+        result = rag_pipeline.answer_question(query=request.query, top_k=request.top_k)
+        return result
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=str(e))
diff --git a/ai-knowledge-engine/config/config.py b/ai-knowledge-engine/config/config.py
@@ -0,0 +1,22 @@
+import os
+
+# Base directory paths
+BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
+DATA_DIR = os.path.join(BASE_DIR, "data")
+DB_DIR = os.path.join(BASE_DIR, "db")
+
+# Chunking settings
+CHUNK_SIZE = 500
+CHUNK_OVERLAP = 50
+
+# Embedding settings (Gemini)
+EMBEDDING_MODEL_NAME = "models/text-embedding-004"
+EMBEDDING_DIMENSION = 768
+
+# Endee Vector DB settings
+ENDEE_HOST = os.getenv("ENDEE_HOST", "localhost")
+ENDEE_PORT = os.getenv("ENDEE_PORT", "8080")
+
+# LLM configurations (Gemini)
+LLM_PROVIDER = "gemini"
+LLM_MODEL = "gemini-1.5-flash"
diff --git a/ai-knowledge-engine/data/architecture_decisions.txt b/ai-knowledge-engine/data/architecture_decisions.txt
@@ -0,0 +1,3 @@
+ADR-001: Modular Architecture for RAG Engine
+Decision: We will modularize our AI Engine. Modules include data, embeddings, retrieval, rag, api, and ui.
+Rationale: A modular structure reduces tight coupling, enabling the Endee vector search to be iterated upon independently of the FastAPI and Streamlit code.
diff --git a/ai-knowledge-engine/data/bug_fixes.txt b/ai-knowledge-engine/data/bug_fixes.txt
@@ -0,0 +1,3 @@
+Bug 1032: Vector precision loss during ingest
+Root Cause: The text chunker was splitting directly in the middle of words, causing tokenization failures in the embedding model.
+Fix: Adjusted chunking algorithm to only split at word boundaries by checking for spaces before creating the substring chunks.
diff --git a/ai-knowledge-engine/data/engineering_notes.txt b/ai-knowledge-engine/data/engineering_notes.txt
@@ -0,0 +1 @@
+Knowledge Memory Systems require robust vector embeddings to capture semantic meaning. We use 500-token chunks with an overlap of 50 tokens to ensure context continuity. Semantic search leverages Cosine Similarity or Dot Product to retrieve documents against natural language queries efficiently within a high-speed database like Endee.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		Knowledge Memory Systems require robust vector embeddings to capture semantic meaning. We use 500-token chunks with an overlap of 50 tokens to ensure context continuity. Semantic search leverages Cosine Similarity or Dot Product to retrieve documents against natural language queries efficiently within a high-speed database like Endee.