Author: Krish Nangia
A .NET 8 console application implementing a RAG pipeline for document-based question answering.
This code ingests plain text documents, generate embeddings for semantic search, and query an LLM to get answers grounded strictly in the provided documents.
This application is designed to demonstrate a simple RAG workflow:
- Document Ingestion – Read and chunk text documents.
- Vector Embedding – Generate embeddings for each chunk using Gemini.
- Vector Store – Store chunk embeddings with metadata for retrieval.
- Query Handling – Accept user questions via the console.
- Similarity Search – Retrieve the most relevant chunks using cosine similarity.
- LLM Answer Generation – Send retrieved context + user question to LLM to produce a grounded answer.
- Loads
FileVectorStorefrom the configuredVECTOR_STORE_LOCATION. - If the folder does not exist, initializes a new store.
- Scans
DOCUMENTS_FOLDER_PATHfor.txtfiles. - Checks whether files were previously ingested by comparing existing vector store chunks.
- Already ingested files are skipped to avoid duplication.
- Currently, ingesting all documents in the
Storiesdirectory from scratch takes around 6 minutes.
- Reads each file’s content.
- Uses
SemanticSlicerto split the document into smaller chunks (e.g., 50–100 words per chunk).
- For each chunk, calls
GeminiEmbeddingGeneratorto produce a vector embedding. - Stores metadata for each chunk:
SourceFile– the original filenameChunkIndex– position of the chunk within the file
- Adds each chunk and its embedding to
FileVectorStore. - Implements a delay (
delayMs) between API calls to avoid rate limiting.
- Waits for console input from the user.
- Accepts natural language questions.
- Type
exitto quit.
- Computes similarity (cosine similarity) between query embedding and stored chunk embeddings.
- Retrieves the top N most relevant chunks (default: 5).
- Combines the retrieved chunks into a single context string.
- Optionally includes metadata such as
SourceFileandChunkIndexfor debugging.
- Sends the context and question as a prompt to the LLM (
LLMService). - Receives an answer grounded strictly in the provided context.
- Prints the answer to the console.
- Loops back to accept more queries until the user exits.
- Incremental document ingestion.
- Chunk-based vector embeddings for precise semantic retrieval.
- Top-N similarity search for relevant context.
- LLM-based answer generation grounded in retrieved chunks.
- Simple console interface with safe exit handling.
- Unique filenames are used for ingestion detection. If a file is modified but the name is unchanged, the system will not re-ingest it.
- The vector store preserves all previous embeddings; old chunks remain even if files are updated.
- Works only with plain text documents. PDFs or Word documents require preprocessing into
.txt. - Delays (
delayMs) are used to avoid hitting rate limits. - Metadata such as
SourceFileandChunkIndexis optional but useful for debugging and context reference.
git clone https://github.com/knangia04/DOTNET-RAG-Document-QA.git
cd DOTNET-RAG-Document-QA
dotnet restore
dotnet build- Place your documents in the
Documentsfolder and setGlobalSettings.DOCUMENTS_FOLDER_PATHin theGlobalSettings.cs. - Set
GlobalSettings.VECTOR_STORE_LOCATIONfor storing chunk embeddings.
Run the application:
dotnet run- The program will ingest new documents and generate embeddings.
- After ingestion, you can enter questions interactively.
- Type
exitto quit the program safely.