Skip to content

gonzalogorgojo/node-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Small Local RAG API (Node.js + SQLite + node-llama-cpp)

A lightweight, local-only Retrieval Augmented Generation (RAG) API.

Stack

  • Express - HTTP API layer
  • better-sqlite3 + sqlite-vector - local storage and vector similarity search
  • node-llama-cpp - local embedding + generation models (no cloud dependency)

The app reads .txt files from documents/, splits them into chunks, generates embeddings, stores them in SQLite, retrieves the most similar chunks for a query, and uses a local LLM to produce the final answer.

What this project does

  • Runs a local HTTP API on http://localhost:3000
  • Reads files from ./documents
  • Splits file content into chunks by blank lines (\n\n+)
  • Creates embeddings with model Qwen3-Embedding-0.6B
  • Stores embeddings in rag.db using cosine distance
  • Retrieves top 8 (can be modified) chunks for a query
  • Generates the final answer with model Qwen3-0.6B

Local-only scope

This is intentionally a small local project (not production-ready):

  • no auth
  • no multi-user isolation
  • no background jobs / queueing
  • no cloud storage

Requirements

  • Node.js (v24.14.1)

Setup

  1. Install dependencies:
npm install

postinstall automatically downloads the two GGUF models into ./models.

  1. Make sure this folder exists and contains your text files:
documents/
  1. Start the API:
npm start

Or with watch mode:

npm run start:watch

API

Base URL: http://localhost:3000

1) List available local documents

curl -X GET http://localhost:3000/documents

Example response:

{
  "totalFiles": 2,
  "files": [{ "name": "cv.txt" }, { "name": "projects.txt" }]
}

2) Embed one document into the vector DB

file must exist inside ./documents.

curl -X POST http://localhost:3000/documents/embed \
  -H "Content-Type: application/json" \
  -d '{"fileName":"example.txt"}'

Example response:

{
  "message": "Embeddings created successfully",
  "chunksStored": 12
}

3) Search and generate an answer

curl -X POST http://localhost:3000/documents/search \
  -H "Content-Type: application/json" \
  -d '{"query":"What backend experience do you have?"}'

Example response:

{
  "generatedAnswer": "..."
}

4) Remove one embedded document from DB

This removes rows from SQLite, not the physical file in documents/.

curl -X DELETE http://localhost:3000/documents/cv.txt

Example response:

{
  "message": "File deleted successfully"
}

Useful scripts

  • npm start - run API
  • npm run start:watch - run API with Node watch mode
  • npm run models:pull - download models manually
  • npm run models:check - open node-llama-cpp chat check

Project files

  • index.js - API server + embedding/search flow
  • rag.db - local SQLite database (created at runtime)
  • models/ - downloaded GGUF models
  • documents/ - your local source files for indexing

Notes

  • First request that needs a model can take longer (model load).
  • The .txt file is first read as plain text (parsed into one full string), then split into chunks by blank lines using /\n\n+/ (paragraph-style boundaries).
  • 1024 means the number of values in each embedding vector (dimensions), not text length. f32 means each of those 1024 values is stored as a 32-bit float (float32) in SQLite via vector_as_f32(...).
  • Embedding is currently one-file-at-a-time via API.
  • If a file is already embedded, /documents/embed returns an error until you delete it from DB first.
  • Im using macOS Apple Silicon that is why @sqliteai/sqlite-vector-darwin-arm64 is loaded, this is installed automatically. For other platforms, check sqlite-vector docs.
  • The prompt should be modified in index.js to fit your use case, currently it's a simple instruction + retrieved chunks. You can also add system instructions or few-shot examples as needed.

About

Small Local RAG API (Node.js + SQLite + node-llama-cpp)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors