Seamless-RAG

Vector Search & Structured-Data RAG Toolkit for MariaDB

Turn any MariaDB table into a searchable vector store. Query results come back in TOON v3 tabular format — a compact wire format that saves 10-55% of tokens (vs compact JSON) when feeding structured data to LLMs or agents.

Why

LLMs and agents consume structured data as context. The standard approach — dumping JSON — wastes tokens on repeated field names and structural characters:

[{"id":1,"name":"Widget","category":"Tools","price":29.99,"stock":150,"supplier":"Acme","rating":4.5},
 {"id":2,"name":"Gadget","category":"Tools","price":19.99,"stock":300,"supplier":"Acme","rating":4.2}]

TOON tabular writes field names once, values as compact rows:

[2,]{id,name,category,price,stock,supplier,rating}:
  1,Widget,Tools,29.99,150,Acme,4.5
  2,Gadget,Tools,19.99,300,Acme,4.2

Measured on real public datasets (full benchmark):

Dataset (query type)	Rows	JSON Tokens	TOON Tokens	Savings
MovieLens — top rated movies (7 cols)	100	6,540	5,019	23.3%
MovieLens — metadata only (4 cols)	100	2,258	1,364	39.6%
SF Restaurant — violations (9 cols)	100	7,071	4,326	38.8%
SF Restaurant — high risk (9 cols)	50	3,437	2,076	39.6%

Savings scale with row count and stabilize at the dataset's natural ceiling:

Rows	MovieLens (7 cols)	Restaurant (9 cols)
10	21.7%	34.6%
50	22.0%	38.2%
100	24.1%	38.8%
500	29.0%	38.9%

TOON is not magic — it shines on structured tabular data with many columns and short values, which is exactly what comes out of database queries. All measurements use compact JSON (separators=(",",":")) as baseline.

Where It Fits

For structured database data, the industry uses two retrieval approaches. Seamless-RAG bridges both to LLMs:

"Q3 revenue by region?"           "Find products similar to X"
        │                                    │
   Text-to-SQL                        Vector Search
   (LLM generates SQL)              (cosine similarity)
        │                                    │
        └──────────┬─────────────────────────┘
                   ▼
           MariaDB executes
                   ▼
           list[dict] results
                   ▼
        Seamless-RAG → TOON format     ← saves 20-40% tokens
                   ▼
           LLM / Agent consumes

Precise queries ("revenue > 1M"): write SQL directly, use seamless-rag export to TOON-format the results
Semantic queries ("similar products"): use seamless-rag ask for vector search on text columns
Hybrid ("waterproof watches under $50"): seamless-rag ask --where "price < 50" combines both

Seamless-RAG is a format + embedding bridge, not a replacement for SQL.

Quick Start

Your data is already in MariaDB. Seamless-RAG adds vectors and TOON.

pip install -e ".[mariadb,embeddings]"         # install
docker compose up -d                            # MariaDB 11.8

seamless-rag init                               # create VECTOR columns + HNSW index
seamless-rag embed --table products --column description  # embed existing rows
seamless-rag ask "Which products are most relevant?"      # vector search → TOON → LLM
seamless-rag export "SELECT id, name, price FROM products LIMIT 20"  # SQL → TOON

No file loading, no document chunking — data lives in MariaDB, Seamless-RAG bridges it to vectors and LLMs.

CLI Commands

seamless-rag init              Create VECTOR columns + HNSW index
seamless-rag embed             Bulk-embed existing table rows (core workflow)
seamless-rag watch             Auto-embed new inserts in real time (Rich live)
seamless-rag ask <question>    Vector search → TOON context → LLM answer
seamless-rag export <sql>      Any SELECT → TOON format
seamless-rag benchmark         JSON vs TOON token/cost comparison
seamless-rag web               Gradio web UI (localhost-only by default)
seamless-rag demo              End-to-end demo with sample data
seamless-rag ingest <path>     Convenience: load text files for quick testing

Multi-column embedding — embed multiple columns for richer semantics:

# Single column (default)
seamless-rag embed --table products --column description

# Multi-column — values concatenated for richer vector search
seamless-rag embed --table products --columns "name,category,price,rating"
# Internally: "Widget — Tools — 29.99 — 4.5"

# Now "cheap high-rated tools" matches on price AND rating, not just description
seamless-rag ask "cheap high-rated tools" --where "price < 50"

Global options: --host, --port, --database, --provider, --model, --log-level

As Agent Tools

Seamless-RAG commands work as agent tools. An LLM agent can call these to interact with MariaDB:

# Agent tool: search MariaDB and get compact context
result = rag.ask("quarterly revenue by region", top_k=10)
# result.context_toon → compact tabular format for next LLM call
# result.savings_pct → token savings vs compact JSON

# Agent tool: export any SQL query as TOON
toon = rag.export("SELECT region, revenue, quarter FROM sales")
# Feed to next agent step with minimal token overhead

# Agent tool: multi-column embed for richer search
rag.embed_table("products", text_column=["name", "category", "price"])
# "Widget — Tools — 29.99" → vector search matches name AND price

In a 20-step agent workflow querying a database at each step (measured on real data):

Dataset	JSON (20 steps)	TOON (20 steps)	Tokens Saved	Cost Saved
MovieLens (7 cols, 50 rows/step)	73,680	58,760	14,920	$0.037
Restaurant (9 cols, 50 rows/step)	69,640	42,640	27,000	$0.068

Python API

from seamless_rag import SeamlessRAG

with SeamlessRAG(host="localhost", database="mydb") as rag:
    rag.init()
    rag.ingest("research.txt", ["chunk1...", "chunk2..."])

    # Single-column embed (default)
    rag.embed_table("articles", text_column="content")

    # Multi-column embed — richer semantics
    rag.embed_table("products", text_column=["name", "category", "price"])

    # Semantic search with hybrid filter
    result = rag.ask("affordable tools", where="price < 50", mmr=True)
    print(result.answer)           # LLM-generated answer
    print(result.context_toon)     # compact context
    print(f"Saved {result.savings_pct:.0f}% tokens")

Pluggable Providers

Both embedding and LLM layers use typing.Protocol — no base class needed:

Layer	Providers	Default
Embedding	SentenceTransformers, Gemini, OpenAI, Ollama	SentenceTransformers (local, free)
LLM	Ollama, Gemini, OpenAI	Ollama (local, free)

Switch via env vars: EMBEDDING_PROVIDER=gemini LLM_PROVIDER=openai seamless-rag ask "..."

See Providers guide for adding custom providers.

Architecture

seamless-rag CLI / Python API / Agent Tools
    │
    ├── EmbeddingProvider (Protocol)     ← 4 built-in, add your own
    ├── LLMProvider (Protocol)           ← 3 built-in, add your own
    ├── VectorStore (Protocol)           ← MariaDB with connection pool
    │     └── VECTOR(N) + HNSW index + VEC_DISTANCE_COSINE
    ├── AutoEmbedder                     ← batch + watch, multi-column concat
    ├── RAGEngine                        ← search → TOON → LLM (retry) → benchmark
    ├── TOONEncoder                      ← full v3 spec (166/166)
    └── TokenBenchmark                   ← tiktoken + GPT-4o cost calc

Test Results

538 tests passing (100%)
  lint:        100%
  unit:        100% (338/338)
  spec:        100% (166/166 TOON v3 conformance)
  integration: 100% (17/17)
  eval:        100%

Security

SQL injection prevention: WHERE filters and SELECT queries validated via sqlglot AST parsing — blocks writes, DDL, subqueries, and dangerous functions (SLEEP, BENCHMARK, LOAD_FILE)
Web UI: binds 127.0.0.1 by default; --share requires auth via SEAMLESS_WEB_USER / SEAMLESS_WEB_PASSWORD; error messages never leak server internals
LLM calls: context truncated to 20K chars; retry with jitter for transient errors; rate-limit detection
Identifiers: all table/column names validated against ^[A-Za-z_][A-Za-z0-9_]*$

Built for the MariaDB Ecosystem

MariaDB 11.7+ VECTOR columns, HNSW indexes, VEC_DISTANCE_COSINE
Native binary protocol via mariadb-connector-python (array.array float32)
Connection pooling with unique pool names for concurrent instances
Version validation (>= 11.7.2) on init

License

Copyright 2026 LiuWei (SunflowersLwtech)
Licensed under the Apache License, Version 2.0

See LICENSE | CONTRIBUTING | Documentation

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.claude		.claude
.github/workflows		.github/workflows
docs		docs
eval		eval
scripts		scripts
src/seamless_rag		src/seamless_rag
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
environment.yml		environment.yml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seamless-RAG

Why

Where It Fits

Quick Start

CLI Commands

As Agent Tools

Python API

Pluggable Providers

Architecture

Test Results

Security

Built for the MariaDB Ecosystem

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Seamless-RAG

Why

Where It Fits

Quick Start

CLI Commands

As Agent Tools

Python API

Pluggable Providers

Architecture

Test Results

Security

Built for the MariaDB Ecosystem

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages