Vector Search & Structured-Data RAG Toolkit for MariaDB
Turn any MariaDB table into a searchable vector store. Query results come back in TOON v3 tabular format — a compact wire format that saves 10-55% of tokens (vs compact JSON) when feeding structured data to LLMs or agents.
LLMs and agents consume structured data as context. The standard approach — dumping JSON — wastes tokens on repeated field names and structural characters:
[{"id":1,"name":"Widget","category":"Tools","price":29.99,"stock":150,"supplier":"Acme","rating":4.5},
{"id":2,"name":"Gadget","category":"Tools","price":19.99,"stock":300,"supplier":"Acme","rating":4.2}]TOON tabular writes field names once, values as compact rows:
[2,]{id,name,category,price,stock,supplier,rating}:
1,Widget,Tools,29.99,150,Acme,4.5
2,Gadget,Tools,19.99,300,Acme,4.2
Measured on real public datasets (full benchmark):
| Dataset (query type) | Rows | JSON Tokens | TOON Tokens | Savings |
|---|---|---|---|---|
| MovieLens — top rated movies (7 cols) | 100 | 6,540 | 5,019 | 23.3% |
| MovieLens — metadata only (4 cols) | 100 | 2,258 | 1,364 | 39.6% |
| SF Restaurant — violations (9 cols) | 100 | 7,071 | 4,326 | 38.8% |
| SF Restaurant — high risk (9 cols) | 50 | 3,437 | 2,076 | 39.6% |
Savings scale with row count and stabilize at the dataset's natural ceiling:
| Rows | MovieLens (7 cols) | Restaurant (9 cols) |
|---|---|---|
| 10 | 21.7% | 34.6% |
| 50 | 22.0% | 38.2% |
| 100 | 24.1% | 38.8% |
| 500 | 29.0% | 38.9% |
TOON is not magic — it shines on structured tabular data with many columns and short values, which is exactly what comes out of database queries. All measurements use compact JSON (separators=(",",":")) as baseline.
For structured database data, the industry uses two retrieval approaches. Seamless-RAG bridges both to LLMs:
"Q3 revenue by region?" "Find products similar to X"
│ │
Text-to-SQL Vector Search
(LLM generates SQL) (cosine similarity)
│ │
└──────────┬─────────────────────────┘
▼
MariaDB executes
▼
list[dict] results
▼
Seamless-RAG → TOON format ← saves 20-40% tokens
▼
LLM / Agent consumes
- Precise queries ("revenue > 1M"): write SQL directly, use
seamless-rag exportto TOON-format the results - Semantic queries ("similar products"): use
seamless-rag askfor vector search on text columns - Hybrid ("waterproof watches under $50"):
seamless-rag ask --where "price < 50"combines both
Seamless-RAG is a format + embedding bridge, not a replacement for SQL.
Your data is already in MariaDB. Seamless-RAG adds vectors and TOON.
pip install -e ".[mariadb,embeddings]" # install
docker compose up -d # MariaDB 11.8
seamless-rag init # create VECTOR columns + HNSW index
seamless-rag embed --table products --column description # embed existing rows
seamless-rag ask "Which products are most relevant?" # vector search → TOON → LLM
seamless-rag export "SELECT id, name, price FROM products LIMIT 20" # SQL → TOONNo file loading, no document chunking — data lives in MariaDB, Seamless-RAG bridges it to vectors and LLMs.
seamless-rag init Create VECTOR columns + HNSW index
seamless-rag embed Bulk-embed existing table rows (core workflow)
seamless-rag watch Auto-embed new inserts in real time (Rich live)
seamless-rag ask <question> Vector search → TOON context → LLM answer
seamless-rag export <sql> Any SELECT → TOON format
seamless-rag benchmark JSON vs TOON token/cost comparison
seamless-rag web Gradio web UI (localhost-only by default)
seamless-rag demo End-to-end demo with sample data
seamless-rag ingest <path> Convenience: load text files for quick testing
Multi-column embedding — embed multiple columns for richer semantics:
# Single column (default)
seamless-rag embed --table products --column description
# Multi-column — values concatenated for richer vector search
seamless-rag embed --table products --columns "name,category,price,rating"
# Internally: "Widget — Tools — 29.99 — 4.5"
# Now "cheap high-rated tools" matches on price AND rating, not just description
seamless-rag ask "cheap high-rated tools" --where "price < 50"Global options: --host, --port, --database, --provider, --model, --log-level
Seamless-RAG commands work as agent tools. An LLM agent can call these to interact with MariaDB:
# Agent tool: search MariaDB and get compact context
result = rag.ask("quarterly revenue by region", top_k=10)
# result.context_toon → compact tabular format for next LLM call
# result.savings_pct → token savings vs compact JSON
# Agent tool: export any SQL query as TOON
toon = rag.export("SELECT region, revenue, quarter FROM sales")
# Feed to next agent step with minimal token overhead
# Agent tool: multi-column embed for richer search
rag.embed_table("products", text_column=["name", "category", "price"])
# "Widget — Tools — 29.99" → vector search matches name AND priceIn a 20-step agent workflow querying a database at each step (measured on real data):
| Dataset | JSON (20 steps) | TOON (20 steps) | Tokens Saved | Cost Saved |
|---|---|---|---|---|
| MovieLens (7 cols, 50 rows/step) | 73,680 | 58,760 | 14,920 | $0.037 |
| Restaurant (9 cols, 50 rows/step) | 69,640 | 42,640 | 27,000 | $0.068 |
from seamless_rag import SeamlessRAG
with SeamlessRAG(host="localhost", database="mydb") as rag:
rag.init()
rag.ingest("research.txt", ["chunk1...", "chunk2..."])
# Single-column embed (default)
rag.embed_table("articles", text_column="content")
# Multi-column embed — richer semantics
rag.embed_table("products", text_column=["name", "category", "price"])
# Semantic search with hybrid filter
result = rag.ask("affordable tools", where="price < 50", mmr=True)
print(result.answer) # LLM-generated answer
print(result.context_toon) # compact context
print(f"Saved {result.savings_pct:.0f}% tokens")Both embedding and LLM layers use typing.Protocol — no base class needed:
| Layer | Providers | Default |
|---|---|---|
| Embedding | SentenceTransformers, Gemini, OpenAI, Ollama | SentenceTransformers (local, free) |
| LLM | Ollama, Gemini, OpenAI | Ollama (local, free) |
Switch via env vars: EMBEDDING_PROVIDER=gemini LLM_PROVIDER=openai seamless-rag ask "..."
See Providers guide for adding custom providers.
seamless-rag CLI / Python API / Agent Tools
│
├── EmbeddingProvider (Protocol) ← 4 built-in, add your own
├── LLMProvider (Protocol) ← 3 built-in, add your own
├── VectorStore (Protocol) ← MariaDB with connection pool
│ └── VECTOR(N) + HNSW index + VEC_DISTANCE_COSINE
├── AutoEmbedder ← batch + watch, multi-column concat
├── RAGEngine ← search → TOON → LLM (retry) → benchmark
├── TOONEncoder ← full v3 spec (166/166)
└── TokenBenchmark ← tiktoken + GPT-4o cost calc
538 tests passing (100%)
lint: 100%
unit: 100% (338/338)
spec: 100% (166/166 TOON v3 conformance)
integration: 100% (17/17)
eval: 100%
- SQL injection prevention: WHERE filters and SELECT queries validated via sqlglot AST parsing — blocks writes, DDL, subqueries, and dangerous functions (SLEEP, BENCHMARK, LOAD_FILE)
- Web UI: binds
127.0.0.1by default;--sharerequires auth viaSEAMLESS_WEB_USER/SEAMLESS_WEB_PASSWORD; error messages never leak server internals - LLM calls: context truncated to 20K chars; retry with jitter for transient errors; rate-limit detection
- Identifiers: all table/column names validated against
^[A-Za-z_][A-Za-z0-9_]*$
- MariaDB 11.7+ VECTOR columns, HNSW indexes, VEC_DISTANCE_COSINE
- Native binary protocol via
mariadb-connector-python(array.array float32) - Connection pooling with unique pool names for concurrent instances
- Version validation (>= 11.7.2) on init
Copyright 2026 LiuWei (SunflowersLwtech)
Licensed under the Apache License, Version 2.0
See LICENSE | CONTRIBUTING | Documentation