Tree-structured knowledge maps for LLM agent exploration.
Turn your document chunks into a navigable knowledge tree and let agents explore with 4 specialized tools — instead of dumping everything into a single RAG search. Library with LangChain & FastAPI integrations.
Standard RAG gives the agent one tool: "search everything." The agent can't strategize. It either retrieves too much or misses what matters.
Knowtology borrows from how skilled developers navigate codebases:
Developer exploring code:
Folder structure → Relevant directory → Open file → Read the part that matters
Agent exploring knowledge:
Tree structure → Relevant category → Check snippets → Read full text
"Don't dump everything into the LLM at once."
The result: agents that explore strategically instead of searching blindly.
| Knowtology | Standard RAG | Graph RAG | LightRAG | |
|---|---|---|---|---|
| Tree-structured navigation | 3-level category tree | - | - | - |
| 4-Tool agent exploration | browse → search → keyword → read | Single search | Single search | Single search |
| Snippet-first design | Snippets for discovery, full text on demand | Full chunks always | Triples | Chunks |
| Zero chunk duplication | Maps to existing vector DB | Copies chunks | Extracts triples | Copies chunks |
| Model-aware batching | Dynamic batch size per LLM | Fixed | Fixed | Fixed |
| Adapter pattern | Swap any storage backend | Coupled | Coupled | Coupled |
| Framework integrations | LangChain, FastAPI | Varies | Varies | Varies |
Traditional RAG:
Query → vector search → top-K chunks → LLM → answer
Problem: no structure, no browsing, no overview
TreeRAG:
Documents → hierarchical tree (auto-built by LLM)
Agent browses tree → finds relevant category → checks snippets → reads originals
= Surgical retrieval, not shotgun search
Single search tool → Agent can't form a strategy
4 specialized tools → Agent creates its own exploration plan
browse_tree: See what's where (the map)
search_chunks: Find by meaning (semantic)
search_keyword: Find by exact terms (keyword)
read_chunks: Read the full original (verify)
Key constraint: search tools return SNIPPETS only.
→ Agent decides "I need to read more" → calls read_chunks
→ Prevents premature answers from partial information
v1 failure:
Chunks → LLM extracts SPO triples → stored separately
→ LLM misses something? Lost forever. Need originals? Search again.
v2 philosophy:
Chunks already live in your vector DB. Use them as-is.
Tree + snippets = "a map that tells you where things are" (index)
read_chunks fetches originals → answer
= No duplication, no sync issues, source preserved
pip install knowtology # Core (pydantic + httpx)
pip install knowtology[openai] # + OpenAI LLM adapter
pip install knowtology[anthropic] # + Anthropic Claude adapter
pip install knowtology[langchain] # + LangChain tool integration
pip install knowtology[pg] # + PostgreSQL adapter
pip install knowtology[qdrant] # + Qdrant vector store
pip install knowtology[all] # Everything
pip install knowtology[dev] # + pytest for developmentfrom knowtology import TreeBuilder
from knowtology.llm.openai import OpenAIClient
from knowtology.adapters.memory import InMemoryTreeStore
llm = OpenAIClient(api_key="sk-...", model="gpt-4o-mini")
tree_store = InMemoryTreeStore()
builder = TreeBuilder(llm=llm, tree_store=tree_store)
# chunks = already stored in your vector DB
result = await builder.build(
collection_id="company_docs",
chunks=[
{"text": "Refunds within 7 days of purchase...", "chunk_id": "abc-123", "chunk_index": 0},
{"text": "Shipping takes 2-3 business days...", "chunk_id": "def-456", "chunk_index": 1},
{"text": "Exchanges within 14 days...", "chunk_id": "ghi-789", "chunk_index": 2},
],
)
print(f"Tree: {result['tree_count']} nodes, Mappings: {result['mapping_count']}")from knowtology import KnowledgeMapTools
from knowtology.adapters.memory import InMemoryTreeStore, InMemoryVectorStore, InMemoryTextStore
tools = KnowledgeMapTools(
collection_id="company_docs",
tree_store=tree_store, # from build step
vector_store=InMemoryVectorStore(),
text_store=InMemoryTextStore(),
top_k=5,
)
# Step 1: Browse the tree structure
await tools.browse_tree("root")
# [Category] Refunds (1 chunk)
# [abc-123] Refund within 7 days, opened food excluded
# [Category] Shipping (1 chunk)
# [def-456] 2-3 business days, remote areas +1-2 days
# Step 2: Semantic search → snippets only
await tools.search_chunks("return policy")
# Step 3: Exact keyword match
await tools.search_keyword("7 days, refund")
# Step 4: Read full original text
await tools.read_chunks("abc-123")
# Returns complete chunk text — the actual answer sourcefrom knowtology import KnowledgeMapTools
from langchain.agents import create_tool_calling_agent
from langchain_openai import ChatOpenAI
tools_instance = KnowledgeMapTools(
collection_id="company_docs",
tree_store=tree_store,
vector_store=vector_store,
text_store=text_store,
)
agent = create_tool_calling_agent(
llm=ChatOpenAI(model="gpt-4o"),
tools=tools_instance.get_langchain_tools(), # 4 tools: km_browse_tree, km_search_chunks, ...
prompt=prompt,
)from knowtology import TreeBuilder, KnowledgeMapTools
from knowtology.llm.anthropic import AnthropicClient
from knowtology.adapters.pg import PostgresTreeStore, PostgresTextStore
from knowtology.adapters.qdrant import QdrantVectorStore
# Build
builder = TreeBuilder(
llm=AnthropicClient(api_key="sk-ant-...", model="claude-sonnet-4-6"),
tree_store=PostgresTreeStore(dsn="postgresql://..."),
)
await builder.build("product_docs", chunks)
# Search
tools = KnowledgeMapTools(
collection_id="product_docs",
tree_store=PostgresTreeStore(dsn="postgresql://..."),
vector_store=QdrantVectorStore(url="http://localhost:6333"),
text_store=PostgresTextStore(dsn="postgresql://..."),
)knowtology
│
├── TreeBuilder ──────── LLM-powered tree construction
│ ├── BatchStrategy Model-aware dynamic batching (gpt-4o: 150, claude: 200, ...)
│ └── Prompts CREATE_TREE / EXTEND_TREE templates
│
├── KnowledgeMapTools ── 4-Tool exploration factory
│ ├── browse_tree Navigate tree hierarchy (the map)
│ ├── search_chunks Semantic search → snippets only
│ ├── search_keyword Exact keyword match → snippets only
│ └── read_chunks Fetch full original text + read tracking
│
├── Integrations
│ ├── LangChain StructuredTool conversion (km_* prefix)
│ └── FastAPI REST router factory (/km/browse, /km/search, ...)
│
StorageBackend (ABC)
│
┌────┼──────────┬──────────────┐
│ │ │ │
Memory SQLite PostgreSQL Qdrant
(test) (light) (TreeStore (VectorStore)
+ TextStore)
TreeNode ChunkMapping
├── id ├── id
├── collection_id ├── collection_id
├── parent_id ├── tree_node_id → TreeNode.id
├── name "Refunds" ├── chunk_id → Vector DB point ID
├── description ├── snippet "Refund within 7 days..."
├── level 1 (max: 3) ├── chunk_index
├── path "Returns/Refunds"├── doc_id
└── chunk_count 12 └── file_name
Key insight: ChunkMapping POINTS to existing vector DB chunks.
It never copies them. Zero duplication.
The tree builder dynamically adjusts batch size based on the LLM model:
| Model | Batch Size | Rationale |
|---|---|---|
gpt-4o / gpt-4o-mini |
150 | 128K context |
gpt-4-turbo |
120 | 128K context, slower |
gpt-4 |
50 | 8K context |
gpt-3.5-turbo |
20 | 16K context, lower quality |
claude-opus-4-6 |
200 | 1M context |
claude-sonnet-4-6 |
150 | 200K context |
claude-haiku-4-5 |
100 | 200K context |
| Other | 50 | Safe default |
No chunk is ever dropped. Every chunk gets processed — unlike v1's fixed 40-chunk cutoff that silently lost data.
Swap any storage backend without changing application code:
from knowtology.adapters.base import TreeStore, VectorStore, TextStore
class TreeStore(ABC):
async def upsert_node(node) -> TreeNode
async def get_tree(collection_id) -> list[TreeNode]
async def get_children(collection_id, parent_path) -> list[TreeNode]
async def insert_mapping(mapping) -> ChunkMapping
async def get_mapping_by_chunk_id(chunk_id) -> ChunkMapping | None
...
class VectorStore(ABC): # Read-only — never writes to your vector DB
async def search(collection_id, query, top_k) -> list[dict]
async def get_by_ids(collection_id, chunk_ids) -> list[dict]
class TextStore(ABC):
async def search_by_keywords(collection_id, keywords, top_k) -> list[dict]Built-in implementations: InMemoryTreeStore, InMemoryVectorStore, InMemoryTextStore — test everything without Docker.
Standard RAG (every query):
top-5 chunks × ~500 tokens = 2,500 tokens fed to LLM
Knowtology (same query):
browse_tree → snippets only (~50 chars each)
search → snippets only
read_chunks → only the 1-2 chunks actually needed = 500~1,000 tokens
+ already-read chunks → skipped entirely
≈ 50-60% token reduction per query
Multi-turn conversations: agents re-fetch the same chunks over and over.
Knowtology tracks read history per session:
First read: → full text returned
Second read: → "(already read — skip)" one line
→ Longer conversations = bigger savings
Graph RAG: chunks → extract triples → store in separate graph DB (2x data)
Knowtology: reuse existing vector DB, add only tree + mappings (lightweight metadata)
→ No extra vector DB storage
→ No sync problems between stores
1,000 chunks to build:
Fixed 40-chunk batches → 25 LLM API calls
gpt-4o dynamic batching → 7 LLM API calls (72% fewer calls)
claude-opus batching → 5 LLM API calls (80% fewer calls)
| # | Component | Status | Description |
|---|---|---|---|
| 1 | pyproject.toml + package structure |
Done | Project scaffolding |
| 2 | core/models.py |
Done | 4 Pydantic models (TreeNode, ChunkMapping, SearchResult, ChunkContent) |
| 3 | adapters/base.py |
Done | 3 abstract interfaces (TreeStore, VectorStore, TextStore) |
| 4 | adapters/memory.py |
Done | InMemory implementations for testing |
| 5 | llm/base.py |
Done | LLMClient abstract interface |
| 6 | builder/batch.py |
Done | Model-aware dynamic batching strategy |
| 7 | builder/prompts.py |
Done | CREATE_TREE / EXTEND_TREE prompt templates |
| 8 | builder/tree_builder.py |
Done | TreeBuilder main pipeline |
| 9 | llm/openai.py |
Done | OpenAI adapter |
| 10 | llm/anthropic.py |
Done | Anthropic Claude adapter |
| 11 | search/browse.py |
Done | browse_tree implementation |
| 12 | search/semantic.py |
Done | search_chunks implementation |
| 13 | search/keyword.py |
Done | search_keyword implementation |
| 14 | search/reader.py |
Done | read_chunks + read history tracking |
| 15 | search/tools.py |
Done | KnowledgeMapTools 4-tool factory |
| 16 | integrations/langchain.py |
Done | LangChain StructuredTool conversion |
| 17 | integrations/fastapi.py |
Done | FastAPI router factory |
| 18 | tests/ |
Done | 28 unit tests (all passing, InMemory-based) |
| 19 | adapters/pg.py |
Not yet | PostgreSQL TreeStore + TextStore |
| 20 | adapters/qdrant.py |
Not yet | Qdrant VectorStore |
| 21 | adapters/sqlite.py |
Not yet | SQLite adapter (lightweight) |
| 22 | Docker integration tests | Not yet | PG + Qdrant docker-compose test suite |
| 23 | XGEN adapter + integration | Not yet | HTTP API adapter for XGEN platform |
| 24 | E2E tests | Not yet | Full-stack end-to-end validation |
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -v # 28 tests, all in-memory, < 1sMIT