📚 ShelfAI

Early alpha, battle-tested core — Built on a 25-agent production swarm. Public API may change. Feedback welcome.

A new primitive for agent systems. Most frameworks optimize retrieval and persistence. ShelfAI optimizes the source files they retrieve from.

Every agent framework treats skill files as flat text — you either load the whole thing or nothing. Retrieval systems find the right file, but nobody optimizes what's inside it. Skills accumulate with no pruning, no chunking, no internal structure for selective loading.

ShelfAI is the document-ops layer for agent context. It applies RAG architecture principles — abstracts, semantic chunking, titled sections — to agent skill files, so an LLM can skim an abstract, decide relevance, and load only the chunks it needs. An agent maintains the files over time, handling restructuring and pruning automatically.

The pattern comes from thinking about how we structure a medical RAG system we're building in partnership with the University of Coimbra — tiered retrieval, agent-written abstracts, structured chunking. We asked: why doesn't agent memory work this way? So we built ShelfAI.

pip install shelfai==0.2.0a4

What you get: 60%+ token reduction on agent files (observed in our swarm using heuristic chunking alone — semantic pass increases savings further) · agent-written abstracts that improve after every session · 163 tests · $0.02/session · Apache 2.0

Where ShelfAI Sits

Your agent's memory architecture is sophisticated. The documents it remembers are not. Existing memory systems — Hermes, Honcho, SuperMemory, QMD — solve what to remember and when to retrieve it. ShelfAI solves how the documents are structured once you've decided to read one.

Raw skill file / knowledge document
        ↓
   [ ShelfAI ]  ← structures, chunks, titles, writes abstracts
        ↓
   Structured document with abstract + semantic chunks
        ↓
   QMD / SuperMemory / Hermes skills / OpenClaw ClawHub

ShelfAI is a preprocessing layer that makes retrieval systems work better, not a replacement for any of them.

Layer	Tool	What It Does
Search	QMD	Finds the right files. BM25 + vector + LLM reranking, fully local.
Structure	ShelfAI	Curates what gets searched. Abstracts, chunks, learning loop.
Entity Memory	Honcho or Supermemory	Remembers users, projects, facts that change over time.

The Core Problem

Every agent framework today treats skill files as flat text:

Skill discovery — You either match the YAML description or you don't. There's no abstract to help make smarter routing decisions. In Hermes, for example, Level 0 gives you a name + description index (~3k tokens) and Level 1 gives you the full skill. There is no Level 0.5.

Skill loading — You read zero lines or all 500. No way to say "give me chunk 3 about error handling." Every irrelevant line burns tokens.

Skill creation — The agent writes a flat markdown file. No internal structure optimized for future retrieval. Skills accumulate indefinitely with no pruning, no contradiction detection, no internal navigation.

ShelfAI adds the missing gradient: abstracts for smarter routing, semantic chunks for selective loading, and a learning loop that improves both over time.

How It Works

The Shelf

Your agent's knowledge lives in a simple directory:

shelf/
├── index.md           ← One-line abstracts for everything
├── skills/            ← Agent capabilities and procedures
├── knowledge/         ← Domain-specific reference material
├── memory/            ← Learnings from past sessions
│   ├── user/          ← User preferences
│   └── agent/         ← Operational lessons and patterns
└── resources/         ← Reference materials

The Index

shelf/index.md is the only file your agent reads first:

# ShelfAI Index

## Skills
- **skills/seo_audit.md** — Use when a client requests a site audit. Covers
  technical crawl, Core Web Vitals, internal linking, and content gaps.
- **skills/lead_nurture.md** — Use when following up with a lead. Includes
  timing rules, email templates, and the 72-hour re-engagement trigger.

## Knowledge
- **knowledge/api_docs.md** — Payments API reference. Key gotcha: staging
  returns 200 with error body, don't trust status codes.

## Memory
- **memory/agent/lessons.md** — Staging needs VPN. Screaming Frog misses
  JS-rendered pages on Client B's site.

The agent reads the index, matches abstracts to the task, and loads only the matching files. This is the Level 0.5 — richer than a name/description pair, cheaper than loading the full file.

The Learning Loop

After each conversation, ShelfAI's session agent:

Analyzes the transcript
Extracts operational lessons, workflow patterns, preference updates
Deduplicates against existing knowledge
Updates memory files and refines index abstracts
QMD re-indexes — better abstracts mean better search next time

Session happens
    │
    ├─→ ShelfAI session agent extracts operational lessons
    │     → Updates memory files
    │     → Refines abstracts
    └─→ QMD re-indexes the updated shelf
          → Better abstracts = better reranking

Next session
    ├─→ QMD finds more relevant files
    └─→ ShelfAI provides richer, curated context

Agent performs better → richer sessions → better extractions → loop

The key insight: Your agent uses your context daily. It knows which details matter, which skills get called when, which gotchas keep tripping things up. An agent that uses the context writes better retrieval abstracts than a model that just summarizes it.

Agent File Chunking

Monolithic agent instruction files (150-400+ lines) waste tokens loading instructions irrelevant to the current task. ShelfAI's chunking system splits them into modular, selectively-loaded chunks — reducing per-run token cost by ~60%.

Two-Layer Approach

Heuristic pre-filter (free): shelfai chunk extracts soul/rules/read-order into always-loaded chunks. Handles the ~35% of chunking that's structurally obvious. Zero LLM cost, safe to run anytime.
LLM semantic pass (~$0.01/agent): The session agent groups remaining sections by deliverable/workflow. Triggered on a weekly cadence. The LLM has full latitude to say "no change needed" when a chunk's size serves the deliverable.

Chunk Structure

agents/{id}/
├── AGENT.md              # Thin router (~40 lines) — maps tasks to chunks
├── MEMORY.md             # Learned patterns
├── chunks/
│   ├── soul.md           # Always loaded — mission, role, identity
│   ├── rules.md          # Always loaded — hard constraints
│   ├── read-order.md     # Always loaded — system integration, data sources
│   ├── {task-1}.md       # Loaded when task matches
│   └── {task-2}.md       # Loaded when task matches

Chunking CLI

# Scan for agents that need chunking
shelfai chunk-scan ./agents

# Preview the pre-filter on a specific agent
shelfai chunk ./agents/18-efficiency/AGENT.md --dry-run

# Write chunk files (backs up original as AGENT.md.pre-chunk)
shelfai chunk ./agents/18-efficiency/AGENT.md --write

Class	When to Load	Examples
`always`	Every run	soul, rules, read-order, MEMORY.md
`task`	Current task matches	tiktok, blog-article, daily-scorecard
`schedule`	Time-triggered	weekly-report (Mondays), monthly-review
`reference`	On demand or searched	scoring-formulas, tool-setup

Quick Start

# Install
pip install shelfai==0.2.0a4

# Initialize a shelf
shelfai init --template agent

# Add your agent's knowledge
shelfai add ./my_playbook.md --category skills
shelfai add ./api_docs.md --category knowledge

# Build the index (manual = best quality, auto = faster)
shelfai index --manual    # You write abstracts with retrieval hints
# OR
shelfai index             # Auto-generate abstracts with LLM (~$0.01)

# Register with QMD
qmd collection add ./shelf --name shelf
qmd embed

# After each conversation, extract learnings
shelfai session ./transcript.md
qmd embed                 # Re-index so QMD sees the updates

That's it. Your agent now has a knowledge base that improves after every conversation.

Use with Your Agent

from shelfai import Shelf

shelf = Shelf("./shelf")

def run_task(task: str):
    # Find relevant context
    relevant = shelf.index.search(task)
    context = "\n".join(shelf.read_file(e.file_path) for e in relevant)
    lessons = shelf.read_file("memory/agent/lessons.md", default="")

    return run_agent(f"{context}\n\n{lessons}", task)

Post-Session Learning

from shelfai import Shelf
from shelfai.agents.session import SessionManager
from shelfai.providers.anthropic import AnthropicProvider

shelf = Shelf("./shelf")
provider = AnthropicProvider()
manager = SessionManager(shelf, provider)

# After conversation ends
report = manager.process_file("transcript.md")
print(f"Extracted {report.extraction.total_items} learnings")

# Re-index so QMD sees the updates
import subprocess
subprocess.run(["qmd", "embed"])

Memory Compaction

Memory files grow as the session agent appends lessons after each conversation. Without compaction, they accumulate duplicates, superseded entries, and stale observations that dilute context quality.

shelfai compact consolidates memory files using heuristic dedup — no LLM needed, safe to run anytime.

# Scan all memory files (shelf + agent MEMORY.md)
shelfai compact --shelf ./shelf --agents ./agents --scan

# Preview compaction on a specific file
shelfai compact --file ./shelf/memory/agent/what-works.md

# Apply compaction (backs up originals as .pre-compact)
shelfai compact --shelf ./shelf --agents ./agents --write

Removes near-duplicate entries, strips placeholders, archives entries older than 90 days (configurable via --stale-days). Preserves file structure, headings, and tables. Backs up originals before writing.

Integrations

ShelfAI is framework-agnostic. It manages markdown files. Works with any agent runtime.

Framework	Integration	Status
Claude Code	Claude skill (`skills/claude/`)	✅ Shipped
Hermes Agent	Post-run hook + skill structuring (`examples/hermes_integration.py`)	📖 Example
OpenClaw	ClawHub skill packaging (`examples/openclaw_integration.py`)	📖 Example
QMD	Direct — ShelfAI curates what QMD indexes	✅ Shipped
Honcho	Complementary — ShelfAI handles ops knowledge, Honcho handles entity memory	Compatible
SuperMemory	Complementary — ShelfAI structures docs before ingestion	Compatible

See examples/ for integration guides.

How ShelfAI Compares

ShelfAI occupies a different layer than most tools labeled "agent memory." Here's how it maps against the landscape:

Tool	What It Solves	ShelfAI Relationship
Letta Context Repositories	Git-tracked markdown files edited by memory sub-agents. Tight coupling to Letta runtime (v0.15+).	Closest analogy — but Letta repos are runtime-locked. ShelfAI is framework-agnostic and adds intra-file optimization (chunking, abstracts) that context repos don't attempt.
Mem0	User-level memory (preferences, history). YC-backed, $24M raised.	Complementary. Mem0 remembers who the user is. ShelfAI structures what the agent knows how to do.
Zep	Temporal knowledge graphs for conversation history.	Complementary. Zep tracks conversational state over time. ShelfAI optimizes the operational docs the agent loads per task.
Cognee	Enterprise knowledge graphs with schema enforcement. €7.5M raised.	Different layer. Cognee builds structured knowledge graphs from unstructured data. ShelfAI structures the agent's own skill/memory files for selective loading.
QMD	Local search (BM25 + vector + LLM reranking) over markdown files.	Direct integration. QMD finds the right file; ShelfAI makes what's inside that file retrieval-optimized.
dotMD	Flattens a codebase into a single markdown file for context injection.	Different problem. dotMD is a snapshot tool. ShelfAI is an ongoing optimization layer with a learning loop.

ShelfAI is a new primitive for agent systems. Nobody else optimizes inside the document. Retrieval systems find the right file. Memory systems decide what to remember. ShelfAI is the missing layer that restructures agent skill and knowledge files — adding abstracts for smarter routing, semantic chunks for selective loading, and a learning loop that refines both after every session. And it works with any framework, not just one runtime.

FAQ

If I update my strict written rules, will the session agent automatically apply those updated rules?

Yes. The session agent reads your shelf files fresh on every run — it doesn't cache old versions. If you update rules.md or any always-loaded chunk, the next shelfai session invocation picks up those changes immediately. The learning loop may also refine abstracts to better reflect your updated rules over time, but it never overwrites your explicit rule files. You own the source of truth; ShelfAI optimizes around it.

What if all the chunks are involved and what is the additional cost?

If every chunk is relevant to a task, the agent loads them all — and the total token count is roughly the same as loading the original monolithic file (plus a small overhead for chunk headers, ~2-5%). You don't pay a penalty for full loads. The savings come from the runs where only a subset of chunks are needed, which in practice is the common case. LLM cost for the chunking operation itself is a one-time ~$0.01 per agent file, and the heuristic pre-filter (shelfai chunk) is free. We're working on formal benchmarks — if you run ShelfAI on your own agents, we'd love to see your numbers.

Built in Production

ShelfAI was built because we needed it. We run a 25-agent content swarm (17 pipeline + 8 oversight) and hit every failure mode:

Memory bloat: 37 memory files, 207 entries with 25 near-duplicates poisoning retrieval. shelfai compact cleaned them in one pass.
Monolithic configs: Agent files exceeding 400 lines, burning tokens on irrelevant instructions every run. shelfai chunk split them into task-specific modules — 60%+ token reduction using heuristic chunking alone. In a 7-agent pipeline benchmark, selective chunk loading cut context from ~12,800 tokens to ~5,100 per run.

CLI Reference

Command	Description
`shelfai init`	Initialize a new shelf
`shelfai add <file>`	Add a file or URL to the shelf
`shelfai index`	Build/rebuild the index (generate abstracts)
`shelfai session <file>`	Run session agent on a transcript
`shelfai search <query>`	Test abstract matching
`shelfai status`	Show shelf health and stats
`shelfai prune`	Clean up stale memory entries
`shelfai export`	Export shelf as a single file
`shelfai chunk-scan <dir>`	Scan agents directory for chunking candidates
`shelfai chunk <file>`	Run heuristic pre-filter on a monolithic agent file
`shelfai compact`	Consolidate memory files (dedup, archive stale)
`shelfai review`	List or approve staged new-context proposals

Cost

Component	Cost
ShelfAI	$0 + ~$0.02/session for LLM calls
QMD	$0 — fully local
Honcho	$0 — open source (or hosted)
Supermemory	Free tier or $19/mo

Philosophy

Files beat databases for human-scale knowledge. If you can ls it, you understand it.
Agents write better indexes. The thing that uses the context should write the retrieval abstracts.
Transparency beats magic. When retrieval fails, open the file and read why.
Zero infrastructure is the default. Scale up when you need to, not because your tools demand it.

Project Status

v0.2.0-alpha (Experimental)

Core Features:

Core CLI (init, add, index, session, search, status, export, prune, review)
Session management agent (5-stage pipeline, schema validation, backups)
Auto-indexing with LLM providers (Anthropic, OpenAI)
Production hardening (path traversal protection, file locking, retry logic)
Agent file chunking (chunk-scan + chunk commands, two-layer architecture)
Memory compaction (heuristic dedup, stale archival, placeholder cleanup)
Integrations: Claude skill, Hermes Agent (example), OpenClaw (example)
163 tests passing ✓ Apache 2.0 licensed

Roadmap:

shelfai register --qmd (one-command QMD setup)
MCP server implementation
Watch mode (auto-index on file changes)
Shelf templates (customer support, content production, analysis, sales)
Interactive chunk classification (shelfai chunk --interactive) — user-guided section tagging with saved config

Contributing

We welcome contributions. ShelfAI is early-stage and there's a lot of surface area.

Areas where help is especially welcome: shelf templates for specific domains, real-world case studies, integration examples for your framework, benchmarks (token cost reductions, retrieval quality), and the shelfai register --qmd CLI command.

See CONTRIBUTING.md for guidelines. If you want to contribute before formal guidelines are up, just open an issue — we're friendly.

License

Apache License 2.0 — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
demo		demo
docs		docs
examples		examples
skills/claude		skills/claude
src/shelfai		src/shelfai
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
DECISIONS.md		DECISIONS.md
LICENSE		LICENSE
README.md		README.md
demo.gif		demo.gif
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 ShelfAI

Where ShelfAI Sits

The Core Problem

How It Works

The Shelf

The Index

The Learning Loop

Agent File Chunking

Two-Layer Approach

Chunk Structure

Chunking CLI

Quick Start

Use with Your Agent

Post-Session Learning

Memory Compaction

Integrations

How ShelfAI Compares

FAQ

Built in Production

CLI Reference

Cost

Philosophy

Project Status

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 ShelfAI

Where ShelfAI Sits

The Core Problem

How It Works

The Shelf

The Index

The Learning Loop

Agent File Chunking

Two-Layer Approach

Chunk Structure

Chunking CLI

Quick Start

Use with Your Agent

Post-Session Learning

Memory Compaction

Integrations

How ShelfAI Compares

FAQ

Built in Production

CLI Reference

Cost

Philosophy

Project Status

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages