agentmemory v0.6.0 — Scale & Cross-Session Evaluation

Date: 2026-03-18T07:45:03.529Z Platform: darwin arm64, Node v20.20.0

1. Scale: agentmemory vs Built-in Memory

Every built-in agent memory (CLAUDE.md, .cursorrules, Cline's memory-bank) loads ALL memory into context every session. agentmemory searches and returns only relevant results.

Observations	Sessions	Index Build	BM25 Search	Hybrid Search	Heap	Context Tokens (built-in)	Context Tokens (agentmemory)	Savings	Built-in Unreachable
240	30	177ms	0.112ms	0.63ms	9MB	10,504	1,924	82%	17%
1,000	125	155ms	0.317ms	1.709ms	6MB	43,834	1,969	96%	80%
5,000	625	810ms	1.496ms	8.58ms	25MB	220,335	1,972	99%	96%
10,000	1250	1657ms	3.195ms	17.49ms	1MB	440,973	1,974	100%	98%
50,000	6250	9182ms	22.827ms	108.722ms	316MB	2,216,173	1,981	100%	100%

What the numbers mean

Context Tokens (built-in): How many tokens Claude Code/Cursor/Cline would consume loading ALL memory into the context window. At 5,000 observations, this is ~250K tokens — exceeding most context windows entirely.

Context Tokens (agentmemory): How many tokens the top-10 search results consume. Stays constant regardless of corpus size.

Built-in Unreachable: Percentage of memories that built-in systems CANNOT access because they exceed the 200-line MEMORY.md cap or context window limits. At 1,000 observations, 80% of your project history is invisible.

Storage Costs

Observations	BM25 Index	Vector Index (d=384)	Total Storage
240	395 KB	494 KB	0.9 MB
1,000	1,599 KB	2,060 KB	3.6 MB
5,000	8,006 KB	10,298 KB	17.9 MB
10,000	16,005 KB	20,596 KB	35.7 MB
50,000	80,126 KB	102,979 KB	178.8 MB

2. Cross-Session Retrieval

Can the system find relevant information from past sessions? This is impossible for built-in memory once observations exceed the line/context cap.

Query	Target Session	Gap	BM25 Found	BM25 Rank	Hybrid Found	Hybrid Rank	Built-in Visible
How did we set up OAuth providers?	ses_005-009	24	Yes	#1	Yes	#1	Yes
What was the N+1 query fix?	ses_010-014	18	Yes	#1	Yes	#2	Yes
PostgreSQL full-text search setup	ses_010-014	17	Yes	#1	Yes	#1	Yes
bcrypt password hashing configuration	ses_005-009	20	Yes	#1	Yes	#1	Yes
Vitest unit testing setup	ses_020-024	9	Yes	#1	Yes	#1	Yes
webhook retry exponential backoff	ses_015-019	14	Yes	#1	Yes	#1	Yes
ESLint flat config migration	ses_000-004	29	Yes	#1	Yes	#1	Yes
Kubernetes HPA autoscaling configuration	ses_025-029	4	Yes	#1	Yes	#1	No
Prisma database seed script	ses_010-014	16	Yes	#1	Yes	#1	Yes
API cursor-based pagination	ses_015-019	14	Yes	#1	Yes	#1	Yes
CSRF protection double-submit cookie	ses_005-009	24	Yes	#1	Yes	#1	Yes
blue-green deployment rollback	ses_025-029	4	Yes	#1	Yes	#1	No

Summary: agentmemory BM25 found 12/12 cross-session queries. Hybrid found 12/12. Built-in memory (200-line cap) could only reach 10/12.

3. The Context Window Problem

Agent context window: ~200K tokens
System prompt + tools:  ~20K tokens
User conversation:      ~30K tokens
Available for memory:  ~150K tokens

At 50 tokens/observation:
  200 observations  =  10,000 tokens  (fits, but 200-line cap hits first)
  1,000 observations =  50,000 tokens  (33% of available budget)
  5,000 observations = 250,000 tokens  (EXCEEDS total context window)

agentmemory top-10 results:
  Any corpus size     =  ~1,924 tokens  (0.3% of budget)

4. What Built-in Memory Cannot Do

Capability	Built-in (CLAUDE.md)	agentmemory
Semantic search	No (keyword grep only)	BM25 + vector + graph
Scale beyond 200 lines	No (hard cap)	Unlimited
Cross-session recall	Only if in 200-line window	Full corpus search
Cross-agent sharing	No (per-agent files)	MCP + REST API
Multi-agent coordination	No	Leases, signals, actions
Temporal queries	No	Point-in-time graph
Memory lifecycle	No (manual pruning)	Ebbinghaus decay + eviction
Knowledge graph	No	Entity extraction + traversal
Query expansion	No	LLM-generated reformulations
Retention scoring	No	Time-frequency decay model
Real-time dashboard	No (read files manually)	Viewer on :3113
Concurrent access	No (file lock)	Keyed mutex + KV store

5. When to Use What

Use built-in memory (CLAUDE.md) when:

You have < 200 items to remember
Single agent, single project
Preferences and quick facts only
Zero setup is the priority

Use agentmemory when:

Project history exceeds 200 observations
You need to recall specific incidents from weeks ago
Multiple agents work on the same codebase
You want semantic search ("how does auth work?") not just keyword matching
You need to track memory quality, decay, and lifecycle
You want a shared memory layer across Claude Code, Cursor, Windsurf, etc.

Built-in memory is your sticky notes. agentmemory is the searchable database behind them.

Scale tests: 5 corpus sizes. Cross-session tests: 12 queries targeting specific past sessions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agentmemory v0.6.0 — Scale & Cross-Session Evaluation

1. Scale: agentmemory vs Built-in Memory

What the numbers mean

Storage Costs

2. Cross-Session Retrieval

3. The Context Window Problem

4. What Built-in Memory Cannot Do

5. When to Use What

FilesExpand file tree

SCALE.md

Latest commit

History

SCALE.md

File metadata and controls

agentmemory v0.6.0 — Scale & Cross-Session Evaluation

1. Scale: agentmemory vs Built-in Memory

What the numbers mean

Storage Costs

2. Cross-Session Retrieval

3. The Context Window Problem

4. What Built-in Memory Cannot Do

5. When to Use What