Brain

A memory database for AI agents. Stores four record types — Memory, Entity, Statement, Relation — with explicit provenance, confidence, and bi-temporal validity. Fused retrieval (semantic + lexical + entity-graph + temporal) combined with weighted rank fusion. One Rust core, one wire protocol, one schema. Apache 2.0.

$ brain-server --config config/dev.toml
─────────────────────────────────────────────────────────────────────────────
  ◉ brain-server  v0.1.0  ·  listening 127.0.0.1:9090 (wire) · :9092 (admin)
  ◉ p99 recall 4.2ms  ·  WAL synced  ·  HNSW warm
  ◉ schema: brain:core + acme:sales (2 namespaces, 14 types)
─────────────────────────────────────────────────────────────────────────────

# Brain ships no client. Any language speaks the binary wire protocol (§04)
# directly — CBOR payloads, documented per-opcode field schemas. Conceptually:

ENCODE  "Had a difficult conversation with Alex about the project"
  → ENCODED                                          LSN 1 · s1/m1/v1 ·  9 ms

RECALL  "conflicts with Alex"  top_k=5
  # → ranked by semantic similarity, edge proximity, recency, and salience
  #   — not just vector distance.

# RECALL always fuses semantic + lexical + entity-graph via RRF — one
# read path. A declared schema adds entity-anchored graph traversal and
# typed enrichment on each hit.
RECALL  "what's Priya working on?"  include_graph=true

Why Brain

Today's agent stacks duct-tape four or five storage systems: a vector database for similarity, a graph database for relationships, a full-text store for keyword matching, an LLM extraction pipeline, plus an orchestration layer that pretends to keep them consistent. Half of that orchestration is reinventing transaction semantics across systems that don't agree on what "committed" means.

Brain collapses the stack into one Rust core with one wire protocol and one schema:

Cognitive verbs, not CRUD. encode / recall / plan / reason / forget are the primitive operations.
Fused retrieval out of the box. Three retrievers (semantic / lexical / graph) combined via weighted RRF — not "top-k by cosine."
Provenance is first-class. Every typed claim carries an evidence list back to source memories, plus four bi-temporal timestamps.
Predictable tail latency. Thread-per-core (Glommio + io_uring), single-writer-per-shard, lock-free reads, group-commit WAL.
Apache 2.0, end to end. No premium edition, no SaaS lock-in. The typed graph, the extractor pipeline, the reranker, and the schema DSL are all in the open repo.

The architectural justification, the five design wedges, and the comparison with adjacent systems (Pinecone, Qdrant, Neo4j, Mem0, Letta, Zep) are in spec/01_architecture/.

What Brain stores

Four record types, one database:

Record	What it is	Example
Memory	Raw experience — text + 384-dim embedding + salience + edges + provenance	`"Alex pushed the deadline to next Friday"`
Entity	Canonical noun with a stable UUIDv7 identity, alias list, and typed attributes	`Person(canonical_name="Alex Chen", aliases=["Alex"])`
Statement	Typed claim about entities — `Fact` / `Preference` / `Event` — with confidence and bi-temporal validity	`Event(subject=alex, predicate=pushed, object="deadline to Friday", valid_from=t0, confidence=0.92)`
Relation	Typed binary edge between entities, with cardinality and evidence	`reports_to(alex, priya)`

Entities, Statements, and Relations are derived from Memories by a three-tier extractor pipeline (pattern → GLiNER classifier → LLM with prompt cache) when a schema is declared.

The full data model is in spec/02_data_model/.

Schemaless vs schema-declared

The same Brain binary serves both modes; the runtime gate is whether the per-shard SCHEMA_ACTIVE_VERSIONS_TABLE is empty.

Mode	What's active	Use it when
Schemaless (default)	Memory record only. `ENCODE`, `RECALL`, `PLAN`, `REASON`, `FORGET`, `SUBSCRIBE`, `TXN_*` over vector memory.	Prototyping; semantic memory without typed-graph overhead; small agents.
Schema-declared	All of the above + typed extraction + entity/statement/relation writes + entity-anchored retrieval over the typed graph.	Production agents that need provenance, temporal reasoning, supersession, or entity-anchored queries.

A deployment can move in either direction. Declaring a schema after months of schemaless use kicks off a backfill. Schema is not a legacy or minimal mode — both postures are first-class.

The DSL is documented in spec/03_schema/. Example:

namespace acme

define entity_type Person {
    attributes {
        email:    text optional unique
        team:     text optional
        timezone: text optional
    }
}

define predicate prefers {
    kind:   Preference
    object: Value<text>
}

define relation_type reports_to {
    from:        Person
    to:          Person
    cardinality: many-to-one
}

define extractor preferences {
    kind:                 llm
    target:               statement Preference
    trigger:              on encode where memory.kind = episodic
    model:                "gpt-4o-mini"
    confidence_threshold: 0.7
    cache:                enabled
}

Quickstart

Requires: Docker, @devcontainers/cli (npm install -g @devcontainers/cli).

git clone https://github.com/brain-db-io/brain-db
cd brain-db
just docker-up            # builds image, starts container, runs post-create
just docker-shell         # bash inside the dev container

Inside the container:

just verify                                            # fmt + build + clippy + test
cargo run --bin brain-server -- --config config/dev.toml   # the database
curl -s http://127.0.0.1:9092/v1/stats                 # admin via curl (loopback)

One binary:

brain-server — the database. Binary wire protocol on the data port; a loopback HTTP admin listener (stats, snapshots, audit, worker control) reachable with curl. Brain ships no client/SDK/CLI — speak the wire protocol from any language; a brainctl migration tool is future work.

A persistent agent identity is opt-in; a client supplies its agent_id at handshake, and by default the server mints an ephemeral one per connection if none is given.

Memory is isolated per agent. RECALL returns only the calling agent's own memories by default — one tenant never sees another's. Cross-agent reads are explicit: a request scopes to a named agent set, or opts into the shared view. Each hit carries its owning agent_id so provenance is always legible.

Cognitive operations

The verbs that drive Brain. Full semantics are in spec/05_operations/.

Verb	What it does
ENCODE	Store an experience. Embeds the text, picks a slot, writes the WAL record, updates metadata, inserts into HNSW. With a schema declared, queues extractors.
RECALL	Find memories relevant to a cue. Fans out to the semantic, lexical, and graph retrievers and fuses them by RRF — one read path. A declared schema adds entity-anchored typed-graph enrichment.
PLAN	Construct a path from one cognitive state to another. Pull-based executor with budgets (steps, wall time, branches).
REASON	Multi-hop traversal explaining why X is connected to Y. Returns the path, evidence memories, and confidence.
FORGET	Soft (mark + grace period) or hard (zero the slot) tombstoning. Cascades to derived typed-graph records when a schema is active.
LINK / UNLINK	Manually assert / retract a typed edge between two memories.
SUBSCRIBE	Stream events: memory created, statement created, extractor failed, schema updated, etc.
TXN_BEGIN / TXN_COMMIT / TXN_ABORT	Group multiple operations into one atomic unit.

One-shot mode (each invocation runs a single verb and exits):

brain encode "Alex pushed the deadline to next Friday"
brain recall "when did Alex change the deadline?" --top-k 5 --include-text
brain plan "current sprint state" "feature shipped" --max-steps 8
brain forget s1/m18/v1 --mode soft

Or the same inside the REPL — no brain prefix:

brain> encode "Alex pushed the deadline to next Friday"
brain> recall "when did Alex change the deadline?" --top-k 5
brain> reason "Alex changed the deadline" --depth 3
brain> subscribe --kind episodic --collect 10

Encoding the same content twice is a no-op by default — pass --allow-duplicate to write a fresh copy.

Architecture in 30 seconds

┌─────────────────────────────────────────────────────────────────────────────┐
│            CLIENTS (any language — speak the wire protocol directly)         │
└────────────────────────────────────┬────────────────────────────────────────┘
                                     │ custom binary protocol over TCP
                                     │ CBOR structured payloads + raw LE-f32 vectors
┌────────────────────────────────────▼────────────────────────────────────────┐
│  CONNECTION LAYER · Tokio · accept · TLS · frame validate · shard dispatch  │
└────────────────────────────────────┬────────────────────────────────────────┘
                                     │ message channels, one per shard
                ┌────────────────────┼────────────────────┐
                ▼                    ▼                    ▼
        ┌──────────────┐     ┌──────────────┐     ┌──────────────┐
        │  Shard 0     │     │  Shard 1     │     │  Shard N     │
        │  Glommio +   │     │  Glommio +   │     │  Glommio +   │
        │  io_uring    │     │  io_uring    │     │  io_uring    │
        │              │     │              │     │              │
        │  ┌────────┐  │     │  ┌────────┐  │     │  ┌────────┐  │
        │  │ arena  │  │     │  │ arena  │  │     │  │ arena  │  │
        │  │ WAL    │  │     │  │ WAL    │  │     │  │ WAL    │  │
        │  │ redb   │  │     │  │ redb   │  │     │  │ redb   │  │
        │  │ HNSW×3 │  │     │  │ HNSW×3 │  │     │  │ HNSW×3 │  │
        │  │ tantvy │  │     │  │ tantvy │  │     │  │ tantvy │  │
        │  └────────┘  │     │  └────────┘  │     │  └────────┘  │
        │              │     │              │     │              │
        │ Single writer per shard. Lock-free reads via ArcSwap + crossbeam.   │
        └──────┬───────┘     └──────┬───────┘     └──────┬───────┘
               │                    │                    │
               └────────────────────┼────────────────────┘
                                    ▼
                    BACKGROUND WORKERS (per-shard, dedicated cores)
                    decay · consolidation · HNSW maintenance · GC
                    schema-declared: extractors · text indexer · sweepers

Two runtimes, one host. Connection layer on Tokio (many tasks, accept TCP, decode 32-byte frame, dispatch). Shard layer on Glommio (thread-per-core, io_uring, single writer per shard). The two communicate via channels carrying messages — per-shard data never crosses the boundary.

Six data structures per shard:

Structure	Role	Spec
Arena	mmap'd file of 1600-byte slots (1536 vector + 64 metadata/padding)	`spec/08_storage/01_arena.md`
WAL	Per-shard append-only log; O_DIRECT + `pwritev2(RWF_DSYNC)` group commit	`spec/08_storage/02_wal.md`
redb	Embedded ACID B-tree for metadata + typed-graph tables	`spec/10_metadata/02_table_layout.md`
HNSW × 3	Memory `M=16, ef_c=200, ef_s=64`; Entity `M=16, ef_c=100, ef_s=64`; Statement `M=32, ef_c=200, ef_s=128`	`spec/09_indexing/01_hnsw_basics.md`
tantivy × 2	BM25 over memory text + statement text	`spec/10_metadata/06_tantivy_layout.md`
LLM cache	Separate redb for extractor responses with TTL	`spec/11_extractors/06_prompt_caching.md`

Seven non-negotiable invariants (from spec/08_storage/00_purpose.md):

WAL-before-acknowledge — no operation returns success until its WAL record is fsynced.
Single writer per shard — no locks needed; the discipline enforces it.
CRC everywhere — every WAL record, every arena slot. Reads verify; mismatches halt.
Slot version on MemoryId — encoded in the ID; stale references → NotFound.
Idempotency by RequestId — same params → cached response; different params → Conflict.
Tombstone grace before reclamation — default 7 days. Hard FORGET zeroes immediately.
No silent corruption — fail-stop and alert. Never return wrong data.

Tested per spec/19_benchmarks/01_correctness_and_durability.md.

For the layered architecture diagram (seven internal layers from L1 connection through L7 sharding) and the full design wedges, see spec/01_architecture/04_layers.md and spec/01_architecture/07_wedges_and_roadmap.md.

Performance targets

Hard targets from spec/01_architecture/05_hardware_and_targets.md §7 and spec/19_benchmarks/02_performance_targets.md. Single shard, warm, reference hardware (16-core x86_64 / 64 GB RAM / NVMe SSD):

Operation	p50	p99
`ENCODE` (text, CPU embedding)	≤ 12 ms	≤ 25 ms
`ENCODE` (text, GPU embedding)	≤ 3 ms	≤ 8 ms
`ENCODE_VECTOR_DIRECT` (pre-supplied vector)	≤ 1 ms	≤ 5 ms
`RECALL` (top-k = 10, schemaless)	≤ 8 ms	≤ 20 ms
`RECALL` (top-k = 10, schema-declared, +graph)	≤ 10 ms	≤ 50 ms
`FORGET`	≤ 3 ms	≤ 10 ms
`PLAN` (simple)	≤ 50 ms	≤ 200 ms
`REASON`	≤ 100 ms	≤ 500 ms

Brain optimizes for predictable tails, not minimum averages. The combined acceptance suite at spec/19_benchmarks/06_complete_acceptance.md is the v1.0 release gate.

Status

Pre-release (v0.1.0). No external users. The wire protocol, redb tables, and schema model are still in flux. Until v1.0 ships, breaking changes happen in place without back-compat shims.

The v1.0 release ships when the combined acceptance suite passes — functional, performance, storage, operational, and schemaless mode tests, end-to-end.

For the high-level milestone index, see ROADMAP.md; the per-phase landing record is in the git history.

Documentation

Topic	Location
Specification (148 files, 20 sections, normative)	`spec/`
Spec entry point + glossary + doc map	`spec/00_overview/`
System architecture + design wedges	`spec/01_architecture/`
Data model (Memory / Entity / Statement / Relation)	`spec/02_data_model/`
Wire protocol (frames + opcodes + handshake)	`spec/04_wire_protocol/`
Schema DSL grammar	`spec/03_schema/`
Acceptance gate for v1.0	`spec/19_benchmarks/06_complete_acceptance.md`
Roadmap (milestone index)	`ROADMAP.md`
Client SDKs (Rust / Python / TS) + interactive shell	`brain-sdk`, `brain-shell` — separate repos
Evaluation, perf/scale-run, soak & acceptance harness	`brain-eval` — separate repo

Repository layout

brain/
├── crates/
│   ├── brain-core/         Shared types: MemoryId, EdgeKind, Error, EntityId, ...
│   ├── brain-protocol/     Wire protocol: frame, opcodes, codec, schema DSL parser
│   ├── brain-storage/      Arena + WAL + recovery
│   ├── brain-metadata/     redb wrapper: memory + entity + statement + relation tables
│   ├── brain-index/        HNSW × 3 + tantivy
│   ├── brain-embed/        BGE embedding service
│   ├── brain-rerank/       Cross-encoder reranker (bge-reranker-base)
│   ├── brain-planner/      Query planner + executor
│   ├── brain-ops/          One write path + retrievers + extractor writes
│   ├── brain-workers/      Background workers (decay, consolidation, extractors, …)
│   ├── brain-extractors/   Pattern + classifier extractors
│   ├── brain-llm/          LLM client + cache + budget
│   ├── brain-plugins/      Plugin surface (enricher + connector)
│   ├── brain-http/         HTTP transport for the admin listener
│   └── brain-server/       Server binary
├── spec/                   The 148-file specification (authoritative)
└── ROADMAP.md              Milestone index

Tech stack

Pinned in the workspace Cargo.toml. New dependencies require commit-message justification.

Component	Crate
Async runtime (shards)	`glommio` — thread-per-core, `io_uring`
Async runtime (connection layer)	`tokio`
Wire encoding	`ciborium` (CBOR) + raw little-endian `f32` vectors
Internal storage encoding	`rkyv` + `bytemuck`
Metadata store	`redb`
ANN index	`hnsw_rs`
Lexical index	`tantivy`
Embedding inference	`candle` + `tokenizers`
SIMD math	`matrixmultiply` + `wide`
Lock-free swap	`arc-swap`
Epoch GC	`crossbeam-epoch`
CRC	`crc32c`
UUIDs (v7)	`uuid`
Errors	`thiserror` + `anyhow`
Telemetry	`tracing` + `opentelemetry`

Platform support

Linux only. Kernel ≥ 5.15 (for stable io_uring). macOS and Windows are not supported; use the supplied dev container for local development on those platforms.

Brain depends on Linux-specific I/O facilities: io_uring, O_DIRECT, madvise(MADV_RANDOM | MADV_DONTDUMP), fallocate(FALLOC_FL_KEEP_SIZE). Abstracting these would either leak platform differences in tail latency or bloat the codebase with multiple backends. For a system whose value proposition is latency, one optimized backend wins.

CPU: x86_64 with SSE 4.2 or ARM64 with the CRC32 extension. AVX2 / NEON used opportunistically. Full hardware envelope in spec/01_architecture/05_hardware_and_targets.md.

Contributing

Brain is pre-release. The wire protocol, on-disk formats, and schema model still change without back-compat shims. Until v1.0:

Spec changes go through the project owner. Code disagreements with the spec are fixed by changing the code.
The seven invariants in spec/08_storage/00_purpose.md are non-negotiable.
New dependencies require commit-message justification; the approved set is in Cargo.toml.

CI (.github/workflows/ci.yml) is the authoritative test gate. Run just verify locally before pushing — it does fmt + build + clippy -D warnings + test.

By submitting a pull request, you agree your contribution is licensed under the Apache-2.0 terms (per Apache-2.0 §5).

License

Apache-2.0. Source code, spec, and documentation are all under the same license.

Repository: https://github.com/brain-db-io/brain-db

Name		Name	Last commit message	Last commit date
Latest commit History 585 Commits
.cargo		.cargo
.devcontainer		.devcontainer
.github/workflows		.github/workflows
assets		assets
config		config
crates		crates
fuzz		fuzz
spec		spec
.dockerignore		.dockerignore
.gitignore		.gitignore
AUTONOMY.md		AUTONOMY.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
clippy.toml		clippy.toml
docker-compose.yml		docker-compose.yml
justfile		justfile
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Brain

Table of contents

Why Brain

What Brain stores

Schemaless vs schema-declared

Quickstart

Cognitive operations

Architecture in 30 seconds

Performance targets

Status

Documentation

Repository layout

Tech stack

Platform support

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Brain

Table of contents

Why Brain

What Brain stores

Schemaless vs schema-declared

Quickstart

Cognitive operations

Architecture in 30 seconds

Performance targets

Status

Documentation

Repository layout

Tech stack

Platform support

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages