A local-first AI memory system for people who run things.
Your knowledge. Your hardware. Instantly searchable.
Getting Started | How It Thinks | Architecture | FAQ | Contributing
Echo is built like a brain. Each component handles a different cognitive function:
graph TD
GW["CORTEX<br/><b>Inference Gateway</b><br/>Thinks with your memories"]:::cortex
B["BROCA'S AREA<br/><b>Chat Adapters</b><br/>Telegram Β· Discord Β· CLI"]:::language
W["WERNICKE'S AREA<br/><b>Context Builder</b><br/>Memory search Β· Relevance scoring"]:::language
A["AUDITORY CORTEX<br/><b>Voice Pipeline</b><br/>Whisper β Memory"]:::perception
H["HIPPOCAMPUS<br/><b>Memory Server</b><br/>SQLite + FTS5 Β· Always on"]:::memory
CB["CEREBELLUM<br/><b>Health + Tracking</b><br/>Token costs Β· Uptime"]:::support
BS["BRAIN STEM<br/><b>Core Infrastructure</b><br/>The foundation everything plugs into"]:::support
B --> GW
W --> GW
A --> H
GW --> H
CB -.-> H
H --> BS
classDef cortex fill:#c0392b,stroke:#922b21,color:#fff
classDef memory fill:#27ae60,stroke:#1e8449,color:#fff
classDef language fill:#2980b9,stroke:#2471a3,color:#fff
classDef perception fill:#d68910,stroke:#b9770e,color:#fff
classDef support fill:#7f8c8d,stroke:#616a6b,color:#fff
| Brain Region | Echo Component | What It Does |
|---|---|---|
| Cortex | Inference Gateway | The thinking layer. Searches your memories for context, composes a prompt, sends to the best available LLM, tracks cost. |
| Hippocampus | Memory Server | Long-term memory. Stores and retrieves knowledge using SQLite full-text search. Always on. |
| Broca's Area | Chat Adapters | Language output. Telegram, Discord, CLI -- how you talk to Echo and how Echo talks back. |
| Wernicke's Area | Context Builder | Language comprehension. Figures out which memories are relevant to what you're asking right now. |
| Auditory Cortex | Voice Pipeline | Perception. Transcribes voice memos via Whisper, turns speech into searchable memory. |
| Cerebellum | Health + Tracking | Coordination. Monitors infrastructure health, tracks token costs, keeps everything running. |
| Brain Stem | Core Infrastructure | Vital functions. The always-on foundation everything plugs into. |
This maps directly to Tiago Forte's CODE framework from Building a Second Brain:
CAPTURE --> Hippocampus (Memory Server stores everything)
ORGANIZE --> Tag System (structured metadata on every memory)
DISTILL --> Cortex (Gateway searches for what's relevant)
EXPRESS --> Broca's Area (delivers informed responses)
This is what makes Echo different from a note-taking app. When you ask a question, the Inference Gateway doesn't just forward it to an AI -- it thinks with your memories first:
You: "What did we quote for the Anderson project?"
β
βΌ
ββ INFERENCE GATEWAY βββββββββββββββββββββββββββββββββ
β β
β 1. Search memory: "Anderson project quote" β
β -> Found 3 relevant memories β
β β
β 2. Compose prompt: β
β System: "You have access to the user's β
β persistent memory..." β
β Context: [memory #1] [memory #2] [memory #3] β
β Question: "What did we quote for Anderson?" β
β β
β 3. Route to LLM (OpenAI -> Anthropic -> Local) β
β β
β 4. Track: 847 tokens, $0.002, 1.3s latency β
β β
ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββ
β
βΌ
"The Anderson project was quoted at $2,400 in Q2.
Based on your notes, that included the base rate
of $1,800 plus $200 per flight of stairs (3 flights)."
Without the gateway, you get a generic AI response. With it, you get an answer grounded in your knowledge.
The fallback chain keeps you online:
Primary Provider (OpenAI)
β fails?
βΌ
Fallback Provider (Anthropic)
β fails?
βΌ
Local LLM (Ollama / LM Studio)
β
βΌ
You always get an answer.
You run a business. You have knowledge trapped in:
- Your head
- Voice memos on your phone
- Scattered notes and spreadsheets
- Conversations you can never search
You forget details. You re-explain things to yourself. You lose ideas between the meeting and the desk, between Tuesday and Friday, between this app and that one.
Echo fixes that. It captures your knowledge, stores it on hardware you own, and makes it searchable through whatever you already use -- a terminal, a chat bot, or your voice.
It's not a chatbot. It's not a SaaS product. It's infrastructure -- like having a second brain that actually works.
# Store a memory
$ echo-cli store "Met with Sarah about Q2 campaign -- she wants social media first, then email. Budget is 12K."
Stored. ID: 847
# Search later
$ echo-cli search "sarah campaign"
[847] Met with Sarah about Q2 campaign -- she wants social media first, then email. Budget is 12K.
Tags: client:sarah, type:note | Stored: 2026-01-15# The gateway searches your memories, enriches the prompt, then calls the LLM
$ curl -X POST http://localhost:3003/api/chat \
-H "Content-Type: application/json" \
-d '{"message": "What pricing decisions have I made this quarter?"}'
{
"response": "Based on your memories, you made 3 pricing decisions this quarter...",
"sources": ["pricing/decision", "strategy/note"],
"tokens": {"input": 512, "output": 234, "cost": 0.003}
}You: What did we charge for the downtown move?
Echo: Based on your records, the downtown move was quoted at $2,400.
That's the base rate ($1,800) plus $200/flight for 3 flights of stairs.
Source: memory #412, tagged client:anderson, type:quote
You: /save Decided to raise the base rate to $2,000 starting Q3.
Echo: Saved. Tagged: decision, pricing
$ curl -X POST http://localhost:3003/api/brief \
-H "Content-Type: application/json" \
-d '{"type": "morning"}'The gateway pulls recent decisions, tasks, and commitments from memory and generates a personalized briefing. No generic AI fluff -- it's built from your actual data.
Before Echo After Echo
----------- ----------
Record voice memo on drive home Record voice memo on drive home
Forget about it Auto-transcribed by Whisper
Idea is gone forever Auto-stored in Echo
Searchable the next morning
"What was that thing I said
about the Wilson job?"
--> instant answer
| Directory | What It Does | Required? |
|---|---|---|
core/memory-server/ |
SQLite memory API with full-text search (the Hippocampus) | Yes |
core/cli/ |
Command-line tool to store and search memories | Yes |
services/inference-gateway/ |
The Cortex. Memory-augmented AI with multi-provider fallback and cost tracking. | Recommended |
adapters/chat/telegram/ |
Telegram bot adapter | No |
adapters/chat/discord/ |
Discord bot adapter (stub) | No |
adapters/inference/ |
AI provider interface docs | No |
adapters/voice/ |
Audio transcription pipeline docs | No |
adapters/agent/openclaw/ |
OpenClaw agent integration with workspace skills | No |
deploy/ |
Docker, Pi, and cloud setup guides | No |
The memory server is the foundation. The inference gateway is what makes it intelligent. Everything else is an adapter you plug in based on what works for you.
git clone https://github.com/DevontiaW/echo-public.git
cd echo-public
# Start the memory server
cd core/memory-server && go build -o echo-memory . && ./echo-memory &
# Start the inference gateway (optional but recommended)
cd ../services/inference-gateway
export OPENAI_API_KEY="sk-..." # or ANTHROPIC_API_KEY
cp config.example.json config.json
go build -o gateway . && ./gateway &
# Store your first memory
cd ../../core/cli && go build -o echo-cli .
./echo-cli store "This is my first memory. Echo is working."
./echo-cli search "first memory"Path A: Interactive wizard (recommended for beginners)
python setup.pyThe wizard asks you a few questions -- what kind of work you do, how you capture ideas, what devices you use -- and configures everything.
Requirements: Python 3.10+
Path B: AI-assisted setup (recommended for the best experience)
Open the project in Claude Code, Codex, Cursor, or your preferred AI coding tool and say:
"Set up Echo for me."
The assistant reads INTERLINKED.md and walks you through a personalized setup -- not a form, a conversation. By the end, Echo is configured for your specific work.
Named after the baseline test in Blade Runner 2049 -- each answer connects to the next until Echo knows you.
Path C: Manual setup (for developers)
# Copy the example environment config
cp .env.example .env
# Edit .env with your settings (host, port, optional API keys)
# Build and start the memory server
cd core/memory-server
go build -o echo-memory .
./echo-memory
# In another terminal, build the CLI
cd core/cli
go build -o echo-cli .
./echo-cli health
# Output: Echo memory server is healthy. 0 memories stored.
# (Optional) Start the inference gateway
cd services/inference-gateway
cp config.example.json config.json
# Edit config.json with your memory server URL and preferred AI provider
go build -o gateway .
./gatewayThen add adapters as needed:
- Chat bot:
adapters/chat/telegram/oradapters/chat/discord/ - Voice pipeline:
adapters/voice/
| Business | Echo Does This | |
|---|---|---|
| π§ | Personal | Capture ideas, decisions, and insights as they happen. Voice-memo on your commute -- auto-transcribed and searchable the next morning. "What was that thing I said about the rebrand?" -- instant answer. |
| πΌ | Small Business | Client details, pricing history, project notes -- all searchable. "What did we quote for the Anderson project?" -- recalled instantly with full context. No more digging through emails. |
| π | Research & Learning | Reading notes, research findings, course material. Build a personal knowledge base that compounds over time. Morning briefings pull from your latest research. |
| π§βπ» | Builder / Creator | Decision log -- what you decided and why. Architecture choices, bug lessons, project context that follows you between sessions and devices. |
graph TB
subgraph You ["Your Devices"]
D1["Desktop"]
D2["Phone"]
end
subgraph Gateway ["Inference Gateway (the Cortex)"]
GW["Search Memory β Compose Prompt β Call LLM β Track Cost"]
end
subgraph Core ["Memory Server (the Hippocampus)"]
MS["REST API"]
DB[("SQLite + FTS5")]
MS --> DB
end
subgraph Adapters ["Adapters (pick what you need)"]
CLI["CLI Tool"]
CB["Chat Bot"]
VP["Voice Pipeline"]
end
D1 --> CLI
D2 --> CB
CLI --> GW
CB --> GW
VP --> Core
GW --> Core
style Gateway fill:#E74C3C,stroke:#333,color:#fff
style Core fill:#2ECC71,stroke:#333,color:#fff
style You fill:#4A90D9,stroke:#333,color:#fff
style Adapters fill:#F39C12,stroke:#333,color:#fff
You deploy Echo however it fits your life:
| Pattern | Setup | Best For |
|---|---|---|
| Everything Local | Memory server + gateway + CLI on your machine | Single-device use, getting started |
| Pi + Workstation | Memory on a Raspberry Pi, gateway + tools on your laptop | Multi-device, always-on access |
| Cloud Server | Memory + gateway on a VPS behind Tailscale | Remote access from anywhere |
| Docker Compose | docker-compose up and done |
Quick spin-up, containerized deployment |
See ARCHITECTURE.md for full details, API reference, and port registry.
| Principle | What It Means |
|---|---|
| Memory-first AI | The gateway searches your knowledge before calling the LLM. Context makes the AI useful -- not the other way around. |
| Local-first | Your memories live on your machine or your own server. Nothing goes to someone else's cloud unless you choose it. |
| Agent-agnostic | Use Claude, GPT, Gemini, a local LLM, or no AI at all. Echo is the memory layer -- you pick the brain. |
| Resilient | Provider down? Auto-fallback. Cloud offline? Local LLM picks up. Your system stays responsive. |
| Cost-aware | Every API call tracked per token, per model, per channel. Know exactly what your AI costs you. |
| Simple | SQLite + full-text search. No vector databases, no Kubernetes, no PhD required. |
Security note: Echo's memory server has no built-in authentication β it's designed for trusted networks (your LAN, a VPN, or localhost). If you deploy to a cloud server, put it behind Tailscale, WireGuard, or a reverse proxy with auth. Never expose the memory server port directly to the internet.
| Requirement | Version | Notes |
|---|---|---|
| Go | 1.21+ | For core services (memory server, gateway, CLI, adapters) |
| Python | 3.10+ | For setup wizard only |
| SQLite | 3.35+ with FTS5 | Usually bundled with Go and most OSes |
| Node.js | 18+ | Optional -- only for command queue |
| ffmpeg | any | Optional -- only for voice pipeline |
Is my data private?
Yes. Everything runs on hardware you control -- your laptop, a Raspberry Pi, a server in your closet. Nothing is sent to any cloud service unless you explicitly configure an AI provider, and even then only your search query goes out, not your entire memory database.
Do I need AI to use this?
No. The core of Echo is SQLite full-text search -- it works without any AI provider. The inference gateway is an optional layer that makes answers more conversational and context-aware. You can add it later, swap providers anytime, or never use it at all.
Can I use this on my phone?
Yes. Set up the Telegram bot adapter (or Discord) and you can store and search memories from any phone. No app to install -- you use the messaging app you already have.
What if I'm not technical?
Run python setup.py for a guided wizard, or open the project in an AI coding tool (Claude Code, Cursor) and say "set up Echo for me." The AI reads the setup guide and walks you through everything conversationally. You don't need to know Go, SQLite, or REST APIs.
How is this different from Notion / Apple Notes / Google Keep?
Those are note-taking apps controlled by someone else's server. Echo is infrastructure you own. Your data never leaves your machine. There's no subscription, no terms of service change, no "we're shutting down" email. And with the inference gateway, Echo doesn't just store notes -- it thinks with them.
How do I give feedback or report a bug?
Open an Issue for bugs and feature requests, or start a Discussion for questions and ideas. PRs are welcome -- see CONTRIBUTING.md.
Can I run this on a Raspberry Pi?
Yes, that's one of the primary deployment patterns. Echo was built and tested on a Pi 5 running 24/7. Go cross-compiles to ARM with zero dependencies. See deploy/ for Pi-specific setup guides.
Echo was built on a few beliefs:
- Memory beats intelligence. The smartest AI is useless without context. The simplest search is powerful with the right memories stored.
- Your data is yours. It runs on hardware you control.
- Context is the product. The inference gateway doesn't make the AI smarter -- it gives the AI your knowledge to think with.
- Infrastructure, not product. Echo is plumbing. You build what you need on top of it.
- Simple beats clever. SQLite over Postgres. Full-text search over vector embeddings. Plain files over complex schemas.
Inspired by Tiago Forte's Building a Second Brain methodology -- specifically the CODE framework: Capture, Organize, Distill, Express. Echo turns that framework into running code.
Echo is source-available under the Business Source License 1.1. Free for personal use, evaluation, and contributions. Commercial use requires a license from Textstone Labs. Contributions are welcome -- especially:
- New chat adapters (Slack, WhatsApp, SMS)
- Inference provider integrations
- Deployment guides for different platforms
- Bug reports and improvements
- Documentation and examples
See CONTRIBUTING.md for how to get started, code style, PR guidelines, and beginner-friendly issues.
BSL 1.1 -- free for personal use. Commercial use requires a license. See LICENSE for details.
Built by Textstone Labs
"All those moments will be lost in time, like tears in rain... unless you store them in Echo."