cogni-mem

cogni-mem is a local-first Python agent memory framework for developers building AI agents, coding agents, and multi-agent systems that gives agents a structured, multi-channel long-term memory with adaptive lifecycle governance.

What is cogni-mem — A Local-First Agent Memory Framework

cogni-mem (GitHub: cogniweave-memory) is a local-first Python agent memory framework that models agent memory as a set of specialized channels — Key, Semantic, Episodic, Perceptual, Experience, and Sensory Buffer — rather than a single flat vector store. Each channel has its own storage backend, retrieval strategy, retention policy, and decay curve, so the agent's memory behaves more like human memory and less like a database with embeddings.

The framework runs entirely on local infrastructure: JSON for key-value memory, SQLite for metadata, and local Qdrant path mode for vector search. No external services are required to get started. Optional integrations for Neo4j (graph-enhanced semantic memory) and MiniMax (OpenAI-compatible LLM provider) are available as install extras.

At its core, cogni-mem is not just a memory store — it is a complete memory-aware agent execution loop: task understanding → sensory buffering → multi-channel parallel retrieval → cross-channel fusion → context orchestration → execution → writeback decision → classified long-term storage → memory lifecycle governance → adaptive feedback tuning.

Why not just RAG

Most memory solutions for LLM agents are essentially "RAG with a vector database" — store everything, embed everything, retrieve by similarity. This works for document search but breaks down for agent memory:

Problem with plain RAG	How cogni-mem handles it
One-size-fits-all retrieval	Task-aware routing selects relevant memory channels per task type (coding, planning, QA, dialogue)
No concept of memory importance	Multi-dimensional scoring (importance, novelty, confidence, consistency, reusability) drives retention and retrieval priority
Storage grows unbounded	Memory lifecycle governance with per-channel retention profiles, forgetting, summarization, demotion, and archiving
Retrieved context is flat	Cross-channel fusion normalizes and merges results from semantic, episodic, perceptual, and experience memory
No feedback loop	Adaptive policy updates adjust retrieval bias, channel weights, and write thresholds based on execution outcomes
Everything is a vector	Key memory (explicit facts/rules) uses JSON key-value storage, not vector search — faster and deterministic

cogni-mem is closer to MemGPT in ambition (both aim for memory-aware agents) but follows a different design philosophy: MemGPT relies on a single simulated memory hierarchy managed by an LLM orchestrator, while cogni-mem uses engineered, deterministic channels with explicit routing, scoring, and lifecycle policies — making it more predictable, debuggable, and suitable for production agent workflows. Unlike Mem0, which focuses on user-level memory for personalization, cogni-mem is designed for task-level agent reasoning across multiple memory types simultaneously. Compared to LangMem, cogni-mem provides a ready-to-use runtime with built-in retrieval, writeback, and forgetting, rather than a set of memory primitives that you assemble yourself.

Key Features of the Multi-Channel Memory Framework

Multi-channel memory architecture — Key, Semantic, Episodic, Perceptual, Experience, Sensory Buffer
Local-first runtime — JSON + SQLite + local Qdrant path mode, zero external dependencies
Memory-aware agent loop — retrieval, context building, execution, and writeback in one cycle
RAG retrieval pipeline — query expansion, HyDE, multi-query expansion, and cross-channel fusion
Memory lifecycle governance — per-channel retention profiles, forgetting, summarization, demotion, and archiving
Adaptive feedback loop — execution outcomes automatically tune retrieval bias and channel weights
Tool Registry — inject memory tools (search, forget, lifecycle, ingestion) into the agent loop
Optional Neo4j graph — entity linking and graph traversal for semantic memory (pip install cogni-mem[graph])
Optional MiniMax provider — OpenAI-compatible LLM integration (pip install cogni-mem[minimax])
Task-aware routing — automatically selects memory channels based on task type (coding, QA, planning, dialogue)

Quick Start with the Python Agent Memory Framework

from cogniweave_full import (
    CalculatorTool,
    Config,
    LLMFactory,
    MemoryAgent,
    MemoryForgetTool,
    MemoryLifecycleTool,
    MemoryManager,
    MemorySearchTool,
    OfflineIngestionTool,
    ToolRegistry,
)

# Configure with mock LLM for zero-cost local testing
config = Config(
    llm_provider="mock",
    enable_hyde=False,
    enable_mqe=False,
    enable_qdrant=False,
    enable_neo4j=False,
)

llm = LLMFactory.create(config=config)
registry = ToolRegistry()
manager = MemoryManager(
    llm=llm,
    tool_registry=registry,
    base_dir="./runtime_demo",
    config=config,
)

# Register memory tools into the agent loop
registry.register_tool(CalculatorTool())
registry.register_tool(MemorySearchTool(manager))
registry.register_tool(MemoryForgetTool(manager))
registry.register_tool(MemoryLifecycleTool(manager))
registry.register_tool(OfflineIngestionTool(manager))

# Create a memory-aware agent
agent = MemoryAgent(
    name="assistant",
    llm=llm,
    memory_manager=manager,
    user_id="demo_user",
    session_id="demo_session",
    system_prompt="You are a memory-aware assistant.",
)

# The agent remembers across turns
print(agent.run("Remember that future answers should start with the conclusion."))
print(agent.run("What did we agree on earlier?"))

For a full working example with ReAct agent mode, see demo.py.

Architecture Overview of the Memory Lifecycle & RAG Pipeline

User Input
    │
    ▼
┌─────────────────────────────────────────────────┐
│  TaskModalityRouter                              │
│  Classifies task → selects relevant channels     │
└─────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────┐
│  MemoryRAGPipeline                               │
│  Query expansion → parallel channel retrieval    │
│  → scoring → cross-channel fusion → normalize   │
└─────────────────────────────────────────────────┘
    │
    ├── Key Memory (JSON, explicit facts/rules)
    ├── Semantic Memory (SQLite + Qdrant + optional Neo4j)
    ├── Episodic Memory (SQLite + Qdrant)
    ├── Perceptual Memory (SQLite + Qdrant)
    └── Experience Memory (SQLite + Qdrant)
    │
    ▼
┌─────────────────────────────────────────────────┐
│  ContextOrchestrator                             │
│  Builds prompt context from retrieved items      │
│  + system prompt + tool schemas + dialogue       │
└─────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────┐
│  Agent Execution (LLM + Tool Calls)              │
│  MemoryAgent / ReActMemoryAgent                  │
└─────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────┐
│  PostRunMemoryRouter                             │
│  Decides what to write, where, and at what level │
└─────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────┐
│  Consolidation + AsyncWriteBack                  │
│  Candidate extraction → record creation → write  │
└─────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────┐
│  Memory Lifecycle (ForgetScheduler)              │
│  Retention scoring → demote / summarize /        │
│  archive / delete per channel                    │
└─────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────┐
│  Feedback Loop (PolicyUpdater)                   │
│  Execution outcomes → adjust retrieval bias,     │
│  channel weights, write thresholds               │
└─────────────────────────────────────────────────┘

Default local runtime:

Key memory → JSON file
Metadata → SQLite (one database per channel)
Vector memory → local Qdrant path mode (no server needed)

Optional services:

Set COGNIWEAVE_ENABLE_QDRANT=true to connect to a remote Qdrant instance
Set COGNIWEAVE_ENABLE_NEO4J=true and install cogni-mem[graph] for graph-enhanced semantic memory

Installation of the Local-First Memory Framework

# Base install (local runtime only)
pip install cogni-mem

# With MiniMax OpenAI-compatible provider
pip install "cogni-mem[minimax]"

# With Neo4j graph enhancement
pip install "cogni-mem[graph]"

# With everything
pip install "cogni-mem[all]"

# Pin a specific version
pip install cogni-mem==0.1.0

# Install from GitHub
pip install "git+https://github.com/luckly06/cogniweave-memory.git@v0.1.0"

Requirements: Python >= 3.11

Environment variables (see .env.example):

Variable	Description
`COGNIWEAVE_LLM_PROVIDER`	LLM provider (e.g., `minimax`, `mock`)
`MINIMAX_API_KEY`	API key for MiniMax provider
`MINIMAX_BASE_URL`	Base URL for MiniMax-compatible endpoint
`MINIMAX_MODEL`	Model name for MiniMax provider
`COGNIWEAVE_ENABLE_QDRANT`	Set to `true` for remote Qdrant
`COGNIWEAVE_ENABLE_NEO4J`	Set to `true` for Neo4j graph

FAQ — Agent Memory, Episodic Memory, and Semantic Memory

How is cogni-mem different from Mem0?

Mem0 is primarily a user-level memory layer for personalization — it remembers user preferences across sessions. cogni-mem is a task-level agent memory framework that manages multiple memory channels (semantic, episodic, perceptual, experience) simultaneously, with built-in retrieval, writeback, and lifecycle governance. If you need an agent that remembers how to solve problems, not just what the user likes, cogni-mem is the better fit.

How is cogni-mem different from MemGPT?

MemGPT simulates a memory hierarchy (core memory / archival memory) managed by an LLM that decides what to move between tiers. cogni-mem takes a different approach: engineered, deterministic memory channels with explicit routing, scoring, and retention policies. This makes cogni-mem more predictable, easier to debug, and less dependent on the LLM for memory management decisions.

Does cogni-mem work without internet?

Yes. The default configuration uses a mock LLM and local storage (JSON + SQLite + local Qdrant path mode), so the entire agent memory framework runs offline. When you're ready to use a real LLM, configure the MiniMax provider or bring your own OpenAI-compatible endpoint.

What storage backends does cogni-mem support?

Local (default): JSON for key memory, SQLite for metadata, Qdrant path mode for vector search
Remote Qdrant: set COGNIWEAVE_ENABLE_QDRANT=true
Neo4j graph: install cogni-mem[graph] and set COGNIWEAVE_ENABLE_NEO4J=true

Can I use cogni-mem with my own LLM provider?

Yes. The framework includes LLMFactory with a pluggable provider interface. The built-in MiniMax provider is OpenAI-compatible, so any provider that exposes an OpenAI-compatible API (including local models via Ollama or vLLM) can be used by configuring the base URL and model name.

What happens to old memories — does the storage grow forever?

No. cogni-mem has a memory lifecycle system with per-channel retention profiles. Each channel has configurable thresholds for retention scoring, summarization, archiving, and deletion. A background scheduler periodically evaluates memories and applies the appropriate action based on importance, recency, reuse frequency, and capacity pressure.

Is cogni-mem production-ready?

cogni-mem is at v0.1.0 (alpha). The core architecture — multi-channel memory, RAG pipeline, agent loop, lifecycle governance, and feedback system — is implemented and functional. The 0.1.x release line focuses on packaging, dependency management, and documentation. Production use is encouraged for evaluation and prototyping; expect API changes in future releases.

What is the relationship between cogni-mem and cogniweave-memory?

cogni-mem is the PyPI package name (pip install cogni-mem). cogniweave-memory is the GitHub repository name (github.com/luckly06/cogniweave-memory). The import namespace is cogniweave_full. They refer to the same project.

License

MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
cogniweave_full		cogniweave_full
docs		docs
scripts		scripts
tests/integration		tests/integration
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
DEVELOPMENT_GUIDE.md		DEVELOPMENT_GUIDE.md
INTEGRATION_GUIDE.md		INTEGRATION_GUIDE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
MINIMAX_SETUP.md		MINIMAX_SETUP.md
README.md		README.md
README.zh-CN.md		README.zh-CN.md
demo.py		demo.py
docker-compose.integration.yml		docker-compose.integration.yml
docker-compose.optional.yml		docker-compose.optional.yml
environment.miniconda.yml		environment.miniconda.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
smoke_test.py		smoke_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cogni-mem

What is cogni-mem — A Local-First Agent Memory Framework

Why not just RAG

Key Features of the Multi-Channel Memory Framework

Quick Start with the Python Agent Memory Framework

Architecture Overview of the Memory Lifecycle & RAG Pipeline

Installation of the Local-First Memory Framework

FAQ — Agent Memory, Episodic Memory, and Semantic Memory

How is cogni-mem different from Mem0?

How is cogni-mem different from MemGPT?

Does cogni-mem work without internet?

What storage backends does cogni-mem support?

Can I use cogni-mem with my own LLM provider?

What happens to old memories — does the storage grow forever?

Is cogni-mem production-ready?

What is the relationship between cogni-mem and cogniweave-memory?

License

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cogni-mem

What is cogni-mem — A Local-First Agent Memory Framework

Why not just RAG

Key Features of the Multi-Channel Memory Framework

Quick Start with the Python Agent Memory Framework

Architecture Overview of the Memory Lifecycle & RAG Pipeline

Installation of the Local-First Memory Framework

FAQ — Agent Memory, Episodic Memory, and Semantic Memory

How is cogni-mem different from Mem0?

How is cogni-mem different from MemGPT?

Does cogni-mem work without internet?

What storage backends does cogni-mem support?

Can I use cogni-mem with my own LLM provider?

What happens to old memories — does the storage grow forever?

Is cogni-mem production-ready?

What is the relationship between cogni-mem and cogniweave-memory?

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages