poncho-memory is a standalone memory middleware for the Poncho AI agent. It runs as a small FastAPI service that keeps a SQLite-backed user profile and rolling session summaries so a stateless agent never has to start a chat completely cold. Claude Haiku is used to extract durable user facts and summarize each session; on every new chat, a compact context_block is generated for you to prepend to Poncho’s system prompt—no changes to Poncho’s core logic required.
From the project root (poncho-memory):
pip install -r requirements.txt && uvicorn main:appRuns on http://127.0.0.1:8000 by default.
Copy .env.example to .env and set ANTHROPIC_API_KEY. Optional: set DB_PATH (default ./memory.db).
Extracts facts from raw text and upserts them into SQLite.
curl -s -X POST http://127.0.0.1:8000/profile/update \
-H "Content-Type: application/json" \
-d '{"text":"I am Alex, a backend engineer. I prefer TypeScript and am building Poncho."}'Returns structured profile: name, role, skills, active_projects, preferences.
curl -s http://127.0.0.1:8000/profileReturns a context_block string for the system prompt (optional topic hint).
curl -s -X POST http://127.0.0.1:8000/session/start \
-H "Content-Type: application/json" \
-d '{"topic":"debugging the memory service"}'Summarizes the transcript, stores the summary, and refreshes profile facts from the same transcript.
curl -s -X POST http://127.0.0.1:8000/session/end \
-H "Content-Type: application/json" \
-d '{"transcript":"User: ...\nAssistant: ..."}'On each new chat, Poncho calls POST /session/start and prepends the returned context_block to its system prompt so the model sees stable user context and recent session history. After a conversation, POST /session/end persists what happened; POST /profile/update can also ingest ad-hoc messages or exports to keep the profile current.
For best GET /profile shaping, extracted facts should use keys like name, role, skills and active_projects with JSON array strings (e.g. ["a","b"]), plus preference_* keys or a JSON object in preferences.