Preserve, index, and expose your AI conversation history as a queryable, programmable archive.
Polylogue archives AI conversations from ChatGPT, Claude, Claude Code, Gemini, and Codex into a unified, searchable local database. Drop your exports in a folder, run one command, and get instant full-text search across every conversation you've ever had.
- Zero-config: Drop exports in
~/.local/share/polylogue/inbox/, runpolylogue run, done - Sub-second search: FTS5-powered full-text search with smartcase matching
- Semantic search: Vector similarity via sqlite-vec embeddings (optional, Voyage AI)
- Library-first: Async Python API with composable filter chains — the CLI is a thin wrapper
- Local-first: All data stays on your machine. SQLite database, no external services
No data needed. Generate a synthetic archive and explore:
eval $(polylogue demo --seed --env-only)
polylogue # Archive stats
polylogue "error handling" # Full-text search
polylogue -p claude --latest # Latest Claude conversation
polylogue dashboard # Interactive TUI# With uv (recommended)
uv tool install polylogue
# With pip
pip install polylogue
# From source (Nix)
git clone https://github.com/sinity/polylogue && cd polylogue
nix develop # or: uv sync| Provider | How to Export |
|---|---|
| ChatGPT | Settings → Data Controls → Export → Download conversations.json |
| Claude | Download from claude.ai conversation history |
| Claude Code | Auto-discovered from ~/.claude/projects/ |
| Codex | Auto-discovered from ~/.codex/sessions/ |
| Gemini | Google Drive sync — run polylogue auth for OAuth setup |
# Copy or symlink your exports
cp ~/Downloads/conversations.json ~/.local/share/polylogue/inbox/
ln -s ~/.claude/projects ~/.local/share/polylogue/inbox/claude-code
# Run the pipeline
polylogue run
# Search
polylogue "error handling"No config file needed. That's it.
Polylogue's CLI treats positional arguments as search terms. No subcommand prefix — just type what you're looking for:
# Basic search
polylogue "error handling" # Full-text search (FTS5)
polylogue "Error" # Case-sensitive (has uppercase)
polylogue "auth" "token" # AND: both terms must appear
# Semantic search (requires VOYAGE_API_KEY)
polylogue --similar "how to debug memory leaks"
# Filters (all composable)
polylogue "error" -p claude,chatgpt # By provider (comma = OR)
polylogue --since "last week" # Natural language dates
polylogue --until 2025-01-01 # ISO dates
polylogue --has thinking # Has reasoning traces
polylogue --has tools # Has tool use
polylogue -t project:backend # By tag (supports key:value)
polylogue --title "API design" # Title contains
# Sort & limit
polylogue --latest # Most recent conversation
polylogue --sort tokens --reverse # Most expensive first
polylogue --sort longest -n 10 # 10 longest conversations
polylogue --sample 5 # Random sample
# Output
polylogue "error" -f json # JSON format
polylogue "error" -f csv # CSV format
polylogue "error" -o browser # Open in browser
polylogue "error" -o clipboard # Copy to clipboard
polylogue "error" --fields id,title,date # Select columns
polylogue "error" --count # Just the count
polylogue "error" --stats-by provider # Aggregate by provider
# Content transforms
polylogue -i abc123 --transform strip-tools # Hide tool calls
polylogue -i abc123 --transform strip-thinking # Hide reasoning
polylogue -i abc123 -d # Dialogue only (user + assistant)
# Metadata modification
polylogue -i abc123 --set title "My Title"
polylogue -i abc123 --set summary "Brief description"
polylogue -i abc123 --add-tag important,project:backend
polylogue "old stuff" --delete --dry-run # Preview bulk delete| Provider | Format | Auto-detected By | ID |
|---|---|---|---|
| ChatGPT | conversations.json |
mapping field with UUID graph |
chatgpt |
| Claude (web) | .jsonl |
chat_messages array |
claude |
| Claude Code | .json array |
parentUuid/sessionId markers |
claude-code |
| Codex | .jsonl |
Session envelope structure | codex |
| Gemini | Google Drive API | chunkedPrompt.chunks structure |
gemini |
ZIP archives are supported (nested ZIPs too, with bomb protection). Provider detection is automatic from file content — no configuration needed.
| Format | Flag | Description |
|---|---|---|
| Markdown | -f markdown |
Default. Syntax-highlighted, human-readable |
| JSON | -f json |
Machine-readable, with all metadata |
| HTML | -f html |
Styled for browser viewing |
| CSV | -f csv |
Tabular, for spreadsheets |
| Obsidian | -f obsidian |
Markdown with YAML frontmatter and [[wikilinks]] |
| Org | -f org |
Emacs org-mode format |
| YAML | -f yaml |
Structured, human-readable |
| Plaintext | -f plaintext |
Stripped of all markup |
Output can be sent to stdout (default), --output browser, or --output clipboard.
polylogue run # Full pipeline: acquire → parse → render → index
polylogue run --source claude # Single source
polylogue run --preview # Preview counts, confirm before writing
polylogue run --stage parse # Single stage only
# Watch mode — continuous sync
polylogue run --watch # Watch sources for changes
polylogue run --watch --notify # Desktop notifications on new conversations
polylogue run --watch --webhook URL # Webhook on new conversationsStages: acquire → parse → render → index
The pipeline is idempotent — re-running imports is always safe. Content hashing (SHA-256 + NFC normalization) ensures unchanged conversations are skipped.
polylogue check # Health check: DB integrity, index status, stats
polylogue check --repair # Auto-fix issues (orphaned refs, stale FTS entries)
polylogue check --deep # Full SQLite integrity check
polylogue embed # Generate vector embeddings for semantic search
polylogue embed --stats # Show embedding coverage
polylogue embed --model voyage-4-large # Use larger model
polylogue tags # List all tags with counts
polylogue tags -p claude --json # Tags for a provider, as JSON
polylogue site -o ./public # Build static HTML archive
polylogue site --title "My Archive" # Custom title
polylogue site --search-provider lunr # Client-side search engine
polylogue dashboard # Interactive Textual TUI
polylogue auth # Google OAuth flow (for Gemini/Drive)
polylogue auth --revoke # Revoke stored credentials
polylogue reset --database # Delete SQLite database
polylogue reset --all # Reset everything
polylogue completions --shell fish # Generate shell completions
polylogue mcp # Start MCP server (stdio)Polylogue provides a Model Context Protocol server, giving AI assistants direct access to your conversation archive.
Claude Code (~/.claude/settings.json):
{
"mcpServers": {
"polylogue": {
"command": "polylogue",
"args": ["mcp"]
}
}
}Claude Desktop (~/.config/claude/claude_desktop_config.json): same format.
| Capability | Available |
|---|---|
| Tools | search, list_conversations, get_conversation, stats |
| Resources | polylogue://stats, polylogue://conversations, polylogue://conversation/{id} |
| Prompts | analyze-errors, summarize-week, extract-code |
Resources support query parameters for filtering: polylogue://conversations?provider=claude&since=2024-01-01&limit=50
Polylogue is library-first — the CLI is a thin wrapper around the Python API.
from polylogue import Polylogue
async with Polylogue() as archive:
# Archive-wide stats
stats = await archive.stats()
# Search with composable filters
convs = await (archive.filter()
.provider("claude")
.contains("error handling")
.since("2024-06-01")
.limit(10)
.list())
# Retrieve a single conversation
conv = await archive.get("chatgpt:abc123")
# Message-level projections
for msg in conv.project().substantive().min_words(50).iter():
print(f"{msg.role}: {msg.text[:80]}...")
# Semantic search (requires embeddings)
similar = await archive.filter().similar("debugging memory leaks").limit(5).list()
# Metadata
await archive.set_metadata("chatgpt:abc123", title="Auth Bug Investigation")
await archive.add_tags("chatgpt:abc123", ["important", "project:backend"])Filter chain methods (all chainable):
| Method | Purpose |
|---|---|
.contains(text) / .exclude_text(text) |
FTS search |
.provider(*names) / .exclude_provider(*names) |
Filter by provider |
.tag(*tags) / .exclude_tag(*tags) |
Filter by tag |
.has(*types) |
Content type: thinking, tools, summary, attachments |
.since(date) / .until(date) |
Date range (strings or datetime) |
.title(pattern) / .id(prefix) |
Text matching |
.similar(text) |
Semantic similarity (vector search) |
.sort(field) |
date, tokens, messages, words, longest, random |
.reverse() / .limit(n) / .sample(n) |
Order and limit |
Terminal methods (async): .list(), .first(), .count(), .delete()
Full library API documentation →
Polylogue includes a complete demo system for exploring features without real data.
Create a full demo environment — synthetic database with realistic conversations from all providers:
# Interactive — prints env vars and instructions
polylogue demo --seed
# Shell integration — eval sets env vars in current shell
eval $(polylogue demo --seed --env-only)
# Customize
polylogue demo --seed -p chatgpt,claude -n 10The seeded environment runs through the real pipeline (acquire → parse → render → index), so the demo exercises the exact same code paths as production.
Write raw provider-format files (JSON, JSONL) to disk for inspection:
polylogue demo --corpus # All providers, 3 each
polylogue demo --corpus -p chatgpt -n 5 # ChatGPT only, 5 files
polylogue demo --corpus -o /tmp/corpus # Custom output directoryUseful for inspecting wire formats, testing parser changes, or generating fixture data.
Exercise the entire CLI surface area (58 exercises across 7 groups) and generate a verification report:
polylogue demo --showcase # Full validation
polylogue demo --showcase --live # Read-only against real data
polylogue demo --showcase --json # Machine-readable report
polylogue demo --showcase --verbose # Print each exercise outputThe showcase seeds a workspace, runs every query mode, output format, filter combination, and mutation — then produces a summary report, JSON results, and a markdown cookbook of all commands with output.
Zero-config by default. Polylogue follows the XDG Base Directory specification:
| Path | Purpose |
|---|---|
~/.local/share/polylogue/polylogue.db |
SQLite database |
~/.local/share/polylogue/inbox/ |
Drop exports here |
~/.local/share/polylogue/render/ |
Rendered output |
~/.config/polylogue/ |
OAuth credentials |
Environment variables:
| Variable | Purpose |
|---|---|
POLYLOGUE_ARCHIVE_ROOT |
Custom database location |
POLYLOGUE_RENDER_ROOT |
Custom render output |
VOYAGE_API_KEY |
Voyage AI key for semantic search |
POLYLOGUE_FORCE_PLAIN |
Force non-interactive output |
POLYLOGUE_LOG |
Log level: error, warn, info, debug |
Full configuration documentation →
| Document | Description |
|---|---|
| CLI Reference | Complete command reference with tips and examples |
| Library API | Python API — filter chains, projections, async patterns |
| Data Model | Conversation / Message / Attachment schemas |
| Configuration | XDG paths, environment variables, observability |
| Architecture | System design, layers, data flow, thread safety |
| MCP Integration | Model Context Protocol server for Claude Desktop/Code |
| Demo & Showcase | Demo command, synthetic data, surface-area validation |
| Providers | Provider formats, detection, session integration |
| Internals | Developer reference — invariants, schemas, debugging |
git clone https://github.com/sinity/polylogue && cd polylogue
# Enter dev environment
nix develop # Nix (recommended)
# or
uv sync # uv
# Run tests
pytest -q # Quick run (4200+ tests)
pytest --cov=polylogue # With coverage (90% minimum enforced)
# Lint & type check
ruff check polylogue/ tests/
mypy polylogue/See CLAUDE.md for development guidelines, docs/internals.md for implementation details, and demos/ for screencast generation.



