Skip to content

Latest commit

 

History

History
491 lines (371 loc) · 19.7 KB

File metadata and controls

491 lines (371 loc) · 19.7 KB

← Back to README

CLI Reference

Invocation Modes

polylogue [QUERY...] [FILTERS...] [OUTPUT...]    # Query mode (default)
polylogue run [OPTIONS...]                        # Run pipeline (ingest → render → index)
polylogue embed [OPTIONS...]                      # Generate vector embeddings
polylogue tags [OPTIONS...]                       # List tags with counts
polylogue site [OPTIONS...]                       # Build static HTML archive
polylogue sources [OPTIONS...]                    # List configured sources
polylogue dashboard                               # Launch TUI dashboard
polylogue mcp                                     # MCP server mode
polylogue check                                   # Health check
polylogue qa                                      # Composable QA (audit, exercises, invariants)
polylogue generate                                # Synthetic data generation
polylogue auth                                    # OAuth flow (Drive)
polylogue reset                                   # Reset database/state
polylogue completions --shell SHELL               # Generate shell completions

Query Mode

Query mode is the default. Running polylogue without arguments shows archive statistics.

Query Syntax

polylogue "error"                    # FTS search (smartcase: lowercase=insensitive)
polylogue "error" "python"           # AND: both terms required
polylogue "Error"                    # Case-sensitive (has uppercase)

Positional arguments are implicit --contains (FTS). Multiple positional args are ANDed.

Filters

Flag Short Description
--contains TEXT -c FTS term (repeatable = AND)
--exclude-text TEXT Exclude FTS term (repeatable)
--provider NAME,... -p Include providers (comma = OR)
--exclude-provider NAME,... Exclude providers
--tag TAG,... -t Include tags (comma = OR, supports key:value)
--exclude-tag TAG,... Exclude tags
--title TEXT Title contains
--has TYPE,... Has: thinking, tools, summary, attachments
--since DATE After date (today, yesterday, "last week", 2025-01-01)
--until DATE Before date
--id PREFIX -i ID prefix match
--limit N -n Max results
--latest Most recent (= --sort date --limit 1)
--sort FIELD Sort by: date (default), tokens, messages, words, longest, random
--reverse Reverse sort order
--sample N Random sample of N conversations

Comma = OR for structured fields (provider, tag). Repeated flags = OR for same field, AND across fields.

Output

Flag Short Description
--output DEST,... -o Output destinations: browser, clipboard, stdout (default: stdout)
--format FMT -f Format: markdown (default), json, html, obsidian, org, yaml, plaintext, csv
--fields FIELD,... Select fields for list/json: id, title, provider, date, messages, words, tags, summary
--list Force list format (even for single result)
--stats Only statistics, no content
--count Print matched count and exit
--stats-by DIM Aggregate statistics by dimension: provider, month, year, day
--open Open result in browser/editor
--transform XFORM Transform output: strip-tools, strip-thinking, strip-all
--stream Stream output (low memory, requires --latest or -i ID)
--dialogue-only -d Show only user/assistant messages

Smart defaults:

  • No query → show stats
  • Single result → show content
  • Multiple results → show list
  • --output browser → always HTML
  • Content to non-stdout → stats printed to stdout

Multiple outputs: --output browser,clipboard performs both actions. Content is rendered once, sent to multiple destinations.

Clipboard behavior:

  • Single conversation: Full markdown content copied
  • Multiple conversations (with --list or when query returns many): Each conversation separated by --- delimiter
  • Format respects --format flag (markdown default, or json)

--stats output (for filtered results):

Matched: 12 conversations

Messages: 847 total (234 user, 421 assistant)
Words: 45,231
Thinking: 89 traces
Tool use: 156 calls
Attachments: 23
Date range: 2025-01-18 to 2025-01-24

--stats-by output (replaces individual --by-* flags):

polylogue --stats-by month                        # Activity histogram by month
polylogue -p claude-ai --stats-by provider           # Provider breakdown for Claude
polylogue --stats-by year                         # Year-by-year overview
polylogue --since 2025-01 --stats-by day          # Daily breakdown

--stream mode: Streams messages to stdout one at a time for constant memory usage on large conversations. Supports --dialogue-only to filter to user/assistant messages. Output format is controlled via --format (plaintext, markdown, or json for JSON Lines).

--transform options:

Transform Effect
strip-tools Remove tool call/result messages
strip-thinking Remove thinking/reasoning traces
strip-all Remove both tools and thinking

Modifiers (Write Operations)

# Metadata (unified k:v storage)
polylogue -i abc123 --set title "My Custom Title"
polylogue -i abc123 --set summary "Brief description..."
polylogue -i abc123 --set priority high            # Custom metadata key
polylogue -i abc123 --add-tag important,project:foo
polylogue -i abc123 --delete                       # Remove from archive

# Bulk operations with safety
polylogue "urgent" --add-tag review --dry-run      # Preview changes
polylogue -p old --delete --dry-run                # Preview deletions
polylogue -p old --delete --force                  # Skip confirmation

Metadata: Title, summary, and tags are stored as unified k:v metadata. Custom keys are allowed. Access via --fields or filter with --has.

--delete safety: Requires at least one filter flag (-i, -p, -t, --since, etc.). Cannot delete entire archive without explicit filter.

--dry-run: Shows what would be changed without executing. Works with --add-tag, --set, and --delete.

--force: Skips confirmation prompts for bulk operations (more than 10 conversations).

List output format:

  ID (24 chars)             DATE        [PROVIDER    ]  TITLE (MSG COUNT)
  claude-ai:a8f2c3d4e5f6...    2025-01-24  [claude-code ]  Debugging OAuth (42 msgs)
  chatgpt:b9d8e7f6a5...     2025-01-23  [chatgpt     ]  Python patterns (18 msgs)

ID prefix matching: If prefix is ambiguous (matches multiple), error with list of matches. Use longer prefix to disambiguate.

Run Mode

polylogue run                             # Run pipeline on all sources
polylogue run --source claude-ai             # Run only for claude source
polylogue run --preview                   # Preview counts, confirm before writing
polylogue run --stage parse               # Run only parse stage
polylogue run --stage all                 # Run all stages (default)
polylogue run --format markdown           # Render as Markdown (default: html)
polylogue run --watch                     # Watch sources and sync continuously
polylogue run --watch --notify            # Desktop notification on new conversations
polylogue run --watch --exec "echo new"   # Execute command on new conversations
polylogue run --watch --webhook URL       # Call webhook on new conversations

Pipeline stages: acquirevalidateparserenderindexgenerate-schemas. Default runs all stages. parse consumes only rows marked validation_status=passed|skipped.

Source scoping: Use --source NAME (repeatable) to process only specific sources. Use polylogue sources to list available sources.

Deduplication: Conversations are identified by content hash (SHA-256 of normalized content). Re-importing the same conversation is a no-op. Modified conversations (same provider ID, different content) update the existing record.

Partial failures: Pipeline continues on individual file failures, reports errors at end. Exit code 0 if any files succeeded, non-zero only if all failed.

Embed Mode

polylogue embed                           # Embed all unembedded conversations
polylogue embed -c <id>                   # Embed specific conversation
polylogue embed --model voyage-4-large    # Use larger model
polylogue embed --rebuild                 # Re-embed everything
polylogue embed --stats                   # Show embedding statistics
polylogue embed -n 50                     # Limit to 50 conversations

Generates vector embeddings using Voyage AI, stored in sqlite-vec for semantic search. Requires VOYAGE_API_KEY environment variable.

Tags

polylogue tags                            # List all tags with counts
polylogue tags -p claude-ai                  # Tags for Claude conversations only
polylogue tags --json                     # Machine-readable output
polylogue tags -n 10                      # Top 10 tags

Site

polylogue site                            # Build static HTML site
polylogue site -o ./public                # Build to custom directory
polylogue site --title "My Archive"       # Custom site title
polylogue site --no-search                # Disable search index
polylogue site --search-provider lunr     # Use lunr.js instead of pagefind
polylogue site --no-dashboard             # Skip dashboard page

Generates a browsable static HTML site with index pages, per-provider views, dashboard with statistics, and client-side search.

Sources

polylogue sources                         # List configured sources
polylogue sources --json                  # JSON output

Dashboard

polylogue dashboard                       # Launch TUI dashboard

Opens the Textual-based TUI (Mission Control) for interactive browsing.

Generate

Generate synthetic conversations for testing and exploration:

polylogue generate                          # Raw provider-format files
polylogue generate -p chatgpt -n 5          # ChatGPT only, 5 conversations
polylogue generate -o /tmp/corpus           # Custom output directory
polylogue generate --seed                   # Full demo environment
polylogue generate --seed --env-only | eval # Shell-friendly

Options:

Flag Short Description
--provider NAME -p Providers to include (repeatable, default: all)
--count N -n Conversations per provider (default: 3)
--output-dir PATH -o Output directory
--seed Run pipeline to produce usable demo environment
--env-only Print export statements only (requires --seed)

QA

Composable quality assurance: schema audit, CLI exercises, and invariant checks.

polylogue qa                                # Full synthetic QA
polylogue qa --live                         # Exercises against real data
polylogue qa --source inbox                 # Fresh workspace from inbox data
polylogue qa --only audit                   # Schema audit only
polylogue qa --only exercises --tier 0      # Tier-0 smoke test
polylogue qa --skip invariants              # Skip invariant checks
polylogue qa --snapshot release-v3          # QA + archive results
polylogue qa --snapshot-from ./qa_outputs   # Archive existing directory

Options:

Flag Description
--synthetic / --live Data source (default: synthetic)
--source NAME Specific real source(s) in fresh workspace (repeatable)
--fresh Isolated temp workspace (default for synthetic)
--workspace DIR Reuse a specific workspace directory
--ingest Run ingestion pipeline (auto for synthetic/fresh)
--schemas Regenerate schemas during pipeline
--only STAGE Run only: audit, exercises, or invariants
--skip STAGE Skip stage (repeatable)
--tier N Exercise tier filter (0/1/2)
--fail-fast Stop on first exercise failure
--report-dir DIR Artifact directory
--json Machine-readable output
--verbose Print exercise outputs
--snapshot [LABEL] Archive results after QA
--snapshot-from DIR Archive existing directory (skip QA)

Other Modes

polylogue mcp                             # Start MCP server (stdio)
polylogue check                           # Health check (DB, index, stats)
polylogue check --verbose                 # Show breakdown by provider
polylogue check --repair                  # Fix issues that can be auto-fixed
polylogue check --repair --preview        # Preview what would be repaired
polylogue check --repair --vacuum         # Compact database after repair
polylogue check --deep                    # Run SQLite integrity check
polylogue check --json                    # Machine-readable output
polylogue check --schemas                 # Raw-corpus schema verification gate
polylogue check --schemas --schema-provider chatgpt --schema-samples all
polylogue check --schemas --schema-provider claude-code --schema-record-limit 500 --schema-record-offset 1000
polylogue qa                              # Full synthetic QA (audit → exercises → invariants)
polylogue qa --live                       # QA against real data
polylogue qa --only audit --json          # Schema audit with JSON output
polylogue auth                            # OAuth flow for Google Drive
polylogue auth --refresh                  # Force token refresh
polylogue auth --revoke                   # Revoke stored credentials
polylogue reset --database                # Delete SQLite database
polylogue reset --render                  # Delete rendered outputs
polylogue reset --cache                   # Delete search indexes
polylogue reset --auth                    # Delete OAuth tokens
polylogue reset --all                     # Reset everything
polylogue reset --all --yes               # Non-interactive reset

polylogue qa --snapshot writes snapshots to <archive_root>/qa/snapshots/<timestamp>-<label> with manifest.json (hashes + metadata), INDEX.md, and a best-effort latest symlink.

Global Flags

polylogue --version                       # Version
polylogue --plain                         # Force non-interactive plain output
polylogue -v                              # Verbose output
polylogue -h / --help                     # Help

Shell Completions

Generate and install completions:

# Fish
polylogue completions --shell fish > ~/.config/fish/completions/polylogue.fish

# Zsh
polylogue completions --shell zsh > ~/.zfunc/_polylogue

# Bash
polylogue completions --shell bash > /etc/bash_completion.d/polylogue

Technical Details

FTS (Full-Text Search):

  • SQLite FTS5 with default tokenizer
  • Smartcase: all-lowercase query → case-insensitive; contains uppercase → case-sensitive
  • Supports phrase queries with quotes: "exact phrase"

Date parsing: Uses dateparser library. Supports:

  • ISO format: 2025-01-15, 2025-01-15T10:30:00
  • Relative: today, yesterday, "last week", "2 days ago", "last month"
  • Natural language: "January 15", "Jan 2025"

Exit codes:

Code Meaning
0 Success
1 General error (invalid args, config error)
2 No results found (for queries)
3 Partial failure (some items failed in sync)

Terminal output:

  • Colors enabled by default on TTY, respects NO_COLOR env var
  • Rich-formatted output (tables, panels) when interactive; plain text when piped
  • Set POLYLOGUE_FORCE_PLAIN=1 to force plain output, or use --plain

Logging: Set POLYLOGUE_LOG=debug for verbose logging to stderr. Levels: error, warn, info, debug.

Tips & Recipes

Polylogue switches to plain mode automatically when stdout/stderr are not TTYs. Use --interactive on a TTY to force prompts, or set POLYLOGUE_FORCE_PLAIN=1 in CI.

Drive Auth

  • Provide an OAuth client JSON at ~/.config/polylogue/polylogue-credentials.json or set POLYLOGUE_CREDENTIAL_PATH.
  • Tokens are stored at ~/.config/polylogue/token.json (or POLYLOGUE_TOKEN_PATH).
  • Drive auth requires --interactive for the browser authorization code.
  • Run polylogue auth to initiate the OAuth flow.

Source Scoping

  • Use --source NAME (repeatable) on run to avoid reprocessing everything.
  • Use --source last to reuse the previous interactive selection.
  • Example: polylogue run --source gemini --stage validate.

Preview Mode

  • Use polylogue run --preview to preview counts without writing.

Search Defaults

  • Interactive runs open a picker when multiple results are returned, then open the selection.
  • Omitting the query opens the latest render in interactive mode and prints the path in plain mode.
  • Use --list to force the full list output (no picker or auto-open).
  • Use --open to open the newest render without searching.
  • Use --verbose to include snippets in list output.

Index Rebuild

  • Search automatically rebuilds the index if it is missing.

Path Debugging

  • Use polylogue run --preview to confirm resolved sources and output paths.
  • Use POLYLOGUE_RENDER_ROOT to override render output without editing config.

Health Checks

  • polylogue check verifies database integrity, FTS index, and render files.
  • polylogue check --repair fixes issues that can be auto-fixed.
  • polylogue check --vacuum compacts the database and reclaims space.
  • polylogue check --schemas runs non-mutating schema verification over raw_conversations.
  • Use --schema-provider to scope providers, --schema-samples for per-record sample depth (N or all), and --schema-record-limit/--schema-record-offset for chunked verification on large corpora.

Examples

# Statistics
polylogue

# Search
polylogue "OAuth bug"
polylogue "error" "python" -p claude-ai,chatgpt

# Filter and output
polylogue -p claude-ai --has thinking --output browser
polylogue --latest --output browser,clipboard

# Sorting and sampling
polylogue --sort tokens --reverse --limit 10     # Longest conversations
polylogue --sort random --limit 5                # Random 5
polylogue --sample 10                            # Random sample of 10

# Field selection
polylogue -p claude-ai --fields id,title,tokens --format json

# Aggregation
polylogue --stats-by month                       # Activity by month
polylogue -p claude-ai --stats-by provider          # Provider breakdown

# Exclusions
polylogue "error" --exclude-text "warning" --exclude-provider gemini
polylogue -t important --exclude-tag archived

# Transforms
polylogue --latest --transform strip-tools       # Clean output without tool calls
polylogue -i abc123 --dialogue-only              # Just the conversation

# Streaming (memory-efficient for large conversations)
polylogue --latest --stream                      # Stream most recent
polylogue -i abc123 --stream -d                  # Stream dialogue only

# Count
polylogue -p claude-ai --count                      # Quick count

# Metadata
polylogue -i abc123 --set title "The OAuth Fix"
polylogue -i abc123 --set summary "Fixed OAuth by..."
polylogue -i abc123 --add-tag project:polylogue,important
polylogue --tag project:polylogue --list

# Bulk operations with safety
polylogue -p claude-ai --since "last month" --add-tag review --dry-run
polylogue -p old --delete --dry-run

# Run pipeline
polylogue run
polylogue run --source claude-ai
polylogue run --preview
polylogue run --watch --notify

# Embeddings
polylogue embed
polylogue embed --stats

# Static site
polylogue site -o ./public

# Maintenance
polylogue check --repair --vacuum

See also: Library API · Data Model · Configuration · MCP Integration