diff --git a/AGENTS.md b/AGENTS.md index d41b2a2..cabdddc 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -8,394 +8,4 @@ Stack: Rust · SQLite · fastembed (ONNX, in-process) · Claude Code `PreToolUse Reduces token usage 60–90% by intercepting large reads and replacing them with focused context. -## Build & Install - -```bash -cargo build -cargo build --release -cargo install --path . # installs to ~/.cargo/bin -tokenix --help -``` - -## Key Files - -| File | Purpose | -|---|---| -| `src/main.rs` | CLI entry (clap), command dispatch, `install-hook`/`remove-hook` helpers (including Antigravity global install/uninstall through `agy plugin`, repo-local OpenCode native `opencode.json` MCP registration/removal, and local `.agents/plugins/tokenix`), `install-binary` (copies the running exe to `global_bin_dir()` — `%LOCALAPPDATA%\tokenix\bin` / `~/.local/bin` — and persists the Windows user PATH via PowerShell `[Environment]::SetEnvironmentVariable`, never `setx`). `banner()` = neon "tokenix" wordmark + tagline; `help_catalog()` = audience-grouped command list (AI agent vs human) + examples, wired via custom `HELP_TEMPLATE` (`before_help`/`after_help`); bare `tokenix` prints this help | -| `src/chunker.rs` | Symbol-aware heuristic chunking, `generate_outline()`, token counting. Tree-sitter for Rust/Python/TS/JS/Go/C++; `chunk_by_symbol_lines()` line-scanning chunkers for grammar-less languages — VB6/VBA (`Sub`/`Function`/`Property`/`Attribute VB_Name`) and SQL (`CREATE [OR REPLACE] `) | -| `src/embed.rs` | fastembed ONNX — `embed_documents()`, `embed_query()`. Model **registry** (`MODELS`, `spec_for`) + thread-local active model (`set_active_model`/`active_model_id`) + per-id loaded-model cache. Per-model query/doc prefixes; query cache keyed by model | -| `src/store.rs` | SQLite schema, CRUD, cosine similarity search (int8-quantized vectors + legacy f32 fallback, `quantize_q8`/`backfill_quantized_embeddings`), import graph (`graph_imports`, `file_imports`), hook log I/O + 5 MB rotation, PID index lock, branch-aware DB paths | -| `src/indexer.rs` | File walk + incremental index pipeline. Runs at below-normal OS priority (`lower_process_priority()`, opt-out `--no-low-priority`/`TOKENIX_FOREGROUND`). `decode_text()` handles UTF-16 BOMs (SSMS-saved `.sql`) and skips binary files (NUL in first 8 KiB). Embeds in batches (default 16) with a progress bar; each batch commits to the embedding cache so a killed run resumes via cache hits | -| `src/query.rs` | Hybrid semantic/lexical ranking (FTS5 + BM25 + RRF), strict `context` modes, budget enforcement, cross-project search | -| `src/pack.rs` | `tokenix pack` — budgeted repo map + focused context, changed-file packs, token maps, and safety report | -| `src/graph.rs` | Symbol graph with PageRank, cycle detection (Tarjan's SCC, homonym-filtered, `path:line`-annotated), tree-sitter references, incremental repair (`update_symbol_graph_incremental` — FTS-narrowed inbound-edge restore; `rebuild-graph` = full escape hatch), file-level import graph (`rebuild_import_graph`, per-language import extraction + path resolution), HTML + Mermaid export. Repo-wide overview (`tokenix graph`): `repo_hotspots` (degree + transitive-dependent blast radius, trivial-symbol filtered), `format_repo_report` (god nodes / bottlenecks / blast-radius leaders), `format_edges_dot` (Graphviz of the top subgraph) | -| `src/artifacts.rs` | Context artifacts — index non-code files (schemas, API specs, docs) via `.tokenix/artifacts.json` | -| `src/hook.rs` | `run_hook()` — called by PreToolUse hook. Tries daemon first for Grep. Thresholds (Read 200 lines / Grep 3 words) overridable via `[hook]` in `.tokenix.toml` (`read_min_lines`, `grep_min_words`) | -| `src/daemon.rs` | Background TCP server (port 47392). Holds model + int8-quantized embedding cache (LRU, max 3 projects, content cap 1000). Bounded to 4 handler threads. Protocol: `search`/`health`/`status`; CLI `tokenix daemon status\|stop\|restart` | -| `src/compress.rs` | Legacy `PostToolUse` compatibility compression + `tokenix run` command-output compression: ANSI strip, emoji removal, blank-line collapse, repeat grouping, JSON compaction, cargo/git-log heuristics. `tokenix run` only applies command-specific filters to stderr when `filter_stderr=true`; otherwise stderr uses safe generic compression so errors are not turned into success sentinels | -| `src/filters.rs` | `FilterDef` (TOML schema), active filter listing, `load_user_filters()`, `load_bundled_filters()` (rust-embed), `apply_filter()`. `find_filter()` matches via `derive_command_candidates()`, which unwraps shell runners, strips `cd`/env prefixes, and `split_on_operators()` splits compound commands quote-aware on `&&`/`\|\|`/`;`/`\|` so anchored `match_command` patterns match a base command in any segment/position | -| `src/cmd_filter.rs` | `tokenix filter list/active/generate` + `filter record start/stop/status` subcommands. `generate` prefers `recordings::read_samples` over a re-run, invokes a detected AI CLI, and saves to `~/.tokenix/filters/`; reused by the TUI Studio tab as a foreground drop-out | -| `src/tui.rs` | Interactive ratatui shell shown by a bare `tokenix` / `tokenix filter` in a TTY (else falls back to help / `filter list`). Tab bar (`←`/`→`): **Stats** dashboard (wordmark + version + hook status + index summary, with selectable Index / Install hooks / Install binary actions — Index runs in the foreground with live progress, the two install actions confirm before writing; Install binary self-execs `tokenix install-binary`), **Filters** (3-pane groups · filters · live `apply_filter` input→output preview with a `chunker::count_tokens` gauge line showing `X → Y tokens · % saved` between the panes), **Studio** (surfaces the record→preview→generate filter loop: `r`/`s` arm/stop a `recordings::start`/`stop` session, left column is a unified candidate list from `cmd_filter::suggest_filters` — recordings unioned with the tokens-wasted ranking, badged `⚠` unfiltered sink (biggest waste first) / `✓` already filtered / `●` recorded-only — plus saved `~/.tokenix/filters/*.toml`, right pane previews a `recordings::read_samples` head with a live `apply_filter` before→after `chunker::count_tokens` delta when an active filter matches the base command; `g` sets `request_generate` to run `cmd_filter::cmd_filter_generate` as a foreground drop-out — same pattern as Index — then resumes the TUI; `x` deletes a saved filter with confirm; `Tab` switches pane), **Gain** (native colored render of `gain::compute_gain`: tokens-saved headline with ≈USD at the ★ reference model's input rate, savings-by-source split — semantic index vs command filters — and numbered by command / by project tables with share %, toggles `c`/`a`), **Usage** (self-exec captured `tokenix usage` via dynamic argv: `s` cycles daily/model/blocks/project/session, `a` toggles all-projects, `r` refresh), **Doctor**/**Tokenmap** (self-exec captured output), **Graph** (self-exec captured `tokenix graph` repo overview — god nodes / bottlenecks / blast radius; `r` refresh), **Secrets** (background-threaded `secrets_scan::scan_findings` with spinner; dedup by distinct value + count; `v` reveal, `c` copy raw value to system clipboard via `clip`/`pbcopy`/`wl-copy`/`xclip`/`xsel`, `x` write `[REDACTED]`), **Egress** (background-threaded `egress_scan::scan_findings` with the same 3-pane pattern as Secrets: groups · destinations · occurrence detail; `s` cycles host/rule/agent/file grouping; `r` rescans; host reputation colors: green safe, red dangerous, yellow unknown). Both Secrets and Egress open scoped to the current repo (cwd) and `g` toggles a global all-repos view; scoping filters the raw scan by each finding's attributed `repo` (`is_local` matches exact `cwd` paths plus Claude `~slug:`/Gemini `~dir:` fallback markers against the project root) | -| `src/ui.rs` | Shared terminal-UI vocabulary for human-facing CLI output (`box_header`, `bar`, `section`/`kv`, `format_num`, `table` via `tabled`); LLM/JSON output deliberately does not route through it | -| `src/gain.rs` | `compute_gain()`/`compute_global_gain()`, `GainStats` (incl. `index_saved`/`filter_saved` source split: empty `command` = semantic-index intercept, non-empty = command filter; pre-phase Bash/PowerShell rewrite markers are excluded from `filter_calls`), `MODELS` pricing table (Anthropic/OpenAI/Google, with `input`/`output`/`cache_read`/`cache_write` per-1M rates; `price_for` name/prefix match + `usage_cost` per-record helper reused by `tokenix usage`). Grep semantic intercepts are logged as neutral usage, not claimed savings, because native grep output is not measured before interception | -| `src/transcripts.rs` | Shared enumeration of local agent transcript files (`roots` per agent: Claude/Codex/Copilot/OpenAI, `transcript_files` walker). Single source of truth reused by `conversation-audit` and `usage` | -| `src/usage.rs` | `tokenix usage` — absolute token spend + ≈USD cost parsed from transcript `message.usage` blocks (input/output/cache read+write), deduped by `(message.id, requestId)`. Aggregates by `daily\|weekly\|monthly\|session\|model\|project`; rolling 5-hour `blocks` with burn rate + projection; month-end forecast; `--cost-mode auto\|calculate\|display`; `--statusline`; `--all-projects` scope; `--json` | -| `src/mcp.rs` | MCP server. `--profile full` exposes all tools; `--profile slim` exposes context/search/call meta-tools for progressive discovery | -| `src/mcp_audit.rs` | `tokenix prompt-audit` / `session-audit` — per-agent MCP config discovery (Claude, Codex, Copilot, OpenCode, Antigravity) + minimal synchronous MCP stdio client (`initialize`/`tools/list`) + token scoring/report | -| `src/secrets_scan.rs` | `tokenix scan-secrets` — gitleaks-style credential scan of Claude/Gemini/Copilot/Antigravity conversation transcripts under `~`; rules loaded from TOML (`assets/secret-rules/` bundled via `rust-embed`, extended by `/` then `~/.tokenix/secret-rules/*.toml`, later `id` wins), backtracking-free regex + entropy-gated generic rule. Each finding is attributed to its repo + git branch via the transcript line's `cwd`/`gitBranch` (Claude), falling back to the project dir slug. Report supports `--filter` (substring), `--group `, `--reveal` (raw values, default redacted), `--json`; exit 1 on hits. `scan_findings()` returns structured `ScanFinding`s (raw + redacted) for the TUI; `redact_in_files()` rewrites `[REDACTED]` over a value in text files (SQLite DBs skipped) | -| `src/egress_scan.rs` | `tokenix egress-audit` — scans Claude/Gemini/Copilot/Antigravity conversation transcripts for external DNS/IP destinations; bundled TOML rules live under `assets/egress-rules/`, local safe hosts are loaded from `~/.tokenix/safe-hosts.toml`, and local blocklist hosts from `~/.tokenix/dangerous-hosts.toml` (`dangerous`, `blocklist`, or `hosts` arrays); report supports `--filter`, `--group `, `--safe`, and `--json`. `scan_findings()` returns structured `EgressFinding`s for the TUI | -| `assets/filters/` | 386 TOML output filters embedded via `rust-embed`, each homologated with ≥2 golden `[[tests]]` cases (realistic success + failure-path inputs; the failure case must prove errors are never masked). 800 cases run through the real `apply_filter` pipeline in `bundled_filters_pass_embedded_golden_tests`; `verbose_real_output_compresses_at_least_70pct` proves ≥70% reduction on realistic verbose output and `match_command_resolves_many_invocation_variants` homologates wrapper/shell/global-opt command variants. User filters in `~/.tokenix/filters/` take priority | - -## SQLite Schema - -```sql -files(id, path TEXT UNIQUE, mtime REAL, content_hash TEXT) -chunks(id, file_id, path, start_line, end_line, symbol, kind, content, token_count) -chunks_fts(rowid, content, symbol, path) -- FTS5 virtual table for keyword search -embeddings(chunk_id PK, embedding BLOB, scale REAL) - -- scale NOT NULL → int8-quantized vector (1 byte/dim); scale NULL → legacy - -- float32 LE blob. Search branches per row; the scale cancels out of the - -- cosine, so q8 search needs only the raw bytes. Legacy rows are migrated - -- (re-encode only) by backfill_quantized_embeddings() at index time + VACUUM. -embedding_cache(content_hash PK, embedding BLOB, updated_at) -- stays float32 so - -- model switches and quantization changes never force a re-embed -graph_nodes(chunk_id PK, file_id, path, name, kind, start_line, end_line, rank) -graph_edges(id, caller_chunk_id, callee_chunk_id, reference, edge_kind) -graph_imports(id, source_path, target, resolved_path, kind, line) - -- file-level import edges; resolved_path NULL = external dependency -meta(key PK, value) -- 'indexed_at', git fingerprint -``` - -`meta` stores `indexed_at` and a Git fingerprint (worktree root + branch + HEAD). Hooks and `--if-stale` treat a different fingerprint as stale so branch switches don't reuse stale context. - -Query paths open old DBs without running migrations — SELECTs must degrade when the `scale` column is missing (`embeddings_have_scale()` probe selects `NULL` instead). - -Hook log: `~/.tokenix/.log` — NDJSON, one `HookEvent` per line. Rotates at 5 MB to `.log.1` (one generation kept); `read_hook_log()` reads both. Fallback when the home dir is unavailable is repo-local `.tokenix/hook.log`. - -## Intercept Logic - -``` -Read tool: - file < 200 lines OR offset/limit set → exit 0 (pass through) - file ≥ 200 lines, no offset/limit → return outline, exit 2 (intercept) - -Grep tool: - pattern < 3 words → exit 0 (pass — likely a regex/symbol search) - pattern ≥ 3 words → return semantic results, exit 2 (intercept); gain records this as neutral usage, not saved tokens - -Bash / PowerShell tools: - command matches a bundled/user filter → rewrite to `tokenix run` (PowerShell - uses `& 'exe' run --shell pwsh ''`, re-executed under pwsh with UTF-8) - otherwise → exit 0 (pass) - -Index missing or >1h old → always exit 0 regardless of tool -``` - -Matcher (installer): `^(Read|Grep|Bash|PowerShell|grep_search|run_in_terminal)$`. -Claude Code's dedicated `PowerShell` tool (exact name) takes the pwsh path; the -generic lowercase `powershell` from Copilot/Antigravity stays on the bash path. - -`get_effective_command` normalizes a command before matching so filters anchored -on the bare tool still hit: it strips shell wrappers, `cd`/env prefixes, package -runners (`uv run`, `python -m`, `npx`, `bunx`, `pnpm exec/dlx`, `yarn dlx`, -`bun x`, `deno run/task`), and tool-global options (`git -C`, `kubectl -n`, -`docker -H`, `cargo +tc`). Verified against Codex/Antigravity histories where -`uv run pytest`, `python -m ruff`, `bunx biome` were bypassing their filters. - -Both thresholds are per-project tunable via `.tokenix.toml`: - -```toml -[hook] -read_min_lines = 120 # default 200 -grep_min_words = 3 # default 3 -``` - -## Critical Rules - -**Never lose content.** The chunker must store 100% of every indexed file. Generic files (.md, .txt, .yaml, .json) use `clean_generic_text()` — full content with formatting stripped. Truncated previews are forbidden. Only code files use symbol-based outlines stored in full in SQLite. - -**Never break hook fallback.** `run_hook()` must always `exit(0)` on any error — missing index, stale index, parse failures, embed errors. Breaking Claude Code sessions is worse than missing a token-saving opportunity. - -**Hook exit codes:** `0` = pass through (original tool runs) · `2` = block tool (hook stderr becomes Claude's context). Never exit `1`. - -**Daemon is optional.** If `tokenix serve` is not running, `handle_grep()` auto-starts it and retries once (800ms wait). If autostart fails, falls back to direct in-process embed. - -**Directory filtering in indexer:** `filter_entry` for directories uses ONLY `IGNORED_DIRS`. Do NOT call `should_index()` on directories — it returns false for dirs without extensions and breaks traversal. Keep `should_index` / `filter_entry` separation intact. - -**Cross-platform paths:** `tokenix_bin_path()` normalizes to forward slashes for shell/JSON config strings. Preserve for Windows compatibility. - -**Hook log format:** Do not change `~/.tokenix/.log` away from NDJSON without updating `gain.rs`. - -**Token count is approximate.** `count_tokens()` = `(len + 3) / 4`. Intentional — no tiktoken dep. - -**Keep docs in sync.** Every new or changed user-facing feature MUST update both `README.md` (Features table, Commands Reference, Usage, Architecture) and `AGENTS.md` (Key Files + relevant section) in the same change. - -## Daemon - -```bash -tokenix serve # start daemon (blocks; use & or detached) -tokenix serve --port 9999 -tokenix stop # stop daemon (reads ~/.tokenix/daemon.pid) -tokenix daemon status # pid, port, uptime, model, cached projects + RAM -tokenix daemon restart # stop (if running) + detached respawn - -# Health check -echo '{"type":"health"}' | nc 127.0.0.1 47392 -# → {"ok":true,"cached_projects":1,"chunks":197} -# Status over the same socket: {"type":"status"} -``` - -Warm Grep calls via daemon: ~80ms vs ~430ms cold in-process. Daemon auto-starts on first Grep hook call. - -**Resource limits (prevents freeze under parallel hooks):** -- Max **4 concurrent handler threads** — unbounded spawning was the primary Windows freeze trigger -- **Spawn lock** (`daemon.pid.spawning`) + PID liveness check — prevents N parallel hooks from each spawning a separate 130 MB daemon process -- **Content cache capped at 1000 entries** per project - -## Output Filters - -Legacy `hook-post` compression flows through (in order): -1. User TOML filters (`~/.tokenix/filters/*.toml`) — highest priority -2. Bundled TOML filters (`assets/filters/*.toml`, rust-embed) -3. Built-in heuristics in `compress.rs` — cargo, git-log, generic head/tail - -`apply_filter()` pipeline: `match_output` short-circuit → `strip_ansi` → `strip_lines_matching` → `keep_lines_matching` → `head/tail/max_lines` → `truncate_lines_at` → `on_empty`. Opt-in `passthrough_when_emptied`: when the pipeline reduces *non-empty* output to nothing (an unrecognized output shape, not a genuinely empty command), emit a bounded view of the real output instead of `on_empty` — set on `git-log`/`git-diff` so `--oneline`/`--stat` don't report a false "no commits"/"no changes". The same bounded fallback fires **automatically** (no opt-in) whenever the original output matches `output_has_failure_signal()` (a strict, case/anchor-tuned `error`/`fatal`/`panic`/`FAILED`/`exit code N` probe) — so a failed build/test/deploy whose error text isn't matched by the tool's `keep_lines_matching` is never masked as the success `on_empty`. Guarded by `bundled_filters_never_mask_generic_failure`. - -**Filter design rule: never use `on_empty` — use `passthrough_when_emptied = true` instead.** `on_empty` fabricates a static string when real output is filtered to nothing; `passthrough_when_emptied` returns the original unfiltered output. Filters must only filter, never invent responses. `match_output` is the only valid short-circuit (it fires only when a confirmed pattern exists in the real output). Tests must not assert on fabricated strings. - -```toml -[filters.my-cmd] -match_command = "^my-cmd\\b" -passthrough_when_emptied = true -strip_ansi = true -strip_lines_matching = ["^\\s+Downloading"] -match_output = [{ pattern = "Success", message = "ok" }] -max_lines = 30 -``` - -## Prompt Audit (MCP/tool weight) - -`tokenix prompt-audit` estimates the variable cost of the effective system prompt -per agent. The base system prompt is internal and **cannot be read or intercepted -via hooks** — this measures the next-largest lever instead: MCP tool-definition -JSON. All logic lives in `src/mcp_audit.rs`. - -Per-agent MCP config sources (one `ConfigSource` each, ausente = silently skipped): - -| Agent | Path(s) | Format | -|---|---|---| -| Claude Code | `/.mcp.json` + `~/.claude.json` (`mcpServers` + `projects[]`) | JSON | -| Codex | `~/.codex/config.toml` → `[mcp_servers.]` | TOML (`toml` dep) | -| OpenCode | `/opencode.json` (`mcp`) | JSON | -| Antigravity | `~/.gemini/antigravity-cli/mcp_config.json` (`mcp_config_path()`) | JSON | -| Copilot | `.vscode/mcp.json` (`servers`) + VS Code user `mcp.json` | JSON, best-effort | - -Pipeline: discover → dedupe stdio transports → `introspect_stdio()` (spawn, JSON-RPC -`initialize`/`tools/list`, 5s timeout via reader thread + `recv_timeout`, kill on -done) → tokenize schemas with `count_tokens` → add static `Agent::native_tokens()` -baseline → compare to thresholds (`TOKENIX_AUDIT_WARN_{TOKENS,SERVERS,TOOLS}`). -`TOKENIX_BRANCH_AWARE=true` suffixes SQLite DB with git branch name to isolate indexes per branch. -HTTP/SSE servers are not introspected (shown `unknown`). CLI-only — no hooks, no -settings.json changes. - -`--recommend` adds conservative reduction advice. `--profile-impact` estimates -the tokenix full-vs-slim MCP schema delta. `tokenix session-audit` reuses the -same summary and combines it with index freshness plus hook-log evidence; -`--cache-hygiene` also reports stable-prefix/cache-risk hints. - -`tokenix mcp --profile slim` is the token-saving MCP mode: it advertises only -`tokenix_context`, `tokenix_search_tools`, and `tokenix_call`. Keep `full` as -the default for compatibility with hosts that do not support progressive tool -discovery. - -## Repository Pack - -`tokenix pack` emits a budgeted repo map for non-hook AI tools. Modes/profiles: -`plan`, `debug`, `audit`, `security`, `review`. Formats: `markdown`, `xml`, -`json`. `--changed` and `--since ` produce review-sized packs; `--token-map` -adds per-file token/reason metadata. -It uses indexed context, file token counts, and symbol outlines; it must skip -obvious secrets, credentials, `.env`, key files, `.git`, and build output by -default. Do not turn `pack` into a raw full-repo dump. - -## Benchmark - -`tokenix benchmark` measures tokenix against a plain **vanilla** baseline only — -no external tools. It prints read-only token reduction, targeted -outline+symbol workflows, semantic Hit@1/Hit@3, context homologation (vanilla -full file vs tokenix budgeted context), and command-output compression. Flags: -`--budget N`, `--refresh-index`, `--cases FILE`, `--json`. - -**Fairness contract (do not regress).** Benchmark is tokenix-vs-vanilla only — -do not add competitor/market comparison arms. Vanilla and tokenix are scored on -identical input counted with the same `count_tokens`. Semantic Hit@1/Hit@3 are -reported as measured (misses included), never filtered to flatter tokenix. -Default scenarios span Rust/TS/Go/Python plus SQLite vector search and command -output (cargo, git, npm, docker compose). Verdict logic is unit-tested in -`benchmark.rs`. - -**Windows caveat:** `npx`/`uvx` run via `cmd /C`; `child.kill()` kills the wrapper -but a `node` grandchild may linger briefly until stdin EOF. Kill-the-tree -(`taskkill /T`) is a possible hardening follow-up. - -## Common Tasks - -**Add a language:** `chunker.rs` — add extension to `INDEXED_EXTS`, add `Lang` variant, map in `detect_lang()`, implement `chunk_()` following `chunk_rust()` pattern (tree-sitter), or `chunk_by_symbol_lines()` with a `_symbol_of()` line matcher when no grammar is bundled (see VB6/SQL). Also add the new `Lang` arms in `graph.rs` (`extract_references_tree_sitter`, `extract_file_imports`). Do NOT add to `INDEXED_EXTS` without a symbol-aware chunker. - -**Add a bundled filter:** create `assets/filters/.toml` with **≥2 embedded `[[tests.]]` golden cases** (input/expected — enforced by `bundled_filters_require_minimum_tests`). Filters with an `on_empty` sentinel must NOT also set `passthrough_when_emptied` (they conflict; passthrough wins and the sentinel never fires), and any filter that can empty a failure payload must keep failure markers (`(?i)error|fail|fatal`) or set `passthrough_when_emptied` — else `bundled_filters_never_mask_generic_failure` fails. Rebuild — rust-embed includes it automatically. Homologate with `cargo test --bin tokenix filters::tests::` (golden + 70% economy + never-mask + no-inflate). Currently 386 filters · 800 golden cases. - -**`filter record` token-economy preview:** `recordings::economy()` reconstructs each captured command's raw output (stripping the `$ cmd`/`--- stderr ---`/truncation scaffold), resolves the bundled filter via the real `find_filter`+`apply_filter` path, and reports `raw→filtered` tokens. `record stop`/`status` render it as a per-command compression bar + total via `print_economy_table` in `cmd_filter.rs`. - -**Change intercept threshold:** `hook.rs` constants — `MAX_INDEX_AGE_SECS`, `MIN_LINES_FOR_OUTLINE`, `MIN_QUERY_WORDS`. - -**Extend hook to a new tool:** -1. Add variant to `Tool` enum in `main.rs` -2. Implement `install_()` and `remove_()` -3. Add match arms in `cmd_install_hook()` and `cmd_remove_hook()` -4. Update `hook.rs` only if the tool has a real hook protocol -5. Document in `README.md` - -**Add an agent to `prompt-audit`:** `mcp_audit.rs` — add an `Agent` variant (with `label`/`key`/`native_tokens`), a `discover_()` config source, and an `AuditAgent` value + mapping in `main.rs`. Reuse `parse_json_map` for JSON `mcpServers`-style configs. - -**Add a `scan-secrets` rule:** no Rust change needed — append a `[[rules]]` block (`id`, `pattern`, optional `capture`/`min_entropy`) to `assets/secret-rules/default.toml` (or a new bundled `*.toml`), or to `~/.tokenix/secret-rules/*.toml` / `/.tokenix/secret-rules/*.toml` at runtime. Patterns use the backtracking-free `regex` crate (no lookaround). `secrets_scan.rs::compile_rules` dedups by `id` (later source wins) and skips invalid regexes with a stderr warning. - -**Change token budget:** `query.rs` — `DEFAULT_BUDGET` constant, or pass `--budget` flag. - -**Embedding model (flexible):** `embed.rs` `MODELS` registry maps a friendly id → `EmbeddingModel` + query/doc prefixes. Default is `nomic-v1.5` (existing indexes keep working). Select with `tokenix index --model ` or `TOKENIX_EMBED_MODEL=`. The chosen model is **stamped in the index `meta` (`embed_model`)**; query/hook/daemon read it back via `store::index_model_id` and `embed::set_active_model` so query vectors always match the indexed docs. The model is **sticky** (a plain re-index keeps it); an explicit switch forces a full re-embed. `index_staleness` only flags a model change when `TOKENIX_EMBED_MODEL` is explicitly set. The embedding cache key (`chunk_embedding_key`) and the persistent query cache are namespaced by model id. Add a built-in model: append a `ModelSpec` with `ModelSource::BuiltIn(EmbeddingModel::…)` (use the non-quantized variant if the Qdrant-Q ONNX fails ORT's `SkipLayerNormalization`). Add a **custom** model (one fastembed does not ship, e.g. code-specialized): `ModelSource::Custom { hf_repo, onnx_file, pooling }` — `build_custom_embedding` downloads the onnx + tokenizer files (`reqwest`) into `/custom//` and loads them via fastembed's `UserDefinedEmbeddingModel`. `jina-code` (jinaai/jina-embeddings-v2-base-code, mean pooling) is the first such model. `tokenix doctor` lists available + active + this-repo's model, and validates user/local filters' `semantic_filter` config (`filters::semantic_filter_issues`); an unknown `semantic_filter.model` also warns at apply time before falling back to the default. - -**Update pricing table:** `gain.rs` — edit `MODELS` constant and bump `PRICING_COLLECTED_AT`. Fields: `name`, `input_per_1m` (USD), `reference` (marks ★ model — used for the Gain tab's ≈USD headline). - -## Testing the Hook - -```bash -tokenix index . - -# Should intercept (exit 2) — large file -echo '{"tool_name":"Read","tool_input":{"file_path":"src/main.rs"}}' | tokenix hook; echo $? - -# Should pass through (exit 0) — small file -echo '{"tool_name":"Read","tool_input":{"file_path":"Cargo.toml"}}' | tokenix hook; echo $? - -# Should intercept (exit 2) — semantic query (auto-starts daemon) -echo '{"tool_name":"Grep","tool_input":{"pattern":"how does embedding work"}}' | tokenix hook; echo $? - -# Copilot-style input -echo '{"toolName":"view","toolArgs":"{\"path\":\"src/main.rs\"}"}' | tokenix hook; echo $? - -# PostToolUse compression — bundled filter short-circuit -echo '{"tool_name":"Bash","tool_input":{"command":"uv sync"},"tool_response":{"output":"Resolved 42 packages in 123ms\nAudited 42 packages in 0.05ms\n"}}' | tokenix hook-post; echo $? - -tokenix gain --history -``` - -## Claude Code Integration Setup - -After `cargo install --path .`, configure Claude Code globally: - -### 1. Hooks (`~/.claude/settings.json` or project `.claude/settings.local.json`) - -Add to the `hooks` key — merging with any existing entries: - -```json -{ - "hooks": { - "PreToolUse": [ - { - "matcher": "^(Read|Grep|Bash)$", - "hooks": [{ "type": "command", "command": "tokenix hook", "timeout": 10 }] - } - ] - } -} -``` - -`PreToolUse` intercepts large reads and semantic Grep queries, and rewrites noisy Bash commands so they execute through tokenix before the model sees the output. - -### 2. Behavioral instruction (`~/.claude/CLAUDE.md`) - -Add a section so Claude prefers tokenix over raw Grep/Glob/Read for codebase searches: - -```markdown -## Tokenix Indexed Search (Token Economy) -- `tokenix` is indexed and available in PATH. -- For any codebase search, PREFER tokenix over Grep/Glob/Read: - - Symbol by name: `tokenix symbols ` - - Semantic query: `tokenix query ""` - - Callers of a function: `tokenix callers ` - - Callees: `tokenix callees ` - - Impact graph: `tokenix impact ` - - Focused context for a task: `tokenix context ""` - - Explore related: `tokenix explore ` -- Fall back to Grep/Glob/Read only when tokenix returns no results or for exact literal matches. -``` - -### 3. Index the project - -```bash -cd -tokenix index . -tokenix stats # verify file/chunk count -``` - -The daemon auto-starts on first Grep hook call. Run `tokenix serve` manually only to pre-warm it. - -## Tool Integration Model - -### Claude Code -- Config: `PreToolUse` in `~/.claude/settings.json` or project `.claude/settings.local.json` (see setup above) -- Input: `{"tool_name":"Read","tool_input":{"file_path":"src/main.rs"}}` - -### GitHub Copilot -- Config: `.github/copilot-instructions.md` + VS Code-compatible `.github/hooks/hooks.json` -- Input: `{"toolName":"view","toolArgs":"{\"path\":\"src/main.rs\"}"}` -- tokenix normalizes `view`/`read` → `Read` - -### OpenAI Codex CLI -- Config: `~/.codex/hooks.json` for `PreToolUse` Bash rewrites + optional shell helpers under `~/.codex/` - -### OpenCode -- Config: repo-local `opencode.json` native `mcp` block -- Shape: `{"mcp":{"tokenix":{"type":"local","command":["tokenix","mcp"]}}}` -- Note: tokenix does **not** install `experimental.hook`; OpenCode support is native MCP registration only - -### Antigravity -- Global config: `~/.gemini/config/plugins/tokenix/`, installed and registered through `agy plugin install` -- Local config: `/.agents/plugins/tokenix/`, validated through `agy plugin validate` -- Input: `{"toolCall":{"name":"read_file","args":{"path":"src/main.rs"}}}` -- Output: native `decision: allow|deny`; command rewrites use `overwrite`. Do not install `PostToolUse` for compression because Antigravity cannot replace the original output there. - -## Agent Workflow (when working on this repo) - -Before opening a large or unfamiliar file: - -```bash -tokenix query "what you need to understand" -tokenix read -``` - -Narrow context with: - -```bash -tokenix read --symbol -tokenix read --lines N-M -tokenix read --mode signatures # signatures only (no bodies) -tokenix read --mode diff # outline + uncommitted hunks -tokenix read --mode density:40 # keep ~40% highest-entropy lines -``` - -Only read a full file directly when tokenix shows it is small. - -Inspect the symbol graph and spend with: - -```bash -tokenix graph # repo-wide god nodes / bottlenecks / blast radius -tokenix graph --format dot # Graphviz of the top subgraph -tokenix usage # absolute token spend + ≈USD cost (daily) -tokenix usage blocks # rolling 5-hour billing blocks + burn rate -``` - -## Release - -Releases are automated via GitHub Actions (`.github/workflows/release.yml`). Pushing to `main` auto-creates a version tag and GitHub Release with pre-built binaries for Linux, macOS, and Windows. - -To trigger manually: push a commit to `main` — the workflow reads version from `Cargo.toml`. +## Build & Install \ No newline at end of file diff --git a/README.md b/README.md index 7976747..24db2d5 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ GitHub stars CI Supply Chain - OpenSSF Scorecard + OpenSSF Scorecard License Built with Rust Platforms @@ -29,792 +29,4 @@ --- -> **tokenix** is a local-first Rust CLI that helps AI coding agents understand a repository without dumping huge files into the prompt. It indexes your code, finds relevant chunks by meaning, returns compact file outlines, and can hook into AI tools to replace noisy reads and command output with smaller, more useful context. Works with Claude Code, GitHub Copilot, OpenAI Codex CLI, OpenCode, Gemini, and any MCP client. **No Ollama or external server required.** - -``` -Without tokenix: Read(src/auth/middleware.rs) → 800 lines → ~2,400 tokens (illustrative) -With tokenix: tokenix read src/auth/middleware.rs → symbol outline → ~180 tokens -``` - -Savings depend on codebase size, AI behavior, and file sizes. Run `tokenix gain` to see measured Read and command-filter savings; semantic Grep context is logged as usage, not counted as saved tokens. - ---- - -## 🖥 Interactive Dashboard - -Run bare `tokenix` to open a terminal dashboard — ten tabs, zero flags. `←`/`→` switch tabs, `↑`/`↓` move, `q` quits. Piped or non-TTY falls back to `--help`. - - - - - - - - - - - - - - - - - - - - - - - - - - -
Stats tab
Stats — wordmark, version, per-agent hook status, index summary, and one-key actions: index repo · install hooks · install binary on PATH.
Gain tab
Gain — tokens saved with a reduction bar, split by source and by command/tool. c adds the ≈USD cost table · a all-projects · r refresh.
Usage — absolute token spend and ≈USD cost read from agent transcripts (the spend-side counterpart to Gain). s cycles the breakdown (daily · model · 5-hour blocks · project · session), a toggles this-repo vs all-projects, r refreshes. The active 5-hour block shows burn rate and a projected cost.
Graph — repo-wide symbol-graph overview: god nodes (most connected), bottlenecks (high fan-in / low fan-out), and blast-radius leaders (most transitive dependents). r refreshes.
Filters tab
Filters — browse all 386 bundled filters by tool with a live input → output preview and a per-filter X → Y tokens · % saved gauge.
Secrets tab
Secrets — credentials leaked across agent transcripts, grouped by rule and attributed to repo + branch. Starts scoped to the current repo; g toggles all repos. v reveal · c copy · x redact.
Tokenmap tab
Tokenmap — the repository as a tree weighted by token count, heaviest paths first.
Doctor tab
Doctor — build/GPU support, detected GPU + CUDA/cuDNN status, active embedding model & cache, and bundled-filter inventory, all on one screen.
Egress — external DNS/IP destinations found in agent transcripts, with local reputation validation: safe hosts green, dangerous hosts red, unknown hosts yellow. Three-pane style like Secrets: group · destination · occurrence detail. Starts scoped to the current repo; g toggles all repos. s rotates host/rule/agent/file grouping · r rescans.
Studio — record → preview → generate output filters without leaving the dashboard. The command list ranks the biggest unfiltered token sinks first ( + tokens wasted), marks commands that already have a filter (), and shows captured recordings () — so you always know what to filter next. r arms recording (capture needs the tokenix hook installed; run your commands in your agent, then come back), s stops. The right pane previews a recorded sample with a live before → after token delta when a filter matches. g generates a filter from the selected command (drops to the interactive filter generate flow, then returns) · x deletes a saved filter · Tab switches pane.
- ---- - -## What Is tokenix? - -AI coding agents often waste context on the wrong shape of information: entire files, long grep output, repeated build logs, and directory listings that are much larger than the useful signal inside them. tokenix is a context layer between the agent and your repository. - -It does four jobs: - -| Job | What tokenix does | Why it matters | -|---|---|---| -| **Index the repository** | Walks source files, splits them into symbol-aware chunks, and stores local embeddings in SQLite | The agent can search by intent instead of opening files blindly | -| **Read files compactly** | Returns outlines, symbols, or line ranges instead of full files when possible | Large files stop consuming thousands of unnecessary tokens | -| **Intercept assistant tools** | Hooks into supported tools before large reads and rewrites noisy command output | Optimization happens automatically during normal AI sessions | -| **Measure savings** | Logs hook decisions and reports measured token/cost reduction where the original output is known | You can see whether it is actually helping on your codebase | - -tokenix is not a cloud service, not a vector database server, and not a replacement for your AI assistant. It is a local repository index plus a set of CLI and hook integrations that make the assistant's context smaller and more targeted. - ---- - -## ⚡ Quick Install - -### Pre-built binary (recommended) - -Every release ships a binary per platform. The version-less -`releases/latest/download/` URL **always resolves to the newest release**, -so you never pin a version. - -| Platform | Asset | -|---|---| -| Linux x86_64 | `tokenix-linux-x86_64` | -| Linux arm64 | `tokenix-linux-aarch64` | -| macOS x86_64 | `tokenix-macos-x86_64` | -| macOS arm64 (Apple Silicon) | `tokenix-macos-aarch64` | -| Windows x86_64 | `tokenix-windows-x86_64.exe` | -| Windows x86_64 (GPU / DirectML) | `tokenix-windows-x86_64-directml.exe` | - -Each release also publishes `sha256sums.txt` and SLSA build-provenance -attestations — see [SECURITY.md](SECURITY.md) for how to verify a download. - -**macOS / Linux** — swap the asset for your platform from the table: - -```bash -curl -fsSL https://github.com/juninmd/tokenix/releases/latest/download/tokenix-linux-x86_64 -o tokenix -chmod +x tokenix && sudo mv tokenix /usr/local/bin/tokenix -tokenix doctor # verify -``` - -**Windows (PowerShell):** - -```powershell -irm https://github.com/juninmd/tokenix/releases/latest/download/tokenix-windows-x86_64.exe -OutFile tokenix.exe -``` - -> **🤖 For AI agents / LLMs:** prefer this prebuilt binary over `cargo install` (no Rust -> toolchain, no compile step). Always fetch the version-less URL -> `https://github.com/juninmd/tokenix/releases/latest/download/tokenix-` — it -> redirects to the latest release, so **never hard-code a version**. Detect the platform, -> download the matching asset from the table, mark it executable, then run `tokenix doctor` -> to confirm. The embedding model downloads automatically on first use. - -### From crates.io - -```bash -cargo install tokenix --locked -``` - -### From source - -```bash -git clone https://github.com/juninmd/tokenix -cd tokenix -cargo install --path . --locked -``` - -> **Use `--locked`.** It builds against the committed `Cargo.lock`; without it `cargo install` re-resolves dependencies and can pull an incompatible `ureq` into the `ort-sys` build script. - -> **Requirements:** a recent stable [Rust](https://www.rust-lang.org/tools/install) toolchain (edition 2021). No Ollama, no Python, no external services. - -The embedding model (`nomic-embed-text-v1.5`, ~130 MB) is downloaded automatically on first use and cached locally. - ---- - -## ✨ Features - -| Feature | Description | -|---|---| -| **Semantic search** | Find relevant code by meaning, not just keywords (`tokenix query`); cross-project with `--link` | -| **Context artifacts** | `tokenix artifacts` indexes non-code schemas, API docs, and specs via `.tokenix/artifacts.json` | -| **Hybrid ranking** | FTS5 BM25 + vector cosine + RRF fusion for ranked results | -| **Exact search** | Regex/literal search over indexed content, no embedding (`tokenix grep`) | -| **One-call task context** | `tokenix context` combines semantic search, entry points, and compact outlines with strict budget modes (`plan`, `debug`, `audit`, `security`, `review`) | -| **Graph-aware explore** | `tokenix explore` returns related symbols, relationship maps, and grouped source in one capped call | -| **Repository pack** | `tokenix pack` emits a budgeted, secret-safe repo map with changed-file packs, token maps, and safety reporting | -| **Symbol graph** | `tokenix symbols` (`--kind` filters by symbol type), `callers`, `callees`, `impact`, `flow`, and `cycles` trace relationships, call-flow, and circular deps between indexed symbols | -| **Import graph** | `tokenix deps FILE` shows file-level import dependencies (`--reverse` for importers, `--transitive` to follow the chain); external deps are tracked too | -| **Int8-quantized embeddings** | Vectors are stored int8-quantized (4x smaller DB + daemon RAM, near-identical recall); legacy f32 indexes migrate automatically on the next `tokenix index` | -| **JSON output** | `--json` on `query`, `context`, `explore`, `read`, `symbols`, `callers`, `callees`, `deps` (+ `impact --format json`) for scripts and agent pipelines | -| **PC-friendly indexing** | `tokenix index` runs at below-normal OS priority by default so long index runs never starve the machine (`--no-low-priority` opts out) | -| **Interactive HTML/Mermaid graphs** | `tokenix impact --format html\|mermaid` exports vis.js / Mermaid flowcharts; `tokenix flow --format mermaid` traces call flow | -| **Repo graph overview** | `tokenix graph` ranks god nodes, bottlenecks, and blast-radius leaders across the whole symbol graph (`--format text\|dot\|json`, `--top N`) | -| **Cycle detection** | `tokenix cycles` finds circular dependencies via Tarjan's strongly-connected components algorithm, dropping same-name (homonym) false positives and annotating each node with `path:line` | -| **Token map** | `tokenix tokenmap` shows a directory tree with token counts per file/folder | -| **Preference memory** | `tokenix memory add/list` stores global and project preferences in editable Markdown; context/explore include saved preferences | -| **Dynamic language detection** | Map custom file extensions to any built-in parser via a project `.tokenix.toml` — no recompile needed | -| **Legacy VB6 + SQL sources** | `.bas`/`.cls`/`.ctl`/`.frm`/`.vbp` and `.sql`/`.fnc`/`.trg`/`.pkg`/`.prc`/`.tab`/`.vw` indexed with symbol-aware heuristic chunking (`Sub`/`Function`/`Property`, `CREATE` objects); UTF-16 SQL files decoded via BOM; binary files (e.g. `.frx`) skipped by a NUL sniff | -| **Symbol-aware chunking** | AST Tree-sitter parsers for Rust, Python, TypeScript, JavaScript, Go, C/C++ | -| **Multi-agent safe index** | PID-based index lock prevents concurrent reindex; embeddings are committed per batch, so a killed index run resumes from the last completed batch | -| **Smart file reader** | Outlines large files; supports `--symbol` and `--lines` reads, plus `--mode full\|outline\|signatures\|diff\|density:X` (signatures-only, changed-hunks, or entropy-filtered reads) | -| **Hook-based interception** | `PreToolUse` intercepts large reads and rewrites noisy Bash **and PowerShell** commands before execution; thresholds tunable via `[hook]` in `.tokenix.toml` | -| **Structural output compression** | Fuzzy grouping, compact `git`/`cargo` filters, NDJSON/JSON compaction, and ANSI/Emoji stripping | -| **Local project filters** | Drop `.toml` files in `.tokenix/filters/` for project-scoped compression rules — highest priority over user and bundled filters | -| **Output filters** | 386 TOML output filters embedded in the binary (each homologated against 800 golden cases) — auto-applied to Bash/PowerShell output for `uv`, `cargo`, `terraform`, `ansible`, `docker`, `kubectl`, `git`, `npm`, `pnpm`, `bun`, `deno`, `vite`, `pip`, `poetry`, `go`, `rust`, `helm`, `apt`, `journalctl`, `trivy`, `semgrep`, `bazel`, `ctest`, `tox`, `conda`, `pulumi`, `dnf`/`yum`, `pacman`, `apk`, `pip-audit`, `ng test`, `bru`, `ps`, `cargo tree`, `npm ls`, `kubectl explain`, `lsof`, `ss`, `netstat`, `ip`, `systemctl list-*`, and more | -| **Filter generation** | `tokenix filter generate` writes a TOML filter for a command; `tokenix filter record` captures real output for richer generation, with a per-command **token-economy preview** (raw→filtered tokens, % saved, compression bar) shown by `record stop`/`status` | -| **GPU acceleration (opt-in)** | Build with `--features directml` (Windows) or `--features cuda` to run embeddings on GPU; GPU is used by default at runtime with automatic CPU fallback, or force CPU with `--only-cpu` | -| **Environment diagnostics** | `tokenix doctor` reports the compiled backend, detected GPU, CUDA/cuDNN status, model cache, and daemon | -| **Branch-aware indexing** | `TOKENIX_BRANCH_AWARE=true` isolates indexes per git branch | -| **In-memory daemon** | `tokenix serve` keeps model + index in RAM so repeated hook calls avoid reloading the model each invocation; `tokenix daemon status\|stop\|restart` manages it | -| **Graceful fallback** | Exits `0` on errors — your AI session is never broken | -| **Token budget** | Results fit within a configurable token budget (default `1200`) | -| **Savings analytics** | `tokenix gain` — token summary, savings split by source (semantic index vs command filters), and by-tool histogram; `--cost-estimate` adds a per-model cost table (10 reference models across Anthropic / OpenAI / Google) | -| **Spend analytics** | `tokenix usage` — absolute token spend and ≈USD cost read from agent transcripts, by `daily\|weekly\|monthly\|session\|model\|project\|blocks`; rolling 5-hour blocks with burn rate, month-end forecast, `--cost-mode auto\|calculate\|display`, `--statusline`, and `--json` | -| **Slim MCP profile** | `tokenix mcp --profile slim` exposes 3 meta-tools instead of the full tool surface for hosts that support progressive discovery | -| **MCP/prompt weight audit** | `tokenix prompt-audit --recommend --profile-impact` connects to configured MCP servers, tokenizes tool schemas, and shows full-vs-slim MCP savings | -| **Session audit** | `tokenix session-audit --cache-hygiene` combines index freshness, hook history, MCP/tool weight, and prompt-cache stability risks | -| **Conversation token-waste audit** | `tokenix conversation-audit` scans local Claude / Codex / Copilot / OpenAI histories for large assistant-visible blobs such as full reads, command logs, bootstrap prompts, connector JSON, images, patches, and task artifacts | -| **Conversation secret scan** | `tokenix scan-secrets` — gitleaks-style credential scan of Claude / Gemini / Copilot / Antigravity conversation transcripts (no git); findings are always redacted, exits non-zero when any are found. Patterns live in TOML (`assets/secret-rules/`), extensible via `~/.tokenix/secret-rules/*.toml` or `/.tokenix/secret-rules/*.toml` | -| **Conversation egress audit** | `tokenix egress-audit` — scans AI agent transcripts for external DNS/IP destinations, groups by host/rule/agent/file, validates host reputation from local safe/dangerous lists, and colors safe/dangerous/unknown hosts in the TUI | -| **Local-first, no dependencies** | fastembed ONNX in-process — no Ollama, no server, no internet after first run | - ---- - -## 🔌 Supported AI Tools - -| Tool | Integration | -|---|---| -| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | `PreToolUse` hooks in `~/.claude/settings.json` or project `.claude/settings.local.json` | -| [GitHub Copilot](https://docs.github.com/en/copilot) | `.github/copilot-instructions.md` + VS Code-compatible `.github/hooks/hooks.json` | -| [OpenAI Codex CLI](https://help.openai.com/en/articles/11096431-openai-codex-cli-getting-started) | `~/.codex/hooks.json` for `PreToolUse` Bash rewrites + optional shell helpers | -| OpenCode | `tokenix install-hook --tool opencode` — registers `tokenix mcp` in a native `opencode.json` `mcp` block | -| Antigravity | `tokenix install-hook --tool antigravity` — installs and validates a native `PreToolUse` plugin through `agy plugin` | -| Any MCP client | `tokenix mcp` — Model Context Protocol server over stdin/stdout (`--tool mcp`) | - ---- - -## 🚀 How It Works - -tokenix has two modes: - -1. **Manual mode**: run `tokenix query`, `tokenix read`, `tokenix context`, etc. directly when you want compact context. -2. **Hook mode**: install hooks so supported AI tools call tokenix automatically before large reads and before noisy Bash commands execute. - -### Output compression - -tokenix includes structural output-filtering logic. It doesn't just truncate output; it understands the structure of common CLI tools. - -- **Fuzzy grouping:** collapses hundreds of `Compiling…` or `Removing…` lines into a single summary line. -- **Structural compaction:** compacts pretty-printed JSON and NDJSON into single-line formats. -- **Signal preservation:** keeps error messages and summaries even when the middle of a log is truncated. - ---- - -## 🛠 Usage - -### 1. Index your repository - -```bash -cd my-project -tokenix index . -``` - -> **First run:** the model (~130 MB) is downloaded automatically. Subsequent runs use the local cache. - -### 2. Search - -```bash -tokenix query "how does JWT validation work" # semantic -tokenix query "database connection pooling" --budget 2000 -tokenix grep "fn validate_token" --ignore-case # exact regex/literal -``` - -### 3. One-call task context - -```bash -tokenix context "fix login refresh token bug" -tokenix context "how does the indexer batch embeddings" --mode debug --budget 2000 -tokenix context "review this auth change" --mode review --budget 1200 -tokenix explore "run_hook hook_post compression" --budget 4000 -``` - -### 4. Repository pack - -```bash -tokenix pack --mode plan --budget 8000 --format markdown --token-map -tokenix pack --mode review --changed --budget 4000 -tokenix pack --mode security --format json --output tokenix-security-pack.json -``` - -`pack` builds a stable repo map plus focused context for tools that cannot call -tokenix directly. It respects the index, skips obvious secrets and build output, -and supports `plan`, `debug`, `audit`, `security`, and `review` modes. Use -`--changed` or `--since ` for compact review packs. - -### 5. Smart file reader - -```bash -tokenix read src/auth/middleware.rs # symbol outline -tokenix read src/auth/middleware.rs --symbol validate_token # targeted -tokenix read src/auth/middleware.rs --lines 45-80 # line range -tokenix read src/auth/middleware.rs --mode signatures # signatures only -tokenix read src/auth/middleware.rs --mode diff # outline + changed hunks -tokenix read src/auth/middleware.rs --mode density:40 # keep ~40% highest-entropy lines -``` - -### 6. Symbol graph & maps - -```bash -tokenix symbols validate_token -tokenix symbols Token --kind struct # filter by symbol kind -tokenix callers validate_token -tokenix callees run_hook -tokenix impact update_user --depth 2 -tokenix impact update_user --format html --output update_user.html # vis.js graph -tokenix deps src/indexer.rs # file-level import dependencies -tokenix deps src/store.rs --reverse # who imports this file -tokenix deps src/daemon.rs --transitive # follow the import chain -tokenix graph # repo-wide hotspots / blast radius -tokenix graph --format dot --top 20 -o graph.dot # Graphviz of the top subgraph -tokenix tokenmap # token tree -tokenix rebuild-graph # recompute relationships without re-embedding -``` - -Most retrieval commands accept `--json` for machine-readable output: - -```bash -tokenix query "jwt validation" --json | jq '.[0].path' -tokenix callers run_hook --json -``` - -### 7. Token savings analytics - -```bash -tokenix gain # token summary + by-tool histogram -tokenix gain --history # include per-call history -tokenix gain --cost-estimate # add the per-model cost table -tokenix usage # absolute spend (daily) + ≈USD cost -tokenix usage model # spend by model · also: weekly|monthly|session|project|blocks -tokenix usage blocks # rolling 5-hour billing blocks + burn rate -tokenix usage --statusline # compact one-liner for a status bar -tokenix session-audit # index + hook + MCP token-economy health -``` - -`tokenix gain` shows a `BY SOURCE` section splitting measured savings between -large Read intercepts answered with outlines and command filters (Bash/PowerShell -output compression), so you can see which half of tokenix is earning its keep. -Semantic Grep intercepts can add useful indexed context, but the native grep -output is not known before interception, so they are logged as neutral usage -instead of claimed savings. `tokenix gain --cost-estimate` prices the savings -against 10 reference models across Anthropic, OpenAI, and Google. Prices are -shown with their collection date (currently `2026-06-11`) so the numbers stay -auditable. - -### 8. Audit MCP / tool weight - -```bash -tokenix prompt-audit # every agent that has MCP config -tokenix prompt-audit --agent claude # one agent (claude|codex|copilot|opencode|antigravity) -tokenix prompt-audit --json # machine-readable -tokenix prompt-audit --recommend # include practical reduction advice -tokenix conversation-audit # scan agent histories for token-waste blobs -tokenix conversation-audit --agent codex --json -``` - -Discovers the MCP servers configured for each agent, connects to each one live -(`initialize` + `tools/list`), tokenizes the returned tool schemas, and warns -when too many servers/tools inflate the effective system prompt. The base system -prompt itself cannot be read by tools, so this is a **relative bloat estimate**: -the native-tool baseline is approximate and HTTP/SSE servers are shown as -`unknown`. Thresholds are overridable via `TOKENIX_AUDIT_WARN_TOKENS`, -`TOKENIX_AUDIT_WARN_SERVERS`, and `TOKENIX_AUDIT_WARN_TOOLS`. - -For MCP hosts that support progressive discovery, run `tokenix mcp --profile slim`. -The slim profile advertises only `tokenix_context`, `tokenix_search_tools`, and -`tokenix_call`, reducing tool-schema tokens while preserving access to the full -tokenix capability set through the meta-tool path. - -`tokenix conversation-audit` walks local Claude (`~/.claude/projects`), Codex -(`~/.codex/sessions`), Copilot (`~/.copilot/session-state,logs` plus the VS Code -Copilot chat store when present), and OpenAI (`~/.openai`) histories. It -classifies the largest assistant-visible strings by waste scenario and reports -the matching tokenix mitigation: add an output filter, use indexed file reads, -trim hook payloads, slim MCP/tool schemas, or avoid replaying image/connector -payloads into context. - -### 9. Benchmark - -```bash -tokenix benchmark -tokenix benchmark --json -``` - -`tokenix benchmark` measures tokenix against a plain **vanilla** baseline using -the actual index/search code — no external tools involved. It reports read-only -token reduction on large files, targeted outline+symbol workflows, semantic -search Hit@1/Hit@3, context homologation (vanilla full file vs tokenix budgeted -context), and command-output compression. Scenarios span Rust/TS/Go/Python, -SQLite vector search, and common command output (cargo, git, npm, docker -compose); misses are included as measured. Pass `--refresh-index` to re-embed -first, `--cases FILE` for project-specific cases, and `--json` for a -machine-readable summary. - -### 10. Scan conversations for exposed secrets - -```bash -tokenix scan-secrets # all agents, redacted, exit 1 on hits -tokenix scan-secrets --group value # one block per distinct secret -tokenix scan-secrets --group repo # group by the repo it leaked from -tokenix scan-secrets --filter telegram # filter rule/agent/file/value/repo/branch -tokenix scan-secrets --agent claude --json # machine-readable -tokenix scan-secrets --filter aws --reveal # print raw values (warns on stderr) -``` - -Like `gitleaks --no-git`, but it walks each AI agent's **conversation transcripts** -(Claude `~/.claude/projects`, Gemini `~/.gemini/tmp,history`, Copilot -`~/.copilot/session-state,logs`, Antigravity `~/.gemini/antigravity`) for pasted -credentials. Every finding is **redacted by default** and attributed to the -**repository + git branch** it was exposed in (from each Claude message's -`cwd`/`gitBranch`, falling back to the project directory). Detection patterns are -TOML `[[rules]]` in `assets/secret-rules/`, extensible without a rebuild via -`~/.tokenix/secret-rules/*.toml` or `/.tokenix/secret-rules/*.toml`. - -### 11. Audit outbound destinations in conversations - -```bash -tokenix egress-audit # all agents, grouped by host -tokenix egress-audit --group rule # group by detection rule -tokenix egress-audit --filter openai # filter host/rule/agent/file -tokenix egress-audit --safe # mark known-safe hosts -tokenix egress-audit --agent claude --json # machine-readable -``` - -This scans local AI agent transcripts for external DNS/IP destinations, so -unexpected outbound domains pasted into sessions are visible without opening raw -history files. The TUI Egress tab uses the same three-pane pattern as Secrets: -group list, distinct destination list, and occurrence detail with agent, file, -repo, and branch when known. Both the Secrets and Egress tabs open scoped to the -current repository (cwd); press `g` to toggle a global view across all repos. - -Host reputation is local and explicit. Put trusted domains in -`~/.tokenix/safe-hosts.toml` and suspicious domains in -`~/.tokenix/dangerous-hosts.toml`; `www.` is ignored and subdomains inherit the -parent verdict. The TUI paints safe hosts green, dangerous hosts red, and unknown -hosts yellow. Example: - -```toml -# ~/.tokenix/safe-hosts.toml -safe = ["api.openai.com", "github.com"] - -# ~/.tokenix/dangerous-hosts.toml -dangerous = ["example-malware.test"] -``` - ---- - -## 🔧 Setup by Tool - -### Claude Code - -```bash -tokenix install-hook --tool claude-code -``` - -Writes a `PreToolUse` hook to `~/.claude/settings.json` (or `.claude/settings.local.json` with `--local`). Large reads, semantic greps, and noisy Bash commands are intercepted automatically — no changes to your prompts needed. `hook-post` (`PostToolUse`) remains a compatibility handler, not a default Claude install, because it cannot replace the original tool output. - -### GitHub Copilot - -```bash -cd my-project -tokenix install-hook --tool copilot -git add .github/ -git commit -m "chore: add tokenix context instructions" -``` - -Creates `.github/copilot-instructions.md` and `.github/hooks/hooks.json`. - -### OpenAI Codex CLI - -```bash -tokenix install-hook --tool codex -# bash / zsh -echo 'source ~/.codex/tokenix-init.sh' >> ~/.bashrc -# PowerShell -echo '. ~/.codex/tokenix-init.ps1' >> $PROFILE -``` - -Then use `tx-read` and `tx-query` as shell helpers. On Windows this also installs `~/.codex/hooks.json` and a PowerShell wrapper that forwards `PreToolUse` intercepts for Bash-like terminal tools (`Bash`, `run_in_terminal`) and normalizes `grep_search` to the same semantic path as `Grep`. - -### OpenCode - -```bash -tokenix install-hook --tool opencode -``` - -Writes a native `opencode.json` entry for `mcp.tokenix` in the current repository root: - -```json -{ - "mcp": { - "tokenix": { - "type": "local", - "command": ["tokenix", "mcp"] - } - } -} -``` - -This integration is MCP-only. tokenix does **not** install OpenCode `experimental.hook` entries and does **not** emulate Claude-style `PreToolUse` / `PostToolUse` hooks in OpenCode. The generated config expects `tokenix` to be available on `PATH`; run `tokenix install-binary` first if needed. - -### Antigravity - -```bash -tokenix install-hook --tool antigravity -# workspace-only: -tokenix install-hook --tool antigravity --local -``` - -Global installation uses `agy plugin install`, validates the result, and stores it under -`~/.gemini/config/plugins/tokenix/`. Workspace installation writes -`.agents/plugins/tokenix/` and validates it with `agy plugin validate`. -The native hook handles Antigravity's `toolCall.name/args` payload and returns -`decision: allow|deny`; Bash savings use a `PreToolUse` `overwrite`. Antigravity -`PostToolUse` cannot replace tool output, so tokenix does not install a no-op post hook. - -### All tools at once - -```bash -tokenix install-hook --tool all -``` - -`--tool all` intentionally skips OpenCode. Use `tokenix install-hook --tool opencode` explicitly when you want tokenix to write a repo-local `opencode.json` MCP registration. - ---- - -## 📖 Commands Reference - -> Run bare `tokenix` (or `tokenix --help`) for an audience-grouped -> command catalog with examples. The reference below mirrors that grouping: -> **AI agent commands** (the LLM/hooks drive these for token-lean retrieval) vs -> **human commands** (setup, ops & analytics you run yourself). - -### 🤖 AI agent commands - -| Command | Description | -|---|---| -| `tokenix context TEXT` | One-call task context: entry points, relevant source, compact outlines, strict budget modes | -| `tokenix explore TEXT` | Graph-aware exploration: entry points, relationships, grouped source | -| `tokenix query TEXT` | Semantic search over indexed chunks | -| `tokenix grep PATTERN` | Exact regex/literal search over indexed content (no embedding) | -| `tokenix read FILE` | Smart reader — outline for large files, full for small (`--symbol`, `--lines`, `--mode full\|outline\|signatures\|diff\|density:X`) | -| `tokenix symbols QUERY` | Find indexed symbols by name or path (`--kind` filters by symbol type) | -| `tokenix callers SYMBOL` | Show symbols that call/reference a symbol | -| `tokenix callees SYMBOL` | Show symbols called/referenced by a symbol | -| `tokenix deps FILE` | File-level import dependencies (`--reverse`, `--transitive`, `--json`) | -| `tokenix impact SYMBOL` | Bidirectional impact graph (`--format html\|mermaid` for vis.js graph or Mermaid flowchart) | -| `tokenix flow SYMBOL` | Forward call-flow trace from a symbol (`--depth`, `--format text\|mermaid`) | -| `tokenix graph` | Repo-wide symbol-graph overview — god nodes, bottlenecks, blast-radius leaders (`--format text\|dot\|json`, `--top N`, `--output`) | -| `tokenix pack` | Budgeted repo pack for non-hook AI tools (`--mode/--profile`, `--changed`, `--token-map`) | -| `tokenix memory add TEXT` | Save a preference (`--global` or `--project`) for future context | -| `tokenix memory list` | List global and project preferences | -| `tokenix memory remove TEXT` | Remove preferences matching text | -| `tokenix memory edit TEXT` | Replace preferences matching text | - -### 🧑 Human commands - -| Command | Description | -|---|---| -| `tokenix` (no args) | Open the [interactive dashboard](#-interactive-dashboard) — Stats · Filters · Studio · Gain · Usage · Doctor · Tokenmap · Graph · Secrets · Egress tabs; piped/non-TTY falls back to help | -| `tokenix filter` (no args) | Open the dashboard on the Filters tab; piped falls back to `filter list` | -| `tokenix index [PATH]` | Index the repo at PATH (default `.`) | -| `tokenix install-hook` | Install assistant hook/instructions (default `--tool all`) | -| `tokenix remove-hook` | Remove assistant hook/instructions (default `--tool all`) | -| `tokenix install-binary` | Copy the running executable to a per-user global bin dir (`%LOCALAPPDATA%\tokenix\bin` on Windows, `~/.local/bin` on Linux/macOS) and ensure it is on PATH (Windows: user PATH updated automatically; Linux/macOS: prints the shell-profile line) | -| `tokenix doctor` | Diagnose embedding backend, GPU availability, model cache, daemon, bundled filter inventory (filter + golden-case counts), active recording session, and user/local filter config (unknown `semantic_filter.model`, bad threshold) | -| `tokenix serve` | Start the background embedding daemon (keeps model + index in RAM) | -| `tokenix stop` | Stop the background daemon | -| `tokenix daemon status\|stop\|restart` | Inspect (pid, port, uptime, model, cache RAM) or control the daemon | -| `tokenix gain` | Token savings analytics with a by-source split — measured Read savings vs command filters; semantic Grep is neutral usage (`--cost-estimate` adds a per-model cost table) | -| `tokenix usage` | Absolute token spend + ≈USD cost from agent transcripts (`daily\|weekly\|monthly\|session\|model\|project\|blocks`, `--since/--until`, `--all-projects`, `--cost-mode`, `--statusline`, `--json`) | -| `tokenix stats` | Index statistics (files, chunks, tokens, age) | -| `tokenix tokenmap` | Directory tree map with token counts, heaviest paths first, plus a top-10 files summary (`--format html` supported) | -| `tokenix benchmark` | Reproducible token-savings and retrieval-quality benchmark — vanilla vs tokenix (`--json`) | -| `tokenix filter list` | Show top Bash commands by tokens wasted (no filter yet) | -| `tokenix filter active` | Show active user and bundled output filters | -| `tokenix filter generate [CMD]` | AI-generate a TOML output filter for a command | -| `tokenix filter record [CMD]` | Record real command output for richer filter generation | -| `tokenix prompt-audit` | Audit MCP/tool token weight across agents; warns on bloat (`--agent`, `--json`, `--recommend`, `--profile-impact`) | -| `tokenix session-audit` | Token-economy health check: index, hook events, MCP/tool weight, cache hygiene | -| `tokenix conversation-audit` | Scan local AI conversation histories for token-waste patterns (`--agent`, `--min-chars`, `--limit`, `--json`) | -| `tokenix scan-secrets` | Scan AI agent conversation transcripts for exposed credentials, gitleaks-style; attributes each to its repo + git branch (`--agent`, `--filter`, `--group`, `--reveal`, `--json`) | -| `tokenix egress-audit` | Scan AI agent conversation transcripts for external DNS/IP destinations and validate hosts against local safe/dangerous reputation lists (`--agent`, `--filter`, `--group`, `--safe`, `--json`) | -| `tokenix artifacts list` | List context artifacts defined in `.tokenix/artifacts.json` | -| `tokenix artifacts show NAME` | Show context artifact content | -| `tokenix cycles` | Detect circular dependencies in the symbol graph using Tarjan's SCC algorithm | -| `tokenix rebuild-graph` | Rebuild graph tables from existing chunks without re-embedding | - -### ⚙ Internal (invoked by hooks/agents, not by hand) - -| Command | Description | -|---|---| -| `tokenix hook` | `PreToolUse` handler — intercepts large reads, semantic grep, and noisy Bash/PowerShell commands (called by AI tools) | -| `tokenix hook-post` | Legacy `PostToolUse` compatibility handler | -| `tokenix run -- CMD` | Run a command and compress its output through tokenix filters | -| `tokenix mcp` | MCP server exposing context, read/search, graph, and gain tools (`--profile slim\|full`) | - -
-Selected flags - -**Global** - -| Flag | Description | -|---|---|---| -| `--only-cpu` | Force CPU embedding even on a GPU-enabled build (no-op on CPU-only builds) | -| `TOKENIX_BRANCH_AWARE=true` | Env var: suffix SQLite DB per git branch (isolate indexes per branch) | - -**`tokenix index`** — `--force/-f`, `--cpu-profile `, `--jobs N`, `--embed-batch N` (default 16 CPU / 64 GPU), `--if-stale`, `--path/-p`, `--model `, `--no-low-priority` (indexing runs at below-normal OS priority by default; this flag or `TOKENIX_FOREGROUND=1` keeps normal priority) - -**Embedding model** — default `nomic-v1.5`. Select another with `tokenix index --model ` or `TOKENIX_EMBED_MODEL=`; run `tokenix doctor` to list available ids (`nomic-v1.5`, `bge-small`, `bge-base`, `minilm-l6`, `e5-small`, `jina-code`). The model is stamped into the index and read back at query time, so search always matches what was indexed; it is sticky across re-indexes and an explicit switch re-embeds. `nomic-v1.5` (768d) is the quality default; `bge-small` (384d) indexes faster; `e5-small` is multilingual; `jina-code` is code-specialized (a custom ONNX downloaded from Hugging Face on first use). Existing indexes keep working unchanged. - -**`tokenix query`** — `--budget/-b` (1200), `--k` (20), `--file/-f`, `--link` (cross-project, repeatable), `--json`, `--path/-p` - -**`tokenix symbols`** — `--limit/-l` (20), `--kind/-k `, `--json`, `--path/-p` - -**`tokenix deps`** — `--reverse` (files importing the target), `--transitive` (follow resolved imports), `--json`, `--path/-p` - -**`tokenix grep`** — `--limit/-l` (20), `--ignore-case/-i`, `--file/-f`, `--path/-p` - -**`tokenix context`** — `--mode `, `--budget/-b` (1200), `--max-files`, `--budget-breakdown`, `--path/-p` - -**`tokenix impact`** — `--depth/-d` (2), `--limit/-l` (50), `--format `, `--output/-o`, `--path/-p` - -**`tokenix flow`** — `--depth/-d` (3), `--format `, `--output/-o`, `--path/-p` - -**`tokenix install-hook` / `remove-hook`** — `--tool ` (default `all`), `--local` (Claude Code, Copilot, and Antigravity) - -**`tokenix pack`** — `--mode/--profile `, `--budget N` (8000), `--format `, `--changed`, `--since REF`, `--token-map`, `--output/-o` - -**`tokenix benchmark`** — `--budget N` (1200), `--json`, `--refresh-index`, `--cases FILE` - -**`tokenix prompt-audit`** — `--agent ` (default `all`), `--json`, `--recommend`, `--profile-impact` - -**`tokenix session-audit`** — `--json`, `--cache-hygiene`, `--path/-p` - -**`tokenix conversation-audit`** — `--agent ` (default `all`), `--min-chars N` (default `5000`), `--limit N` (default `30`), `--json`. Scans local conversation stores for token-waste patterns: full file reads, huge command/log outputs, bootstrap/system prompts, duplicated hook payloads, MCP/tool schemas, diff/test logs, task context blobs, image base64 payloads, connector JSON, build artifacts, provider signatures, documentation blobs, and oversized patches. - -**`tokenix scan-secrets`** — `--agent ` (default `all`), `--filter ` (case-insensitive match over rule/agent/file/value/repo/branch), `--group ` (default `none`; `value` collapses each distinct secret into one block with its occurrence count, `repo` groups by the repository the secret was exposed in), `--reveal` (print raw values instead of redacting — warns on stderr), `--json`. Each finding is attributed to its **repository + git branch** when recoverable: Claude transcripts carry an exact `cwd`/`gitBranch` per message; otherwise the project directory is used as a best-effort `~slug:`/`~dir:` label. Scans each agent's conversation transcripts under `~` (Claude `~/.claude/projects`, Gemini `~/.gemini/tmp,history`, Copilot `~/.copilot/session-state,logs`, Antigravity `~/.gemini/antigravity`) for credential patterns; output is redacted by default and exit code is `1` when findings exist. Patterns are TOML `[[rules]]` (`id`, `pattern`, optional `capture`/`min_entropy`): bundled defaults in `assets/secret-rules/`, extended/overridden by `/.tokenix/secret-rules/*.toml` then `~/.tokenix/secret-rules/*.toml` (later sources win on matching `id`). - -**`tokenix mcp`** — `--profile ` (default `full`) - -
- ---- - -## 🧠 Supported Languages - -| Language | Extensions | Symbol types | -|---|---|---| -| Rust | `.rs` | `fn`, `struct`, `enum`, `impl`, `trait`, `mod` | -| Python | `.py` | `def`, `async def`, `class` | -| TypeScript | `.ts`, `.tsx` | `function`, `class`, `interface`, `type`, arrow functions | -| JavaScript | `.js`, `.jsx`, `.mjs`, `.cjs` | `function`, `class`, arrow functions | -| Go | `.go` | `func`, `type` | -| C / C++ | `.c`, `.cpp`, `.h`, `.hpp`, `.cc`, `.cxx` | `function`, `class`, `struct`, `namespace` | -| Config / Docs | `.toml`, `.md`, `.txt`, `.sh`, `.bash` | line blocks | -| Data files (opt-in) | `.json`, `.yaml`, `.yml` | Indexed only when `data_files = true` in `.tokenix.toml` | -| **Custom** | any extension | Mapped to an existing parser via `.tokenix.toml` | - -Languages without a symbol-aware chunker (Java, C#, Ruby, Swift, Kotlin, …) are not indexed by default — blind line-block chunking produces low-quality search results. - -### Custom language mapping - -Create a `.tokenix.toml` (or `tokenix.toml`) in the project root: - -```toml -[languages] -pyi = "python" # Python stub files -mts = "typescript" # TypeScript module files -lua = "generic" # use sliding-window chunks -``` - -Valid parser values: `rust`, `python`, `typescript`, `javascript`, `go`, `cpp`, `c`, `generic`. - -### Hook tuning - -The same `.tokenix.toml` accepts a `[hook]` section to tune when the -`PreToolUse` hook intercepts: - -```toml -[hook] -read_min_lines = 120 # outline files with >= this many lines (default 200) -grep_min_words = 3 # treat Grep patterns with >= this many words as semantic (default 3; neutral in gain) -``` - -Lower `read_min_lines` to intercept more reads (saving more tokens); raise it -when you prefer verbatim file content. The fail-open contract is unchanged — -hook errors never break the session. - ---- - -## 🔧 Output Filters - -tokenix reduces noisy shell output by rewriting matching `Bash` commands in `PreToolUse` so they run through `tokenix run` before the agent sees the result. Filtering happens in three layers (highest priority first): - -1. **Local project filters** — `.toml` files in `.tokenix/filters/` inside the repo. Scoped to the project, committed to version control. -2. **User filters** — `.toml` files in `~/.tokenix/filters/`. Apply to all projects, override bundled filters. -3. **Bundled filters** — 386 TOML output filters shipped inside the binary (each homologated against 800 embedded golden cases), covering `uv`, `cargo build`/`cargo run`/`cargo audit`, `git`, `gradle`, `terraform plan`, `make`, `npm`/`npm audit`, `pnpm`, `bun`, `deno`, `vite`, `node --test`, `poetry`, `docker`, `kubectl`/`kubectl top`, `helm`, `go`, `rust`, `python`, `dotnet`, `swift`, `apt`/`apt-get`, `journalctl`, `trivy`, `semgrep`, `bazel`, `ctest`, `tox`, `conda`/`mamba`, `pulumi up`/`preview`/`destroy`, `dnf`/`yum`, `pacman`, `apk`, `pip-audit`, `ng test` (Karma), `bru` (Bruno), `ps`, and more. Applied automatically — no setup needed. - -### Filter format - -```toml -[filters.uv-sync] -description = "Compact uv sync output" -match_command = "^uv\\s+(sync|pip\\s+install)\\b" -strip_ansi = true -strip_lines_matching = ["^\\s*$", "^\\s+Downloading ", "^\\s+Using cached "] -match_output = [ - { pattern = "Audited \\d+ package", message = "ok (up to date)" }, -] -max_lines = 20 -on_empty = "uv: ok" -``` - -| Field | Description | -|---|---| -| `match_command` | Rust regex matched against the command. Compound commands are split (quote-aware) on `&&`, `\|\|`, `;`, and `\|`, and each segment is matched independently, so anchoring on the base command (e.g. `^gitleaks\b`) still matches `cd repo && gitleaks`, `cd repo;gitleaks`, or `producer \| gitleaks` | -| `strip_ansi` | Remove ANSI colour codes before filtering | -| `strip_lines_matching` | Drop lines matching any of these regex patterns | -| `keep_lines_matching` | Keep only lines matching these patterns | -| `match_output` | Short-circuit: if output matches `pattern`, return `message` immediately; use `unless` for error/warning guards | -| `max_lines` / `head_lines` / `tail_lines` | Truncate output | -| `truncate_lines_at` | Truncate individual lines at N characters | -| `on_empty` | Message to return when filtering produces empty output. **Never emitted if the original output carries a generic failure signal** (`error`/`fatal`/`panic`/`FAILED`/`exit code N`…) — the engine falls back to a bounded view of the real output so a failed command is never masked as success, even when its error format isn't recognized by `keep_lines_matching` | -| `passthrough_when_emptied` | When the filter reduces *non-empty* output to nothing (an unexpected output shape the keep/extract rules don't recognize), show a bounded view of the real output instead of `on_empty` — so format-specific filters never report a false "nothing here" (e.g. `git log --oneline` against the full-log filter) | -| `filter_stderr` | Opt in to applying this command-specific filter to stderr. Without it, stderr uses generic safe compression so command errors are not turned into success sentinels | - -### AI-assisted filter generation - -```bash -tokenix filter list # commands wasting the most tokens (no filter yet) -tokenix filter active # all active user + bundled filters -tokenix filter record "cargo test" # capture real output for richer generation -tokenix filter generate "cargo test" # generate a TOML filter via a local AI CLI -``` - ---- - -## 🏗 Architecture - -``` -src/ -├── main.rs CLI entry (clap), command dispatch, install-hook helpers -├── chunker.rs Symbol-aware AST chunking (Tree-sitter) + dynamic language config (.tokenix.toml) -├── embed.rs fastembed ONNX: embed_documents(), embed_query() — optional GPU via ort features -├── store.rs SQLite schema, CRUD, FTS5, hybrid search, incremental branch fingerprint check -├── indexer.rs File walker + incremental index pipeline (parallel chunking + batch embedding) -├── query.rs Hybrid semantic + sparse FTS5 ranking, strict context modes, token-budget selection -├── pack.rs Budgeted repo pack generation for non-hook AI tools, changed packs, token maps -├── graph.rs Symbol relationship graph, cycle detection (Tarjan's SCC), HTML/Mermaid export -├── artifacts.rs Context artifacts — parse `.tokenix/artifacts.json`, read non-code content -├── hook.rs PreToolUse handler — Claude-, Copilot-, and grep_search/run_in_terminal-style JSON input -├── daemon.rs Background TCP server — holds model + in-memory embedding cache -├── compress.rs Legacy PostToolUse compatibility pipeline for tool-output rewriting -├── filters.rs FilterDef, load_local/user/bundled_filters(), priority merge, apply_filter() -├── cmd_filter.rs `tokenix filter` subcommands (list, active, generate, record) -├── recordings.rs Capture/replay real command output for filter generation -├── memory.rs Global/project preference memory (editable Markdown) -├── gain.rs Analytics from the hook log — per-model cost table -├── benchmark.rs Reproducible savings + retrieval-quality benchmark -├── doctor.rs Backend / GPU / model-cache / daemon diagnostics -├── mcp.rs Model Context Protocol server (full and slim profiles) -└── mcp_audit.rs Multi-agent MCP config discovery + live tools/list introspection (prompt/session audit) - -assets/ -└── filters/ 386 TOML output filters (+800 golden cases), embedded in the binary via rust-embed -``` - -### GPU acceleration (opt-in) - -A default build runs embeddings on CPU. Compile with a GPU feature to use the GPU — it then becomes the **default at runtime, with automatic CPU fallback** if the provider is unavailable: - -```bash -# Windows — DirectML (works with any D3D12-capable GPU, no CUDA toolkit required) -cargo install --path . --features directml --locked - -# Linux / Windows — CUDA (needs CUDA 12.x + cuDNN 9.x installed and on PATH; -# ort rc.9 does not support CUDA 13 yet) -cargo install --path . --features cuda --locked -``` - -On a GPU build, force CPU per-invocation with the global `--only-cpu` flag: - -```bash -tokenix index . # uses the GPU -tokenix --only-cpu index . # forces CPU on a GPU build -``` - -`--embed-batch` drives peak memory (default 16 on CPU, 64 on GPU) — lower it if RAM/VRAM is tight. Run `tokenix doctor` to see the compiled backend, detected GPU, CUDA/cuDNN status, and tailored recommendations. - -### Daemon - -The background daemon (`tokenix serve`) keeps the ONNX model and project embeddings in RAM (int8-quantized — 4x less memory than f32). Hook calls route over TCP loopback instead of re-loading the model on each subprocess invocation, and it auto-starts on the first Grep hook call — you don't need to run it manually. Manage it with `tokenix daemon status` (pid, port, uptime, model, cache size), `tokenix daemon stop`, and `tokenix daemon restart`. - -### Embedding model - -| Property | Value | -|---|---| -| Model | `nomic-embed-text-v1.5` (quantized) | -| Dimensions | 768 | -| File size | ~130 MB | -| Cache location | `%LOCALAPPDATA%\tokenix\models` (Windows) / `~/.cache/tokenix/models` (Linux/macOS) | -| Download | Automatic on first run | -| Runtime | fastembed (ONNX Runtime, in-process) | - -Index storage lives at `~/.tokenix/.db` (one DB per project). Embeddings are stored as **int8-quantized** blobs (4x smaller than f32, near-identical recall — the per-vector scale cancels out of the cosine) and similarity is computed in Rust — no external vector database needed. Indexes created before quantization migrate automatically (re-encode only, no re-embedding) on the next `tokenix index`; `tokenix doctor` reports migration coverage. - ---- - -## 🔒 Security - -tokenix's build and release pipeline is hardened against supply-chain attacks: -SHA-pinned GitHub Actions, least-privilege workflow permissions, `cargo-deny` -(advisories + license + crates.io-only sources), `zizmor` workflow analysis, -OpenSSF Scorecard, SLSA build-provenance attestations, and tokenless crates.io -publishing via OIDC. See [SECURITY.md](SECURITY.md) for the disclosure policy -and release-verification steps. - ---- - -## 🤝 Contributing - -Contributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for how to get started. - ---- - -## 📄 License - -[MIT](LICENSE) - - +> **tokenix** is a local-first Rust CLI that helps AI coding agents understand a repository without dumping huge files into the prompt. It indexes your code, finds relevant chunks by meaning, returns compact file outlines, and can hook into AI tools to replace noisy reads and command output with smaller, more useful context. Works with Claude Code, GitHub Copilot, OpenAI Codex CLI, OpenCode, Gemini, and any MCP client. **No Ollama or external server required.** \ No newline at end of file diff --git a/assets/filters/airflow.toml b/assets/filters/airflow.toml new file mode 100644 index 0000000..697bb7f --- /dev/null +++ b/assets/filters/airflow.toml @@ -0,0 +1,37 @@ +[filters.airflow] +description = "Keep airflow task state transitions and errors, drop INFO setup logs" +match_command = "^airflow\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "Marking task as SUCCESS", + "Marking task as FAILED", + "Task exited with return code", + "(?i)error", + "(?i)exception", + "(?i)traceback", +] +max_lines = 80 +on_empty = "airflow: task done" +token_budget = 2000 + +[[tests.airflow]] +name = "task state and error kept" +input = """ +[2024-06-01 12:00:00] INFO - Executing task on 2024-06-01 +[2024-06-01 12:00:01] INFO - Started process to run task +[2024-06-01 12:00:05] ERROR - Task failed with exception +[2024-06-01 12:00:05] INFO - Marking task as FAILED +""" +expected = """[2024-06-01 12:00:05] ERROR - Task failed with exception +[2024-06-01 12:00:05] INFO - Marking task as FAILED""" + +[[tests.airflow]] +name = "clean run collapses" +input = """ +[2024-06-01 12:00:00] INFO - Executing task on 2024-06-01 +[2024-06-01 12:00:01] INFO - Started process to run task +""" +expected = "airflow: task done" diff --git a/assets/filters/alex.toml b/assets/filters/alex.toml new file mode 100644 index 0000000..3b63350 --- /dev/null +++ b/assets/filters/alex.toml @@ -0,0 +1,33 @@ +[filters.alex] +description = "Keep alex insensitive-language warnings and totals, drop filenames" +match_command = "^(npx\\s+)?alex\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "\\d+:\\d+(-\\d+:\\d+)?\\s+(warning|error)", + "✖", + "problems?\\b", + "(?i)error", +] +max_lines = 100 +on_empty = "alex: no issues" + +[[tests.alex]] +name = "warnings and total kept" +input = """ +README.md + 5:10-5:18 warning `master` may be insensitive, use `main` master retext-equality + +✖ 1 warning +""" +expected = """ 5:10-5:18 warning `master` may be insensitive, use `main` master retext-equality +✖ 1 warning""" + +[[tests.alex]] +name = "clean run collapses" +input = """ +README.md +""" +expected = "alex: no issues" diff --git a/assets/filters/ansible-galaxy.toml b/assets/filters/ansible-galaxy.toml new file mode 100644 index 0000000..213d6c7 --- /dev/null +++ b/assets/filters/ansible-galaxy.toml @@ -0,0 +1,38 @@ +[filters.ansible-galaxy] +description = "Keep ansible-galaxy install results and errors, drop download progress" +match_command = "^ansible-galaxy\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^- downloading ", + "^Starting galaxy", + "^Process install", +] +keep_lines_matching = [ + "was installed successfully", + "is already installed", + "(?i)error", + "(?i)\\[warning\\]", +] +max_lines = 80 +on_empty = "ansible-galaxy: nothing installed" + +[[tests.ansible-galaxy]] +name = "install results kept" +input = """ +Starting galaxy role install process +- downloading role 'nginx', owned by geerlingguy +- geerlingguy.nginx (3.1.4) was installed successfully +- downloading role 'postgresql' +- geerlingguy.postgresql (3.4.0) was installed successfully +""" +expected = """- geerlingguy.nginx (3.1.4) was installed successfully +- geerlingguy.postgresql (3.4.0) was installed successfully""" + +[[tests.ansible-galaxy]] +name = "install error kept" +input = """ +Starting galaxy role install process +ERROR! - the role 'missing.role' was not found +""" +expected = "ERROR! - the role 'missing.role' was not found" diff --git a/assets/filters/argocd.toml b/assets/filters/argocd.toml new file mode 100644 index 0000000..3e7d408 --- /dev/null +++ b/assets/filters/argocd.toml @@ -0,0 +1,50 @@ +[filters.argocd] +description = "Keep argocd app sync/health status and errors, drop verbose resource tree" +match_command = "^argocd\\b" +strip_ansi = true +keep_lines_matching = [ + "^Name:", + "^Project:", + "^Namespace:", + "Sync Status:", + "Health Status:", + "^Phase:", + "^Message:", + "OutOfSync", + "(?i)error", + "(?i)fail", +] +max_lines = 40 +on_empty = "argocd: ok" + +[[tests.argocd]] +name = "status fields kept, resource tree dropped" +input = """ +Name: myapp +Project: default +Server: https://kubernetes.default.svc +Namespace: prod +URL: https://argocd.example.com/applications/myapp +Sync Status: Synced to main (abc1234) +Health Status: Healthy + +GROUP KIND NAMESPACE NAME STATUS HEALTH +apps Deployment prod web Synced Healthy + Service prod web Synced Healthy +""" +expected = """Name: myapp +Project: default +Namespace: prod +Sync Status: Synced to main (abc1234) +Health Status: Healthy""" + +[[tests.argocd]] +name = "out of sync kept" +input = """ +Name: myapp +Sync Status: OutOfSync from main +Health Status: Degraded +""" +expected = """Name: myapp +Sync Status: OutOfSync from main +Health Status: Degraded""" diff --git a/assets/filters/astro.toml b/assets/filters/astro.toml new file mode 100644 index 0000000..f1d742c --- /dev/null +++ b/assets/filters/astro.toml @@ -0,0 +1,40 @@ +[filters.astro] +description = "Keep astro build totals and errors, drop per-page generated lines" +match_command = "^(npx\\s+)?astro\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "▶ ", + "└─ ", + "^\\s*generating ", +] +keep_lines_matching = [ + "page\\(s\\) built", + "Complete!", + "Server built", + "built in", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "astro: build done" + +[[tests.astro]] +name = "build totals kept, per-page dropped" +input = """ +12:00:00 [build] Building static entrypoints... +▶ src/pages/index.astro +└─ /index.html (+12ms) +12:00:01 [build] 10 page(s) built in 1.23s +12:00:01 [build] Complete! +""" +expected = """12:00:01 [build] 10 page(s) built in 1.23s +12:00:01 [build] Complete!""" + +[[tests.astro]] +name = "build error kept" +input = """ +▶ src/pages/index.astro +[ERROR] Could not resolve "./missing" +""" +expected = """[ERROR] Could not resolve "./missing\"""" diff --git a/assets/filters/atlas.toml b/assets/filters/atlas.toml new file mode 100644 index 0000000..847655b --- /dev/null +++ b/assets/filters/atlas.toml @@ -0,0 +1,39 @@ +[filters.atlas] +description = "Keep atlas migration version/status and errors, drop raw SQL" +match_command = "^atlas\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*-> ", + "No migration files", +] +keep_lines_matching = [ + "Migrating to version", + "migrating version", + "^\\s*-- ok", + "Current version", + "(?i)error", +] +max_lines = 80 +on_empty = "atlas: schema up to date" + +[[tests.atlas]] +name = "migration steps kept, sql dropped" +input = """ +Migrating to version 20240101 (1 migrations in total): + + -- migrating version 20240101 + -> CREATE TABLE users (id int primary key); + -> CREATE INDEX idx ON users (id); + -- ok (12.3ms) +""" +expected = """Migrating to version 20240101 (1 migrations in total): + -- migrating version 20240101 + -- ok (12.3ms)""" + +[[tests.atlas]] +name = "up to date collapses" +input = """ +No migration files to execute +""" +expected = "atlas: schema up to date" diff --git a/assets/filters/attw.toml b/assets/filters/attw.toml new file mode 100644 index 0000000..66d020b --- /dev/null +++ b/assets/filters/attw.toml @@ -0,0 +1,40 @@ +[filters.attw] +description = "Keep are-the-types-wrong problem rows, drop OK rows and box borders" +match_command = "^(npx\\s+)?attw\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^[┌├└╭╞╰][─┬┼┴┄]+", + "🟢", +] +keep_lines_matching = [ + "💥", + "❌", + "⚠️", + "Masquerading", + "no problems found", + "(?i)error", +] +max_lines = 60 +on_empty = "attw: no problems found" + +[[tests.attw]] +name = "problem rows kept, ok rows dropped" +input = """ +mypackage v1.0.0 + +┌───────────────────┬──────────────────────┐ +│ node10 │ 🟢 │ +├───────────────────┼──────────────────────┤ +│ node16 (from CJS) │ 💥 Masquerading as ESM │ +└───────────────────┴──────────────────────┘ +""" +expected = """│ node16 (from CJS) │ 💥 Masquerading as ESM │""" + +[[tests.attw]] +name = "clean package collapses" +input = """ +mypackage v1.0.0 +🟢 No problems found +""" +expected = "attw: no problems found" diff --git a/assets/filters/autoflake.toml b/assets/filters/autoflake.toml new file mode 100644 index 0000000..82a78e3 --- /dev/null +++ b/assets/filters/autoflake.toml @@ -0,0 +1,42 @@ +[filters.autoflake] +description = "Keep autoflake changed-file lines and diffs, collapse clean runs" +match_command = "^autoflake\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^Fixing ", + "^---", + "^\\+\\+\\+", + "^@@", + "^[+-][^+-]", + "(?i)error", +] +max_lines = 100 +on_empty = "autoflake: nothing to fix" + +[[tests.autoflake]] +name = "fix diff kept" +input = """ +Fixing src/app.py + +--- original/src/app.py ++++ fixed/src/app.py +@@ -1,4 +1,2 @@ + import os +-import sys +-import json + print(os.getcwd()) +""" +expected = """Fixing src/app.py +--- original/src/app.py ++++ fixed/src/app.py +@@ -1,4 +1,2 @@ +-import sys +-import json""" + +[[tests.autoflake]] +name = "clean run collapses" +input = "" +expected = "autoflake: nothing to fix" diff --git a/assets/filters/ava.toml b/assets/filters/ava.toml new file mode 100644 index 0000000..9b39d73 --- /dev/null +++ b/assets/filters/ava.toml @@ -0,0 +1,38 @@ +[filters.ava] +description = "Keep AVA failures and result counts, drop passing test lines" +match_command = "^(npx\\s+)?ava\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*✔", +] +keep_lines_matching = [ + "✖", + "tests? (passed|failed)", + "Rejected promise", + "(?i)error", +] +max_lines = 80 +on_empty = "ava: all tests passed" + +[[tests.ava]] +name = "failures and counts kept" +input = """ + ✔ passes test one + ✔ passes test two + ✖ fails test three + + 2 tests passed + 1 test failed +""" +expected = """ ✖ fails test three + 2 tests passed + 1 test failed""" + +[[tests.ava]] +name = "all pass collapses" +input = """ + ✔ passes test one + ✔ passes test two +""" +expected = "ava: all tests passed" diff --git a/assets/filters/buck2.toml b/assets/filters/buck2.toml new file mode 100644 index 0000000..67474fa --- /dev/null +++ b/assets/filters/buck2.toml @@ -0,0 +1,41 @@ +[filters.buck2] +description = "Keep buck2 build result and job summary, drop per-action progress" +match_command = "^buck2\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Action ", + "^\\s*Cache ", + "^Watchman", +] +keep_lines_matching = [ + "BUILD SUCCEEDED", + "BUILD FAILED", + "Jobs completed", + "Time elapsed", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "buck2: build done" + +[[tests.buck2]] +name = "result and summary kept" +input = """ +Watchman fresh instance +Action: compile //app:lib +Jobs completed: 120. Time elapsed: 12.3s. +BUILD SUCCEEDED +""" +expected = """Jobs completed: 120. Time elapsed: 12.3s. +BUILD SUCCEEDED""" + +[[tests.buck2]] +name = "build failure kept" +input = """ +Action: compile //app:lib +BUILD FAILED +Caused by: compilation error in app/lib.rs +""" +expected = """BUILD FAILED +Caused by: compilation error in app/lib.rs""" diff --git a/assets/filters/buildah.toml b/assets/filters/buildah.toml new file mode 100644 index 0000000..37d5dbe --- /dev/null +++ b/assets/filters/buildah.toml @@ -0,0 +1,42 @@ +[filters.buildah] +description = "Keep buildah build STEP/COMMIT lines and errors, drop layer noise" +match_command = "^buildah\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^--> ", + "Copying blob", + "Copying config", +] +keep_lines_matching = [ + "^STEP ", + "^COMMIT ", + "Successfully tagged", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "buildah: ok" + +[[tests.buildah]] +name = "step and commit kept" +input = """ +STEP 1/2: FROM alpine +STEP 2/2: RUN apk add curl +--> a1b2c3d +COMMIT app +Successfully tagged localhost/app:latest +""" +expected = """STEP 1/2: FROM alpine +STEP 2/2: RUN apk add curl +COMMIT app +Successfully tagged localhost/app:latest""" + +[[tests.buildah]] +name = "error kept" +input = """ +STEP 2/2: RUN false +error building at STEP "RUN false": exit status 1 +""" +expected = """STEP 2/2: RUN false +error building at STEP "RUN false": exit status 1""" diff --git a/assets/filters/c8.toml b/assets/filters/c8.toml new file mode 100644 index 0000000..1748b53 --- /dev/null +++ b/assets/filters/c8.toml @@ -0,0 +1,34 @@ +[filters.c8] +description = "Keep c8 coverage table rows with uncovered lines and the All-files summary" +match_command = "^(npx\\s+)?c8\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^-+\\|", + "^\\s*-+$", +] +keep_lines_matching = [ + "% Stmts", + "All files", + "(?i)error", +] +max_lines = 120 +on_empty = "c8: coverage collected" + +[[tests.c8]] +name = "summary and header kept" +input = """ +----------|---------|----------|---------|---------|------------------- +File | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s +----------|---------|----------|---------|---------|------------------- +All files | 85.71 | 66.66 | 100 | 85.71 | + app.js | 85.71 | 66.66 | 100 | 85.71 | 12-14 +----------|---------|----------|---------|---------|------------------- +""" +expected = """File | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s +All files | 85.71 | 66.66 | 100 | 85.71 |""" + +[[tests.c8]] +name = "no table collapses" +input = "" +expected = "c8: coverage collected" diff --git a/assets/filters/cargo-add.toml b/assets/filters/cargo-add.toml new file mode 100644 index 0000000..2ab5f9f --- /dev/null +++ b/assets/filters/cargo-add.toml @@ -0,0 +1,36 @@ +[filters.cargo-add] +description = "Keep cargo add dependency lines, drop index update and feature dumps" +match_command = "^cargo\\s+add\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Updating crates.io index", + "^\\s*Features:", + "^\\s*[-+] ", +] +keep_lines_matching = [ + "Adding ", + "(?i)error", + "(?i)warning", +] +max_lines = 40 +on_empty = "cargo add: ok" + +[[tests.cargo-add]] +name = "added deps kept, feature dump dropped" +input = """ + Updating crates.io index + Adding serde v1.0.197 to dependencies + Features: + + derive + - alloc +""" +expected = " Adding serde v1.0.197 to dependencies" + +[[tests.cargo-add]] +name = "error kept" +input = """ + Updating crates.io index +error: the crate `nonexistent` could not be found +""" +expected = "error: the crate `nonexistent` could not be found" diff --git a/assets/filters/cargo-expand.toml b/assets/filters/cargo-expand.toml new file mode 100644 index 0000000..683b895 --- /dev/null +++ b/assets/filters/cargo-expand.toml @@ -0,0 +1,33 @@ +[filters.cargo-expand] +description = "Bound cargo expand macro output, surface compile errors" +match_command = "^cargo\\s+expand\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Checking ", + "^\\s*Compiling ", + "^\\s*Finished ", +] +max_lines = 150 +on_empty = "cargo expand: no output" +token_budget = 2500 + +[[tests.cargo-expand]] +name = "expanded code bounded, build lines dropped" +input = """ + Checking app v0.1.0 + Finished dev profile +#![feature(prelude_import)] +fn main() { + { ::std::io::_print(format_args!("hi\\n")); }; +} +""" +expected = """#![feature(prelude_import)] +fn main() { + { ::std::io::_print(format_args!("hi\\n")); }; +}""" + +[[tests.cargo-expand]] +name = "empty collapses" +input = "" +expected = "cargo expand: no output" diff --git a/assets/filters/cargo-fix.toml b/assets/filters/cargo-fix.toml new file mode 100644 index 0000000..d113848 --- /dev/null +++ b/assets/filters/cargo-fix.toml @@ -0,0 +1,34 @@ +[filters.cargo-fix] +description = "Keep cargo fix applied-fix lines and warnings, drop compile noise" +match_command = "^cargo\\s+fix\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Checking ", + "^\\s*Compiling ", + "^\\s*Finished ", +] +keep_lines_matching = [ + "Fixed ", + "(?i)error", + "(?i)warning", +] +max_lines = 60 +on_empty = "cargo fix: no changes" + +[[tests.cargo-fix]] +name = "applied fixes kept" +input = """ + Checking app v0.1.0 + Fixed src/main.rs (2 fixes) + Finished dev profile in 1.2s +""" +expected = " Fixed src/main.rs (2 fixes)" + +[[tests.cargo-fix]] +name = "nothing to fix collapses" +input = """ + Checking app v0.1.0 + Finished dev profile in 1.2s +""" +expected = "cargo fix: no changes" diff --git a/assets/filters/cargo-llvm-cov.toml b/assets/filters/cargo-llvm-cov.toml new file mode 100644 index 0000000..1f3a364 --- /dev/null +++ b/assets/filters/cargo-llvm-cov.toml @@ -0,0 +1,41 @@ +[filters.cargo-llvm-cov] +description = "Keep llvm-cov header and TOTAL coverage line, drop per-file rows and separators" +match_command = "^cargo\\s+llvm-cov\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^-+$", + "^\\s*Compiling", + "^\\s*Finished", + "^\\s*Running", +] +keep_lines_matching = [ + "^Filename", + "^TOTAL", + "(?i)error", +] +max_lines = 20 +on_empty = "cargo-llvm-cov: no coverage table" + +[[tests.cargo-llvm-cov]] +name = "header and total kept, per-file rows dropped" +input = """ + Compiling app v0.1.0 + Finished test profile in 3.2s +Filename Regions Missed Regions Cover Functions Missed Functions Executed Lines Missed Lines Cover +----------------------------------------------------------------------------------------------------------------------------------- +src/main.rs 12 3 75.00% 4 1 75.00% 40 8 80.00% +src/lib.rs 20 0 100.00% 6 0 100.00% 55 0 100.00% +----------------------------------------------------------------------------------------------------------------------------------- +TOTAL 32 3 90.62% 10 1 90.00% 95 8 91.57% +""" +expected = """Filename Regions Missed Regions Cover Functions Missed Functions Executed Lines Missed Lines Cover +TOTAL 32 3 90.62% 10 1 90.00% 95 8 91.57%""" + +[[tests.cargo-llvm-cov]] +name = "no table collapses" +input = """ + Compiling app v0.1.0 + Finished test profile in 3.2s +""" +expected = "cargo-llvm-cov: no coverage table" diff --git a/assets/filters/cargo-machete.toml b/assets/filters/cargo-machete.toml new file mode 100644 index 0000000..3fdd11c --- /dev/null +++ b/assets/filters/cargo-machete.toml @@ -0,0 +1,41 @@ +[filters.cargo-machete] +description = "Keep cargo-machete unused-dependency findings, collapse clean runs" +match_command = "^cargo\\s+machete\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^If you believe", + "^Use ", + "didn't find any unused", +] +keep_lines_matching = [ + "unused dependencies", + "Cargo.toml:", + "^\\t", + "^ ", + "(?i)error", +] +max_lines = 80 +on_empty = "cargo-machete: no unused dependencies" + +[[tests.cargo-machete]] +name = "unused deps kept" +input = """ +cargo-machete found the following unused dependencies in /repo: +crate-a -- /repo/Cargo.toml: + serde + regex + +If you believe these are false positives, ... +""" +expected = """cargo-machete found the following unused dependencies in /repo: +crate-a -- /repo/Cargo.toml: + serde + regex""" + +[[tests.cargo-machete]] +name = "clean run collapses" +input = """ +cargo-machete didn't find any unused dependencies. Good job! +""" +expected = "cargo-machete: no unused dependencies" diff --git a/assets/filters/cargo-outdated.toml b/assets/filters/cargo-outdated.toml new file mode 100644 index 0000000..f413413 --- /dev/null +++ b/assets/filters/cargo-outdated.toml @@ -0,0 +1,36 @@ +[filters.cargo-outdated] +description = "Keep cargo-outdated table rows, collapse all-up-to-date" +match_command = "^cargo\\s+outdated\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^-+$", + "^=+$", + "All dependencies are up to date", +] +keep_lines_matching = [ + "^Name\\s", + "^\\S+\\s+\\d", + "(?i)error", +] +max_lines = 80 +on_empty = "cargo-outdated: all up to date" + +[[tests.cargo-outdated]] +name = "outdated rows kept" +input = """ +Name Project Compat Latest Kind Platform +---- ------- ------ ------ ---- -------- +serde 1.0.150 1.0.197 1.0.197 Normal --- +tokio 1.20.0 1.36.0 1.36.0 Normal --- +""" +expected = """Name Project Compat Latest Kind Platform +serde 1.0.150 1.0.197 1.0.197 Normal --- +tokio 1.20.0 1.36.0 1.36.0 Normal ---""" + +[[tests.cargo-outdated]] +name = "up to date collapses" +input = """ +All dependencies are up to date, yay! +""" +expected = "cargo-outdated: all up to date" diff --git a/assets/filters/cargo-publish.toml b/assets/filters/cargo-publish.toml new file mode 100644 index 0000000..48403de --- /dev/null +++ b/assets/filters/cargo-publish.toml @@ -0,0 +1,44 @@ +[filters.cargo-publish] +description = "Keep cargo publish package/upload steps and errors, drop compile noise" +match_command = "^cargo\\s+publish\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Compiling ", + "^\\s*Updating crates.io index", + "^\\s*Downloading ", + "^\\s*Downloaded ", +] +keep_lines_matching = [ + "Packaging ", + "Verifying ", + "Uploading ", + "Uploaded ", + "Finished ", + "(?i)error", + "(?i)warning", +] +max_lines = 40 +on_empty = "cargo publish: ok" + +[[tests.cargo-publish]] +name = "package and upload steps kept" +input = """ + Updating crates.io index + Packaging mycrate v0.1.0 + Verifying mycrate v0.1.0 + Compiling mycrate v0.1.0 + Uploading mycrate v0.1.0 +""" +expected = """ Packaging mycrate v0.1.0 + Verifying mycrate v0.1.0 + Uploading mycrate v0.1.0""" + +[[tests.cargo-publish]] +name = "error kept" +input = """ + Packaging mycrate v0.1.0 +error: failed to verify package tarball +""" +expected = """ Packaging mycrate v0.1.0 +error: failed to verify package tarball""" diff --git a/assets/filters/cargo-tarpaulin.toml b/assets/filters/cargo-tarpaulin.toml new file mode 100644 index 0000000..5f01874 --- /dev/null +++ b/assets/filters/cargo-tarpaulin.toml @@ -0,0 +1,40 @@ +[filters.cargo-tarpaulin] +description = "Keep tarpaulin coverage percentage and uncovered files, drop per-line trace" +match_command = "^cargo\\s+tarpaulin\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Compiling ", + "^\\s*Finished ", + "^\\s*Running ", + "^\\|\\| Uncovered Line", +] +keep_lines_matching = [ + "% coverage", + "lines covered", + "^\\|\\| Tested/Total", + "(?i)error", +] +max_lines = 60 +on_empty = "cargo-tarpaulin: coverage collected" + +[[tests.cargo-tarpaulin]] +name = "coverage summary kept" +input = """ + Compiling app v0.1.0 +|| Tested/Total Lines: +|| src/main.rs: 10/12 +|| src/lib.rs: 50/58 +|| +85.71% coverage, 60/70 lines covered +""" +expected = """|| Tested/Total Lines: +85.71% coverage, 60/70 lines covered""" + +[[tests.cargo-tarpaulin]] +name = "no coverage collapses" +input = """ + Compiling app v0.1.0 + Finished test profile +""" +expected = "cargo-tarpaulin: coverage collected" diff --git a/assets/filters/cargo-udeps.toml b/assets/filters/cargo-udeps.toml new file mode 100644 index 0000000..8f3b8f2 --- /dev/null +++ b/assets/filters/cargo-udeps.toml @@ -0,0 +1,42 @@ +[filters.cargo-udeps] +description = "Keep cargo-udeps unused-dependency report, collapse clean runs" +match_command = "^cargo\\s+udeps\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Checking ", + "^\\s*Compiling ", + "^\\s*Finished ", +] +keep_lines_matching = [ + "unused dependencies", + "^\\s*dependencies", + "^\\s*\"", + "(?i)error", +] +max_lines = 80 +on_empty = "cargo-udeps: no unused dependencies" + +[[tests.cargo-udeps]] +name = "unused deps kept" +input = """ + Checking app v0.1.0 +unused dependencies: +`app v0.1.0` + dependencies + "regex" + "lazy_static" +""" +expected = """unused dependencies: + dependencies + "regex" + "lazy_static\"""" + +[[tests.cargo-udeps]] +name = "clean run collapses" +input = """ + Checking app v0.1.0 + Finished dev profile +All deps seem to have been used. +""" +expected = "cargo-udeps: no unused dependencies" diff --git a/assets/filters/cdktf.toml b/assets/filters/cdktf.toml new file mode 100644 index 0000000..985c768 --- /dev/null +++ b/assets/filters/cdktf.toml @@ -0,0 +1,41 @@ +[filters.cdktf] +description = "Keep cdktf apply/deploy summary and resource changes, drop progress spinners" +match_command = "^cdktf\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Refreshing state", + "Still creating", + "Still modifying", +] +keep_lines_matching = [ + "Apply complete", + "Plan:", + "will be created", + "will be destroyed", + "will be updated", + "Creation complete", + "Destruction complete", + "(?i)error", + "(?i)fail", +] +max_lines = 80 +on_empty = "cdktf: no changes" + +[[tests.cdktf]] +name = "apply summary kept" +input = """ +Refreshing state... +aws_s3_bucket.data: Still creating... [10s elapsed] +aws_s3_bucket.data: Creation complete after 12s +Apply complete! Resources: 1 added, 0 changed, 0 destroyed. +""" +expected = """aws_s3_bucket.data: Creation complete after 12s +Apply complete! Resources: 1 added, 0 changed, 0 destroyed.""" + +[[tests.cdktf]] +name = "no-op apply collapses" +input = """ +Refreshing state... +""" +expected = "cdktf: no changes" diff --git a/assets/filters/cfn-lint.toml b/assets/filters/cfn-lint.toml new file mode 100644 index 0000000..f8f29d5 --- /dev/null +++ b/assets/filters/cfn-lint.toml @@ -0,0 +1,33 @@ +[filters.cfn-lint] +description = "Keep cfn-lint findings (rule code + location), drop blanks" +match_command = "^cfn-lint\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^[EW][0-9]{4}", + "^[a-zA-Z0-9_./-]+:[0-9]+:[0-9]+", + "(?i)error", +] +max_lines = 100 +on_empty = "cfn-lint: no issues" + +[[tests.cfn-lint]] +name = "findings kept" +input = """ +W3010 Don't hardcode us-east-1a in Availability Zones +template.yaml:5:7 + +E3012 Property Resources/Bucket/Properties/VersioningConfiguration should be an object +template.yaml:12:9 +""" +expected = """W3010 Don't hardcode us-east-1a in Availability Zones +template.yaml:5:7 +E3012 Property Resources/Bucket/Properties/VersioningConfiguration should be an object +template.yaml:12:9""" + +[[tests.cfn-lint]] +name = "clean template collapses" +input = "" +expected = "cfn-lint: no issues" diff --git a/assets/filters/changeset.toml b/assets/filters/changeset.toml new file mode 100644 index 0000000..3dc413a --- /dev/null +++ b/assets/filters/changeset.toml @@ -0,0 +1,37 @@ +[filters.changeset] +description = "Keep changeset bump/status lines, drop blanks" +match_command = "^(npx\\s+)?changeset\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "🦋", + "(?i)bump", + "(?i)release", + "(?i)major|minor|patch", + "(?i)error", +] +max_lines = 60 +on_empty = "changeset: no changes" + +[[tests.changeset]] +name = "bump plan kept" +input = """ +🦋 info Bumping the following packages + +🦋 major +🦋 - my-pkg +🦋 patch +🦋 - my-utils +""" +expected = """🦋 info Bumping the following packages +🦋 major +🦋 - my-pkg +🦋 patch +🦋 - my-utils""" + +[[tests.changeset]] +name = "no changesets collapses" +input = "" +expected = "changeset: no changes" diff --git a/assets/filters/clj-kondo.toml b/assets/filters/clj-kondo.toml new file mode 100644 index 0000000..80689e2 --- /dev/null +++ b/assets/filters/clj-kondo.toml @@ -0,0 +1,34 @@ +[filters.clj-kondo] +description = "Keep clj-kondo warnings/errors and the linting summary line" +match_command = "^clj-kondo\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + ": warning:", + ": error:", + "^linting took", + "(?i)\\berror\\b", +] +max_lines = 120 +on_empty = "clj-kondo: clean" + +[[tests.clj-kondo]] +name = "findings and summary kept, blanks dropped" +input = """ +/tmp/example.cljc:1:14: warning: Expected: number, received: string. + +/tmp/example.cljc:1:26: warning: Expected: number, received: keyword. + +linting took 16ms, errors: 0, warnings: 2 +""" +expected = """/tmp/example.cljc:1:14: warning: Expected: number, received: string. +/tmp/example.cljc:1:26: warning: Expected: number, received: keyword. +linting took 16ms, errors: 0, warnings: 2""" + +[[tests.clj-kondo]] +name = "no findings collapses" +input = """ +""" +expected = "clj-kondo: clean" diff --git a/assets/filters/codespell.toml b/assets/filters/codespell.toml new file mode 100644 index 0000000..2900272 --- /dev/null +++ b/assets/filters/codespell.toml @@ -0,0 +1,27 @@ +[filters.codespell] +description = "Keep codespell misspelling findings (file:line + correction), drop noise" +match_command = "^codespell\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "==>", + "(?i)error", +] +max_lines = 120 +on_empty = "codespell: no misspellings found" + +[[tests.codespell]] +name = "findings kept" +input = """ +./README.md:12: comparsion ==> comparison +./src/app.py:45: recieve ==> receive +""" +expected = """./README.md:12: comparsion ==> comparison +./src/app.py:45: recieve ==> receive""" + +[[tests.codespell]] +name = "clean run collapses" +input = "" +expected = "codespell: no misspellings found" diff --git a/assets/filters/consul.toml b/assets/filters/consul.toml new file mode 100644 index 0000000..7585eee --- /dev/null +++ b/assets/filters/consul.toml @@ -0,0 +1,35 @@ +[filters.consul] +description = "Keep consul members/status rows and errors, drop blanks" +match_command = "^consul\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^Node\\s", + "alive", + "failed", + "left", + "Synced", + "(?i)error", +] +max_lines = 80 +on_empty = "consul: ok" + +[[tests.consul]] +name = "members kept" +input = """ +Node Address Status Type Build Protocol +node1 10.0.0.1:8301 alive server 1.17.0 2 +node2 10.0.0.2:8301 alive client 1.17.0 2 +node3 10.0.0.3:8301 failed client 1.17.0 2 +""" +expected = """Node Address Status Type Build Protocol +node1 10.0.0.1:8301 alive server 1.17.0 2 +node2 10.0.0.2:8301 alive client 1.17.0 2 +node3 10.0.0.3:8301 failed client 1.17.0 2""" + +[[tests.consul]] +name = "no output collapses" +input = "" +expected = "consul: ok" diff --git a/assets/filters/crane.toml b/assets/filters/crane.toml new file mode 100644 index 0000000..6c66025 --- /dev/null +++ b/assets/filters/crane.toml @@ -0,0 +1,27 @@ +[filters.crane] +description = "Keep crane digests/tags/manifests output, drop blanks" +match_command = "^crane\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +max_lines = 60 +on_empty = "crane: ok" + +[[tests.crane]] +name = "tags listed" +input = """ +latest +v1.2.3 +v1.2.4 +""" +expected = """latest +v1.2.3 +v1.2.4""" + +[[tests.crane]] +name = "digest kept" +input = """ +sha256:abc123def4567890abc123def4567890abc123def4567890abc123def4567890 +""" +expected = "sha256:abc123def4567890abc123def4567890abc123def4567890abc123def4567890" diff --git a/assets/filters/dagger.toml b/assets/filters/dagger.toml new file mode 100644 index 0000000..d7048fc --- /dev/null +++ b/assets/filters/dagger.toml @@ -0,0 +1,38 @@ +[filters.dagger] +description = "Keep dagger pipeline results and errors, drop per-step spinners" +match_command = "^dagger\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*●", + "^\\s*▶", + "CACHED", +] +keep_lines_matching = [ + "✔", + "✗", + "✘", + "completed", + "(?i)error", + "(?i)fail", +] +max_lines = 80 +on_empty = "dagger: pipeline complete" + +[[tests.dagger]] +name = "results kept, steps dropped" +input = """ +● connect +▶ exec go build +✔ build pipeline (2.3s) +✗ test pipeline (1.1s) +""" +expected = """✔ build pipeline (2.3s) +✗ test pipeline (1.1s)""" + +[[tests.dagger]] +name = "all pass kept" +input = """ +✔ build pipeline (2.3s) +""" +expected = "✔ build pipeline (2.3s)" diff --git a/assets/filters/dart-analyze.toml b/assets/filters/dart-analyze.toml new file mode 100644 index 0000000..fcc9a4c --- /dev/null +++ b/assets/filters/dart-analyze.toml @@ -0,0 +1,38 @@ +[filters.dart-analyze] +description = "Keep dart analyze findings and issue count, collapse clean runs" +match_command = "^dart\\s+analyze\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Analyzing ", + "No issues found", +] +keep_lines_matching = [ + "•", + "issues? found", + "(?i)error", +] +max_lines = 100 +on_empty = "dart analyze: no issues" + +[[tests.dart-analyze]] +name = "findings and count kept" +input = """ +Analyzing project... + + info • Unused import • lib/main.dart:3:8 • unused_import + warning • Dead code • lib/util.dart:12:1 • dead_code + +2 issues found. +""" +expected = """ info • Unused import • lib/main.dart:3:8 • unused_import + warning • Dead code • lib/util.dart:12:1 • dead_code +2 issues found.""" + +[[tests.dart-analyze]] +name = "clean run collapses" +input = """ +Analyzing project... +No issues found! +""" +expected = "dart analyze: no issues" diff --git a/assets/filters/dart-test.toml b/assets/filters/dart-test.toml new file mode 100644 index 0000000..1773529 --- /dev/null +++ b/assets/filters/dart-test.toml @@ -0,0 +1,33 @@ +[filters.dart-test] +description = "Keep dart test final result and failures, drop passing tick lines" +match_command = "^dart\\s+test\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\d\\d:\\d\\d \\+\\d+: .*[^!]$", +] +keep_lines_matching = [ + "All tests passed", + "Some tests failed", + "-\\d+:", + "\\[E\\]", + "(?i)error", +] +max_lines = 80 +on_empty = "dart test: passed" + +[[tests.dart-test]] +name = "failure result kept" +input = """ +00:01 +1: test one +00:02 +9 -1: Some tests failed. +""" +expected = "00:02 +9 -1: Some tests failed." + +[[tests.dart-test]] +name = "all pass kept" +input = """ +00:01 +1: test one +00:02 +10: All tests passed! +""" +expected = "00:02 +10: All tests passed!" diff --git a/assets/filters/dbmate.toml b/assets/filters/dbmate.toml new file mode 100644 index 0000000..922fa22 --- /dev/null +++ b/assets/filters/dbmate.toml @@ -0,0 +1,34 @@ +[filters.dbmate] +description = "Keep dbmate applied/rolled-back migrations and errors" +match_command = "^dbmate\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^Applying:", + "^Applied:", + "^Rolling back:", + "^Rolled back:", + "(?i)error", +] +max_lines = 80 +on_empty = "dbmate: no pending migrations" + +[[tests.dbmate]] +name = "applied migrations kept" +input = """ +Applying: 20240101120000_create_users.sql +Applied: 20240101120000_create_users.sql +Applying: 20240102120000_add_index.sql +Applied: 20240102120000_add_index.sql +""" +expected = """Applying: 20240101120000_create_users.sql +Applied: 20240101120000_create_users.sql +Applying: 20240102120000_add_index.sql +Applied: 20240102120000_add_index.sql""" + +[[tests.dbmate]] +name = "no pending collapses" +input = "" +expected = "dbmate: no pending migrations" diff --git a/assets/filters/dependency-check.toml b/assets/filters/dependency-check.toml new file mode 100644 index 0000000..2d5aebc --- /dev/null +++ b/assets/filters/dependency-check.toml @@ -0,0 +1,40 @@ +[filters.dependency-check] +description = "Keep OWASP dependency-check vulnerable dependencies and CVEs" +match_command = "^dependency-check(\\.sh|\\.bat)?\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Checking for updates", + "Download Started", + "Download Complete", + "Analysis Started", +] +keep_lines_matching = [ + "known vulnerabilit", + "CVE-", + "\\): (CVE|GHSA)", + "(?i)error", +] +max_lines = 100 +on_empty = "dependency-check: no known vulnerabilities" +token_budget = 2000 + +[[tests.dependency-check]] +name = "vulnerable deps kept" +input = """ +Checking for updates +Analysis Started +One or more dependencies were identified with known vulnerabilities in the project: + +log4j-core-2.14.1.jar (pkg:maven/org.apache.logging.log4j) : CVE-2021-44228, CVE-2021-45046 +""" +expected = """One or more dependencies were identified with known vulnerabilities in the project: +log4j-core-2.14.1.jar (pkg:maven/org.apache.logging.log4j) : CVE-2021-44228, CVE-2021-45046""" + +[[tests.dependency-check]] +name = "clean scan collapses" +input = """ +Checking for updates +Analysis Started +""" +expected = "dependency-check: no known vulnerabilities" diff --git a/assets/filters/dive.toml b/assets/filters/dive.toml new file mode 100644 index 0000000..0714f8c --- /dev/null +++ b/assets/filters/dive.toml @@ -0,0 +1,41 @@ +[filters.dive] +description = "Keep dive image efficiency results, drop layer-by-layer detail" +match_command = "^dive\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^\\s*efficiency:", + "WastedBytes|wastedBytes|WastedPercent", + "^\\s*Result:", + "image source bytes", + "PASS", + "FAIL", + "(?i)error", +] +max_lines = 40 +on_empty = "dive: analysis complete" + +[[tests.dive]] +name = "efficiency summary kept" +input = """ + efficiency: 98.7531 % + wastedBytes: 1421850 bytes (1.4 MB) + userWastedPercent: 12.3401 % +Inefficient Files: +Count Wasted Space File Path + 2 1.4 MB /var/cache/apt/archives +Result:PASS [Total:3] [Passed:3] [Failed:0] +""" +expected = """ efficiency: 98.7531 % + wastedBytes: 1421850 bytes (1.4 MB) + userWastedPercent: 12.3401 % +Result:PASS [Total:3] [Passed:3] [Failed:0]""" + +[[tests.dive]] +name = "ci fail kept" +input = """ +Result:FAIL [Total:3] [Passed:2] [Failed:1] +""" +expected = "Result:FAIL [Total:3] [Passed:2] [Failed:1]" diff --git a/assets/filters/docker-scout.toml b/assets/filters/docker-scout.toml new file mode 100644 index 0000000..3b0d236 --- /dev/null +++ b/assets/filters/docker-scout.toml @@ -0,0 +1,51 @@ +[filters.docker-scout] +description = "Keep docker scout vulnerability counts and findings, drop indexing checkmarks" +match_command = "^docker\\s+scout\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "✓ Image stored", + "✓ Indexed ", + "✓ Pulled", +] +keep_lines_matching = [ + "✗ ", + "vulnerabilit", + "CRITICAL", + "HIGH", + "MEDIUM", + "LOW", + "CVE-", + "(?i)error", +] +max_lines = 80 +on_empty = "docker scout: no vulnerabilities" +token_budget = 2000 + +[[tests.docker-scout]] +name = "vulnerability summary kept" +input = """ + ✓ Image stored for indexing + ✓ Indexed 234 packages + ✗ Detected 5 vulnerable packages with a total of 12 vulnerabilities + +5 vulnerabilities found in 5 packages + CRITICAL 0 + HIGH 2 + MEDIUM 3 + LOW 7 +""" +expected = """ ✗ Detected 5 vulnerable packages with a total of 12 vulnerabilities +5 vulnerabilities found in 5 packages + CRITICAL 0 + HIGH 2 + MEDIUM 3 + LOW 7""" + +[[tests.docker-scout]] +name = "clean image collapses" +input = """ + ✓ Image stored for indexing + ✓ Indexed 234 packages +""" +expected = "docker scout: no vulnerabilities" diff --git a/assets/filters/doctl.toml b/assets/filters/doctl.toml new file mode 100644 index 0000000..e0105c8 --- /dev/null +++ b/assets/filters/doctl.toml @@ -0,0 +1,33 @@ +[filters.doctl] +description = "Keep doctl resource table rows and errors, drop blanks" +match_command = "^doctl\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^ID\\s", + "^Name\\s", + "^\\w", + "(?i)error", +] +max_lines = 80 +on_empty = "doctl: no resources" + +[[tests.doctl]] +name = "droplet table kept" +input = """ +ID Name Public IPv4 Status +123456789 web-1 203.0.113.10 active +987654321 db-1 203.0.113.20 active +""" +expected = """ID Name Public IPv4 Status +123456789 web-1 203.0.113.10 active +987654321 db-1 203.0.113.20 active""" + +[[tests.doctl]] +name = "auth error kept" +input = """ +Error: Unable to authenticate you +""" +expected = "Error: Unable to authenticate you" diff --git a/assets/filters/dprint.toml b/assets/filters/dprint.toml new file mode 100644 index 0000000..c2c58b7 --- /dev/null +++ b/assets/filters/dprint.toml @@ -0,0 +1,31 @@ +[filters.dprint] +description = "Keep dprint unformatted-file list and errors, collapse clean runs" +match_command = "^dprint\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "not formatted", + "^Error", + "from ", + "\\.(ts|js|tsx|jsx|json|md):", + "(?i)error", +] +max_lines = 80 +on_empty = "dprint: all formatted" + +[[tests.dprint]] +name = "unformatted files kept" +input = """ +src/app.ts +src/utils.ts + +Found 2 not formatted files. +""" +expected = """Found 2 not formatted files.""" + +[[tests.dprint]] +name = "clean run collapses" +input = "" +expected = "dprint: all formatted" diff --git a/assets/filters/drizzle-kit.toml b/assets/filters/drizzle-kit.toml new file mode 100644 index 0000000..150d0d2 --- /dev/null +++ b/assets/filters/drizzle-kit.toml @@ -0,0 +1,37 @@ +[filters.drizzle-kit] +description = "Keep drizzle-kit migration/generate results and errors" +match_command = "^(npx\\s+)?drizzle-kit\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Reading config", + "^Reading schema", +] +keep_lines_matching = [ + "migration file", + "changes? applied", + "No schema changes", + "✓", + "tables?", + "(?i)error", +] +max_lines = 60 +on_empty = "drizzle-kit: no changes" + +[[tests.drizzle-kit]] +name = "generated migration kept" +input = """ +Reading config file '/app/drizzle.config.ts' +Reading schema files... + +[✓] Your SQL migration file ➜ drizzle/0000_cool_name.sql +""" +expected = "[✓] Your SQL migration file ➜ drizzle/0000_cool_name.sql" + +[[tests.drizzle-kit]] +name = "no changes collapses" +input = """ +Reading config file '/app/drizzle.config.ts' +Reading schema files... +""" +expected = "drizzle-kit: no changes" diff --git a/assets/filters/dvc.toml b/assets/filters/dvc.toml new file mode 100644 index 0000000..c3c9c8f --- /dev/null +++ b/assets/filters/dvc.toml @@ -0,0 +1,42 @@ +[filters.dvc] +description = "Keep dvc stage runs and errors, drop skip/git-hint noise" +match_command = "^dvc\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "didn't change, skipping", + "To track the changes with git", + "^\\s*git add ", + "Use `dvc push`", +] +keep_lines_matching = [ + "Running stage", + "Updating lock file", + "added", + "modified", + "(?i)error", + "(?i)fail", +] +max_lines = 80 +on_empty = "dvc: up to date" + +[[tests.dvc]] +name = "stage runs kept, skips dropped" +input = """ +Stage 'prepare' didn't change, skipping +Running stage 'train': +Updating lock file 'dvc.lock' + +To track the changes with git, run: + git add dvc.lock +""" +expected = """Running stage 'train': +Updating lock file 'dvc.lock'""" + +[[tests.dvc]] +name = "all up to date collapses" +input = """ +Stage 'prepare' didn't change, skipping +Stage 'train' didn't change, skipping +""" +expected = "dvc: up to date" diff --git a/assets/filters/earthly.toml b/assets/filters/earthly.toml new file mode 100644 index 0000000..e8df7de --- /dev/null +++ b/assets/filters/earthly.toml @@ -0,0 +1,41 @@ +[filters.earthly] +description = "Keep earthly result banner and output artifacts, drop per-step logs" +match_command = "^earthly\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "\\|\\s+--> ", + "\\|\\s+RUN ", + "\\|\\s+COPY ", +] +keep_lines_matching = [ + "=== SUCCESS ===", + "=== FAILURE ===", + "Image .* output as", + "Artifact .* output as", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "earthly: done" + +[[tests.earthly]] +name = "result and artifacts kept" +input = """ ++build | --> RUN go build ++build | RUN go build ./... ++build | Image +build output as docker.io/app:latest +=========================== SUCCESS =========================== +""" +expected = """+build | Image +build output as docker.io/app:latest +=========================== SUCCESS ===========================""" + +[[tests.earthly]] +name = "failure kept" +input = """ ++build | --> RUN go build +=========================== FAILURE =========================== +Error: build target +build failed +""" +expected = """=========================== FAILURE =========================== +Error: build target +build failed""" diff --git a/assets/filters/expo.toml b/assets/filters/expo.toml new file mode 100644 index 0000000..f0d83aa --- /dev/null +++ b/assets/filters/expo.toml @@ -0,0 +1,40 @@ +[filters.expo] +description = "Keep expo/eas build results and errors, drop bundling progress" +match_command = "^(npx\\s+)?(expo|eas)\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Bundling ", + "^\\s*Starting ", + "^\\s*Waiting ", +] +keep_lines_matching = [ + "✔", + "✓", + "✖", + "Build finished", + "Exported", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "expo: done" + +[[tests.expo]] +name = "build result kept" +input = """ +Starting Metro Bundler +Bundling index.js 100% +✔ Exported bundle to dist/ +✓ Build finished +""" +expected = """✔ Exported bundle to dist/ +✓ Build finished""" + +[[tests.expo]] +name = "build error kept" +input = """ +Bundling index.js 100% +✖ Build failed: Gradle build failed +""" +expected = "✖ Build failed: Gradle build failed" diff --git a/assets/filters/fastlane.toml b/assets/filters/fastlane.toml new file mode 100644 index 0000000..99b4065 --- /dev/null +++ b/assets/filters/fastlane.toml @@ -0,0 +1,39 @@ +[filters.fastlane] +description = "Keep fastlane result and summary table, drop per-step timestamps" +match_command = "^fastlane\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\[\\d\\d:\\d\\d:\\d\\d\\]: \\$", + "^\\[\\d\\d:\\d\\d:\\d\\d\\]: Driving", +] +keep_lines_matching = [ + "finished successfully", + "fastlane finished", + "Successfully", + "error", + "failed", + "\\| fastlane", + "(?i)exception", +] +max_lines = 60 +on_empty = "fastlane: done" + +[[tests.fastlane]] +name = "success result kept" +input = """ +[12:00:00]: Driving the lane 'ios release' +[12:00:01]: $ xcodebuild ... +[12:00:30]: Successfully uploaded the new binary +fastlane.tools finished successfully +""" +expected = """[12:00:30]: Successfully uploaded the new binary +fastlane.tools finished successfully""" + +[[tests.fastlane]] +name = "failure kept" +input = """ +[12:00:00]: Driving the lane 'ios release' +[12:00:10]: Build failed with error code 65 +""" +expected = "[12:00:10]: Build failed with error code 65" diff --git a/assets/filters/flutter-build.toml b/assets/filters/flutter-build.toml new file mode 100644 index 0000000..5ec9317 --- /dev/null +++ b/assets/filters/flutter-build.toml @@ -0,0 +1,36 @@ +[filters.flutter-build] +description = "Keep flutter build result artifact and errors, drop Gradle/Xcode progress" +match_command = "^flutter\\s+build\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Running Gradle task", + "^\\s*Running ", + "^\\s*Resolving dependencies", +] +keep_lines_matching = [ + "✓ Built", + "Built build/", + "(?i)error", + "(?i)fail", + "(?i)exception", +] +max_lines = 60 +on_empty = "flutter build: done" + +[[tests.flutter-build]] +name = "built artifact kept" +input = """ +Running Gradle task 'assembleRelease'... +Resolving dependencies... +✓ Built build/app/outputs/flutter-apk/app-release.apk (12.3MB) +""" +expected = "✓ Built build/app/outputs/flutter-apk/app-release.apk (12.3MB)" + +[[tests.flutter-build]] +name = "build failure kept" +input = """ +Running Gradle task 'assembleRelease'... +FAILURE: Build failed with an exception. +""" +expected = "FAILURE: Build failed with an exception." diff --git a/assets/filters/flux.toml b/assets/filters/flux.toml new file mode 100644 index 0000000..4bba8b3 --- /dev/null +++ b/assets/filters/flux.toml @@ -0,0 +1,39 @@ +[filters.flux] +description = "Keep flux get/reconcile status rows, drop blanks" +match_command = "^flux\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^► ", + "^✔ no", +] +keep_lines_matching = [ + "^NAME\\b", + "True", + "False", + "Applied revision", + "reconciliation", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "flux: nothing to report" + +[[tests.flux]] +name = "status rows kept" +input = """ +NAME READY MESSAGE REVISION SUSPENDED +app True Applied revision: main@sha1:ab main@sha1:ab False +infra False building artifact main@sha1:cd False +""" +expected = """NAME READY MESSAGE REVISION SUSPENDED +app True Applied revision: main@sha1:ab main@sha1:ab False +infra False building artifact main@sha1:cd False""" + +[[tests.flux]] +name = "reconcile progress collapses" +input = """ +► annotating GitRepository app in flux-system namespace +✔ no changes detected +""" +expected = "flux: nothing to report" diff --git a/assets/filters/gatsby.toml b/assets/filters/gatsby.toml new file mode 100644 index 0000000..873839f --- /dev/null +++ b/assets/filters/gatsby.toml @@ -0,0 +1,39 @@ +[filters.gatsby] +description = "Keep gatsby build success/warning/result lines, drop info progress" +match_command = "^(npx\\s+)?gatsby\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^info ", + "^verbose ", + "^\\s*⠋", +] +keep_lines_matching = [ + "^success ", + "^warning ", + "Done building", + "(?i)error", + "(?i)failed", +] +max_lines = 80 +on_empty = "gatsby: build done" + +[[tests.gatsby]] +name = "success and result kept, info dropped" +input = """ +info Building production JavaScript and CSS bundles +success Building production JavaScript and CSS bundles - 12.345s +warning Browserslist: caniuse-lite is outdated +Done building in 45.678 sec +""" +expected = """success Building production JavaScript and CSS bundles - 12.345s +warning Browserslist: caniuse-lite is outdated +Done building in 45.678 sec""" + +[[tests.gatsby]] +name = "build failure kept" +input = """ +info Building production JavaScript and CSS bundles +failed Building production JavaScript and CSS bundles - 5.000s +""" +expected = "failed Building production JavaScript and CSS bundles - 5.000s" diff --git a/assets/filters/gci.toml b/assets/filters/gci.toml new file mode 100644 index 0000000..9ac0ac7 --- /dev/null +++ b/assets/filters/gci.toml @@ -0,0 +1,31 @@ +[filters.gci] +description = "Keep gci import-order diffs/changed files, collapse clean runs" +match_command = "^gci\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "\\.go$", + "^---", + "^\\+\\+\\+", + "^@@", + "would be skipped|wrong", + "(?i)error", +] +max_lines = 100 +on_empty = "gci: imports ordered" + +[[tests.gci]] +name = "changed files kept" +input = """ +src/app.go +src/handler.go +""" +expected = """src/app.go +src/handler.go""" + +[[tests.gci]] +name = "clean run collapses" +input = "" +expected = "gci: imports ordered" diff --git a/assets/filters/ggshield.toml b/assets/filters/ggshield.toml new file mode 100644 index 0000000..6c44a95 --- /dev/null +++ b/assets/filters/ggshield.toml @@ -0,0 +1,39 @@ +[filters.ggshield] +description = "Keep ggshield secret-incident summary and locations, collapse clean scans" +match_command = "^ggshield\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Scanning ", + "No secrets have been found", +] +keep_lines_matching = [ + "(?i)secret", + "(?i)incident", + "^>\\s", + "(?i)error", +] +max_lines = 80 +on_empty = "ggshield: no secrets found" +token_budget = 2000 + +[[tests.ggshield]] +name = "incidents kept" +input = """ +Scanning commits... + +secrets-found: 2 +> src/config.py: 1 incident +> .env: 1 incident +""" +expected = """secrets-found: 2 +> src/config.py: 1 incident +> .env: 1 incident""" + +[[tests.ggshield]] +name = "clean scan collapses" +input = """ +Scanning commits... +No secrets have been found +""" +expected = "ggshield: no secrets found" diff --git a/assets/filters/ginkgo.toml b/assets/filters/ginkgo.toml new file mode 100644 index 0000000..17806b2 --- /dev/null +++ b/assets/filters/ginkgo.toml @@ -0,0 +1,38 @@ +[filters.ginkgo] +description = "Keep ginkgo run totals and failures, drop dots and per-spec noise" +match_command = "^ginkgo\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^[•.]+$", + "^Will run ", +] +keep_lines_matching = [ + "^Ran \\d+ of \\d+ Specs", + "SUCCESS!", + "FAIL!", + "Passed \\|", + "\\[FAILED\\]", + "(?i)error", +] +max_lines = 80 +on_empty = "ginkgo: specs passed" + +[[tests.ginkgo]] +name = "totals kept" +input = """ +•••••••••• +Ran 10 of 10 Specs in 2.345 seconds +SUCCESS! -- 10 Passed | 0 Failed | 0 Pending | 0 Skipped +""" +expected = """Ran 10 of 10 Specs in 2.345 seconds +SUCCESS! -- 10 Passed | 0 Failed | 0 Pending | 0 Skipped""" + +[[tests.ginkgo]] +name = "failure kept" +input = """ +Ran 10 of 10 Specs in 2.345 seconds +FAIL! -- 9 Passed | 1 Failed | 0 Pending | 0 Skipped +""" +expected = """Ran 10 of 10 Specs in 2.345 seconds +FAIL! -- 9 Passed | 1 Failed | 0 Pending | 0 Skipped""" diff --git a/assets/filters/git-secrets.toml b/assets/filters/git-secrets.toml new file mode 100644 index 0000000..f757bf0 --- /dev/null +++ b/assets/filters/git-secrets.toml @@ -0,0 +1,29 @@ +[filters.git-secrets] +description = "Keep git-secrets matched-secret lines and errors, collapse clean scans" +match_command = "^git[ -]secrets\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "prohibited", + "\\[ERROR\\]", + ":\\d+:", + "(?i)error", +] +max_lines = 80 +on_empty = "git-secrets: no secrets found" + +[[tests.git-secrets]] +name = "matched secret kept" +input = """ +config/prod.env:5:AWS_SECRET_ACCESS_KEY=AKIAIOSFODNN7EXAMPLE +[ERROR] Matched one or more prohibited patterns +""" +expected = """config/prod.env:5:AWS_SECRET_ACCESS_KEY=AKIAIOSFODNN7EXAMPLE +[ERROR] Matched one or more prohibited patterns""" + +[[tests.git-secrets]] +name = "clean scan collapses" +input = "" +expected = "git-secrets: no secrets found" diff --git a/assets/filters/go-install.toml b/assets/filters/go-install.toml new file mode 100644 index 0000000..f3a337c --- /dev/null +++ b/assets/filters/go-install.toml @@ -0,0 +1,36 @@ +[filters.go-install] +description = "Surface go install errors, collapse silent success" +match_command = "^go\\s+install\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^go: downloading ", + "^go: extracting ", + "^go: found ", +] +keep_lines_matching = [ + "\\.go:\\d+:\\d+:", + "cannot find package", + "no required module", + "(?i)error", + "(?i)fatal", +] +max_lines = 60 +on_empty = "go install: ok" + +[[tests.go-install]] +name = "download noise collapses on success" +input = """ +go: downloading github.com/spf13/cobra v1.8.0 +go: downloading github.com/spf13/pflag v1.0.5 +""" +expected = "go install: ok" + +[[tests.go-install]] +name = "build error kept" +input = """ +go: downloading github.com/x/y v1.0.0 +# github.com/x/y +../y/main.go:5:2: undefined: Bar +""" +expected = "../y/main.go:5:2: undefined: Bar" diff --git a/assets/filters/go-run.toml b/assets/filters/go-run.toml new file mode 100644 index 0000000..09de4c3 --- /dev/null +++ b/assets/filters/go-run.toml @@ -0,0 +1,36 @@ +[filters.go-run] +description = "Bound go run program output, surface compile errors and panics" +match_command = "^go\\s+run\\b" +strip_ansi = true +passthrough_when_emptied = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "\\.go:\\d+:\\d+:", + "^panic:", + "^goroutine ", + "cannot find package", + "undefined:", + "(?i)error", + "(?i)fatal", +] +max_lines = 80 +token_budget = 2000 + +[[tests.go-run]] +name = "compile error surfaced" +input = """ +# command-line-arguments +./main.go:10:6: undefined: foo +""" +expected = """./main.go:10:6: undefined: foo""" + +[[tests.go-run]] +name = "clean program output passes through bounded" +input = """ +hello world +done +""" +expected = """hello world +done""" diff --git a/assets/filters/gocritic.toml b/assets/filters/gocritic.toml new file mode 100644 index 0000000..7eb0e5f --- /dev/null +++ b/assets/filters/gocritic.toml @@ -0,0 +1,27 @@ +[filters.gocritic] +description = "Keep gocritic findings, collapse clean runs" +match_command = "^gocritic\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "\\.go:\\d+:\\d+:", + "(?i)error", +] +max_lines = 100 +on_empty = "gocritic: no issues" + +[[tests.gocritic]] +name = "findings kept" +input = """ +main.go:10:5: ifElseChain: rewrite if-else to switch statement +util.go:22:1: commentFormatting: put a space between // and comment text +""" +expected = """main.go:10:5: ifElseChain: rewrite if-else to switch statement +util.go:22:1: commentFormatting: put a space between // and comment text""" + +[[tests.gocritic]] +name = "clean run collapses" +input = "" +expected = "gocritic: no issues" diff --git a/assets/filters/goimports.toml b/assets/filters/goimports.toml new file mode 100644 index 0000000..3ebdf7c --- /dev/null +++ b/assets/filters/goimports.toml @@ -0,0 +1,28 @@ +[filters.goimports] +description = "Keep goimports -l file list and errors, collapse clean runs" +match_command = "^goimports\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "\\.go$", + "\\.go:\\d+:", + "(?i)error", +] +max_lines = 100 +on_empty = "goimports: all formatted" + +[[tests.goimports]] +name = "unformatted files listed" +input = """ +src/app.go +src/handler.go +""" +expected = """src/app.go +src/handler.go""" + +[[tests.goimports]] +name = "clean run collapses" +input = "" +expected = "goimports: all formatted" diff --git a/assets/filters/goreleaser.toml b/assets/filters/goreleaser.toml new file mode 100644 index 0000000..ed70805 --- /dev/null +++ b/assets/filters/goreleaser.toml @@ -0,0 +1,41 @@ +[filters.goreleaser] +description = "Keep goreleaser result and errors, drop routine build bullets" +match_command = "^goreleaser\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "•\\s+loading", + "•\\s+getting", + "•\\s+setting", + "•\\s+writing", + "•\\s+cleaning", +] +keep_lines_matching = [ + "release succeeded", + "release failed", + "build succeeded", + "published", + "took", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "goreleaser: done" + +[[tests.goreleaser]] +name = "result kept, routine bullets dropped" +input = """ + • loading config file file=.goreleaser.yaml + • cleaning dist + • building binaries + • release succeeded after 32.10s +""" +expected = """ • release succeeded after 32.10s""" + +[[tests.goreleaser]] +name = "failure kept" +input = """ + • loading config file file=.goreleaser.yaml + ⨯ release failed after 5.00s error=git is in a dirty state +""" +expected = " ⨯ release failed after 5.00s error=git is in a dirty state" diff --git a/assets/filters/gotestsum.toml b/assets/filters/gotestsum.toml new file mode 100644 index 0000000..4807371 --- /dev/null +++ b/assets/filters/gotestsum.toml @@ -0,0 +1,39 @@ +[filters.gotestsum] +description = "Keep gotestsum failures and DONE summary, drop passing packages" +match_command = "^gotestsum\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^✓", + "^∅", +] +keep_lines_matching = [ + "^✖", + "^DONE", + "FAIL", + "failure", + "(?i)error", + "--- FAIL", +] +max_lines = 80 +on_empty = "gotestsum: all tests passed" + +[[tests.gotestsum]] +name = "failures and summary kept" +input = """ +✓ pkg/foo (cached) +✓ pkg/baz (0.2s) +✖ pkg/bar (0.5s) + +DONE 42 tests, 1 failure in 2.345s +""" +expected = """✖ pkg/bar (0.5s) +DONE 42 tests, 1 failure in 2.345s""" + +[[tests.gotestsum]] +name = "all pass collapses" +input = """ +✓ pkg/foo (cached) +✓ pkg/baz (0.2s) +""" +expected = "gotestsum: all tests passed" diff --git a/assets/filters/great-expectations.toml b/assets/filters/great-expectations.toml new file mode 100644 index 0000000..c55d22b --- /dev/null +++ b/assets/filters/great-expectations.toml @@ -0,0 +1,37 @@ +[filters.great-expectations] +description = "Keep Great Expectations validation result and failed expectations" +match_command = "^great_expectations\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Calculating Metrics", + "^Using ", +] +keep_lines_matching = [ + "Validation succeeded", + "Validation failed", + "expectation", + "(?i)error", +] +max_lines = 80 +on_empty = "great-expectations: done" + +[[tests.great-expectations]] +name = "validation result kept" +input = """ +Using v3 (Batch Request) API +Calculating Metrics: 100%|██████| 25/25 + +Validation failed! +2 of 25 expectations were not met. +""" +expected = """Validation failed! +2 of 25 expectations were not met.""" + +[[tests.great-expectations]] +name = "success kept" +input = """ +Calculating Metrics: 100%|██████| 25/25 +Validation succeeded! +""" +expected = "Validation succeeded!" diff --git a/assets/filters/helmfile.toml b/assets/filters/helmfile.toml new file mode 100644 index 0000000..606786f --- /dev/null +++ b/assets/filters/helmfile.toml @@ -0,0 +1,46 @@ +[filters.helmfile] +description = "Keep helmfile apply/sync release results and diffs, drop progress noise" +match_command = "^helmfile\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Adding repo ", + "^Updating repo ", + "^Building dependency ", + "^Comparing release", +] +keep_lines_matching = [ + "^Upgrading release", + "^Listing releases", + "^Affected releases", + "has been (upgraded|installed|uninstalled)", + "STATUS: deployed", + "^UPDATED RELEASES", + "(?i)error", + "(?i)fail", +] +max_lines = 80 +on_empty = "helmfile: no changes" + +[[tests.helmfile]] +name = "release results kept" +input = """ +Adding repo stable https://charts.example.com +Updating repo +Comparing release=web, chart=stable/web +Upgrading release=web, chart=stable/web +Release "web" has been upgraded. Happy Helming! +STATUS: deployed +""" +expected = """Upgrading release=web, chart=stable/web +Release "web" has been upgraded. Happy Helming! +STATUS: deployed""" + +[[tests.helmfile]] +name = "no-op sync collapses" +input = """ +Adding repo stable https://charts.example.com +Updating repo +Comparing release=web, chart=stable/web +""" +expected = "helmfile: no changes" diff --git a/assets/filters/htmlhint.toml b/assets/filters/htmlhint.toml new file mode 100644 index 0000000..1472d1c --- /dev/null +++ b/assets/filters/htmlhint.toml @@ -0,0 +1,35 @@ +[filters.htmlhint] +description = "Keep htmlhint error locations and scan summary, drop code snippets" +match_command = "^(npx\\s+)?htmlhint\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "Scanned ", + "\\(.*\\)\\s*$", + "L\\d+ \\|", + "(?i)error", +] +max_lines = 100 +on_empty = "htmlhint: no errors" + +[[tests.htmlhint]] +name = "errors and summary kept" +input = """ + index.html + L5 |
+ ^ Tag must be paired (tag-pair) + + Scanned 1 files, found 1 errors in 12 ms +""" +expected = """ L5 |
+ ^ Tag must be paired (tag-pair) + Scanned 1 files, found 1 errors in 12 ms""" + +[[tests.htmlhint]] +name = "clean scan collapses" +input = """ + index.html +""" +expected = "htmlhint: no errors" diff --git a/assets/filters/hugo.toml b/assets/filters/hugo.toml new file mode 100644 index 0000000..53fa512 --- /dev/null +++ b/assets/filters/hugo.toml @@ -0,0 +1,48 @@ +[filters.hugo] +description = "Keep Hugo build stats table and total time, drop banner and separators" +match_command = "^hugo\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Start building sites", + "^hugo v", + "^Built in", + "^[-+\\s]+$", + "WARN.*found no layout", +] +keep_lines_matching = [ + "\\|", + "^Total in", + "(?i)error", + "(?i)^warn", +] +max_lines = 60 +on_empty = "hugo: built (no stats)" + +[[tests.hugo]] +name = "stats table and total kept, banner dropped" +input = """ +Start building sites … +hugo v0.120.0+extended linux/amd64 + + | EN +-------------------+----- + Pages | 22 + Static files | 12 + Aliases | 4 + +Total in 18 ms +""" +expected = """ | EN + Pages | 22 + Static files | 12 + Aliases | 4 +Total in 18 ms""" + +[[tests.hugo]] +name = "bare success collapses" +input = """ +Start building sites … +hugo v0.120.0+extended linux/amd64 +""" +expected = "hugo: built (no stats)" diff --git a/assets/filters/hyperfine.toml b/assets/filters/hyperfine.toml new file mode 100644 index 0000000..5580ce9 --- /dev/null +++ b/assets/filters/hyperfine.toml @@ -0,0 +1,50 @@ +[filters.hyperfine] +description = "Keep hyperfine mean timings and comparison summary, drop range/warmup detail" +match_command = "^hyperfine\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Range \\(min", + "^\\s*Warning:", +] +keep_lines_matching = [ + "^Benchmark \\d", + "Time \\(mean", + "^Summary", + "ran", + "faster", + "(?i)error", +] +max_lines = 60 +on_empty = "hyperfine: benchmark done" + +[[tests.hyperfine]] +name = "means and summary kept" +input = """ +Benchmark 1: grep foo + Time (mean ± σ): 5.2 ms ± 0.3 ms + Range (min … max): 4.8 ms … 6.1 ms + +Benchmark 2: rg foo + Time (mean ± σ): 1.1 ms ± 0.1 ms + Range (min … max): 0.9 ms … 1.4 ms + +Summary + 'rg foo' ran 4.73 ± 0.45 times faster than 'grep foo' +""" +expected = """Benchmark 1: grep foo + Time (mean ± σ): 5.2 ms ± 0.3 ms +Benchmark 2: rg foo + Time (mean ± σ): 1.1 ms ± 0.1 ms +Summary + 'rg foo' ran 4.73 ± 0.45 times faster than 'grep foo'""" + +[[tests.hyperfine]] +name = "single benchmark kept" +input = """ +Benchmark 1: ls + Time (mean ± σ): 2.0 ms ± 0.2 ms + Range (min … max): 1.7 ms … 2.5 ms +""" +expected = """Benchmark 1: ls + Time (mean ± σ): 2.0 ms ± 0.2 ms""" diff --git a/assets/filters/infracost.toml b/assets/filters/infracost.toml new file mode 100644 index 0000000..d470b95 --- /dev/null +++ b/assets/filters/infracost.toml @@ -0,0 +1,42 @@ +[filters.infracost] +description = "Keep infracost project totals and cost lines, drop sub-resource detail" +match_command = "^infracost\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*├─", + "^\\s*└─", +] +keep_lines_matching = [ + "OVERALL TOTAL", + "Monthly Cost", + "^\\s*Project:", + "Total Monthly Cost", + "\\$[0-9]", + "(?i)error", +] +max_lines = 60 +on_empty = "infracost: no cost data" + +[[tests.infracost]] +name = "totals kept, sub-resources dropped" +input = """ + Project: org/repo + + Name Monthly Qty Unit Monthly Cost + aws_instance.web + └─ Instance usage 730 hours $69.35 + ├─ root_block_device $5.27 + + OVERALL TOTAL $74.62 +""" +expected = """ Project: org/repo + Name Monthly Qty Unit Monthly Cost + OVERALL TOTAL $74.62""" + +[[tests.infracost]] +name = "no resources collapses" +input = """ + Project: org/repo +""" +expected = """ Project: org/repo""" diff --git a/assets/filters/interrogate.toml b/assets/filters/interrogate.toml new file mode 100644 index 0000000..424c308 --- /dev/null +++ b/assets/filters/interrogate.toml @@ -0,0 +1,40 @@ +[filters.interrogate] +description = "Keep interrogate docstring-coverage result and per-file gaps, drop table borders" +match_command = "^interrogate\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*[-=|]+\\s*$", + "^\\|-+", +] +keep_lines_matching = [ + "RESULT:", + "PASSED", + "FAILED", + "TOTAL", + "(?i)error", +] +max_lines = 40 +on_empty = "interrogate: ok" + +[[tests.interrogate]] +name = "result and total kept" +input = """ +==================== Coverage for /app ==================== +------------------------------------- Summary -------------------------------- +| Name | Total | Miss | Cover | +|---------------|-------|------|-------| +| app.py | 10 | 1 | 90% | +| TOTAL | 10 | 1 | 90.0% | +------------------------------------------------------------------------------ +RESULT: PASSED (minimum: 80.0%, actual: 90.0%) +""" +expected = """| TOTAL | 10 | 1 | 90.0% | +RESULT: PASSED (minimum: 80.0%, actual: 90.0%)""" + +[[tests.interrogate]] +name = "below threshold kept" +input = """ +RESULT: FAILED (minimum: 80.0%, actual: 55.0%) +""" +expected = "RESULT: FAILED (minimum: 80.0%, actual: 55.0%)" diff --git a/assets/filters/istioctl.toml b/assets/filters/istioctl.toml new file mode 100644 index 0000000..3eec8eb --- /dev/null +++ b/assets/filters/istioctl.toml @@ -0,0 +1,35 @@ +[filters.istioctl] +description = "Keep istioctl analyze findings, collapse clean analysis" +match_command = "^istioctl\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^✔ No validation issues", +] +keep_lines_matching = [ + "^Error ", + "^Warning ", + "^Info ", + "\\[IST[0-9]+\\]", + "(?i)error", +] +max_lines = 80 +on_empty = "istioctl: no issues found" + +[[tests.istioctl]] +name = "analyze findings kept" +input = """ +Error [IST0101] (Gateway my-gw) Referenced selector not found: "app=missing" +Warning [IST0118] (Service web) Port name http-web is invalid +Info [IST0102] (Namespace default) The namespace is not enabled for Istio injection +""" +expected = """Error [IST0101] (Gateway my-gw) Referenced selector not found: "app=missing" +Warning [IST0118] (Service web) Port name http-web is invalid +Info [IST0102] (Namespace default) The namespace is not enabled for Istio injection""" + +[[tests.istioctl]] +name = "clean analysis collapses" +input = """ +✔ No validation issues found when analyzing namespace: default. +""" +expected = "istioctl: no issues found" diff --git a/assets/filters/jsonlint.toml b/assets/filters/jsonlint.toml new file mode 100644 index 0000000..a313f21 --- /dev/null +++ b/assets/filters/jsonlint.toml @@ -0,0 +1,32 @@ +[filters.jsonlint] +description = "Keep jsonlint parse errors, collapse valid files" +match_command = "^(npx\\s+)?jsonlint\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^Error", + "Parse error", + "on line \\d+", + "Expecting ", + "(?i)error", +] +max_lines = 40 +on_empty = "jsonlint: valid" + +[[tests.jsonlint]] +name = "parse error kept" +input = """ +Error: Parse error on line 3: +... "name": "x" "age": 5 ... +-----------------^ +Expecting 'EOF', '}', ',', ']', got 'STRING' +""" +expected = """Error: Parse error on line 3: +Expecting 'EOF', '}', ',', ']', got 'STRING'""" + +[[tests.jsonlint]] +name = "valid file collapses" +input = "" +expected = "jsonlint: valid" diff --git a/assets/filters/kafka-topics.toml b/assets/filters/kafka-topics.toml new file mode 100644 index 0000000..77cfdd6 --- /dev/null +++ b/assets/filters/kafka-topics.toml @@ -0,0 +1,35 @@ +[filters.kafka-topics] +description = "Keep kafka-topics descriptions and errors, drop blanks" +match_command = "^kafka-topics(\\.sh)?\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^Topic:", + "^\\s*Topic:", + "Partition:", + "Created topic", + "(?i)error", + "(?i)exception", +] +max_lines = 80 +on_empty = "kafka-topics: ok" + +[[tests.kafka-topics]] +name = "topic description kept" +input = """ +Topic: orders PartitionCount: 3 ReplicationFactor: 2 Configs: retention.ms=604800000 + Topic: orders Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2 + Topic: orders Partition: 1 Leader: 2 Replicas: 2,3 Isr: 2,3 +""" +expected = """Topic: orders PartitionCount: 3 ReplicationFactor: 2 Configs: retention.ms=604800000 + Topic: orders Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2 + Topic: orders Partition: 1 Leader: 2 Replicas: 2,3 Isr: 2,3""" + +[[tests.kafka-topics]] +name = "created topic kept" +input = """ +Created topic orders. +""" +expected = "Created topic orders." diff --git a/assets/filters/kics.toml b/assets/filters/kics.toml new file mode 100644 index 0000000..28fdc57 --- /dev/null +++ b/assets/filters/kics.toml @@ -0,0 +1,55 @@ +[filters.kics] +description = "Keep kics severity summary and results, drop scan progress" +match_command = "^kics\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Scanning with", + "Preparing Scan Assets", + "Executing queries:", +] +keep_lines_matching = [ + "^(HIGH|MEDIUM|LOW|INFO|CRITICAL):", + "Results Summary", + "Files scanned", + "Queries failed", + "Total .* Results", + "(?i)error", +] +max_lines = 60 +on_empty = "kics: no findings" + +[[tests.kics]] +name = "summary kept, progress dropped" +input = """ +Scanning with Keeping you (KICS) v1.7.0 +Preparing Scan Assets: Done +Executing queries: 100.00% + +Files scanned: 10 +Queries failed to execute: 0 + +Results Summary: +CRITICAL: 0 +HIGH: 2 +MEDIUM: 3 +LOW: 1 +INFO: 0 +""" +expected = """Files scanned: 10 +Queries failed to execute: 0 +Results Summary: +CRITICAL: 0 +HIGH: 2 +MEDIUM: 3 +LOW: 1 +INFO: 0""" + +[[tests.kics]] +name = "clean scan collapses" +input = """ +Scanning with Keeping you (KICS) v1.7.0 +Preparing Scan Assets: Done +Executing queries: 100.00% +""" +expected = "kics: no findings" diff --git a/assets/filters/knex.toml b/assets/filters/knex.toml new file mode 100644 index 0000000..05c91d9 --- /dev/null +++ b/assets/filters/knex.toml @@ -0,0 +1,35 @@ +[filters.knex] +description = "Keep knex migration batch results and errors" +match_command = "^(npx\\s+)?knex\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Using environment", + "^Requiring external", +] +keep_lines_matching = [ + "^Batch ", + "Already up to date", + "ran successfully", + "migration", + "(?i)error", +] +max_lines = 60 +on_empty = "knex: up to date" + +[[tests.knex]] +name = "batch result kept" +input = """ +Using environment: development +Requiring external module ts-node/register + +Batch 1 run: 2 migrations +""" +expected = "Batch 1 run: 2 migrations" + +[[tests.knex]] +name = "up to date collapses" +input = """ +Using environment: development +""" +expected = "knex: up to date" diff --git a/assets/filters/ko.toml b/assets/filters/ko.toml new file mode 100644 index 0000000..396533c --- /dev/null +++ b/assets/filters/ko.toml @@ -0,0 +1,37 @@ +[filters.ko] +description = "Keep ko published image references and errors, drop build progress" +match_command = "^ko\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\d{4}/\\d{2}/\\d{2} .* Building ", + "^\\d{4}/\\d{2}/\\d{2} .* Publishing ", + "^\\d{4}/\\d{2}/\\d{2} .* pushing ", +] +keep_lines_matching = [ + "@sha256:", + "Published ", + "(?i)error", + "(?i)fail", +] +max_lines = 40 +on_empty = "ko: published" + +[[tests.ko]] +name = "published refs kept" +input = """ +2024/06/01 12:00:00 Building github.com/org/app for linux/amd64 +2024/06/01 12:00:05 Publishing ko.local/app:latest +2024/06/01 12:00:06 Published ko.local/app@sha256:abc123 +ko.local/app@sha256:abc123def456 +""" +expected = """2024/06/01 12:00:06 Published ko.local/app@sha256:abc123 +ko.local/app@sha256:abc123def456""" + +[[tests.ko]] +name = "build error kept" +input = """ +2024/06/01 12:00:00 Building github.com/org/app for linux/amd64 +error: go build failed: exit status 2 +""" +expected = "error: go build failed: exit status 2" diff --git a/assets/filters/kube-bench.toml b/assets/filters/kube-bench.toml new file mode 100644 index 0000000..5dcbfe0 --- /dev/null +++ b/assets/filters/kube-bench.toml @@ -0,0 +1,46 @@ +[filters.kube-bench] +description = "Keep kube-bench FAIL/WARN checks and summary, drop PASS/INFO" +match_command = "^kube-bench\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\[PASS\\]", + "^\\[INFO\\]", +] +keep_lines_matching = [ + "^\\[FAIL\\]", + "^\\[WARN\\]", + "checks (PASS|FAIL|WARN|INFO)", + "== Summary", + "(?i)error", +] +max_lines = 100 +on_empty = "kube-bench: all checks passed" + +[[tests.kube-bench]] +name = "fail/warn and summary kept" +input = """ +[INFO] 1 Master Node Security Configuration +[PASS] 1.1.1 Ensure that the API server pod spec file permissions +[FAIL] 1.1.2 Ensure that the etcd data dir ownership +[WARN] 1.1.3 Ensure that the controller manager + +== Summary == +50 checks PASS +3 checks FAIL +2 checks WARN +""" +expected = """[FAIL] 1.1.2 Ensure that the etcd data dir ownership +[WARN] 1.1.3 Ensure that the controller manager +== Summary == +50 checks PASS +3 checks FAIL +2 checks WARN""" + +[[tests.kube-bench]] +name = "all pass collapses" +input = """ +[INFO] 1 Master Node Security Configuration +[PASS] 1.1.1 Ensure permissions +""" +expected = "kube-bench: all checks passed" diff --git a/assets/filters/kube-linter.toml b/assets/filters/kube-linter.toml new file mode 100644 index 0000000..4286da2 --- /dev/null +++ b/assets/filters/kube-linter.toml @@ -0,0 +1,37 @@ +[filters.kube-linter] +description = "Keep kube-linter findings and error count, drop banner" +match_command = "^kube-linter\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^KubeLinter ", +] +keep_lines_matching = [ + "\\(check:", + "^Error: found", + "lint errors", + "(?i)error", +] +max_lines = 100 +on_empty = "kube-linter: no lint errors" + +[[tests.kube-linter]] +name = "findings and count kept" +input = """ +KubeLinter 0.6.0 + +web.yaml: (object: /web apps/v1, Kind=Deployment) container "web" does not have a read-only root file system (check: no-read-only-root-fs) +web.yaml: (object: /web apps/v1, Kind=Deployment) container "web" has cpu limit 0 (check: unset-cpu-requirements) + +Error: found 2 lint errors +""" +expected = """web.yaml: (object: /web apps/v1, Kind=Deployment) container "web" does not have a read-only root file system (check: no-read-only-root-fs) +web.yaml: (object: /web apps/v1, Kind=Deployment) container "web" has cpu limit 0 (check: unset-cpu-requirements) +Error: found 2 lint errors""" + +[[tests.kube-linter]] +name = "no findings collapses" +input = """ +KubeLinter 0.6.0 +""" +expected = "kube-linter: no lint errors" diff --git a/assets/filters/kube-score.toml b/assets/filters/kube-score.toml new file mode 100644 index 0000000..2b88c23 --- /dev/null +++ b/assets/filters/kube-score.toml @@ -0,0 +1,44 @@ +[filters.kube-score] +description = "Keep kube-score critical/warning checks, drop OK checks and decoration" +match_command = "^kube-score\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "\\[OK\\]", +] +keep_lines_matching = [ + "\\[CRITICAL\\]", + "\\[WARNING\\]", + "·", + "apps/v1", + "v1/", + "(?i)error", +] +max_lines = 100 +on_empty = "kube-score: all checks ok" + +[[tests.kube-score]] +name = "critical and warning checks kept, ok dropped" +input = """ +apps/v1/Deployment web + + [OK] Container Image Pull Policy + [CRITICAL] Container Resources + · web -> CPU limit is not set + [WARNING] Pod NetworkPolicy + · The pod does not have a matching NetworkPolicy +""" +expected = """apps/v1/Deployment web + [CRITICAL] Container Resources + · web -> CPU limit is not set + [WARNING] Pod NetworkPolicy + · The pod does not have a matching NetworkPolicy""" + +[[tests.kube-score]] +name = "all ok collapses" +input = """ +apps/v1/Deployment web + [OK] Container Resources + [OK] Pod NetworkPolicy +""" +expected = "apps/v1/Deployment web" diff --git a/assets/filters/kubens.toml b/assets/filters/kubens.toml new file mode 100644 index 0000000..084a49a --- /dev/null +++ b/assets/filters/kubens.toml @@ -0,0 +1,29 @@ +[filters.kubens] +description = "Bound kubens namespace list and keep switch confirmation" +match_command = "^(kubens|kubectx)\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^Active namespace", + "^Switched to ", + "^\\S", + "(?i)error", +] +max_lines = 80 +on_empty = "kubens: ok" + +[[tests.kubens]] +name = "switch confirmation kept" +input = """ +Active namespace is "production". +""" +expected = "Active namespace is \"production\"." + +[[tests.kubens]] +name = "switched context kept" +input = """ +Switched to context "prod-cluster". +""" +expected = "Switched to context \"prod-cluster\"." diff --git a/assets/filters/kustomize.toml b/assets/filters/kustomize.toml new file mode 100644 index 0000000..f4a779d --- /dev/null +++ b/assets/filters/kustomize.toml @@ -0,0 +1,41 @@ +[filters.kustomize] +description = "Summarize kustomize build YAML to resource kind+name per document" +match_command = "^kustomize\\s+build\\b" +strip_ansi = true +keep_lines_matching = [ + "^kind:", + "^ name:", + "^---", + "(?i)error", +] +max_lines = 120 +on_empty = "kustomize: no resources" +token_budget = 2000 + +[[tests.kustomize]] +name = "resources summarized to kind+name" +input = """ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: web + labels: + app: web +spec: + replicas: 3 +--- +apiVersion: v1 +kind: Service +metadata: + name: web-svc +""" +expected = """kind: Deployment + name: web +--- +kind: Service + name: web-svc""" + +[[tests.kustomize]] +name = "empty build collapses" +input = "" +expected = "kustomize: no resources" diff --git a/assets/filters/lychee.toml b/assets/filters/lychee.toml new file mode 100644 index 0000000..8b83088 --- /dev/null +++ b/assets/filters/lychee.toml @@ -0,0 +1,45 @@ +[filters.lychee] +description = "Keep lychee broken-link errors and the run summary, drop per-OK link lines" +match_command = "^lychee\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\[200\\]", + "^\\s*\\[200\\]", +] +keep_lines_matching = [ + "\\[ERROR\\]", + "Total", + "Errors", + "Excluded", + "✅", + "🚫", + "OK", + "(?i)error", +] +max_lines = 80 +on_empty = "lychee: all links ok" +token_budget = 2000 + +[[tests.lychee]] +name = "summary kept, ok links dropped" +input = """ +[200] http://localhost:1234/a/ +[200] http://localhost:1234/b/ +[ERROR] https://example.com/dead | Failed: Network error +🔍 99 Total (in 0s) +✅ 98 OK +🚫 1 Errors +""" +expected = """[ERROR] https://example.com/dead | Failed: Network error +🔍 99 Total (in 0s) +✅ 98 OK +🚫 1 Errors""" + +[[tests.lychee]] +name = "all-ok run collapses" +input = """ +[200] http://localhost:1234/a/ +[200] http://localhost:1234/b/ +""" +expected = "lychee: all links ok" diff --git a/assets/filters/migrate.toml b/assets/filters/migrate.toml new file mode 100644 index 0000000..b4feb75 --- /dev/null +++ b/assets/filters/migrate.toml @@ -0,0 +1,31 @@ +[filters.migrate] +description = "Keep golang-migrate applied versions and errors" +match_command = "^migrate\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^\\d+/[uvd] ", + "no change", + "Dirty database", + "(?i)error", +] +max_lines = 80 +on_empty = "migrate: no change" + +[[tests.migrate]] +name = "applied versions kept" +input = """ +20240101120000/u create_users (12.3ms) +20240102120000/u add_index (8.1ms) +""" +expected = """20240101120000/u create_users (12.3ms) +20240102120000/u add_index (8.1ms)""" + +[[tests.migrate]] +name = "dirty database kept" +input = """ +error: Dirty database version 20240101120000. Fix and force version. +""" +expected = "error: Dirty database version 20240101120000. Fix and force version." diff --git a/assets/filters/mill.toml b/assets/filters/mill.toml new file mode 100644 index 0000000..56adc0e --- /dev/null +++ b/assets/filters/mill.toml @@ -0,0 +1,37 @@ +[filters.mill] +description = "Keep mill task failures and totals, drop per-step compile progress" +match_command = "^(\\./)?mill\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\[\\d+/\\d+\\] \\w+\\.(compile|resolve)", +] +keep_lines_matching = [ + "targets? failed", + "\\d+ tests? failed", + "failed", + "BUILD ", + "(?i)error", +] +max_lines = 80 +on_empty = "mill: ok" + +[[tests.mill]] +name = "failure kept, compile steps dropped" +input = """ +[1/100] app.compile +[50/100] app.test.compile +[100/100] app.test +1 targets failed +app.test scala.AssertionError +""" +expected = """1 targets failed +app.test scala.AssertionError""" + +[[tests.mill]] +name = "clean build collapses" +input = """ +[1/100] app.compile +[100/100] app.test.compile +""" +expected = "mill: ok" diff --git a/assets/filters/mlflow.toml b/assets/filters/mlflow.toml new file mode 100644 index 0000000..1631f50 --- /dev/null +++ b/assets/filters/mlflow.toml @@ -0,0 +1,36 @@ +[filters.mlflow] +description = "Keep mlflow run result and errors, drop INFO setup logs" +match_command = "^mlflow\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "mlflow.projects.utils", + "=== Creating ", + "=== Fetching ", +] +keep_lines_matching = [ + "=== Run .* succeeded ===", + "=== Run .* failed ===", + "succeeded", + "failed", + "(?i)error", + "(?i)exception", +] +max_lines = 60 +on_empty = "mlflow: done" + +[[tests.mlflow]] +name = "run result kept" +input = """ +2024/06/01 12:00:00 INFO mlflow.projects.utils: === Fetching project === +2024/06/01 12:00:05 INFO mlflow.projects: === Run (ID 'abc123') succeeded === +""" +expected = "2024/06/01 12:00:05 INFO mlflow.projects: === Run (ID 'abc123') succeeded ===" + +[[tests.mlflow]] +name = "run failure kept" +input = """ +2024/06/01 12:00:00 INFO mlflow.projects.utils: === Fetching project === +2024/06/01 12:00:05 ERROR mlflow.cli: === Run (ID 'abc123') failed === +""" +expected = "2024/06/01 12:00:05 ERROR mlflow.cli: === Run (ID 'abc123') failed ===" diff --git a/assets/filters/mockgen.toml b/assets/filters/mockgen.toml new file mode 100644 index 0000000..a03ba36 --- /dev/null +++ b/assets/filters/mockgen.toml @@ -0,0 +1,28 @@ +[filters.mockgen] +description = "Surface mockgen errors, collapse silent success" +match_command = "^mockgen\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "(?i)error", + "(?i)fail", + "cannot ", + "no such", + "loading ", +] +max_lines = 40 +on_empty = "mockgen: generated" + +[[tests.mockgen]] +name = "error kept" +input = """ +2024/06/01 mockgen: Loading input failed: cannot find package "./missing" +""" +expected = """2024/06/01 mockgen: Loading input failed: cannot find package "./missing\"""" + +[[tests.mockgen]] +name = "silent success collapses" +input = "" +expected = "mockgen: generated" diff --git a/assets/filters/moon.toml b/assets/filters/moon.toml new file mode 100644 index 0000000..a24999b --- /dev/null +++ b/assets/filters/moon.toml @@ -0,0 +1,41 @@ +[filters.moon] +description = "Keep moon task summary and failures, drop per-task progress bars" +match_command = "^moon\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^▪+ ", + "^\\s*▪+ ", +] +keep_lines_matching = [ + "^Tasks:", + "^\\s*Time:", + "completed", + "failed", + "(?i)error", +] +max_lines = 60 +on_empty = "moon: done" + +[[tests.moon]] +name = "summary kept, task bars dropped" +input = """ +▪▪▪▪ app:build (1.2s) +▪▪▪▪ app:test (0.8s) + +Tasks: 2 completed + Time: 2.5s +""" +expected = """Tasks: 2 completed + Time: 2.5s""" + +[[tests.moon]] +name = "failure kept" +input = """ +▪▪▪▪ app:build (1.2s) + +Tasks: 1 completed, 1 failed + Time: 2.5s +""" +expected = """Tasks: 1 completed, 1 failed + Time: 2.5s""" diff --git a/assets/filters/nbconvert.toml b/assets/filters/nbconvert.toml new file mode 100644 index 0000000..eb33e65 --- /dev/null +++ b/assets/filters/nbconvert.toml @@ -0,0 +1,36 @@ +[filters.nbconvert] +description = "Keep jupyter nbconvert output-written lines and errors" +match_command = "^jupyter\\s+nbconvert\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Making directory", +] +keep_lines_matching = [ + "Writing \\d+ bytes", + "Converting notebook", + "(?i)error", + "(?i)exception", + "(?i)traceback", +] +max_lines = 40 +on_empty = "nbconvert: done" + +[[tests.nbconvert]] +name = "written output kept" +input = """ +[NbConvertApp] Converting notebook analysis.ipynb to html +[NbConvertApp] Making directory output +[NbConvertApp] Writing 124356 bytes to output/analysis.html +""" +expected = """[NbConvertApp] Converting notebook analysis.ipynb to html +[NbConvertApp] Writing 124356 bytes to output/analysis.html""" + +[[tests.nbconvert]] +name = "execution error kept" +input = """ +[NbConvertApp] Converting notebook analysis.ipynb to html +nbclient.exceptions.CellExecutionError: NameError: name 'foo' is not defined +""" +expected = """[NbConvertApp] Converting notebook analysis.ipynb to html +nbclient.exceptions.CellExecutionError: NameError: name 'foo' is not defined""" diff --git a/assets/filters/nerdctl.toml b/assets/filters/nerdctl.toml new file mode 100644 index 0000000..d829f29 --- /dev/null +++ b/assets/filters/nerdctl.toml @@ -0,0 +1,38 @@ +[filters.nerdctl] +description = "Keep nerdctl table rows and results, drop layer download progress" +match_command = "^nerdctl\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*[0-9a-f]{12}:\\s+(downloading|extracting|waiting)", + "elapsed:", + "total:", +] +keep_lines_matching = [ + "^CONTAINER ID", + "^REPOSITORY", + "^IMAGE ID", + "^[0-9a-f]{12}\\b", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "nerdctl: ok" + +[[tests.nerdctl]] +name = "ps rows kept, progress dropped" +input = """ +a1b2c3d4e5f6: downloading |++++++++| 12.0 MiB/12.0 MiB +elapsed: 1.2 s total: 12.0 M +CONTAINER ID IMAGE STATUS NAMES +a1b2c3d4e5f6 app:latest Up 1m web +""" +expected = """CONTAINER ID IMAGE STATUS NAMES +a1b2c3d4e5f6 app:latest Up 1m web""" + +[[tests.nerdctl]] +name = "run error kept" +input = """ +FATA[0000] error: image not found +""" +expected = "FATA[0000] error: image not found" diff --git a/assets/filters/nomad.toml b/assets/filters/nomad.toml new file mode 100644 index 0000000..e3d26b5 --- /dev/null +++ b/assets/filters/nomad.toml @@ -0,0 +1,38 @@ +[filters.nomad] +description = "Keep nomad job status key fields and errors, drop blanks" +match_command = "^nomad\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^\\w[\\w ]*=", + "^ID\\s", + "Status", + "running", + "failed", + "dead", + "(?i)error", +] +max_lines = 80 +on_empty = "nomad: ok" + +[[tests.nomad]] +name = "job status fields kept" +input = """ +ID = web +Name = web +Status = running +Type = service +Priority = 50 +""" +expected = """ID = web +Name = web +Status = running +Type = service +Priority = 50""" + +[[tests.nomad]] +name = "no output collapses" +input = "" +expected = "nomad: ok" diff --git a/assets/filters/nox.toml b/assets/filters/nox.toml new file mode 100644 index 0000000..1bf4e97 --- /dev/null +++ b/assets/filters/nox.toml @@ -0,0 +1,44 @@ +[filters.nox] +description = "Keep nox session start/result lines and the multi-session summary, drop inner command echoes" +match_command = "^nox\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Creating virtualenv", + "Re-using existing virtualenv", +] +keep_lines_matching = [ + "Running session", + "was successful", + "Ran multiple sessions", + "nox > \\*", + "(?i)\\bfailed\\b", + "(?i)error", +] +max_lines = 80 +on_empty = "nox: done" + +[[tests.nox]] +name = "session results kept, inner echoes dropped" +input = """ +nox > Running session lint +nox > Creating virtualenv using python3.10 in .nox/lint +nox > python -m pip install flake8 +nox > flake8 example.py +nox > Session lint was successful. +nox > Ran multiple sessions: +nox > * lint: success +""" +expected = """nox > Running session lint +nox > Session lint was successful. +nox > Ran multiple sessions: +nox > * lint: success""" + +[[tests.nox]] +name = "failed session kept" +input = """ +nox > Running session tests +nox > Session tests failed. +""" +expected = """nox > Running session tests +nox > Session tests failed.""" diff --git a/assets/filters/nuclei.toml b/assets/filters/nuclei.toml new file mode 100644 index 0000000..a80d2a2 --- /dev/null +++ b/assets/filters/nuclei.toml @@ -0,0 +1,42 @@ +[filters.nuclei] +description = "Keep nuclei findings, drop banner and progress" +match_command = "^nuclei\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*__", + "projectdiscovery.io", + "\\[INF\\]", + "\\[WRN\\] ", +] +keep_lines_matching = [ + "\\[critical\\]", + "\\[high\\]", + "\\[medium\\]", + "\\[low\\]", + "\\[CVE-", + "(?i)error", +] +max_lines = 100 +on_empty = "nuclei: no findings" +token_budget = 2000 + +[[tests.nuclei]] +name = "findings kept, banner dropped" +input = """ + __ _ + ____ __ _______/ /__ (_) +projectdiscovery.io +[INF] Templates loaded: 5000 +[CVE-2021-1234] [http] [critical] https://target/path +[exposed-panel] [http] [info] https://target/admin +""" +expected = "[CVE-2021-1234] [http] [critical] https://target/path" + +[[tests.nuclei]] +name = "clean scan collapses" +input = """ +[INF] Templates loaded: 5000 +[INF] No results found +""" +expected = "nuclei: no findings" diff --git a/assets/filters/osv-scanner.toml b/assets/filters/osv-scanner.toml new file mode 100644 index 0000000..b63b649 --- /dev/null +++ b/assets/filters/osv-scanner.toml @@ -0,0 +1,42 @@ +[filters.osv-scanner] +description = "Keep OSV-Scanner vulnerability rows and the severity summary, drop box borders" +match_command = "^osv-scanner\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "OSV URL", + "Scanning dir", + "Scanned .* file", +] +keep_lines_matching = [ + "osv\\.dev", + "known vulnerabilit", + "can be fixed", + "(?i)error", +] +max_lines = 80 +on_empty = "osv-scanner: no vulnerabilities found" +token_budget = 2000 + +[[tests.osv-scanner]] +name = "vuln rows and summary kept, borders dropped" +input = """ +╭─────────────────────────────────────┬──────┬───────────┬──────────────────────────┬───────────────┬─────────┬────────────────────╮ +│ OSV URL │ CVSS │ ECOSYSTEM │ PACKAGE │ FIXED VERSION │ VERSION │ SOURCE │ +├─────────────────────────────────────┼──────┼───────────┼──────────────────────────┼───────────────┼─────────┼────────────────────┤ +│ https://osv.dev/GHSA-c3h9-896r-86jm │ 8.6 │ Go │ github.com/gogo/protobuf │ 1.3.2 │ 1.3.1 │ path/to/go.mod │ +╰─────────────────────────────────────┴──────┴───────────┴──────────────────────────┴───────────────┴─────────┴────────────────────╯ +Total 2 packages affected by 2 known vulnerabilities (1 Critical, 1 High, 0 Medium, 0 Low, 0 Unknown) from 2 ecosystems. +1 vulnerability can be fixed. +""" +expected = """│ https://osv.dev/GHSA-c3h9-896r-86jm │ 8.6 │ Go │ github.com/gogo/protobuf │ 1.3.2 │ 1.3.1 │ path/to/go.mod │ +Total 2 packages affected by 2 known vulnerabilities (1 Critical, 1 High, 0 Medium, 0 Low, 0 Unknown) from 2 ecosystems. +1 vulnerability can be fixed.""" + +[[tests.osv-scanner]] +name = "clean scan collapses" +input = """ +Scanned /app/go.mod file and found 12 packages +No issues found +""" +expected = "osv-scanner: no vulnerabilities found" diff --git a/assets/filters/pants.toml b/assets/filters/pants.toml new file mode 100644 index 0000000..23f1d14 --- /dev/null +++ b/assets/filters/pants.toml @@ -0,0 +1,38 @@ +[filters.pants] +description = "Keep pants results and failures, drop per-target INFO progress" +match_command = "^(\\./)?pants\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "\\[INFO\\] Completed:", + "\\[INFO\\] Initializing", +] +keep_lines_matching = [ + "✓", + "✕", + "✗", + "passed", + "FAILURE", + "failed", + "(?i)error", +] +max_lines = 80 +on_empty = "pants: ok" + +[[tests.pants]] +name = "results kept, info dropped" +input = """ +17:23:45.12 [INFO] Initializing scheduler... +17:23:46.00 [INFO] Completed: Building app +✓ app/tests:tests passed. +✕ app/lib:lint failed. +""" +expected = """✓ app/tests:tests passed. +✕ app/lib:lint failed.""" + +[[tests.pants]] +name = "all pass kept" +input = """ +✓ app/tests:tests passed. +""" +expected = "✓ app/tests:tests passed." diff --git a/assets/filters/pg_dump.toml b/assets/filters/pg_dump.toml new file mode 100644 index 0000000..8b98c76 --- /dev/null +++ b/assets/filters/pg_dump.toml @@ -0,0 +1,41 @@ +[filters.pg_dump] +description = "Surface pg_dump errors/warnings, collapse silent SQL dump" +match_command = "^pg_dump\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^--", + "^SET ", + "^SELECT pg_catalog", +] +keep_lines_matching = [ + "pg_dump: error", + "pg_dump: warning", + "(?i)error:", + "(?i)fatal:", + "permission denied", +] +max_lines = 60 +on_empty = "pg_dump: dump complete" + +[[tests.pg_dump]] +name = "errors surfaced" +input = """ +-- +-- PostgreSQL database dump +-- +SET statement_timeout = 0; +pg_dump: error: connection to server failed: FATAL: role "x" does not exist +""" +expected = """pg_dump: error: connection to server failed: FATAL: role "x" does not exist""" + +[[tests.pg_dump]] +name = "clean dump collapses" +input = """ +-- +-- PostgreSQL database dump +-- +SET statement_timeout = 0; +SET lock_timeout = 0; +""" +expected = "pg_dump: dump complete" diff --git a/assets/filters/pip-compile.toml b/assets/filters/pip-compile.toml new file mode 100644 index 0000000..7c6b484 --- /dev/null +++ b/assets/filters/pip-compile.toml @@ -0,0 +1,34 @@ +[filters.pip-compile] +description = "Keep pip-compile pinned requirements, drop comment provenance lines" +match_command = "^pip-compile\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*#", +] +keep_lines_matching = [ + "==", + "(?i)error", +] +max_lines = 200 +on_empty = "pip-compile: no requirements" +token_budget = 2500 + +[[tests.pip-compile]] +name = "pins kept, provenance comments dropped" +input = """ +# +# This file is autogenerated by pip-compile +# +click==8.1.7 + # via flask +flask==3.0.0 + # via -r requirements.in +""" +expected = """click==8.1.7 +flask==3.0.0""" + +[[tests.pip-compile]] +name = "empty collapses" +input = "" +expected = "pip-compile: no requirements" diff --git a/assets/filters/pipdeptree.toml b/assets/filters/pipdeptree.toml new file mode 100644 index 0000000..483b178 --- /dev/null +++ b/assets/filters/pipdeptree.toml @@ -0,0 +1,28 @@ +[filters.pipdeptree] +description = "Bound pipdeptree dependency tree, keep conflict warnings" +match_command = "^pipdeptree\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +max_lines = 120 +on_empty = "pipdeptree: no dependencies" +token_budget = 2000 + +[[tests.pipdeptree]] +name = "tree kept and bounded" +input = """ +flask==3.0.0 +├── blinker [required: >=1.6.2, installed: 1.7.0] +├── click [required: >=8.1.3, installed: 8.1.7] +└── jinja2 [required: >=3.1.2, installed: 3.1.3] +""" +expected = """flask==3.0.0 +├── blinker [required: >=1.6.2, installed: 1.7.0] +├── click [required: >=8.1.3, installed: 8.1.7] +└── jinja2 [required: >=3.1.2, installed: 3.1.3]""" + +[[tests.pipdeptree]] +name = "empty collapses" +input = "" +expected = "pipdeptree: no dependencies" diff --git a/assets/filters/please.toml b/assets/filters/please.toml new file mode 100644 index 0000000..a7462b5 --- /dev/null +++ b/assets/filters/please.toml @@ -0,0 +1,37 @@ +[filters.please] +description = "Keep please (plz) build result and failures, drop per-target progress" +match_command = "^(plz|please)\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*Building ", + "^\\s*Parsing ", +] +keep_lines_matching = [ + "Build finished", + "targets? failed", + "Some targets failed", + "failed", + "(?i)error", +] +max_lines = 60 +on_empty = "please: build done" + +[[tests.please]] +name = "result kept" +input = """ +Parsing //src:all +Building //src:lib +Build finished; total time 12s, incrementality 50.0%, 20 targets built. +""" +expected = "Build finished; total time 12s, incrementality 50.0%, 20 targets built." + +[[tests.please]] +name = "failure kept" +input = """ +Building //src:lib +//src:lib failed: compile error +Some targets failed +""" +expected = """//src:lib failed: compile error +Some targets failed""" diff --git a/assets/filters/pod.toml b/assets/filters/pod.toml new file mode 100644 index 0000000..c0d6e3d --- /dev/null +++ b/assets/filters/pod.toml @@ -0,0 +1,40 @@ +[filters.pod] +description = "Keep CocoaPods install result and errors, drop per-pod install lines" +match_command = "^pod\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Installing ", + "^Using ", + "^Analyzing dependencies", + "^Downloading dependencies", + "^Generating Pods project", +] +keep_lines_matching = [ + "Pod installation complete", + "dependencies from the Podfile", + "(?i)error", + "(?i)\\[!\\]", +] +max_lines = 60 +on_empty = "pod: done" + +[[tests.pod]] +name = "install result kept, per-pod lines dropped" +input = """ +Analyzing dependencies +Downloading dependencies +Installing AFNetworking (4.0.1) +Installing Alamofire (5.8.0) +Generating Pods project +Pod installation complete! There are 2 dependencies from the Podfile and 2 total pods installed. +""" +expected = "Pod installation complete! There are 2 dependencies from the Podfile and 2 total pods installed." + +[[tests.pod]] +name = "error kept" +input = """ +Analyzing dependencies +[!] Unable to find a specification for `MissingPod` +""" +expected = "[!] Unable to find a specification for `MissingPod`" diff --git a/assets/filters/podman-build.toml b/assets/filters/podman-build.toml new file mode 100644 index 0000000..e97b4fb --- /dev/null +++ b/assets/filters/podman-build.toml @@ -0,0 +1,44 @@ +[filters.podman-build] +description = "Keep podman build STEP/COMMIT lines and errors, drop layer noise" +match_command = "^podman\\s+build\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^--> ", + "Copying blob", + "Copying config", +] +keep_lines_matching = [ + "^STEP ", + "^COMMIT ", + "Successfully tagged", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "podman build: ok" + +[[tests.podman-build]] +name = "step and commit kept" +input = """ +STEP 1/3: FROM alpine:3.19 +STEP 2/3: RUN apk add curl +--> a1b2c3d +STEP 3/3: CMD ["/app"] +COMMIT app:latest +Successfully tagged localhost/app:latest +""" +expected = """STEP 1/3: FROM alpine:3.19 +STEP 2/3: RUN apk add curl +STEP 3/3: CMD ["/app"] +COMMIT app:latest +Successfully tagged localhost/app:latest""" + +[[tests.podman-build]] +name = "build error kept" +input = """ +STEP 2/3: RUN false +Error: building at STEP "RUN false": exit status 1 +""" +expected = """STEP 2/3: RUN false +Error: building at STEP "RUN false": exit status 1""" diff --git a/assets/filters/podman.toml b/assets/filters/podman.toml new file mode 100644 index 0000000..ed70f05 --- /dev/null +++ b/assets/filters/podman.toml @@ -0,0 +1,41 @@ +[filters.podman] +description = "Keep podman ps/images/inspect table rows, drop blanks" +match_command = "^podman\\s+(ps|images|image\\s+ls|container\\s+ls|inspect|pull|push)\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Copying blob", + "Copying config", + "Writing manifest", +] +keep_lines_matching = [ + "^CONTAINER ID", + "^REPOSITORY", + "^IMAGE ID", + "^[0-9a-f]{12}\\b", + "Storing signatures", + "(?i)error", +] +max_lines = 60 +on_empty = "podman: ok" + +[[tests.podman]] +name = "ps table rows kept" +input = """ +CONTAINER ID IMAGE COMMAND STATUS NAMES +a1b2c3d4e5f6 docker.io/app:latest /entry.sh Up 2 minutes web +b2c3d4e5f6a7 docker.io/db:15 postgres Up 5 minutes db +""" +expected = """CONTAINER ID IMAGE COMMAND STATUS NAMES +a1b2c3d4e5f6 docker.io/app:latest /entry.sh Up 2 minutes web +b2c3d4e5f6a7 docker.io/db:15 postgres Up 5 minutes db""" + +[[tests.podman]] +name = "pull progress collapses" +input = """ +Copying blob sha256:abc done +Copying config sha256:def done +Writing manifest to image destination +Storing signatures +""" +expected = "Storing signatures" diff --git a/assets/filters/prefect.toml b/assets/filters/prefect.toml new file mode 100644 index 0000000..ca8ff03 --- /dev/null +++ b/assets/filters/prefect.toml @@ -0,0 +1,36 @@ +[filters.prefect] +description = "Keep prefect flow/task final states and errors, drop routine INFO" +match_command = "^prefect\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Created task run", + "Executing ", +] +keep_lines_matching = [ + "Finished in state", + "Completed\\(", + "Failed\\(", + "Crashed\\(", + "(?i)error", + "(?i)exception", +] +max_lines = 80 +on_empty = "prefect: flow done" + +[[tests.prefect]] +name = "final state kept" +input = """ +12:00:00.000 | INFO | Created task run 'load-0' for task 'load' +12:00:01.000 | INFO | Executing 'load-0' +12:00:02.000 | INFO | Finished in state Completed() +""" +expected = "12:00:02.000 | INFO | Finished in state Completed()" + +[[tests.prefect]] +name = "failed state kept" +input = """ +12:00:01.000 | INFO | Executing 'load-0' +12:00:02.000 | ERROR | Finished in state Failed('Task run encountered an exception') +""" +expected = "12:00:02.000 | ERROR | Finished in state Failed('Task run encountered an exception')" diff --git a/assets/filters/proselint.toml b/assets/filters/proselint.toml new file mode 100644 index 0000000..ec9f2eb --- /dev/null +++ b/assets/filters/proselint.toml @@ -0,0 +1,27 @@ +[filters.proselint] +description = "Keep proselint findings, collapse clean prose" +match_command = "^proselint\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + ":\\d+:\\d+:", + "(?i)error", +] +max_lines = 100 +on_empty = "proselint: no issues" + +[[tests.proselint]] +name = "findings kept" +input = """ +README.md:10:3: typography.symbols.curly_quotes Use curly quotes +README.md:22:1: leonard.exclamation.30ppm More than 30 ppm of exclamations +""" +expected = """README.md:10:3: typography.symbols.curly_quotes Use curly quotes +README.md:22:1: leonard.exclamation.30ppm More than 30 ppm of exclamations""" + +[[tests.proselint]] +name = "clean prose collapses" +input = "" +expected = "proselint: no issues" diff --git a/assets/filters/protolint.toml b/assets/filters/protolint.toml new file mode 100644 index 0000000..76fe9fd --- /dev/null +++ b/assets/filters/protolint.toml @@ -0,0 +1,28 @@ +[filters.protolint] +description = "Keep protolint findings, collapse clean protos" +match_command = "^protolint\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "\\.proto:\\d+:\\d+\\]", + "\\.proto:\\d+:\\d+:", + "(?i)error", +] +max_lines = 100 +on_empty = "protolint: no issues" + +[[tests.protolint]] +name = "findings kept" +input = """ +[proto/user.proto:10:1] EnumField name "active" must be CONSTANT_CASE +[proto/user.proto:15:3] Field "userId" must be lower_snake_case +""" +expected = """[proto/user.proto:10:1] EnumField name "active" must be CONSTANT_CASE +[proto/user.proto:15:3] Field "userId" must be lower_snake_case""" + +[[tests.protolint]] +name = "clean proto collapses" +input = "" +expected = "protolint: no issues" diff --git a/assets/filters/prowler.toml b/assets/filters/prowler.toml new file mode 100644 index 0000000..468b5c1 --- /dev/null +++ b/assets/filters/prowler.toml @@ -0,0 +1,39 @@ +[filters.prowler] +description = "Keep prowler FAIL findings and severity, drop PASS checks" +match_command = "^prowler\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*PASS ", + "^\\s*\\[PASS\\]", +] +keep_lines_matching = [ + "FAIL", + "CRITICAL", + "HIGH", + "MEDIUM", + "Findings", + "(?i)error", +] +max_lines = 100 +on_empty = "prowler: no failed checks" +token_budget = 2000 + +[[tests.prowler]] +name = "failed checks kept" +input = """ +PASS us-east-1 iam_root_mfa_enabled +FAIL us-east-1 s3_bucket_public_access [HIGH] Bucket is public +PASS us-east-1 ec2_ebs_encryption +FAIL us-east-1 rds_no_public_access [CRITICAL] RDS publicly accessible +""" +expected = """FAIL us-east-1 s3_bucket_public_access [HIGH] Bucket is public +FAIL us-east-1 rds_no_public_access [CRITICAL] RDS publicly accessible""" + +[[tests.prowler]] +name = "all pass collapses" +input = """ +PASS us-east-1 iam_root_mfa_enabled +PASS us-east-1 ec2_ebs_encryption +""" +expected = "prowler: no failed checks" diff --git a/assets/filters/publint.toml b/assets/filters/publint.toml new file mode 100644 index 0000000..c9a4218 --- /dev/null +++ b/assets/filters/publint.toml @@ -0,0 +1,41 @@ +[filters.publint] +description = "Keep publint errors/warnings/suggestions, collapse clean packages" +match_command = "^(npx\\s+)?publint\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Linting ", +] +keep_lines_matching = [ + "^Errors:", + "^Warnings:", + "^Suggestions:", + "^\\s*\\d+\\.", + "All good", + "(?i)error", +] +max_lines = 80 +on_empty = "publint: all good" + +[[tests.publint]] +name = "errors and suggestions kept" +input = """ +Linting mypackage + +Errors: +1. pkg.main is mjs/index.js but the file does not exist. + +Suggestions: +1. pkg.exports should be defined. +""" +expected = """Errors: +1. pkg.main is mjs/index.js but the file does not exist. +Suggestions: +1. pkg.exports should be defined.""" + +[[tests.publint]] +name = "clean package collapses" +input = """ +Linting mypackage +""" +expected = "publint: all good" diff --git a/assets/filters/pyupgrade.toml b/assets/filters/pyupgrade.toml new file mode 100644 index 0000000..c0041db --- /dev/null +++ b/assets/filters/pyupgrade.toml @@ -0,0 +1,27 @@ +[filters.pyupgrade] +description = "Keep pyupgrade rewritten-file lines, collapse no-op runs" +match_command = "^pyupgrade\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^Rewriting ", + "(?i)error", +] +max_lines = 80 +on_empty = "pyupgrade: no changes" + +[[tests.pyupgrade]] +name = "rewritten files kept" +input = """ +Rewriting src/app.py +Rewriting src/utils.py +""" +expected = """Rewriting src/app.py +Rewriting src/utils.py""" + +[[tests.pyupgrade]] +name = "no changes collapses" +input = "" +expected = "pyupgrade: no changes" diff --git a/assets/filters/radon.toml b/assets/filters/radon.toml new file mode 100644 index 0000000..da7416d --- /dev/null +++ b/assets/filters/radon.toml @@ -0,0 +1,35 @@ +[filters.radon] +description = "Keep radon complexity grades C/D/E/F (worst offenders), drop A/B and files with no flags" +match_command = "^radon\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "- [AB]$", +] +keep_lines_matching = [ + "- [C-F]$", + "Average complexity", + "(?i)error", +] +max_lines = 80 +on_empty = "radon: no high-complexity blocks" + +[[tests.radon]] +name = "high-complexity blocks kept, A/B dropped" +input = """ +src/app.py + F 12:0 process - C + M 40:4 Handler.run - A + F 60:0 validate - E +""" +expected = """ F 12:0 process - C + F 60:0 validate - E""" + +[[tests.radon]] +name = "all simple collapses" +input = """ +src/app.py + F 12:0 process - A + M 40:4 Handler.run - B +""" +expected = "radon: no high-complexity blocks" diff --git a/assets/filters/rclone.toml b/assets/filters/rclone.toml new file mode 100644 index 0000000..cb3668b --- /dev/null +++ b/assets/filters/rclone.toml @@ -0,0 +1,44 @@ +[filters.rclone] +description = "Keep rclone final transfer stats and errors, drop interim progress" +match_command = "^rclone\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Transferring:", + "^\\s*\\* ", +] +keep_lines_matching = [ + "^Transferred:", + "^Errors:", + "^Checks:", + "^Elapsed time:", + "^Deleted:", + "(?i)error", +] +max_lines = 40 +on_empty = "rclone: done" + +[[tests.rclone]] +name = "final stats kept, interim dropped" +input = """ +Transferring: + * bigfile.dat: 50% /10M, 1M/s, ETA 5s + +Transferred: 10 MiB / 10 MiB, 100%, 2 MiB/s, ETA 0s +Errors: 0 +Checks: 5 / 5, 100% +Elapsed time: 2.3s +""" +expected = """Transferred: 10 MiB / 10 MiB, 100%, 2 MiB/s, ETA 0s +Errors: 0 +Checks: 5 / 5, 100% +Elapsed time: 2.3s""" + +[[tests.rclone]] +name = "transfer error kept" +input = """ +Errors: 1 (retrying may help) +2024/06/01 ERROR : bigfile.dat: Failed to copy: permission denied +""" +expected = """Errors: 1 (retrying may help) +2024/06/01 ERROR : bigfile.dat: Failed to copy: permission denied""" diff --git a/assets/filters/refurb.toml b/assets/filters/refurb.toml new file mode 100644 index 0000000..6762300 --- /dev/null +++ b/assets/filters/refurb.toml @@ -0,0 +1,27 @@ +[filters.refurb] +description = "Keep refurb suggestions (FURB codes), drop blanks" +match_command = "^refurb\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "\\[FURB[0-9]+\\]", + "(?i)error", +] +max_lines = 100 +on_empty = "refurb: no suggestions" + +[[tests.refurb]] +name = "suggestions kept" +input = """ +src/app.py:10:5 [FURB109]: Replace `in (x, y)` with `in {x, y}` +src/app.py:22:1 [FURB104]: Replace `os.getcwd()` with `Path.cwd()` +""" +expected = """src/app.py:10:5 [FURB109]: Replace `in (x, y)` with `in {x, y}` +src/app.py:22:1 [FURB104]: Replace `os.getcwd()` with `Path.cwd()`""" + +[[tests.refurb]] +name = "clean run collapses" +input = "" +expected = "refurb: no suggestions" diff --git a/assets/filters/restic.toml b/assets/filters/restic.toml new file mode 100644 index 0000000..b69bd8b --- /dev/null +++ b/assets/filters/restic.toml @@ -0,0 +1,43 @@ +[filters.restic] +description = "Keep restic backup summary and snapshot id, drop scan progress" +match_command = "^restic\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\[\\d+:\\d+\\] ", + "scan finished", +] +keep_lines_matching = [ + "^Files:", + "^Dirs:", + "^Added to the repo", + "snapshot \\w+ saved", + "processed ", + "(?i)error", + "(?i)fatal", +] +max_lines = 40 +on_empty = "restic: done" + +[[tests.restic]] +name = "backup summary kept" +input = """ +[0:00] 100.00% 10 / 10 files +Files: 10 new, 0 changed, 0 unmodified +Dirs: 2 new, 0 changed, 0 unmodified +Added to the repo: 1.234 MiB +processed 10 files, 5.678 MiB in 0:02 +snapshot ab12cd34 saved +""" +expected = """Files: 10 new, 0 changed, 0 unmodified +Dirs: 2 new, 0 changed, 0 unmodified +Added to the repo: 1.234 MiB +processed 10 files, 5.678 MiB in 0:02 +snapshot ab12cd34 saved""" + +[[tests.restic]] +name = "fatal error kept" +input = """ +Fatal: unable to open repository at /backup: no such file +""" +expected = "Fatal: unable to open repository at /backup: no such file" diff --git a/assets/filters/retire.toml b/assets/filters/retire.toml new file mode 100644 index 0000000..ae1e855 --- /dev/null +++ b/assets/filters/retire.toml @@ -0,0 +1,29 @@ +[filters.retire] +description = "Keep retire.js vulnerability findings, collapse clean runs" +match_command = "^retire\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "has known vulnerabilities", + "CVE-", + "↳", + "severity:", + "(?i)error", +] +max_lines = 100 +on_empty = "retire: no known vulnerabilities" + +[[tests.retire]] +name = "findings kept" +input = """ +public/js/app.js + ↳ jquery 1.8.0 has known vulnerabilities: severity: medium; CVE-2011-4969 +""" +expected = " ↳ jquery 1.8.0 has known vulnerabilities: severity: medium; CVE-2011-4969" + +[[tests.retire]] +name = "clean run collapses" +input = "" +expected = "retire: no known vulnerabilities" diff --git a/assets/filters/sam.toml b/assets/filters/sam.toml new file mode 100644 index 0000000..83ef58a --- /dev/null +++ b/assets/filters/sam.toml @@ -0,0 +1,51 @@ +[filters.sam] +description = "Keep AWS SAM build/deploy results and stack outputs, drop event stream noise" +match_command = "^sam\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*-+\\s*$", + "CREATE_IN_PROGRESS", + "UPDATE_IN_PROGRESS", + "DELETE_IN_PROGRESS", +] +keep_lines_matching = [ + "Build Succeeded", + "Build Failed", + "Successfully created/updated stack", + "CREATE_COMPLETE", + "UPDATE_COMPLETE", + "ROLLBACK", + "FAILED", + "Outputs", + "(?i)error", +] +max_lines = 80 +on_empty = "sam: done" + +[[tests.sam]] +name = "build and deploy results kept" +input = """ +Building codeuri: hello_world runtime: python3.12 + +Build Succeeded + +CloudFormation events from stack operations +ResourceStatus ResourceType LogicalResourceId +CREATE_IN_PROGRESS AWS::Lambda::Function HelloWorldFunction +CREATE_COMPLETE AWS::Lambda::Function HelloWorldFunction + +Successfully created/updated stack - my-app in us-east-1 +""" +expected = """Build Succeeded +CREATE_COMPLETE AWS::Lambda::Function HelloWorldFunction +Successfully created/updated stack - my-app in us-east-1""" + +[[tests.sam]] +name = "build failure kept" +input = """ +Build Failed +Error: PythonPipBuilder:ResolveDependencies - {pkg} +""" +expected = """Build Failed +Error: PythonPipBuilder:ResolveDependencies - {pkg}""" diff --git a/assets/filters/scons.toml b/assets/filters/scons.toml new file mode 100644 index 0000000..6e72626 --- /dev/null +++ b/assets/filters/scons.toml @@ -0,0 +1,40 @@ +[filters.scons] +description = "Keep scons result and errors, drop per-file compile commands" +match_command = "^scons\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^(gcc|g\\+\\+|cc|clang) ", + "^Compiling ", +] +keep_lines_matching = [ + "^scons: ", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "scons: done" + +[[tests.scons]] +name = "result kept, compile commands dropped" +input = """ +scons: Reading SConscript files ... +scons: Building targets ... +gcc -o app.o -c app.c +gcc -o app app.o +scons: done building targets. +""" +expected = """scons: Reading SConscript files ... +scons: Building targets ... +scons: done building targets.""" + +[[tests.scons]] +name = "build error kept" +input = """ +scons: Building targets ... +app.c:5:1: error: expected ';' +scons: building terminated because of errors. +""" +expected = """scons: Building targets ... +app.c:5:1: error: expected ';' +scons: building terminated because of errors.""" diff --git a/assets/filters/scorecard.toml b/assets/filters/scorecard.toml new file mode 100644 index 0000000..b218f1a --- /dev/null +++ b/assets/filters/scorecard.toml @@ -0,0 +1,42 @@ +[filters.scorecard] +description = "Keep OSSF scorecard aggregate score and per-check scores, drop details" +match_command = "^scorecard\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^-+$", + "^\\|-", + "Starting [Ss]corecard", + "^GITHUB_AUTH_TOKEN", +] +keep_lines_matching = [ + "Aggregate score", + "^\\|", + "(?i)error", +] +max_lines = 60 +on_empty = "scorecard: no score" + +[[tests.scorecard]] +name = "scores kept, details dropped" +input = """ +Starting [scorecard] version 4.13.1 + +Aggregate score: 7.2 / 10 + +Check scores: +|---------|----------------|----------------------| +| SCORE | NAME | REASON | +|---------|----------------|----------------------| +| 10 / 10 | Binary-Artifacts | no binaries found | +| 5 / 10 | Branch-Protection | branch not protected | +""" +expected = """Aggregate score: 7.2 / 10 +| SCORE | NAME | REASON | +| 10 / 10 | Binary-Artifacts | no binaries found | +| 5 / 10 | Branch-Protection | branch not protected |""" + +[[tests.scorecard]] +name = "no output collapses" +input = "" +expected = "scorecard: no score" diff --git a/assets/filters/sequelize.toml b/assets/filters/sequelize.toml new file mode 100644 index 0000000..0d13ad3 --- /dev/null +++ b/assets/filters/sequelize.toml @@ -0,0 +1,40 @@ +[filters.sequelize] +description = "Keep sequelize-cli migration steps and errors, drop loaded-config noise" +match_command = "^(npx\\s+)?sequelize(-cli)?\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Loaded configuration", + "^Using environment", + "^Sequelize CLI ", + "No migrations were executed", +] +keep_lines_matching = [ + ": migrating", + ": migrated", + ": reverting", + ": reverted", + "(?i)error", +] +max_lines = 80 +on_empty = "sequelize: nothing to migrate" + +[[tests.sequelize]] +name = "migration steps kept" +input = """ +Sequelize CLI [Node: 20.0.0, CLI: 6.6.0] +Loaded configuration file "config/config.json". + +== 20240101120000-create-users: migrating ======= +== 20240101120000-create-users: migrated (0.025s) +""" +expected = """== 20240101120000-create-users: migrating ======= +== 20240101120000-create-users: migrated (0.025s)""" + +[[tests.sequelize]] +name = "nothing pending collapses" +input = """ +Sequelize CLI [Node: 20.0.0, CLI: 6.6.0] +No migrations were executed, database schema was already up to date. +""" +expected = "sequelize: nothing to migrate" diff --git a/assets/filters/serverless.toml b/assets/filters/serverless.toml new file mode 100644 index 0000000..a54bedd --- /dev/null +++ b/assets/filters/serverless.toml @@ -0,0 +1,52 @@ +[filters.serverless] +description = "Keep serverless deploy endpoints/functions/result, drop upload progress" +match_command = "^(serverless|sls)\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Packaging ", + "Uploading ", + "Validating template", + "Retrieving CloudFormation", +] +keep_lines_matching = [ + "Service deployed", + "endpoint", + "^endpoints:", + "^functions:", + "^\\s{2,}\\w+:", + "GET - ", + "POST - ", + "deployment bucket", + "(?i)error", + "(?i)fail", +] +max_lines = 80 +on_empty = "serverless: deployed" + +[[tests.serverless]] +name = "endpoints and functions kept" +input = """ +Deploying myservice to stage dev (us-east-1) +Packaging service... +Uploading service.zip file to S3 (2 MB) +✔ Service deployed to stack myservice-dev (112s) + +endpoints: + GET - https://abc.execute-api.us-east-1.amazonaws.com/dev/hello +functions: + hello: myservice-dev-hello (1.2 MB) +""" +expected = """✔ Service deployed to stack myservice-dev (112s) +endpoints: + GET - https://abc.execute-api.us-east-1.amazonaws.com/dev/hello +functions: + hello: myservice-dev-hello (1.2 MB)""" + +[[tests.serverless]] +name = "deploy error kept" +input = """ +Packaging service... +Error: The security token included in the request is invalid +""" +expected = "Error: The security token included in the request is invalid" diff --git a/assets/filters/shellspec.toml b/assets/filters/shellspec.toml new file mode 100644 index 0000000..48ecc4b --- /dev/null +++ b/assets/filters/shellspec.toml @@ -0,0 +1,33 @@ +[filters.shellspec] +description = "Keep shellspec failures and example/failure counts, drop dots" +match_command = "^shellspec\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^[.]+$", +] +keep_lines_matching = [ + "Examples:", + "Failures:", + "^\\s*\\d+\\)", + "expected", + "(?i)error", +] +max_lines = 80 +on_empty = "shellspec: all examples passed" + +[[tests.shellspec]] +name = "failure summary kept" +input = """ +....F. + +Examples: 6, Failures: 1 +""" +expected = "Examples: 6, Failures: 1" + +[[tests.shellspec]] +name = "all pass collapses" +input = """ +...... +""" +expected = "shellspec: all examples passed" diff --git a/assets/filters/size-limit.toml b/assets/filters/size-limit.toml new file mode 100644 index 0000000..e5db794 --- /dev/null +++ b/assets/filters/size-limit.toml @@ -0,0 +1,44 @@ +[filters.size-limit] +description = "Keep size-limit package size/limit/time lines, drop decoration" +match_command = "^(npx\\s+)?size-limit\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "(?i)size", + "(?i)limit", + "(?i)loading time", + "(?i)running time", + "(?i)total time", + "(?i)error", +] +max_lines = 40 +on_empty = "size-limit: ok" + +[[tests.size-limit]] +name = "size report kept" +input = """ + Package size: 10.5 kB + Size limit: 12 kB + + Loading time: 210 ms + Running time: 80 ms + Total time: 290 ms +""" +expected = """ Package size: 10.5 kB + Size limit: 12 kB + Loading time: 210 ms + Running time: 80 ms + Total time: 290 ms""" + +[[tests.size-limit]] +name = "over limit kept" +input = """ + Package size: 15 kB + Size limit: 12 kB + Error: exceeds limit +""" +expected = """ Package size: 15 kB + Size limit: 12 kB + Error: exceeds limit""" diff --git a/assets/filters/skaffold.toml b/assets/filters/skaffold.toml new file mode 100644 index 0000000..1d54cbf --- /dev/null +++ b/assets/filters/skaffold.toml @@ -0,0 +1,48 @@ +[filters.skaffold] +description = "Keep skaffold build/deploy results, drop progress and tag generation" +match_command = "^skaffold\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Generating tags", + "^Checking cache", + "^Starting build", + "^Build \\[", + "^Building \\[", + "^\\s*-->", +] +keep_lines_matching = [ + "Successfully built", + "Tags used in deployment", + "Deploy completed", + "Starting deploy", + "deployed", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "skaffold: done" + +[[tests.skaffold]] +name = "build and deploy results kept" +input = """ +Generating tags... +Checking cache... +Building [app]... +Successfully built abc123def456 +Tags used in deployment: +Starting deploy... +Deploy completed in 2.3s +""" +expected = """Successfully built abc123def456 +Tags used in deployment: +Starting deploy... +Deploy completed in 2.3s""" + +[[tests.skaffold]] +name = "build failure kept" +input = """ +Building [app]... +ERROR building app: exit status 1 +""" +expected = "ERROR building app: exit status 1" diff --git a/assets/filters/spark-submit.toml b/assets/filters/spark-submit.toml new file mode 100644 index 0000000..b6e6b01 --- /dev/null +++ b/assets/filters/spark-submit.toml @@ -0,0 +1,37 @@ +[filters.spark-submit] +description = "Keep spark-submit errors/warnings and final status, drop INFO log spam" +match_command = "^spark-submit\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + " ERROR ", + " WARN ", + "Exception", + "final status:", + "final app status", + "(?i)failed", +] +max_lines = 80 +on_empty = "spark-submit: completed" +token_budget = 2000 + +[[tests.spark-submit]] +name = "errors and final status kept" +input = """ +24/06/01 12:00:00 INFO SparkContext: Running Spark version 3.5.0 +24/06/01 12:00:01 INFO Client: Submitting application +24/06/01 12:00:30 ERROR TaskSetManager: Task 3 failed 4 times +24/06/01 12:00:31 INFO Client: final status: FAILED +""" +expected = """24/06/01 12:00:30 ERROR TaskSetManager: Task 3 failed 4 times +24/06/01 12:00:31 INFO Client: final status: FAILED""" + +[[tests.spark-submit]] +name = "clean run collapses" +input = """ +24/06/01 12:00:00 INFO SparkContext: Running Spark version 3.5.0 +24/06/01 12:00:01 INFO Client: Submitting application +""" +expected = "spark-submit: completed" diff --git a/assets/filters/sqlx.toml b/assets/filters/sqlx.toml new file mode 100644 index 0000000..6dabc13 --- /dev/null +++ b/assets/filters/sqlx.toml @@ -0,0 +1,30 @@ +[filters.sqlx] +description = "Keep sqlx migrate applied/reverted lines and errors" +match_command = "^sqlx\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^Applied ", + "^Reverted ", + "^Skipped ", + "no migrations", + "(?i)error", +] +max_lines = 80 +on_empty = "sqlx: no pending migrations" + +[[tests.sqlx]] +name = "applied migrations kept" +input = """ +Applied 20240101120000/migrate create users (1.234ms) +Applied 20240102120000/migrate add index (0.876ms) +""" +expected = """Applied 20240101120000/migrate create users (1.234ms) +Applied 20240102120000/migrate add index (0.876ms)""" + +[[tests.sqlx]] +name = "nothing pending collapses" +input = "" +expected = "sqlx: no pending migrations" diff --git a/assets/filters/standard.toml b/assets/filters/standard.toml new file mode 100644 index 0000000..65b530d --- /dev/null +++ b/assets/filters/standard.toml @@ -0,0 +1,30 @@ +[filters.standard] +description = "Compact StandardJS lint output, keep findings" +match_command = "^(npx\\s+)?standard\\b" +passthrough_when_emptied = true +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^standard: Use JavaScript Standard Style", + "^standard: Run `standard --fix`", +] +keep_lines_matching = [ + ":\\d+:\\d+:", + "(?i)error", +] +max_lines = 100 + +[[tests.standard]] +name = "findings kept, banner dropped" +input = """ +standard: Use JavaScript Standard Style (https://standardjs.com) + /src/app.js:10:1: 'x' is not defined. (no-undef) + /src/app.js:15:20: Missing semicolon. (semi) +""" +expected = """ /src/app.js:10:1: 'x' is not defined. (no-undef) + /src/app.js:15:20: Missing semicolon. (semi)""" + +[[tests.standard]] +name = "clean run stays empty" +input = "" +expected = "" diff --git a/assets/filters/swag.toml b/assets/filters/swag.toml new file mode 100644 index 0000000..bb5af7f --- /dev/null +++ b/assets/filters/swag.toml @@ -0,0 +1,37 @@ +[filters.swag] +description = "Keep swag generation result and errors, drop per-file progress" +match_command = "^swag\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "Generating ", +] +keep_lines_matching = [ + "create docs.go", + "create swagger.json", + "create swagger.yaml", + "(?i)error", + "(?i)fail", + "cannot ", +] +max_lines = 40 +on_empty = "swag: docs generated" + +[[tests.swag]] +name = "doc artifacts kept" +input = """ +2024/06/01 12:00:00 Generate swagger docs.... +2024/06/01 12:00:00 Generating model.User +2024/06/01 12:00:00 create docs.go at docs/docs.go +2024/06/01 12:00:00 create swagger.json at docs/swagger.json +""" +expected = """2024/06/01 12:00:00 create docs.go at docs/docs.go +2024/06/01 12:00:00 create swagger.json at docs/swagger.json""" + +[[tests.swag]] +name = "parse error kept" +input = """ +2024/06/01 12:00:00 Generate swagger docs.... +2024/06/01 12:00:00 cannot find type definition: User +""" +expected = "2024/06/01 12:00:00 cannot find type definition: User" diff --git a/assets/filters/tap.toml b/assets/filters/tap.toml new file mode 100644 index 0000000..e0a46d0 --- /dev/null +++ b/assets/filters/tap.toml @@ -0,0 +1,37 @@ +[filters.tap] +description = "Keep TAP failures and summary, drop passing assertions" +match_command = "^(npx\\s+)?tap\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^ok \\d", + "^\\s*ok \\d", +] +keep_lines_matching = [ + "^not ok", + "^\\s*not ok", + "^# (failed|pass|fail|tests)", + "^1\\.\\.\\d", + "(?i)error", +] +max_lines = 80 +on_empty = "tap: all tests passed" + +[[tests.tap]] +name = "failures and summary kept" +input = """ +ok 1 - test one +not ok 2 - test two +ok 3 - test three +# failed 1 of 3 tests +""" +expected = """not ok 2 - test two +# failed 1 of 3 tests""" + +[[tests.tap]] +name = "all pass collapses" +input = """ +ok 1 - test one +ok 2 - test two +""" +expected = "tap: all tests passed" diff --git a/assets/filters/taplo.toml b/assets/filters/taplo.toml new file mode 100644 index 0000000..319fa0c --- /dev/null +++ b/assets/filters/taplo.toml @@ -0,0 +1,32 @@ +[filters.taplo] +description = "Keep taplo TOML errors/warnings and locations, collapse clean files" +match_command = "^taplo\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^error", + "^warning", + "┌─", + "(?i)error", +] +max_lines = 60 +on_empty = "taplo: ok" + +[[tests.taplo]] +name = "syntax error kept" +input = """ +error: expected `=`, `.` + ┌─ config.toml:3:5 + │ +3 │ key value + │ ^^^^^ +""" +expected = """error: expected `=`, `.` + ┌─ config.toml:3:5""" + +[[tests.taplo]] +name = "clean file collapses" +input = "" +expected = "taplo: ok" diff --git a/assets/filters/terraform-docs.toml b/assets/filters/terraform-docs.toml new file mode 100644 index 0000000..b77bd32 --- /dev/null +++ b/assets/filters/terraform-docs.toml @@ -0,0 +1,44 @@ +[filters.terraform-docs] +description = "Keep terraform-docs generated tables and headings, drop blanks" +match_command = "^terraform-docs\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^#", + "^\\|", + "(?i)error", +] +max_lines = 120 +on_empty = "terraform-docs: no content" +token_budget = 2000 + +[[tests.terraform-docs]] +name = "markdown tables and headings kept" +input = """ +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|----------| +| region | AWS region | string | "us-east-1" | no | + +## Outputs + +| Name | Description | +|------|-------------| +| vpc_id | The VPC id | +""" +expected = """## Inputs +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|----------| +| region | AWS region | string | "us-east-1" | no | +## Outputs +| Name | Description | +|------|-------------| +| vpc_id | The VPC id |""" + +[[tests.terraform-docs]] +name = "no content collapses" +input = "" +expected = "terraform-docs: no content" diff --git a/assets/filters/terrascan.toml b/assets/filters/terrascan.toml new file mode 100644 index 0000000..2a2f386 --- /dev/null +++ b/assets/filters/terrascan.toml @@ -0,0 +1,53 @@ +[filters.terrascan] +description = "Keep terrascan violations and scan summary, drop decoration" +match_command = "^terrascan\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*-+\\s*$", +] +keep_lines_matching = [ + "Description", + "^\\s*File", + "Severity", + "Violated Policies", + "Policies Validated", + "^\\s*(Low|Medium|High)\\b", + "(?i)error", +] +max_lines = 100 +on_empty = "terrascan: no violations" +token_budget = 2000 + +[[tests.terrascan]] +name = "violations and summary kept" +input = """ +Violation Details - + + Description : S3 bucket Access is not restricted + File : main.tf + Severity : HIGH + +Scan Summary - + + Policies Validated : 100 + Violated Policies : 1 + Low : 0 + Medium : 0 + High : 1 +""" +expected = """ Description : S3 bucket Access is not restricted + File : main.tf + Severity : HIGH + Policies Validated : 100 + Violated Policies : 1 + Low : 0 + Medium : 0 + High : 1""" + +[[tests.terrascan]] +name = "clean scan collapses" +input = """ +Scan Summary - +""" +expected = "terrascan: no violations" diff --git a/assets/filters/textlint.toml b/assets/filters/textlint.toml new file mode 100644 index 0000000..b9a864c --- /dev/null +++ b/assets/filters/textlint.toml @@ -0,0 +1,33 @@ +[filters.textlint] +description = "Keep textlint findings and problem total, drop filename headers" +match_command = "^(npx\\s+)?textlint\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^\\s*\\d+:\\d+\\s+(error|warning)", + "✖", + "problems?\\b", + "(?i)error", +] +max_lines = 100 +on_empty = "textlint: no issues" + +[[tests.textlint]] +name = "findings and total kept" +input = """ +/docs/guide.md + 1:5 error Found double spaces no-doubled-spaces + 3:1 error Don't start with lowercase some-rule + +✖ 2 problems (2 errors, 0 warnings) +""" +expected = """ 1:5 error Found double spaces no-doubled-spaces + 3:1 error Don't start with lowercase some-rule +✖ 2 problems (2 errors, 0 warnings)""" + +[[tests.textlint]] +name = "clean run collapses" +input = "" +expected = "textlint: no issues" diff --git a/assets/filters/tsup.toml b/assets/filters/tsup.toml new file mode 100644 index 0000000..e841d77 --- /dev/null +++ b/assets/filters/tsup.toml @@ -0,0 +1,46 @@ +[filters.tsup] +description = "Keep tsup build outputs and result, drop per-chunk noise" +match_command = "^(npx\\s+)?tsup\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "CLI Building entry", + "CLI Using ", + "CLI Target:", + "CLI tsup ", +] +keep_lines_matching = [ + "Build success", + "Build start", + "dist/", + "(?i)error", + "(?i)fail", +] +max_lines = 60 +on_empty = "tsup: build done" + +[[tests.tsup]] +name = "build outputs and result kept" +input = """ +CLI Building entry: src/index.ts +CLI Using tsconfig: tsconfig.json +CLI tsup v8.0.0 +ESM dist/index.mjs 1.20 KB +CJS dist/index.js 1.45 KB +ESM ⚡️ Build success in 320ms +DTS dist/index.d.ts 0.50 KB +DTS ⚡️ Build success in 1200ms +""" +expected = """ESM dist/index.mjs 1.20 KB +CJS dist/index.js 1.45 KB +ESM ⚡️ Build success in 320ms +DTS dist/index.d.ts 0.50 KB +DTS ⚡️ Build success in 1200ms""" + +[[tests.tsup]] +name = "build error kept" +input = """ +CLI Building entry: src/index.ts +X [ERROR] Could not resolve "./missing" +""" +expected = """X [ERROR] Could not resolve "./missing\"""" diff --git a/assets/filters/twine.toml b/assets/filters/twine.toml new file mode 100644 index 0000000..5074d72 --- /dev/null +++ b/assets/filters/twine.toml @@ -0,0 +1,44 @@ +[filters.twine] +description = "Keep twine uploaded artifacts and view URL, drop progress bars" +match_command = "^twine\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^Uploading distributions", + "%\\s*[━─=]+", + "[━─]{3,}", +] +keep_lines_matching = [ + "^Uploading \\S+\\.(whl|tar\\.gz)", + "View at:", + "^https?://", + "(?i)error", + "(?i)fail", +] +max_lines = 40 +on_empty = "twine: uploaded" + +[[tests.twine]] +name = "uploaded files and url kept" +input = """ +Uploading distributions to https://upload.pypi.org/legacy/ +Uploading mypackage-1.0.0-py3-none-any.whl +100% ━━━━━━━━━━━━ 12.3/12.3 kB • 00:01 +Uploading mypackage-1.0.0.tar.gz +100% ━━━━━━━━━━━━ 10.1/10.1 kB • 00:00 + +View at: +https://pypi.org/project/mypackage/1.0.0/ +""" +expected = """Uploading mypackage-1.0.0-py3-none-any.whl +Uploading mypackage-1.0.0.tar.gz +View at: +https://pypi.org/project/mypackage/1.0.0/""" + +[[tests.twine]] +name = "auth error kept" +input = """ +Uploading distributions to https://upload.pypi.org/legacy/ +HTTPError: 403 Forbidden: Invalid or non-existent authentication +""" +expected = "HTTPError: 403 Forbidden: Invalid or non-existent authentication" diff --git a/assets/filters/typos.toml b/assets/filters/typos.toml new file mode 100644 index 0000000..bbbae3e --- /dev/null +++ b/assets/filters/typos.toml @@ -0,0 +1,35 @@ +[filters.typos] +description = "Keep typos findings (word + location), drop source-context rendering" +match_command = "^typos\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*\\|", + "^\\s*\\d+\\s*\\|", + "\\^\\^", +] +keep_lines_matching = [ + "should be", + "^\\s*-->", + "(?i)^error", +] +max_lines = 100 +on_empty = "typos: no typos found" + +[[tests.typos]] +name = "finding word and location kept, context dropped" +input = """ +error: `recieved` should be `received` + --> ./src/api/handlers.js:15:10 + | +15 | recieved = true + | ^^^^^^^^ + | +""" +expected = """error: `recieved` should be `received` + --> ./src/api/handlers.js:15:10""" + +[[tests.typos]] +name = "clean run collapses" +input = "" +expected = "typos: no typos found" diff --git a/assets/filters/vault.toml b/assets/filters/vault.toml new file mode 100644 index 0000000..91afef1 --- /dev/null +++ b/assets/filters/vault.toml @@ -0,0 +1,37 @@ +[filters.vault] +description = "Keep vault status/secret key-value rows and errors, drop blanks" +match_command = "^vault\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^-+\\s+-+$", +] +keep_lines_matching = [ + "^Key\\s+Value", + "^\\S.*\\s{2,}\\S", + "Success!", + "(?i)error", +] +max_lines = 80 +on_empty = "vault: ok" + +[[tests.vault]] +name = "status rows kept" +input = """ +Key Value +--- ----- +Seal Type shamir +Sealed false +Total Shares 5 +""" +expected = """Key Value +Seal Type shamir +Sealed false +Total Shares 5""" + +[[tests.vault]] +name = "error kept" +input = """ +Error checking seal status: connection refused +""" +expected = "Error checking seal status: connection refused" diff --git a/assets/filters/wdio.toml b/assets/filters/wdio.toml new file mode 100644 index 0000000..2db713a --- /dev/null +++ b/assets/filters/wdio.toml @@ -0,0 +1,37 @@ +[filters.wdio] +description = "Keep WebdriverIO spec results and failures, drop passing test lines" +match_command = "^(npx\\s+)?wdio\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", + "^\\s*✓", + "^\\s*green ", +] +keep_lines_matching = [ + "^\\s*✖", + "Spec Files:", + "\\bfailed\\b", + "\\bFailed\\b", + "(?i)error", +] +max_lines = 80 +on_empty = "wdio: all specs passed" + +[[tests.wdio]] +name = "failures and spec summary kept" +input = """ + ✓ should load the page + ✖ should submit the form + +Spec Files: 1 passed, 1 failed, 2 total (100% completed) +""" +expected = """ ✖ should submit the form +Spec Files: 1 passed, 1 failed, 2 total (100% completed)""" + +[[tests.wdio]] +name = "all pass collapses" +input = """ + ✓ should load the page + ✓ should submit the form +""" +expected = "wdio: all specs passed" diff --git a/assets/filters/write-good.toml b/assets/filters/write-good.toml new file mode 100644 index 0000000..56669cf --- /dev/null +++ b/assets/filters/write-good.toml @@ -0,0 +1,29 @@ +[filters.write-good] +description = "Keep write-good suggestions, collapse clean prose" +match_command = "^(npx\\s+)?write-good\\b" +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "on line \\d+", + "weasel word", + "passive voice", + "(?i)error", +] +max_lines = 100 +on_empty = "write-good: no suggestions" + +[[tests.write-good]] +name = "suggestions kept" +input = """ +"very" is a weasel word on line 5 at column 10 +"was written" may be passive voice on line 8 at column 3 +""" +expected = """"very" is a weasel word on line 5 at column 10 +"was written" may be passive voice on line 8 at column 3""" + +[[tests.write-good]] +name = "clean prose collapses" +input = "" +expected = "write-good: no suggestions" diff --git a/assets/filters/xo.toml b/assets/filters/xo.toml new file mode 100644 index 0000000..6a6bedb --- /dev/null +++ b/assets/filters/xo.toml @@ -0,0 +1,36 @@ +[filters.xo] +description = "Compact XO lint output, keep findings and totals" +match_command = "^(npx\\s+)?xo\\b" +passthrough_when_emptied = true +strip_ansi = true +strip_lines_matching = [ + "^\\s*$", +] +keep_lines_matching = [ + "^\\s*✖", + "^\\s*\\S+:\\d+:\\d+", + "^\\s+\\d+:\\d+", + "\\b(error|warning)\\b", + "problems?\\b", + "(?i)error", +] +max_lines = 100 + +[[tests.xo]] +name = "findings kept" +input = """ + src/app.js:10:1 + ✖ 10:1 'x' is not defined no-undef + ✖ 15:3 Missing semicolon semi + + 2 problems (2 errors, 0 warnings) +""" +expected = """ src/app.js:10:1 + ✖ 10:1 'x' is not defined no-undef + ✖ 15:3 Missing semicolon semi + 2 problems (2 errors, 0 warnings)""" + +[[tests.xo]] +name = "clean run stays empty" +input = "" +expected = "" diff --git a/src/usage.rs b/src/usage.rs index 57025c2..f249483 100644 --- a/src/usage.rs +++ b/src/usage.rs @@ -490,9 +490,9 @@ fn report_blocks(records: &[Record], opts: &Options) -> Result<()> { } fn floor_hour(ts: DateTime) -> DateTime { - Local - .with_ymd_and_hms(ts.year(), ts.month(), ts.day(), ts.hour(), 0, 0) - .single() + ts.with_minute(0) + .and_then(|t| t.with_second(0)) + .and_then(|t| t.with_nanosecond(0)) .unwrap_or(ts) } @@ -590,8 +590,8 @@ fn print_statusline(records: &[Record], mode: CostMode) { } fn short(s: &str) -> String { - if s.len() > 20 { - format!("{}…", &s[..19]) + if s.chars().count() > 20 { + format!("{}…", s.chars().take(19).collect::()) } else { s.to_string() } @@ -615,4 +615,4 @@ fn fmt_cost(c: f64) -> String { } else { format!("${:.4}", c) } -} +} \ No newline at end of file