CLI tools for AI agent workflows: structure-aware code search and web content extraction.
See the Wiki for more specifics.
Tree-sitter powered code search with a simple DSL for finding calls, imports, and definitions across your codebase.
# Find all axios.get/post calls
ast-find --lang ts,js --query 'call(prop=/^(get|post)$/)' --within src/
# Find all imports of a module
ast-find --lang py --query 'import(module=/^requests$/)'
# Find function definitions
ast-find --lang js --query 'def(name=/^handle/)'Fetch web pages, extract main content, and convert to clean Markdown for LLM consumption.
# Fetch a single page
web-get "https://example.com/article"
# Batch fetch with stdin
cat urls.txt | web-get --concurrency 8
# Use CSS selector for precise extraction
web-get "https://blog.example.com" --selector "article, .post-content"Use the bundled installer to build both binaries in release mode and copy them
to /usr/local/bin (pass --prefix/--destdir to customize the target):
git clone https://github.com/Chris-Cullins/agent-tools.git
cd agent-tools
./install.shTo install each crate manually:
cargo install --path crates/ast-find
cargo install --path crates/web-getBoth tools emit NDJSON (newline-delimited JSON) for easy parsing:
ast-find:
{"type":"match","lang":"javascript","path":"./src/api.js","start_line":42,"end_line":42,"chunk_id":"abc123...","score":1.0,"excerpt":"...code...","capture":{"callee":"get","object":"axios"}}web-get:
{"type":"document","url":"https://example.com","title":"Page Title","text_md":"# Heading\n\nContent...","word_count":523,"links":["https://..."],"hash":"blake3hex"}For more information on the tree-sitter output that is read, see The Tree-sitter Docs
Simple pattern matching with regex predicates:
| Pattern | Example | Matches |
|---|---|---|
call(callee=/regex/) |
call(callee=/^fetch$/) |
Function calls |
call(prop=/regex/) |
call(prop=/^(get|post)$/) |
Method calls |
call(text=/regex/) |
call(text=/axios\.get\(.*Authorization/) |
Full call text (multi-line) |
import(module=/regex/) |
import(module=/^axios/) |
Import statements |
def(name=/regex/) |
def(name=/^handle/) |
Function/class definitions |
Multi-line calls, imports, or definitions often defeat plain regex searches. Combine AST matching with the text/code predicate to stay accurate without giving up span-aware matching:
ast-find --lang js --query 'call(text=/axios\\.get\\(.*Authorization/)'The query above only returns call_expression nodes whose full source (including nested options objects) mentions an Authorization header anywhere inside the call. Because the text predicate automatically turns on dot-all mode, .* will cross line breaks out of the box. The alias code=/.../ behaves identically if you prefer a more descriptive keyword.
Combine queries using boolean operators:
and(expr, ...)— results present in every operandor(expr, ...)— results from any operandnot(expr)— removes operand matches from sibling expressions
Example: and(call(prop=/log/), not(call(object=/console/)))
Code Search + Context:
ast-find --lang ts --query 'call(prop=/^deprecated/)' --max-results 50 | \
jq -r '.path' | sort -u > files_to_migrate.txtDocumentation Retrieval:
echo "https://docs.example.com/api
https://docs.example.com/auth" | \
web-get --selector "article" | \
jq -r 'select(.type=="document") | .text_md' > combined_docs.mdDependency Audit:
ast-find --lang js --query 'import(module=/^[^\\.\/]/)' --max-results 1000 | \
jq -r '.capture.module' | sort -u > all_dependencies.txtast-find:
- Go language adapter
- Rust language adapter
- Java language adapter
- Boolean combinators (And, Or, Not)
- Incremental caching
- Multi-line pattern matching
web-get:
- Advanced Readability scoring
- PDF text extraction (beyond stubs)
- Image alt-text extraction
- Table → Markdown table conversion
git-diff-json — Structured diffs and hunks
Emit per-file changes and per-hunk spans between commits/branches/working tree. Avoids fragile parsing of patch text; directly yields file-level and hunk-level structures with stable IDs and spans.
repo-ls — Repo-aware file inventory
Walk repo honoring .gitignore and emit files with metadata: size, lang, hash, line counts. Single pass, deterministic, with language detection and content hashes for deduplication.
dep-scan — Dependency manifest normalizer
Parse manifests and lockfiles (package.json, Cargo.toml, pyproject.toml, etc.) into a unified dependency list. Cross-ecosystem, structured graph with resolved versions and sources.
doc-index — Markdown/MDX indexer and link graph
Parse .md/.mdx to extract headings, anchors, links, and per-section spans. Structured table of contents and document graph; per-section chunking with stable IDs.
link-check — Concurrent HTTP link validator
Verify internal/external links with redirects, status, canonical, content-type. Handles concurrency, timeouts, and outputs structured results.
api-probe — JSON API sampler and shape extractor
Perform HEAD/GET/POST with sample payloads and emit structured response metadata plus inferred JSON shape (keys/types). Quickly map REST endpoint response shapes.
code-chop — Deterministic chunker for code/text
Split files into stable, size-bounded chunks for embedding/context with language-aware boundaries. Stable chunk IDs and boundaries; reproducible spans.
secret-find — Deterministic secrets scanner
Scan repo for likely secrets using curated detectors with entropy and context windows. Structured findings with types and confidence; consistent ignores and sorting.
task-list — Discover runnable project tasks
Enumerate build/test tasks from Makefile, package.json scripts, Cargo.toml workspace, and common CI configs. Normalizes disparate task definitions into a single structured list.
MIT