Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 0 additions & 57 deletions skills/firecrawl-agent/SKILL.md

This file was deleted.

28 changes: 21 additions & 7 deletions skills/firecrawl-cli/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,17 +71,31 @@ For detailed command reference, run `firecrawl <command> --help`.

## When to Load References

- **Searching the web or finding sources first** -> [firecrawl-search](../firecrawl-search/SKILL.md)
- **Scraping a known URL** -> [firecrawl-scrape](../firecrawl-scrape/SKILL.md)
- **Finding URLs on a known site** -> [firecrawl-map](../firecrawl-map/SKILL.md)
- **Bulk extraction from a docs section or site** -> [firecrawl-crawl](../firecrawl-crawl/SKILL.md)
- **AI-powered structured extraction from complex sites** -> [firecrawl-agent](../firecrawl-agent/SKILL.md)
- **Clicks, forms, login, pagination, or post-scrape browser actions** -> [firecrawl-instruct](../firecrawl-instruct/SKILL.md)
- **Downloading a site to local files** -> [firecrawl-download](../firecrawl-download/SKILL.md)
- **Searching the web or finding sources first** -> [references/search.md](references/search.md)
- **Scraping a known URL** -> [references/scrape.md](references/scrape.md)
- **Finding URLs on a known site** -> [references/map.md](references/map.md)
- **Bulk extraction from a docs section or site** -> [references/crawl.md](references/crawl.md)
- **AI-powered structured extraction from complex sites** -> [references/agent.md](references/agent.md)
- **Clicks, forms, login, pagination, or post-scrape browser actions** -> [references/interact.md](references/interact.md)
- **Downloading a site to local files** -> [references/download.md](references/download.md)
- **Install, auth, or setup problems** -> [rules/install.md](rules/install.md)
- **Output handling and safe file-reading patterns** -> [rules/security.md](rules/security.md)
- **Integrating Firecrawl into an app, adding `FIRECRAWL_API_KEY` to `.env`, or choosing endpoint usage in product code** -> install the Firecrawl skills repo with `npx skills add firecrawl/skills`

## References

| Need | Reference |
| ---- | --------- |
| Search the web or discover sources | [references/search.md](references/search.md) |
| Scrape a known URL | [references/scrape.md](references/scrape.md) |
| Find URLs on a known site | [references/map.md](references/map.md) |
| Crawl a site section | [references/crawl.md](references/crawl.md) |
| Extract structured data with AI | [references/agent.md](references/agent.md) |
| Interact with a page after scraping | [references/interact.md](references/interact.md) |
| Download a site to local files | [references/download.md](references/download.md) |
| Fix install or auth issues | [rules/install.md](rules/install.md) |
| Review output-handling guidance | [rules/security.md](rules/security.md) |

## Output & Organization

Unless the user specifies to return in context, write results to `.firecrawl/` with `-o`. Add `.firecrawl/` to `.gitignore`. Always quote URLs - shell interprets `?` and `&` as special characters.
Expand Down
35 changes: 35 additions & 0 deletions skills/firecrawl-cli/references/agent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Agent Reference

Use this when manual scraping would require navigating many pages or when the user wants structured JSON output with a schema.

## Quick start

```bash
# Extract structured data
firecrawl agent "extract all pricing tiers" --wait -o .firecrawl/pricing.json

# With a JSON schema
firecrawl agent "extract products" --schema '{"type":"object","properties":{"name":{"type":"string"},"price":{"type":"number"}}}' --wait -o .firecrawl/products.json

# Focus on specific pages
firecrawl agent "get feature list" --urls "<url>" --wait -o .firecrawl/features.json
```

## Options

| Option | Description |
| ---------------------- | ----------------------------------------- |
| `--urls <urls>` | Starting URLs for the agent |
| `--model <model>` | Model to use: spark-1-mini or spark-1-pro |
| `--schema <json>` | JSON schema for structured output |
| `--schema-file <path>` | Path to JSON schema file |
| `--max-credits <n>` | Credit limit for this agent run |
| `--wait` | Wait for agent to complete |
| `--pretty` | Pretty print JSON output |
| `-o, --output <path>` | Output file path |

## Tips

- Use `--schema` for predictable output whenever possible.
- Agent runs usually cost more and take longer than plain scrape or crawl.
- For single-page extraction, prefer `scrape` first.
37 changes: 37 additions & 0 deletions skills/firecrawl-cli/references/crawl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Crawl Reference

Use this when you need content from many pages on one site, such as a docs section or large content area.

## Quick start

```bash
# Crawl a docs section
firecrawl crawl "<url>" --include-paths /docs --limit 50 --wait -o .firecrawl/crawl.json

# Full crawl with depth limit
firecrawl crawl "<url>" --max-depth 3 --wait --progress -o .firecrawl/crawl.json

# Check status of a running crawl
firecrawl crawl <job-id>
```

## Options

| Option | Description |
| ------------------------- | ------------------------------------------- |
| `--wait` | Wait for crawl to complete before returning |
| `--progress` | Show progress while waiting |
| `--limit <n>` | Max pages to crawl |
| `--max-depth <n>` | Max link depth to follow |
| `--include-paths <paths>` | Only crawl URLs matching these paths |
| `--exclude-paths <paths>` | Skip URLs matching these paths |
| `--delay <ms>` | Delay between requests |
| `--max-concurrency <n>` | Max parallel crawl workers |
| `--pretty` | Pretty print JSON output |
| `-o, --output <path>` | Output file path |

## Tips

- Use `--wait` when you need results in the current session.
- Scope crawls with `--include-paths` instead of crawling an entire domain unnecessarily.
- Large crawls consume credits per page, so check `firecrawl credit-usage` first.
36 changes: 36 additions & 0 deletions skills/firecrawl-cli/references/download.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Download Reference

Use this when the goal is to save an entire site or section as local files rather than just extract content once.

## Quick start

```bash
# Interactive wizard
firecrawl download https://docs.example.com

# With screenshots
firecrawl download https://docs.example.com --screenshot --limit 20 -y

# Filter to specific sections
firecrawl download https://docs.example.com --include-paths "/features,/sdks"

# Skip translations
firecrawl download https://docs.example.com --exclude-paths "/zh,/ja,/fr,/es,/pt-BR"
```

## Options

| Option | Description |
| ------------------------- | ------------------------------------------- |
| `--limit <n>` | Max pages to download |
| `--search <query>` | Filter URLs by search query |
| `--include-paths <paths>` | Only download matching paths |
| `--exclude-paths <paths>` | Skip matching paths |
| `--allow-subdomains` | Include subdomain pages |
| `-y` | Skip confirmation prompt in automated flows |

## Tips

- Always pass `-y` in automated flows to skip the confirmation prompt.
- All scrape options still work with download.
- Use download when the end goal is local files, not just extraction output.
37 changes: 37 additions & 0 deletions skills/firecrawl-cli/references/interact.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Interact Reference

Use this when content requires clicks, forms, login, pagination, or multi-step navigation and plain scrape is not enough.

## Quick start

```bash
# 1. Scrape a page
firecrawl scrape "<url>"

# 2. Interact with natural language
firecrawl interact --prompt "Click the login button"
firecrawl interact --prompt "Fill in the email field with test@example.com"

# 3. Or use code for precise control
firecrawl interact --code "agent-browser click @e5" --language bash

# 4. Stop the session
firecrawl interact stop
```

## Options

| Option | Description |
| --------------------- | -------------------------------------- |
| `--prompt <text>` | Natural language instruction |
| `--code <code>` | Code to execute in the browser session |
| `--language <lang>` | Language for code: bash, python, node |
| `--timeout <seconds>` | Execution timeout |
| `--scrape-id <id>` | Target a specific scrape |
| `-o, --output <path>` | Output file path |

## Tips

- Always scrape first. `interact` needs a scrape ID from a previous scrape.
- Use `firecrawl interact stop` to free resources when done.
- Do not use interact for general web search. Use `search` for that.
29 changes: 29 additions & 0 deletions skills/firecrawl-cli/references/map.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Map Reference

Use this when the user knows the site but not the exact page, or wants to list and filter URLs before scraping.

## Quick start

```bash
# Find a specific page on a large site
firecrawl map "<url>" --search "authentication" -o .firecrawl/filtered.txt

# Get all URLs
firecrawl map "<url>" --limit 500 --json -o .firecrawl/urls.json
```

## Options

| Option | Description |
| --------------------------------- | ---------------------------- |
| `--limit <n>` | Max number of URLs to return |
| `--search <query>` | Filter URLs by search query |
| `--sitemap <include\|skip\|only>` | Sitemap handling strategy |
| `--include-subdomains` | Include subdomain URLs |
| `--json` | Output as JSON |
| `-o, --output <path>` | Output file path |

## Tips

- `map --search` plus `scrape` is a common pattern for large docs sites.
- If the goal is to download or extract lots of pages, graduate to `crawl` or `download`.
38 changes: 38 additions & 0 deletions skills/firecrawl-cli/references/scrape.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Scrape Reference

Use this when you already have a URL and want content from a static page or JS-rendered SPA.

## Quick start

```bash
# Basic markdown extraction
firecrawl scrape "<url>" -o .firecrawl/page.md

# Main content only
firecrawl scrape "<url>" --only-main-content -o .firecrawl/page.md

# Wait for JS to render
firecrawl scrape "<url>" --wait-for 3000 -o .firecrawl/page.md

# Multiple URLs
firecrawl scrape https://example.com https://example.com/blog https://example.com/docs
```

## Options

| Option | Description |
| ------------------------ | ---------------------------------------------------------------- |
| `-f, --format <formats>` | Output formats: markdown, html, rawHtml, links, screenshot, json |
| `-Q, --query <prompt>` | Ask a question about the page content (5 credits) |
| `-H` | Include HTTP headers in output |
| `--only-main-content` | Strip nav, footer, sidebar and keep the main content |
| `--wait-for <ms>` | Wait for JS rendering before scraping |
| `--include-tags <tags>` | Only include these HTML tags |
| `--exclude-tags <tags>` | Exclude these HTML tags |
| `-o, --output <path>` | Output file path |

## Tips

- Prefer plain scrape over `--query` when you want the full page content to inspect yourself.
- Try scrape before interact. Escalate only when you truly need clicks, forms, login, or pagination.
- Multiple URLs are scraped concurrently. Check `firecrawl --status` for the current concurrency limit.
Original file line number Diff line number Diff line change
@@ -1,21 +1,6 @@
---
name: firecrawl-search
description: |
Web search with full page content extraction. Use this skill whenever the user asks to search the web, find articles, research a topic, look something up, find recent news, discover sources, or says "search for", "find me", "look up", "what are people saying about", or "find articles about". Returns real search results with optional full-page markdown — not just snippets. Provides capabilities beyond Claude's built-in WebSearch.
allowed-tools:
- Bash(firecrawl *)
- Bash(npx firecrawl *)
---
# Search Reference

# firecrawl search

Web search with optional content scraping. Returns search results as JSON, optionally with full page content.

## When to use

- You don't have a specific URL yet
- You need to find pages, answer questions, or discover sources
- First step in the [workflow escalation pattern](firecrawl-cli): search → scrape → map → crawl → interact
Use this when you do not have a specific URL yet and need to find pages, answer questions, discover sources, or gather recent results.

## Quick start

Expand Down Expand Up @@ -47,13 +32,6 @@ firecrawl search "your query" --sources news --tbs qdr:d -o .firecrawl/news.json

## Tips

- **`--scrape` fetches full content** — don't re-scrape URLs from search results. This saves credits and avoids redundant fetches.
- `--scrape` already fetches full page content, so do not re-scrape the same result URLs.
- Always write results to `.firecrawl/` with `-o` to avoid context window bloat.
- Use `jq` to extract URLs or titles: `jq -r '.data.web[].url' .firecrawl/search.json`
- Naming convention: `.firecrawl/search-{query}.json` or `.firecrawl/search-{query}-scraped.json`

## See also

- [firecrawl-scrape](../firecrawl-scrape/SKILL.md) — scrape a specific URL
- [firecrawl-map](../firecrawl-map/SKILL.md) — discover URLs within a site
- [firecrawl-crawl](../firecrawl-crawl/SKILL.md) — bulk extract from a site
- Use `jq -r '.data.web[].url' .firecrawl/search.json` to extract URLs quickly.
Loading
Loading