Skip to content

Latest commit

 

History

History
87 lines (61 loc) · 3.12 KB

File metadata and controls

87 lines (61 loc) · 3.12 KB

AGENTS.md

This file provides guidance to Codex (Codex.ai/code) when working with code in this repository.

Project Overview

C# SDK for Firecrawl, the web scraping and crawling API, auto-generated from the official Firecrawl OpenAPI specification using AutoSDK. Published as NuGet package Firecrawl. Also includes a .NET CLI tool (Firecrawl.Cli) for command-line scraping and crawling.

Build Commands

# Build the solution
dotnet build Firecrawl.slnx

# Build for release (also produces NuGet packages)
dotnet build Firecrawl.slnx -c Release

# Run integration tests (requires FIRECRAWL_API_KEY env var)
dotnet test src/tests/IntegrationTests/Firecrawl.IntegrationTests.csproj

# Regenerate SDK from OpenAPI spec
cd src/libs/Firecrawl && ./generate.sh

# Install and use the CLI tool
dotnet tool install -g Firecrawl.Cli
firecrawl auth <API_KEY>
firecrawl scrape https://example.com
firecrawl crawl https://example.com --limit 5

Architecture

Code Generation Pipeline

The SDK code in src/libs/Firecrawl/Generated/ is entirely auto-generated -- do not manually edit files there.

  1. src/libs/Firecrawl/openapi.yaml -- the Firecrawl OpenAPI spec (fetched from the official Firecrawl repo)
  2. src/libs/Firecrawl/generate.sh -- orchestrates: download spec, run AutoSDK CLI, output to Generated/
  3. CI auto-updates the spec and creates PRs if changes are detected

Project Layout

Project Purpose
src/libs/Firecrawl/ Main SDK library (FirecrawlClient)
src/libs/Firecrawl.Cli/ .NET CLI tool for scraping/crawling (auth, scrape, crawl, map commands)
src/tests/IntegrationTests/ Integration tests against real Firecrawl API
src/helpers/GenerateDocs/ Documentation generator from integration tests
src/helpers/TrimmingHelper/ NativeAOT/trimming compatibility validator

Hand-Written Extensions

File Purpose
CrawlClient.WaitJob.cs Polling helper to wait for crawl jobs to complete

CLI Tool Structure

The Firecrawl.Cli project provides a command-line interface with these commands:

Command File
firecrawl auth Commands/AuthCommand.cs
firecrawl scrape Commands/ScrapeCommand.cs
firecrawl crawl Commands/CrawlCommand.cs
firecrawl map Commands/MapCommand.cs

Build Configuration

  • Target: net10.0 (single target)
  • Language: C# preview with nullable reference types
  • Signing: Strong-named assemblies via src/key.snk
  • Versioning: Semantic versioning from git tags (v prefix) via MinVer
  • Analysis: All .NET analyzers enabled, AOT/trimming compatibility enforced
  • Testing: MSTest + FluentAssertions

Key Conventions

  • The client class is named FirecrawlClient
  • The namespace is Firecrawl
  • Crawl results are accessed via client.Crawling.WaitJobAsync() for polling until completion

CI/CD

  • Uses shared workflows from HavenDV/workflows repo
  • Dependabot updates NuGet packages weekly (auto-merged)
  • Documentation deployed to GitHub Pages via MkDocs Material