FuzzPilot is an experimental C/C++ controller for AFL++ fuzzing campaigns. It adds telemetry collection, plateau detection, model-backed planning, short micro-campaigns, recipe-guided mutation, and optional reverse-engineering context from Ghidra.
The core design rule is simple: model calls never run in the AFL++ custom mutator hot path. The model observes telemetry and proposes strategy between fuzzing windows; the mutator stays local and fast.
- Documentation
- Current status
- Repository layout
- Concepts
- Dependencies
- Clone and submodules
- Build
- Ubuntu/x86 Docker smoke
- Ubuntu/x86 cloud notes
- Test
- Configuration overview
- Quickstart: fixture-only dry run
- Quickstart: cJSON AFL++ smoke
- Using a real model API
- Running full FuzzPilot loops
- M6 experiment matrix
- Recording experiment metadata
- Interpreting outputs
- Target development guide
- Security and secret handling
- Troubleshooting
- Cleanup
- CI
- Version:
0.1.0 - Maturity: experimental research prototype
- Primary development platform: macOS Apple Silicon
- CI platform: Linux on GitHub Actions
- Bundled real-world targets:
cJSONandlibpng - Model provider support: OpenAI-compatible chat completions endpoints
- Confirmed local smoke coverage:
- CMake build
- CTest suite
- process capture and env override smoke
- JSON proposal validation smoke
- M6 experiment matrix generation
- dry-run orchestration
- AFL++ cJSON smoke with the custom mutator
- real DeepSeek/OpenAI-compatible agent smoke
include/- public C++ headerssrc/- controller, CLI, model gateway, telemetry, storage, process runnermutators/- AFL++ custom mutator implementationconfigs/- example configsdb/- SQLite schemadocs/- project brief, quickstart, and evaluation planexperiments/targets/- target harnesses, seeds, and per-target configstests/- fixtures used by local smoke teststools/- small helper binaries compiled for tests.github/workflows/ci.yml- Linux build/test workflow
FuzzPilot has a few moving pieces. Knowing these names makes the CLI output much easier to read.
- Main campaign: the primary AFL++ process that fuzzes the target and produces telemetry.
- Telemetry: parsed AFL++
fuzzer_statsplus mutator telemetry such as recipe hits and misses. - Plateau: a period where execution continues but coverage growth slows or stops.
- Blackboard: compact JSON context passed to model-backed agents. It includes plateau metadata, AFL metrics, static-analysis context, previous decisions, and agent memory.
- Agent decision: one model-backed proposal from an agent such as
CoordinatorAgent,DictionaryAgent, orMutatorAgent. - Micro-campaign: a short AFL++ run launched from a corpus snapshot to compare interventions such as dictionary probes or per-seed recipe probes.
- Recipe: structured mutation guidance consumed by the custom mutator.
- Promotion: selecting the best micro-campaign result and persisting its recipe or intervention as the next fuzzing strategy.
Install these before building:
- CMake 3.22 or newer
- Ninja or another CMake generator
- SQLite3 development headers and library
- C++20 compiler
- AFL++
- Git
- Bash
curl
Optional but useful:
- Ghidra headless if using
static_analysis.backend=ghidra(the only supported backend) ghfor GitHub PR and CI workflowsjqorsqlite3for inspecting run artifacts
brew install cmake ninja sqlite afl++ git openjdksudo apt-get update
sudo apt-get install -y afl++ build-essential clang cmake curl git \
libpng-dev libsqlite3-dev ninja-build pkg-config \
python3 python3-pip sqlite3 zlib1g-dev openjdk-17-jdk-headless
sudo apt-get install -y python3-matplotlib python3-pandas python3-yaml
# For Ghidra (required by full-agent ablation):
sudo FUZZPILOT_GHIDRA_INSTALL_DIR=/opt scripts/install_ghidra_ubuntu.shscripts/fuzzpilot_docker.sh build
scripts/fuzzpilot_docker.sh preflight
scripts/fuzzpilot_docker.sh smoke
FUZZPILOT_MODEL_API_KEY="$KEY" \
scripts/fuzzpilot_docker.sh run-batch --exp E1a --parallel 4The wrapper auto-selects linux/amd64 or linux/arm64 from the host
architecture and only mounts results/ into the container. Set
FUZZPILOT_DOCKER_PLATFORM=linux/amd64 for the paper-canonical platform.
The image pins AFL++, Ghidra, cJSON, and libpng, then builds all targets inside
the image, so host submodules and host AFL++ are not required for Docker runs.
The bundled experiment target sources are tracked as Git submodules/gitlinks for
native development. Docker builds clone the pinned target revisions internally,
so submodule initialization is not required for scripts/fuzzpilot_docker.sh.
Clone with submodules when starting a native checkout:
git clone --recurse-submodules https://github.com/qiaozhiyi/fuzz_agent.git
cd fuzz_agentIf the repository was already cloned:
git submodule update --init --recursiveExpected target source paths:
experiments/targets/cjson/srcexperiments/targets/libpng/src
The checked-in target binaries may have been built on a developer machine and are not portable across CPU/OS boundaries. Before native long runs, rebuild target binaries on that machine:
scripts/build_ubuntu_targets.shIf cJSON is skipped in native mode, initialize submodules first with the
command above or install libcjson-dev so the harness can link against the
system package. Docker builds do this internally.
For normal development:
cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build buildThe bundled configs point mutation_strategy.custom_mutator_path at
./build/mutators/fuzzpilot/libfuzzpilot_mutator. CMake creates that
extensionless link after building the platform library, so the same config works
on macOS (.dylib) and Ubuntu/Linux (.so).
Typical binaries after build:
build/fuzzpilotbuild/mutators/fuzzpilot/libfuzzpilot_mutatorcross-platform linkbuild/mutators/fuzzpilot/libfuzzpilot_mutator.soon Linuxbuild/mutators/fuzzpilot/libfuzzpilot_mutator.dylibon macOSbuild/fuzzpilot_mutator_smokebuild/fuzzpilot_process_capture_smokebuild/fuzzpilot_json_utils_smoke
Run the same smoke path on macOS or Linux, amd64 or arm64:
scripts/fuzzpilot_docker.sh smokeThe wrapper builds docker/ubuntu/Dockerfile when needed, runs preflight inside
the image, validates runtime configs, and launches a short cJSON baseline run.
Smoke artifacts are written under results/docker_smoke/ and remain readable on
the host.
For paper-comparable data, use the canonical platform:
FUZZPILOT_DOCKER_PLATFORM=linux/amd64 scripts/fuzzpilot_docker.sh smokeUse experiments/README.md when moving to a long-running
server. The recommended path is still Docker; force
FUZZPILOT_DOCKER_PLATFORM=linux/amd64 for paper-comparable runs. Native runs
must rebuild all target binaries on that machine before starting M6 runs.
Run the full local test suite:
ctest --test-dir build --output-on-failureUseful individual checks:
./build/fuzzpilot --version
./build/fuzzpilot check-config --config configs/examples/libpng.yaml
./build/fuzzpilot parse-stats --stats tests/fixtures/fuzzer_stats_newer
./build/fuzzpilot detect-plateau \
--older tests/fixtures/fuzzer_stats_older \
--newer tests/fixtures/fuzzer_stats_newer \
--window-sec 600Validate the metadata script:
bash -n scripts/capture_run_metadata.shEach run is driven by a YAML-like config. The parser is intentionally lightweight, so prefer simple key/value structures like the checked-in examples.
Important sections:
project: "cJSON_fuzz"
target:
name: "cjson_parser"
binary: "experiments/targets/cjson/cjson_fuzzer"
input_dir: "experiments/targets/cjson/seeds"
args: ["@@"]
timeout_ms: 1000
memory_mb: 1024
afl:
base_env:
AFL_MAP_SIZE: "4096"
AFL_NO_UI: "1"
AFL_SKIP_CPUFREQ: "1"
main_budget_sec: 120
plateau_window_sec: 10
micro_campaign:
budget_sec: 20
num_candidates: 4
mutation_strategy:
enabled: true
recipe_store: "work_cjson/recipes"
custom_mutator_path: "./build/mutators/fuzzpilot/libfuzzpilot_mutator"
static_analysis:
enabled: false
backend: "ghidra"
ghidra_home: "/opt/ghidra"
extractor_script: "scripts/ghidra/FuzzPilotGhidraExtract.java"
timeout_sec: 120
model_api:
provider: "openai-compatible"
endpoint: "https://api.deepseek.com/chat/completions"
model: "deepseek-chat"
api_key_env: "FUZZPILOT_MODEL_API_KEY"Config notes:
target.argsshould include@@when the target reads from a file path.afl.base_env.AFL_MAP_SIZEmust match the instrumented target when AFL++ asks for a non-default map size.mutation_strategy.custom_mutator_pathcan point at the extensionlesslibfuzzpilot_mutatorlink, or directly at the platform.so/.dylib.model_api.api_key_envis the name of the environment variable, not the secret value.model_api.provideracceptsopenai-compatibleand legacyopenai_compatible.target.dictand legacytarget.dictionaryare both supported.model.model_nameandmodel_api.modelare both supported.static_analysis.backendonly acceptsghidra; keepenabled: falseuntil Ghidra headless is installed (seescripts/install_ghidra_ubuntu.sh).
This path does not launch AFL++. It replays fixture stats and is the fastest way to verify the controller, database, plateau detection, micro-campaign planning, agent runtime, and report writing.
./build/fuzzpilot run \
--config configs/examples/libpng.yaml \
--work-dir build/smoke/mvp_run \
--schema db/schema.sql \
--afl-output-dir tests/fixtures/afl_out \
--stats tests/fixtures/fuzzer_stats_older \
--stats tests/fixtures/fuzzer_stats_newer \
--micro-stats tests/fixtures/fuzzer_stats_micro_control \
--micro-stats tests/fixtures/fuzzer_stats_micro_dictionary \
--micro-stats tests/fixtures/fuzzer_stats_micro_seed_focus \
--micro-stats tests/fixtures/fuzzer_stats_micro_winner \
--provider fakeExpected artifacts:
build/smoke/mvp_run/<run_id>/report.mdcoverage.csvevents.jsonlagent_decisions.jsonlagent_memory.jsonlfuzzpilot.sqlite
This launches AFL++ for a short bounded run. Use it to confirm target execution, AFL++ instrumentation, the custom mutator, and output directory creation.
mkdir -p /tmp/fuzzpilot_cjson_recipes
AFL_CUSTOM_MUTATOR_LIBRARY=./build/mutators/fuzzpilot/libfuzzpilot_mutator \
AFL_MAP_SIZE=4096 \
AFL_NO_UI=1 \
AFL_SKIP_CPUFREQ=1 \
FUZZPILOT_RECIPE_STORE=/tmp/fuzzpilot_cjson_recipes \
afl-fuzz -V 5 \
-i experiments/targets/cjson/seeds \
-o /tmp/fuzzpilot_cjson_afl_smoke \
-m 1024 \
-t 1000 \
-- experiments/targets/cjson/cjson_fuzzer @@Healthy signs:
- AFL++ loads the custom mutator successfully.
- The fork server starts.
- The seed dry run succeeds.
- AFL++ reports new corpus items or at least stable execution.
- Crashes and timeouts remain zero for a basic smoke.
FuzzPilot supports OpenAI-compatible chat completions APIs. DeepSeek works with the checked-in target configs.
Never put a real API key in a config, README, commit, shell history, issue, or PR comment. Export it only in your local shell:
export FUZZPILOT_MODEL_API_KEY="<YOUR_API_KEY>"Optional endpoint override:
export FUZZPILOT_MODEL_ENDPOINT="https://api.deepseek.com/chat/completions"Minimal real API smoke:
./build/fuzzpilot run-model-agents \
--db /tmp/fuzzpilot_model_probe.sqlite \
--schema db/schema.sql \
--run-id run_model_probe \
--plateau-id plateau_model_probe \
--blackboard-json '{"plateau":{"reason":"api_probe"},"target":{"name":"cjson_parser","format":"JSON"},"main_metrics":{"execs_done":1000,"execs_per_sec":1000},"static_analysis_context":{"magic_tokens":["true","false","null","[","]","{","}"]}}' \
--provider openai-compatible \
--endpoint https://api.deepseek.com/chat/completions \
--model deepseek-chat \
--api-key-env FUZZPILOT_MODEL_API_KEYInspect the results:
sqlite3 /tmp/fuzzpilot_model_probe.sqlite \
'select agent, schema_valid, fallback_used, latency_ms from agent_decisions order by agent;'Expected result:
- one row per agent
schema_validshould be1fallback_usedshould be0- latency values should be nonzero
If a model response is truncated or malformed, FuzzPilot now marks
schema_valid=0 instead of silently accepting partial JSON.
Use this when validating local orchestration without spending API credits:
./build/fuzzpilot run \
--config experiments/targets/cjson/config.yaml \
--work-dir work_cjson_fake \
--schema db/schema.sql \
--provider fakeUse this when you are ready to run AFL++ plus real agent decisions:
export FUZZPILOT_MODEL_API_KEY="<YOUR_API_KEY>"
./build/fuzzpilot run \
--config experiments/targets/cjson/config.yaml \
--work-dir work_cjson_deepseek \
--schema db/schema.sql \
--real-runFor libpng:
export FUZZPILOT_MODEL_API_KEY="<YOUR_API_KEY>"
./build/fuzzpilot run \
--config experiments/targets/libpng/config.yaml \
--work-dir work_libpng_deepseek \
--schema db/schema.sql \
--real-runOperational notes:
--real-runlaunches AFL++ instead of replaying fixture stats.- The main AFL++ run collects telemetry until budget or plateau.
- On plateau, FuzzPilot snapshots the corpus, runs static analysis if enabled, asks agents for strategies, then launches micro-campaigns.
- The controller stops the main AFL++ process before micro-campaigns so AFL++ can release shared memory cleanly on macOS.
- Reports are written under the selected
--work-dir.
M6 is the real-target validation phase. Generate the reproducible target/mode matrix before launching long runs:
./build/fuzzpilot m6-matrix \
--config experiments/targets/cjson/config.yaml \
--config experiments/targets/libpng/config.yaml \
--out-dir results/m6_matrix \
--work-dir work_m6 \
--repeats 3 \
--main-budget-sec 86400 \
--micro-budget-sec 300This writes:
results/m6_matrix/m6_matrix.jsonresults/m6_matrix/m6_matrix.md
Each target is planned across these modes:
baseline-afl- plain AFL++ control, no model agents, static analysis, or custom mutatorrule-only- deterministic fallback agents with micro-campaign validationno-static-analysis- full loop without reverse-engineering intelligenceno-mutator- full loop without custom mutator recipesfull-agent- complete configured loop
The run command accepts the same ablation modes directly:
./build/fuzzpilot run \
--config experiments/targets/cjson/config.yaml \
--work-dir work_m6/cjson_parser/full-agent/r1 \
--schema db/schema.sql \
--real-run \
--ablation full-agent \
--main-budget-sec 86400 \
--micro-budget-sec 300Use the metadata helper before or after important runs:
scripts/capture_run_metadata.sh \
--run-id run_cjson_001 \
--config experiments/targets/cjson/config.yaml \
--target cjson_parser \
--out-dir results/run_cjson_001The script writes:
run_metadata.json- commit, branch, config hash, target, OS, archgit_status.txt- branch and dirty worktree summarygit.patch- staged and unstaged local diff
Recommended run bundle:
results/<run_id>/
run_metadata.json
git_status.txt
git.patch
report.md
coverage.csv
events.jsonl
agent_decisions.jsonl
agent_memory.jsonl
fuzzpilot.sqlite
See docs/evaluation.md for the required field baseline.
Common files:
report.md- human-readable summary of the runcoverage.csv- time series of AFL++ telemetryevents.jsonl- append-only machine-readable event logagent_decisions.jsonl- model/fake agent decisionsagent_memory.jsonl- memory patches persisted from agent resultsfuzzpilot.sqlite- normalized run, telemetry, campaign, plateau, and agent datamain_launch.sh- reproducible main AFL++ command previewmicro/- micro-campaign directoriespromoted_recipes/- selected recipes after intervention evaluation
Useful SQLite queries:
sqlite3 work_cjson_deepseek/<run_id>/fuzzpilot.sqlite \
'select agent, schema_valid, fallback_used, latency_ms from agent_decisions order by created_ts;'
sqlite3 work_cjson_deepseek/<run_id>/fuzzpilot.sqlite \
'select campaign_id, execs_done, paths_total, bitmap_cvg, unique_crashes from telemetry order by ts desc limit 10;'A target directory should contain:
experiments/targets/<name>/
config.yaml
harness.c or harness.cpp
<target_fuzzer>
seeds/
seed1.*
src/
Target checklist:
- Build the harness with AFL++ instrumentation.
- Confirm the harness exits cleanly on every seed.
- Keep seed files small but structurally valid.
- Set
target.argsto["@@"]if the harness expects a file path. - Use
target.timeout_msto catch hangs without killing normal parsing. - Set
target.memory_mbhigh enough for legitimate target allocations. - Set
AFL_MAP_SIZEif AFL++ reports a required map size. - Add a dictionary when the format has stable magic tokens.
- Keep crash-triggering or generated outputs out of Git unless they are tiny, intentional fixtures.
Direct seed checks:
experiments/targets/cjson/cjson_fuzzer experiments/targets/cjson/seeds/seed1.json
experiments/targets/libpng/libpng_fuzzer experiments/targets/libpng/seeds/seed1.pngRules for API keys:
- Do not commit secrets.
- Do not paste real secrets into README examples.
- Do not store secrets in
config.yaml. - Prefer
api_key_env: "FUZZPILOT_MODEL_API_KEY". - Export the secret only in the local shell that runs FuzzPilot.
- Rotate the key if it appears in a commit, issue, PR, log, or screenshot.
Rules for command execution:
- FuzzPilot uses argv-based process execution for model
curl, environment capture, and static-analysis helper paths. - Model API secrets are passed to
curlthrough a private temporary header file, not directly in process arguments. - Captured subprocess output is bounded and model/static-analysis helper calls have timeouts.
- Environment variable names are validated before use.
- Model responses must be complete agent proposal JSON before being marked
schema_valid=true. - Invalid or truncated proposal text is serialized as raw text instead of being embedded as broken JSON.
- Corpus snapshots skip symlinks and cap copied queue files to protect local storage during long M6 runs.
- Compact recipes reject control-character tokens so model or CLI output cannot inject extra recipe directives through dictionary token text.
Build the target harness or update target.binary in the config.
Build the project in the directory referenced by
mutation_strategy.custom_mutator_path, or update the config.
Set afl.base_env.AFL_MAP_SIZE in the config to the value AFL++ reports.
Stop stale AFL++ processes and rerun. FuzzPilot stops the main AFL++ process before micro-campaigns to reduce shared memory pressure, but interrupted manual runs can still leave state behind.
Check:
FUZZPILOT_MODEL_API_KEYis exported in the current shell.- The endpoint is reachable.
- The model name is valid for the provider.
- The provider is
openai-compatible. run-model-agentsworks before attempting a full--real-run.
This means the model response was not a complete agent proposal JSON object. It may be truncated, off-schema, or an API error body. Reduce prompt size, provide more focused blackboard context, or increase provider-side output limits if the provider supports it.
Ensure .gitmodules contains entries for target source gitlinks:
experiments/targets/cjson/srcexperiments/targets/libpng/src
Check for fuzzing leftovers:
pgrep -af 'afl-fuzz|fuzzpilot run|cjson_fuzzer|libpng_fuzzer'Stop a known leftover process:
kill <pid>Temporary local run directories are usually safe to remove when you no longer need their reports:
rm -rf work_cjson_fake work_cjson_deepseek work_libpng_deepseek
rm -rf /tmp/fuzzpilot_*Do not delete artifacts that you still need for experiment reproducibility.
GitHub Actions runs on push and pull request:
.github/workflows/ci.yml
The workflow installs CMake, Ninja, SQLite, and a C++ compiler, then runs:
cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build build
ctest --test-dir build --output-on-failureCI intentionally does not require AFL++ or real model credentials. Those are validated with local smoke tests and explicit real-run experiments.