This page is the “map” of the major systems in FBA-Bench and where to find the implementation.
- Prompt Battery (
prompt): fast, model-only evaluation (no tools/memory implied). - Agentic Simulation (
agentic): long-horizon, stateful simulation for full agent systems.
See: docs/benchmark_philosophy.md
- World state + arbitration:
docs/simulation_services.md(WorldStore) - Market demand + sales processing:
docs/simulation_services.md(MarketSimulationService) - Supply chain disruptions + black swans:
docs/simulation_services.md(SupplyChainService) - Fees and unit economics:
docs/simulation_services.md(FeeCalculationService) - Trust/reputation signals:
docs/simulation_services.md(TrustScoreService)
- Red Team Gauntlet (adversarial injections + scoring):
docs/red_team_gauntlet.md
- Per-day long-term memory consolidation + competition awareness modes:
docs/cognitive_memory.md
- Utility-based autonomous shoppers:
docs/consumer_utility_model.md
- Token/cost/call budgets and enforcement:
docs/budget_constraints.md
- Deterministic seeding + response caching + golden masters:
docs/reproducibility.md - Audit and replay (artifacts, playback paths, and audit layers):
docs/audit_and_replay.md
- Plugin framework (scenarios/agents/tools/metrics):
docs/plugin_framework.md
- Multi-axis metrics suite (finance/ops/trust/stress/adversarial/cost):
docs/metrics_suite.md
- Logs, metrics, tracing, ClearML hooks:
docs/observability_stack.md
- Learning modules and post-run analysis tooling:
docs/learning_systems.md
- Godot Simulation Theater (observer-friendly GUI) and recording workflows:
docs/RUNBOOK_SIM_THEATER.mddocs/press/promo_video.mdscripts/record_godot_demo.ps1