Backtester has three visible layers:
- Core Python backtesting package in
backtester/. - FastAPI API wrapper in
backtester/api/. - Next.js dashboard in
frontend/, branded in the UI as Backtest Lab.
The core package is intentionally modular. Data loading, strategies, portfolio state, metrics, research utilities, and visualization are independently testable. Engines and API services compose those modules. The frontend is an API client: it renders forms, validation, charts, and tables, but it does not reimplement backtesting logic.
backtester/data/- Fetches OHLCV data with yfinance, cleans it, validates schema, and caches Parquet files under
~/.backtester/cache/.
- Fetches OHLCV data with yfinance, cleans it, validates schema, and caches Parquet files under
backtester/ai/- Defines the safe natural-language strategy draft contract, prompt template, provider abstraction and factory, deterministic fake provider, optional OpenAI-compatible provider, optional LangChain OpenAI-compatible adapter, limited provider-output normalization, validation helpers, and compilers into existing API request schemas. Drafts and compiled payloads are inert data and are not executed.
backtester/agents/- Defines a backend Research Copilot graph. It moves from natural-language goal interpretation to inert AI draft, validation, compile, approval gate, optional approved workflow execution, deterministic result analysis, and next-step recommendation through typed state transitions.
backtester/strategy/- Defines
Strategy,MultiAssetStrategy,Signal, built-in momentum and mean-reversion strategies, constrained rule DSL schemas,RuleBasedStrategy, and a wrapper for applying one single-asset strategy across multiple assets.
- Defines
backtester/portfolio/- Defines
Order,Trade,Position, andPortfolio. Tracks cash, positions, trade history, and equity curve.
- Defines
backtester/engine/- Contains single-asset and multi-asset engines, immutable configs, result dataclasses, and shared position sizing logic.
backtester/metrics/- Computes returns, drawdowns, Sharpe/Sortino, alpha/beta, information ratio, profit factor, benchmark equity, and trade-level summaries.
backtester/research/- Runs parameter grid searches and returns sorted
pandas.DataFrameresults, including failed-combination rows for research diagnostics.
- Runs parameter grid searches and returns sorted
backtester/viz/- Matplotlib chart helpers for equity, drawdown, trades, and strategy comparison.
backtester/api/- FastAPI routes, Pydantic schemas, and service conversion between engine objects and JSON responses.
frontend/app/- Next.js App Router entrypoint, layout, global dark dashboard styles, and page-level state orchestration.
frontend/components/- Backtest Lab UI components: app shell, sidebar, top bar, form, metric cards, charts, states, AI Builder, Research Copilot, result tabs, and trade table.
frontend/lib/- Frontend API client, TypeScript API types, defaults, and form validation helpers.
examples/- Demo scripts and chart generation scripts.
benchmarks/- Synthetic benchmark and cProfile scripts.
tests/- Unit and smoke tests using deterministic synthetic data where possible.
DataLoader.fetch(ticker, start, end)returns a cleaned OHLCVDataFramewith lowercase columns andDatetimeIndexnameddate.BacktestEngine.run()runs one ticker with aStrategy.MultiAssetBacktestEngine.run()runs multiple tickers with aMultiAssetStrategy.RuleBasedStrategyevaluates Pydantic-validated rule specs over precomputed close, SMA, prior rolling high/low, and Bollinger band indicators without executing generated code.Portfolio.execute_order()applies slippage/commission and mutates cash/positions on accepted trades.generate_report()computes primary performance metrics and optional benchmark comparison keys.- Additional metrics helpers compute rolling Sharpe, rolling volatility, rolling drawdown, drawdown duration, best/worst day, monthly returns, VaR, and CVaR from first principles.
buy_and_hold_equity()creates a benchmark equity curve for comparison.run_grid_search()expands a parameter grid, runs backtests, and records errors per combination.- FastAPI
POST /api/backtestwraps a single-asset backtest for Backtest Lab. - FastAPI
POST /api/grid-searchwraps single-asset parameter sweeps, heatmap data, and deterministic robustness analysis. - FastAPI
POST /api/walk-forwardwraps rolling train/test validation using grid-search-selected parameters per fold. - FastAPI
POST /api/ai/strategy-draftwraps the AI Strategy Builder provider factory and returns validated structured drafts. The fake provider is the default; real providers are server-side opt-in. - FastAPI
POST /api/ai/compilecompiles validated drafts into existing API-compatible request payloads without running them. - FastAPI
POST /api/ai/research-planexposes the Research Copilot draft-and-compile path and stops before workflow execution. - FastAPI
POST /api/ai/research-approveresumes a prior response state and executes at most one existing workflow when approval matches the compiled target mode. backtester/agents/research_graph.pywires the research workflow with LangGraph when the backend dependency is installed. Importing the package does not require LangGraph; direct graph construction without it raises a sanitized dependency error, while the high-level runner can still use the same local state transitions for deterministic tests.backtester/agents/nodes.pyreuses the existing AI provider, draft validator, compiler, and API service wrappers. It never runs a compiled payload until an explicit matchingapproved_actionis present.backtester/agents/tools.pyrevalidates approved payload JSON with the existing API request schemas, runs only the matching service method, and reports malformed payload errors without including raw payload values.- Frontend
frontend/lib/api.tsisolates API calls from UI components. - Frontend
frontend/lib/validation.tsperforms inline form validation before POST requests.
DataLoaderfetches and validates OHLCV data.BacktestEngineinitializesPortfolioand callsstrategy.precompute(data).- For each bar, engine passes full
datapluscurrent_indextostrategy.generate_signal. - Engine converts signals into
Orders. Portfolioexecutes accepted orders and records equity.- Engine returns
BacktestResult. - Metrics, charts, CLI, API, or frontend consume the result.
- Backtest Lab loads health status from
GET /health. - Backtest Lab loads strategy metadata from
GET /api/strategies; local fallback metadata keeps the form renderable if the API is offline. - User edits the single-asset config in the right-side form.
- Frontend validates the request shape and strategy parameters inline.
- Browser submits
BacktestRequesttoPOST /api/backtest. backtester/api/services.pybuildsBacktestConfigand the selected strategy.- Existing Python engine and metrics run server-side.
- API returns:
- Submitted config
- Summary metrics
- Equity series
- Optional benchmark series
- Drawdown series
- Price series
- Executed trades
- Frontend renders KPI cards, Recharts equity/drawdown charts, result tabs, trades, metrics, and parameters.
- A client submits a prompt to
POST /api/ai/strategy-draft. - The API calls
get_strategy_draft_provider(), which selects the deterministic fake provider by default or an opt-in server-side OpenAI-compatible provider from backend environment variables. - The provider returns a constrained draft describing a single-run, grid-search, walk-forward, or unspecified target. Real-provider responses are parsed as JSON and checked for raw code-looking output.
- A limited normalization step repairs only deterministic schema-adjacent mistakes before validation: simple boolean strings for
benchmark, clearequity_sizingobjects intoposition_size_method/position_size_value, and cleanrule_spec.conditionsreferences into the existingrule_spec.rulesDSL. StrategyDraftPydantic validation remains the strict boundary. Unexpected fields, ambiguous sizing, malformed rule specs, unsupported indicators/operators, and extra provider output are rejected with sanitized validation errors.backtester/ai/validator.pychecks semantic safety: ticker readiness, ISO dates, date order, supported strategy kind, positive windows, momentum window ordering, mean-reversion bands, unsupported concepts, and raw-code fields.- The API returns structured JSON containing the draft, status, warnings, unsupported items, and validation errors.
- A reviewed draft can be submitted to
POST /api/ai/compile. backtester/ai/compiler.pymaps the draft into an existingBacktestRequest,GridSearchRequest, orWalkForwardRequestpayload.- Rule-based drafts compile to a single-run
BacktestRequestwith a strictrule_spec; built-in momentum and mean-reversion drafts can also compile to research workflows. - Missing research grids, date ranges, optimization metrics, and walk-forward windows use deterministic defaults with warnings.
- The compiled payload is not executed. Clients must submit it to the existing workflow endpoints if they choose to run it.
- Natural-language rule prompts are converted into
RuleBasedStrategySpec, not Python code. - The spec contains only enum-backed indicators and operators:
- indicators:
close,sma,rolling_high,rolling_low,bollinger_upper,bollinger_lower - operators:
>,<,>=,<=,crosses_above,crosses_below
- indicators:
- API schemas validate the nested
rule_specwithextra="forbid"before strategy construction. AI provider normalization may translate one narrow indicator/conditions shape into this DSL, but only when every condition validates and no unused or unsupported indicators remain. backtester/api/services.pybuildsRuleBasedStrategyserver-side from the structured spec.RuleBasedStrategy.precompute(data)calculates indicator arrays from Pandas/NumPy only.generate_signal(data, current_index)reads only current and previous indicator values. Entry conditions use ALL logic; exit conditions use ANY logic.- The engine remains strategy-agnostic and continues to receive only
Signal.BUY,Signal.SELL, orSignal.HOLD.
- User switches Backtest Lab into AI Builder mode.
- Frontend collects a natural-language prompt and submits it to
POST /api/ai/strategy-draft. - The browser renders the returned draft as an auditable strategy card: target mode, ticker/date range, strategy kind, parameters, sizing, costs, benchmark, assumptions, warnings, unsupported items, validation errors, and readiness status.
- User can inspect secondary reproducibility JSON for the original prompt, validated draft, and latest compiled payload.
- When the user chooses to load the draft, frontend calls
POST /api/ai/compile. - The compile response payload is copied into the existing Single Run, Grid Search, or Walk-Forward form based on
target_mode. - The workflow form is shown for review. Backtest Lab does not execute the loaded request until the user runs the existing workflow.
- User switches Backtest Lab into Research Copilot mode.
- Frontend collects a natural-language research goal and submits it to
POST /api/ai/research-plan. - The browser renders the returned graph state: status, target mode, step timeline, audit log, draft summary, compiled payload preview, warnings, unsupported items, validation errors, and recommendation.
- The plan response stops before execution. Backtest Lab does not call
research-approveautomatically. - When a ready compiled payload is present, the user may load it into the existing Single Run, Grid Search, or Walk-Forward form for manual review. Loading does not run the workflow.
- If the user clicks the explicit approval button, frontend sends the previous response state plus the matching
approved_actiontoPOST /api/ai/research-approve. - The backend runs at most one existing workflow and returns an updated state containing workflow result summary, deterministic analysis, and recommended next step.
- Frontend displays that analysis as server-provided research commentary. It does not compute backtest metrics, grid-search rankings, walk-forward folds, or portfolio accounting in TypeScript.
- A backend caller, or
POST /api/ai/research-plan, creates aResearchGraphStatewith auser_goal. - The graph records
interpret_research_goal, then calls the existing AI draft provider path to create an inertStrategyDraft. - The draft passes through
backtester/ai/validator.py, preserving warnings, unsupported concepts, and validation errors in state. - The compiler maps ready drafts into one existing API request payload: single run, grid search, or walk-forward.
- The graph stops at
await_user_approvalwithapproval_required=truewhen a compiled payload is ready and no explicit approval is present. The plan endpoint returns sanitized state and never runsbacktest,grid-search, orwalk-forward. POST /api/ai/research-approveaccepts the prior response state plusapproved_action, then sets exactly one matching action:run_backtest,run_grid_search, orrun_walk_forward.- Mismatched approval is recorded as a validation error and no workflow is run.
- Approved execution uses thin wrappers around existing API service functions only, refuses already-executed response states, revalidates the browser-returned compiled payload against the target request schema, and does not create server-side sessions.
- Malformed or tampered approval payloads return sanitized field-level validation messages and clear the compiled payload from the response so raw browser-supplied values are not echoed back.
- There are no filesystem, shell, generated-code, broker, live-trading, auth, database, or persistence tools.
- Result analysis is deterministic and heuristic. It summarizes high drawdown, sparse trades, failed grid combinations, benchmark underperformance where available, and walk-forward degradation. It is transparent research commentary, not prediction.
- User switches Backtest Lab into Grid Search mode.
- Frontend validates ticker/date range, base portfolio settings, optimization metric, and strategy parameter ranges for UX.
- Browser submits
GridSearchRequesttoPOST /api/grid-search. backtester/api/services.pybuilds a baseBacktestConfig, strategy factory, and parameter grid.backtester/research/run_grid_search()expands combinations and runs the Python engine for each one.- The service converts the result frame into ranked JSON rows, failed-combination rows, best parameters, heatmap points when two numeric parameters vary, and deterministic robustness warnings.
- Frontend renders the leaderboard, heatmap, robustness panel, failed combinations, exports, and a "Run selected config" action.
- User switches Backtest Lab into Walk-Forward mode.
- Frontend validates base config, parameter grid, optimization metric, and train/test/step bar windows.
- Browser submits
WalkForwardRequesttoPOST /api/walk-forward. - The API fetches the full single-asset price window once and slices train/test folds server-side.
- Each train fold runs grid search; the best train parameters are then evaluated on the following out-of-sample test fold.
- The service returns selected parameters, train/test metrics, degradation ratios, fold warnings, aggregate averages, parameter stability, and overall warnings.
- Frontend renders a table-first validation view. It does not optimize, rank, or compute metrics in TypeScript.
- Engine fetches each ticker independently.
- DataFrames align on the intersection of dates available for all tickers.
- Strategy returns ticker-to-signal mappings.
- Orders execute in config ticker order.
- Equity is recorded once per shared timestamp using all current prices.
Multi-asset support exists in Python. It is not currently exposed by FastAPI, CLI, or Backtest Lab.
FastAPI app: backtester/api/main.py
GET /health- Response:
{ "status": "ok" }.
- Response:
GET /api/strategies- Returns supported strategy ids, descriptions, and parameter metadata.
POST /api/backtest- Request schema:
tickerstart_dateend_datestrategyinitial_cashcommission_rateslippage_bpsposition_size_methodposition_size_valuebenchmarkparameters- optional
rule_specforstrategy="rule_based"
- Response schema:
configsummaryseries.equityseries.benchmarkseries.drawdownseries.pricetradesrisk
- Request schema:
POST /api/grid-search- Request schema:
- base single-asset config fields
strategyparameter_gridoptimization_metricmax_results
- Response schema:
configstrategy_idstrategy_nameoptimization_metrictotal_combinationsresultsfailed_combinationsbest_parametersbest_rowheatmapanalysis
- Request schema:
POST /api/walk-forward- Request schema:
- base single-asset config fields
strategyparameter_gridoptimization_metrictrain_window_barstest_window_barsstep_bars
- Response schema:
configfoldssummary
- Request schema:
POST /api/ai/strategy-draft- Request schema:
prompt- optional
provider,model, andcurrent_configplaceholders for future compatibility
- Response schema:
draftstatuswarningsunsupportedvalidation_errors
- Request schema:
POST /api/ai/compile- Request schema:
draft, or a bare StrategyDraft-shaped body
- Response schema:
target_modestatuspayloadassumptionswarningsunsupportedvalidation_errors
- Request schema:
POST /api/ai/research-plan- Request schema:
user_goal- optional
current_config - optional
context
- Response schema:
session_iduser_goalstatuscurrent_steptarget_modestepsdraftcompile_responsecompile_payloadapproval_requiredapproved_actionworkflow_resultanalysisrecommendationwarningsunsupportedvalidation_errorsaudit_log
- Request schema:
POST /api/ai/research-approve- Request schema:
state: priorResearchGraphResponseapproved_action:run_backtest,run_grid_search, orrun_walk_forward
- Response schema:
- same sanitized
ResearchGraphResponseshape asresearch-plan
- same sanitized
- Request schema:
The API normalizes ticker case and validates strategy parameters with Pydantic.
Backtest Lab uses Next.js App Router with client-side state in frontend/app/page.tsx.
Main component groups:
AppShell,Sidebar,TopBar- Full-screen application frame and run context.
BacktestForm,GridSearchForm,WalkForwardForm- Controlled right-side configuration panels for single-run and research workflows.
ai-builder/*- Natural-language prompt panel, prompt templates, generated strategy preview, assumptions/warnings display, compile handoff, and reproducibility JSON.
research-copilot/*- Natural-language research goal panel, graph step timeline, payload preview, explicit approval card, workflow summary, deterministic analysis, warnings/errors display, and safe load-into-form handoff.
ResultsDashboard- Run hero, KPI cards, chart stack, and tab orchestration.
GridSearchResults,WalkForwardResults- Research result views for leaderboard, heatmap, robustness warnings, fold tables, and export actions.
EquityChart,DrawdownChart- Recharts charts with dark financial styling and custom tooltips.
ResultsTabs,TradeTable- Summary, trades, metrics, richer risk analytics, exports, and parameters views.
EmptyState,LoadingSkeleton,ErrorState- Non-happy-path dashboard states.
formatters- Shared currency, percent, number, decimal, and date formatting.
The design system lives mostly in Tailwind classes plus frontend/app/globals.css CSS variables. Numeric values use a mono font stack through font-mono-finance.
- yfinance is used for historical market data.
- FastAPI serves the local API.
- The Next.js frontend calls
NEXT_PUBLIC_API_URL, defaulting tohttp://localhost:8000. - No database, auth provider, broker API, payment system, paid data feed, or live trading integration is present.
- The AI Strategy Builder uses a deterministic fake provider by default. Optional real provider support uses server-side OpenAI-compatible chat completion calls or a LangChain structured-output adapter only when
BACKTESTER_AI_PROVIDERand server-side credentials are configured. - OpenRouter is supported as a first-class backend AI provider with
BACKTESTER_AI_PROVIDER=openrouter. It callsPOST https://openrouter.ai/api/v1/chat/completionsby default, uses bearer auth fromBACKTESTER_AI_API_KEY, defaults totencent/hy3-preview:free, and can send backend-only attribution headers fromBACKTESTER_AI_APP_NAMEandBACKTESTER_AI_APP_URL. - LangChain is supported as an optional backend provider with
BACKTESTER_AI_PROVIDER=langchain_openai_compatible. It reusesBACKTESTER_AI_MODEL,BACKTESTER_AI_API_KEY,BACKTESTER_AI_BASE_URL, andBACKTESTER_AI_TIMEOUT_SECONDS, requires the optionallangchain-openaidependency, invokesChatOpenAI.with_structured_output(StrategyDraft), and still returns through the existing normalization and validation boundary.
- Python dependencies are in
requirements.txtandpyproject.toml; optional LangChain provider dependencies are in theai-langchainextra andrequirements-ai-langchain.txt. - LangGraph powers the backend-only Research Copilot graph and is listed with backend Python dependencies. The graph module imports it lazily; missing installations do not affect the rest of the backend, and direct graph construction reports a sanitized dependency error.
- Tests are configured in
pyproject.tomlwithtestpaths = ["tests"]. - Mypy is configured strict for Python 3.11 in
pyproject.toml. - Frontend dependencies and scripts are in
frontend/package.json. - Frontend optional env file:
frontend/.env.local, based onfrontend/.env.example. - API CORS currently allows:
http://localhost:3000http://127.0.0.1:3000
- Additional API CORS origins can be configured with comma-separated
BACKTESTER_CORS_ORIGINS. - AI Builder backend env vars:
BACKTESTER_AI_ENABLED=true|falseBACKTESTER_AI_PROVIDER=fake|deepseek|openrouter|openai_compatible|langchain_openai_compatibleBACKTESTER_AI_MODELBACKTESTER_AI_API_KEYBACKTESTER_AI_BASE_URLBACKTESTER_AI_TIMEOUT_SECONDSBACKTESTER_AI_APP_NAMEBACKTESTER_AI_APP_URL
- OpenRouter defaults:
BACKTESTER_AI_MODEL=tencent/hy3-preview:freeBACKTESTER_AI_BASE_URL=https://openrouter.ai/api/v1BACKTESTER_AI_APP_NAME=Backtest Lab
- AI provider keys are backend-only. The frontend receives draft statuses, warnings, unsupported items, and validation errors, never API keys.
- CI is
.github/workflows/ci.yml; it installs Python requirements, runspython -m pytest, runspython -m mypy backtester, installs frontend dependencies withnpm ci, runsnpm audit, runsnpm run lint, runsnpm run typecheck, and runsnpm run build.
- No domain-specific backtesting or finance metrics libraries are used.
- Strategies use full DataFrame plus
current_indexfor speed; look-ahead prevention is a strategy contract. - Multi-asset backtests use inner-join date alignment for simplicity and predictable shared indexing.
- Rejected orders return
None; rejection is normal simulation behavior. - Cash is rounded to cents after trades; production-grade accounting would likely use
Decimal. - Backtest Lab is deliberately a single-asset API client even though the Python engine supports multi-asset backtests.
- Frontend validation improves UX but does not replace API/Pydantic validation.
- Robustness scoring is transparent deterministic heuristics only. It flags sparse trades, severe drawdowns, failed combinations, benchmark underperformance, and concentrated parameter performance; it is not ML and not a guarantee of strategy quality.
- AI strategy drafts are never executable code. Real-provider output is treated as untrusted JSON, may pass through only limited deterministic normalization, and must pass Pydantic schema validation plus
validator.py; unexpected fields, raw-code fields, unsupported indicators/operators, unsupported strategy kinds, broker execution, live trading, intraday minute bars, options flow, sentiment feeds, filesystem/code loading, and multi-asset portfolios are surfaced as unsupported or clarification-needed for the v1 builder. OpenRouter support does not change this flow: draft JSON is validated, compiled only into existing API request payloads, and never executed automatically. - The Research Copilot graph preserves that same boundary. It can resume and run one existing workflow only after explicit matching approval. The API and frontend use request/response state passing only: no server-side session persistence, auth, database, broker integration, generated-code execution, or frontend API-key handling is added. Because the browser returns state to the approval endpoint, the backend treats the compiled payload as untrusted and validates it again before execution.
- Backtest Lab favors the existing stack: Next.js, TypeScript, Tailwind CSS, Recharts, and small local components instead of heavy UI libraries.
- Whether to expose multi-asset backtesting in API, CLI, and Backtest Lab.
- Whether generated dashboard screenshots should ever be committed; current policy is to regenerate them on demand.
- Whether CLI should expose multi-asset backtesting.
- Whether live yfinance examples should be replaced with fully synthetic defaults for all demo paths.