Crypto Trading Research Repo

Lightweight Python research repository for building and comparing Binance spot crypto strategies. The repo now covers:

data acquisition, cleaning, and return construction
pair selection and pairs backtesting
multi-asset trend-following backtesting
transaction cost modelling with Roll slippage estimates
report-ready tables and figures for the chosen final strategies

Current Final Strategies

Pairs: AVAXUSDT/ICPUSDT
Trend: 4-asset trend strategy on BTCUSDT, ETHUSDT, SOLUSDT, BNBUSDT

The latest cost-adjusted comparison outputs are saved in:

Key Result Note

There are two different performance layers in this repo, and they should not be compared as if they are the same quantity:

The original strategy sweep files such as data/processed/trend/strategy/trend_strategy_parameter_sweep.csv report the strategy-selection results from the original backtest configuration.
The stratergy cost files such as data/processed/costs/strategy_comparison.csv report a separate rerun that applies the Roll-model slippage stage and computes both gross and net performance.

Example for the chosen trend strategy:

In the parameter sweep, test_cumulative_return = 0.897441, which is the original net test return under the original sweep setup.
In the stratergy comparison, gross_return_pct = 107.21% and net_return_pct = 57.84%.

So:

89.74% is the original sweep net test return
107.21% is the stratergy gross test return
57.84% is the stratergy net test return after Roll slippage

These differ because the stratergy stage reruns the chosen strategies with asset-level Roll slippage costs and separates gross from net performance explicitly.

Cost Logic

The transaction cost stage uses the turnover form:

Cost_t = s * sum_i |theta_t^i - theta_{t-1}^i (1 + r_{t-1}^i)|

Implementation details:

theta is notional exposure by asset
unallocated capital stays in USDT cash
profits and losses are reinvested through the live equity state
exposure remains constrained by the fixed gross exposure cap
Roll slippage is estimated per asset from the negative first-order autocovariance of price changes
if the estimated autocovariance is non-negative, the slippage estimate is clipped to zero

The methodology note is saved in:

data/processed/costs/performance_methodology.txt

Report Figures

Combined cost comparison:

Combined gross vs net PnL during the test period:

Roll slippage by asset:

Pairs sensitivity:

Trend sensitivity:

Structure

config/: YAML configuration for the research pipeline
data/raw/binance/: raw Binance OHLCV parquet files
data/raw/risk_free/: raw risk-free parquet files
data/interim/binance/: cleaned and repaired per-symbol parquet files
data/interim/risk_free/: aligned per-step risk-free parquet files
data/processed/binance/: per-symbol parquet files with returns and excess returns
data/processed/pairs/: pairs selection and chosen-pair backtest outputs
data/processed/trend/: trend parameter sweeps and final backtest outputs
data/processed/costs/: Roll slippage summaries, comparison tables, and cost sensitivity outputs
outputs/figures/prices/: cleaned price charts with repaired bars highlighted
outputs/figures/candles/: recent-window candlestick charts
outputs/figures/report/: stacked multi-symbol report charts
outputs/figures/returns/: return time-series charts
outputs/figures/volatility/: rolling-volatility charts
outputs/figures/costs/: slippage and cost-adjusted performance figures
scripts/: runnable project scripts
src/crypto_trader/: research code for config, data, signals, backtests, analysis, and plots
tests/: unit tests for accounting, signal logic, and slippage estimation

Setup

Create a virtual environment and install the package in editable mode:

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -e .

Run The Data Pipeline

After installation, run:

python scripts/run_data_pipeline.py

To regenerate figures only from existing processed parquet files:

python scripts/run_plot_pipeline.py

To plot only one symbol:

python scripts/run_plot_pipeline.py --symbol BTCUSDT

To generate stacked report plots for selected symbols:

python scripts/run_plot_pipeline.py --symbol BTCUSDT --symbol ETHUSDT --symbol BNBUSDT --report

For the dedicated report workflow, with default symbols BTCUSDT, ETHUSDT, and BNBUSDT:

python scripts/run_report_plots.py

To override the default report symbols:

python scripts/run_report_plots.py --symbol BTCUSDT --symbol ETHUSDT --symbol BNBUSDT

The pipeline will:

download Binance spot OHLCV data for the configured symbols
reuse an exact cached raw Binance parquet instead of downloading again when start_date, end_date, and interval already match an existing raw file
save raw Binance parquet files to data/raw/binance/
clean, validate, and repair bad bars before saving cleaned parquet files to data/interim/binance/
download a short-rate proxy, save it to data/raw/risk_free/, and align it to the asset timestamps in data/interim/risk_free/
compute simple returns and lagged-risk-free excess returns
save processed parquet files to data/processed/binance/
save price, candlestick, return, and rolling-volatility plots to outputs/figures/

Run Strategy Research

Pairs research:

python scripts/run_pairs_research.py

Trend strategy research:

python scripts/run_trend_strategy.py

Trend report plots:

python scripts/run_trend_report_plots.py

Pairs report plot for a saved run:

python scripts/run_pairs_report_plot.py --pair-slug avaxusdt_icpusdt --run-key a00a6c7066b7

Cost analysis and comparison outputs:

python scripts/run_cost_analysis.py

This produces:

Roll slippage summary by asset
cost-adjusted pair and trend summaries
slippage sensitivity tables
combined pair vs trend comparison table
report-ready cost figures

Data Outputs

Raw, interim, and processed parquet files use the naming pattern NAME_YYYY-MM-DD_YYYY-MM-DD_interval.parquet.
Figure files use the matching pattern NAME_YYYY-MM-DD_YYYY-MM-DD_interval_plot-name.png.
The default plotting style is applied project-wide through Matplotlib rcParams and can still be overridden per figure when needed.
Report plots use a consistent per-symbol color mapping across stacked and overlaid comparisons.
Report figures default to single-column LaTeX sizing and can be adjusted through plotting.report_single_column_width, plotting.report_stack_panel_height, and plotting.report_overlay_height.
Raw parquet files contain standardized Binance spot OHLCV data with UTC timestamps and columns: timestamp, open, high, low, close, volume, symbol.
Interim Binance parquet files contain cleaned and repaired OHLCV data after sorting, deduplication, missing-bar handling, OHLC validation, and simple outlier repair, plus a readable datetime_utc column.
Interim risk-free parquet files contain the aligned per-step rate, its lagged rf_{t-1} version, and the underlying annualized source fields used to form excess returns.
Processed Binance parquet files contain the cleaned OHLCV data plus simple_return, the lagged risk_free_rate, and excess_return.
Chosen pair backtests include bar-level PnL, turnover, transaction cost, and equity paths.
Final trend backtests include per-asset positions, turnover, trade IDs, transaction costs, and equity paths.
Cost outputs include gross and net PnL series, cumulative transaction costs, and sensitivity tables at multiple slippage multipliers.

Excess returns are formed as simple_return_t - rf_{t-1} so that the risk-free rate used at timestamp t is information available before the return over period t is realised.

Risk-Free Rate Note

At mid to high frequency, the per-bar risk-free accrual is usually extremely small relative to crypto return volatility, so it is often ignored in practical intraday research. It is still included here for completeness and methodological correctness, especially when comparing results across sampling frequencies.

Latest Comparison

The current combined comparison file is:

data/processed/costs/strategy_comparison.csv

Current headline values:

Strategy	Gross PnL (USDT)	Net PnL (USDT)	Gross Return	Net Return	Total Transaction Cost
Pairs (`AVAX/ICP`)	3816.84	2573.95	38.17%	25.74%	1242.89
Trend (`BTC, ETC, BNB, SOL`)	10720.61	5784.21	107.21%	57.84%	4936.40

Those values come from the rerun with Roll-model slippage and therefore differ from the original sweep files that were used to choose the strategies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crypto Trading Research Repo

Current Final Strategies

Key Result Note

Cost Logic

Report Figures

Structure

Setup

Run The Data Pipeline

Run Strategy Research

Data Outputs

Risk-Free Rate Note

Latest Comparison

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
config		config
data		data
outputs/figures		outputs/figures
scripts		scripts
src/crypto_trader		src/crypto_trader
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Crypto Trading Research Repo

Current Final Strategies

Key Result Note

Cost Logic

Report Figures

Structure

Setup

Run The Data Pipeline

Run Strategy Research

Data Outputs

Risk-Free Rate Note

Latest Comparison

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages