Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
191 changes: 172 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,185 @@
# Abstract

# DeepMM
DeepMM (Deep dive Market Making) module is a Python package for backtesting, analyzing and optimizing trading
strategies. It includes a number of pre-implemented strategies, but it is also possible to create new strategies, as
well as to combine them.

DeepMM (Deep dive Market Making) module is a Python package for backtesting, analysing and optimizing trading strategies. It includes a number of pre-implemented strategies, but it is also possible to create new strategies, as well as to combine them.
# Introduction

# Model we tried
In this work, we implemented a variety of different strategies and market making around the midprice.
Specifically, we implemented Avellaneda-Stoikov Market Making Model [1]. This model allowed us to adjust for inventory
and volatility in a much more profitable way. Detailed implementation of this model is given in file
`src/strategy/asmodel.py`.

We tried out a variety of different strategies and market making around the midprice:
- Avellaneda-Stoikov Market Making Model. This model allowed us to adjust for inventory and volatility in a much more profitable way. Refer `src/strategy/market_making.py`
# Implementation

In this section, we demonstrate steps to install and run the project and reproduce the experimental results.

# Getting started
## Installation

This section guides you through installing the necessary software to run the pipeline

For detailed user guides and advanced guides, please refer to our documentation:
* User guides:
<details>
<summary>Details</summary>
<ul>
<li> <a href='./docs/en/user_guides/installation.md'>Installation</a></li>
<li> <a href='./docs/en/user_guides/preparing_dataset.md'>Data preparation</a></li>
<li> <a href='./docs/en/user_guides/run.md'>Run</a></li>
<li> <a href='./docs/en/user_guides/papertrading.md'>Papertrading</a></li>
</ul>
</details>
### Clone the GitHub repo
The GitHub repo can be cloned by running
```bash
git clone git@github.com:algotrade-research/deepmm.git
```
Please make sure you have the correct credentials/privilege to run that command.

Or download the zip file through the repo main page: `https://github.com/algotrade-research/deepmm`

We also provide the experiment on [colab](https://colab.research.google.com/drive/1gnMGsCedhIbKEm4xRO7utDPsQAFcxTXm?usp=sharing)
### Create a virtual environment

This project use Python 3.12. Make sure you are using Python 3.12 to create virtual environment.

Run the following to create a virtual environment.
```bash
python -m venv .venv
```
where `.venv` is the folder contain the virtual environment. Refer to this [tutorial](https://docs.python.org/3/library/venv.html) for more information.

Then activate the environment by running:
```bash
source .venv/bin/activate
```

Then install the packages by `pip` through the `requirements.txt` file by running:
```bash
pip install -r requirements.txt
```

### Installing Plutus
The Plutus source code (in zip file) can be downloaded [here](https://drive.google.com/file/d/1O6i_B6EhxJ1EijGl0MHNPhykG21QD1Zz/view?usp=drive_link).

After downloading the source code, Plutus package can be installed by running:
```bash
pip install path/to/the/zip/file/plutus-0.0.1.zip
```

Now the project can be run.

## Data

### Data Format
deepmm works with data formatted like this:
```
datetime,price,tickersymbol
```
Here's what each column means:
- datetime: The timestamp for each piece of data.
- price: The price of the asset at that timestamp.
- tickersymbol: The symbol for the asset (e.g., VN30F2304).

### Downloading Data from algotrade public datasets

A comprehensive dataset, meticulously curated by algotrade experts, is available for download though [link](https://drive.google.com/drive/folders/1ZJzFUcxd5mdt8MA9r7lhx3MY1zw1uqdY?usp=sharing). This dataset contains a vast quantity of data points specifically designed to serve as a benchmark for scientific experimentation.

Unzip the dataset into the source root under the folder `datasetATDB`. Inside the folder, there should be three CSV files: `train.csv`, `test.csv`, `val.csv`.

## Configuration File

### General Parameters

* fee (float): Transaction fee charged per trade (default 0.125 points).
* save_dir (str): Directory to save the pipeline's output files. (eg: "runs/market_making")
* is_optimization (bool): Flag indicating if the pipeline is running in optimization mode (potentially tuning hyperparameters).

### Market Making Specific Parameters
* maximum_inventory (int): Maximum number of underlying assets the pipeline will hold at any given time (prevents excessive risk). (eg: 35)
* num_of_spread (float): Target spread between the bid and ask prices quoted by the pipeline (e.g., current price is 10 and spread is 3, if num_of_spread is 2.0, then bid-ask price will be 4 and 16 respectively).
* gamma (float): Risk aversion parameter used in the Avellaneda-Stoikov model (higher values indicate lower risk tolerance).
* historical_window_size (int): Number of days of historical data used to calculate the underlying asset's volatility.
* min_second_time_step (int): Minimum time (in seconds) between order updates placed by the pipeline.
* close_at (str): Time of day to stop retrieving prices and release any remaining inventory. (default: "14:20:45")
* start_at (str): Time of day to begin retrieving prices and actively participate in the market. (default: "09:00:00")

### Dataset:

* TRAIN, VAL, TEST: Definitions for training, validation, and testing datasets used by the pipeline.
* csv_file: Path to the CSV file containing the market data for each set.

### Custom Flags for Pipeline Configuration
This section explains how to modify specific pipeline parameters at runtime using command-line flags. These flags override the defaults defined in the configuration file.

Example:
```bash
python run.py -c configs/parameters/pseudo_marketmaking.yaml -o PIPELINE.params.save_dir='new_exp'
```

In this example:

* `python run.py`: This launches the Python script (`run.py`) that executes the pipeline.
* `-c configs/parameters/pseudo_marketmaking.yaml`: This flag specifies the configuration file (`pseudo_marketmaking.yaml`) containing the default pipeline parameters.
* `-o PIPELINE.params.save_dir='new_exp'`: This is the custom flag denoted by `-o`. It overrides the default value for the `save_dir` parameter within the `PIPELINE.params` section of the configuration file. Here, we're setting the new directory to `new_exp`.

**Note**: Essentially, you can use the `-o` flag followed by the desired parameter path and its new value to customize specific parameters on the fly without modifying the configuration file itself.

## Training Pipeline

Run the Training Script with the following command:
```bash
python run.py -c configs/parameters/pseudo_marketmaking.yaml
```
This command launches the `run.py` script, performs optimization (brute-force all possible combinations of parameters
specified in `OPTIMIZER` section), and use the best parameter combination (highest Sharpe Ratio) on Training and
Validation data.

It is worth noting that running hyper-parameter optimization can take a very long time, depending on data size and
number of possible combinations in hyper-parameters. The default brute-force search will take one week continuously
running on a Macbook Pro 2019 laptop. To reduce the running time, one can reduce the number of possible values for each
hyper-parameters in `OPTIMIZER` section.

After the optimization phase, one can use the best parameters picked by the optimizer (best Sharpe ratio in the
training data) for subsequent experimental result (back-testing and paper-trading).

## Back-testing

One can run the algorithm on historical data using the best parameter combination by running the following command:
```bash
python run.py -c configs/parameters/pseudo_marketmaking.yaml -o PIPELINE.params.is_optimization=False
```

In the above command, parameters will be specified in the section `PIPELINE` of file `pseudo_marketmaking.yaml`. In
addition, we also override the configuration `is_optimization` (set to False) to avoid hyper-parameter
search in this phase.

The back-testing results can be viewed in the accompanied document (file `docs/Algotrade_marketmaking.pdf` section 3).

## Run paper-trading

In this section, we provide instructions to perform the paper trading (use real time price data and make the
decisions) with our market making strategy.

### Redis connection setup

The paper trading run will require connection to retrieve real time price. For this reason, one has to setup a
connection to Algotrade's Redis. Specifically, one has to create a file `configs/usr/redis_account.yaml` with the
following information:
```yaml
host: ####
port: ####
password: ####
```
Please contact Algotrade's team to get information about host, port, and password.

### Paper-trading configuration file

An example of configuration file is `configs/parameters/papertrading.yaml`. Market making parameters will be the
parameters you found in the training (back-testing) process.

### Run paper trading

One can run paper-trading with the following command:
```bash
python run_papertrading.py -c configs/parameters/papertrading.yaml
```
where `configs/parameters/papertrading.yaml` is the configuration file.

At this moment, the paper trading result is only available and viewable in output of the running script. Ideally,
the output should be captured and analyzed and presented in a paper trading report. This limitation is because of
time constraint of the project and will be a topic for future improvements.

**Note**: This simulation will receive data during Vietnamese trading hours only. If you want to conduct offline paper trading simulations with test data, please create a separate test dataset and run the run.py script as described in the run.md documentation.

# References
- [Avellaneda M. & Stoikov S. (2006). High Frequency Trading in a Limit Order Book](https://www.researchgate.net/publication/24086205_High_Frequency_Trading_in_a_Limit_Order_Book)
- [Avellaneda M. & Stoikov S. (2006). High Frequency Trading in a Limit Order Book](https://www.researchgate.net/publication/24086205_High_Frequency_Trading_in_a_Limit_Order_Book)
8 changes: 1 addition & 7 deletions configs/parameters/pseudo_marketmaking.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,7 @@ PIPELINE:
start_at: "09:00:00"

OPTIMIZER:
name: "custom_optuna"
params:
n_trials: 50
study_name: "market_making"
storage: "sqlite:///market_making.db"
load_if_exists: True

gamma:
type: "float"
values: [0.1, 0.2, 0.3, 0.4, 0.5, 1.0, 1.5, 2.0, 2.5] # 9 hyperparameters
Expand All @@ -47,4 +41,4 @@ DATASET:
VAL:
csv_file: "datasetATDB/val.csv"
TEST:
csv_file: "datasetATDB/test.csv"
csv_file: "datasetATDB/test.csv"
File renamed without changes.
51 changes: 0 additions & 51 deletions docs/en/user_guides/config.md

This file was deleted.

39 changes: 0 additions & 39 deletions docs/en/user_guides/installation.md

This file was deleted.

27 changes: 0 additions & 27 deletions docs/en/user_guides/papertrading.md

This file was deleted.

43 changes: 0 additions & 43 deletions docs/en/user_guides/preparing_dataset.md

This file was deleted.

Loading