A TensorFlow re-implementation of FinGAN (Vuletic, Prenzel, Cucuringu) with a deterministic LSTM-Fin baseline. The pipeline trains conditional return-forecasting models over US equities and sector ETFs, then evaluates them with the financial loss combinations and trading-style metrics from the original paper.
Reference PyTorch implementation: https://github.com/milenavuletic/Fin-GAN
src/
├── main.py # CLI entry point
├── models/
│ ├── generator.py # LSTM + dense head, noise + condition → next return
│ ├── discriminator.py # PnL/MSE/SR/STD/BCE-aware discriminator
│ ├── LSTM.py # deterministic LSTM-Fin baseline
│ └── utils.py
├── libraries/
│ ├── data/
│ │ ├── loader.py # Nasdaq Data Link fetch + ArcticDB cache
│ │ └── preprocessing.py # excess-return construction, train/val/test split
│ └── training/
│ ├── fin_timegan_trainer.py # per-ticker FinGAN training over loss combos
│ ├── fin_timegan_universal.py # shared model over a ticker pool
│ ├── lstm_fin_trainer.py # deterministic LSTM-Fin pipeline
│ └── plots.py
├── utils/
│ ├── evaluation_fintimegan.py # Sharpe ratio, hit rate, correlation, sanity checks
│ └── lstm_eval.py
└── database/arctic/ # local ArcticDB cache (created on first run)
Run outputs (models, plots, results CSVs, run notes) land under src/runs/<label>/.
Prices are pulled from Nasdaq Data Link (SHARADAR/SEP for stocks, SHARADAR/SFP for ETFs) on first use and cached in a local ArcticDB at src/database/arctic/.
You must supply your own Nasdaq Data Link API key. Open src/libraries/data/loader.py and replace the value of ndl.ApiConfig.api_key with your own key before running the pipeline. Without a valid key, the first fetch for any uncached ticker will fail.
- FinGAN — conditional GAN. Generator: noise + length-
lcondition window → next return via an LSTM and two dense layers. Discriminator is trained against any combination of PnL, MSE, Sharpe-ratio, std, and BCE objectives. - LSTM-Fin — deterministic baseline mirroring Vuletic's
LSTMclass. Same condition window, no noise input.
Loss combinations evaluated by default (one model is trained per combination):
PnL MSE STD, PnL, PnL MSE, PnL MSE SR, PnL SR,
PnL STD, SR, SR MSE, MSE, BCE
From the src/ directory:
# Per-ticker FinGAN over multiple seeds
python main.py --model fingan --train_mode single \
--tickers AMZN HD KO --seeds 10 20 40
# Universal FinGAN over a preset ticker pool
python main.py --model fingan --train_mode universal \
--experiment xlp --seed 42
# Deterministic LSTM-Fin baseline
python main.py --model lstm --tickers AMZN HD KO --seed 42Universal experiment presets (--experiment): full, xlp, xlp_etf.
| flag | default | meaning |
|---|---|---|
--l |
10 | condition window length |
--pred |
1 | prediction horizon |
--tr / --vl / --h |
0.8 / 0.1 / 1 | train/val split, holdout |
--hidden_dim / --noise_dim |
8 / 8 | network widths |
--n_epochs / --ngrad |
100 / 25 | training / GradientCheck epochs |
--batch_size / --lr |
100 / 1e-4 | optimisation |
--diter / --tanh_coeff |
1 / 100 | discriminator iters, soft-sign coefficient |
--eval_samples |
1000 | MC samples at evaluation time |
--jit_compile |
off | enable XLA |
--plot / --no-plot |
on | save per-run PNGs |
--notes_file |
auto | path for run_notes.txt |
Each run writes under src/runs/<label>/:
models/— trained weights per loss comboplots/— overlays, correlation heatmaps, gradient-norm and intraday/overnight chartsresults/— per-seed and aggregated metric CSVs (Sharpe ratio, hit rate, correlation, sanity-check shuffles)pnl/— PnL seriesrun_notes.txt— config + summary snapshot
tensorflow, numpy, pandas, matplotlib, tqdm, nasdaqdatalink, arcticdb.