Youngjoong Kim, Duhoe Kim, Woosung Kim, Jaesik Park†
Seoul National University
† Corresponding author.
This repository has been tested in the following conda environments:
- Python 3.12.11, CUDA 11.8, NVIDIA RTX 3090
- Python 3.12.12, CUDA 12, NVIDIA A100
- Python 3.12.12, CUDA 12.9, NVIDIA H200
See requirements.txt for Python dependencies. The default versions have been tested in the RTX 3090 environment, and the commented versions have been tested in the H200 environment.
pip install -r requirements.txt- Download the required materials from Hugging Face:
- Pretrained weights
- VAE checkpoints and latent statistics
- Reference files for FID calculation
| Checkpoint | Network | Steps | FID50K |
|---|---|---|---|
| 2026.02.15KST14.22.08-base4 | FlowMapTiT-B/4 (SD-VAE, TrigFlow) | 400K | 14.58 |
| 2026.01.18KST19.26.11-xlarge1 | ADiT-XL/1 (VA-VAE, Linear) | 600K | 2.30 |
- Place the downloaded
outputsandbuffersdirectories at the top level of the repository. - Preprocess the dataset (skip this step if you are not training networks):
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch \
--num_processes 8 \
--num_machines 1 \
--mixed_precision bf16 \
data.py \
--data_path /data/imagenet-1k/ILSVRC2012_img_train \
--config sdvae_f8c4 \
--image_size 256 \
--batch_size 20 \
--num_workers 8Train few-step models with improved self-distillation:
# Use the training script.
bash train.sh
# Or run the training command directly.
CUDA_VISIBLE_DEVICES=1 accelerate launch \
--dynamo_backend=no \
--num_processes=1 \
--num_machines=1 \
--mixed_precision=bf16 \
train.py \
--config ./configs/base4.yamlEvaluate models by calculating FID and Inception Score:
bash eval.shThis repository is released under the non-commercial research and educational use license in LICENSE. It is a source-available research release and is not distributed under an OSI-approved open-source license.
The improved Self-Distillation research license applies only to the original code in this repository that is owned by the authors. Third-party software, models, weights, and datasets remain under their own license terms. See THIRD_PARTY.md for a concise inventory.
This repository is mainly based on LINs-Lab/UCGM, LTH14/JiT, and hustvl/LightningDiT. We thank the authors for their excellent projects.
@misc{kim2026stabilizingconsistencytrainingflow,
title={Stabilizing Consistency Training: A Flow Map Analysis and Self-Distillation},
author={Youngjoong Kim and Duhoe Kim and Woosung Kim and Jaesik Park},
year={2026},
eprint={2601.22679},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2601.22679},
}