Skip to content

SNU-VGILab/improvedSelfDistillation

Repository files navigation

Stabilizing Consistency Training: A Flow Map Analysis and Self-Distillation

arXiv BibTeX

Youngjoong Kim, Duhoe Kim, Woosung Kim, Jaesik Park
Seoul National University
Corresponding author.

iSD Sample aimges

Requirements

This repository has been tested in the following conda environments:

  • Python 3.12.11, CUDA 11.8, NVIDIA RTX 3090
  • Python 3.12.12, CUDA 12, NVIDIA A100
  • Python 3.12.12, CUDA 12.9, NVIDIA H200

See requirements.txt for Python dependencies. The default versions have been tested in the RTX 3090 environment, and the commented versions have been tested in the H200 environment.

pip install -r requirements.txt

Preparation

  1. Download the required materials from Hugging Face:
Checkpoint Network Steps FID50K
2026.02.15KST14.22.08-base4 FlowMapTiT-B/4 (SD-VAE, TrigFlow) 400K 14.58
2026.01.18KST19.26.11-xlarge1 ADiT-XL/1 (VA-VAE, Linear) 600K 2.30
  1. Place the downloaded outputs and buffers directories at the top level of the repository.
  2. Preprocess the dataset (skip this step if you are not training networks):
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch \
    --num_processes 8 \
    --num_machines 1 \
    --mixed_precision bf16 \
    data.py \
    --data_path /data/imagenet-1k/ILSVRC2012_img_train \
    --config sdvae_f8c4 \
    --image_size 256 \
    --batch_size 20 \
    --num_workers 8

Usage

Train few-step models with improved self-distillation:

# Use the training script.
bash train.sh

# Or run the training command directly.
CUDA_VISIBLE_DEVICES=1 accelerate launch \
    --dynamo_backend=no \
    --num_processes=1 \
    --num_machines=1 \
    --mixed_precision=bf16 \
    train.py \
    --config ./configs/base4.yaml

Evaluate models by calculating FID and Inception Score:

bash eval.sh

License

This repository is released under the non-commercial research and educational use license in LICENSE. It is a source-available research release and is not distributed under an OSI-approved open-source license.

The improved Self-Distillation research license applies only to the original code in this repository that is owned by the authors. Third-party software, models, weights, and datasets remain under their own license terms. See THIRD_PARTY.md for a concise inventory.

Acknowledgements

This repository is mainly based on LINs-Lab/UCGM, LTH14/JiT, and hustvl/LightningDiT. We thank the authors for their excellent projects.

Citation

@misc{kim2026stabilizingconsistencytrainingflow,
      title={Stabilizing Consistency Training: A Flow Map Analysis and Self-Distillation},
      author={Youngjoong Kim and Duhoe Kim and Woosung Kim and Jaesik Park},
      year={2026},
      eprint={2601.22679},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2601.22679},
}

About

Stabilizing Consistency Training: A Flow Map Analysis and Self-Distillation (iSD: improved Self-Distillation)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors