ADT-Tree: Fast Inference of Visual Autoregressive Model

This repository is an official PyTorch implementation of the paper Fast Inference of Visual Autoregressive Model with Adjacency-Adaptive Dynamical Draft Trees.

Autoregressive (AR) image models achieve diffusion-level quality but suffer from sequential inference, requiring approximately 2,000 steps for a 576x576 image. Speculative decoding with draft trees accelerates LLMs yet underperforms on visual AR models due to spatially varying token prediction difficulty. We identify a key obstacle in applying speculative decoding to visual AR models: inconsistent acceptance rates across draft trees due to varying prediction difficulties in different image regions. We propose Adjacency-Adaptive Dynamical Draft Trees (ADT-Tree), an adjacency-adaptive dynamic draft tree that dynamically adjusts draft tree depth and width by leveraging adjacent token states and prior acceptance rates. ADT-Tree initializes via horizontal adjacency, then refines depth/width via bisectional adaptation, yielding deeper trees in simple regions and wider trees in complex ones.

All main code refers to the project LANTERN

Thank the LANTERN team for their contributions to the open-source community

📰 News

[2025-11-28] TODO: Change the eagle tree
[2025-11-20] 🎉🎉🎉 Our ADT-Tree is released! 🎉🎉🎉
Paper Portal for Top Conferences in the Field of Artificial intelligence: CV_Paper_Portal

Method and Performance

Below is a comparison of the effects of different methods

⚙️ Installation

Install Required Packages Requirements

Python >= 3.10
PyTorch >= 2.4.0

Install the dependencies listed in requirements.txt.

git clone https://github.com/Haodong-Lei-Ray/ADT-Tree.git
cd ADT-Tree
conda create -n ADT-Tree python=3.10 -y
conda activate ADT-Tree
pip install -r requirements.txt

Additional Setup

Lumina-mGPT For Lumina-mGPT, we need to install flash_attention and xllmx packages.

pip install flash-attn --no-build-isolation
cd models/base_models/lumina_mgpt
pip install -e .

Checkpoints All model weights and other required data should be stored in ckpts/.

Lumina-mGPT For Lumina-mGPT, since currently the Chameleon implementation in transformers does not contain the VQ-VAE decoder, please manually download the original VQ-VAE weights provided by Meta and put them to the following directory:

ckpts
└── lumina_mgpt
    └── chameleon
        └── tokenizer
            ├── text_tokenizer.json
            ├── vqgan.yaml
            └── vqgan.ckpt

Also download the original model Lumina-mGPT-7B-768 from Huggingface 🤗 and put them to the following directory:

ckpts
└── lumina_mgpt
    └── Lumina-mGPT-7B-768
        ├── config.json
        ├── generation_config.json
        ├── model-00001-of-00002.safetensors
        └── other files...

Anole For Anole, download Anole-7b-v0.1-hf, which is a huggingface style converted model from Anole.

In addition, you should download the original VQ-VAE weights provided by Meta and put them to the following directory:

ckpts
└── anole
    ├── Anole-7b-v0.1-hf
    |   ├── config.json
    |   ├── generation_config.json
    |   ├── model-00001-of-00003.safetensors
    |   └── other files...
    └── chameleon
        └── tokenizer
            ├── text_tokenizer.json
            ├── vqgan.yaml
            └── vqgan.ckpt

(Optional) Trained drafter To use trained drafter, you need to download anole_drafter and save it under trained_drafters directory.

ckpts
└── anole
    └── trained_drafters
        └── anole_drafter
            ├── config.json
            ├── generation_config.json
            ├── pytorch_model.bin
            └── other files...

✨ Usage

ANOLE

ADT-Tree+LANTERN in MSCOCO2017Val

cd ./ADT-Tree
prompt=MSCOCO2017Val
model=anole
temperature=1
model_type=eagle
lantern_delta=0.5
lantern_k=100

#output_path=/home/leihaodong/TIP26/exp/Anole/MSCOCO2017Val/lantern_ADT-Tree
output_path=<your out path>

mkdir -p ${output_path}

nohup python main.py generate_images \
 --prompt $prompt \
 --model $model \
 --temperature $temperature \
 --model_type $model_type \
 --model_path leloy/Anole-7b-v0.1-hf \
 --drafter_path jadohu/anole_drafter \
 --output_dir $output_path \
 --lantern \
 --peanut \
 --lantern_k $lantern_k \
 --lantern_delta ${lantern_delta} \
 --num_images -1 > ${output_path}.log 2>&1 &

ADT-Tree+LANTERN

⚖️ License

This project is distributed under the Chameleon License by Meta Platforms, Inc. For more information, please see the LICENSE file in the repository.

🙏 Acknowledgement

This repository is built with extensive reference to FoundationVision/LlamaGen, Alpha-VLLM/Lumina-mGPT and SafeAILab/EAGLE, leveraging many of their core components and approaches.

📄 Citation

@misc{lei2025fastinferencevisualautoregressive,
      title={Fast Inference of Visual Autoregressive Model with Adjacency-Adaptive Dynamical Draft Trees}, 
      author={Haodong Lei and Hongsong Wang and Xin Geng and Liang Wang and Pan Zhou},
      year={2025},
      eprint={2512.21857},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.21857}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data		data
entrypoints		entrypoints
models		models
third_party		third_party
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml
main.py		main.py
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ADT-Tree: Fast Inference of Visual Autoregressive Model

📰 News

Method and Performance

⚙️ Installation

✨ Usage

ANOLE

⚖️ License

🙏 Acknowledgement

📄 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ADT-Tree: Fast Inference of Visual Autoregressive Model

📰 News

Method and Performance

⚙️ Installation

✨ Usage

ANOLE

⚖️ License

🙏 Acknowledgement

📄 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages