Skip to content

Haodong-Lei-Ray/ADT-Tree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ADT-Tree: Fast Inference of Visual Autoregressive Model

This repository is an official PyTorch implementation of the paper Fast Inference of Visual Autoregressive Model with Adjacency-Adaptive Dynamical Draft Trees.

Autoregressive (AR) image models achieve diffusion-level quality but suffer from sequential inference, requiring approximately 2,000 steps for a 576x576 image. Speculative decoding with draft trees accelerates LLMs yet underperforms on visual AR models due to spatially varying token prediction difficulty. We identify a key obstacle in applying speculative decoding to visual AR models: inconsistent acceptance rates across draft trees due to varying prediction difficulties in different image regions. We propose Adjacency-Adaptive Dynamical Draft Trees (ADT-Tree), an adjacency-adaptive dynamic draft tree that dynamically adjusts draft tree depth and width by leveraging adjacent token states and prior acceptance rates. ADT-Tree initializes via horizontal adjacency, then refines depth/width via bisectional adaptation, yielding deeper trees in simple regions and wider trees in complex ones.

All main code refers to the project LANTERN

Thank the LANTERN team for their contributions to the open-source community


📰 News

  • [2025-11-28] TODO: Change the eagle tree
  • [2025-11-20] 🎉🎉🎉 Our ADT-Tree is released! 🎉🎉🎉
  • Paper Portal for Top Conferences in the Field of Artificial intelligence: CV_Paper_Portal

Method and Performance

Method

Below is a comparison of the effects of different methods

Performance


⚙️ Installation

  1. Install Required Packages Requirements

    • Python >= 3.10
    • PyTorch >= 2.4.0

    Install the dependencies listed in requirements.txt.

    git clone https://github.com/Haodong-Lei-Ray/ADT-Tree.git
    cd ADT-Tree
    conda create -n ADT-Tree python=3.10 -y
    conda activate ADT-Tree
    pip install -r requirements.txt
  2. Additional Setup

    1. Lumina-mGPT For Lumina-mGPT, we need to install flash_attention and xllmx packages.
      pip install flash-attn --no-build-isolation
      cd models/base_models/lumina_mgpt
      pip install -e .
  3. Checkpoints All model weights and other required data should be stored in ckpts/.

    1. Lumina-mGPT For Lumina-mGPT, since currently the Chameleon implementation in transformers does not contain the VQ-VAE decoder, please manually download the original VQ-VAE weights provided by Meta and put them to the following directory:

      ckpts
      └── lumina_mgpt
          └── chameleon
              └── tokenizer
                  ├── text_tokenizer.json
                  ├── vqgan.yaml
                  └── vqgan.ckpt
      

      Also download the original model Lumina-mGPT-7B-768 from Huggingface 🤗 and put them to the following directory:

      ckpts
      └── lumina_mgpt
          └── Lumina-mGPT-7B-768
              ├── config.json
              ├── generation_config.json
              ├── model-00001-of-00002.safetensors
              └── other files...
      
    2. Anole For Anole, download Anole-7b-v0.1-hf, which is a huggingface style converted model from Anole.

      In addition, you should download the original VQ-VAE weights provided by Meta and put them to the following directory:

      ckpts
      └── anole
          ├── Anole-7b-v0.1-hf
          |   ├── config.json
          |   ├── generation_config.json
          |   ├── model-00001-of-00003.safetensors
          |   └── other files...
          └── chameleon
              └── tokenizer
                  ├── text_tokenizer.json
                  ├── vqgan.yaml
                  └── vqgan.ckpt
      

      (Optional) Trained drafter To use trained drafter, you need to download anole_drafter and save it under trained_drafters directory.

      ckpts
      └── anole
          └── trained_drafters
              └── anole_drafter
                  ├── config.json
                  ├── generation_config.json
                  ├── pytorch_model.bin
                  └── other files...
      

✨ Usage

ANOLE

ADT-Tree+LANTERN in MSCOCO2017Val

cd ./ADT-Tree
prompt=MSCOCO2017Val
model=anole
temperature=1
model_type=eagle
lantern_delta=0.5
lantern_k=100

#output_path=/home/leihaodong/TIP26/exp/Anole/MSCOCO2017Val/lantern_ADT-Tree
output_path=<your out path>

mkdir -p ${output_path}

nohup python main.py generate_images \
 --prompt $prompt \
 --model $model \
 --temperature $temperature \
 --model_type $model_type \
 --model_path leloy/Anole-7b-v0.1-hf \
 --drafter_path jadohu/anole_drafter \
 --output_dir $output_path \
 --lantern \
 --peanut \
 --lantern_k $lantern_k \
 --lantern_delta ${lantern_delta} \
 --num_images -1 > ${output_path}.log 2>&1 &

ADT-Tree+LANTERN

⚖️ License

This project is distributed under the Chameleon License by Meta Platforms, Inc. For more information, please see the LICENSE file in the repository.


🙏 Acknowledgement

This repository is built with extensive reference to FoundationVision/LlamaGen, Alpha-VLLM/Lumina-mGPT and SafeAILab/EAGLE, leveraging many of their core components and approaches.

📄 Citation

@misc{lei2025fastinferencevisualautoregressive,
      title={Fast Inference of Visual Autoregressive Model with Adjacency-Adaptive Dynamical Draft Trees}, 
      author={Haodong Lei and Hongsong Wang and Xin Geng and Liang Wang and Pan Zhou},
      year={2025},
      eprint={2512.21857},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.21857}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors