Skip to content

Request for Pre-trained Tokenizer Weights (pytorch_model.bin) for Reproducibility #3

@j1ajunzhu

Description

@j1ajunzhu

First of all, thank you for sharing your amazing work! The framework is impressive.

While exploring the codebase and the training configurations, I noticed that the config.py files for all stages refer to local paths for the image tokenizer checkpoints, specifically:
pytorch_model.bin (referenced in Stage 1 & 2)
pytorch_model_512.bin (referenced in Stage 3)(which can be found in huggingface)

The paths in the config (e.g., /opt/tiger/ju/ckpt/flowtitok_swiglu_bl77_vae/...) suggest these are internal checkpoints. Since the image tokenizer is the "foundation" for the Flow Matching training, it is currently difficult for the community to:
Reproduce the Stage 1/2 pre-training results.

Could you kindly consider sharing these tokenizer weights (e.g., on Hugging Face or GitHub Releases)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions