Skip to content

chawuciren11/Sync-R1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Sync-R1 Core Training Logic

This repository snapshot keeps only the current paper-aligned GRPO training path for Sync-R1.

Included Code

The GitHub-facing version intentionally keeps a single public entrypoint and its direct dependencies:

  • train_grpo_paper.py: training launcher
  • grpo_paper.py: paper-closer GRPO implementation
  • ref_model.py: reference-side rollout helper
  • pdata.py, utils.py, clip_eval.py, glm_api.py: runtime helpers
  • models/, training/, llava/: model and training modules
  • configs/: config files
  • requirements.txt: minimal dependency list

Older draft variants are intentionally excluded to avoid ambiguity for users.

Paths To Fill In

The config files use placeholder paths:

  • path/to/show-o-512x512
  • path/to/show-o
  • path/to/magvitv2
  • path/to/phi-1_5
  • path/to/checkpoints

Update them in:

  • configs/showo_demo_512x512.yaml
  • configs/showo_demo.yaml

The training command also expects:

  • --data_root path/to/unictokens_data
  • --pre_trained_ckpt_name path/to/second_stage_checkpoint_dir

External Assets Not Included

This trimmed repo does not include:

  • training data
  • pretrained checkpoints
  • CLIP weights
  • facenet weights
  • generated images and logs

You need to provide them locally before training.

Launch

Run from Sync-R1/. The script initializes torch.distributed, so torchrun is the recommended launcher.

3 GPUs

torchrun --nproc_per_node=3 train_grpo_paper.py \
  --num_gpus 3 \
  --config_file configs/showo_demo_512x512.yaml \
  --data_root path/to/unictokens_data \
  --pre_trained_ckpt_name path/to/second_stage_checkpoint_dir \
  --concept adrien_brody \
  --save_dir ./tmp_result_accelerate/ \
  --epoch_to_load 15 \
  --batch_num 10 \
  --batch_size 1 \
  --num_gen 9 \
  --llm glm \
  --accelerate True \
  --semantic True

1 GPU

torchrun --nproc_per_node=1 train_grpo_paper.py \
  --num_gpus 1 \
  --config_file configs/showo_demo_512x512.yaml \
  --data_root path/to/unictokens_data \
  --pre_trained_ckpt_name path/to/second_stage_checkpoint_dir \
  --concept adrien_brody \
  --save_dir ./tmp_result_accelerate/ \
  --epoch_to_load 15 \
  --batch_num 10 \
  --batch_size 1 \
  --num_gen 3 \
  --llm glm \
  --accelerate True \
  --semantic True

Runtime Notes

  • --num_gen is the rollout group size for a single prompt.
  • The current training loop assumes batch_size=1 and multiple rollouts per prompt.
  • --num_gpus should match --nproc_per_node.

Optional Environment Variables

If you use the LLM-based scoring paths, configure credentials through environment variables instead of hardcoding them:

  • ZAI_API_KEY
  • VERTEXAI_PROJECT
  • VERTEXAI_LOCATION
  • GOOGLE_APPLICATION_CREDENTIALS

Scope

This trimmed repo is meant for:

  • reading the current GRPO training logic
  • reproducing the paper-closer implementation
  • auditing or modifying the Sync-R1 training path

It is not a plug-and-play full training package until you attach the required local datasets, checkpoints, and external model assets.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors