Hello, Thanks for the great works again. I am trying to use the following command (which is almost the same as the command your repo provided)to reproduce the training on Libero with 4H100.
It turn out it will take about 28 hours to train, which is mismatch with 5 hours the repo provided.

Also GPU use also now fully utilized, is this normal?
Do you have any thought why the training time is much longer? Thanks and looking forward to hear from you!
data_name=libero_spatial_no_noops
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --standalone --nnodes 1 --nproc-per-node 4 vla-scripts/finetune.py \
--vlm_path pretrained_models/prism-qwen25-extra-dinosiglip-224px-0_5b \
--config_file_path pretrained_models/configs \
--data_root_dir data/libero \
--dataset_name $data_name \
--run_root_dir outputs \
--use_film False \
--num_images_in_input 2 \
--use_proprio True \
--use_lora True \
--use_fz False \
--use_minivlm True \
--image_aug True \
--num_steps_before_decay 150000 \
--max_steps 150005 \
--save_freq 5000 \
--save_latest_checkpoint_only False \
--merge_lora_during_training True \
--batch_size 16 \
--grad_accumulation_steps 1 \
--learning_rate 2e-4 \
--lora_rank 64 \
--use_pro_version True \
--wandb_project "$data_name" \
--run_id_note VLA-Adapter--spatial--testing
Hello, Thanks for the great works again. I am trying to use the following command (which is almost the same as the command your repo provided)to reproduce the training on Libero with 4H100.
It turn out it will take about 28 hours to train, which is mismatch with 5 hours the repo provided.

Also GPU use also now fully utilized, is this normal?
Do you have any thought why the training time is much longer? Thanks and looking forward to hear from you!