feat(examples/hunyuanimage): add HunyuanImage3.0-80B inference&finetune by Dong1017 · Pull Request #1432 · mindspore-lab/mindone

Dong1017 · 2025-11-20T10:34:01Z

What does this PR do?

Adds

Hunyuan-Image-3 models and the corresponding text-to-image pipeline.
LoRA finetune script for Hunyuan-Image-3 t2i with dataset from lambdalabs/pokemon-blip-captions.

Usage

Text-to-image inference

#!/bin/bash
export TOKENIZERS_PARALLELISM=False
export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

# Distributed training configuration
MASTER_ADDR=${MASTER_ADDR:-"127.0.0.1"}
MASTER_PORT=${MASTER_PORT:-$(shuf -i 20001-29999 -n 1)}
NPROC_PER_NODE=${WORLD_SIZE:-8}

# Model configuration
MODEL_ID=${MODEL_ID:-"HunyuanImage-3/"}  # Using HuggingFace model ID

# Training entry point
entry_file="run_image_gen.py"

# Output configuration
image_path="examples/infer/image_repro.png"

# Input argument (To be filled)
image_path="image_repro.png"
prompt="A brown and white dog is running on the grass"
seed=0
verbose=1
enable_amp="True"
image_size="832x1216"

# Launch inference
msrun --worker_num=${NPROC_PER_NODE} \
    --local_worker_num=${NPROC_PER_NODE} \
    --master_addr=${MASTER_ADDR} \
    --master_port=${MASTER_PORT} \
    --log_dir="logs/infer" \
    --join=True \
    ${entry_file} \
    --model-id "${MODEL_ID}" \
    --save "${image_path}" \
    --prompt "${prompt}" \
    --seed "${seed}" \
    --verbose "${verbose}" \
    --enable-ms-amp "${enable_amp}"\
    --image-size "${image_size}"\
    --reproduce \
    --bf16

Finetune

#!/bin/bash
export TOKENIZERS_PARALLELISM=False
export ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7

# Distributed training configuration
MASTER_ADDR=${MASTER_ADDR:-"127.0.0.1"}
MASTER_PORT=${MASTER_PORT:-$(shuf -i 20001-29999 -n 1)}
NPROC_PER_NODE=${WORLD_SIZE:-8}

# Model configuration
MODEL_ID=${MODEL_ID:-"HunyuanImage-3/"}  # Using HuggingFace model ID

# Training entry point
entry_file="run_image_train.py"

# Output configuration
output_dir="output/train"

# Input argument (To be filled)
dataset_path="datasets/pokemon-blip-captions"
deepspeed="scripts/zero3.json"
learning_rate=1e-5
num_train_epochs=1
seed=0
save_strategy="no"
do_eval="False"

# Launch inference
msrun --worker_num=${NPROC_PER_NODE} \
    --local_worker_num=${NPROC_PER_NODE} \
    --master_addr=${MASTER_ADDR} \
    --master_port=${MASTER_PORT} \
    --log_dir="logs/train" \
    --join=True \
    ${entry_file} \
    --dataset_path "${dataset_path}" \
    --deepspeed "${deepspeed}" \
    --model_path "${MODEL_ID}" \
    --output_dir "${output_dir}" \
    --num_train_epochs "${num_train_epochs}" \
    --learning_rate "${learning_rate}" \
    --seed "${seed}" \
    --save_strategy "${save_strategy}" \
    --do_eval "${do_eval}" \
    --bf16

More information is available in the README.md

Performance

Inference experiments are tested on Ascend Atlas 800T A2 machines with MindSpore 2.7.1, using 8 NPUs.

Type	Weight Loading Time	Mode	Speed (s/it)
Inference	6m48s	pynative	28.20

Finetune experiments are tested on Ascend Atlas 800T A2 machines with MindSpore 2.7.0, using 8 NPUs.

Type	Mode	Traniable ratio	Speed for one step in an epoch (s/it)
Finetune	pynative	0.073%	54.41

Option

Use #1422 to accelerate loading model weights

Limitations

MindSpore version: Inference support ms >= 2.7.0. Lora Finetune support ms 2.7.0.
The App server has not been supported yet. This limitation arises because model inference relies on msrun fordistributed execution, which is currently tightly integrated with Gradio. In the current setup, each NPU node spawns an independent process responsible for loading model weights and attempting to start the server on a designated port. While different ports can be assigned to each NPU’s process, the core issue lies in Gradio’s launch method, which blocks the main thread by default (prevent_thread_lock=False). This blocking behavior leads to distributed process deadlock in a multi-NPU environment. Setting prevent_thread_lock = True allows the server to start briefly, but it immediately terminates and fails to remain active. As a potential solution, future work may involve replacing the current Gradio-based server with a lightweight web framework such as Flask.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
documentation guidelines
Did you build and run the code without any errors?
Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@xxx

Dong1017 and others added 17 commits November 6, 2025 11:15

hunyuanimage3_0 modified from tecent repository

2d2f537

remove prompt rewite and add no_init_parameters

abf8ebd

modified when using single NPU

b22d3c6

tmp changes, need to fix auto mixed precision problem

e4e9ca9

fixing vae problem

50cc868

infer pass using 8 NPUs

335f75f

fix: run_app, but cannot find empty port

f36e953

update for training, need to try 8NPUs

67b4b53

fix some bugs, and try mindspore-lab#1422 loading method

a06cb4d

Merge branch 'mindspore-lab:master' into hunyuan_image_3

1388532

fix: multi-NPUs lora script is ready

c6a3f8c

add md

6d9a21d

fix: cig

1b8761d

fix: ci

b0438ab

fix: ci, related to loading tools

349c49e

Merge branch 'mindspore-lab:master' into hunyuan_image_3

5b82246

fix: ci problem

58688f6

Dong1017 marked this pull request as ready for review December 22, 2025 06:47

Dong1017 requested review from CaitinZhao, SamitHuang, vigo999 and zhanghuiyao as code owners December 22, 2025 06:47

Dong1017 and others added 2 commits December 23, 2025 19:33

Merge branch 'mindspore-lab:master' into hunyuan_image_3

78c7dac

fix: ci & app

0779a8e

vigo999 approved these changes Dec 24, 2025

View reviewed changes

Dong1017 added 4 commits December 25, 2025 11:09

rm app due to msrun not support gradio

caff6fb

rm app due to msrun not support gradio

c83158e

free dependency on PR mindspore-lab#1422

46a9e06

fix: clean comments

0ab6f3c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(examples/hunyuanimage): add HunyuanImage3.0-80B inference&finetune#1432

feat(examples/hunyuanimage): add HunyuanImage3.0-80B inference&finetune#1432
Dong1017 wants to merge 23 commits intomindspore-lab:masterfrom
Dong1017:hunyuan_image_3

Dong1017 commented Nov 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Dong1017 commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Adds

Usage

Performance

Option

Limitations

Before submitting

Who can review?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Dong1017 commented Nov 20, 2025 •

edited

Loading