[Bug] args.build-dataset-num-proc doesn't work

### Checklist

- [x] 1. I have searched related issues but cannot get the expected help.
- [x] 2. The bug has not been fixed in the latest version.
- [x] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- [x] 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/SpecForge/discussions/new/choose Otherwise, it will be closed.
- [x] 5. Please use English, otherwise it will be closed.

### Describe the bug

When I was training Eagle3 Qwen2.5 VL, the parameter `build-dataset-num-proc` in the training script had to be 0; otherwise, a deadlock would be triggered, causing the map process to get stuck.

### Reproduction

```bash
torchrun \
    --standalone \
    --nproc_per_node $NUM_GPUS \
    $ROOT_DIR/scripts/train_eagle3.py \
    --target-model-path Qwen/Qwen2.5-VL-7B-Instruct \
    --target-model-backend hf \
    --draft-model-config $ROOT_DIR/configs/qwen2-5-vl-eagle3.json \
    --build-dataset-num-proc 8 \
    --train-data-path $ROOT_DIR/cache/dataset/allava4v_train.jsonl \
    --output-dir $ROOT_DIR/outputs/Qwen2.5-VL-7B-eagle3 \
    --num-epochs 10 \
    --batch-size 1 \
    --learning-rate 1e-4 \
    --max-length 8192 \
    --dist-timeout 360 \
    --chat-template qwen2-vl \
    --cache-dir $ROOT_DIR/cache \
    --embedding-key model.embed_tokens.weight \
    --tp-size 1 \
    --is-vlm \
    --min-pixels 50176 \
    --max-pixels 802816

```

### Environment

sglang 0.5.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] args.build-dataset-num-proc doesn't work #349

Checklist

Describe the bug

Reproduction

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] args.build-dataset-num-proc doesn't work #349

Description

Checklist

Describe the bug

Reproduction

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions