Skip to content

[feat][Draft] Refactor slime to work with custom Megatron and vllm version #7

Open
knlnguyen1802 wants to merge 13 commits intoSamitHuang:dev_vllmfrom
knlnguyen1802:refactor_dev
Open

[feat][Draft] Refactor slime to work with custom Megatron and vllm version #7
knlnguyen1802 wants to merge 13 commits intoSamitHuang:dev_vllmfrom
knlnguyen1802:refactor_dev

Conversation

@knlnguyen1802
Copy link
Copy Markdown
Collaborator

This can run vllm 0.17.0 and Megatron-LM 0.16.1

How to run this
Base image for docker
nvcr.io/nvidia/pytorch:26.01-py3

Download Megatron-LM 0.16.1
https://github.com/NVIDIA/Megatron-LM/releases/tag/core_v0.16.1
Unzip and rename it to Megatron-LM folder and put under /root/

pip install vllm==0.17.0

Clone the slime repo

cd slime
pip install -e .

For compatible

pip install "numpy<2"
pip install torch_memory_saver

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a backend separation plan to decouple the framework from a hard dependency on sglang, enabling support for vLLM. Key changes include renaming backend-agnostic arguments like --server-concurrency, introducing lazy imports for sglang-specific modules, and providing local fallbacks for weight-synchronization utilities to support environments without sglang. The PR also updates Megatron-Core parameter gathering to support strided layouts and includes a comprehensive RFC detailing the refactoring strategy. A critical issue was identified in the argument parsing logic where Hugging Face checkpoint validation was inadvertently disabled by hardcoding a skip flag to true.

@knlnguyen1802 knlnguyen1802 changed the title [feat] Refactor slime to work with custom Megatron and vllm version [feat][Draft] Refactor slime to work with custom Megatron and vllm version Apr 1, 2026
Copy link
Copy Markdown
Owner

@SamitHuang SamitHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clear design

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move it to design doc


self.semaphore = asyncio.Semaphore(
args.sglang_server_concurrency * args.rollout_num_gpus // args.rollout_num_gpus_per_engine
args.server_concurrency * args.rollout_num_gpus // args.rollout_num_gpus_per_engine
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's more clear to name as rollout_server_concurrency ?

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants