madengine v2 add Primus launcher support#80
Open
coketaste wants to merge 6 commits intocoketaste/refactor-disfrom
Open
madengine v2 add Primus launcher support#80coketaste wants to merge 6 commits intocoketaste/refactor-disfrom
coketaste wants to merge 6 commits intocoketaste/refactor-disfrom
Conversation
Resolved conflicts and integrated Primus launcher support into refactored codebase architecture. The refactor-dis branch introduced cleaner code organization by extracting common utilities and launcher logic into dedicated modules. Conflict Resolutions: - src/madengine/deployment/common.py: Added "primus" to VALID_LAUNCHERS - src/madengine/deployment/kubernetes_launcher_mixin.py: Added _generate_primus_command() - src/madengine/deployment/slurm.py: Removed duplicate functions, now imports from common.py - src/madengine/deployment/kubernetes.py: Uses KubernetesLauncherMixin, removed duplicates - src/madengine/execution/container_runner.py: Integrated helper functions while preserving Primus features Key Changes: - Primus launcher fully integrated into refactored architecture - Maintained Primus-specific features: image resolution, config path, CLI extra args - Merged environment variables: PRIMUS_CONFIG_PATH, PRIMUS_CLI_EXTRA, TORCH_ELASTIC_RDZV_TIMEOUT - Used refactored helpers: resolve_run_timeout(), make_run_log_file_path(), _resolve_multiple_results_path() - Preserved all existing launcher functionality (torchrun, vllm, sglang, deepspeed, megatron, torchtitan) Architecture Improvements: - Common launcher utilities in deployment/common.py - Kubernetes launcher logic in kubernetes_launcher_mixin.py - Container runner helpers in execution/container_runner_helpers.py - Run details utilities in utils/run_details.py and utils/path_utils.py Validation: - All Python files compile without syntax errors - All imports successful - Launcher normalization works for all launchers including Primus - Helper functions tested and working - No conflict markers remaining Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PRIMUS_CONFIG_PATH, PRIMUS_CLI_EXTRA); pass PRIMUS_* in container_runner.
for primus_pretrain/* when model-specific image missing; preserve resolved
docker_image in run_results for perf/reports; create_run_details_dict uses
run_results docker_image.