Skip to content

add a simple profiler context#136

Closed
bowenyang008 wants to merge 11 commits intomainfrom
boweny/simple-profiler
Closed

add a simple profiler context#136
bowenyang008 wants to merge 11 commits intomainfrom
boweny/simple-profiler

Conversation

@bowenyang008
Copy link
Collaborator

No description provided.

abaheti95 and others added 11 commits July 29, 2025 14:55
torch.barrier has a fixed limited timeout depending on its backend so
won't help keep all MCT managed processes alive. So we need a new
barrier mechanism. This implementation is a refactor of existing
SyncActor but made it more general to serve as a barrier between
clients.
Added a single arg parse for `file_path` for omegaconf yaml path

Also added support for omegaconf yamls
Example run: compose-rl-distributed-ppo-test-yRLEyN 

This is tested with 2x8 GPUs
<img width="1072" height="668" alt="image"
src="https://github.com/user-attachments/assets/a6166c79-b136-4290-9344-fb68f1aedfa8"
/>


still missing a couple steps for some reason, going to ask the mlflow
team about it but wanted to get this up asap so we can have logging.
@bowenyang008 bowenyang008 deleted the boweny/simple-profiler branch August 6, 2025 04:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants