Skip to content

Multi-Motion Training with Future-Conditioned Observation#42

Closed
badajhong wants to merge 2 commits intoamazon-far:mainfrom
badajhong:dev/badajhong/multi-motion-train
Closed

Multi-Motion Training with Future-Conditioned Observation#42
badajhong wants to merge 2 commits intoamazon-far:mainfrom
badajhong:dev/badajhong/multi-motion-train

Conversation

@badajhong
Copy link
Copy Markdown

🚀 Description

This PR introduces multi-motion training capabilities for the G1 humanoid (29 DOF) and enhances the inference pipeline.

The primary contribution is a Future-Conditioned Observation mechanism. By providing the agent with temporal context (future target states), the policy can now robustly learn and switch between diverse motion styles within a single training session.

🛠 Key Changes

1. Future Motion Observation

To stabilize learning across heterogeneous motions, I added a Temporal Look-ahead (Preview Window) to the observation space:

Logic: The agent now observes target states for the next 3 time steps (t+1,t+2,t+3).

Features: Concatenated joint_pos and joint_vel for each future step.

Impact: This look-ahead allows the controller to anticipate sharp transitions or changes in motion, which is critical for multi-motion tracking performance.

2. Multi-Motion Training Support

Updated train_agent.py to support directory-based motion loading.

Compatible with both PPO and Fast-SAC frameworks.

Enabled recursive loading of .npz files (Holosoma training format).

3. Inference Pipeline Update (run_policy.py)

Added support for loading .npy files with the following keys:

    root_trans (frame, 3), root_ori (frame, 4), joint_pos (frame, 29), and fps.

Aligned input format (qpos=[root_trans,root_ori,joint_pos]) with the holosoma retargeting web interface.

🧪 Testing Guide

1. Training Execution

Ensure the motion directory contains multiple converted .npz files.

PPO Training:

python src/holosoma/holosoma/train_agent.py \
    exp:g1_29dof_wbt_multi_motion \
    logger:wandb \
    --command.setup_terms.motion_command.params.motion_config.motion_file holosoma/data/motions/g1_29dof/whole_body_tracking/train_data_sample/

Fast-SAC Training:

python src/holosoma/holosoma/train_agent.py \
    exp:g1-29dof-wbt-fast-sac-multi-motion \
    logger:wandb \
    --command.setup_terms.motion_command.params.motion_config.motion_file holosoma/data/motions/g1_29dof/whole_body_tracking/train_data_sample/

2. Simulation & Inference (Verification)

Terminal 1: Mujoco Sim

source scripts/source_mujoco_setup.sh
python src/holosoma/holosoma/run_sim.py robot:g1-29dof

Terminal 2: Policy Runner

# Environment Setup
source scripts/setup_unified.sh
source scripts/source_unified_setup.sh

# Run Policy
python3 src/holosoma_inference/holosoma_inference/run_policy.py inference:g1-29dof-wbt \
    --task.model-path YOUR_MODEL_PATH/*.onnx \
    --task.no-use-joystick \
    --task.use-sim-time \
    --task.rl-rate 50 \
    --task.interface lo \
    --task.target-pose-path PATH_TO_NPY_FILE/*.npy

@badajhong badajhong closed this Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant