ENPM690 - Robot Learning
Varad Nerlekar, Anirudh Swarankar
├── customPPO_TD3
│ ├── environment.py
│ ├── evaluate.py
│ ├── main.py
│ ├── memory.py
│ ├── model.py
│ ├── parser.py
│ ├── PPO.py
│ ├── TD3.py
│ ├── train.py
│ └── utils.py
├── docker-compose.yml
├── Dockerfile
├── models
│ ├── aniswa_fully_impaired.zip
│ ├── aniswa_partial_impaired.zip
│ ├── heathy_robot.zip
│ ├── Readme.md
│ ├── varad_100_percent_impaired.zip
│ └── varad_50_percent_impaired.zip
├── PPO
│ └── mars_ppo.py
├── ppo_leo_tensorboard
│ ├── PPO_1
│ │ └── events.out.tfevents.1745810268.arthavnuc.2483.0
│ ├── PPO_2
│ │ └── events.out.tfevents.1745810312.arthavnuc.2891.0
│ ├── PPO_3
│ │ └── events.out.tfevents.1745814732.arthavnuc.37220.0
│ ├── PPO_4
│ │ └── events.out.tfevents.1745900765.arthavnuc.115574.0
│ ├── PPO_5
│ │ └── events.out.tfevents.1745901552.arthavnuc.137728.0
│ ├── PPO_6
│ │ └── events.out.tfevents.1745942398.arthavnuc.1062466.0
│ ├── PPO_7
│ │ └── events.out.tfevents.1745987178.arthavnuc.2098751.0
│ └── PPO_8
│ └── events.out.tfevents.1745987510.83fbafc735cf.1059.0
├── README.md
└── src
├── leo_erc_common
└── leo_erc_desktop
├── leo_erc_gazebo_worlds
├── leo_erc_vizThe project has been implemented using various approaches. To overcome and get better understanding of the PPO and other RL algorithms. We implemented PPO and TD3 by scratch which was highly inspired by codes available on various resources. The custom implementation gave us insights and confidence on stableBaseline3 implementation.
In the final version of code i.e. mars_ppo.py we used SB3-PPO to achieve the desired results.
Tensorboards graphs of various runs can be visualized using
tensorboard --logdir ppo_leo_tensorboardTo save computation gazebo is runs in headless mode i.e. only gzserver runs, gzclient can be spawed manually. Visualization is done in RVIZ as it is computationally light as compared to gzclient
catkin build
source devel/setup.bash
roslaunch leo_erc_gazebo leo_gazebo.launch
# On another terminal
python3 mars_ppo.py [-h] [--base BASE] [--new NEW] [--timesteps TIMESTEPS] [--episodes EPISODES] {train,fine_tune,test}To build docker environment and run the model in a test environment for 10 episodes. The run_test.sh is default executable of of container which brings up rviz and loads a test environment. all the trained models are stored in a directory /models
python3 mars_ppo.py test --episodes 1 --base /models/aniswa_fully_impaired.zipdocker compose -f docker-compose.yml up
# or manually build docker image using
docker buildx build -f Dockerfile -t MCIH:latest .
Using rqt reconfigure robot can be impaired. Manipulate PI values. For full impairement set P and I to 0.
NOTE: PLease refresh after spawing for configurations to load.
docker exec -it mars_rover_1 tensorboard --logdir /ppo_leo_tensorboard