Skip to content

Arthav24/MCIH-RL

Repository files navigation

Model Compensation for Impaired Hardware using RL

ENPM690 - Robot Learning
Varad Nerlekar, Anirudh Swarankar

Directory Structure

├── customPPO_TD3
│   ├── environment.py
│   ├── evaluate.py
│   ├── main.py
│   ├── memory.py
│   ├── model.py
│   ├── parser.py
│   ├── PPO.py
│   ├── TD3.py
│   ├── train.py
│   └── utils.py
├── docker-compose.yml
├── Dockerfile
├── models
│   ├── aniswa_fully_impaired.zip
│   ├── aniswa_partial_impaired.zip
│   ├── heathy_robot.zip
│   ├── Readme.md
│   ├── varad_100_percent_impaired.zip
│   └── varad_50_percent_impaired.zip
├── PPO
│   └── mars_ppo.py
├── ppo_leo_tensorboard
│   ├── PPO_1
│   │   └── events.out.tfevents.1745810268.arthavnuc.2483.0
│   ├── PPO_2
│   │   └── events.out.tfevents.1745810312.arthavnuc.2891.0
│   ├── PPO_3
│   │   └── events.out.tfevents.1745814732.arthavnuc.37220.0
│   ├── PPO_4
│   │   └── events.out.tfevents.1745900765.arthavnuc.115574.0
│   ├── PPO_5
│   │   └── events.out.tfevents.1745901552.arthavnuc.137728.0
│   ├── PPO_6
│   │   └── events.out.tfevents.1745942398.arthavnuc.1062466.0
│   ├── PPO_7
│   │   └── events.out.tfevents.1745987178.arthavnuc.2098751.0
│   └── PPO_8
│       └── events.out.tfevents.1745987510.83fbafc735cf.1059.0
├── README.md
└── src
    ├── leo_erc_common
    └── leo_erc_desktop
        ├── leo_erc_gazebo_worlds
        ├── leo_erc_viz

Implementation Notes

The project has been implemented using various approaches. To overcome and get better understanding of the PPO and other RL algorithms. We implemented PPO and TD3 by scratch which was highly inspired by codes available on various resources. The custom implementation gave us insights and confidence on stableBaseline3 implementation.
In the final version of code i.e. mars_ppo.py we used SB3-PPO to achieve the desired results.

How to visualize tensorboard graphs

Tensorboards graphs of various runs can be visualized using

tensorboard --logdir ppo_leo_tensorboard

How to run ?

Native

To save computation gazebo is runs in headless mode i.e. only gzserver runs, gzclient can be spawed manually. Visualization is done in RVIZ as it is computationally light as compared to gzclient

catkin build
source devel/setup.bash
roslaunch leo_erc_gazebo leo_gazebo.launch

# On another terminal
python3 mars_ppo.py [-h] [--base BASE] [--new NEW] [--timesteps TIMESTEPS] [--episodes EPISODES] {train,fine_tune,test}

Docker

To build docker environment and run the model in a test environment for 10 episodes. The run_test.sh is default executable of of container which brings up rviz and loads a test environment. all the trained models are stored in a directory /models

python3 mars_ppo.py test --episodes 1 --base /models/aniswa_fully_impaired.zip
docker compose -f docker-compose.yml up
# or manually build docker image using
docker buildx build -f Dockerfile -t MCIH:latest .

How to impair robot

rqt Using rqt reconfigure robot can be impaired. Manipulate PI values. For full impairement set P and I to 0. NOTE: PLease refresh after spawing for configurations to load.

To visualize tensorboard graphs

docker exec -it mars_rover_1 tensorboard --logdir /ppo_leo_tensorboard

About

Model Compensation for Impaired Hardware using RL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors