Official source code from the paper "Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience"
This repo was designed for python 3.7.9. Dependencies are listed in requirements.txt
agents/ contains all experiment scripts, including RL pre-training (Learning a Basis for Intentions) in agents/pretraining_forward.py and Inferring Intentions with IRL in agents/irl.py
We include all default hyperparameters in config/default.yaml, with domain-specific configs in config/domain and mode-specific configs in config/mode. By default all training scripts log to wandb online. To turn this off, set WANDB_MODE to offline in misc/utils.py
For all experiments, you will run:
./_train.shTo specify the mode and domain, index into the appropriate parameter in main.py
We recommend the following workflow for training a BASIS agent:
- Train a pre-trained by setting mode to
multitask-forwardand training until the reward converges on wandb. - Train the expert agent with preferences I would like the BASIS agent to infer (randomly generated) by setting mode to
expert, similarly training till convergence. - After learning the expert policy, collect the expert trajectories by setting mode to
play-expertand set theconfig.demonstrationsparameter in the appropriate domain config file. - Set mode to
irland run the IRL phase to see the performance of the basis agent inferring the experts preferences.
If you would like to run the ablation of multi-task IRL pre-training:
- Collect expert trajectories for all tasks by setting mode to
play-multitask-forward, settingtraj_task_idinconfig/mode/play-multitask-forward.yamlto the series of task indices (i.e. 0,1,2 for 3 tasks). - Train a pre-trained by setting mode to
multitask-irland training until the reward converges on wandb. - Resume from step #2 in the previous set of instructions.
- All models from each mode will be saved and loaded automatically.
agents/contains all scripts for each agent type.configcontain all the configs with default, domain, and mode-specific parametersenvscontain all environment files for Fruitgrid, Highway, and Roundabout Domains used in the paper.misccontains utilsnetworks/contains model files (for both simple and image features.)
