Code supplement for comparing Proximal Policy Optimization and REINFORCE for a pedeogical project on PPO for Machine Learning and Optimization 4968 Authors Christopher Metcalfe and Avery Iorio.
Since this is only a pedegogical project we focused primarily on using the code to supplement our lecture notes and problem set, and explain important facts about PPO using modifications of existing code supplied by OpenAI and Gymnasium to train and run a host of classical reinforcment learning benchmarks.
To run the provided code, please follow the installation instructions provided by CleanRL https://github.com/vwxyzjn/cleanrl Once the package requirements are met the python files can be run directly.