Skip to content

mohammadmonjil/reinforce-safe_exploration-gridworld

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

reinforce-safe_exploration-gridworld

Project for EEL-6935 Safe Autonomous System Graduate Coursework at ECE, University of Florida.

The objective is to solve the safe exploration problem in a reinforcement learning framework in the Island Navigation Environment, one of the RL environments proposed by OpenAI[1]. We define dangerous states as states from where agents can go to catastrophic states within a short period and propose to distinguish between safe and dangerous states by training a separate supervised neural network. This network will be trained whenever a catastrophe is detected. In subsequent exploration, we reduce the reward obtained in dangerous states thereby motivating the agent to avoid those states and thereby avoid catastrophe. The proposed approach was used to modify the Advantage Actor-Critic RL algorithm. Experiment results suggest improvement over the base A2C algorithm in terms of reducing catastrophic events and faster convergence as well.

[1] J. Leike, M. Martic, V. Krakovna, P. A. Ortega, T. Everitt, A. Lefrancq, L. Orseau, and S. Legg, “AI safety grid-worlds,” CoRR, vol. abs/1711.09883, 2017. [Online]. Available: http://arxiv.org/abs/1711.09883.

Requred Modules: pytorch & numpy

Just run the notebook.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors