Reinforcement Learning Agents

This repository contains implementations of Agents using Nervana Systems Coach, and bayesian optimization implementations using Sheffields GPyOpt.

These scripts require that you use Linux.

There is also a heavy reliance on the use of Pandas to mange .csv files for logging data during training, this is implemented in such a way that while running Bayesian Optimization if for any reason the environment/agent crashes the failed log files will be removed.

This repository is realized quite quickly and not everything has been extensively tested. However the premis behind the code works, Bayesian Optimization of RL Coach Agents for any arbitrary environment considering they make use of either OpenAI Gym or the RL Coach interface.

How it works:

In theory:

Find optimal hyperparameters using the following equation:

is defined as the averaged sum of 'Training rewards' for all episodes in a training cycle.

E: is the number of training episodes and T is the total number of training iterations.

Here are some reasons for this:

Very Easy to implement
Quantifies learning rate, faster is better
Quantifies stable learning, continual progress is better
Quantifies asymptotic performance, the higher the final performance the better

Of course there is a possibility that a local-maximum is found, but in my experience this metric is sufficient in achieving acceptable results.

How it is implemented:

Using the library GPyOpt, a gaussian process bayesian optimizer is constructed the two mandatory params that must be passed are:

Boundary definitions of the hyper-parameter set as a list of dicts.
The function can be implemented in python as an algorithm/function such as def run_ai(x): do stuff; return y

The acquisition funtion for determining the next choice of hyper-parameters is the Expected Improvement function by default.

In practice:

The algorithm/function defined here actually performs 3 steps:

It writes the new parameters to an opt_params.csv file
It calls the Agent .py script and waits for this code to finish running.
After correct execution of the Agent .py script reads the log-file and sums + returns the total-reward for each episode. If the agent script crashes this code deletes the last training data and exits the script as well.

The Agent .py script:

In the Agent script the opt_params.csv file is read and the latest hyper-parameters entry is used to construct a new agent.

The new agent is trained for a predefined number of iterations and upon completion the hyperparameter optimization process is resumed.

The output of the bayesopt.py script is the optimization_parameters.csv file. Agents implemented in Reinforcement Learning Coach automatically generate log files that are used for both the Dashboard app that comes with RL Coach as well as the Optimization script implemented in this Repo.

This means that any Agent realized in RL Coach can easily be optimzed using bayesopt.py all that is required is having the Agent load new parameters from the opt_params.csv file and defining the boundaries of the hyper-parameter searchspace.

Goals for in the future if i have time:

Implement multi-agent optimization techniques
Implement CMA-ES as an alternate optimization method

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
agents		agents
hyper_params		hyper_params
images/equations		images/equations
optimization		optimization
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement Learning Agents

How it works:

In theory:

How it is implemented:

In practice:

The Agent .py script:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

molomono/rl_agents

Folders and files

Latest commit

History

Repository files navigation

Reinforcement Learning Agents

How it works:

In theory:

How it is implemented:

In practice:

The Agent .py script:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages