Upside-Down RL

title

Upside-Down Reinforcement Learning

emoji

🤖

colorFrom

green

colorTo

gray

sdk

streamlit

python_version

3.10

sdk_version

1.39.0

app_file

app.py

pinned

true

short_description

Upside-Down Reinforcement Learning (UDRL)

Upside-Down RL

This project implements an Upside-Down Reinforcement Learning (UDRL) agent.

This is the codebase of the paper: arXiv

The website associated with it is: demo

Installation

Make sure you have Python 3.10 installed. You can check your version with python --version. NOTE Use a virtual env to avoid dependency clash
Install the project dependencies using Poetry:
```
poetry install
```
If you do not have poetry use pip to install the requirements like so:
```
pip install -r requirements.txt
```

Running the Experiment

You can run the experiment with various configuration options using the command line:

poetry run python -m udrl [options]

Note If you are already inside a virtual env python -m udrl [options] is enough Note All defaults are for the CartPole-v0 Available options include:

--env_name: Name of the Gym environment (default: CartPole-v0)
--estimator_name: "neural" for NN or a fully qualified name of the scikit-learn estimator class (default: ensemble.RandomForestClassifier)
--seed: Random seed (default: 42)
--max_episode: Maximum training episodes (default: 500)
--collect_episode: Episodes to collect between training (default: 15)
--batch_size: Batch size for training (default: 0, uses entire replay buffer)
Other options related to warm-up, memory size, exploration, testing, saving, etc.

Result Data

Experiment configuration and final test results are saved in a JSON file (conf.json) within a directory structure based on the environment, seed, and non-default configuration values (e.g., data/[env-name]/[experiment_name]/[seed]/conf.json).
If save_policy is True, the trained policy is saved in the same directory (policy).
If save_learning_infos is True, learning infos and rewards during training are saved as a NumPy file (e.g.test_rewards.npy) and a json file (e.h.learning_infos.json) in the same directory.

Process Data

A base post processing is available to convert the results data in csvs run it as python -m udrl.data_proc

Project Structure

data: Stores experiment results and other data.
old_code: Contains previous code versions (not used in the current setup).
poetry.lock, pyproject.toml: Manage project dependencies and configuration.
README.md: This file.
udrl: Contains the main Python modules for the UDRL agent.

Please refer to the code and comments for further details on the implementation.

Troubleshooting

If you encounter any errors during installation or execution, or if you have any questions about the project, feel free to reach out to me at massimiliano@falzari.dev or open an issue. I'll be happy to assist you!

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
old_code		old_code
resources		resources
udrl		udrl
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
app.py		app.py
logo.jpg		logo.jpg
packages.txt		packages.txt
pages.toml		pages.toml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
website_photo.jpg		website_photo.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Upside-Down RL

Installation

Running the Experiment

Result Data

Process Data

Project Structure

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Upside-Down RL

Installation

Running the Experiment

Result Data

Process Data

Project Structure

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages