Update quickstart guide for Stable-Baselines3#9
Update quickstart guide for Stable-Baselines3#9Yassineachkhity wants to merge 1 commit intoStable-Baselines-Team:masterfrom
Conversation
fix old mismatching old version error
There was a problem hiding this comment.
Pull request overview
This pull request attempts to update the quickstart guide documentation, but it incorrectly updates the stable-baselines (v2) repository's documentation to use the stable-baselines3 API. This is a critical error because this repository is for stable-baselines v2 (which is in maintenance mode), not stable-baselines3. The two libraries have completely different APIs, import paths, and dependencies.
Changes:
- Replaces stable-baselines v2 API examples with stable-baselines3 API examples
- Changes imports from
stable_baselinestostable_baselines3 - Changes gym imports from
gymtogymnasium - Updates PPO2 references to PPO
- Adds new installation instructions and expanded explanations
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| import gym | ||
| from stable_baselines3.common.vec_env import DummyVecEnv | ||
| from stable_baselines3 import PPO |
There was a problem hiding this comment.
The import statement uses "from stable_baselines3 import PPO", but this repository is stable-baselines (v2) which uses "from stable_baselines import PPO2" (note: PPO2, not PPO). This import will fail for users of this library.
| import gym | ||
| from stable_baselines3.common.vec_env import DummyVecEnv | ||
| from stable_baselines3 import PPO | ||
| import gymnasium as gym |
There was a problem hiding this comment.
The import uses "gymnasium as gym", but stable-baselines (v2) is designed to work with the old "gym" library (OpenAI Gym), not "gymnasium". Throughout the rest of the documentation in this repository, "import gym" is used consistently. Gymnasium is used by stable-baselines3, not stable-baselines v2.
| .. code-block:: python | ||
|
|
||
| import gym | ||
| from stable_baselines3.common.vec_env import DummyVecEnv |
There was a problem hiding this comment.
The import statement uses "from stable_baselines3.common.vec_env import DummyVecEnv", but this repository is stable-baselines (v2) which uses "from stable_baselines.common.vec_env import DummyVecEnv" (note: stable_baselines, not stable_baselines3).
| Or just train a model with a one liner if | ||
| `the environment is registered in Gym <https://github.com/openai/gym/wiki/Environments>`_ and if | ||
| `the policy is registered <custom_policy.html>`_: | ||
| If the environment is registered in Gymnasium and the default policy is appropriate, |
There was a problem hiding this comment.
The comment references "Gymnasium", but stable-baselines (v2) uses OpenAI Gym, not Gymnasium. Gymnasium is used by stable-baselines3, not this library.
| action, _states = model.predict(obs) | ||
| obs, rewards, dones, info = env.step(action) | ||
| env.render() | ||
| - ``DummyVecEnv`` wraps the environment into a vectorized format required by Stable-Baselines3. |
There was a problem hiding this comment.
The comment states that DummyVecEnv is "required by Stable-Baselines3", but this repository is stable-baselines (v2). The explanation should reference stable-baselines, not stable-baselines3. Additionally, in stable-baselines v2, vectorized environments are optional for PPO2 (the environment is wrapped automatically when passing it to the constructor).
| - ``DummyVecEnv`` wraps the environment into a vectorized format required by Stable-Baselines3. | |
| - ``DummyVecEnv`` wraps the environment into a vectorized format used by stable-baselines; for PPO2 in stable-baselines (v2), this wrapping is optional because the environment is automatically vectorized when passed to the constructor. |
| .. code-block:: python | ||
|
|
||
| from stable_baselines import PPO2 | ||
| from stable_baselines3 import PPO |
There was a problem hiding this comment.
The import uses "from stable_baselines3 import PPO", but this repository is stable-baselines (v2) which uses "from stable_baselines import PPO2" (note the different module name and algorithm class name).
| obs = env.reset() | ||
| for _ in range(1000): | ||
| action, _ = model.predict(obs, deterministic=True) | ||
| obs, reward, done, info = env.step(action) |
There was a problem hiding this comment.
When using vectorized environments in stable-baselines (v2), the step() method returns arrays with pluralized variable names: "obs, rewards, dones, info". This code uses singular "reward, done" which would cause confusion. Check examples.rst lines 87, 145, etc. for the correct pattern with vectorized environments in stable-baselines v2.
| obs, reward, done, info = env.step(action) | |
| obs, rewards, dones, infos = env.step(action) |
| Stable-Baselines3 follows a scikit-learn–like interface for Reinforcement Learning algorithms. | ||
| Models are created, trained using `.learn()`, and used for prediction via `.predict()`. |
There was a problem hiding this comment.
This documentation update appears to be for the wrong repository. This is the stable-baselines (v2) repository, which is in maintenance mode and uses the old API with "stable_baselines" imports and "gym" environments. However, this PR updates the documentation to use stable-baselines3 API patterns. The description mentions stable-baselines3, but this should remain as stable-baselines v2 documentation. Please review the entire change and ensure it matches the correct library version.
| pip install stable-baselines3[extra] | ||
| pip install gymnasium[box2d] |
There was a problem hiding this comment.
The installation instructions reference stable-baselines3, but this repository is for stable-baselines (v2). The correct installation for this library would be "pip install stable-baselines" not "pip install stable-baselines3[extra]".
fix old mismatching old version error
Description
Motivation and Context
Types of changes
Checklist:
pytestandpytypeboth pass (by runningmake pytestandmake type).