Skip to content

Update quickstart guide for Stable-Baselines3#9

Closed
Yassineachkhity wants to merge 1 commit intoStable-Baselines-Team:masterfrom
Yassineachkhity:patch-1
Closed

Update quickstart guide for Stable-Baselines3#9
Yassineachkhity wants to merge 1 commit intoStable-Baselines-Team:masterfrom
Yassineachkhity:patch-1

Conversation

@Yassineachkhity
Copy link

@Yassineachkhity Yassineachkhity commented Feb 26, 2026

fix old mismatching old version error

Description

Motivation and Context

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist:

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have ensured pytest and pytype both pass (by running make pytest and make type).

fix old mismatching old version error
Copilot AI review requested due to automatic review settings February 26, 2026 12:30
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request attempts to update the quickstart guide documentation, but it incorrectly updates the stable-baselines (v2) repository's documentation to use the stable-baselines3 API. This is a critical error because this repository is for stable-baselines v2 (which is in maintenance mode), not stable-baselines3. The two libraries have completely different APIs, import paths, and dependencies.

Changes:

  • Replaces stable-baselines v2 API examples with stable-baselines3 API examples
  • Changes imports from stable_baselines to stable_baselines3
  • Changes gym imports from gym to gymnasium
  • Updates PPO2 references to PPO
  • Adds new installation instructions and expanded explanations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


import gym
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3 import PPO
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import statement uses "from stable_baselines3 import PPO", but this repository is stable-baselines (v2) which uses "from stable_baselines import PPO2" (note: PPO2, not PPO). This import will fail for users of this library.

Copilot uses AI. Check for mistakes.
import gym
from stable_baselines3.common.vec_env import DummyVecEnv
from stable_baselines3 import PPO
import gymnasium as gym
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import uses "gymnasium as gym", but stable-baselines (v2) is designed to work with the old "gym" library (OpenAI Gym), not "gymnasium". Throughout the rest of the documentation in this repository, "import gym" is used consistently. Gymnasium is used by stable-baselines3, not stable-baselines v2.

Copilot uses AI. Check for mistakes.
.. code-block:: python

import gym
from stable_baselines3.common.vec_env import DummyVecEnv
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import statement uses "from stable_baselines3.common.vec_env import DummyVecEnv", but this repository is stable-baselines (v2) which uses "from stable_baselines.common.vec_env import DummyVecEnv" (note: stable_baselines, not stable_baselines3).

Copilot uses AI. Check for mistakes.
Or just train a model with a one liner if
`the environment is registered in Gym <https://github.com/openai/gym/wiki/Environments>`_ and if
`the policy is registered <custom_policy.html>`_:
If the environment is registered in Gymnasium and the default policy is appropriate,
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment references "Gymnasium", but stable-baselines (v2) uses OpenAI Gym, not Gymnasium. Gymnasium is used by stable-baselines3, not this library.

Copilot uses AI. Check for mistakes.
action, _states = model.predict(obs)
obs, rewards, dones, info = env.step(action)
env.render()
- ``DummyVecEnv`` wraps the environment into a vectorized format required by Stable-Baselines3.
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states that DummyVecEnv is "required by Stable-Baselines3", but this repository is stable-baselines (v2). The explanation should reference stable-baselines, not stable-baselines3. Additionally, in stable-baselines v2, vectorized environments are optional for PPO2 (the environment is wrapped automatically when passing it to the constructor).

Suggested change
- ``DummyVecEnv`` wraps the environment into a vectorized format required by Stable-Baselines3.
- ``DummyVecEnv`` wraps the environment into a vectorized format used by stable-baselines; for PPO2 in stable-baselines (v2), this wrapping is optional because the environment is automatically vectorized when passed to the constructor.

Copilot uses AI. Check for mistakes.
.. code-block:: python

from stable_baselines import PPO2
from stable_baselines3 import PPO
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import uses "from stable_baselines3 import PPO", but this repository is stable-baselines (v2) which uses "from stable_baselines import PPO2" (note the different module name and algorithm class name).

Copilot uses AI. Check for mistakes.
obs = env.reset()
for _ in range(1000):
action, _ = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using vectorized environments in stable-baselines (v2), the step() method returns arrays with pluralized variable names: "obs, rewards, dones, info". This code uses singular "reward, done" which would cause confusion. Check examples.rst lines 87, 145, etc. for the correct pattern with vectorized environments in stable-baselines v2.

Suggested change
obs, reward, done, info = env.step(action)
obs, rewards, dones, infos = env.step(action)

Copilot uses AI. Check for mistakes.
Comment on lines +7 to +8
Stable-Baselines3 follows a scikit-learn–like interface for Reinforcement Learning algorithms.
Models are created, trained using `.learn()`, and used for prediction via `.predict()`.
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation update appears to be for the wrong repository. This is the stable-baselines (v2) repository, which is in maintenance mode and uses the old API with "stable_baselines" imports and "gym" environments. However, this PR updates the documentation to use stable-baselines3 API patterns. The description mentions stable-baselines3, but this should remain as stable-baselines v2 documentation. Please review the entire change and ensure it matches the correct library version.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +18
pip install stable-baselines3[extra]
pip install gymnasium[box2d]
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The installation instructions reference stable-baselines3, but this repository is for stable-baselines (v2). The correct installation for this library would be "pip install stable-baselines" not "pip install stable-baselines3[extra]".

Copilot uses AI. Check for mistakes.
@Yassineachkhity Yassineachkhity closed this by deleting the head repository Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants