Update quickstart guide for Stable-Baselines3 by Yassineachkhity · Pull Request #9 · Stable-Baselines-Team/stable-baselines

Yassineachkhity · 2026-02-26T12:30:19Z

fix old mismatching old version error

Description

Motivation and Context

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist:

I've read the CONTRIBUTION guide (required)
I have updated the changelog accordingly (required).
My change requires a change to the documentation.
I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.
I have ensured pytest and pytype both pass (by running make pytest and make type).

fix old mismatching old version error

Copilot

Pull request overview

This pull request attempts to update the quickstart guide documentation, but it incorrectly updates the stable-baselines (v2) repository's documentation to use the stable-baselines3 API. This is a critical error because this repository is for stable-baselines v2 (which is in maintenance mode), not stable-baselines3. The two libraries have completely different APIs, import paths, and dependencies.

Changes:

Replaces stable-baselines v2 API examples with stable-baselines3 API examples
Changes imports from stable_baselines to stable_baselines3
Changes gym imports from gym to gymnasium
Updates PPO2 references to PPO
Adds new installation instructions and expanded explanations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-26T12:34:49Z

docs/guide/quickstart.rst


-  import gym
+   from stable_baselines3.common.vec_env import DummyVecEnv
+   from stable_baselines3 import PPO


The import statement uses "from stable_baselines3 import PPO", but this repository is stable-baselines (v2) which uses "from stable_baselines import PPO2" (note: PPO2, not PPO). This import will fail for users of this library.

Copilot · 2026-02-26T12:34:49Z

docs/guide/quickstart.rst

-  import gym
+   from stable_baselines3.common.vec_env import DummyVecEnv
+   from stable_baselines3 import PPO
+   import gymnasium as gym


The import uses "gymnasium as gym", but stable-baselines (v2) is designed to work with the old "gym" library (OpenAI Gym), not "gymnasium". Throughout the rest of the documentation in this repository, "import gym" is used consistently. Gymnasium is used by stable-baselines3, not stable-baselines v2.

Copilot · 2026-02-26T12:34:50Z

docs/guide/quickstart.rst

 .. code-block:: python

-  import gym
+   from stable_baselines3.common.vec_env import DummyVecEnv


The import statement uses "from stable_baselines3.common.vec_env import DummyVecEnv", but this repository is stable-baselines (v2) which uses "from stable_baselines.common.vec_env import DummyVecEnv" (note: stable_baselines, not stable_baselines3).

Copilot · 2026-02-26T12:34:50Z

docs/guide/quickstart.rst

-Or just train a model with a one liner if
-`the environment is registered in Gym <https://github.com/openai/gym/wiki/Environments>`_ and if
-`the policy is registered <custom_policy.html>`_:
+If the environment is registered in Gymnasium and the default policy is appropriate,


The comment references "Gymnasium", but stable-baselines (v2) uses OpenAI Gym, not Gymnasium. Gymnasium is used by stable-baselines3, not this library.

Copilot · 2026-02-26T12:34:50Z

docs/guide/quickstart.rst

-      action, _states = model.predict(obs)
-      obs, rewards, dones, info = env.step(action)
-      env.render()
+- ``DummyVecEnv`` wraps the environment into a vectorized format required by Stable-Baselines3.


The comment states that DummyVecEnv is "required by Stable-Baselines3", but this repository is stable-baselines (v2). The explanation should reference stable-baselines, not stable-baselines3. Additionally, in stable-baselines v2, vectorized environments are optional for PPO2 (the environment is wrapped automatically when passing it to the constructor).

Suggested change

- ``DummyVecEnv`` wraps the environment into a vectorized format required by Stable-Baselines3.

- ``DummyVecEnv`` wraps the environment into a vectorized format used by stable-baselines; for PPO2 in stable-baselines (v2), this wrapping is optional because the environment is automatically vectorized when passed to the constructor.

Copilot · 2026-02-26T12:34:50Z

docs/guide/quickstart.rst

 .. code-block:: python

-    from stable_baselines import PPO2
+   from stable_baselines3 import PPO


The import uses "from stable_baselines3 import PPO", but this repository is stable-baselines (v2) which uses "from stable_baselines import PPO2" (note the different module name and algorithm class name).

Copilot · 2026-02-26T12:34:50Z

docs/guide/quickstart.rst

+   obs = env.reset()
+   for _ in range(1000):
+       action, _ = model.predict(obs, deterministic=True)
+       obs, reward, done, info = env.step(action)


When using vectorized environments in stable-baselines (v2), the step() method returns arrays with pluralized variable names: "obs, rewards, dones, info". This code uses singular "reward, done" which would cause confusion. Check examples.rst lines 87, 145, etc. for the correct pattern with vectorized environments in stable-baselines v2.

Suggested change

obs, reward, done, info = env.step(action)

obs, rewards, dones, infos = env.step(action)

Copilot · 2026-02-26T12:34:51Z

docs/guide/quickstart.rst

+Stable-Baselines3 follows a scikit-learn–like interface for Reinforcement Learning algorithms.
+Models are created, trained using `.learn()`, and used for prediction via `.predict()`.


This documentation update appears to be for the wrong repository. This is the stable-baselines (v2) repository, which is in maintenance mode and uses the old API with "stable_baselines" imports and "gym" environments. However, this PR updates the documentation to use stable-baselines3 API patterns. The description mentions stable-baselines3, but this should remain as stable-baselines v2 documentation. Please review the entire change and ensure it matches the correct library version.

Copilot · 2026-02-26T12:34:51Z

docs/guide/quickstart.rst

+   pip install stable-baselines3[extra]
+   pip install gymnasium[box2d]


The installation instructions reference stable-baselines3, but this repository is for stable-baselines (v2). The correct installation for this library would be "pip install stable-baselines" not "pip install stable-baselines3[extra]".

Update quickstart guide for Stable-Baselines3

379458a

fix old mismatching old version error

Copilot AI review requested due to automatic review settings February 26, 2026 12:30

Copilot started reviewing on behalf of Yassineachkhity February 26, 2026 12:30 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

Yassineachkhity closed this by deleting the head repository Feb 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update quickstart guide for Stable-Baselines3#9

Update quickstart guide for Stable-Baselines3#9
Yassineachkhity wants to merge 1 commit intoStable-Baselines-Team:masterfrom
Yassineachkhity:patch-1

Yassineachkhity commented Feb 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Copilot AI Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	- ``DummyVecEnv`` wraps the environment into a vectorized format required by Stable-Baselines3.
	- ``DummyVecEnv`` wraps the environment into a vectorized format used by stable-baselines; for PPO2 in stable-baselines (v2), this wrapping is optional because the environment is automatically vectorized when passed to the constructor.

	obs, reward, done, info = env.step(action)
	obs, rewards, dones, infos = env.step(action)

		Stable-Baselines3 follows a scikit-learn–like interface for Reinforcement Learning algorithms.
		Models are created, trained using `.learn()`, and used for prediction via `.predict()`.

		pip install stable-baselines3[extra]
		pip install gymnasium[box2d]

Conversation

Yassineachkhity commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Types of changes

Checklist:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Yassineachkhity commented Feb 26, 2026 •

edited

Loading