Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: DummyVecEnv.seed() resets the environment #1481

Closed
4 tasks done
Kallinteris-Andreas opened this issue May 4, 2023 · 4 comments · Fixed by #1486
Closed
4 tasks done

[Bug]: DummyVecEnv.seed() resets the environment #1481

Kallinteris-Andreas opened this issue May 4, 2023 · 4 comments · Fixed by #1486
Labels
bug Something isn't working

Comments

@Kallinteris-Andreas
Copy link
Contributor

Kallinteris-Andreas commented May 4, 2023

🐛 Bug

Seeding a DummyVecEnv resets the env with that seed (instead of saving the seed to be used in the next reset)

def seed(self, seed: Optional[int] = None) -> List[Union[None, int]]:

This behavior is weird and undocumented, and may cause reproducibility issues, I am assuming this happened in the transition to gymnasium-v26 API

the expected behavior would be to save the seed to be used in the next reset

Also, DummyVecEnv.seed() returns a List[None] instead of None

Possible Solutions:

  1. save the seed to be used in the next DummyVecEnv.reset()
  2. add the seed argument to reset (Just as Gymansium.Env has done and remove the seed() function)

As a Gymnasium dev, I would recommend the second solution

Note: the same behavior can be seen in subproc_vec_env

To Reproduce

import gymnasium

from stable_baselines3.common.vec_env import DummyVecEnv

MY_SEED = 0

env = gymnasium.make('Ant-v4')
d_env = DummyVecEnv([lambda: gymnasium.make('Ant-v4')])

obs, _ = env.reset(seed=MY_SEED)
obs1, _ = env.reset()
seeds = d_env.seed(MY_SEED)
d_obs = d_env.reset()

assert (d_obs[0] == obs1).all()

breakpoint()

Relevant log output / Error message

No response

System Info

$ pip list | grep stabl
stable-baselines3 2.0.0a5

$ python --version
Python 3.10.10

$ pip list | grep torch
torch 1.12.1

$ pip list | grep gym
gymnasium 0.28.1

Checklist

  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • I have provided a minimal working example to reproduce the bug
  • I've used the markdown code blocks for both code and stack traces.
@Kallinteris-Andreas Kallinteris-Andreas added the bug Something isn't working label May 4, 2023
@qgallouedec
Copy link
Collaborator

In addition, from the documentation:

  • the reset() method doesn't take any parameter. If you want to seed the pseudo-random generator, you should call vec_env.seed(seed=seed) and obs = vec_env.reset() afterward.

This means that the environment is actually reset twice. I agree that this is not very intuitive.

@araffin
Copy link
Member

araffin commented May 5, 2023

Hello,

I am assuming this happened in the transition to gymnasium-v26 API

This happened indeed when Gym broke its API.

save the seed to be used in the next DummyVecEnv.reset()

This is actually a nicer option I didn't consider. I would be happy to receive a PR =)
(we would need to document the behavior though)
We still need to double check but I think we always call reset() after calling seed(), we could also remove the compatibility wrapper I had.

add the seed argument to reset (Just as Gymansium.Env has done and remove the seed() function)

I don't think it's needed and it would change/break too many things (all the VecEnvWrapper for instance).

@Kallinteris-Andreas
Copy link
Contributor Author

Kallinteris-Andreas commented May 5, 2023

I don't think it's needed and it would change/break too many things (all the VecEnvWrapper for instance).

This would be the best long term option in my opinion.
There is no functional benefit in keeping the seed function over adding a seed argument.
(though you probably know the project better than me)

I can make a PR, all I need is conformation that the only affected classes (ones with a seed() function) are subproc_vec_env and DummyVecEnv

@araffin
Copy link
Member

araffin commented May 5, 2023

There is no functional benefit in keeping the seed function over adding a seed argument.

Consistency over time/long-term support is the main benefit.
Both options are fine, but one keeps the API (and behavior) the same from a user perspective, no matter which version of gym/gymnasium/SB3 is used.

SB3 is no longer a new library, we should avoid breaking people's code when it is not needed.
This is something that Gym didn't follow and caused a lot of unnecessary effort for users. (sail-sg/envpool#240 (comment)).

I can make a PR, all I need is conformation that the only affected files/classes are subproc_vec_env and DummyVecEnv

yes and the async version in SB3 contrib.

PS: I already spent too much time discussing that issue in gym: openai/gym#2422 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants