-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: DummyVecEnv.seed()
resets the environment
#1481
Comments
In addition, from the documentation:
This means that the environment is actually reset twice. I agree that this is not very intuitive. |
Hello,
This happened indeed when Gym broke its API.
This is actually a nicer option I didn't consider. I would be happy to receive a PR =)
I don't think it's needed and it would change/break too many things (all the VecEnvWrapper for instance). |
This would be the best long term option in my opinion. I can make a PR, all I need is conformation that the only affected classes (ones with a |
Consistency over time/long-term support is the main benefit. SB3 is no longer a new library, we should avoid breaking people's code when it is not needed.
yes and the async version in SB3 contrib. PS: I already spent too much time discussing that issue in gym: openai/gym#2422 (comment) |
🐛 Bug
Seeding a
DummyVecEnv
resets the env with that seed (instead of saving the seed to be used in the next reset)stable-baselines3/stable_baselines3/common/vec_env/dummy_vec_env.py
Line 73 in d6ddee9
stable-baselines3/stable_baselines3/common/utils.py
Line 563 in d6ddee9
This behavior is weird and undocumented, and may cause reproducibility issues, I am assuming this happened in the transition to
gymnasium-v26
APIthe expected behavior would be to save the seed to be used in the next reset
Also,
DummyVecEnv.seed()
returns aList[None]
instead of NonePossible Solutions:
seed
to be used in the nextDummyVecEnv.reset()
seed
argument to reset (Just asGymansium.Env
has done and remove theseed()
function)As a Gymnasium dev, I would recommend the second solution
Note: the same behavior can be seen in
subproc_vec_env
To Reproduce
Relevant log output / Error message
No response
System Info
$ pip list | grep stabl
stable-baselines3 2.0.0a5
$ python --version
Python 3.10.10
$ pip list | grep torch
torch 1.12.1
$ pip list | grep gym
gymnasium 0.28.1
Checklist
The text was updated successfully, but these errors were encountered: