-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch action_space in VectorEnv #2280
Batch action_space in VectorEnv #2280
Conversation
4b8dbb5
to
7c64777
Compare
gym/vector/utils/spaces.py
Outdated
>>> next(it) | ||
StopIteration | ||
""" | ||
if isinstance(space, _BaseGymSpaces): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not put the space as the first argument, and use a singledispatch callable here?
This would let users customize how this function should behave with their custom spaces, which isn't possible as it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're also essentially doing the same thing as a singledispatch here, but worse! :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is definitely a prime candidate for singledispatch!
I was waiting for #2093, but I'll get the ball rolling and switch this function to singledispatch already.
gym/vector/utils/spaces.py
Outdated
|
||
Parameters | ||
---------- | ||
items : samples of `space` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: those docstrings and the doctest below still have the "items, space" ordering
What is the status of this PR? Any timeline for this PR getting merged? |
Pinging @jkterry1 |
I'm just going to reply to all the notifications you sent in one place: Everything regarding changes to the vector API is temporarily on hold until I get more time to develop a cohesive plan for it going forward. This is has been a lower priority for me than many other fixes to Gym at the moment. The documentation stuff isn't on hold per se, there's just been a long list of problems in getting the desired website fully operational and getting other parts of the text of the documentation written. The status of all your other PRs is that I'm aware they exist and I need to go through them again myself for one reason or another and have not had the free time. Amongst other considerations, I'm working on 3 first author ICLR submissions. Gym wasn't maintained for years, things are going to take a little while to catch up. |
I see, perhaps you should get other maintainers on board to not be limited by the free time you can allocate to Gym, and so that you can offload some of those tasks. For the particular case of this PR (to stay focused), these changes were discussed in #2279, and you were welcome to comment on those changes. You could rely on the community more instead of taking the responsibility to devise a plan by yourself if you are too busy. |
6dad0da
to
7167c33
Compare
Any update on this PR? @jkterry1 |
The proposed changes make sense to me. I have a quick question.
So what happened if you have |
You mean if the environment has action space from gym.vector.utils.spaces import batch_space
space = Tuple((MultiDiscrete([2, 2, 2, 2, 2]), MultiDiscrete([2, 2, 2, 2, 2])))
batch_space(space, n=5)
# Tuple(Box([[0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]], [[1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]], (5, 5), int64), Box([[0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]], [[1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]], (5, 5), int64)) Just like for |
No I meant if the |
Again, it follows the rules of space = MultiDiscrete([2, 2, 2, 2, 2])
batch_space(space, n=5)
# Box([[0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]], [[1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]], (5, 5), int64) |
I see. Thank you. The API makes sense. LGTM @jkterry1. |
A more complete example: class CustomEnv(gym.Env):
observation_space = Box(low=0., high=1., shape=(2,), dtype=np.float32)
action_space = MultiDiscrete([2, 2, 2, 2, 2])
env = AsyncVectorEnv([lambda: CustomEnv() for _ in range(5)])
print(env.observation_space)
# Box([[0. 0.]
# [0. 0.]
# [0. 0.]
# [0. 0.]
# [0. 0.]], [[1. 1.]
# [1. 1.]
# [1. 1.]
# [1. 1.]
# [1. 1.]], (5, 2), float32)
print(env.action_space)
# Box([[0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]
# [0 0 0 0 0]], [[1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]
# [1 1 1 1 1]], (5, 5), int64) |
Given the discussion in #2279, here is a proposal to have a batch
action_space
instead of aTuple
instance.This handles any nested action space as well:
batch_space
instead ofTuple
inVectorEnv
iterate
utility function to iterate over items from a (batch) spaceaction_space
are all the same in all sub-environments