Dict space support for VecEnv #207

AdamGleave · 2019-02-18T05:12:40Z

OpenAI added support for Dict observation spaces to DummyVecEnv and SubprocVecEnv after the fork. This PR cherry-picks these changes, while retaining the improvements made by Stable Baselines. This will be needed (amongst other changes) to support use cases such as #133.

OpenAI also added ShmemVecEnv, with common code factored out into util.py. I've retained the util.py as a separate file in case we wish to also port over ShmemVecEnv in the future.

araffin · 2019-02-18T06:20:12Z

Hello,
Thanks for this pr. Before any review, please read "CONTRIBUTING.md" carefully ;)

AdamGleave · 2019-02-18T21:32:41Z

Thanks @araffin. I think the only things I missed were updating the change log (done now) and creating an accompanying issue for the PR (closes #208). Let me know if there are any changes you'd like to see.

There's also a case for adding Tuple support, as discussed in the issue; let me know if you'd like to see this and I can add it as well.

araffin · 2019-02-18T21:58:50Z

I think the only things I missed

Yes for the changelog, but I was more concern about the import and the documentation of each function (and the doc style).

AdamGleave · 2019-02-18T23:41:00Z

Thanks for the comments. Re: the imports, I assume the issue is with the relative import style. This was already present in DummyVecEnv so I did not want to change it, but have now switched to an absolute import style.

I've expanded on the documentation for the functions added and made them a consistent style. Happy to elaborate on this more but the functions added in the PR are all module-private so documentation doesn't seem as important as if part of public API.

araffin · 2019-02-19T08:47:16Z

I should have been more explicit, relative import are fine, i was refering to the import collection that was not at the top of the file.
Sure, go ahead if you can also add support for tuple =)!
Every function documentation matter ;)
EDIT: i will try to do a full review this week, but not sure to have the time.

AdamGleave · 2019-02-22T03:43:47Z

I've added support for Tuple observations, ready for review.

One area I'd appreciate feedback is what axis ordering we want for actions. Note there are two axes in the observations/actions for Tuple and Dict spaces in VecEnv environments: the index into the tuple or dict, and the index of the environment.

For observations, the first axis indexes into the tuple or dict, and the environment id second. This follows the convention used in OpenAI master, and I think is generally a reasonable design choice: e.g. one might want to feed a dict of observations of differently shaped values into different placeholders in a TensorFlow graph.

However, I'm less sure what the right choice is for actions. Right now, step_async follows the opposite convention: first index is environment, second index is into tuple or dict. This is the same convention as in OpenAI master, and does have its advantages: e.g. random sampling is easy, [action_space.sample() for _ in range(num_envs)]. However, I dislike the inconsistency, and for my own application (multi-agent RL, where each tuple index corresponds to a different agent) it would be more convenient to have the tuple or dict axis come first.

Converting between these two conventions is just a transpose operation, so the user can always change it, but it'd be good to get the default right.

araffin · 2019-02-26T21:08:08Z

Hello,

I did a quick test today and found a minor inconsistency:

the implementation seems inconsistent between DummyVecEnv and SubprocVecEnv: one requires an ordered dict while the other is ok with a normal dict. Also, that is a problem because it just crashes if this requirement is not satisfied.

It also seems that the doc needs to be updated (+ adding a note saying that currently no model supports those spaces).

I'll try to have a deeper look in the coming days but I'm quite busy right now...

PS: please add your name at the bottom of the changelog too ;)

AdamGleave · 2019-02-27T00:31:39Z

Thanks for the review and catching the error with dict v.s. OrderedDict. Gym is a little unclear on this point: the Dict space has spaces be an OrderedDict and sample always returns an OrderedDict, but contains accepts any dict instance (and some Gym environments return unordered dicts as observations). Following Postel's law, I've modified the VecEnv classes to accept unordered dict as observations, but to always return an OrderedDict to the user. I've also added a test case to verify this behaviour.

What docs would you like me to update? I believe the VecEnv classes should now support all observation spaces supported in Gym; do you think this should be stated explicitly (currently the docs do not mention anything regarding supported observation spaces)?

araffin · 2019-02-28T10:01:24Z

What docs would you like me to update?

I would say this one.

do you think this should be stated explicitly (currently the docs do not mention anything regarding supported observation spaces)?

yes, that what I meant for updating the doc. Please add a note about what is supported by the vec env.
For the algorithm, we already have the table in the doc and a more complete one in the readme.

araffin

Some minor changes required.
For your question about the order (key of the dict first or env idx first), I need to think a bit more about that.

stable_baselines/common/vec_env/util.py

stable_baselines/common/vec_env/subproc_vec_env.py

AdamGleave · 2019-02-28T23:03:30Z

Thanks for your comments. I've added documentation on supported spaces to the vector environment page, and also clarified on the RL algorithms page that the algorithms do not currently support Dict and Tuple spaces. I believe I've addressed the comments you made on the code. I also realised while writing the documentation I was missing test cases for some of the more obscure spaces (e.g. MultiBinary) so added those.

docs/guide/algos.rst

docs/guide/vec_envs.rst

araffin · 2019-03-08T09:45:59Z

Thank you very much for your good PR.
Just minor details in the doc, it will be merged after ;)

AdamGleave · 2019-03-08T21:35:22Z

I've made the suggested improvements to the documentation, thanks for the feedback.

AdamGleave added 3 commits February 17, 2019 21:06

Cherry-pick dict space support for VecEnv from OpenAI master

745fa71

Smoke test for dict observation spaces

3b6c30d

Codacity linting

5592477

Update changelog

2798da2

Clean up imports and docs

0851256

Doc formatting

77d9010

AdamGleave and others added 3 commits February 19, 2019 13:04

Move import to top-level

4b941d7

Add support for Tuple spaces to VecEnv

bec7905

Merge branch 'master' into dict-spaces

6b43a97

araffin requested review from hill-a, araffin and ernestum and removed request for hill-a February 22, 2019 13:13

dict vs OrderedDict observations

78e16a0

araffin requested changes Feb 28, 2019

View reviewed changes

AdamGleave added 3 commits February 28, 2019 13:40

Add messages to assertions

736af5f

Test VecEnvs with all supported spaces

eef0744

Document dict and tuple support for VecEnv

4287f81

araffin mentioned this pull request Mar 8, 2019

Make SubprocVecEnv Thread Safe #218

Merged

araffin reviewed Mar 8, 2019

View reviewed changes

docs/guide/algos.rst Outdated Show resolved Hide resolved

araffin reviewed Mar 8, 2019

View reviewed changes

docs/guide/vec_envs.rst Outdated Show resolved Hide resolved

araffin previously approved these changes Mar 8, 2019

View reviewed changes

Documentation improvements

076dfb6

AdamGleave dismissed araffin’s stale review via 076dfb6 March 8, 2019 21:34

araffin approved these changes Mar 8, 2019

View reviewed changes

araffin merged commit 8e60a78 into hill-a:master Mar 8, 2019

araffin mentioned this pull request Mar 8, 2019

[feature request] Dict spaces in VecEnv #208

Closed

srivatsankrishnan mentioned this pull request Mar 29, 2019

[question] [feature request] support for Dict and Tuple spaces #133

Closed

araffin mentioned this pull request Mar 8, 2020

V3.0 implementation design #576

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dict space support for VecEnv #207

Dict space support for VecEnv #207

AdamGleave commented Feb 18, 2019

araffin commented Feb 18, 2019

AdamGleave commented Feb 18, 2019

araffin commented Feb 18, 2019 •

edited

Loading

AdamGleave commented Feb 18, 2019

araffin commented Feb 19, 2019 •

edited

Loading

AdamGleave commented Feb 22, 2019

araffin commented Feb 26, 2019

AdamGleave commented Feb 27, 2019

araffin commented Feb 28, 2019

araffin left a comment

AdamGleave commented Feb 28, 2019

araffin commented Mar 8, 2019

AdamGleave commented Mar 8, 2019

Dict space support for VecEnv #207

Dict space support for VecEnv #207

Conversation

AdamGleave commented Feb 18, 2019

araffin commented Feb 18, 2019

AdamGleave commented Feb 18, 2019

araffin commented Feb 18, 2019 • edited Loading

AdamGleave commented Feb 18, 2019

araffin commented Feb 19, 2019 • edited Loading

AdamGleave commented Feb 22, 2019

araffin commented Feb 26, 2019

AdamGleave commented Feb 27, 2019

araffin commented Feb 28, 2019

araffin left a comment

Choose a reason for hiding this comment

AdamGleave commented Feb 28, 2019

araffin commented Mar 8, 2019

AdamGleave commented Mar 8, 2019

araffin commented Feb 18, 2019 •

edited

Loading

araffin commented Feb 19, 2019 •

edited

Loading