Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug Report] VectorizeAction not working #1169

Closed
1 task done
pkrack opened this issue Sep 20, 2024 · 3 comments · Fixed by #1170
Closed
1 task done

[Bug Report] VectorizeAction not working #1169

pkrack opened this issue Sep 20, 2024 · 3 comments · Fixed by #1170
Labels
bug Something isn't working

Comments

@pkrack
Copy link

pkrack commented Sep 20, 2024

Describe the bug

The VectorizeAction wrapper is not correctly implemented.

The problem lies in how self.out is initialized, how iterate is called and how concatenate is called. There is a confusion between what should be self.action_space, self.single_action_space, self.env.action_space and self.env.single_action_space.
Cf. example/test below.

I did not submit a PR because the contributing guidelines mention that you have to run the test suite, but the jax functional test seem to run forever on my machine. PR is ready though, let me know if you want me to submit it.

Proposed solution:

diff --git a/gymnasium/wrappers/vector/vectorize_action.py b/gymnasium/wrappers/vector/vectorize_action.py
index 3dc4a797..89677da0 100644
--- a/gymnasium/wrappers/vector/vectorize_action.py
+++ b/gymnasium/wrappers/vector/vectorize_action.py
@@ -138,7 +138,7 @@ class VectorizeTransformAction(VectorActionWrapper):
         self.action_space = batch_space(self.single_action_space, self.num_envs)
 
         self.same_out = self.action_space == self.env.action_space
-        self.out = create_empty_array(self.single_action_space, self.num_envs)
+        self.out = create_empty_array(self.env.single_action_space, self.num_envs)
 
     def actions(self, actions: ActType) -> ActType:
         """Applies the wrapper to each of the action.
@@ -154,17 +154,17 @@ class VectorizeTransformAction(VectorActionWrapper):
                 self.single_action_space,
                 tuple(
                     self.wrapper.func(action)
-                    for action in iterate(self.action_space, actions)
+                    for action in iterate(self.single_action_space, actions)
                 ),
                 actions,
             )
         else:
             return deepcopy(
                 concatenate(
-                    self.single_action_space,
+                    self.env.single_action_space,
                     tuple(
                         self.wrapper.func(action)
-                        for action in iterate(self.env.action_space, actions)
+                        for action in iterate(self.single_action_space, actions)
                     ),
                     self.out,
                 )

Code example

"""Test suite for VectorizeAction wrapper."""
import numpy as np

import gymnasium as gym
from gymnasium import spaces
from gymnasium.vector import SyncVectorEnv
from tests.testing_env import GenericTestEnv


def create_env():
    return GenericTestEnv(
        action_space=spaces.Box(
            low=0.0,
            high=1.0,
            shape=(1,),
            dtype=np.float32
        )
    )


def test_vectorize_box_to_dict_action():
    def func(x): return x["key"]
    env = SyncVectorEnv([create_env for _ in range(2)])
    env = gym.wrappers.vector.VectorizeTransformAction(
        env=env,
        wrapper=gym.wrappers.TransformAction,
        func=func,
        action_space=gym.spaces.Dict(
            {
                "key": gym.spaces.Box(
                    low=0.0,
                    high=1.0,
                    shape=(1,),
                    dtype=np.float32
                )
            }
        )
    )
    env.reset()
    # Check that func transforms the outer env action into the inner env action
    assert func(
        env.single_action_space.sample()
    ) in env.env.single_action_space
    # Check that the VectorizeWrapper correctly vectorizes func
    assert env.actions(env.action_space.sample()) in env.env.action_space

System info

  • installed form source with pip install -e
  • version 1.0.0
  • Debian 12
  • Python 3.11.2

Additional context

No response

Checklist

  • I have checked that there is no similar issue in the repo
@pkrack pkrack added the bug Something isn't working label Sep 20, 2024
@pseudo-rnd-thoughts
Copy link
Member

Thanks for the issue again, I'll have a look. At least initially, your solution seems to work but I'm uncertain about bits of it

@pkrack
Copy link
Author

pkrack commented Sep 20, 2024

In the message of this commit, there is some reasoning about which action space should be used where. Maybe that helps identifying issues.

I think the current implementation was adapted from the VectorizeObservation wrapper. But TransformObservation and TransformAction work in an opposite direction. TransformObservation wants a func that transforms an inner obs to an outer obs, TransformAction wants a func that transforms an outer action to an inner action. Hence the reversal of self.env.x_space and self.x_space in the VectorizeObservation and VectorizeAction wrappers.

@pseudo-rnd-thoughts
Copy link
Member

Yes, in short, the problem was that when I was implementing it I was thinking about the data "moving" in the same direction of observation rather than the opposite.

I'm adding some more testing but yes, I think the primary issue was the out was using self.single_action_space not self.env.single_action_space as you corrected.
Also looking at the actions function, the two cases are identical except for the concatenate output which should be actions and out respectively.

Currently, I'm adding more testing for this to doubly check this.

Thanks again @pkrack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants