Skip to content

Commit

Permalink
Add reset_target argument to antmaze environments (#167)
Browse files Browse the repository at this point in the history
* add argument reset_target

* fix argument pass

* point maze reset_target

* pointmaze docs

---------

Co-authored-by: rodrigodelazcano <rperezvicente@farama.org>
  • Loading branch information
rodrigodelazcano and rodrigodelazcano authored Sep 2, 2023
1 parent 56eb9a6 commit 58d5cfe
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 1 deletion.
5 changes: 4 additions & 1 deletion gymnasium_robotics/envs/maze/ant_maze_v4.py
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,8 @@ class AntMazeEnv(MazeEnv, EzPickle):
### Arguments
* `maze_map` - Optional argument to initialize the environment with a custom maze map.
* `continuing_task` - If set to `True` the episode won't be terminated when reaching the goal, instead a new goal location will be generated. If `False` the environment is terminated when the ant reaches the final goal.
* `continuing_task` - If set to `True` the episode won't be terminated when reaching the goal, instead a new goal location will be generated (unless `reset_target` argument is `True`). If `False` the environment is terminated when the ant reaches the final goal.
* `reset_target` - If set to `True` and the argument `continuing_task` is also `True`, when the ant reaches the target goal the location of the goal will be kept the same and no new goal location will be generated. If `False` a new goal will be generated when reached.
* `use_contact_forces` - If `True` contact forces of the ant are included in the `observation`.
Note that, the maximum number of timesteps before the episode is `truncated` can be increased or decreased by specifying the `max_episode_steps` argument at initialization. For example,
Expand Down Expand Up @@ -216,6 +217,7 @@ def __init__(
maze_map: List[List[Union[str, int]]] = U_MAZE,
reward_type: str = "sparse",
continuing_task: bool = True,
reset_target: bool = True,
**kwargs,
):
# Get the ant.xml path from the Gymnasium package
Expand All @@ -229,6 +231,7 @@ def __init__(
maze_height=0.5,
reward_type=reward_type,
continuing_task=continuing_task,
reset_target=reset_target,
**kwargs,
)
# Create the MuJoCo environment, include position observation of the Ant for GoalEnv
Expand Down
3 changes: 3 additions & 0 deletions gymnasium_robotics/envs/maze/maze_v4.py
Original file line number Diff line number Diff line change
Expand Up @@ -238,6 +238,7 @@ def __init__(
agent_xml_path: str,
reward_type: str = "dense",
continuing_task: bool = True,
reset_target: bool = True,
maze_map: List[List[Union[int, str]]] = U_MAZE,
maze_size_scaling: float = 1.0,
maze_height: float = 0.5,
Expand All @@ -247,6 +248,7 @@ def __init__(

self.reward_type = reward_type
self.continuing_task = continuing_task
self.reset_target = reset_target
self.maze, self.tmp_xml_file_path = Maze.make_maze(
agent_xml_path, maze_map, maze_size_scaling, maze_height
)
Expand Down Expand Up @@ -375,6 +377,7 @@ def update_goal(self, achieved_goal: np.ndarray) -> None:
"""Update goal position if continuing task and within goal radius."""
if (
self.continuing_task
and self.reset_target
and bool(np.linalg.norm(achieved_goal - self.goal) <= 0.45)
and len(self.maze.unique_goal_locations) > 1
):
Expand Down
3 changes: 3 additions & 0 deletions gymnasium_robotics/envs/maze/point_maze.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,7 @@ class PointMazeEnv(MazeEnv, EzPickle):
* `maze_map` - Optional argument to initialize the environment with a custom maze map.
* `continuing_task` - If set to `True` the episode won't be terminated when reaching the goal, instead a new goal location will be generated. If `False` the environment is terminated when the ball reaches the final goal.
* `reset_target` - If set to `True` and the argument `continuing_task` is also `True`, when the ant reaches the target goal the location of the goal will be kept the same and no new goal location will be generated. If `False` a new goal will be generated when reached.
Note that, the maximum number of timesteps before the episode is `truncated` can be increased or decreased by specifying the `max_episode_steps` argument at initialization. For example,
to increase the total number of timesteps to 100 make the environment as follows:
Expand Down Expand Up @@ -309,6 +310,7 @@ def __init__(
render_mode: Optional[str] = None,
reward_type: str = "sparse",
continuing_task: bool = True,
reset_target: bool = False,
**kwargs,
):
point_xml_file_path = path.join(
Expand All @@ -321,6 +323,7 @@ def __init__(
maze_height=0.4,
reward_type=reward_type,
continuing_task=continuing_task,
reset_target=reset_target,
**kwargs,
)

Expand Down

0 comments on commit 58d5cfe

Please sign in to comment.