Skip to content

Commit

Permalink
Adds setting of environment seed at initialization (isaac-sim#940)
Browse files Browse the repository at this point in the history
# Description

Various randomization and procedural generation operations happen at
initialization. However, as noted in one of the issues, the seed setting
happens after all these operations are performed. This means that the
creation of the environment is not completely deterministic. This MR
resolves this issue by adding a `seed` configuration to the environment.

Fixes isaac-sim#904

## Type of change

- Bug fix (non-breaking change which fixes an issue)

## Screenshots

The before and after results over **three runs** with the default seed
(seed: 42)

```bash
./isaaclab.sh -p source/standalone/workflows/rsl_rl/train.py --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --run_name seed
```

| Results over three runs |
| ------ | 
| Before (main at 788a061)
![before](https://github.com/user-attachments/assets/21a6a9f3-7438-4e73-92dd-a32106272fcb)
|
| Now (this MR)
![after](https://github.com/user-attachments/assets/821b9c63-34b7-4ce2-8d36-4c979c47070b)
|

## Checklist

- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

---------

Signed-off-by: Mayank Mittal <12863862+Mayankm96@users.noreply.github.com>
  • Loading branch information
Mayankm96 authored Sep 10, 2024
1 parent 0d7eb76 commit 5f2c90c
Show file tree
Hide file tree
Showing 23 changed files with 337 additions and 76 deletions.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ Table of Contents
source/features/tiled_rendering
source/features/environments
source/features/actuators
source/features/reproducibility
.. source/features/motion_generators

.. toctree::
Expand Down
34 changes: 17 additions & 17 deletions docs/source/features/environments.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,16 @@ Classic environments that are based on IsaacGymEnvs implementation of MuJoCo-sty
+------------------+-----------------------------+-------------------------------------------------------------------------+
| World | Environment ID | Description |
+==================+=============================+=========================================================================+
| |humanoid| | | |humanoid-link| | Move towards a direction with the MuJoCo humanoid robot |
| | | |humanoid-direct-link| | |
| |humanoid| | |humanoid-link| | Move towards a direction with the MuJoCo humanoid robot |
| | |humanoid-direct-link| | |
+------------------+-----------------------------+-------------------------------------------------------------------------+
| |ant| | | |ant-link| | Move towards a direction with the MuJoCo ant robot |
| | | |ant-direct-link| | |
| |ant| | |ant-link| | Move towards a direction with the MuJoCo ant robot |
| | |ant-direct-link| | |
+------------------+-----------------------------+-------------------------------------------------------------------------+
| |cartpole| | | |cartpole-link| | Move the cart to keep the pole upwards in the classic cartpole control |
| | | |cartpole-direct-link| | |
| | | |cartpole-camera-rgb-link|| |
| | | |cartpole-camera-dpt-link|| |
| |cartpole| | |cartpole-link| | Move the cart to keep the pole upwards in the classic cartpole control |
| | |cartpole-direct-link| | |
| | |cartpole-camera-rgb-link| | |
| | |cartpole-camera-dpt-link| | |
+------------------+-----------------------------+-------------------------------------------------------------------------+

.. |humanoid| image:: ../_static/tasks/classic/humanoid.jpg
Expand Down Expand Up @@ -77,12 +77,12 @@ for the reach environment:
+----------------+---------------------------+-----------------------------------------------------------------------------+
| |cabi-franka| | |cabi-franka-link| | Grasp the handle of a cabinet's drawer and open it with the Franka robot |
+----------------+---------------------------+-----------------------------------------------------------------------------+
| |cube-allegro| | | |cube-allegro-link| | In-hand reorientation of a cube using Allegro hand |
| | | |allegro-direct-link| | |
| |cube-allegro| | |cube-allegro-link| | In-hand reorientation of a cube using Allegro hand |
| | |allegro-direct-link| | |
+----------------+---------------------------+-----------------------------------------------------------------------------+
| |cube-shadow| | | |cube-shadow-link| | In-hand reorientation of a cube using Shadow hand |
| | | |cube-shadow-ff-link| | |
| | | |cube-shadow-lstm-link| | |
| |cube-shadow| | |cube-shadow-link| | In-hand reorientation of a cube using Shadow hand |
| | |cube-shadow-ff-link| | |
| | |cube-shadow-lstm-link| | |
+----------------+---------------------------+-----------------------------------------------------------------------------+

.. |reach-franka| image:: ../_static/tasks/manipulation/franka_reach.jpg
Expand Down Expand Up @@ -120,11 +120,11 @@ Environments based on legged locomotion tasks.
+------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
| |velocity-rough-anymal-b| | |velocity-rough-anymal-b-link| | Track a velocity command on rough terrain with the Anymal B robot |
+------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
| |velocity-flat-anymal-c| | | |velocity-flat-anymal-c-link| | Track a velocity command on flat terrain with the Anymal C robot |
| | | |velocity-flat-anymal-c-direct-link| | |
| |velocity-flat-anymal-c| | |velocity-flat-anymal-c-link| | Track a velocity command on flat terrain with the Anymal C robot |
| | |velocity-flat-anymal-c-direct-link| | |
+------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
| |velocity-rough-anymal-c| | | |velocity-rough-anymal-c-link| | Track a velocity command on rough terrain with the Anymal C robot |
| | | |velocity-rough-anymal-c-direct-link| | |
| |velocity-rough-anymal-c| | |velocity-rough-anymal-c-link| | Track a velocity command on rough terrain with the Anymal C robot |
| | |velocity-rough-anymal-c-direct-link| | |
+------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
| |velocity-flat-anymal-d| | |velocity-flat-anymal-d-link| | Track a velocity command on flat terrain with the Anymal D robot |
+------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
Expand Down
42 changes: 42 additions & 0 deletions docs/source/features/reproducibility.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
Reproducibility and Determinism
-------------------------------

Given the same hardware and Isaac Sim (and consequently PhysX) version, the simulation produces
identical results for scenes with rigid bodies and articulations. However, the simulation results can
vary across different hardware configurations due to floating point precision and rounding errors.
At present, PhysX does not guarantee determinism for any scene with non-rigid bodies, such as cloth
or soft bodies. For more information, please refer to the `PhysX Determinism documentation`_.

Based on above, Isaac Lab provides a deterministic simulation that ensures consistent simulation
results across different runs. This is achieved by using the same random seed for the
simulation environment and the physics engine. At construction of the environment, the random seed
is set to a fixed value using the :meth:`~omni.isaac.core.utils.torch.set_seed` method. This method sets the
random seed for both the CPU and GPU globally across different libraries, including PyTorch and
NumPy.

In the included workflow scripts, the seed specified in the learning agent's configuration file or the
command line argument is used to set the random seed for the environment. This ensures that the
simulation results are reproducible across different runs. The seed is set into the environment
parameters :attr:`omni.isaac.lab.envs.ManagerBasedEnvCfg.seed` or :attr:`omni.isaac.lab.envs.DirectRLEnvCfg.seed`
depending on the manager-based or direct environment implementation respectively.

For results on our determinacy testing for RL training, please check the GitHub Pull Request `#940`_.

.. tip::

Due to GPU work scheduling, there's a possibility that runtime changes to simulation parameters
may alter the order in which operations take place. This occurs because environment updates can
happen while the GPU is occupied with other tasks. Due to the inherent nature of floating-point
numeric storage, any modification to the execution ordering can result in minor changes in the
least significant bits of output data. These changes may lead to divergent execution over the
course of simulating thousands of environments and simulation frames.

An illustrative example of this issue is observed with the runtime domain randomization of object's
physics materials. This process can introduce both determinacy and simulation issues when executed
on the GPU due to the way these parameters are passed from the CPU to the GPU in the lower-level APIs.
Consequently, it is strongly advised to perform this operation only at setup time, before the
environment stepping commences.


.. _PhysX Determinism documentation: https://nvidia-omniverse.github.io/PhysX/physx/5.4.1/docs/API.html#determinism
.. _#940: https://github.com/isaac-sim/IsaacLab/pull/940
30 changes: 0 additions & 30 deletions docs/source/refs/issues.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,35 +36,6 @@ over stepping different parts of the simulation app. However, at this point, the
timeline for this feature request.


Non-determinism in physics simulation
-------------------------------------

Due to GPU work scheduling, there's a possibility that runtime changes to simulation parameters
may alter the order in which operations take place. This occurs because environment updates can
happen while the GPU is occupied with other tasks. Due to the inherent nature of floating-point
numeric storage, any modification to the execution ordering can result in minor changes in the
least significant bits of output data. These changes may lead to divergent execution over the
course of simulating thousands of environments and simulation frames.

An illustrative example of this issue is observed with the runtime domain randomization of object's
physics materials. This process can introduce both determinancy and simulation issues when executed
on the GPU due to the way these parameters are passed from the CPU to the GPU in the lower-level APIs.
Consequently, it is strongly advised to perform this operation only at setup time, before the
environment stepping commences.

For more information, please refer to the `PhysX Determinism documentation`_.

In addition, due to floating point precision, states across different environments in the simulation
may be non-deterministic when the same set of actions are applied to the same initial
states. This occurs as environments are placed further apart from the world origin at (0, 0, 0).
As actors get placed at different origins in the world, floating point errors may build up
and result in slight variance in results even when starting from the same initial states. One
possible workaround for this issue is to place all actors/environments at the world origin
at (0, 0, 0) and filter out collisions between the environments. Note that this may induce
a performance degradation of around 15-50%, depending on the complexity of actors and
environment.


Blank initial frames from the camera
------------------------------------

Expand Down Expand Up @@ -99,7 +70,6 @@ are stored in the instanceable asset's USD file and not in its stage reference's

.. _instanceable assets: https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/tutorial_gym_instanceable_assets.html
.. _Omniverse Isaac Sim documentation: https://docs.omniverse.nvidia.com/isaacsim/latest/known_issues.html
.. _PhysX Determinism documentation: https://nvidia-omniverse.github.io/PhysX/physx/5.3.1/docs/BestPractices.html#determinism


Exiting the process
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorials/01_assets/run_articulation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ an instance of the :class:`assets.Articulation` class by passing the configurati

.. literalinclude:: ../../../../source/standalone/tutorials/01_assets/run_articulation.py
:language: python
:start-at: # Create separate groups called "Origin1", "Origin2", "Origin3"
:start-at: # Create separate groups called "Origin1", "Origin2"
:end-at: cartpole = Articulation(cfg=cartpole_cfg)


Expand Down
2 changes: 1 addition & 1 deletion source/extensions/omni.isaac.lab/config/extension.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]

# Note: Semantic Versioning is used: https://semver.org/
version = "0.22.9"
version = "0.22.10"

# Description
title = "Isaac Lab framework for Robot Learning"
Expand Down
12 changes: 12 additions & 0 deletions source/extensions/omni.isaac.lab/docs/CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,18 @@
Changelog
---------

0.22.10 (2024-09-09)
~~~~~~~~~~~~~~~~~~~~

Added
^^^^^

* Added a seed parameter to the :attr:`omni.isaac.lab.envs.ManagerBasedEnvCfg` and :attr:`omni.isaac.lab.envs.DirectRLEnvCfg`
classes to set the seed for the environment. This seed is used to initialize the random number generator for the environment.
* Adapted the workflow scripts to set the seed for the environment using the seed specified in the learning agent's configuration
file or the command line argument. This ensures that the simulation results are reproducible across different runs.


0.22.9 (2024-09-08)
~~~~~~~~~~~~~~~~~~~

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,12 @@ def __init__(self, cfg: DirectRLEnvCfg, render_mode: str | None = None, **kwargs
# initialize internal variables
self._is_closed = False

# set the seed for the environment
if self.cfg.seed is not None:
self.seed(self.cfg.seed)
else:
carb.log_warn("Seed not set for the environment. The environment creation may not be deterministic.")

# create a simulation context to control the simulator
if SimulationContext.instance() is None:
self.sim: SimulationContext = SimulationContext(self.cfg.sim)
Expand All @@ -93,6 +99,7 @@ def __init__(self, cfg: DirectRLEnvCfg, render_mode: str | None = None, **kwargs
# print useful information
print("[INFO]: Base environment:")
print(f"\tEnvironment device : {self.device}")
print(f"\tEnvironment seed : {self.cfg.seed}")
print(f"\tPhysics step-size : {self.physics_dt}")
print(f"\tRendering step-size : {self.physics_dt * self.cfg.sim.render_interval}")
print(f"\tEnvironment step-size : {self.step_dt}")
Expand Down Expand Up @@ -241,6 +248,10 @@ def max_episode_length(self):
def reset(self, seed: int | None = None, options: dict[str, Any] | None = None) -> tuple[VecEnvObs, dict]:
"""Resets all the environments and returns observations.
This function calls the :meth:`_reset_idx` function to reset all the environments.
However, certain operations, such as procedural terrain generation, that happened during initialization
are not repeated.
Args:
seed: The seed to use for randomization. Defaults to None, in which case the seed is not set.
options: Additional information to specify how the environment is reset. Defaults to None.
Expand All @@ -254,13 +265,13 @@ def reset(self, seed: int | None = None, options: dict[str, Any] | None = None)
# set the seed
if seed is not None:
self.seed(seed)

# reset state of scene
indices = torch.arange(self.num_envs, dtype=torch.int64, device=self.device)
self._reset_idx(indices)

obs = self._get_observations()
# return observations
return obs, self.extras
return self._get_observations(), self.extras

def step(self, action: torch.Tensor) -> VecEnvStepReturn:
"""Execute one time-step of the environment's dynamics.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,14 @@ class DirectRLEnvCfg:
"""

# general settings
seed: int | None = None
"""The seed for the random number generator. Defaults to None, in which case the seed is not set.
Note:
The seed is set at the beginning of the environment initialization. This ensures that the environment
creation is deterministic and behaves similarly across different runs.
"""

decimation: int = MISSING
"""Number of control action updates @ sim dt per policy dt.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,12 @@ def __init__(self, cfg: ManagerBasedEnvCfg):
# initialize internal variables
self._is_closed = False

# set the seed for the environment
if self.cfg.seed is not None:
self.seed(self.cfg.seed)
else:
carb.log_warn("Seed not set for the environment. The environment creation may not be deterministic.")

# create a simulation context to control the simulator
if SimulationContext.instance() is None:
# the type-annotation is required to avoid a type-checking error
Expand All @@ -89,6 +95,7 @@ def __init__(self, cfg: ManagerBasedEnvCfg):
# print useful information
print("[INFO]: Base environment:")
print(f"\tEnvironment device : {self.device}")
print(f"\tEnvironment seed : {self.cfg.seed}")
print(f"\tPhysics step-size : {self.physics_dt}")
print(f"\tRendering step-size : {self.physics_dt * self.cfg.sim.render_interval}")
print(f"\tEnvironment step-size : {self.step_dt}")
Expand Down Expand Up @@ -222,6 +229,10 @@ def load_managers(self):
def reset(self, seed: int | None = None, options: dict[str, Any] | None = None) -> tuple[VecEnvObs, dict]:
"""Resets all the environments and returns observations.
This function calls the :meth:`_reset_idx` function to reset all the environments.
However, certain operations, such as procedural terrain generation, that happened during initialization
are not repeated.
Args:
seed: The seed to use for randomization. Defaults to None, in which case the seed is not set.
options: Additional information to specify how the environment is reset. Defaults to None.
Expand All @@ -235,9 +246,11 @@ def reset(self, seed: int | None = None, options: dict[str, Any] | None = None)
# set the seed
if seed is not None:
self.seed(seed)

# reset state of scene
indices = torch.arange(self.num_envs, dtype=torch.int64, device=self.device)
self._reset_idx(indices)

# return observations
return self.observation_manager.compute(), self.extras

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,14 @@ class ManagerBasedEnvCfg:
"""

# general settings
seed: int | None = None
"""The seed for the random number generator. Defaults to None, in which case the seed is not set.
Note:
The seed is set at the beginning of the environment initialization. This ensures that the environment
creation is deterministic and behaves similarly across different runs.
"""

decimation: int = MISSING
"""Number of control action updates @ sim dt per policy dt.
Expand Down
Loading

0 comments on commit 5f2c90c

Please sign in to comment.