Adds setting of environment seed at initialization (isaac-sim#940)

# Description Various randomization and procedural generation operations happen at initialization. However, as noted in one of the issues, the seed setting happens after all these operations are performed. This means that the creation of the environment is not completely deterministic. This MR resolves this issue by adding a `seed` configuration to the environment. Fixes isaac-sim#904 ## Type of change - Bug fix (non-breaking change which fixes an issue) ## Screenshots The before and after results over **three runs** with the default seed (seed: 42) ```bash ./isaaclab.sh -p source/standalone/workflows/rsl_rl/train.py --task Isaac-Velocity-Rough-Anymal-C-v0 --headless --run_name seed ``` | Results over three runs | | ------ | | Before (main at 788a061) ![before](https://github.com/user-attachments/assets/21a6a9f3-7438-4e73-92dd-a32106272fcb) | | Now (this MR) ![after](https://github.com/user-attachments/assets/821b9c63-34b7-4ce2-8d36-4c979c47070b) | ## Checklist - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there --------- Signed-off-by: Mayank Mittal <12863862+Mayankm96@users.noreply.github.com>
iamdrfly · Sep 10, 2024 · 5f2c90c · 5f2c90c
1 parent 0d7eb76
commit 5f2c90c
Show file tree

Hide file tree

Showing 23 changed files with 337 additions and 76 deletions.
diff --git a/docs/index.rst b/docs/index.rst
@@ -75,6 +75,7 @@ Table of Contents
    source/features/tiled_rendering
    source/features/environments
    source/features/actuators
+   source/features/reproducibility
    .. source/features/motion_generators
 
 .. toctree::

diff --git a/docs/source/features/environments.rst b/docs/source/features/environments.rst
@@ -24,16 +24,16 @@ Classic environments that are based on IsaacGymEnvs implementation of MuJoCo-sty
     +------------------+-----------------------------+-------------------------------------------------------------------------+
     | World            | Environment ID              | Description                                                             |
     +==================+=============================+=========================================================================+
-    | |humanoid|       | | |humanoid-link|           | Move towards a direction with the MuJoCo humanoid robot                 |
-    |                  | | |humanoid-direct-link|    |                                                                         |
+    | |humanoid|       | |humanoid-link|             | Move towards a direction with the MuJoCo humanoid robot                 |
+    |                  | |humanoid-direct-link|      |                                                                         |
     +------------------+-----------------------------+-------------------------------------------------------------------------+
-    | |ant|            | | |ant-link|                | Move towards a direction with the MuJoCo ant robot                      |
-    |                  | | |ant-direct-link|         |                                                                         |
+    | |ant|            | |ant-link|                  | Move towards a direction with the MuJoCo ant robot                      |
+    |                  | |ant-direct-link|           |                                                                         |
     +------------------+-----------------------------+-------------------------------------------------------------------------+
-    | |cartpole|       | | |cartpole-link|           | Move the cart to keep the pole upwards in the classic cartpole control  |
-    |                  | | |cartpole-direct-link|    |                                                                         |
-    |                  | | |cartpole-camera-rgb-link||                                                                         |
-    |                  | | |cartpole-camera-dpt-link||                                                                         |
+    | |cartpole|       | |cartpole-link|             | Move the cart to keep the pole upwards in the classic cartpole control  |
+    |                  | |cartpole-direct-link|      |                                                                         |
+    |                  | |cartpole-camera-rgb-link|  |                                                                         |
+    |                  | |cartpole-camera-dpt-link|  |                                                                         |
     +------------------+-----------------------------+-------------------------------------------------------------------------+
 
 .. |humanoid| image:: ../_static/tasks/classic/humanoid.jpg
@@ -77,12 +77,12 @@ for the reach environment:
     +----------------+---------------------------+-----------------------------------------------------------------------------+
     | |cabi-franka|  | |cabi-franka-link|        | Grasp the handle of a cabinet's drawer and open it with the Franka robot    |
     +----------------+---------------------------+-----------------------------------------------------------------------------+
-    | |cube-allegro| | | |cube-allegro-link|     | In-hand reorientation of a cube using Allegro hand                          |
-    |                | | |allegro-direct-link|   |                                                                             |
+    | |cube-allegro| | |cube-allegro-link|       | In-hand reorientation of a cube using Allegro hand                          |
+    |                | |allegro-direct-link|     |                                                                             |
     +----------------+---------------------------+-----------------------------------------------------------------------------+
-    | |cube-shadow|  | | |cube-shadow-link|      | In-hand reorientation of a cube using Shadow hand                           |
-    |                | | |cube-shadow-ff-link|   |                                                                             |
-    |                | | |cube-shadow-lstm-link| |                                                                             |
+    | |cube-shadow|  | |cube-shadow-link|        | In-hand reorientation of a cube using Shadow hand                           |
+    |                | |cube-shadow-ff-link|     |                                                                             |
+    |                | |cube-shadow-lstm-link|   |                                                                             |
     +----------------+---------------------------+-----------------------------------------------------------------------------+
 
 .. |reach-franka| image:: ../_static/tasks/manipulation/franka_reach.jpg
@@ -120,11 +120,11 @@ Environments based on legged locomotion tasks.
     +------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
     | |velocity-rough-anymal-b|    | |velocity-rough-anymal-b-link|               | Track a velocity command on rough terrain with the Anymal B robot            |
     +------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
-    | |velocity-flat-anymal-c|     | | |velocity-flat-anymal-c-link|              | Track a velocity command on flat terrain with the Anymal C robot             |
-    |                              | | |velocity-flat-anymal-c-direct-link|       |                                                                              |
+    | |velocity-flat-anymal-c|     | |velocity-flat-anymal-c-link|                | Track a velocity command on flat terrain with the Anymal C robot             |
+    |                              | |velocity-flat-anymal-c-direct-link|         |                                                                              |
     +------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
-    | |velocity-rough-anymal-c|    | | |velocity-rough-anymal-c-link|             | Track a velocity command on rough terrain with the Anymal C robot            |
-    |                              | | |velocity-rough-anymal-c-direct-link|      |                                                                              |
+    | |velocity-rough-anymal-c|    | |velocity-rough-anymal-c-link|               | Track a velocity command on rough terrain with the Anymal C robot            |
+    |                              | |velocity-rough-anymal-c-direct-link|        |                                                                              |
     +------------------------------+----------------------------------------------+------------------------------------------------------------------------------+
     | |velocity-flat-anymal-d|     | |velocity-flat-anymal-d-link|                | Track a velocity command on flat terrain with the Anymal D robot             |
     +------------------------------+----------------------------------------------+------------------------------------------------------------------------------+

diff --git a/docs/source/features/reproducibility.rst b/docs/source/features/reproducibility.rst
@@ -0,0 +1,42 @@
+Reproducibility and Determinism
+-------------------------------
+
+Given the same hardware and Isaac Sim (and consequently PhysX) version, the simulation produces
+identical results for scenes with rigid bodies and articulations. However, the simulation results can
+vary across different hardware configurations due to floating point precision and rounding errors.
+At present, PhysX does not guarantee determinism for any scene with non-rigid bodies, such as cloth
+or soft bodies. For more information, please refer to the `PhysX Determinism documentation`_.
+
+Based on above, Isaac Lab provides a deterministic simulation that ensures consistent simulation
+results across different runs. This is achieved by using the same random seed for the
+simulation environment and the physics engine. At construction of the environment, the random seed
+is set to a fixed value using the :meth:`~omni.isaac.core.utils.torch.set_seed` method. This method sets the
+random seed for both the CPU and GPU globally across different libraries, including PyTorch and
+NumPy.
+
+In the included workflow scripts, the seed specified in the learning agent's configuration file or the
+command line argument is used to set the random seed for the environment. This ensures that the
+simulation results are reproducible across different runs. The seed is set into the environment
+parameters :attr:`omni.isaac.lab.envs.ManagerBasedEnvCfg.seed` or :attr:`omni.isaac.lab.envs.DirectRLEnvCfg.seed`
+depending on the manager-based or direct environment implementation respectively.
+
+For results on our determinacy testing for RL training, please check the GitHub Pull Request `#940`_.
+
+.. tip::
+
+  Due to GPU work scheduling, there's a possibility that runtime changes to simulation parameters
+  may alter the order in which operations take place. This occurs because environment updates can
+  happen while the GPU is occupied with other tasks. Due to the inherent nature of floating-point
+  numeric storage, any modification to the execution ordering can result in minor changes in the
+  least significant bits of output data. These changes may lead to divergent execution over the
+  course of simulating thousands of environments and simulation frames.
+
+  An illustrative example of this issue is observed with the runtime domain randomization of object's
+  physics materials. This process can introduce both determinacy and simulation issues when executed
+  on the GPU due to the way these parameters are passed from the CPU to the GPU in the lower-level APIs.
+  Consequently, it is strongly advised to perform this operation only at setup time, before the
+  environment stepping commences.
+
+
+.. _PhysX Determinism documentation: https://nvidia-omniverse.github.io/PhysX/physx/5.4.1/docs/API.html#determinism
+.. _#940: https://github.com/isaac-sim/IsaacLab/pull/940
diff --git a/docs/source/refs/issues.rst b/docs/source/refs/issues.rst
@@ -36,35 +36,6 @@ over stepping different parts of the simulation app. However, at this point, the
 timeline for this feature request.
 
 
-Non-determinism in physics simulation
--------------------------------------
-
-Due to GPU work scheduling, there's a possibility that runtime changes to simulation parameters
-may alter the order in which operations take place. This occurs because environment updates can
-happen while the GPU is occupied with other tasks. Due to the inherent nature of floating-point
-numeric storage, any modification to the execution ordering can result in minor changes in the
-least significant bits of output data. These changes may lead to divergent execution over the
-course of simulating thousands of environments and simulation frames.
-
-An illustrative example of this issue is observed with the runtime domain randomization of object's
-physics materials. This process can introduce both determinancy and simulation issues when executed
-on the GPU due to the way these parameters are passed from the CPU to the GPU in the lower-level APIs.
-Consequently, it is strongly advised to perform this operation only at setup time, before the
-environment stepping commences.
-
-For more information, please refer to the `PhysX Determinism documentation`_.
-
-In addition, due to floating point precision, states across different environments in the simulation
-may be non-deterministic when the same set of actions are applied to the same initial
-states. This occurs as environments are placed further apart from the world origin at (0, 0, 0).
-As actors get placed at different origins in the world, floating point errors may build up
-and result in slight variance in results even when starting from the same initial states. One
-possible workaround for this issue is to place all actors/environments at the world origin
-at (0, 0, 0) and filter out collisions between the environments. Note that this may induce
-a performance degradation of around 15-50%, depending on the complexity of actors and
-environment.
-
-
 Blank initial frames from the camera
 ------------------------------------
 
@@ -99,7 +70,6 @@ are stored in the instanceable asset's USD file and not in its stage reference's
 
 .. _instanceable assets: https://docs.omniverse.nvidia.com/app_isaacsim/app_isaacsim/tutorial_gym_instanceable_assets.html
 .. _Omniverse Isaac Sim documentation: https://docs.omniverse.nvidia.com/isaacsim/latest/known_issues.html
-.. _PhysX Determinism documentation: https://nvidia-omniverse.github.io/PhysX/physx/5.3.1/docs/BestPractices.html#determinism
 
 
 Exiting the process

diff --git a/docs/source/tutorials/01_assets/run_articulation.rst b/docs/source/tutorials/01_assets/run_articulation.rst
@@ -49,7 +49,7 @@ an instance of the :class:`assets.Articulation` class by passing the configurati
 
 .. literalinclude:: ../../../../source/standalone/tutorials/01_assets/run_articulation.py
    :language: python
-   :start-at: # Create separate groups called "Origin1", "Origin2", "Origin3"
+   :start-at: # Create separate groups called "Origin1", "Origin2"
    :end-at: cartpole = Articulation(cfg=cartpole_cfg)
 
 

diff --git a/source/extensions/omni.isaac.lab/config/extension.toml b/source/extensions/omni.isaac.lab/config/extension.toml
@@ -1,7 +1,7 @@
 [package]
 
 # Note: Semantic Versioning is used: https://semver.org/
-version = "0.22.9"
+version = "0.22.10"
 
 # Description
 title = "Isaac Lab framework for Robot Learning"

diff --git a/source/extensions/omni.isaac.lab/docs/CHANGELOG.rst b/source/extensions/omni.isaac.lab/docs/CHANGELOG.rst
@@ -1,6 +1,18 @@
 Changelog
 ---------
 
+0.22.10 (2024-09-09)
+~~~~~~~~~~~~~~~~~~~~
+
+Added
+^^^^^
+
+* Added a seed parameter to the :attr:`omni.isaac.lab.envs.ManagerBasedEnvCfg` and :attr:`omni.isaac.lab.envs.DirectRLEnvCfg`
+  classes to set the seed for the environment. This seed is used to initialize the random number generator for the environment.
+* Adapted the workflow scripts to set the seed for the environment using the seed specified in the learning agent's configuration
+  file or the command line argument. This ensures that the simulation results are reproducible across different runs.
+
+
 0.22.9 (2024-09-08)
 ~~~~~~~~~~~~~~~~~~~
 

diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/direct_rl_env.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/direct_rl_env.py
@@ -84,6 +84,12 @@ def __init__(self, cfg: DirectRLEnvCfg, render_mode: str | None = None, **kwargs
         # initialize internal variables
         self._is_closed = False
 
+        # set the seed for the environment
+        if self.cfg.seed is not None:
+            self.seed(self.cfg.seed)
+        else:
+            carb.log_warn("Seed not set for the environment. The environment creation may not be deterministic.")
+
         # create a simulation context to control the simulator
         if SimulationContext.instance() is None:
             self.sim: SimulationContext = SimulationContext(self.cfg.sim)
@@ -93,6 +99,7 @@ def __init__(self, cfg: DirectRLEnvCfg, render_mode: str | None = None, **kwargs
         # print useful information
         print("[INFO]: Base environment:")
         print(f"\tEnvironment device    : {self.device}")
+        print(f"\tEnvironment seed      : {self.cfg.seed}")
         print(f"\tPhysics step-size     : {self.physics_dt}")
         print(f"\tRendering step-size   : {self.physics_dt * self.cfg.sim.render_interval}")
         print(f"\tEnvironment step-size : {self.step_dt}")
@@ -241,6 +248,10 @@ def max_episode_length(self):
     def reset(self, seed: int | None = None, options: dict[str, Any] | None = None) -> tuple[VecEnvObs, dict]:
         """Resets all the environments and returns observations.
 
+        This function calls the :meth:`_reset_idx` function to reset all the environments.
+        However, certain operations, such as procedural terrain generation, that happened during initialization
+        are not repeated.
+
         Args:
             seed: The seed to use for randomization. Defaults to None, in which case the seed is not set.
             options: Additional information to specify how the environment is reset. Defaults to None.
@@ -254,13 +265,13 @@ def reset(self, seed: int | None = None, options: dict[str, Any] | None = None)
         # set the seed
         if seed is not None:
             self.seed(seed)
+
         # reset state of scene
         indices = torch.arange(self.num_envs, dtype=torch.int64, device=self.device)
         self._reset_idx(indices)
 
-        obs = self._get_observations()
         # return observations
-        return obs, self.extras
+        return self._get_observations(), self.extras
 
     def step(self, action: torch.Tensor) -> VecEnvStepReturn:
         """Execute one time-step of the environment's dynamics.

diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/direct_rl_env_cfg.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/direct_rl_env_cfg.py
@@ -41,6 +41,14 @@ class DirectRLEnvCfg:
     """
 
     # general settings
+    seed: int | None = None
+    """The seed for the random number generator. Defaults to None, in which case the seed is not set.
+
+    Note:
+      The seed is set at the beginning of the environment initialization. This ensures that the environment
+      creation is deterministic and behaves similarly across different runs.
+    """
+
     decimation: int = MISSING
     """Number of control action updates @ sim dt per policy dt.
 

diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/manager_based_env.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/manager_based_env.py
@@ -74,6 +74,12 @@ def __init__(self, cfg: ManagerBasedEnvCfg):
         # initialize internal variables
         self._is_closed = False
 
+        # set the seed for the environment
+        if self.cfg.seed is not None:
+            self.seed(self.cfg.seed)
+        else:
+            carb.log_warn("Seed not set for the environment. The environment creation may not be deterministic.")
+
         # create a simulation context to control the simulator
         if SimulationContext.instance() is None:
             # the type-annotation is required to avoid a type-checking error
@@ -89,6 +95,7 @@ def __init__(self, cfg: ManagerBasedEnvCfg):
         # print useful information
         print("[INFO]: Base environment:")
         print(f"\tEnvironment device    : {self.device}")
+        print(f"\tEnvironment seed      : {self.cfg.seed}")
         print(f"\tPhysics step-size     : {self.physics_dt}")
         print(f"\tRendering step-size   : {self.physics_dt * self.cfg.sim.render_interval}")
         print(f"\tEnvironment step-size : {self.step_dt}")
@@ -222,6 +229,10 @@ def load_managers(self):
     def reset(self, seed: int | None = None, options: dict[str, Any] | None = None) -> tuple[VecEnvObs, dict]:
         """Resets all the environments and returns observations.
 
+        This function calls the :meth:`_reset_idx` function to reset all the environments.
+        However, certain operations, such as procedural terrain generation, that happened during initialization
+        are not repeated.
+
         Args:
             seed: The seed to use for randomization. Defaults to None, in which case the seed is not set.
             options: Additional information to specify how the environment is reset. Defaults to None.
@@ -235,9 +246,11 @@ def reset(self, seed: int | None = None, options: dict[str, Any] | None = None)
         # set the seed
         if seed is not None:
             self.seed(seed)
+
         # reset state of scene
         indices = torch.arange(self.num_envs, dtype=torch.int64, device=self.device)
         self._reset_idx(indices)
+
         # return observations
         return self.observation_manager.compute(), self.extras
 

diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/manager_based_env_cfg.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/manager_based_env_cfg.py
@@ -56,6 +56,14 @@ class ManagerBasedEnvCfg:
     """
 
     # general settings
+    seed: int | None = None
+    """The seed for the random number generator. Defaults to None, in which case the seed is not set.
+
+    Note:
+      The seed is set at the beginning of the environment initialization. This ensures that the environment
+      creation is deterministic and behaves similarly across different runs.
+    """
+
     decimation: int = MISSING
     """Number of control action updates @ sim dt per policy dt.