Eclectic-Sheep · belerico · Feb 29, 2024 · Feb 28, 2024 · Feb 28, 2024 · Feb 28, 2024
@@ -29,10 +29,10 @@ python sheeprl.py exp=dreamer_v3 env=mujoco env.id=Walker2d-v4 algo.cnn_keys.enc
 ```
 
 ## DeepMind Control
-In order to train your agents on the [DeepMind control suite](https://github.com/deepmind/dm_control/blob/main/dm_control/suite/README.md), you have to select the *DMC* environment (`env=dmc`) and set the id of the environment you want to use. A list of the available environments can be found [here](https://arxiv.org/abs/1801.00690). For instance, if you want to train your agent on the *walker walk* environment, you need to set the `env.id` to `"walker_walk"`.
+In order to train your agents on the [DeepMind control suite](https://github.com/deepmind/dm_control/blob/main/dm_control/suite/README.md), you have to select the *DMC* environment (`env=dmc`) and set the `domain` and the `task` of the environment you want to use. A list of the available environments can be found [here](https://arxiv.org/abs/1801.00690). For instance, if you want to train your agent on the *walker walk* environment, you need to set the `env.wrapper.domain_name` to `"walker"` and the  `env.wrapper.task_name` to `"walk"`.
 
 ```bash
-python sheeprl.py exp=dreamer_v3 env=dmc env.id=walker_walk algo.cnn_keys.encoder=[rgb]
+python sheeprl.py exp=dreamer_v3 env=dmc env.wrapper.domain_name=walker env.wrapper.task_name=walk algo.cnn_keys.encoder=[rgb]
 ```
 
 > [!NOTE]

@@ -9,6 +9,6 @@ In this document, we give the user some advice to execute its experiments.
 Now that you are familiar with [hydra](https://hydra.cc/docs/intro/) and the organization of the configs of this repository, we can introduce few constraints to launch experiments:
 
 1. When you launch an experiment you **must** specify the experiment config of the agent you want to train: `python sheeprl.py exp=...`. The list of the available experiment configs can be retrieved with the following command: `python sheeprl.py --help`
-2. Then you have to specify the hyper-parameters of your experiment: you can override the hyper-parameters by specifying them as cli arguments (e.g., `exp=dreamer_v3 algo=dreamer_v3 env=dmc env.id=walker_walk env.action_repeat=2 ...`) or you can write your custom experiment file (you must put it in the `./sheeprl/configs/exp` folder) and call your script with the command `python sheeprl.py exp=custom_experiment` (the last option is recommended). There are some available examples, just check the [exp folder](../sheeprl/configs/exp/).
+2. Then you have to specify the hyper-parameters of your experiment: you can override the hyper-parameters by specifying them as cli arguments (e.g., `exp=dreamer_v3 algo=dreamer_v3 env=dmc env.wrapper.domain_name=walker env.wrapper.task_name=walk env.action_repeat=2 ...`) or you can write your custom experiment file (you must put it in the `./sheeprl/configs/exp` folder) and call your script with the command `python sheeprl.py exp=custom_experiment` (the last option is recommended). There are some available examples, just check the [exp folder](../sheeprl/configs/exp/).
 3. You **cannot mix the agent command with the configs of another algorithm**, this might raise an error or create anomalous behaviors. So if you want to train the `dreamer_v3` agent, be sure to select the correct algorithm configuration (in our case `algo=dreamer_v3`)
 4. To change the optimizer of an algorithm through the CLI you must do the following: suppose that you want to run an experiment with Dreamer-V3 and want to change the world model optimizer from Adam (default in the `sheeprl/configs/algo/dreamer_v3.yaml` config) with SGD, then in the CLI you must type `python sheeprl.py algo=dreamer_v3 ... optim@algo.world_model.optimizer=sgd`, where `optim@algo.world_model.optimizer=sgd` means that the `optimizer` field of the `world_model` of the `algo` config choosen (the dreamer_v3.yaml one) will be equal to the config `sgd.yaml` found under the `sheeprl/configs/optim` folder 
@@ -80,9 +80,9 @@ There are two versions for most Atari environments: one version uses the *frame
 For more information see the official documentation of [Gymnasium Atari environments](https://gymnasium.farama.org/environments/atari/).
 
 ## DMC environments
-It is possible to use the environments provided by the [DeepMind Control suite](https://www.deepmind.com/open-source/deepmind-control-suite). To use such environments it is necessary to specify "dmc" and the name of the environment in the `env` and `env.id` hyper-parameters respectively, e.g., `env=dmc env.id=walker_walk` will create an instance of the walker walk environment. For more information about all the environments, check their [paper](https://arxiv.org/abs/1801.00690).
+It is possible to use the environments provided by the [DeepMind Control suite](https://www.deepmind.com/open-source/deepmind-control-suite). To use such environments it is necessary to specify "dmc", the domain and the task of the environment in the `env`, `env.wrapper.domain_name` and `env.wrapper.task_name` hyper-parameters respectively, e.g., `env=dmc env.wrapper.domain_name=walker env.wrapper.task_name=walk` will create an instance of the walker walk environment. For more information about all the environments, check their [paper](https://arxiv.org/abs/1801.00690).
 
-When running DreamerV1 in a DMC environment on a server (or a PC without a video terminal) it could be necessary to add two variables to the command to launch the script: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa <command>`. For instance, to run walker walk with DreamerV1 on two gpus (0 and 1) it is necessary to runthe following command: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa python sheeprl.py exp=dreamer_v1 fabric.devices=2 fabric.accelerator=gpu env=dmc env.id=walker_walk env.action_repeat=2 env.capture_video=True checkpoint.every=100000 cnn_keys.encoder=[rgb]`. 
+When running DreamerV1 in a DMC environment on a server (or a PC without a video terminal) it could be necessary to add two variables to the command to launch the script: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa <command>`. For instance, to run walker walk with DreamerV1 on two gpus (0 and 1) it is necessary to runthe following command: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa python sheeprl.py exp=dreamer_v1 fabric.devices=2 fabric.accelerator=gpu env=dmc env.wrapper.domain_name=walker env.wrapper.task_name=walk env.action_repeat=2 env.capture_video=True checkpoint.every=100000 algo.cnn_keys.encoder=[rgb]`. 
 Other possibitities for the variable `MUJOCO_GL` are: `GLFW` for rendering to an X11 window or and `EGL` for hardware accelerated headless. (For more information, click [here](https://mujoco.readthedocs.io/en/stable/programming/index.html#using-opengl)).
 Moreover, it could be necessary to decomment two rows in the `sheeprl.algos.dreamer_v1.dreamer_v1.py` file.
 

@@ -147,9 +147,9 @@ checkpoint.every=100000 \
 ```
 
 ## DMC environments
-It is possible to use the environments provided by the [DeepMind Control suite](https://www.deepmind.com/open-source/deepmind-control-suite). To use such environments it is necessary to specify "dmc" and the name of the environment in the `env` and `env.id` hyper-parameters respectively, e.g., `env=dmc env.id=walker_walk` will create an instance of the walker walk environment. For more information about all the environments, check their [paper](https://arxiv.org/abs/1801.00690).
+It is possible to use the environments provided by the [DeepMind Control suite](https://www.deepmind.com/open-source/deepmind-control-suite). To use such environments it is necessary to specify "dmc", the domain and the task of the environment in the `env`, `env.wrapper.domain_name` and `env.wrapper.task_name` hyper-parameters respectively, e.g., `env=dmc env.wrapper.domain_name=walker env.wrapper.task_name=walk` will create an instance of the walker walk environment. For more information about all the environments, check their [paper](https://arxiv.org/abs/1801.00690).
 
-When running DreamerV2 in a DMC environment on a server (or a PC without a video terminal) it could be necessary to add two variables to the command to launch the script: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa <command>`. For instance, to run walker walk with DreamerV2 on two gpus (0 and 1) it is necessary to runthe following command: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa CUDA_VISIBLE_DEVICES="2,3" python sheeprl.py exp=dreamer_v2 fabric.devices=2 fabric.accelerator=gpu env=dmc env.id=walker_walk env.action_repeat=2 env.capture_video=True checkpoint.every=80000 cnn_keys.encoder=[rgb]`. 
+When running DreamerV2 in a DMC environment on a server (or a PC without a video terminal) it could be necessary to add two variables to the command to launch the script: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa <command>`. For instance, to run walker walk with DreamerV2 on two gpus (0 and 1) it is necessary to runthe following command: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa CUDA_VISIBLE_DEVICES="2,3" python sheeprl.py exp=dreamer_v2 fabric.devices=2 fabric.accelerator=gpu env=dmc env.wrapper.domain_name=walker env.wrapper.task_name=walk env.action_repeat=2 env.capture_video=True checkpoint.every=80000 algo.cnn_keys.encoder=[rgb]`. 
 Other possibitities for the variable `MUJOCO_GL` are: `GLFW` for rendering to an X11 window or and `EGL` for hardware accelerated headless. (For more information, click [here](https://mujoco.readthedocs.io/en/stable/programming/index.html#using-opengl)).
 Moreover, it could be necessary to decomment two rows in the `sheeprl.algos.dreamer_v1.dreamer_v1.py` file.
 
@@ -160,7 +160,8 @@ PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa python sheeprl.py \
 exp=dreamer_v2 \
 fabric.devices=1 \
 env=dmc \
-env.id=walker_walk \
+env.wrapper.domain_name=walker \
+env.wrapper.task_name=walk \
 env.capture_video=True \
 env.action_repeat=2 \
 env.clip_rewards=False \

@@ -199,9 +199,9 @@ There are two versions for most Atari environments: one version uses the *frame
 For more information see the official documentation of [Gymnasium Atari environments](https://gymnasium.farama.org/environments/atari/).
 
 ## DMC environments
-It is possible to use the environments provided by the [DeepMind Control suite](https://www.deepmind.com/open-source/deepmind-control-suite). To use such environments it is necessary to specify "dmc" and the name of the environment in the `env` and `env.id` hyper-parameters respectively, e.g., `env=dmc env.id=walker_walk` will create an instance of the walker walk environment. For more information about all the environments, check their [paper](https://arxiv.org/abs/1801.00690).
+It is possible to use the environments provided by the [DeepMind Control suite](https://www.deepmind.com/open-source/deepmind-control-suite). To use such environments it is necessary to specify "dmc", the domain and the task of the environment in the `env`, the `env.wrapper.domain_name` and `env.wrapper.task_name` hyper-parameters respectively, e.g., `env=dmc env.wrapper.domain_name=walker env.wrapper.task_name=walk` will create an instance of the walker walk environment. For more information about all the environments, check their [paper](https://arxiv.org/abs/1801.00690).
 
-When running DreamerV1 in a DMC environment on a server (or a PC without a video terminal) it could be necessary to add two variables to the command to launch the script: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa <command>`. For instance, to run walker walk with DreamerV1 on two gpus (0 and 1) it is necessary to runthe following command: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa python sheeprl.py exp=sac_ae fabric.devices=2 fabric.accelerator=gpu env=dmc env.id=walker_walk env.action_repeat=2 env.capture_video=True checkpoint.every=80000 cnn_keys.encoder=[rgb]`. 
+When running DreamerV1 in a DMC environment on a server (or a PC without a video terminal) it could be necessary to add two variables to the command to launch the script: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa <command>`. For instance, to run walker walk with DreamerV1 on two gpus (0 and 1) it is necessary to runthe following command: `PYOPENGL_PLATFORM="" MUJOCO_GL=osmesa python sheeprl.py exp=sac_ae fabric.devices=2 fabric.accelerator=gpu env=dmc env.wrapper.domain_name=walker env.wrapper.task_name=walk env.action_repeat=2 env.capture_video=True checkpoint.every=80000 algo.cnn_keys.encoder=[rgb]`. 
 Other possibitities for the variable `MUJOCO_GL` are: `GLFW` for rendering to an X11 window or and `EGL` for hardware accelerated headless. (For more information, click [here](https://mujoco.readthedocs.io/en/stable/programming/index.html#using-opengl)).
 
 ## Recommendations

@@ -3,15 +3,16 @@ defaults:
   - _self_
 
 # Override from `default` config
-id: walker_walk
+id: ${env.wrapper.domain_name}_${env.wrapper.task_name}
 action_repeat: 1
 max_episode_steps: 1000
 sync_env: True
 
 # Wrapper to be instantiated
 wrapper:
   _target_: sheeprl.envs.dmc.DMCWrapper
-  id: ${env.id}
+  domain_name: walker
+  task_name: walk
   width: ${env.screen_size}
   height: ${env.screen_size}
   seed: null

@@ -0,0 +1,47 @@
+# @package _global_
+
+defaults:
+  - dreamer_v3
+  - override /algo: dreamer_v3_S
+  - override /env: dmc
+  - _self_
+
+# Experiment
+seed: 5
+
+# Environment
+env:
+  num_envs: 4
+  action_repeat: 2
+  max_episode_steps: -1
+  wrapper:
+    domain_name: cartpole
+    task_name: swingup_sparse
+    from_vectors: False
+    from_pixels: True
+    seed: ${seed}
+
+# Checkpoint
+checkpoint:
+  every: 10000
+
+# Buffer
+buffer:
+  size: 100000
+  checkpoint: True
+  memmap: True
+
+# Algorithm
+algo:
+  total_steps: 1000000
+  cnn_keys:
+    encoder:
+      - rgb
+  mlp_keys:
+    encoder: []
+  learning_starts: 1024
+  train_every: 2
+
+# Metric
+metric:
+  log_every: 5000
@@ -13,8 +13,9 @@ seed: 5
 env:
   num_envs: 4
   max_episode_steps: -1
-  id: walker_walk
   wrapper:
+    domain_name: walker
+    task_name: walk
     from_vectors: False
     from_pixels: True
 

@@ -49,7 +49,8 @@ def _flatten_obs(obs: Dict[Any, Any]) -> np.ndarray:
 class DMCWrapper(gym.Wrapper):
     def __init__(
         self,
-        id: str,
+        domain_name: str,
+        task_name: str,
         from_pixels: bool = False,
         from_vectors: bool = True,
         height: int = 84,
@@ -74,7 +75,8 @@ def __init__(
         by the camera specified through the `camera_id` parameter
 
         Args:
-            id (str): the task id, e.g. 'walker_walk'. The id must be 'underscore' separated.
+            domain_name (str): the domain of the environment, e.g., "walker".
+            task_name (str): the task of the environment, e.g., "walk".
             from_pixels (bool, optional): whether to return the image observation.
                 If both 'from_pixels' and 'from_vectors' are True, then the observation space
                 will be a `gymnasium.spaces.Dict` with two keys: 'rgb' and 'state' for the
@@ -112,7 +114,6 @@ def __init__(
                 f"got {from_vectors} and {from_pixels} respectively."
             )
 
-        domain_name, task_name = id.split("_")
         self._from_pixels = from_pixels
         self._from_vectors = from_vectors
         self._height = height