[RLlib] New API stack: (Multi)RLModule overhaul vol 04 (deprecate RLModuleConfig; cleanups, DefaultModelConfig dataclass). #47908

sven1977 · 2024-10-04T23:51:18Z

New API stack: (Multi)RLModule overhaul vol 04:

deprecate RLModuleConfig and MultiRLModuleConfig in favor of using only specs (individual c'tor args) to construct/build RLModules
Cleaned up some attribute naming for better clarity, e.g. MultiRLModule.module_specs -> rl_module_specs.
Introduced DefaultModelConfig class to replace the old stack's MODEL_CONFIG dict. This dataclass will be used when training with RLlib's default models. When training with any custom RLModule, users should simply pass in a dict with arbitrary key/value pairs into the RLModuleSpec.
Renamed RLModule.model_config_dict into simply model_config. (<- now we can see, why it's important to deprecate RLModuleConfig as it is very confusing to have 2 different config classes/attributes around).

Why are these changes needed?

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/algorithms/ppo/tests/test_ppo_rl_module.py # rllib/algorithms/ppo/torch/ppo_torch_rl_module.py # rllib/core/rl_module/multi_rl_module.py # rllib/core/rl_module/rl_module.py # rllib/tuned_examples/dqn/multi_agent_cartpole_dqn.py

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 · 2024-10-07T08:49:48Z

doc/source/rllib/doc_code/rlmodule_guide.py

@@ -54,18 +54,18 @@
 from ray.rllib.core.testing.torch.bc_module import DiscreteBCTorchModule

 spec = MultiRLModuleSpec(
-    module_specs={
+    rl_module_specs={


renamed: rl_module_specs for more clarity

sven1977 · 2024-10-07T08:50:18Z

doc/source/rllib/doc_code/rlmodule_guide.py

@@ -40,7 +40,7 @@
    module_class=DiscreteBCTorchModule,
    observation_space=env.observation_space,
    action_space=env.action_space,
-    model_config_dict={"fcnet_hiddens": [64]},
+    model_config={"fcnet_hiddens": [64]},


renamed: model_config for more clarity (now that the redundant (Multi)RLModuleConfig are gone).

sven1977 · 2024-10-07T08:51:11Z

doc/source/rllib/doc_code/rlmodule_guide.py

-        input_dim = self.config.observation_space.shape[0]
-        hidden_dim = self.config.model_config_dict["fcnet_hiddens"][0]
-        output_dim = self.config.action_space.n
+        input_dim = self.observation_space.shape[0]


note: the old notation: self.config.... still works in case users have old RLModule classes that use this attribute, but it is no longer advertized.

Maybe we should set a date until users should switch to be able to deprecate cleanly.

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980

LGTM. Some general questions and some nit here and there.

simonsays1980 · 2024-10-07T08:48:53Z

doc/source/rllib/doc_code/rlmodule_guide.py

-        hidden_dim = self.config.model_config_dict["fcnet_hiddens"][0]
-        output_dim = self.config.action_space.n
+        input_dim = self.observation_space.shape[0]
+        hidden_dim = self.model_config["fcnet_hiddens"][0]


This is very nice!

simonsays1980 · 2024-10-07T08:53:45Z

rllib/algorithms/algorithm_config.py

        configure the `RLModule` in the new stack and the `ModelV2` in the old
        stack.

        Returns:
            A dictionary with the model configuration.
        """
-        return self._model_config_auto_includes | self._model_config_dict
+        return self._model_config_auto_includes | (
+            self._model_config


Theoretically self._model_config could be empty, couldn't it? This case was covered before by the MODEL_DEFAULTS from self._model_config_auto_includes

Correct! self._model_config could be empty, BUT then the user would have to catch such empty dicts in their own custom RLModule class' setup method. If the user uses a default RLModule from one of the algos, they have to provide the DefaultModelConfig dataclass, which is always fully defined.

I'm 100% with you that we have to come up with a unified solution for all our configs (across RLlib, but even better across Ray itself). Right now, it's a mix of dicts, dataclasses, and custom config classes (e.g. AlgorithmConfig).

simonsays1980 · 2024-10-07T08:54:23Z

rllib/algorithms/bc/bc_catalog.py

@@ -26,8 +26,8 @@ class BCCatalog(Catalog):
    Any custom head can be built by overriding the `build_pi_head()` method.
    Alternatively, the `PiHeadConfig` can be overridden to build a custom
    policy head during runtime. To change solely the network architecture,
-    `model_config_dict["post_fcnet_hiddens"]` and
-    `model_config_dict["post_fcnet_activation"]` can be used.
+    `model_config_dict["head_fcnet_hiddens"]` and


Yes, I like this naming convention. This makes it much clearer where to pass in which configs.

simonsays1980 · 2024-10-07T08:57:25Z

rllib/algorithms/marwil/torch/marwil_torch_rl_module.py

@@ -13,54 +12,41 @@
 torch, nn = try_import_torch()


-class MARWILTorchRLModule(TorchRLModule, MARWILRLModule):
-    framework: str = "torch"
+class MARWILTorchRLModule(TorchRLModule):


I think, we can simply derive from PPOTorchRLModule, can't we. Should be all the same.

done, removed MARWILTorchRLModule entirely (similar to IMPALA using PPOTorchRLModule).

simonsays1980 · 2024-10-07T12:01:02Z

doc/source/rllib/doc_code/rlmodule_guide.py

-        input_dim = self.config.observation_space.shape[0]
-        hidden_dim = self.config.model_config_dict["fcnet_hiddens"][0]
-        output_dim = self.config.action_space.n
+        input_dim = self.observation_space.shape[0]


Maybe we should set a date until users should switch to be able to deprecate cleanly.

simonsays1980 · 2024-10-07T12:15:24Z

rllib/examples/rl_modules/classes/autoregressive_actions_rlm.py

-            hidden_layer_activation=self.config.model_config_dict[
-                "post_fcnet_activation"
-            ],
+            hidden_layer_dims=self.model_config["post_fcnet_hiddens"],


Same here :)

simonsays1980 · 2024-10-07T12:15:32Z

rllib/examples/rl_modules/classes/autoregressive_actions_rlm.py

-            hidden_layer_activation=self.config.model_config_dict[
-                "post_fcnet_activation"
-            ],
+            hidden_layer_dims=self.model_config["post_fcnet_hiddens"],


And here :)

simonsays1980 · 2024-10-07T12:17:03Z

rllib/models/catalog.py

@@ -50,268 +50,69 @@
 # fmt: off
 # __sphinx_doc_begin__
 MODEL_DEFAULTS: ModelConfigDict = {


DefaultModelConfig.to_dict()?

Let's keep it as-is (will be gone, soon, anyways). Some names/keys have changed, so this would be a nightmare.

simonsays1980 · 2024-10-07T12:18:24Z

rllib/utils/typing.py

@@ -69,7 +69,16 @@

 # Represents the model config sub-dict of the algo config that is passed to
 # the model catalog.
-ModelConfigDict = dict
+ModelConfigDict = dict  # @OldAPIStack


Ah got it. Here it is defined ...

simonsays1980 · 2024-10-07T12:18:40Z

rllib/utils/typing.py

+# Each inner list has the format: [num_output_filters, kernel, stride], where kernel
+# and stride may be single ints (width and height are the same) or 2-tuples (int, int)
+# for width and height (different values).
+ConvFilterSpec = List[


Alright. Can be tuple.

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module

Signed-off-by: sven1977 <svenmika1977@gmail.com>

…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>

ema-pe · 2025-01-16T16:09:41Z

rllib/models/catalog.py

-    # Experimental flag.
-    # If True, user specified no preprocessor to be created
-    # (via config._disable_preprocessor_api=True). If True, observations
-    # will arrive in model as they are returned by the env.


I @sven1977, I know that this pull request has been merged and that this is the old API, but why have these comments (from this key and below) been removed? Thank you in advance.

wip

6fd97ba

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from ArturNiederfahrenhorst and simonsays1980 as code owners October 4, 2024 23:51

sven1977 assigned simonsays1980 Oct 4, 2024

sven1977 added 11 commits October 5, 2024 01:51

Merge branch 'master' of https://github.com/ray-project/ray into rl_m…

69b2d6c

…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module

wip

416b622

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

face4a3

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

901ade5

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

ac2acd9

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

5a00666

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

4cd530e

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

2a012f7

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

e4cb737

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 requested review from maxpumperla and a team as code owners October 6, 2024 20:53

sven1977 added 2 commits October 7, 2024 09:53

fixes

053832a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fixes

a178c74

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 enabled auto-merge (squash) October 7, 2024 08:48

github-actions bot added the go add ONLY when ready to merge, run all tests label Oct 7, 2024

sven1977 commented Oct 7, 2024

View reviewed changes

merge

9f6f391

Signed-off-by: sven1977 <svenmika1977@gmail.com>

github-actions bot disabled auto-merge October 7, 2024 11:37

wip

0b2f87b

Signed-off-by: sven1977 <svenmika1977@gmail.com>

simonsays1980 approved these changes Oct 7, 2024

View reviewed changes

sven1977 added 2 commits October 7, 2024 14:32

wip

6bb5c1a

Signed-off-by: sven1977 <svenmika1977@gmail.com>

wip

7c1e0c0

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 added 6 commits October 8, 2024 11:07

wip

05222ad

Signed-off-by: sven1977 <svenmika1977@gmail.com>

Merge branch 'master' of https://github.com/ray-project/ray into rl_m…

1359bac

…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module

merge

37561c9

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

7dab1a8

Signed-off-by: sven1977 <svenmika1977@gmail.com>

fix

bed9703

Signed-off-by: sven1977 <svenmika1977@gmail.com>

merge

58ed3ae

Signed-off-by: sven1977 <svenmika1977@gmail.com>

sven1977 enabled auto-merge (squash) October 9, 2024 15:00

sven1977 merged commit a62d4bf into ray-project:master Oct 9, 2024
6 checks passed

sven1977 deleted the rl_module_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module branch October 9, 2024 17:31

ema-pe reviewed Jan 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] New API stack: (Multi)RLModule overhaul vol 04 (deprecate RLModuleConfig; cleanups, DefaultModelConfig dataclass). #47908

[RLlib] New API stack: (Multi)RLModule overhaul vol 04 (deprecate RLModuleConfig; cleanups, DefaultModelConfig dataclass). #47908

sven1977 commented Oct 4, 2024 •

edited

Loading

sven1977 Oct 7, 2024

sven1977 Oct 7, 2024

sven1977 Oct 7, 2024

simonsays1980 Oct 7, 2024

simonsays1980 left a comment

simonsays1980 Oct 7, 2024

simonsays1980 Oct 7, 2024

sven1977 Oct 7, 2024

simonsays1980 Oct 7, 2024

simonsays1980 Oct 7, 2024

sven1977 Oct 7, 2024

simonsays1980 Oct 7, 2024

simonsays1980 Oct 7, 2024

sven1977 Oct 7, 2024

simonsays1980 Oct 7, 2024

sven1977 Oct 7, 2024

simonsays1980 Oct 7, 2024

sven1977 Oct 7, 2024

simonsays1980 Oct 7, 2024

simonsays1980 Oct 7, 2024

ema-pe Jan 16, 2025

[RLlib] New API stack: (Multi)RLModule overhaul vol 04 (deprecate RLModuleConfig; cleanups, DefaultModelConfig dataclass). #47908

[RLlib] New API stack: (Multi)RLModule overhaul vol 04 (deprecate RLModuleConfig; cleanups, DefaultModelConfig dataclass). #47908

Conversation

sven1977 commented Oct 4, 2024 • edited Loading

Why are these changes needed?

Related issue number

Checks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonsays1980 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sven1977 commented Oct 4, 2024 •

edited

Loading