-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] New API stack: (Multi)RLModule overhaul vol 04 (deprecate RLModuleConfig; cleanups, DefaultModelConfig dataclass). #47908
Conversation
…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module
…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/algorithms/ppo/tests/test_ppo_rl_module.py # rllib/algorithms/ppo/torch/ppo_torch_rl_module.py # rllib/core/rl_module/multi_rl_module.py # rllib/core/rl_module/rl_module.py # rllib/tuned_examples/dqn/multi_agent_cartpole_dqn.py
…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module Signed-off-by: sven1977 <svenmika1977@gmail.com> # Conflicts: # rllib/algorithms/ppo/tests/test_ppo_rl_module.py # rllib/algorithms/ppo/torch/ppo_torch_rl_module.py # rllib/core/rl_module/multi_rl_module.py # rllib/core/rl_module/rl_module.py # rllib/tuned_examples/dqn/multi_agent_cartpole_dqn.py
@@ -54,18 +54,18 @@ | |||
from ray.rllib.core.testing.torch.bc_module import DiscreteBCTorchModule | |||
|
|||
spec = MultiRLModuleSpec( | |||
module_specs={ | |||
rl_module_specs={ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed: rl_module_specs
for more clarity
@@ -40,7 +40,7 @@ | |||
module_class=DiscreteBCTorchModule, | |||
observation_space=env.observation_space, | |||
action_space=env.action_space, | |||
model_config_dict={"fcnet_hiddens": [64]}, | |||
model_config={"fcnet_hiddens": [64]}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed: model_config
for more clarity (now that the redundant (Multi)RLModuleConfig
are gone).
input_dim = self.config.observation_space.shape[0] | ||
hidden_dim = self.config.model_config_dict["fcnet_hiddens"][0] | ||
output_dim = self.config.action_space.n | ||
input_dim = self.observation_space.shape[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: the old notation: self.config....
still works in case users have old RLModule classes that use this attribute, but it is no longer advertized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should set a date until users should switch to be able to deprecate cleanly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Some general questions and some nit here and there.
hidden_dim = self.config.model_config_dict["fcnet_hiddens"][0] | ||
output_dim = self.config.action_space.n | ||
input_dim = self.observation_space.shape[0] | ||
hidden_dim = self.model_config["fcnet_hiddens"][0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very nice!
configure the `RLModule` in the new stack and the `ModelV2` in the old | ||
stack. | ||
|
||
Returns: | ||
A dictionary with the model configuration. | ||
""" | ||
return self._model_config_auto_includes | self._model_config_dict | ||
return self._model_config_auto_includes | ( | ||
self._model_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically self._model_config
could be empty, couldn't it? This case was covered before by the MODEL_DEFAULTS
from self._model_config_auto_includes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct! self._model_config
could be empty, BUT then the user would have to catch such empty dicts in their own custom RLModule class' setup
method. If the user uses a default RLModule from one of the algos, they have to provide the DefaultModelConfig
dataclass, which is always fully defined.
I'm 100% with you that we have to come up with a unified solution for all our configs (across RLlib, but even better across Ray itself). Right now, it's a mix of dicts, dataclasses, and custom config classes (e.g. AlgorithmConfig
).
@@ -26,8 +26,8 @@ class BCCatalog(Catalog): | |||
Any custom head can be built by overriding the `build_pi_head()` method. | |||
Alternatively, the `PiHeadConfig` can be overridden to build a custom | |||
policy head during runtime. To change solely the network architecture, | |||
`model_config_dict["post_fcnet_hiddens"]` and | |||
`model_config_dict["post_fcnet_activation"]` can be used. | |||
`model_config_dict["head_fcnet_hiddens"]` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I like this naming convention. This makes it much clearer where to pass in which configs.
@@ -13,54 +12,41 @@ | |||
torch, nn = try_import_torch() | |||
|
|||
|
|||
class MARWILTorchRLModule(TorchRLModule, MARWILRLModule): | |||
framework: str = "torch" | |||
class MARWILTorchRLModule(TorchRLModule): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, we can simply derive from PPOTorchRLModule
, can't we. Should be all the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, removed MARWILTorchRLModule
entirely (similar to IMPALA using PPOTorchRLModule).
input_dim = self.config.observation_space.shape[0] | ||
hidden_dim = self.config.model_config_dict["fcnet_hiddens"][0] | ||
output_dim = self.config.action_space.n | ||
input_dim = self.observation_space.shape[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should set a date until users should switch to be able to deprecate cleanly.
hidden_layer_activation=self.config.model_config_dict[ | ||
"post_fcnet_activation" | ||
], | ||
hidden_layer_dims=self.model_config["post_fcnet_hiddens"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
hidden_layer_activation=self.config.model_config_dict[ | ||
"post_fcnet_activation" | ||
], | ||
hidden_layer_dims=self.model_config["post_fcnet_hiddens"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -50,268 +50,69 @@ | |||
# fmt: off | |||
# __sphinx_doc_begin__ | |||
MODEL_DEFAULTS: ModelConfigDict = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DefaultModelConfig.to_dict()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep it as-is (will be gone, soon, anyways). Some names/keys have changed, so this would be a nightmare.
@@ -69,7 +69,16 @@ | |||
|
|||
# Represents the model config sub-dict of the algo config that is passed to | |||
# the model catalog. | |||
ModelConfigDict = dict | |||
ModelConfigDict = dict # @OldAPIStack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it. Here it is defined ...
# Each inner list has the format: [num_output_filters, kernel, stride], where kernel | ||
# and stride may be single ints (width and height are the same) or 2-tuples (int, int) | ||
# for width and height (different values). | ||
ConvFilterSpec = List[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. Can be tuple.
…odule_do_over_bc_default_module_04_refactor_rl_module_and_multi_rl_module
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
…oduleConfig; cleanups, DefaultModelConfig dataclass). (ray-project#47908) Signed-off-by: ujjawal-khare <ujjawal.khare@dream11.com>
# Experimental flag. | ||
# If True, user specified no preprocessor to be created | ||
# (via config._disable_preprocessor_api=True). If True, observations | ||
# will arrive in model as they are returned by the env. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I @sven1977, I know that this pull request has been merged and that this is the old API, but why have these comments (from this key and below) been removed? Thank you in advance.
New API stack: (Multi)RLModule overhaul vol 04:
MultiRLModule.module_specs
->rl_module_specs
.DefaultModelConfig
class to replace the old stack'sMODEL_CONFIG
dict. This dataclass will be used when training with RLlib's default models. When training with any custom RLModule, users should simply pass in a dict with arbitrary key/value pairs into the RLModuleSpec.RLModule.model_config_dict
into simplymodel_config
. (<- now we can see, why it's important to deprecateRLModuleConfig
as it is very confusing to have 2 different config classes/attributes around).Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.