New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[RLlib] Single agent RLTrainer made easy #31802

Merged

gjoliver merged 48 commits into ray-project:master from kouroshHakha:sarl-torch-trainer

Jan 20, 2023

Contributor

kouroshHakha commented Jan 20, 2023

Why are these changes needed?

The compute_loss method is built for multi-agent. While it is still possible to write single-agent losses, it may become confusing to users. We should find a way to allow them to specify single-agent losses as well, without having to think about one extra layer of hierarchy for module ids. This PR makes it easy to this.

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

kouroshHakha added 30 commits

January 10, 2023 10:35


          added quick cleanups to trainer_runner.

28679ca

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          created test_trainer_runner

a2f9439

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          added TODO tag

d8b36c1

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          Merge branch 'master' into trainer-runner-quick-cleanups

d24bff5

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fixed imports

2b67577

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          typo in BUILD

fe60e20

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          started to create torch_rl_trainer

916d674

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          added bc_rl_trainer

71026e5

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          torch trainer test works now

ae61014

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          lint

ef1ffb8

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          Merge branch 'master' into torch-trainer

f393cea

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          updated TODOs and BUILD

16f64f9

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          wip: trainer_runner multi-gpu test

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          torch version runs but the parameters are not synced

77730d8

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

wip

97573dc

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          got the multi-gpu gradient sync up working

091c406

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fixed add/remove multi-gpu tests

a42b0f1

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          moved the DDPRLModuleWrapper outside of RLTrainer + lint

62fe11f

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          merged tf and torch train_runner tests

2cc5185

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fixed trainer_runner auto-scaling on a cluster where autoscaling is e…

435c352

…nabled

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fix rl_trainer unittest failures.

ff845c3

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          1. renamed the DDP wrapper

7c3eed7

2. don't do numpy conversion for batch on the base class

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          removed in_test from the production code

d56ce2c

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          clarified todo

200b5f7

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          comments

f747e50

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          renamed make_distributed to make_distributed_module

7ce81f0

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fixed test torch rl_trainer lint

b2ddd2d

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fixed marl_module stuff

5bc625c

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fixed the import issue

3302db7

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          lint

5455e29

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha added 8 commits

January 19, 2023 10:37


          Merge branch 'master' into torch-trainer

f3edd50

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          test trainer runner updated

2aec198

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fixed the scaling config and in_test issues introduced after the merge.

873cdd5

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          fixed the scaling config and in_test issues introduced after the merge.

13e19aa

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          Merge branch 'torch-trainer' of github.com:kouroshHakha/ray into torc…

68b72e4

…h-trainer


          Merge branch 'master' into torch-trainer

ff3b335


          fixed trainer_runner config test

dac0d6b

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          added updates per module to make it easier for people to write single…

f86f702

… agent losses

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

kouroshHakha assigned gjoliver

kouroshHakha requested review from sven1977, gjoliver, avnishn, ArturNiederfahrenhorst, smorad, maxpumperla and krfricke as code owners

January 20, 2023 02:40

kouroshHakha mentioned this pull request

[RLlib] PPO torch RLTrainer #31801

Merged

7 tasks

kouroshHakha added 8 commits

January 19, 2023 21:41


          fixed torch import

7b5938b

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          removed the override decorator for nn.Module

f4cbe5a

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          Merge branch 'torch-trainer' into sarl-torch-trainer

5db09a3


          fixed import torch in bc_module.py

eac5223

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          Merge branch 'torch-trainer' into sarl-torch-trainer

51f9724


          fixed the bazel bug where the working directory gets switched to wher…

0baa3cf

…e the unittest is locateed and import torch would import the relative torch module instead of the global torch moduel

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>


          Merge branch 'torch-trainer' into sarl-torch-trainer

24c38c0


          Merge branch 'master' into sarl-torch-trainer

75fbf44

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>

gjoliver reviewed

View reviewed changes

Member

gjoliver left a comment

a couple of random questions.

rllib/core/rl_trainer/rl_trainer.py

+                              module_id, module_batch, module_fwd_out
+                          )
+                          results_all_modules[module_id] = module_results
+                          loss = module_results[self.TOTAL_LOSS_KEY]

Member

gjoliver Jan 20, 2023

just a bit confusing, why would module_results ever contain TOTAL_LOSS_KEY at this point?

Contributor Author

kouroshHakha Jan 20, 2023

that's the convention. User needs to return their loss under that key.

Contributor Author

kouroshHakha Jan 20, 2023

they can optionally include other metrics they want to log.

rllib/core/testing/tf/bc_rl_trainer.py


		return loss_dict
		return {self.TOTAL_LOSS_KEY: loss}

Member

gjoliver Jan 20, 2023

maybe this should get saved under per-module key, so that rl_trainer can always aggregate the loss into a total key.
also, why is TOTAL_LOSS_KEY a contains on self?

Contributor Author

kouroshHakha Jan 20, 2023 •

edited

Loading

also, why is TOTAL_LOSS_KEY a contains on self?

to be accessible everywhere. No need to import :)

Contributor Author

kouroshHakha Jan 20, 2023

maybe this should get saved under per-module key, so that rl_trainer can always aggregate the loss into a total key.

This is more general. Trust me :)

Member

gjoliver Jan 20, 2023

ok :)

gjoliver approved these changes

View reviewed changes

gjoliver merged commit f31a0ad into ray-project:master

andreapiso pushed a commit to andreapiso/ray that referenced this pull request


          [RLlib] Single agent RLTrainer made easy (ray-project#31802)

5b8f9bd

Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>
Signed-off-by: Andrea Pisoni <andreapiso@gmail.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

gjoliver gjoliver approved these changes

sven1977 Awaiting requested review from sven1977 sven1977 is a code owner

avnishn Awaiting requested review from avnishn

ArturNiederfahrenhorst Awaiting requested review from ArturNiederfahrenhorst

smorad Awaiting requested review from smorad

maxpumperla Awaiting requested review from maxpumperla

krfricke Awaiting requested review from krfricke

Labels

None yet