Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Forward fix for failing PPO Torch RLTrainer test #32308

Merged
merged 136 commits into from
Feb 8, 2023
Merged
Changes from all commits
Commits
Show all changes
136 commits
Select commit Hold shift + click to select a range
28679ca
added quick cleanups to trainer_runner.
kouroshHakha Jan 10, 2023
a2f9439
created test_trainer_runner
kouroshHakha Jan 10, 2023
d8b36c1
added TODO tag
kouroshHakha Jan 10, 2023
d24bff5
Merge branch 'master' into trainer-runner-quick-cleanups
kouroshHakha Jan 11, 2023
2b67577
fixed imports
kouroshHakha Jan 11, 2023
fe60e20
typo in BUILD
kouroshHakha Jan 11, 2023
916d674
started to create torch_rl_trainer
kouroshHakha Jan 12, 2023
71026e5
added bc_rl_trainer
kouroshHakha Jan 12, 2023
ae61014
torch trainer test works now
kouroshHakha Jan 12, 2023
ef1ffb8
lint
kouroshHakha Jan 12, 2023
f393cea
Merge branch 'master' into torch-trainer
kouroshHakha Jan 13, 2023
16f64f9
updated TODOs and BUILD
kouroshHakha Jan 13, 2023
9719518
wip: trainer_runner multi-gpu test
kouroshHakha Jan 13, 2023
77730d8
torch version runs but the parameters are not synced
kouroshHakha Jan 14, 2023
97573dc
wip
kouroshHakha Jan 14, 2023
091c406
got the multi-gpu gradient sync up working
kouroshHakha Jan 17, 2023
a42b0f1
fixed add/remove multi-gpu tests
kouroshHakha Jan 17, 2023
62fe11f
moved the DDPRLModuleWrapper outside of RLTrainer + lint
kouroshHakha Jan 17, 2023
2cc5185
merged tf and torch train_runner tests
kouroshHakha Jan 17, 2023
435c352
fixed trainer_runner auto-scaling on a cluster where autoscaling is e…
kouroshHakha Jan 18, 2023
ff845c3
fix rl_trainer unittest failures.
kouroshHakha Jan 18, 2023
7c3eed7
1. renamed the DDP wrapper
kouroshHakha Jan 18, 2023
d56ce2c
removed in_test from the production code
kouroshHakha Jan 18, 2023
200b5f7
clarified todo
kouroshHakha Jan 18, 2023
f747e50
comments
kouroshHakha Jan 18, 2023
7ce81f0
renamed make_distributed to make_distributed_module
kouroshHakha Jan 18, 2023
b2ddd2d
fixed test torch rl_trainer lint
kouroshHakha Jan 18, 2023
5bc625c
fixed marl_module stuff
kouroshHakha Jan 18, 2023
3302db7
fixed the import issue
kouroshHakha Jan 18, 2023
5455e29
lint
kouroshHakha Jan 18, 2023
a2d042f
Merge branch 'master' into torch-trainer
kouroshHakha Jan 19, 2023
b9159a8
fixed lint
kouroshHakha Jan 19, 2023
f3edd50
Merge branch 'master' into torch-trainer
kouroshHakha Jan 19, 2023
2aec198
test trainer runner updated
kouroshHakha Jan 19, 2023
873cdd5
fixed the scaling config and in_test issues introduced after the merge.
kouroshHakha Jan 19, 2023
13e19aa
fixed the scaling config and in_test issues introduced after the merge.
kouroshHakha Jan 19, 2023
68b72e4
Merge branch 'torch-trainer' of github.com:kouroshHakha/ray into torc…
kouroshHakha Jan 19, 2023
ca3e225
wip
kouroshHakha Jan 20, 2023
ff3b335
Merge branch 'master' into torch-trainer
kouroshHakha Jan 20, 2023
3234aaf
wip
kouroshHakha Jan 20, 2023
15b99ee
wip
kouroshHakha Jan 20, 2023
8b9ae92
wip
kouroshHakha Jan 20, 2023
eb82e67
wip
kouroshHakha Jan 20, 2023
dac0d6b
fixed trainer_runner config test
kouroshHakha Jan 20, 2023
cfdaa04
removed the stuff that got moved to SARLTrainer made easy PR
kouroshHakha Jan 20, 2023
7b5938b
fixed torch import
kouroshHakha Jan 20, 2023
f4cbe5a
removed the override decorator for nn.Module
kouroshHakha Jan 20, 2023
9fb749c
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Jan 20, 2023
9645137
added unittest (wip)
kouroshHakha Jan 20, 2023
eac5223
fixed import torch in bc_module.py
kouroshHakha Jan 20, 2023
0baa3cf
fixed the bazel bug where the working directory gets switched to wher…
kouroshHakha Jan 20, 2023
78860f8
Merge branch 'torch-trainer' into ppo-torch-trainer
kouroshHakha Jan 20, 2023
c125e20
wip
kouroshHakha Jan 20, 2023
5aaa603
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Jan 21, 2023
c77be0c
fixed the unittest
kouroshHakha Jan 21, 2023
0e6f511
wip
kouroshHakha Jan 21, 2023
1778c44
Merge branch 'master' into policy-with-marl
kouroshHakha Jan 23, 2023
4bdd949
added dataclass specs for RLModule and MARLModule for easier construc…
kouroshHakha Jan 24, 2023
58bbe82
test trainer runner local passed
kouroshHakha Jan 24, 2023
93b27ec
add_module() api is now update to accept a module_spec instead of mod…
kouroshHakha Jan 24, 2023
7aaee18
get_trainer_runner_config() now gets an optional ModuleSpec object
kouroshHakha Jan 24, 2023
7c831d3
Algorithm can now construct the trainer_runner based on the policy_maps
kouroshHakha Jan 24, 2023
d3d610e
lint and clean up
kouroshHakha Jan 24, 2023
97acdb7
fixed imports
kouroshHakha Jan 24, 2023
f3ccf54
Merge branch 'policy-with-marl' into ppo-torch-trainer
kouroshHakha Jan 24, 2023
93f3ce9
fixed the unittest for ppo_rl_trainer
kouroshHakha Jan 24, 2023
c58293a
got the PPO POC running
kouroshHakha Jan 25, 2023
3f36f0d
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Jan 25, 2023
77ff585
lint
kouroshHakha Jan 25, 2023
87bda01
wip
kouroshHakha Jan 25, 2023
6346a20
wip
kouroshHakha Jan 25, 2023
eb5106f
wip
kouroshHakha Jan 25, 2023
b2c01ad
multi-gpu test works now
kouroshHakha Jan 25, 2023
ea3d9c6
removed left out api get_weight()
kouroshHakha Jan 25, 2023
45830c6
get_weights() updated
kouroshHakha Jan 26, 2023
94ca772
trying out a new configuration pattern for trainer runner and rl trai…
kouroshHakha Jan 26, 2023
29ac2fb
wip
kouroshHakha Jan 26, 2023
4714e20
lint
kouroshHakha Jan 26, 2023
abd5e5e
rl_trainer tf test passes again
kouroshHakha Jan 27, 2023
a44c370
torch rl trainer test passed
kouroshHakha Jan 27, 2023
e496fcc
trainer_runner_config test works too
kouroshHakha Jan 27, 2023
477795d
tested the multigpu
kouroshHakha Jan 27, 2023
58fe5df
docstring updated
kouroshHakha Jan 27, 2023
d5bcd3b
updated the docstring
kouroshHakha Jan 27, 2023
d8841d1
wip
kouroshHakha Jan 28, 2023
026899e
renamed the classes and variables to backend
kouroshHakha Jan 28, 2023
f04e99d
renamed
kouroshHakha Jan 28, 2023
d280887
wip
kouroshHakha Jan 28, 2023
97d80b1
refactor
kouroshHakha Jan 28, 2023
85387e5
lin
kouroshHakha Jan 28, 2023
1a70b6e
fix the lint and tf_dependency test issue via adding tf stubs
kouroshHakha Jan 29, 2023
317a9fd
wip on unittest trianer_runner
kouroshHakha Jan 29, 2023
cbc9b02
wip
kouroshHakha Jan 29, 2023
869717e
Merge branch 'master' into trainer-runner-scaling-config
kouroshHakha Jan 29, 2023
defa5f1
test_trainer_runner updated to support all variations of scaling config
kouroshHakha Jan 29, 2023
e0a0bcf
removed test trainer runner local and moved it to test_trainer_runner.py
kouroshHakha Jan 29, 2023
cf4041e
fixed the test failures
kouroshHakha Jan 29, 2023
d4cd654
1. Removed tf due to flakiness from test_trainer_runner
kouroshHakha Jan 29, 2023
54eb315
removed backend class definitions
kouroshHakha Jan 29, 2023
1c826db
Removed Hyperparams class
kouroshHakha Jan 29, 2023
e8cf7e1
introed FrameworkHPs to differebntiate between tf/torch specific stuf…
kouroshHakha Jan 29, 2023
2ea3a4d
the unittests pass
kouroshHakha Jan 29, 2023
92dd832
Merge branch 'trainer-runner-scaling-config' into ppo-torch-trainer
kouroshHakha Jan 30, 2023
fd84f7a
addressed comments and fixed some introduced bug
kouroshHakha Jan 30, 2023
4c38455
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Jan 30, 2023
d40c48d
fix from_worker_or_trainer renaming issue
kouroshHakha Jan 30, 2023
b0fed29
fixed tests
kouroshHakha Jan 30, 2023
b92eee9
fixed test_ppo_rl_trainer.py
kouroshHakha Jan 30, 2023
1c3ccb5
rerunning ci
kouroshHakha Jan 31, 2023
37c9fca
lint
kouroshHakha Jan 31, 2023
1687241
added TODO
kouroshHakha Jan 31, 2023
d3ee81a
empty commit
kouroshHakha Jan 31, 2023
24abebd
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Jan 31, 2023
3ff0668
fixed weights to numpy
kouroshHakha Jan 31, 2023
d91eca5
[release] minor fix to pytorch_pbt_failure test when using gpu. (#32070)
xwjiang2010 Jan 31, 2023
5b209be
Merge branch 'master' into err-out-marl-env
kouroshHakha Jan 31, 2023
833e491
Merge branch 'err-out-marl-env' into ppo-torch-trainer
kouroshHakha Jan 31, 2023
e29021d
error out when no agent is passed in in the indepenent MARL case
kouroshHakha Jan 31, 2023
d113d3a
Merge branch 'err-out-marl-env' into ppo-torch-trainer
kouroshHakha Jan 31, 2023
f28a385
1. set resources for trainable 2. convert_to_numpy weights on RLTrain…
kouroshHakha Jan 31, 2023
993932f
added examples as a unittest to BUILD kite
kouroshHakha Jan 31, 2023
05c8297
fixed test name conflict
kouroshHakha Jan 31, 2023
839ff90
removed the wrong tag from docs
kouroshHakha Feb 1, 2023
320b116
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Feb 1, 2023
8466fc8
fixed as test flag
kouroshHakha Feb 1, 2023
4c4f7cc
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Feb 2, 2023
d78f3d5
made the sync_weights equivalent to the implementation before this PR
kouroshHakha Feb 6, 2023
7a68bb4
addressed jun's comments, created a minibatchCycleIterator
kouroshHakha Feb 7, 2023
0008833
Merge branch 'ppo-torch-trainer' of github.com:kouroshHakha/ray into …
kouroshHakha Feb 7, 2023
edbf081
added annotations
kouroshHakha Feb 7, 2023
2698973
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Feb 7, 2023
4c8ce18
empty
kouroshHakha Feb 7, 2023
9f29038
empty
kouroshHakha Feb 7, 2023
b1d3f63
empty
kouroshHakha Feb 7, 2023
f345c70
Merge branch 'master' into ppo-torch-trainer
kouroshHakha Feb 8, 2023
653bf3d
fwd fix for the failing test
kouroshHakha Feb 8, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion rllib/algorithms/ppo/tests/test_ppo_rl_trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@
[[0.1, 0.2, 0.3, 0.4], [0.5, 0.6, 0.7, 0.8], [0.9, 1.0, 1.1, 1.2]],
dtype=np.float32,
),
SampleBatch.NEXT_OBS: np.array(
[[0.1, 0.2, 0.3, 0.4], [0.5, 0.6, 0.7, 0.8], [0.9, 1.0, 1.1, 1.2]],
dtype=np.float32,
),
SampleBatch.ACTIONS: np.array([0, 1, 1]),
SampleBatch.PREV_ACTIONS: np.array([0, 1, 1]),
SampleBatch.REWARDS: np.array([1.0, -1.0, 0.5], dtype=np.float32),
Expand Down Expand Up @@ -57,7 +61,7 @@ def test_loss(self):
.training(
gamma=0.99,
model=dict(
fcnet_hiddens=[10],
fcnet_hiddens=[10, 10],
fcnet_activation="linear",
vf_share_layers=False,
),
Expand Down