[RLlib] Give more time to impala tests #31910

ArturNiederfahrenhorst · 2023-01-24T23:31:48Z

Signed-off-by: Artur Niederfahrenhorst artur@anyscale.com

Why are these changes needed?

IMPALA release tests have been flakey. They are capable of learning >> 200 mean reward on breakout, but struggle to do so within 1800s. This PR moves that bar to 2400s.
The following screenshot depicts two representative runs of how the mean reward behaves over time (1 on horizontal scale is 1h).
Note that TF has much more variance.

I could not find out why TF has much more variance but noted that the major difference in metrics is with the value loss.

I also tuned LR and num workers to find out of TF being faster is simply an off-policy thing. It's not.

Related TB https://tensorboard.dev/experiment/5vi5EGOdSJWdfwoc1A2qNw/#scalars&tagFilter=cur_lr&_smoothingWeight=0

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

ArturNiederfahrenhorst · 2023-01-24T23:38:11Z

release/rllib_tests/learning_tests/yaml_files/impala/impala-breakoutnoframeskip-v5.yaml

-            [0, 0.0005],
-            [20000000, 0.000000000001],
-        ]
+        lr: 0.0005


I noticed that 20000000, twntymillion in words, is never reached so to not confuse this for 2M, we should just leaved it out.

this sounds good.

kouroshHakha · 2023-01-27T17:21:43Z

@gjoliver @sven1977 Can someone merge this plz?

gjoliver · 2023-01-27T17:59:32Z

can you not leave the y scale out of the screen shot?
does it really take 4 hrs to reach 200? if we are judging based on max(mean_episode_reward), why would the variance matter?

ArturNiederfahrenhorst · 2023-01-27T18:02:26Z

@gjoliver The veriance does indeed not manifest in the max(mean_episode_reward).
I still found it surprising that it does that. Offpolicyness is the same.

gjoliver · 2023-01-28T17:03:17Z

ok, but I still have the question of why then do we need 4 hrs for this test?

ArturNiederfahrenhorst · 2023-01-28T20:07:29Z

@gjoliver Sorry, my mistake. I dug out the original tensorboard that you see in the picture: https://tensorboard.dev/experiment/5vi5EGOdSJWdfwoc1A2qNw/#scalars&tagFilter=cur_lr&_smoothingWeight=0
I might have swapped two screenshots when uploading.

ArturNiederfahrenhorst · 2023-01-28T20:07:52Z

So the run in the picture is a actually 40 minutes long.

ArturNiederfahrenhorst · 2023-01-28T20:08:28Z

In the file changes, I have only moved this from 30 to 40 minutes.

gjoliver · 2023-01-29T01:08:26Z

oh ok, man, this sounds a lot better. sorry I misread the numbers 😓

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com> Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

give more time to impala tests

154d250

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

ArturNiederfahrenhorst assigned sven1977 Jan 24, 2023

ArturNiederfahrenhorst added 2 commits January 24, 2023 15:36

change config

39916a8

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

remove old configs

69747de

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com>

ArturNiederfahrenhorst commented Jan 24, 2023

View reviewed changes

kouroshHakha mentioned this pull request Jan 27, 2023

[RLlib] Fixed the autorom dependency issue #31933

Merged

7 tasks

kouroshHakha assigned gjoliver Jan 27, 2023

gjoliver approved these changes Jan 29, 2023

View reviewed changes

gjoliver merged commit 1929bb1 into ray-project:master Jan 29, 2023

edoakes pushed a commit to edoakes/ray that referenced this pull request Mar 22, 2023

[RLlib] Give more time to impala release tests (ray-project#31910)

e49968c

Signed-off-by: Artur Niederfahrenhorst <artur@anyscale.com> Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Give more time to impala tests #31910

[RLlib] Give more time to impala tests #31910

ArturNiederfahrenhorst commented Jan 24, 2023 •

edited

Loading

ArturNiederfahrenhorst Jan 24, 2023

gjoliver Jan 27, 2023

kouroshHakha commented Jan 27, 2023

gjoliver commented Jan 27, 2023

ArturNiederfahrenhorst commented Jan 27, 2023

gjoliver commented Jan 28, 2023

ArturNiederfahrenhorst commented Jan 28, 2023

ArturNiederfahrenhorst commented Jan 28, 2023

ArturNiederfahrenhorst commented Jan 28, 2023

gjoliver commented Jan 29, 2023

[RLlib] Give more time to impala tests #31910

[RLlib] Give more time to impala tests #31910

Conversation

ArturNiederfahrenhorst commented Jan 24, 2023 • edited Loading

Why are these changes needed?

ArturNiederfahrenhorst Jan 24, 2023

Choose a reason for hiding this comment

gjoliver Jan 27, 2023

Choose a reason for hiding this comment

kouroshHakha commented Jan 27, 2023

gjoliver commented Jan 27, 2023

ArturNiederfahrenhorst commented Jan 27, 2023

gjoliver commented Jan 28, 2023

ArturNiederfahrenhorst commented Jan 28, 2023

ArturNiederfahrenhorst commented Jan 28, 2023

ArturNiederfahrenhorst commented Jan 28, 2023

gjoliver commented Jan 29, 2023

ArturNiederfahrenhorst commented Jan 24, 2023 •

edited

Loading