Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tune|RLlib] PB2 scheduler runs into error when learning rate is used as hyperparameter. #42180

Closed
simonsays1980 opened this issue Jan 4, 2024 · 0 comments · Fixed by #42181
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@simonsays1980
Copy link
Collaborator

What happened + What you expected to happen

What happened

I used the PB2 scheduler to tune a custom Algorithm's hyperparameter, including the learning rate. At the point where a new config was explored by Bayesian Optimization the trial always errored out with the following error:

Traceback (most recent call last):

  File "python/ray/_raylet.pyx", line 1590, in ray._raylet.execute_task

  File "python/ray/_raylet.pyx", line 1683, in ray._raylet.execute_task

  File "python/ray/_raylet.pyx", line 1596, in ray._raylet.execute_task

  File "python/ray/_raylet.pyx", line 1536, in ray._raylet.execute_task.function_executor

  File "/home/simon/git-projects/rllib/.venv-nightly/lib/python3.9/site-packages/ray/_private/function_manager.py", line 726, in actor_method_executor
    return method(__ray_actor, *args, **kwargs)

  File "/home/simon/git-projects/rllib/.venv-nightly/lib/python3.9/site-packages/ray/util/tracing/tracing_helper.py", line 464, in _resume_span
    return method(self, *_args, **_kwargs)

  File "/home/simon/git-projects/rllib/.venv-nightly/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm.py", line 437, in __init__
    config.validate()

  File "/home/simon/git-projects/rllib/.venv-nightly/lib/python3.9/site-packages/ray/rllib/algorithms/ppo/ppo.py", line 315, in validate
    super().validate()

  File "/home/simon/git-projects/rllib/.venv-nightly/lib/python3.9/site-packages/ray/rllib/algorithms/pg/pg.py", line 100, in validate
    super().validate()

  File "/home/simon/git-projects/rllib/.venv-nightly/lib/python3.9/site-packages/ray/rllib/algorithms/algorithm_config.py", line 904, in validate
    Scheduler.validate(

  File "/home/simon/git-projects/rllib/.venv-nightly/lib/python3.9/site-packages/ray/rllib/utils/schedules/scheduler.py", line 96, in validate
    raise ValueError(

ValueError: Invalid `lr` (0.00016038432659115642) specified! Must be a list of at least 2 tuples, each of the form (`timestep`, `learning rate to reach`), e.g. `[(0, 0.001), (1e6, 0.0001), (2e6, 0.00005)]`.

What you expected to happen

That PB2 as a major hyperparameter search algorithm can handle such errors smoothly by either jumping over it and starting a new one (see issue #40787) or by not failing on such an important hyperparameter.

Versions / Dependencies

Ray nightly version
Fedora 37
Python 3.9.12

Reproduction script

Use the example given in the docs and use the learning rate as a hpyerparmater to tune.

Issue Severity

High: It blocks me from completing my task.

@simonsays1980 simonsays1980 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jan 4, 2024
@simonsays1980 simonsays1980 changed the title [Tune|RLlib] PB2 scheduler runs into error when learning rate is used as hpyerparameter. [Tune|RLlib] PB2 scheduler runs into error when learning rate is used as hyperparameter. Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant