Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Evaluation logic needs clean-up #44595

Closed
simonsays1980 opened this issue Apr 9, 2024 · 2 comments · Fixed by #45652
Closed

[RLlib] Evaluation logic needs clean-up #44595

simonsays1980 opened this issue Apr 9, 2024 · 2 comments · Fixed by #45652
Assignees
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks rllib RLlib related issues

Comments

@simonsays1980
Copy link
Collaborator

What happened + What you expected to happen

What happened

I tested evaluation of algorithms and noted to things:

  1. Algorithm.evaluate() does not count agent steps and environment steps, if a cusotm evaluation function is used.
  2. The evaluaion worker set is not None even if the evaluation_num_workers=0.

What you expected to happen

That metrics are collected identically when a custom evaluation funciton is used and that the evaluation_num_workers=0 lets us run the evaluation on the local worker in Algorothm.workers.

Versions / Dependencies

Ray nightly
Python 3.9.12
Fedora Linux 39

Reproduction script

Run our custom evaluation example.

Issue Severity

Medium: It is a significant difficulty but I can work around it.

@simonsays1980 simonsays1980 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Apr 9, 2024
@anyscalesam anyscalesam added the rllib RLlib related issues label Apr 15, 2024
@sven1977
Copy link
Contributor

On the second point:

The evaluaion worker set is not None even if the evaluation_num_workers=0.

This is the expected behavior, as long as evaluation_interval is != 0.

from ray.rllib.algorithms.ppo import PPOConfig

config = (
    PPOConfig()
    .environment("CartPole-v1")
    .evaluation(
        evaluation_num_env_runners=0,
    ),
)

algo = config.build()

print(algo.evaluation_workers is None)

Should print True, however, if I change evaluation_interval to 1 (default is 0), then RLlib will create a eval WorkerSet with num_workers=0 (only 1 local worker, no remote workers).

@simonsays1980 simonsays1980 added P1 Issue that should be fixed within a few weeks enhancement Request for new feature and/or capability and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) enhancement Request for new feature and/or capability labels Apr 27, 2024
@simonsays1980 simonsays1980 self-assigned this Apr 27, 2024
@simonsays1980
Copy link
Collaborator Author

@sven1977 Are agent and env steps now counted in a custom evaluation function with the new Metrics logger?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks rllib RLlib related issues
Projects
None yet
3 participants