-
Notifications
You must be signed in to change notification settings - Fork 7k
Closed
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogrllibRLlib related issuesRLlib related issuesrllib-envrunnersIssues around the sampling backend of RLlibIssues around the sampling backend of RLlibstability
Description
What happened + What you expected to happen
When running a MultiAgentEnvRunner in collect episodes-mode there's an early-out termination of a for loop in _sample:
# Also early-out if we reach the number of episodes within this
# for-loop.
if eps == num_episodes:
break
This causes the code to skip the creation of a new episode but it also causes done_episodes_to_run_env_to_module to be a subset of the episodes in episodes. Normally, they would be disjoint since episodes would have re-created episodes in those slots. This causes a problem because slightly later in the call, env-to-module connectors are called, first with done_episodes_to_run_env_to_module, then with episodes:
if done_episodes_to_run_env_to_module:
# Run the env-to-module connector pipeline for all done episodes.
# Note, this is needed to postprocess last-step data, e.g. if the
# user uses a connector that one-hot encodes observations.
# Note, this pipeline run is not timed as the number of episodes
# can differ from `num_envs_per_env_runner` and would bias time
# measurements.
self._env_to_module(
episodes=done_episodes_to_run_env_to_module,
explore=explore,
rl_module=self.module,
shared_data=shared_data,
metrics=None,
)
self._cached_to_module = self._env_to_module(
episodes=episodes,
explore=explore,
rl_module=self.module,
shared_data=shared_data,
metrics=self.metrics,
metrics_prefix_key=(ENV_TO_MODULE_CONNECTOR,),
)
This causes connectors to have to deal with information they have already processed, for the done episodes.
Versions / Dependencies
We use ray[rllib]==2.44.1 but the issue seems present on Github.
Reproduction script
See description above.
Issue Severity
None
Metadata
Metadata
Assignees
Labels
P1Issue that should be fixed within a few weeksIssue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tSomething that is supposed to be working; but isn'tcommunity-backlogrllibRLlib related issuesRLlib related issuesrllib-envrunnersIssues around the sampling backend of RLlibIssues around the sampling backend of RLlibstability