-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the Agent reset immediately after Done #3291
Conversation
_AgentReset(); | ||
m_RequestAction = false; | ||
m_RequestDecision = false; | ||
m_Reward = 0f; | ||
m_CumulativeReward = 0f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like this could be moved into NotifyAgentDone() (or maybe combine Done and NotifyAgentDone, unless you don't want to the user to set maxStepReached)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved some things around
@@ -44,7 +44,7 @@ public struct AgentInfo | |||
/// Unique identifier each agent receives at initialization. It is used | |||
/// to separate between different agents in the environment. | |||
/// </summary> | |||
public int id; | |||
public int episodeId; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this comment too.
@@ -144,7 +144,7 @@ def add_experiences( | |||
) | |||
for traj_queue in self.trajectory_queues: | |||
traj_queue.put(trajectory) | |||
self.experience_buffers[global_id] = [] | |||
del self.experience_buffers[global_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to del last_step_result
and last_take_action_outputs
as well. Probably can do it right after the experience buffer del
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm, unfortunately, it seems last_step_result as well as policy.previous_actions are modified after the check for done. (So even if I delete them, they will be re-added). I need to do more experiments...
for terminated_id in terminated_agents: | ||
self._clean_agent_data(terminated_id) | ||
|
||
def _clean_agent_data(self, global_id: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ervteng Tell me what you think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - we'll keep an eye out for mem leaks
No description provided.