-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass action masker as input to CollectObservations #3389
Conversation
I am rather unsure about the doubling of the CollectObservation methods (and the copy pasta that comes along) Need to edit the documentation and the migrating doc once we agree we want to do this
/// to the action the agent will be unable to perform. | ||
/// </summary> | ||
/// <param name="actionIndices">The indices of the masked actions on branch 0</param> | ||
protected void SetActionMask(IEnumerable<int> actionIndices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be public, right? They were protected on the Agent, but now they need to be called from outside of the Agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right
/// When using Discrete Control, you can prevent the Agent from using a certain | ||
/// action by masking it. You can call the following method on the ActionMasker | ||
/// input : | ||
/// - <see cref="SetActionMask(int branch, IEnumerable<int> actionIndices)"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Links to other overloads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Seems reasonable. It makes sense to handle these together, since the observations might affect the possible actions. The name "CollectObservations" sounds less appropriate now, but I don't have a better suggest. |
@@ -19,6 +19,43 @@ internal ActionMasker(BrainParameters brainParameters) | |||
m_BrainParameters = brainParameters; | |||
} | |||
|
|||
/// <summary> | |||
/// Sets an action mask for discrete control agents. When used, the agent will not be | |||
/// able to perform the action passed as argument at the next decision. If no branch is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this doc, can you just say that for this method the branch is assumed to be 0 instead of saying "if no branch is specified."
|
||
/// <summary> | ||
/// Sets an action mask for discrete control agents. When used, the agent will not be | ||
/// able to perform the action passed as argument at the next decision. If no branch is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, you cannot specify a branch, so just say it's assumed to be 0.
What's the motivation behind this PR? I'm a bit confused 🙃 |
Trying to do the same thing as we did for the VectorSensor but with masked actions. This will make sure that the user will only be able to call SetActionMask when CollectObservations is called. |
I am rather unsure about the doubling of the CollectObservation methods (and the copy pasta that comes along)
Need to edit the documentation and the migrating doc once we agree we want to do this