Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return deterministic actions #5597

Merged

Conversation

cmard
Copy link
Contributor

@cmard cmard commented Oct 27, 2021

Proposed change(s)

This PR will add an ability to retrieve actions deterministically based on the input to the model. A new run-options configuration has been added as well as a new CLI flag --deterministic.

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

@cmard cmard changed the base branch from main to deterministic-actions-python-training October 28, 2021 15:58
Copy link
Collaborator

@miguelalonsojr miguelalonsojr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see CR feedback.

com.unity.ml-agents/CHANGELOG.md Outdated Show resolved Hide resolved
com.unity.ml-agents/CHANGELOG.md Outdated Show resolved Hide resolved
docs/Training-Configuration-File.md Outdated Show resolved Hide resolved
@@ -66,22 +67,31 @@ def __init__(
# During training, clipping is done in TorchPolicy, but we need to clip before ONNX
# export as well.
self._clip_action_on_export = not tanh_squash
self.deterministic = deterministic
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless it's going to be used outside of the ActionModel class, refactor to make it
self._deterministic.

cmard and others added 4 commits October 29, 2021 13:10
Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Copy link
Contributor

@maryamhonari maryamhonari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great! Some nit changes on docs and I believe the setting test for yaml is not set right and I wonder how it's passing CI.

com.unity.ml-agents/CHANGELOG.md Outdated Show resolved Hide resolved
ml-agents/mlagents/trainers/settings.py Outdated Show resolved Hide resolved
ml-agents/mlagents/trainers/cli_utils.py Outdated Show resolved Hide resolved
@@ -32,6 +32,7 @@ def __init__(
action_spec: ActionSpec,
conditional_sigma: bool = False,
tanh_squash: bool = False,
deterministic: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the docstring

@@ -541,6 +542,7 @@ def test_default_settings():
test1_settings = run_options.behaviors["test1"]
assert test1_settings.max_steps == 2
assert test1_settings.network_settings.hidden_units == 2000
assert not test1_settings.network_settings.deterministic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we use == True just for readability

agent_action1 = action_model._sample_action(dists)
agent_action2 = action_model._sample_action(dists)
agent_action3 = action_model._sample_action(dists)
assert torch.equal(agent_action1.continuous_tensor, agent_action2.continuous_tensor)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some tests on discrete actions would be great!

@@ -541,6 +542,7 @@ def test_default_settings():
test1_settings = run_options.behaviors["test1"]
assert test1_settings.max_steps == 2
assert test1_settings.network_settings.hidden_units == 2000
assert not test1_settings.network_settings.deterministic
Copy link
Contributor

@maryamhonari maryamhonari Oct 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this test is wrong. Above we set deterministic: true in yaml, so it should be assert test1_settings.network_settings.deterministic == True, right
?

cmard and others added 5 commits November 1, 2021 09:33
Co-authored-by: Maryam Honari <honari.m94@gmail.com>
Co-authored-by: Maryam Honari <honari.m94@gmail.com>
Co-authored-by: Maryam Honari <honari.m94@gmail.com>
…y-Technologies/ml-agents into develop-staging-determinstic-action
cmard and others added 2 commits November 3, 2021 13:53
@cmard cmard merged commit 98da4b1 into deterministic-actions-python-training Nov 15, 2021
@delete-merged-branch delete-merged-branch bot deleted the develop-staging-determinstic-action branch November 15, 2021 22:43
@maryamhonari maryamhonari restored the develop-staging-determinstic-action branch November 15, 2021 22:55
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants