Return deterministic actions #5597

cmard · 2021-10-27T17:29:21Z

Proposed change(s)

This PR will add an ability to retrieve actions deterministically based on the input to the model. A new run-options configuration has been added as well as a new CLI flag --deterministic.

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

miguelalonsojr

Please see CR feedback.

com.unity.ml-agents/CHANGELOG.md

docs/Training-Configuration-File.md

miguelalonsojr · 2021-10-29T15:05:27Z

ml-agents/mlagents/trainers/torch/action_model.py

@@ -66,22 +67,31 @@ def __init__(
        # During training, clipping is done in TorchPolicy, but we need to clip before ONNX
        # export as well.
        self._clip_action_on_export = not tanh_squash
+        self.deterministic = deterministic


Unless it's going to be used outside of the ActionModel class, refactor to make it
self._deterministic.

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>

maryamhonari

looks great! Some nit changes on docs and I believe the setting test for yaml is not set right and I wonder how it's passing CI.

com.unity.ml-agents/CHANGELOG.md

ml-agents/mlagents/trainers/settings.py

ml-agents/mlagents/trainers/cli_utils.py

maryamhonari · 2021-10-29T18:53:46Z

ml-agents/mlagents/trainers/torch/action_model.py

@@ -32,6 +32,7 @@ def __init__(
        action_spec: ActionSpec,
        conditional_sigma: bool = False,
        tanh_squash: bool = False,
+        deterministic: bool = False,


please update the docstring

maryamhonari · 2021-10-29T18:56:23Z

ml-agents/mlagents/trainers/tests/test_settings.py

@@ -541,6 +542,7 @@ def test_default_settings():
    test1_settings = run_options.behaviors["test1"]
    assert test1_settings.max_steps == 2
    assert test1_settings.network_settings.hidden_units == 2000
+    assert not test1_settings.network_settings.deterministic


nit: can we use == True just for readability

maryamhonari · 2021-10-29T19:24:35Z

ml-agents/mlagents/trainers/tests/torch/test_action_model.py

+    agent_action1 = action_model._sample_action(dists)
+    agent_action2 = action_model._sample_action(dists)
+    agent_action3 = action_model._sample_action(dists)
+    assert torch.equal(agent_action1.continuous_tensor, agent_action2.continuous_tensor)


some tests on discrete actions would be great!

maryamhonari · 2021-10-29T19:31:10Z

ml-agents/mlagents/trainers/tests/test_settings.py

@@ -541,6 +542,7 @@ def test_default_settings():
    test1_settings = run_options.behaviors["test1"]
    assert test1_settings.max_steps == 2
    assert test1_settings.network_settings.hidden_units == 2000
+    assert not test1_settings.network_settings.deterministic


IIUC this test is wrong. Above we set deterministic: true in yaml, so it should be assert test1_settings.network_settings.deterministic == True, right
?

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

…y-Technologies/ml-agents into develop-staging-determinstic-action

ml-agents/mlagents/trainers/torch/distributions.py

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

cmard added 2 commits October 27, 2021 09:54

Progress on propagating the setting to the action model.

7e7c3e2

Added the _sample_action logic and tests.

824f54b

cmard requested review from maryamhonari and miguelalonsojr October 27, 2021 17:29

cmard added 2 commits October 27, 2021 13:32

Add information to the changelog.

3e1a60a

Prioritize the CLI over the configuration file.

2918be6

cmard changed the base branch from main to deterministic-actions-python-training October 28, 2021 15:58

Update documentation for config file.

f1d0965

miguelalonsojr requested changes Oct 29, 2021

View reviewed changes

cmard and others added 4 commits October 29, 2021 13:10

CR refactor.

646498e

Update docs/Training-Configuration-File.md

78fd1c8

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>

Update com.unity.ml-agents/CHANGELOG.md

6e43451

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>

Update com.unity.ml-agents/CHANGELOG.md

4b6808f

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>

maryamhonari suggested changes Oct 29, 2021

View reviewed changes

cmard and others added 5 commits November 1, 2021 09:33

Update com.unity.ml-agents/CHANGELOG.md

a507c7d

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

Update ml-agents/mlagents/trainers/settings.py

5b059fd

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

Update ml-agents/mlagents/trainers/cli_utils.py

c8eb7a9

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

Fix CR requests

283ed15

Merge branch 'develop-staging-determinstic-action' of github.com:Unit…

0267d3d

…y-Technologies/ml-agents into develop-staging-determinstic-action

miguelalonsojr mentioned this pull request Nov 2, 2021

support for deterministic inference in onnx #5593

Merged

10 tasks

maryamhonari reviewed Nov 3, 2021

View reviewed changes

ml-agents/mlagents/trainers/torch/distributions.py Outdated Show resolved Hide resolved

cmard and others added 2 commits November 3, 2021 13:53

Add tests for discrete.

0431025

Update ml-agents/mlagents/trainers/torch/distributions.py

98da4b1

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

cmard merged commit 98da4b1 into deterministic-actions-python-training Nov 15, 2021

delete-merged-branch bot deleted the develop-staging-determinstic-action branch November 15, 2021 22:43

maryamhonari restored the develop-staging-determinstic-action branch November 15, 2021 22:55

maryamhonari mentioned this pull request Nov 18, 2021

Deterministic actions python training #5619

Merged

10 tasks

maryamhonari mentioned this pull request Nov 30, 2021

Deterministic actions python training #5626

Merged

10 tasks

github-actions bot locked as resolved and limited conversation to collaborators Nov 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return deterministic actions #5597

Return deterministic actions #5597

cmard commented Oct 27, 2021 •

edited

Loading

miguelalonsojr left a comment

miguelalonsojr Oct 29, 2021

maryamhonari left a comment

maryamhonari Oct 29, 2021

maryamhonari Oct 29, 2021

maryamhonari Oct 29, 2021

maryamhonari Oct 29, 2021 •

edited

Loading

Return deterministic actions #5597

Return deterministic actions #5597

Conversation

cmard commented Oct 27, 2021 • edited Loading

Proposed change(s)

Types of change(s)

Checklist

Other comments

miguelalonsojr left a comment

Choose a reason for hiding this comment

miguelalonsojr Oct 29, 2021

Choose a reason for hiding this comment

maryamhonari left a comment

Choose a reason for hiding this comment

maryamhonari Oct 29, 2021

Choose a reason for hiding this comment

maryamhonari Oct 29, 2021

Choose a reason for hiding this comment

maryamhonari Oct 29, 2021

Choose a reason for hiding this comment

maryamhonari Oct 29, 2021 • edited Loading

Choose a reason for hiding this comment

cmard commented Oct 27, 2021 •

edited

Loading

maryamhonari Oct 29, 2021 •

edited

Loading