Package changes to support deterministic inference #5599

maryamhonari · 2021-10-29T18:08:24Z

Proposed change(s)

New models in ml-agents are exported with 2 new action tensors to support deterministic action selection during inference with barracuda. This option is only available in editor mode (with no python training attached) using Stochastic Inference flag.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

MLA-2263

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

miguelalonsojr

High level review before digging into the code:

You mentioned this is an editor only option. Can we make this a runtime option as well? My thinking here is that folks may want to package up the policy and set it to deterministic mode and ship it that way.
Let's be consistent between the python and Unity side. The default is assumed to be Stochastic, so you must add a "deterministic" flag to either the CLI or config. Let's do the same in Behavior Parameters, i.e. the checkbox should be labeled "Deterministic Policy" and have it unchecked. by default.

- Add tests for deterministic sampling - update editor and tooltips

maryamhonari · 2021-11-04T16:28:54Z

The flag only works within unity editor or unity standalone after build. If it's used from python mlagents-learn, --deterministic should be used.
2.Done, reverted the flag to deterministic inference.

Thanks

miguelalonsojr · 2021-11-10T14:07:44Z

com.unity.ml-agents/Tests/Editor/Inference/ModelRunnerTest.cs

+            modelRunner.DecideBatch();
+            var stochAction2 = (float[])modelRunner.GetAction(1).ContinuousActions.Array.Clone();
+            // Stochastic action selection should output randomly different action values with same obs
+            Assert.IsFalse(Enumerable.SequenceEqual(stochAction1, stochAction2, new FloatThresholdComparer(0.001f)));


I think it would be a good idea to refactor this test into two tests, one for deterministic actions and one for stochastic actions.

miguelalonsojr

LGTM.

* Init: actor.forward outputs separate deterministic actions * fix tensor shape for discrete actions * Add test and editor flag - Add tests for deterministic sampling - update editor and tooltips * Reverting to "Deterministic Inference" * dissect tests * Update docs

* Progress on propagating the setting to the action model. * Added the _sample_action logic and tests. * Add information to the changelog. * Prioritize the CLI over the configuration file. * Update documentation for config file. * CR refactor. * Update docs/Training-Configuration-File.md Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com> Update com.unity.ml-agents/CHANGELOG.md Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com> Update com.unity.ml-agents/CHANGELOG.md Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com> Update com.unity.ml-agents/CHANGELOG.md Co-authored-by: Maryam Honari <honari.m94@gmail.com> Update ml-agents/mlagents/trainers/settings.py Co-authored-by: Maryam Honari <honari.m94@gmail.com> Update ml-agents/mlagents/trainers/cli_utils.py Co-authored-by: Maryam Honari <honari.m94@gmail.com> * Fix CR requests * Add tests for discrete. * Update ml-agents/mlagents/trainers/torch/distributions.py Co-authored-by: Maryam Honari <honari.m94@gmail.com> * Added more stable test. * Return deterministic actions for training (#5615) * Added more stable test. * Fix the tests. * Fix pre-commit * Fix help line to pass precommit. * support for deterministic inference in onnx (#5593) * Init: actor.forward outputs separate deterministic actions * changelog * Renaming * Add more tests * Package changes to support deterministic inference (#5599) * Init: actor.forward outputs separate deterministic actions * fix tensor shape for discrete actions * Add test and editor flag - Add tests for deterministic sampling - update editor and tooltips * Reverting to "Deterministic Inference" * dissect tests * Update docs * Update CHANGELOG.md Co-authored-by: Chingiz Mardanov <chingiz.mardanov@unity3d.com> Co-authored-by: cmard <87716492+cmard@users.noreply.github.com>

* Progress on propagating the setting to the action model. * Added the _sample_action logic and tests. * Add information to the changelog. * Prioritize the CLI over the configuration file. * Update documentation for config file. * CR refactor. * Update docs/Training-Configuration-File.md Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com> Update com.unity.ml-agents/CHANGELOG.md Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com> Update com.unity.ml-agents/CHANGELOG.md Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com> Update com.unity.ml-agents/CHANGELOG.md Co-authored-by: Maryam Honari <honari.m94@gmail.com> Update ml-agents/mlagents/trainers/settings.py Co-authored-by: Maryam Honari <honari.m94@gmail.com> Update ml-agents/mlagents/trainers/cli_utils.py Co-authored-by: Maryam Honari <honari.m94@gmail.com> * Fix CR requests * Add tests for discrete. * Update ml-agents/mlagents/trainers/torch/distributions.py Co-authored-by: Maryam Honari <honari.m94@gmail.com> * Added more stable test. * Return deterministic actions for training (#5615) * Added more stable test. * Fix the tests. * Fix pre-commit * Fix help line to pass precommit. * support for deterministic inference in onnx (#5593) * Init: actor.forward outputs separate deterministic actions * changelog * Renaming * Add more tests * Package changes to support deterministic inference (#5599) * Init: actor.forward outputs separate deterministic actions * fix tensor shape for discrete actions * Add test and editor flag - Add tests for deterministic sampling - update editor and tooltips * Reverting to "Deterministic Inference" * dissect tests * Update docs * Update CHANGELOG.md * Fix the deterministic showing up all the tiime (#5621) Co-authored-by: Chingiz Mardanov <chingiz.mardanov@unity3d.com> Co-authored-by: cmard <87716492+cmard@users.noreply.github.com>

maryamhonari added 4 commits October 22, 2021 10:34

Init: actor.forward outputs separate deterministic actions

b340c85

fix tensor shape for discrete actions

07c11d8

clean up

b047e5d

changelog

085e56e

maryamhonari marked this pull request as ready for review November 2, 2021 14:21

maryamhonari requested review from miguelalonsojr and jrupert-unity November 2, 2021 16:23

miguelalonsojr requested changes Nov 2, 2021

View reviewed changes

maryamhonari added 5 commits November 2, 2021 16:29

Renaming

fb7849f

Add more tests

afa4d83

set stochasticInference in package

60e4206

Add test and editor flag

ec95350

- Add tests for deterministic sampling - update editor and tooltips

Reverting to "Deterministic Inference"

b241118

maryamhonari force-pushed the develop-deterministic-policy-serialize branch from 6d609e2 to b241118 Compare November 3, 2021 19:39

maryamhonari added 3 commits November 3, 2021 12:43

formatting

0153d01

update tooltip/warning messages

32f38af

cleanup

bdb8c26

maryamhonari requested review from miguelalonsojr and cmard November 4, 2021 16:29

miguelalonsojr requested changes Nov 10, 2021

View reviewed changes

cmard force-pushed the deterministic-actions-python-training branch from bf15d2e to 604d7c1 Compare November 15, 2021 23:06

maryamhonari added 2 commits November 15, 2021 15:26

dissect tests

f7fd062

Merge featurebranch

5163791

maryamhonari requested a review from miguelalonsojr November 16, 2021 21:55

cmard approved these changes Nov 16, 2021

View reviewed changes

miguelalonsojr approved these changes Nov 17, 2021

View reviewed changes

maryamhonari added 3 commits November 18, 2021 07:27

merge feature branch

376145e

Update docs

9b3520f

Update docs

8856c3a

maryamhonari force-pushed the develop-deterministic-policy-serialize branch from 55bfe70 to 8856c3a Compare November 18, 2021 15:36

maryamhonari merged commit 60a2b26 into deterministic-actions-python-training Nov 18, 2021

delete-merged-branch bot deleted the develop-deterministic-policy-serialize branch November 18, 2021 15:39

maryamhonari mentioned this pull request Nov 18, 2021

Deterministic actions python training #5619

Merged

10 tasks

maryamhonari mentioned this pull request Nov 30, 2021

Deterministic actions python training #5626

Merged

10 tasks

github-actions bot locked as resolved and limited conversation to collaborators Nov 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package changes to support deterministic inference #5599

Package changes to support deterministic inference #5599

maryamhonari commented Oct 29, 2021 •

edited

Loading

miguelalonsojr left a comment

maryamhonari commented Nov 4, 2021

miguelalonsojr Nov 10, 2021

miguelalonsojr left a comment

Package changes to support deterministic inference #5599

Package changes to support deterministic inference #5599

Conversation

maryamhonari commented Oct 29, 2021 • edited Loading

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Other comments

miguelalonsojr left a comment

Choose a reason for hiding this comment

maryamhonari commented Nov 4, 2021

miguelalonsojr Nov 10, 2021

Choose a reason for hiding this comment

miguelalonsojr left a comment

Choose a reason for hiding this comment

maryamhonari commented Oct 29, 2021 •

edited

Loading