Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package changes to support deterministic inference #5599

Conversation

maryamhonari
Copy link
Contributor

@maryamhonari maryamhonari commented Oct 29, 2021

Proposed change(s)

New models in ml-agents are exported with 2 new action tensors to support deterministic action selection during inference with barracuda. This option is only available in editor mode (with no python training attached) using Stochastic Inference flag.
Screen Shot 2021-11-03 at 11 51 26 AM

Screen Shot 2021-11-04 at 8 23 42 AM

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

MLA-2263

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

@maryamhonari maryamhonari marked this pull request as ready for review November 2, 2021 14:21
Copy link
Collaborator

@miguelalonsojr miguelalonsojr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level review before digging into the code:

  1. You mentioned this is an editor only option. Can we make this a runtime option as well? My thinking here is that folks may want to package up the policy and set it to deterministic mode and ship it that way.
  2. Let's be consistent between the python and Unity side. The default is assumed to be Stochastic, so you must add a "deterministic" flag to either the CLI or config. Let's do the same in Behavior Parameters, i.e. the checkbox should be labeled "Deterministic Policy" and have it unchecked. by default.

@maryamhonari maryamhonari force-pushed the develop-deterministic-policy-serialize branch from 6d609e2 to b241118 Compare November 3, 2021 19:39
@maryamhonari
Copy link
Contributor Author

  1. The flag only works within unity editor or unity standalone after build. If it's used from python mlagents-learn, --deterministic should be used.
  2. 2.Done, reverted the flag to deterministic inference.

Thanks

modelRunner.DecideBatch();
var stochAction2 = (float[])modelRunner.GetAction(1).ContinuousActions.Array.Clone();
// Stochastic action selection should output randomly different action values with same obs
Assert.IsFalse(Enumerable.SequenceEqual(stochAction1, stochAction2, new FloatThresholdComparer(0.001f)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good idea to refactor this test into two tests, one for deterministic actions and one for stochastic actions.

@cmard cmard force-pushed the deterministic-actions-python-training branch from bf15d2e to 604d7c1 Compare November 15, 2021 23:06
Copy link
Collaborator

@miguelalonsojr miguelalonsojr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@maryamhonari maryamhonari force-pushed the develop-deterministic-policy-serialize branch from 55bfe70 to 8856c3a Compare November 18, 2021 15:36
@maryamhonari maryamhonari merged commit 60a2b26 into deterministic-actions-python-training Nov 18, 2021
@delete-merged-branch delete-merged-branch bot deleted the develop-deterministic-policy-serialize branch November 18, 2021 15:39
maryamhonari added a commit that referenced this pull request Nov 18, 2021
* Init: actor.forward outputs separate deterministic actions

* fix tensor shape for discrete actions

* Add test and editor flag

- Add tests for deterministic sampling
- update editor and tooltips

* Reverting to "Deterministic Inference"

* dissect tests

* Update docs
maryamhonari added a commit that referenced this pull request Nov 18, 2021
* Init: actor.forward outputs separate deterministic actions

* fix tensor shape for discrete actions

* Add test and editor flag

- Add tests for deterministic sampling
- update editor and tooltips

* Reverting to "Deterministic Inference"

* dissect tests

* Update docs
maryamhonari added a commit that referenced this pull request Nov 18, 2021
* Progress on propagating the setting to the action model.

* Added the _sample_action logic and tests.

* Add information to the changelog.

* Prioritize the CLI over the configuration file.

* Update documentation for config file.

* CR refactor.

* Update docs/Training-Configuration-File.md

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Update com.unity.ml-agents/CHANGELOG.md

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Update com.unity.ml-agents/CHANGELOG.md

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Update com.unity.ml-agents/CHANGELOG.md

Co-authored-by: Maryam Honari <honari.m94@gmail.com>
Update ml-agents/mlagents/trainers/settings.py

Co-authored-by: Maryam Honari <honari.m94@gmail.com>
Update ml-agents/mlagents/trainers/cli_utils.py

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

* Fix CR requests

* Add tests for discrete.

* Update ml-agents/mlagents/trainers/torch/distributions.py

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

* Added more stable test.

* Return deterministic actions for training (#5615)

* Added more stable test.

* Fix the tests.

* Fix pre-commit

* Fix help line to pass precommit.

* support for deterministic inference in onnx (#5593)

* Init: actor.forward outputs separate deterministic actions

* changelog

* Renaming

* Add more tests

* Package changes to support deterministic inference (#5599)

* Init: actor.forward outputs separate deterministic actions

* fix tensor shape for discrete actions

* Add test and editor flag

- Add tests for deterministic sampling
- update editor and tooltips

* Reverting to "Deterministic Inference"

* dissect tests

* Update docs

* Update CHANGELOG.md

Co-authored-by: Chingiz Mardanov <chingiz.mardanov@unity3d.com>
Co-authored-by: cmard <87716492+cmard@users.noreply.github.com>
miguelalonsojr pushed a commit that referenced this pull request Dec 2, 2021
* Progress on propagating the setting to the action model.

* Added the _sample_action logic and tests.

* Add information to the changelog.

* Prioritize the CLI over the configuration file.

* Update documentation for config file.

* CR refactor.

* Update docs/Training-Configuration-File.md

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Update com.unity.ml-agents/CHANGELOG.md

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Update com.unity.ml-agents/CHANGELOG.md

Co-authored-by: Miguel Alonso Jr. <76960110+miguelalonsojr@users.noreply.github.com>
Update com.unity.ml-agents/CHANGELOG.md

Co-authored-by: Maryam Honari <honari.m94@gmail.com>
Update ml-agents/mlagents/trainers/settings.py

Co-authored-by: Maryam Honari <honari.m94@gmail.com>
Update ml-agents/mlagents/trainers/cli_utils.py

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

* Fix CR requests

* Add tests for discrete.

* Update ml-agents/mlagents/trainers/torch/distributions.py

Co-authored-by: Maryam Honari <honari.m94@gmail.com>

* Added more stable test.

* Return deterministic actions for training (#5615)

* Added more stable test.

* Fix the tests.

* Fix pre-commit

* Fix help line to pass precommit.

* support for deterministic inference in onnx (#5593)

* Init: actor.forward outputs separate deterministic actions

* changelog

* Renaming

* Add more tests

* Package changes to support deterministic inference (#5599)

* Init: actor.forward outputs separate deterministic actions

* fix tensor shape for discrete actions

* Add test and editor flag

- Add tests for deterministic sampling
- update editor and tooltips

* Reverting to "Deterministic Inference"

* dissect tests

* Update docs

* Update CHANGELOG.md

* Fix the deterministic showing up all the tiime (#5621)

Co-authored-by: Chingiz Mardanov <chingiz.mardanov@unity3d.com>
Co-authored-by: cmard <87716492+cmard@users.noreply.github.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants