Update to barracuda 1.3.3 and changes to the model inputs and outputs for LSTM #5236

vincentpierre · 2021-04-08T21:41:08Z

Proposed change(s)

In Barracuda 1.3.3-preview, the LSTM module no longer uses the _c and _h special inputs and outputs. This means that we now have to manually feed the data of the LSTM like any other input/output.
This PR contains :

Changes to the LSTM module in pytorch (additional transposes for the ONNX export)
Changes to the Memory Applier and generator
A new model version called MLAgents_2_0_Recurrent
Changes to the ModelLoader to use this new model serialization

There is a bug in 1.3.3 with Profiler.BeginSample. We would probably need to use the next version for this change.
This change would make running 1.3.2 with new models impossible. Need to address that.

Future releases of barracuda will break backwards compatibility : Meaning that old LSTM models will no longer run with latest barracuda (and ML-Agents).

It seems the next release would need to break compatibility both ways : New models old C# and new C# old models.

This table summarizes the compatibility between barracuda and ml-agents :

genererated with  |        Barracuda version 
ML-Agents version |  1.3.2  |  1.3.3  |  1.4.0  |  2.0
-------------------------------------------------------
1.9.0             |   ok    |  bad    |   ok    |  bad 
2.0               |   bad   |  ok* ** |   ok    |  ok*
* (with this PR change only)
** (has console errors due to profiler)

KNOWN ISSUE : Updating to barracuda 1.3.3 will generate spam error messages in the console when using LSTM. This issue will be resolved in 1.4.0. We need to update to barracuda 1.4.0 BEFORE the next release.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Added tests that prove my fix is effective or that my feature works
Updated the changelog (if applicable)
Updated the documentation (if applicable)
Updated the migration guide (if applicable)

Other comments

chriselion · 2021-04-09T16:28:00Z

ml-agents/mlagents/trainers/torch/layers.py

@@ -206,7 +207,16 @@ def forward(
        # We don't use torch.split here since it is not supported by Barracuda
        h0 = memories[:, :, : self.hidden_size].contiguous()
        c0 = memories[:, :, self.hidden_size :].contiguous()
+
+        if exporting_to_onnx.is_exporting():


Is the comment above about torch.split still accurate?

Yes, this corresponds to a slice operator, not a split.

I actually think split was added to recent versions of Barracuda, and we could use it here if we're breaking compat with old versions anyway. https://docs.unity3d.com/Packages/com.unity.barracuda@1.3/manual/SupportedOperators.html#Split

chriselion · 2021-04-09T16:30:27Z

com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs

@@ -19,8 +19,9 @@ internal enum ModelApiVersion
        {
            MLAgents1_0 = 2,
            MLAgents2_0 = 3,
+            MLAgents2_0_Recurrent = 4,


Since we haven't released a version that uses MLAgents2_0, I'd be in favor of just defining MLAgents2_0 to contain the LSTM changes too.

Either way, we should describe the differences between the versions here, just for our own internal knowledge.

Yeah, you might consider calling MLAgents2_0 => MLAgents1_9 if that is the model version that went out with the last release.

com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs

ml-agents/mlagents/trainers/torch/networks.py

com.unity.ml-agents/Runtime/Inference/GeneratorImpl.cs

Co-authored-by: Chris Elion <chris.elion@unity3d.com>

ervteng · 2021-04-13T14:27:10Z

ml-agents/mlagents/trainers/torch/networks.py

@@ -643,6 +643,7 @@ def forward(
        At this moment, torch.onnx.export() doesn't accept None as tensor to be exported,
        so the size of return tuple varies with action spec.
        """
+


Extra line?

ervteng · 2021-04-13T14:32:53Z

ml-agents/mlagents/trainers/torch/layers.py

+            # This transpose is needed both at input and output of the LSTM when
+            # exporting because ONNX will expect (sequence_len, batch, memory_size)
+            # instead of (batch, sequence_len, memory_size)
+            h0 = torch.transpose(h0, 0, 1)


I think we should transpose it before the split into (h0, c0) as it will be marginally faster and will be symmetrical with the transpose on the output below. But it's not a dealbreaker.

ervteng · 2021-04-13T14:35:07Z

com.unity.ml-agents/CHANGELOG.md

@@ -38,7 +39,7 @@ different sizes using the same model. For a summary of the interface changes, pl

 ### Minor Changes
 #### com.unity.ml-agents / com.unity.ml-agents.extensions (C#)
- The `.onnx` models input names have changed. All input placeholders will now use the prefix `obs_` removing the distinction between visual and vector observations. Models created with this version will not be usable with previous versions of the package (#5080)
+- The `.onnx` models input names have changed. All input placeholders will now use the prefix `obs_` removing the distinction between visual and vector observations. Models created with this version will not be usable with previous versions of the package (#5080, #5236)


Maybe add something about LSTM as well

ervteng

Minor comments but otherwise looks fine

vincentpierre added 2 commits April 8, 2021 11:54

Initial commit

6f12ff7

making it work with hallway. Added a new model version

9e5f864

vincentpierre requested a review from chriselion April 8, 2021 21:41

vincentpierre self-assigned this Apr 8, 2021

chriselion reviewed Apr 9, 2021

View reviewed changes

com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs Outdated Show resolved Hide resolved

chriselion reviewed Apr 9, 2021

View reviewed changes

com.unity.ml-agents/Runtime/Inference/BarracudaModelParamLoader.cs Outdated Show resolved Hide resolved

chriselion reviewed Apr 9, 2021

View reviewed changes

ml-agents/mlagents/trainers/torch/networks.py Outdated Show resolved Hide resolved

vincentpierre added 4 commits April 9, 2021 10:03

addressing feedback

a541af9

Adding a comment

e24d0c9

formatting

9e19e80

modifying the changelog

ca70a0b

vincentpierre marked this pull request as ready for review April 9, 2021 22:51

Adding descriptions on the model version descriptions

f7919e8

vincentpierre changed the title ~~Experiment barr 1.3.3~~ Update to barracuda 1.3.3 and changes to the model inputs and outputs for LSTM Apr 9, 2021

chriselion reviewed Apr 13, 2021

View reviewed changes

com.unity.ml-agents/Runtime/Inference/GeneratorImpl.cs Outdated Show resolved Hide resolved

Update com.unity.ml-agents/Runtime/Inference/GeneratorImpl.cs

bfedf6a

Co-authored-by: Chris Elion <chris.elion@unity3d.com>

vincentpierre requested a review from ervteng April 13, 2021 00:18

chriselion approved these changes Apr 13, 2021

View reviewed changes

ervteng reviewed Apr 13, 2021

View reviewed changes

ervteng approved these changes Apr 13, 2021

View reviewed changes

vincentpierre added 2 commits April 13, 2021 09:28

addressing comments

65c19a7

Merge branch 'main' into experiment-barr-1.3.3

908810c

vincentpierre merged commit c6c28b7 into main Apr 13, 2021

delete-merged-branch bot deleted the experiment-barr-1.3.3 branch April 13, 2021 20:36

chriselion mentioned this pull request Apr 14, 2021

update packages-lock.json files #5258

Merged

github-actions bot locked as resolved and limited conversation to collaborators Apr 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to barracuda 1.3.3 and changes to the model inputs and outputs for LSTM #5236

Update to barracuda 1.3.3 and changes to the model inputs and outputs for LSTM #5236

vincentpierre commented Apr 8, 2021 •

edited

Loading

chriselion Apr 9, 2021

vincentpierre Apr 9, 2021

ervteng Apr 13, 2021

chriselion Apr 9, 2021

amirebrahimi Apr 9, 2021

ervteng Apr 13, 2021

ervteng Apr 13, 2021 •

edited

Loading

ervteng Apr 13, 2021

ervteng left a comment

Update to barracuda 1.3.3 and changes to the model inputs and outputs for LSTM #5236

Update to barracuda 1.3.3 and changes to the model inputs and outputs for LSTM #5236

Conversation

vincentpierre commented Apr 8, 2021 • edited Loading

Proposed change(s)

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

Checklist

Other comments

chriselion Apr 9, 2021

Choose a reason for hiding this comment

vincentpierre Apr 9, 2021

Choose a reason for hiding this comment

ervteng Apr 13, 2021

Choose a reason for hiding this comment

chriselion Apr 9, 2021

Choose a reason for hiding this comment

amirebrahimi Apr 9, 2021

Choose a reason for hiding this comment

ervteng Apr 13, 2021

Choose a reason for hiding this comment

ervteng Apr 13, 2021 • edited Loading

Choose a reason for hiding this comment

ervteng Apr 13, 2021

Choose a reason for hiding this comment

ervteng left a comment

Choose a reason for hiding this comment

vincentpierre commented Apr 8, 2021 •

edited

Loading

ervteng Apr 13, 2021 •

edited

Loading