-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functional tests for ONNX scenarios #2984
Conversation
These tests roundtrip the main ONNX exportable classes in ML.NET, and show that it's possible to serialize, deserialize, and use them as scorers / transforms. Thus, there are no separate tests for the first scenario. |
Codecov Report
@@ Coverage Diff @@
## master #2984 +/- ##
==========================================
+ Coverage 72.38% 72.4% +0.01%
==========================================
Files 803 803
Lines 143569 143655 +86
Branches 16162 16165 +3
==========================================
+ Hits 103924 104012 +88
+ Misses 35227 35224 -3
- Partials 4418 4419 +1
|
public float Score { get; set; } | ||
} | ||
|
||
private class OnnxScoreColumn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between OnnxScoreColumn
and ClusteringScoreColumn
? Without specifying the dimension of Score
field, I am not sure the code is safe. Maybe we can do
[VectorType(dimension)]
public float[] Score { get; set; }
``` #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ScoreColumn
class will be replaced with a general one being added by the ModelFiles
PR.
The OnnxScoreColumn
is there to make it explicit that we are working around a ?bug? in our ONNX implementation.
The ClusteringScoreColumn
I added specifically for clustering.
Having wrote that out, we can delete the last two and make a ScoreArrayColumn
class in the general helper class files (in ModelFiles
, but I'll pull it into this PR instead and rebase that one).
On the topic of VectorType
, we don't need to specify the dimension. Specifying a dimension just guarantees that the vector will be the same length for each row. That has the downside of making the classes non-reusable, so for helper classes in tests, we usually don't specify this attribute.
In reply to: 266636611 [](ancestors = 266636611)
.Append(mlContext.Transforms.Normalize("Features")) | ||
.AppendCacheCheckpoint(mlContext) | ||
.Append(mlContext.Regression.Trainers.FastTree( | ||
new FastTreeRegressionTrainer.Options { NumberOfThreads = 1, NumberOfTrees = 10 })); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
= 1 [](start = 76, length = 3)
why threads equal to 1? is there a know issue with multi-threading? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no issues with multithreading. This is for convenience. Namely, we run these sorts of tests single-threaded because
- Many are running at the same time;
- Some algorithms are non-deterministic when multitheaded (e.g. SDCA) and make testing harder (boxing vs exact).
In reply to: 267028362 [](ancestors = 267028362)
// TODO #2980: ONNX outputs don't match the outputs of the model, so we must hand-correct this for now. | ||
// TODO #2981: ONNX models cannot be fit as part of a pipeline, so we must use a workaround like this. | ||
var onnxWorkaroundPipeline = onnxModel.Append( | ||
mlContext.Transforms.CopyColumns("Score", "Score0").Fit(onnxModel.Transform(data))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Score0 [](start = 59, length = 6)
Just a question: where does Score0
come from? is it produced by onnx transform? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As laid out in #2498 , we need scenarios to cover the ONNX functionality we want fully supported in V1.
Scenarios:
Fixes #2963