Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Timer and ctx.CancelExecution() to fix AutoML max-time experiment bug #5445
Use Timer and ctx.CancelExecution() to fix AutoML max-time experiment bug #5445
Changes from 25 commits
d0f7054
4fa26f8
48a6267
f324030
ee70024
36bf24e
d5d23de
33cf5a6
bfc93e9
c69a19f
299b05b
2e2d441
ce747fb
7635500
94a80de
abe1d7f
9585a50
1ab662f
bc9e578
71ebf23
2d8d06f
490d8c1
b0de1d3
0918afa
0922aed
ef4b34f
b4b49ce
6502fc8
28e2f2e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this stop user from getting ongoing training log? As the context that is used for training will be different from the context where AutoML experiment is created, and is unavailable externally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinormont @mstfbl @JakeRadMSFT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LittleLittleCloud: Good point. Have you seen issues?
We can always duplicate the logger. Or attach a logger to the new context, and when called, have it pass the message to the original context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in Image classification we subscribe to the log channel to show training progress. It's quite important since image-classification training is way more time consuming than other scenarios and we need to show something to let users know the training progress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might make sense to catch the
OperationCanceledException
where the other AutoML iteration errors are caught:machinelearning/src/Microsoft.ML.AutoML/Experiment/Runners/RunnerUtil.cs
Lines 26 to 51 in f87a3bb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it's better to catch
OperationCanceledException
inExecute()
as the timers that triggers that exception are started and defined inExecute()
as well. Plus catchingOperationCanceledException
inRunnerUtil.cs
requires returning a whole new(ModelContainer model, TMetrics metrics, Exception exception, double score)
tuple, whereas when caught inExecute()
, we can send the sameiterationResults
inExecute()
and the same(ModelContainer model, TMetrics metrics, Exception exception, double score)
tuple as should be.