-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add threshold for RCA #5218
add threshold for RCA #5218
Conversation
/// <example> | ||
/// <format type="text/markdown"> | ||
/// <data:image/s3,"s3://crabby-images/faca9/faca97f7661f1ce61476aeab4fefb843ab403658" alt="CDATA[ | ||
/// [!code-csharp[LocalizeRootCause"] | ||
/// ]]> | ||
/// </format> | ||
/// </example> | ||
public static RootCause LocalizeRootCause(this AnomalyDetectionCatalog catalog, RootCauseLocalizationInput src, double beta = 0.5) | ||
public static RootCause LocalizeRootCause(this AnomalyDetectionCatalog catalog, RootCauseLocalizationInput src, double beta = 0.5, double rootCauseThreshold = 0.95) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: another option as suggested by Harish (for AD) is to create an Options class to avoid multiple overload in future. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: another option as suggested by Harish (for AD) is to create an Options class to avoid multiple overload in future.
I found that the option is always served for transformer implementation. Does it makes sense to use it in the extentions catalog? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ML.Net team treats it at a pattern of handling multiple overloaded methods thus not confined to which type of class it can be used. yet it's just a minor comments so feel free to ignore #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Four parameters to a function is not too bad. If you are expecting to enhance this API further in the future by adding more parameters, I would suggest adding an Options parameter now. If there are no plans to add new parameters now, please add an Options parameter the next time there is a new parameter added.
E.g. We have now migrated rootCauseThreshold from being a private constant in RootCauseAnaylyzer to being a parameter here. But I also see _anomalyRatioThreshold and _anomalyPreDeltaThreshold as private constants there?
Does it make sense to convert them to parameters? If so, please create an Options class and add all those parameters there.
In reply to: 438493061 [](ancestors = 438493061)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Four parameters to a function is not too bad. If you are expecting to enhance this API further in the future by adding more parameters, I would suggest adding an Options parameter now. If there are no plans to add new parameters now, please add an Options parameter the next time there is a new parameter added.
E.g. We have now migrated rootCauseThreshold from being a private constant in RootCauseAnaylyzer to being a parameter here. But I also see _anomalyRatioThreshold and _anomalyPreDeltaThreshold as private constants there?
Does it make sense to convert them to parameters? If so, please create an Options class and add all those parameters there.
In reply to: 438493061 [](ancestors = 438493061)
Thanks for the suggestion, have updated #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I don't think I communicated well.
The pattern in ML.NET is to have two overloaded functions. A convenience function and a full function. The convenience function takes in a minimal set of parameters that will meet the common use case and use default values for all other parameters. We always have this function. The second function is if a particular feature requires a function with lots of parameters. In this case we add a second overloaded function that takes an Options parameter where the user can set each parameter individually.
So at the very least you should have one function without an Options parameter. You add a second function with an Options parameter only if you are going to need lots of parameters.
Like I mentioned above, if all you need are four parameters, a single function is okay. Only if you were going to migrate the other constants from RootCauseAnalyzer, please add a second function with an Options parameter.
In reply to: 442571173 [](ancestors = 442571173)
@@ -421,7 +422,7 @@ private TimeSeriesPoint GetPointByDimension(Dictionary<string, TimeSeriesPoint> | |||
|
|||
private static string GetDicCode(Dictionary<string, Object> dic) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an ideal way to create a key? or probably more efficient (and accurate) to maintain a tuple dictionary for <key, value> pair and invoke object.equal methods? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for this piece, does it assume that the value is additive? what if the values are not additive? Sample1 and Sample2, when we set the AggregateType.Unknown, is this piece still applicable?
p.Value = +leave.Value; p.ExpectedValue = +leave.ExpectedValue; p.Delta = +leave.Delta;
Currently, for the unknown type, as the calculation might be complex, we use a default method like above to treat it.
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an ideal way to create a key? or probably more efficient (and accurate) to maintain a tuple dictionary for <key, value> pair and invoke object.equal methods?
It is hard to implement it as the object is very complex. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm... will user just override the equal/Hashcode methods and ML.Net just needs to do object.EqualTo(..)? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm... will user just override the equal/Hashcode methods and ML.Net just needs to do object.EqualTo(..)?
Thanks for the reminder, have changed #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to change libmf, right? Please take change from submodules(when git clone or git pull, add --recurse-submodules option as well)
To clarify @frank-dong-ms's comment above, you are seeing test failures above because your rebase didn't include an update to the submodule. Recently there was an update to the libmf submodule that fixed test failures. Please pull and rebase with --recurse-submodules and update your branch and these test failures should go away. #Resolved |
Co-authored-by: Mustafa Bal <5262061+mstfbl@users.noreply.github.com>
Codecov Report
@@ Coverage Diff @@
## master #5218 +/- ##
==========================================
- Coverage 73.31% 69.38% -3.93%
==========================================
Files 1007 770 -237
Lines 187969 144849 -43120
Branches 20241 18435 -1806
==========================================
- Hits 137812 100507 -37305
+ Misses 44630 39195 -5435
+ Partials 5527 5147 -380
|
Thanks for the reminder, updated #Resolved |
Thanks for the reminder, updated #Resolved |
@@ -187,14 +187,15 @@ public static class TimeSeriesCatalog | |||
/// It is used when score is calculated for each root cause item. The range of beta should be in [0,1]. | |||
/// For a larger beta, root cause items which have a large difference between value and expected value will get a high score. | |||
/// For a small beta, root cause items which have a high relative change will get a low score.</param> | |||
/// <param name="rootCauseThreshold">A threshold to determine whether the point should be root cause. If the point's delta is equal to or larger than rootCauseThreshold multiplied by anomaly dimension point's delta, this point is treated as a root cause. Different threshold will turn out different results. Users can choose the delta according to their data and requirments. </param> | |||
/// <example> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the range of valid values for this parameter? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the range of valid values for this parameter?
Have updated the range
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update Root Cause Analysis interface.