-
Notifications
You must be signed in to change notification settings - Fork 36
Verifying multi-entity detectors #240
Verifying multi-entity detectors #240
Conversation
Codecov Report
@@ Coverage Diff @@
## master #240 +/- ##
============================================
+ Coverage 71.70% 72.84% +1.14%
- Complexity 1367 1425 +58
============================================
Files 157 160 +3
Lines 6513 6680 +167
Branches 493 508 +15
============================================
+ Hits 4670 4866 +196
+ Misses 1610 1567 -43
- Partials 233 247 +14
Flags with carried forward coverage won't be shown. Click here to find out more.
|
} | ||
|
||
/** | ||
* Precondition: anomalyDetector.getCategoryField() != null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this comment mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not true anymore. Removed.
if (shingleSize != null && shingleSize < 1) { | ||
throw new IllegalArgumentException("Shingle size must be a positive integer"); | ||
} | ||
if (categoryField != null && categoryField.size() > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 is magic number, how about we create CATEGORY_FIELD_LIMIT instant and use it at all other places. like IndexAnomalyDetectorActionHandler.java line 269:
if (categoryField.size() != 1) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I like the simpler check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
@@ -87,6 +88,7 @@ | |||
private final Map<String, Object> uiMetadata; | |||
private final Integer schemaVersion; | |||
private final Instant lastUpdateTime; | |||
private final List<String> categoryField; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: if we plan to support multiple category fields, should we name this as "categoryFields" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the changes look good to me.
@@ -87,6 +88,7 @@ | |||
private final Map<String, Object> uiMetadata; | |||
private final Integer schemaVersion; | |||
private final Instant lastUpdateTime; | |||
private final List<String> categoryField; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we just use 1 category field today, do we see possible use cases of multiple category fields ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, maybe in the future
@@ -22,6 +22,9 @@ | |||
// index name for anomaly checkpoint of each model. One model one document. | |||
public static final String CHECKPOINT_INDEX_NAME = ".opendistro-anomaly-checkpoints"; | |||
|
|||
// The alias of the index in which to write single-entity AD result history |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we plan to write HC detector's result into another index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not in the plan. The code would have to change a lot for that to happen. e.g., depends on there is categorical field or not, job scheduler will save to different places. Right now, job scheduler does not have access to AnomalyDetector object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, how about remove the "single-entity" from the comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
@@ -72,6 +75,7 @@ | |||
private static final String SHINGLE_SIZE_FIELD = "shingle_size"; | |||
private static final String LAST_UPDATE_TIME_FIELD = "last_update_time"; | |||
public static final String UI_METADATA_FIELD = "ui_metadata"; | |||
public static final String CATEGORY_FIELD = "category_field"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change to "category_fields" ? Same question for line 113, 130, and other places. Suggest to replace by searching all files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do it later? Both Yizhe and I need to change for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'm ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for your consideration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me.
This PR adds categorical fields' number and length check. We only support one categorical field, and the categorical field can only be of type keyword and ip. We also limit the max multi-entity detectors to 10. Testing done: 1. added unit tests 2. did manual testing.
Issue #, if available:
Description of changes:
This PR adds categorical fields' number and type check. We only support one categorical field, and the categorical field can only be of type keyword and ip. We also limit the max multi-entity detectors to 10.
Testing done:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.