-
Notifications
You must be signed in to change notification settings - Fork 36
Selectively store anomaly results when index pressure is high #241
Selectively store anomaly results when index pressure is high #241
Conversation
Codecov Report
@@ Coverage Diff @@
## master #241 +/- ##
============================================
+ Coverage 72.76% 73.04% +0.27%
- Complexity 1422 1461 +39
============================================
Files 160 164 +4
Lines 6680 6834 +154
Branches 508 527 +19
============================================
+ Hits 4861 4992 +131
- Misses 1571 1591 +20
- Partials 248 251 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@@ -151,6 +210,7 @@ public AnomalyDetector( | |||
this.uiMetadata = uiMetadata; | |||
this.schemaVersion = schemaVersion; | |||
this.lastUpdateTime = lastUpdateTime; | |||
this.categoryField = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may refactor this constructor to
public AnomalyDetector(....,lastUpdateTime) {
new AnomalyDetector(....,lastUpdateTime, null);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, changed.
@@ -91,6 +95,55 @@ public AnomalyResult( | |||
this.executionStartTime = executionStartTime; | |||
this.executionEndTime = executionEndTime; | |||
this.error = error; | |||
this.entity = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above to refactor constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
*/ | ||
public AnomalyDetector( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has some overlap with #240 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they have common dependencies.
@@ -428,7 +497,9 @@ public TimeConfiguration getWindowDelay() { | |||
} | |||
|
|||
public Integer getShingleSize() { | |||
return shingleSize == null ? DEFAULT_SHINGLE_SIZE : shingleSize; | |||
return shingleSize == null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we move this logic to detector constructor?Always set shingle size as default value if it's null. Then this get method can return shingleSize
directly; and we should persist shingle size in detector document even it's default shingle size in case the default shingle size changes in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea. Changed.
public static final String ENTITY_NAME_FIELD = "name"; | ||
public static final String ENTITY_VALUE_FIELD = "value"; | ||
|
||
private final String name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name
is category field name and value
is the entity's unique value? For example, name
will be "IP" and value
will be 53.230.198.42
?
Is it necessary to store same name
for every AD result doc ? As high cardinality detector will generate too much AD results, I think it will save some disk usage if we remove such duplicate information (same for feature data).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, your understanding is correct. I am mostly doing this for the future: in case we have multiple categorical fields and the parser knows which value belongs to which field. Open to change if we have a solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep it as is currently. Remove these duplicate fields needs some snapshot mechanism to track the old version of detector configuration. Let's discuss more to make a decision.
} | ||
} | ||
|
||
client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check bulk request size, if no anomaly result, no need to execute bulk index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the change
…hold. Our performance evaluation has observed a large number of EsRejectedExecutionException due to indexing too many anomaly results. This PR adds a customized bulk transport action to deal with high index memory pressure. When the index pressure exceeds 80%, we only index non-zero anomaly grade index results and a random subset of zero anomaly grade index results. This PR adds a few supporting data structures related changes:1. add constructors in AnomalyResult/FeatureData to use them in the newly added transport action.2. add Entity class to represent an entity in anomaly results. This PR also renames ANOMALY_RESULT_INDEX to ANOMALY_RESULT_INDEX_ALIAS and moves it to the CommonName class. The name refers to an alias and is accessed in multiple places. Testing done: 1. we don't observe EsRejectedExecutionException after applying the change (together with a series of other related changes). 2. added unit tests.
Issue #, if available:
Description of changes:
Our performance evaluation has observed a large number of EsRejectedExecutionException due to indexing too many anomaly results. This PR adds a customized bulk transport action to deal with high index memory pressure. When the index pressure exceeds 80%, we only index non-zero anomaly grade index results and a random subset of zero anomaly grade index results.
This PR adds a few supporting data structures related changes:1. add constructors in AnomalyResult/FeatureData to use them in the newly added transport action.2. add Entity class to represent an entity in anomaly results.
This PR also renames ANOMALY_RESULT_INDEX to ANOMALY_RESULT_INDEX_ALIAS and moves it to the CommonName class. The name refers to an alias and is accessed in multiple places.
Testing done:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.