-
Notifications
You must be signed in to change notification settings - Fork 36
Conversation
if (profilesToCollect.contains(DetectorProfileName.TOTAL_ENTITIES)) { | ||
totalResponsesToWait++; | ||
} | ||
if (profilesToCollect.contains(DetectorProfileName.COORDINATING_NODE) | ||
|| profilesToCollect.contains(DetectorProfileName.SHINGLE_SIZE) | ||
|| profilesToCollect.contains(DetectorProfileName.TOTAL_SIZE_IN_BYTES) | ||
|| profilesToCollect.contains(DetectorProfileName.MODELS) | ||
|| profilesToCollect.contains(DetectorProfileName.ACTIVE_ENTITIES) | ||
|| profilesToCollect.contains(DetectorProfileName.INIT_PROGRESS) | ||
|| profilesToCollect.contains(DetectorProfileName.STATE)) { | ||
totalResponsesToWait++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: can we combine these 2 if
into single one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I separate them on purpose. Each group will cost MultiResponsesDelegateActionListener one response.
* @return if the entity is in the cache, return the timestamp in epoch | ||
* milliseconds when the entity's state is lastly used. Otherwise, return -1. | ||
*/ | ||
long getLastActiveMs(String detectorId, String entityModelId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just name it getLastActiveModels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For "Ms", I meant milliseconds. Please see https://en.wikipedia.org/wiki/Millisecond.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the change.
This PR did various things to improve profile API: First, the PR fixed the hang issue. Previously, when users run _profile/state or _profile on the multi-entity detector, the request hangs. The problem is due to incorrect maxResponseCount passed to MultiResponsesDelegateActionListener. Second, the PR fixes the multi-entity detector's wrong state issue. Previously, we can show the init state after an anomaly has shown up. We may have the problem because we read the most active entity's init progress in the cache for a detector's init_progress. But the entity already produced anomaly has been evicted out of the cache. This PR fixes the issue by double-checking the result index's non-zero RCF score for a multi-entity detector before reporting the init state. If there is any non-zero RCF score, we say running state instead of the initing state. Third, this PR adds more information to the entity level profile, including last_active_timestamp, last_sample_timestamp, init_progress, model, and state. Fourth, this PR adds models and total_size_in_bytes to the multi-entity detector level profile. This PR also fixes various "fail to return" issues in the rest API related transport action. We didn't return after sending channel responses. Later, when we use the channel to send back responses again, we get " java.lang.IllegalStateException: Channel is already closed." Testing done: 1. manual testing passes. 2. actively adding unit tests
Codecov Report
@@ Coverage Diff @@
## master #298 +/- ##
============================================
+ Coverage 71.25% 72.01% +0.76%
- Complexity 1869 1967 +98
============================================
Files 194 199 +5
Lines 9024 9466 +442
Branches 766 844 +78
============================================
+ Hits 6430 6817 +387
- Misses 2231 2236 +5
- Partials 363 413 +50
Flags with carried forward coverage won't be shown. Click here to find out more. |
if (hits.getTotalHits().value == 0L) { | ||
processInitResponse(detector, profilesToCollect, totalUpdates, false, profileBuilder, listener); | ||
} else { | ||
createRunningStateAndInitProgress(profilesToCollect, profileBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a running detector stopped, then restart but not pass initialization yet. We can find anomaly results with anomaly score > 0 as the detector was running before. We can't tell the detector is at running status exactly for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am searching records older than the job's enabled time. Does that cover the issue you mentioned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, make sense.
Issue #, if available:
Description of changes:
This PR did various things to improve profile API:
First, the PR fixed the hang issue. Previously, when users run _profile/state or _profile on the multi-entity detector, the request hangs. The problem is due to incorrect maxResponseCount passed to MultiResponsesDelegateActionListener.
Second, the PR fixes the multi-entity detector's wrong state issue. Previously, we can show the init state after an anomaly has shown up. We may have the problem because we read the most active entity's init progress in the cache for a detector's init_progress. But the entity already produced anomaly has been evicted out of the cache. This PR fixes the issue by double-checking the result index's non-zero RCF score for a multi-entity detector before reporting the init state. If there is any non-zero RCF score, we say running state instead of the initing state.
Third, this PR adds more information to the entity level profile, including last_active_timestamp, last_sample_timestamp, init_progress, model, and state.
Fourth, this PR adds models and total_size_in_bytes to the multi-entity detector level profile.
This PR also fixes various "fail to return" issues in the rest API related transport action. We didn't return after sending channel responses. Later, when we use the channel to send back responses again, we get " java.lang.IllegalStateException: Channel is already closed."
Testing done:
After the change, we have the following output for multi-entity detectors:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.