This repository has been archived by the owner on Aug 2, 2022. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
start historical detector #355
start historical detector #355
Changes from 1 commit
bb1967b
0257a36
7c9d4b9
0d1c8e2
b3b71e2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./gradlew spotlessApply
will reset copyright year as 2020 as configured in file spotless.license.java.To make this PR clean, will send out a separate PR to update this license file and apply to all files. Will replace
2020
with$YEAR
, then spotless can fill as current year automatically.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to call getNearbyPointsForShingle which helps real time detectors to deal with uneven arrival of requests? You run in batches and your timestamp within an interval is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to handle sparse data, this method is used for imputing missing points in the shingle with neighboring points here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In getNearbyPointsForShingle, the imputing distance is half of the interval, which does not apply to your case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this logic for historical detector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from product/user experience consideration, we will not differentiate the historical and realtime detection, will keep the model/algorithm consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to impute missing data which caused by run time jitter for historical detector, just need to impute the data hole in source data.
Discussed with kaituo, will simplify the code for historical detector currently. For the single flow, we can create new function to handle both historical and realtime detection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems that we have lots of ideas for the new universal workflow. Should we create a doc/issue to track them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, as universal flow is just started, I will doc these ideas on my notebook and share with internal team first. Don't thinks it will benefit community by sharing such unconnected ideas on Github now as we don't even mention what universal flow is. Will create an RFC issue later once we finish research, and put those ideas on that Github issue. So user can know the big picture of background, our solution, then they can understand why we come up with these ideas.