-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Time shifts with different granularity for ECharts #24176
fix: Time shifts with different granularity for ECharts #24176
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's still in draft, but a few initial thoughts
fb8cb90
to
bbb8788
Compare
4e4d95c
to
7a5a2ac
Compare
Codecov Report
@@ Coverage Diff @@
## master #24176 +/- ##
==========================================
+ Coverage 68.29% 68.45% +0.15%
==========================================
Files 1957 1951 -6
Lines 75624 75528 -96
Branches 8223 8218 -5
==========================================
+ Hits 51649 51699 +50
+ Misses 21867 21718 -149
- Partials 2108 2111 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Additionally, I believe that the time grain and time offset granularity should be consistent. We might use same dataset to discuss this topic. |
I could do that but I guess I was excited to fix the problem and ended up doing all at once. Sorry if this makes it more difficult to review but I think the important part is that I took the time to improve the code base.
It was not working indeed with |
06116f7
to
c87c24d
Compare
c87c24d
to
fd57a5b
Compare
offset_metrics_df[index] = offset_metrics_df[index] - DateOffset( | ||
**normalize_time_delta(offset) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the index of offset_metrics_df
seems constructing an appropriate index for the result of dataframe, if the align of the results incorrect, I think we should find root cause from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand what you're suggesting. Can you explain more?
Thanks for provide more information. If I have more time, will continue review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few second pass comments
@villebro I addressed your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small optional stylistic comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@villebro @michael-s-molina I'll take another look this PR, give me some time. Thanks! |
Draft a PR to refine the join key generator. please review. Thanks both! |
@zhaoyongjie That's not the best way to help reviewing a PR. I spent a lot of time debugging the issue, addressing comments and executing tests using Airbnb charts to get to a solution. If you think the solution can be improved, you can make suggestions, provide more context, contribute code to the PR, etc. Completely ignoring the PR and opening your own is not cool 😕 |
@michael-s-molina
|
@zhaoyongjie I thought I have addressed both your suggestions with my latest commits. You even removed the Request Changes status. Then you asked time to review the PR and suddenly opened your own which was really confusing to me. |
SUMMARY
There's an option in the Advanced Analytics section of Explore called Time Shift that allows an user to compare the query results with the same query dislocated in time by the amount specified in the control. The way this is implemented, is to fire a query for each specified Time Shift and later join the results with the original query. To make the join work the algorithm modifies the temporal column of the time shift results to match the one in the original query. The problem is that the algorithm is not taking into account that the time granularity and time shifts can influence on the temporal column and affect the resulting timestamps. For example, if the Time Grain is Quarter and Time Shift is 28 days, the results could have different timestamps but still belong to the same quarter. Another example is when the Time Grain is Week and after applying the time shifts, the results fall in a different year which may produce different week numbers for the timestamps. The objective of this PR is to fix these problems by generating an artificial join key that takes into consideration the Time Grain when joining the results.
The following table illustrates how the join keys are calculated depending on the Time Grain:
This join key is only used when joining the results and is discarded in the end.
This PR also improves the Python codebase by creating a
TimeGrain
class and replacing all string literals with references to the new class.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
1 - Choose a dataset with plenty of temporal data available
2 - Create a Line Chart (ECharts)
3 - Play with different Time Grains and Time Shifts
4 - Make sure the offsets are correctly calculated
ADDITIONAL INFORMATION