-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(druid): Standardizing time grain transformations #17050
chore(druid): Standardizing time grain transformations #17050
Conversation
"PT5S": "TIME_FLOOR({col}, 'PT5S')", | ||
"PT30S": "TIME_FLOOR({col}, 'PT30S')", | ||
"PT1M": "FLOOR({col} TO MINUTE)", | ||
"PT1M": "TIME_FLOOR({col}, 'PT1M')", | ||
"PT5M": "TIME_FLOOR({col}, 'PT5M')", | ||
"PT10M": "TIME_FLOOR({col}, 'PT10M')", | ||
"PT15M": "TIME_FLOOR({col}, 'PT15M')", | ||
"PT0.5H": "TIME_FLOOR({col}, 'PT30M')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@villebro I've never understood why this is PT0.5H
("Half hour") as opposed to PT30M
("30 minute"). I was thinking of doing a pass to change these—which will require a database migration. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've wondered about this before, as well as why P0.25Y
is used in Superset instead of P3M
. I tried looking at ISO-8601 to see if there's any guidance there, but I can't find anything. So I think they might be used interchangeably. However, since the decimal character varies from country to country, I think we'd be better off replacing PT0.5H
and P0.25Y
with PT30M
and P3M
respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because originally it was called just "half an hour", so PT0.5H
mad sense. It was a direct translation, and I was just looking for a way to standardize the intervals across DB engine specs.
Codecov Report
@@ Coverage Diff @@
## master #17050 +/- ##
=======================================
Coverage 76.91% 76.91%
=======================================
Files 1031 1031
Lines 55163 55163
Branches 7501 7501
=======================================
+ Hits 42428 42430 +2
+ Misses 12483 12481 -2
Partials 252 252
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
94a8f88
to
f87b171
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. If/when a migration is done, I'd also replace P0.25Y
with P3M
.
"PT5S": "TIME_FLOOR({col}, 'PT5S')", | ||
"PT30S": "TIME_FLOOR({col}, 'PT30S')", | ||
"PT1M": "FLOOR({col} TO MINUTE)", | ||
"PT1M": "TIME_FLOOR({col}, 'PT1M')", | ||
"PT5M": "TIME_FLOOR({col}, 'PT5M')", | ||
"PT10M": "TIME_FLOOR({col}, 'PT10M')", | ||
"PT15M": "TIME_FLOOR({col}, 'PT15M')", | ||
"PT0.5H": "TIME_FLOOR({col}, 'PT30M')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've wondered about this before, as well as why P0.25Y
is used in Superset instead of P3M
. I tried looking at ISO-8601 to see if there's any guidance there, but I can't find anything. So I think they might be used interchangeably. However, since the decimal character varies from country to country, I think we'd be better off replacing PT0.5H
and P0.25Y
with PT30M
and P3M
respectively.
* chore(druid): Standardizing time grain transformations * Update druid_tests.py * Update druid_tests.py Co-authored-by: John Bodley <john.bodley@airbnb.com>
SUMMARY
The Apache Druid time grain transformations uses a mix of
FLOOR
,TIME_FLOOR
, andTIMESTAMPADD
UDFs. This PR merely standardizes these to consistently use theTIME_FLOOR
andTIME_SHIFT
functions which both leverage the ISO 8601 standard for defining periods.TESTING INSTRUCTIONS
CI.
ADDITIONAL INFORMATION