-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to RangeByMinutes backfilling tasks #1863
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I first thought of RangeMinuetly
but that doesn't make sense now does it? ^^
Overall very good. I just don't get the =5
default.
"in minutes from current time. Prevents infinite loop when stop is none") | ||
|
||
minutes_interval = luigi.IntParameter( | ||
default=5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems way to arbitrary. Why not just default=1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Tarrasch, thanks for reviewing it.
Not sure what you mean with your first comment, is it just the choice of name of RangeByMinutes vs RangeMinutely? English is not my mother tongue, but minutely didn't sound correct to me. Please suggest a name that you think is best.
About the default value of 5 minutes interval. Ultimately it responds to the fact it is the interval that was needed in our project. I could justify it as being a reasonable interval in the minutes range, making every minute a too frequent interval... but it is not trouble to change it.
@j-santander I think your name of Thanks for changing the default to A lost point which I would like you to fix. Can you throw a |
Added the validations, let me know if anything else is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks almost ready to merge. I added some comments to further improve quality. After that I think this can be merged! :)
@@ -460,6 +460,11 @@ def finite_datetimes(self, finite_start, finite_stop): | |||
""" | |||
Simply returns the points in time that correspond to a whole number of minutes intervals. | |||
""" | |||
# Validate that the minutes_interval can divide 60 and it is greater than 0. | |||
if self.minutes_interval <= 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're at it, can you change this to
if not (0 < self.minutes_interval < 60)
task.requires() | ||
except luigi.parameter.ParameterException: | ||
return | ||
self.fail("Expected a parameter exception") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use this helper instead? https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertRaises
for task_path in task_a_paths: | ||
MockTarget(task_path) | ||
# this test takes a few seconds. Since stop is not defined, | ||
# finite_datetimes constitute many years to consider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this comment? (and some for the other test)
Thanks @j-santander! :) |
Description
This change adds RangeByMinutesBase (inheriting from existing RangeBase) and RangeByMinutes (inheriting from RangeByMinutesBase) providing support for back-filling classes operating in units of minutes.
Behavior (and coding) follows closely the existing RangeHourly and RangeDaily classes. Only addition is
the ability to specify the number of minutes considered as one interval (defaulting to 5 minutes)
Motivation and Context
In our project we produce reports on 5 minutes intervals and the existing backfilling tasks do not cover
such a short interval.
Have you tested this? If so, how?
We've tested the code using on our existing jobs, working succesfully.
In addition, I've added additional test cases to test/range_test.py, following the same structure as existing cases for hours and days intervals.