-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
move implementation of Timedelta to tslibs.timedeltas #18085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# resolution in ns | ||
Timedelta.min = Timedelta(np.iinfo(np.int64).min +1) | ||
Timedelta.max = Timedelta(np.iinfo(np.int64).max) | ||
|
||
cdef PyTypeObject* td_type = <PyTypeObject*> Timedelta | ||
|
||
|
||
cdef inline bint is_timedelta(object o): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is never used, could probably be removed (along with td_type above)
pandas/_libs/tslibs/timedeltas.pyx
Outdated
# ---------------------------------------------------------------------- | ||
|
||
cpdef int64_t _delta_to_nanoseconds(delta) except? -1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_delta_to_nanoseconds is cut/paste
delta.microseconds) * 1000 | ||
|
||
|
||
cpdef convert_to_timedelta64(object ts, object unit): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convert_to_timedelta64 is cut/paste
return ts.astype('timedelta64[ns]') | ||
|
||
|
||
cpdef array_to_timedelta64(ndarray[object] values, unit='ns', errors='raise'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array_to_timedelta64 is cut/paste
# define a binary operation that only works if the other argument is | ||
# timedelta like or an array of timedeltalike | ||
def f(self, other): | ||
if hasattr(other, 'delta') and not PyDelta_Check(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isinstance(other, Timedelta)
--> PyDelta_Check(other)
effectively equivalent, should be more performant.
pandas/_libs/tslibs/timedeltas.pyx
Outdated
# We are implicitly requiring the canonical behavior to be | ||
# defined by Timestamp methods. | ||
|
||
elif PyDateTime_CheckExact(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isinstance(other, datetime) and not isinstance(other, Timestamp)
--> PyDateTime_CheckExact(other)
. Effectively equivalent, should be more performant, and doesn't require Timestamp in the namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 2 separate branches here that are the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Historical accident, will fix.
pandas/_libs/tslibs/timedeltas.pyx
Outdated
cdef _to_py_int_float(v): | ||
# Note: This used to be defined inside _timedelta_value_kwargs | ||
# (and Timedelta.__new__ before that), but cython | ||
# will not allow dynamically-defined functions nested that way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should be re-worded to clarify cython will not allow cdef
functions to be defined dynamically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So...go and reword it? 😄
Codecov Report
@@ Coverage Diff @@
## master #18085 +/- ##
==========================================
- Coverage 91.27% 91.26% -0.02%
==========================================
Files 163 163
Lines 50120 50120
==========================================
- Hits 45749 45740 -9
- Misses 4371 4380 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18085 +/- ##
==========================================
- Coverage 91.4% 91.39% -0.02%
==========================================
Files 163 163
Lines 50073 50134 +61
==========================================
+ Hits 45769 45819 +50
- Misses 4304 4315 +11
Continue to review full report at Codecov.
|
@jbrockmendel : Minor comments. Also, let's |
pandas/_libs/tslib.pyx
Outdated
from tslibs.timedeltas cimport parse_timedelta_string, cast_from_unit | ||
from tslibs.timedeltas cimport cast_from_unit, _delta_to_nanoseconds | ||
from tslibs.timedeltas import (Timedelta, convert_to_timedelta64, | ||
_delta_to_nanoseconds, array_to_timedelta64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you not cimporting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only cimporting things that are used in tslib
. Other imports are pass-through for other modules to find in tslib
. Outside imports will be updated in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_delta_to_nanoseconds cimport version is used in tslib, and python version is pass-through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean passthru? if its not used then remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only cimporting things that are used in tslib. Other imports are pass-through for other modules to find in tslib. Outside imports will be updated in a follow-up.
no let's fix this here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'll de-privatize _delta_to_nanoseconds
while I'm at it.
pandas/_libs/tslibs/timedeltas.pxd
Outdated
|
||
# Exposed for tslib, not intended for outside use. | ||
cdef parse_timedelta_string(object ts) | ||
cpdef int64_t cast_from_unit(object ts, object unit) except? -1 | ||
cpdef int64_t _delta_to_nanoseconds(delta) except? -1 | ||
cpdef convert_to_timedelta64(object ts, object unit) | ||
cpdef array_to_timedelta64(ndarray[object] values, unit=*, errors=*) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so remove it
pandas/_libs/tslibs/timedeltas.pyx
Outdated
# We are implicitly requiring the canonical behavior to be | ||
# defined by Timestamp methods. | ||
|
||
elif PyDateTime_CheckExact(other): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 2 separate branches here that are the same?
Several asv runs:
|
you have some wild variability. pls run for timedeltas too. |
This is a truthfact.
|
ok so it doesn't change perf, which is expected, good. |
I think you need to add timedeltas.pxd to period in setup.py; I don't think you can remove anything from tslib def (but check). rebase. |
|
||
from nattype import nat_strings | ||
from nattype import nat_strings, NaT | ||
from nattype cimport _checknull_with_nat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timedeltas should depend on nattype.pxd
pandas/_libs/tslibs/timedeltas.pyx
Outdated
# ---------------------------------------------------------------------- | ||
# Timedelta Construction | ||
|
||
cdef _to_py_int_float(v): | ||
# Note: This used to be defined inside _timedelta_value_kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should prob move to util.pyx (I know it doesn't exist, but should). though maybe not useful outside of this module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
util.pyx (I know it doesn't exist, but should).
Separate issue: the vast majority of usage of util is for is_foo_object
, and those functions could be made pure cython (in C-equivalent ways as in #18059), i.e. put in a file without having dependencies on src, so we wouldn't have to futz around with include/sources
(or in my opinion depends/pxdfiles
since cython will take care of that for us).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree though I don't think its a big deal to simply include it. having too fine grain dependencies is a problem too.
pandas/_libs/tslibs/timedeltas.pyx
Outdated
"float.".format(type(v))) | ||
|
||
|
||
cdef _timedelta_value_kwargs(dict kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you pulled these out in the previous PR, is it more clear to put them back (into the constructor)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK either way. It seemed like an appropriate scope for a function, as the __new__
method was a bit ungainly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't change things like this now, later propose it first (I don't think its a good idea)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this back and i think can merge
pandas/_libs/tslibs/timedeltas.pyx
Outdated
elif PyDelta_Check(other): | ||
ots = Timedelta(other) | ||
else: | ||
ndim = getattr(other, _NDIM_STRING, -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the only use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of _NDIM_STRING
? Here, yes. I think its used in Timestamp too. Not totally sure what the benefit is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this seems kind of silly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix this too
pandas/_libs/tslibs/timedeltas.pyx
Outdated
cdef _Timedelta td_base | ||
|
||
if value is _no_input: | ||
value = _timedelta_value_kwargs(kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. here
If we can push this through we can port most of Timestamp to its own module. Along with #18086 we can then get the rest. |
pandas/_libs/tslibs/timedeltas.pyx
Outdated
"float.".format(type(v))) | ||
|
||
|
||
cdef _timedelta_value_kwargs(dict kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this back and i think can merge
pandas/_libs/tslibs/timedeltas.pyx
Outdated
elif PyDelta_Check(other): | ||
ots = Timedelta(other) | ||
else: | ||
ndim = getattr(other, _NDIM_STRING, -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix this too
Get rid of the matching one in |
pandas/_libs/tslib.pxd
Outdated
@@ -1,8 +1,8 @@ | |||
from numpy cimport ndarray, int64_t | |||
|
|||
from tslibs.conversion cimport convert_to_tsobject | |||
from tslibs.timedeltas cimport convert_to_timedelta64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these shouldn’t be here
if something wants to import them they should directly cimport from conversion or timedelta and not tslib
thanks @jbrockmendel making nice progress! |
rebase any prs that touch this code |
Good teamwork; this was a big one. |
@@ -13,7 +13,7 @@ | |||
from pandas.tseries.frequencies import to_offset, is_subperiod, is_superperiod | |||
from pandas.core.indexes.datetimes import DatetimeIndex, date_range | |||
from pandas.core.indexes.timedeltas import TimedeltaIndex | |||
from pandas.tseries.offsets import DateOffset, Tick, Day, _delta_to_nanoseconds | |||
from pandas.tseries.offsets import DateOffset, Tick, Day, delta_to_nanoseconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas.tseries.offsets is a public-ish module. In general I would not add non-underscored internal methods to that namespace like delta_to_nanoseconds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, if nothing else delta_to_nanoseconds
should be imported directly from tslibs.timedelta. I'll put something together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, do we need to obscure the name if it isn't in __all__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, tab completion (= for me my interaction with the API) does not take into account __all__
I think.
git diff upstream/master -u -- "*.py" | flake8 --diff