-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate _TSObject into conversion #18060
Conversation
Codecov Report
@@ Coverage Diff @@
## master #18060 +/- ##
==========================================
- Coverage 91.25% 91.23% -0.02%
==========================================
Files 163 163
Lines 50115 50115
==========================================
- Hits 45730 45721 -9
- Misses 4385 4394 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18060 +/- ##
==========================================
- Coverage 91.25% 91.23% -0.02%
==========================================
Files 163 163
Lines 50120 50120
==========================================
- Hits 45737 45728 -9
- Misses 4383 4392 +9
Continue to review full report at Codecov.
|
+1 for refactoring and logic decoupling, but let's make sure performance didn't take a hit by accident (FYI, if you do plan on doing anymore of these types of PR's, I would suggest you provide that from the get go). |
pandas/_libs/tslibs/conversion.pyx
Outdated
|
||
|
||
# TODO: We can type the input as a np.datetime64 right? | ||
# and the output as an int64_t? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a doc-string
pandas/_libs/tslibs/conversion.pyx
Outdated
@@ -48,13 +91,227 @@ cdef class _TSObject: | |||
return self.value | |||
|
|||
|
|||
# helper to extract datetime and int64 from several different possibilities |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I would rather you integrate the comments with the doc-strings; in this case your comment is not useful as the doc-string is pretty good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No big deal, but the comment was cut/pasted from the original. Will change.
pandas/_libs/tslibs/conversion.pyx
Outdated
cdef _TSObject convert_str_to_tsobject(object ts, object tz, object unit, | ||
bint dayfirst=False, | ||
bint yearfirst=False): | ||
""" ts must be a string """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update doc-string
pandas/_libs/tslibs/conversion.pyx
Outdated
if unit != PANDAS_FR_ns: | ||
pandas_datetime_to_datetimestruct(ival, unit, &dts) | ||
check_dts_bounds(&dts) | ||
return dtstruct_to_dt64(&dts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ival = dstruct_to_dt64(&dts)
return ival
pandas/_libs/tslibs/conversion.pyx
Outdated
# ---------------------------------------------------------------------- | ||
# Misc Helpers | ||
|
||
cdef inline bint _is_timestamp(datetime obj): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename this to make it clear
is_instance_timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if tihs is *onlyI used in that 1 place I would just in-line it directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if tihs is *onlyI used in that 1 place I would just in-line it directly.
Makes sense.
Just realized: recall recently there was an issue with Timestamp
classmethods returning Timestamp
instead of cls
instances. The conclusion was that this could not be fixed because is_timestamp
would break on subclasses. If this implementation is in fact equally performant, that would solve the problem.
pandas/_libs/tslibs/conversion.pyx
Outdated
offset = get_utcoffset(obj.tzinfo, ts) | ||
obj.value -= int(offset.total_seconds() * 1e9) | ||
|
||
if _is_timestamp(ts): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment here
pandas/_libs/tslibs/conversion.pyx
Outdated
check_dts_bounds(&dts) | ||
return dtstruct_to_dt64(&dts) | ||
else: | ||
return ival |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
de-privatize
note I am not only suggesting these things for general cleanliness, but also because these prevent accidental imports; IOW things will immediately break because of the new name unless they are fixed
…libs-conversion10
write decent docstrings de-private _get_datetime64_nanos also, removed unnecessary unit input from convert_str_to_tsobject
Appveyor error looks like it can't import scipy; saw the same error in #18016 a few minutes ago.
|
Suggestions for the scipy import error on appveyor? |
@jreback : This looks like a |
Forgot to use affinity on the first run here:
Note master got updated to a new commit between runs here.
Given the excess variability caused by my local entropy field, this looks about unchanged. |
looks good. pls rebase |
…libs-conversion10
thanks! |
This is what we've been building towards folks.
convert_to_tsobject
,convert_str_to_tsobject
, andconvert_datetime_to_tsobject
are moved totslibs.conversion
. Concerns regarding_TSObjects
s can be considered separated.This isn't quite cut/paste:
convert_to_tsobject
callsis_timestamp
which is not available in the namespace. Similarly_localize_pydatetime
checksisinstance(dt, Timestamp
). So this re-implements private versions:conversion._is_timestamp
checks that its input has typedatetime
. IIUC correctly this check should be as performant astslib.is_timestamp
. (moderate sized "if")conversion._localize_pydatetime
includes small optimizations that are available because we know the context in which it is being called. Specifically, 1) it can skip theTimestamp
case since any nanos get dropped inpydatetime_to_dt64
, and 2) thetz
arg is typed as atzinfo
object, 3) it skips unnecessaryNone
check.Other notes:
_get_datetime64_nanos
is called byconvert_to_tsobject
, so also had to be moved. It fits well with the "conversion" theme.