Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: support for reading and writing datetimes with timezones #253
ENH: support for reading and writing datetimes with timezones #253
Changes from 31 commits
e075091
3df7936
d68b473
9aa5a8c
1a2af4d
fbd2898
127d0a7
016778a
faa0631
a8c200e
6047375
6061563
3ba42cf
d983140
e9993bd
b6ca5cf
05cc1cf
a78a76c
b681656
3426fdc
bb6fd4e
87419ac
fc78bd9
5fab348
26c403a
ebdb71b
f46e716
44686f9
6b946f5
ec16ed3
da0639a
85a67c2
d96d67e
3eb70dc
c37c1ed
4064f25
3df12c0
8fd30a5
55293c0
8040c21
fccc8fb
200cc1d
ebfc01c
c8c186a
95030c0
c5c272b
086e52e
2b2dd5f
2167d0f
ab0fbf6
e3f4d6a
0f02115
52a922d
7c99e51
a5f5f9d
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tz_as_datetime.utcoffset()
is permitted to return None, perhaps it makes sense to check for that explicitly. I believe that given the way this is supplied using to_pydatetime()
that shouldn't happen, but there could be edge cases I'm not aware of.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This first attempt will already raise a warning with the latest pandas, so we will need to catch that warning:
(I have to check if the warning is actually wrong about the future behaviour of raising a warning. I would assume the deprecation would warn for an error in the future. In which case we should also already start catching that, I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I noticed this a while back in the geopandas equivalent test, so we should fix there too.
Based on the discussion in pandas-dev/pandas#50887 and pandas-dev/pandas#54014, it does seem like the error is supposed to read "will raise an error" rather than warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight preference to prefix params specifically intended to pass things down to GDAL as
gdal_
, and since the only thing being stored here is the GDAL offset rather than other metadata, this rename seems a bit more clear (provided I follow correctly).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you do follow correctly, the new name makes more sense. I think the name I had predates me passing down the offset only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is probably a bit verbose right now.
Basic idea of this is that I can't seem to find a way to produce a series/ numpy array of timezone offsets in a vectorised way through pandas.
Best I could seem to do is that
ser.array.to_pydatetime()
uses the cython functionints_to_pydatetime
. Note the DateTimeArray is marked as experimental and doesn't actually document any methods on the website which isn't great. The equivalent Series / DateTimeIndex method is another option, but it throws unavoidable userwarnings for changing in behaviour.This seems wasteful in terms of duplication (basically the same datetime information twice), but seemed better to pass this down and compute the offsets in cython (but I haven't profiled it either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A potentially faster alternative:
Although it has some more steps, it's vectorized and avoids creating python datetime.datetime objects, and based on a quick test is much faster. And I think it should give the same result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks much nicer! We can also do the conversion into the gdal representation directly rather than in cython. I've pushed a commit which should implement this - tested locally and it looks like it works, but my local setup doesn't have the best version of GDAL to test against. I will do some performance comparisons but haven't got to that yet.