Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: ENH: Support ZoneInfo timezones #40135

Closed

Conversation

AlexKirko
Copy link
Member

@AlexKirko AlexKirko commented Mar 1, 2021

WIP for #37654. Starting from scratch on this one. I will mark this ready for a review when it gets the job done. Apart from refactoring the code to rely only on the public interface of tzinfo independent of the implementation, the main worry is performance. I'll make sure to benchmark appopriately.

Apologies for the long absence. Had some health issues, but they are, thankfully, pinned down now, and I have mostly recovered.


from pandas import (
Timestamp,
date_range,
)
import pandas._testing as tm

if PY39:
from zoneinfo import ZoneInfo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's also a backport i think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel , thanks!

Added the backport. Since this is my first time messing with the dev requirements, please tell me if I missed something:

  • added backports.zoneinfo to environment.yml
  • ran the hook to mirror the addition in the main pip requirements
  • added it to all the CI pipeline yml files. For MacOS and numpy dev added it to the pip section, since it wasn't available on conda
  • added it separately to the py37_32bit setup script

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once you get this to work, would have to add this as a required dep for < 3.9 (which is prob ok to do).

Copy link
Member Author

@AlexKirko AlexKirko Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback, hello! It certainly wouldn't hurt to add it, but it shouldn't be necessary if I manage to do it right. The idea was to use this issue to move away from relying on tz internals to get stuff done and to use only the public interface, which is the same between tzinfo implementations.

To be frank, last time I tried to tackle this, I got bogged down in the amount of rewriting that needs to get done with how barren base tzinfo is, but hopefully I'll have better luck this time. If not, and if we determine that this approach isn't viable, then we will need to add backports.tzinfo as a dependency and rely on zoneinfo specifics as another tzinfo implementation that we support.

But I know that thinking that we can drop all the internal stuff without an unacceptable drop in performance is optimistic.

def test_zoneinfo_support():
# GH 37654
# Ensure that TimeStamp supports ZoneInfo objects
tz = ZoneInfo("US/Pacific")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a suggestion, the tz_aware_fixture object should include a ZoneInfo object so it's exposed across the testing suite.

P.S. Glad to hear you're doing well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke Thank you! I'm very happy to feel better too.

Created the fixture, so that should make my future breaking the tz conversion and Timestamp much easier to debug.

Maybe you have any idea why this new test could be throwing an unraisable error, though? I'm a bit stumped, because in all other tests (or if I just open the console and execute the code), the error propagates through the call stack properly. I'll probably replace the test or cludge something together tomorrow, but I don't want to accumulate tech debt in case it's something less innocuous than it looks to be:

pandas/tests/tslibs/test_conversion.py::test_zoneinfo_support
  C:\Users\amatamune\Anaconda3\envs\pandas-dev\lib\site-packages\_pytest\unraisableexception.py:78: PytestUnraisableExceptionWarning: Exception ignored in: 'pandas._libs.tslibs.tzconversion.tz_convert_from_utc_single'

  Traceback (most recent call last):
    File "pandas\_libs\tslibs\timezones.pyx", line 263, in pandas._libs.tslibs.timezones.get_dst_info
      num = int(get_utcoffset(tz, None).total_seconds()) * 1_000_000_000
  AttributeError: 'NoneType' object has no attribute 'total_seconds'

    warnings.warn(pytest.PytestUnraisableExceptionWarning(msg))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any idea why this new test could be throwing an unraisable error, though?

get_dst_info is defined without an except? -1 clause, meaning that an exception raised inside of it is suppressed, instead printing the "ignored in [...]" and returning (uh, not sure what it returns, but its almost certainly not useful)

For troubleshooting, changing cdef object get_dst_info(tzinfo tz): to cdef object get_dst_info(tzinfo tz) except? -1: should get you a more pythonic failure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thank you, that's super-useful!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably won't need that test anyway. Adding zoneinfo to fixture seems to take care of testing.

@jreback jreback added the Timezones Timezone data dtype label Mar 2, 2021

from pandas import (
Timestamp,
date_range,
)
import pandas._testing as tm

if PY39:
from zoneinfo import ZoneInfo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once you get this to work, would have to add this as a required dep for < 3.9 (which is prob ok to do).

@jreback jreback added the Dependencies Required and optional dependencies label Mar 2, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Apr 4, 2021

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Apr 4, 2021
@@ -1081,6 +1087,7 @@ def iris(datapath):
timezone.utc,
timezone(timedelta(hours=1)),
timezone(timedelta(hours=-1), name="foo"),
ZoneInfo("US/Pacific"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

US/Pacific is a deprecated zone. America/Los_Angeles is the preferred key ID.

@simonjayhawkins
Copy link
Member

@AlexKirko closing as stale.

@AlexKirko AlexKirko deleted the tzinfo-only-public-interface branch January 22, 2022 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dependencies Required and optional dependencies Stale Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: support zoneinfo tzinfo objects
6 participants