Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][CI] Hypothesis nightly test fails with pytz.exceptions.UnknownTimeZoneError: 'build/etc/localtime' #36349

Closed
raulcd opened this issue Jun 28, 2023 · 4 comments · Fixed by #36391

Comments

@raulcd
Copy link
Member

raulcd commented Jun 28, 2023

Describe the bug, including details regarding any error messages, version, and platform.

During the last 5 days the test_python_datetime_with_pytz_timezone on test-conda-python-3.11-hypothesis has been failing with the following error:

>               raise UnknownTimeZoneError(zone)
E               pytz.exceptions.UnknownTimeZoneError: 'build/etc/localtime'
E               Falsifying example: test_python_datetime_with_pytz_timezone(
E                   self=<pyarrow.tests.test_pandas.TestConvertDateTimeLikeTypes object at 0x7f9626db7710>,
E                   tz=ZoneInfo('build/etc/localtime'),
E               )

I have been NOT able to reproduce locally with:
PYTHON=3.11 PYARROW_TEST_HYPOTHESIS="ON" PYTEST_ARGS="-m hypothesis" docker-compose run --rm conda-python-pandas

Component(s)

Continuous Integration, Python

@raulcd
Copy link
Member Author

raulcd commented Jun 28, 2023

@jorisvandenbossche any idea on that one? I am not sure why it tries to use build/etc/localtime as a timezone.

@jorisvandenbossche
Copy link
Member

No, I looked at the traceback a while ago (it's already failing longer than 5 days), and honestly no idea how this can happen or where this comes from ..

The test is using the hypothesis.extra.pytz.timezones() and hypothesis.strategies.timezones() strategies. But somehow this strategy can produce a tzinfo that gives a "build/etc/localtime" string representation. I wonder if that's a bug upstream, as such a zone doesn't sound to make sense.

@jorisvandenbossche
Copy link
Member

So actually, just trying that out, this string works for zoneinfo, and this is one of the "available" zones:

>>> import datetime
>>> import zoneinfo
>>> "build/etc/localtime" in zoneinfo.available_timezones()
True
>>> tz = zoneinfo.ZoneInfo("build/etc/localtime")
>>> tz
zoneinfo.ZoneInfo(key='build/etc/localtime')
>>> tz.key
'build/etc/localtime'
>>> tz.utcoffset(datetime.datetime.now())
datetime.timedelta(0)
>>> print(datetime.datetime.now().replace(tzinfo=tz))
2023-06-28 18:19:13.737991+00:00

So probably this timezone gets created by the zoneinfo hypothesis strategy, but we still always use pytz timezones for export (there is an issue about stopping to do this), and then pytz doesn't recognize is:

>>> pa.lib.tzinfo_to_string(tz)
'build/etc/localtime'
>>> pa.lib.string_to_tzinfo("build/etc/localtime")
...
UnknownTimeZoneError: 'build/etc/localtime'

which comes from

>>> pytz.timezone("build/etc/localtime")
>>>
UnknownTimeZoneError: 'build/etc/localtime'

I think the question is what we should do with such a 'build/etc/localtime' string though. The easy thing would be to check for this string and then create a zoneinfo timezone instead of pytz (in pa.lib.string_to_tzinfo). But is that the correct thing to do? Should we even allow such a timezone in the arrow type? I think our spec says that we allow either fixed offset strings or a name from the Olsen database. zoneinfo should be using an Olsen tzdata database, but is 'build/etc/localtime' part of that? And is that cross-platform? Can we somehow convert that in a proper name or fixed offset?

@jorisvandenbossche
Copy link
Member

So the "build/etc/localtime" is an actual file in my tzdata directory (base path from zoneinfo.TZPATH), and it's an actual TZif2 file (just like all other zones).
But for example pytz bundles its own copy of the timezone database, and in this directory there is no such "build" subdir.

And the tzdata package (https://pypi.org/project/tzdata) also doesn't include such a "build/etc/localtime" (because it can't know what the local time is of the machine it will get installed on).

Maybe the best path forward is to disallow this timezone? (although in general we don't really check if the timezone provided by the user is correct, i.e. is a fixed offset or valid Olsen zone name)

Or short term for this test, skip it in case the timezone is this 'build/etc/localtime'

@jorisvandenbossche jorisvandenbossche added this to the 13.0.0 milestone Jun 28, 2023
jorisvandenbossche added a commit to jorisvandenbossche/arrow that referenced this issue Jun 29, 2023
raulcd pushed a commit that referenced this issue Jun 30, 2023
…hypothesis tests (#36391)

### What changes are included in this PR?

Skip the test if the timezone is `zoneinfo.ZoneInfo(key='build/etc/localtime')`, because we don't support roundtripping such a timezone

* Closes: #36349

Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Raúl Cumplido <raulcumplido@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment