Skip to content

API: should only Area/Location time-zone-identifiers (other than UTC) be allowed? #53250

Closed
@MarcoGorelli

Description

@MarcoGorelli

Currently, it's possible to set all kinds of time zones, such as:

In [7]: to_datetime(['2020-01-01']).tz_localize('+01:00')
Out[7]: DatetimeIndex(['2020-01-01 00:00:00+01:00'], dtype='datetime64[ns, UTC+01:00]', freq=None)

In [10]: to_datetime(['2020-01-01']).tz_localize('CET')
Out[10]: DatetimeIndex(['2020-01-01 00:00:00+01:00'], dtype='datetime64[ns, CET]', freq=None)

In [9]: to_datetime(['2020-01-01']).tz_localize('Cuba')
Out[9]: DatetimeIndex(['2020-01-01 00:00:00-05:00'], dtype='datetime64[ns, Cuba]', freq=None)

In https://en.wikipedia.org/wiki/List_of_tz_database_time_zones, it's recommended that people use an 'Area/Location' time zone identifier instead - e.g. 'Africa/Lagos' instead of the first, 'Europe/Paris' instead of the second, and 'America/Havana' instead of the third.

Trying to pass non-area/location tz-identifiers opens people up to common misconceptions and traps about time zones, e.g. that despite Greenwich being in London, London does not observe GMT (it only does for half the year)

In https://en.wikipedia.org/wiki/List_of_tz_database_time_zones, for every single tz-identifier which isn't in the 'Area/Location' format, there's a link to one which is, suggesting to use that one instead.

Would it be safe to make such a restriction?

cc @mroeschke @jbrockmendel @pganssle @rebecca-palmer (sorry for the pings, would really value your input here if possible!)


This would go hand-in-hand with #50887. What we'd get to in the end would be:

Current behaviour (pandas 2.0.1):

In [11]: to_datetime(['2020-01-01 00:00+01:00'])
Out[11]: DatetimeIndex(['2020-01-01 00:00:00+01:00'], dtype='datetime64[ns, UTC+01:00]', freq=None)

In [12]: to_datetime(['2020-01-01 00:00+01:00']).tz_convert('+02:00')
Out[12]: DatetimeIndex(['2020-01-01 01:00:00+02:00'], dtype='datetime64[ns, UTC+02:00]', freq=None)

New behaviour (pandas 3.x):

In [11]: to_datetime(['2020-01-01 00:00+01:00'])
Out[11]: DatetimeIndex(['2019-12-31 23:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None)

In [12]: to_datetime(['2020-01-01 00:00+01:00']).tz_convert('+02:00')
UnknownTimeZoneError: 'Please use Area/Location time-zone-identifier, see https://en.wikipedia.org/wiki/List_of_tz_database_time_zones

In [13]: to_datetime(['2020-01-01 00:00+01:00']).tz_convert('Europe/Athens')
Out[13]: DatetimeIndex(['2020-01-01 01:00:00+02:00'], dtype='datetime64[ns, Europe/Athens]', freq=None)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions