Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: to_datetime - confusion with timeawarness #48678

Closed
3 tasks done
PrimozGodec opened this issue Sep 21, 2022 · 3 comments · Fixed by #48686
Closed
3 tasks done

BUG: to_datetime - confusion with timeawarness #48678

PrimozGodec opened this issue Sep 21, 2022 · 3 comments · Fixed by #48686
Labels
Datetime Datetime data dtype Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@PrimozGodec
Copy link

PrimozGodec commented Sep 21, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

>>> import pandas as pd

# passes
>>> pd.to_datetime([pd.Timestamp("2022-01-01 00:00:00"), pd.Timestamp("1724-12-20 20:20:20+01:00")], utc=True)
# fails
>>> pd.to_datetime([pd.Timestamp("1724-12-20 20:20:20+01:00"), pd.Timestamp("2022-01-01 00:00:00")], utc=True)
Traceback (most recent call last):
  File "/Users/primoz/miniconda3/envs/orange3/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-69a49866a765>", line 1, in <cell line: 1>
    pd.to_datetime([pd.Timestamp("1724-12-20 20:20:20+01:00"), pd.Timestamp("2022-01-01 00:00:00")], utc=True)
  File "/Users/primoz/python-projects/pandas/pandas/core/tools/datetimes.py", line 1124, in to_datetime
    result = convert_listlike(argc, format)
  File "/Users/primoz/python-projects/pandas/pandas/core/tools/datetimes.py", line 439, in _convert_listlike_datetimes
    result, tz_parsed = objects_to_datetime64ns(
  File "/Users/primoz/python-projects/pandas/pandas/core/arrays/datetimes.py", line 2182, in objects_to_datetime64ns
    result, tz_parsed = tslib.array_to_datetime(
  File "pandas/_libs/tslib.pyx", line 428, in pandas._libs.tslib.array_to_datetime
  File "pandas/_libs/tslib.pyx", line 535, in pandas._libs.tslib.array_to_datetime
ValueError: Cannot mix tz-aware with tz-naive values

# the same examples with strings only also pass
>>> pd.to_datetime(["1724-12-20 20:20:20+01:00", "2022-01-01 00:00:00"], utc=True)

Issue Description

When calling to_datetime with pd.Timestamps in the list/series conversion fail when the datetime at position 0 is tz-aware and the other datetime is tz-naive. When datetimes are reordered the same example pass. The same example also pass when datetimes are strings.

So I am a bit confused about what is supported now. Should a case that fails work?
The error is only present in pandas==1.5.0 and in the main branch. I think it was introduced with this PR #47018.

Expected Behavior

I think to_datetime should have the same behavior in all cases.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 87cfe4e python : 3.10.4.final.0 python-bits : 64 OS : Darwin OS-release : 21.6.0 Version : Darwin Kernel Version 21.6.0: Wed Aug 10 14:28:23 PDT 2022; root:xnu-8020.141.5~2/RELEASE_ARM64_T6000 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : None LOCALE : None.UTF-8 pandas : 1.5.0 numpy : 1.22.4 pytz : 2022.1 dateutil : 2.8.2 setuptools : 61.2.0 pip : 22.1.2 Cython : None pytest : None hypothesis : None sphinx : 5.1.1 blosc : None feather : None xlsxwriter : 3.0.3 lxml.etree : 4.9.1 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.4.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : 1.3.5 brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.5.2 numba : 0.56.0 numexpr : None odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : 1.9.1 snappy : None sqlalchemy : 1.4.39 tables : None tabulate : 0.8.10 xarray : None xlrd : 2.0.1 xlwt : None zstandard : None tzdata : 2022.2
@PrimozGodec PrimozGodec added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 21, 2022
@MarcoGorelli
Copy link
Member

MarcoGorelli commented Sep 21, 2022

I think it was introduced with this PR #47018.

Can confirm (though haven't looked any more closely than that yet), thanks for the report

0277a9c49f5efb1edcb0ebdba48123e2fda2a590 is the first bad commit
commit 0277a9c49f5efb1edcb0ebdba48123e2fda2a590
Author: jbrockmendel <jbrockmendel@gmail.com>
Date:   Sat May 21 12:47:14 2022 -0700

    REF: merge datetime_to_datetime64 into array_to_datetime (#47018)

 pandas/_libs/tslib.pyx                 | 28 ++++++++++---
 pandas/_libs/tslibs/conversion.pyi     |  3 --
 pandas/_libs/tslibs/conversion.pyx     | 74 ----------------------------------
 pandas/core/arrays/datetimes.py        |  8 ----
 pandas/core/tools/datetimes.py         | 72 ++++++++++++---------------------
 pandas/tests/tools/test_to_datetime.py |  9 +++++
 6 files changed, 58 insertions(+), 136 deletions(-)
bisect run success

@MarcoGorelli MarcoGorelli added Datetime Datetime data dtype Regression Functionality that used to work in a prior pandas version and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 21, 2022
@mroeschke mroeschke added this to the 1.5.1 milestone Sep 21, 2022
@MarcoGorelli
Copy link
Member

Hey @PrimozGodec

Thanks for having reported this - there's now a release candidate for pandas 2.0.0, which you can install with

mamba install -c conda-forge/label/pandas_rc pandas==2.0.0rc0

or

python -m pip install --upgrade --pre pandas==2.0.0rc0

If you try it out and report bugs, then we can fix them before the final 2.0.0 release

@PrimozGodec
Copy link
Author

@MarcoGorelli, thank you for working on 2.0. For now I don't find any issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants