Skip to content

BUG: Multiplication of two serieses changes the the timezone from the given serieses #33671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eyjay-ok opened this issue Apr 20, 2020 · 7 comments · Fixed by #36503
Closed
Labels
Bug Datetime Datetime data dtype Numeric Operations Arithmetic, Comparison, and Logical operations Timezones Timezone data dtype
Milestone

Comments

@eyjay-ok
Copy link

(since this is my first bug report I am happy for any feedback :) )

import pandas as pd

cet_idx = pd.date_range(start=pd.to_datetime('today').normalize(), periods=10, freq='15min', tz='CET')
utc_idx = cet_idx.tz_convert('utc')

cet_series = pd.Series(data=2, index=cet_idx)
utc_series = pd.Series(data=2, index=utc_idx)

print(cet_series.index.tz)
>> <DstTzInfo 'CET' CET+1:00:00 STD>

out_series = utc_series * cet_series   # The output is irrelevant 

print(cet_series.index.tz)
>> <UTC>

Problem description

The timezone of the series 'cet_series' changes by the multiplication. Since it is only an input series the timezone should stay the same.

Expected Output

... # the same initialization as above

print(cet_series.index.tz)
>> <DstTzInfo 'CET' CET+1:00:00 STD>

utc_series * cet_series   # The output is irrelevant 

print(cet_series.index.tz)
>> <DstTzInfo 'CET' CET+1:00:00 STD>

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.8.2.final.0
python-bits : 64
OS : Linux
OS-release : 5.3.0-46-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 19.2.3
setuptools : 41.2.0
Cython : None
pytest : None
hypothesis : None
sphinx : 2.4.4
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.13.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pytest : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : 1.3.13
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None

@eyjay-ok eyjay-ok added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 20, 2020
@TomAugspurger
Copy link
Contributor

Thanks for the report. Confirmed on master.

@eyjay-ok do you know if this happened in pandas 0.25?

@TomAugspurger TomAugspurger added Numeric Operations Arithmetic, Comparison, and Logical operations Datetime Datetime data dtype Timezones Timezone data dtype and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 20, 2020
@TomAugspurger TomAugspurger added this to the Contributions Welcome milestone Apr 20, 2020
@eyjay-ok
Copy link
Author

Yes, it was the same behavior in Pandas 0.25.1
I also tested it in pandas 0.24.0 and it was also the same (the output was a bit different but still the timezone was changed to UTC)

@mroeschke
Copy link
Member

Thanks for the report.

This is the expected behavior. Please see https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#working-with-time-zones

Operations between Series in different time zones will yield UTC Series, aligning the data on the UTC timestamps:

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Apr 20, 2020

I think the surprising behavior is that one of the operands (cet_series) is mutated so that it's index changes from CET to UTC.

@mroeschke mroeschke reopened this Apr 20, 2020
@mroeschke
Copy link
Member

Ah that is really weird behavior. Yes, that definitely looks like a bug. Sorry for reading the issue too quickly.

@mroeschke
Copy link
Member

mroeschke commented Apr 20, 2020

This fixes the issue locally:

(pandas-dev) matthewroeschke:pandas-mroeschke matthewroeschke$ git diff
diff --git a/pandas/core/ops/__init__.py b/pandas/core/ops/__init__.py
index 9a7c9fdad..048eb6cc5 100644
--- a/pandas/core/ops/__init__.py
+++ b/pandas/core/ops/__init__.py
@@ -380,7 +380,7 @@ def _align_method_SERIES(left, right, align_asobject=False):
                 left = left.astype(object)
                 right = right.astype(object)

-            left, right = left.align(right, copy=False)
+            left, right = left.align(right, copy=True)

     return left, right

Not sure if always copying is necessary though. Maybe we should set copy=True for Extension Array indexes?

@phofl
Copy link
Member

phofl commented Apr 20, 2020

Not sure if always copying is necessary though. Maybe we should set copy=True for Extension Array indexes?

We could set copy=True if the indices at this point:

https://github.com/pandas-dev/pandas/blob/master/pandas/core/generic.py#L8486-8491

are not equal. (else block). This works locally.

@jreback jreback modified the milestones: Contributions Welcome, 1.2 Sep 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Numeric Operations Arithmetic, Comparison, and Logical operations Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants