Skip to content

DataFrame groupby partially drops timezone info (to_csv, in notebook) #7622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Poquaruse opened this issue Jun 30, 2014 · 4 comments
Closed
Labels
Bug Timezones Timezone data dtype

Comments

@Poquaruse
Copy link

Hi all,

I've encountered a problem with DataFrames, groupby and timezones.

import pandas as pd
import numpy as np

dt_rng = pd.date_range(start='2014-01-01 00:00', periods = 1000, freq='1s', tz='Europe/Berlin')
df = pd.DataFrame({'a':np.random.randn(1000), 'b': np.random.randn(1000)},index = dt_rng)
df['b'] = df['b'].round()
df.to_csv()

--> Timezones are shown in the csv output, for example 2014-01-01 00:00:00+01:00

Now with resampling:

dt_rng = pd.date_range(start='2014-01-01 00:00', periods = 1000, freq='1s', tz='Europe/Berlin')
df = pd.DataFrame({'a':np.random.randn(1000), 'b': np.random.randn(1000)},index = dt_rng)
df['b'] = df['b'].round()
df.groupby(df['b']).resample('1min').to_csv()

--> 2013-12-31 23:01:00 no timezone info, not even UTC.

However:

dt_rng = pd.date_range(start='2014-01-01 00:00', periods = 1000, freq='1s', tz='Europe/Berlin')
df = pd.DataFrame({'a':np.random.randn(1000), 'b': np.random.randn(1000)},index = dt_rng)
df['b'] = df['b'].round()
df.groupby(df['b']).resample('1min').index.levels[1]

shows: Timezone: Europe/Berlin

So the info seems to be there, but is not exported - even if it was exported before without resampling...

Any ideas?

Thanks and best regards

@jreback
Copy link
Contributor

jreback commented Jun 30, 2014

pls post pd.show_versions()

@Poquaruse
Copy link
Author

Sorry, I forgot. Here it is:

INSTALLED VERSIONS

commit: None
python: 3.4.1.final.0
python-bits: 64
OS: Windows
OS-release: 8
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.14.0
nose: 1.3.3
Cython: 0.20.1
numpy: 1.8.1
scipy: 0.14.0
statsmodels: None
IPython: 2.1.0
sphinx: 1.2.2
patsy: 0.2.1
scikits.timeseries: None
dateutil: 2.1
pytz: 2014.3
bottleneck: None
tables: 3.1.1
numexpr: 2.3.1
matplotlib: 1.3.1
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: None
xlsxwriter: 0.5.5
lxml: 3.3.5
bs4: 4.3.1
html5lib: None
bq: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.4
pymysql: None
psycopg2: None

@jreback
Copy link
Contributor

jreback commented Jun 30, 2014

this works in master, lots of bugs related to tz preservation are fixed for 0.14.1 (releasing soon)

@Poquaruse
Copy link
Author

Thanks for the heads up! :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

2 participants