Skip to content

Compatibility with pandas 0.18 #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
philipstarkey opened this issue Aug 22, 2016 · 4 comments
Closed

Compatibility with pandas 0.18 #18

philipstarkey opened this issue Aug 22, 2016 · 4 comments

Comments

@philipstarkey
Copy link
Contributor

Original report (archived issue) by Russell Anderson (Bitbucket: rpanderson, GitHub: rpanderson).


Since pandas 0.18, adding shots to the lyse GUI fails with:

#!python

Traceback (most recent call last):
  File "C:\Anaconda\lib\threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "C:\labscript_suite\lyse\__main__.py", line 1523, in incoming_buffer_loop
    self.shots_model.add_files(filepaths, new_row_data)
  File "C:\Anaconda\lib\site-packages\qtutils\invoke_in_main.py", line 114, in f
    return inmain(fn, *args, **kwargs)
  File "C:\Anaconda\lib\site-packages\qtutils\invoke_in_main.py", line 74, in inmain
    return get_inmain_result(in_main_later(fn,False,*args,**kwargs))
  File "C:\Anaconda\lib\site-packages\qtutils\invoke_in_main.py", line 94, in get_inmain_result
    exec('raise type, value, traceback')
  File "C:\Anaconda\lib\site-packages\qtutils\invoke_in_main.py", line 53, in event
    result = event.fn(*event.args, **event.kwargs)
  File "C:\labscript_suite\lyse\__main__.py", line 1335, in add_files
    self.dataframe = concat_with_padding(self.dataframe, new_row_data)
  File "C:\labscript_suite\lyse\dataframe_utilities.py", line 144, in concat_with_padding
    return pandas.concat(dataframes, ignore_index=True)
  File "C:\Anaconda\lib\site-packages\pandas\tools\merge.py", line 846, in concat
    return op.get_result()
  File "C:\Anaconda\lib\site-packages\pandas\tools\merge.py", line 1038, in get_result
    copy=self.copy)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 4545, in concatenate_block_managers
    for placement, join_units in concat_plan]
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 4642, in concatenate_join_units
    for ju in join_units]
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 4915, in get_reindexed_values
    missing_arr = np.empty(self.shape, dtype=empty_dtype)
TypeError: data type not understood
@philipstarkey
Copy link
Contributor Author

Original comment by Russell Anderson (Bitbucket: rpanderson, GitHub: rpanderson).


The error derives from the call to concat_with_padding that attempts to concatenate an initially empty DataFrame with the first non-empty DataFrame of added shots. Specifically, this is due to columns with timezone aware datetimes, e.g. the run time column.

Minimal breaking example (pandas 0.18.1, numpy 1.11.0):

#!python
df1 = pd.DataFrame(columns=['filepath'])
df2 = pd.DataFrame(data=[['C:\\test.h5', pandas.Timestamp('2016-08-18 16:04:59+1000', tz='Australia/Sydney')]],
                   columns=['filepath', 'run time'])
pd.concat([df1, df2], ignore_index=True)

This fails as above, at the call to np.empty. Explicitly,

#!python
In [151]: df2.dtypes[1]
Out[151]: datetime64[ns, Australia/Sydney]

In [152]: np.empty((0, 1), df2.dtypes[1])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
C:\labscript_suite\lyse\dataframe_utilities.py in <module>()
----> 1 np.empty((0, 1), df2.dtypes[1])

TypeError: data type not understood

The error does not occur for naive datetimes.

If the columns of the empty DataFrame are not specified, there is no such problem, i.e. the following works.

#!python
df1 = pd.DataFrame()
df2 = pd.DataFrame(data=[['C:\\test.h5', pandas.Timestamp('2016-08-18 16:04:59+1000', tz='Australia/Sydney')]],
                   columns=['filepath', 'run time'])
pd.concat([df1, df2], ignore_index=True)

@philipstarkey
Copy link
Contributor Author

Original comment by Russell Anderson (Bitbucket: rpanderson, GitHub: rpanderson).


This is resolved by only concatenating non-empty DataFrames, as per pull request #5.

@philipstarkey
Copy link
Contributor Author

Original comment by Russell Anderson (Bitbucket: rpanderson, GitHub: rpanderson).


Looks related to these pandas bugs:

pandas-dev/pandas#12985

pandas-dev/pandas#12244

@philipstarkey
Copy link
Contributor Author

Original comment by Russell Anderson (Bitbucket: rpanderson, GitHub: rpanderson).


  • changed state from "new" to "resolved"

Fixes issue #18, where adding shots to lyse failed with pandas >= 0.18.
concat_with_padding now only tries to concatenate non-empty DataFrames.

Modified pandas requirement accordingly, with no upper limit on version.
Modified labscript_utils requirement to allow above version specification.

→ <<cset f1b822e>>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant