-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pickle incompatibility between 0.25 and 1.0 when saving a MultiIndex dataframe #34535
Comments
@WillAyd #29335 (comment) suggested @mspacek open a new issue but I did not see it. I have ran into the same problem. |
Is there any current efforts on fixing this issue, or is the recommended approach finding a workaround? |
you would have to show an actual example pd.read_pickle is the way to load as it ensures compatibility (pickle.load will not work) we test loading older pickles explicitly so likely this is your setup |
Thanks, @jreback. The issue is of course related to having containers that contain dataframes. Here is an example with dataclasses: Run first with pandas==0.25.2 and then with 1.1 to reproduce.
Is there a way to modify the container to share the logic used by |
you should simply use pd.read_pickle |
@jreback: What exactly do you mean by this ( The example above is a toy example showing the problem. In our systems we have a computational graph where each node itself can be a computational graph. The primary building blocks of the graph may contain dataframes, and what we are doing now is to pickle these computational graph, so that we can run them elsewhere. |
exactly what i said pd.read_pickle does |
It seems a valid use case where dataframe(s) are parts of a larger container which is pickled as a whole. In this case pd.read_pickle does not apply, although one can workaround this by defining some new class which holds the dataframe and its setstate calls pd.read_pickle. I'd hope that pickle.load of pandas dataframe includes some backward compatibility support like that. |
Corrects brain-score#11 I encountered FrozenIndices Error trying to load a pandas dataframe after updating my pandas version. This commit introduces backward compatibility, i.e. you are fine if you pickle.dump() using an old pandas' version and trying to load using a new pandas' version. pandas-dev/pandas#34535 pickle.load() still called for all non Dataframe objects see https://github.com/pandas-dev/pandas/blob/f2c8480af2f25efdbd803218b9d87980f416563e/pandas/io/pickle.py#L203
Corrects #11 I encountered FrozenIndices Error trying to load a pandas dataframe after updating my pandas version. This commit introduces backward compatibility, i.e. you are fine if you pickle.dump() using an old pandas' version and trying to load using a new pandas' version. pandas-dev/pandas#34535 pickle.load() still called for all non Dataframe objects see https://github.com/pandas-dev/pandas/blob/f2c8480af2f25efdbd803218b9d87980f416563e/pandas/io/pickle.py#L203
@jreback Ya, the idea that one only ever loads isolated pandas frames is quite simplified. As @veneto-maggio said, often pandas frame would be part of large pickle file, so direct support for pickle is most general. |
Old issue, I know but if anyone else finds this by googling, pd.read_pickle will handle any pickled object, not just pickled DataFrames! I believe that is what jreback is refering to. |
This seems to have caused the problem described here:
https://stackoverflow.com/questions/61641738/pandas-1-0-cannot-pickle-load-dict-containing-dataframe-with-multiindex
which I'm now also experiencing. I dumped a MultiIndex dataframe containing ndarrays to a pickle on disk under pandas 0.25.x in Python 3.6, and now I'm getting:
AttributeError: Can't get attribute 'FrozenNDArray' on <module 'pandas.core.indexes.frozen'
when trying to load it in pandas 1.0.3 (still on Python 3.6). Any suggestions/workarounds? Should I instead open up a new issue?
This solved issue seems related, but is for Python 2.7:
#31988
This comment in the original rationale for getting rid of
FrozenNDArray
mentionspandas.compat.pickle_compat.py
, which seems relevant:#9031 (comment)
Originally posted by @mspacek in #29335 (comment)
The text was updated successfully, but these errors were encountered: