You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TL;DR: In order to unpickle DataFrames with types, every type stored therein must be imported prior to calling pandas.read_pickle.
Our initial implementation of the Tracer would store raw type objects into the DataFrame.
This was advantageous for us, as every instances' real type, location on disk for import-related tasks, and base class information were made available to anyone who loaded the DataFrame.
We had tested this on multiple basic types, such as str and int, then went on to trace a numpy test, which produced instances of numpy.arrays. At this stage, the DataFrame STILL remained loadable.
Satisfied with this, we declared that we could build upon this DataFrame format, and began implementing further features on top of this. Then, when working on an MRE for inline annotation generation that would use class names as for the type hints, our DataFrame refused to load, stating that the given types could not be loaded.
As the TL;DR states states, this problem was fixed by importing the relevant class prior to unpickling.
However, this was not a viable solution for us.
The reason that even tracing the numpy test worked, is that when we imported pandas for unpickling, numpy was imported as a dependency of pandas, which made numpy.array accessible.
This was fixed by storing the module and name of the type as strings within the DataFrame, which is sufficient to dynamically import desired types from the project, the standard library and from virtualenvs.
The text was updated successfully, but these errors were encountered:
TL;DR: In order to unpickle
DataFrame
s withtype
s, every type stored therein must be imported prior to callingpandas.read_pickle
.Our initial implementation of the
Tracer
would store rawtype
objects into theDataFrame
.This was advantageous for us, as every instances' real type, location on disk for import-related tasks, and base class information were made available to anyone who loaded the
DataFrame
.We had tested this on multiple basic types, such as
str
andint
, then went on to trace a numpy test, which produced instances ofnumpy.array
s. At this stage, theDataFrame
STILL remained loadable.Satisfied with this, we declared that we could build upon this
DataFrame
format, and began implementing further features on top of this. Then, when working on an MRE for inline annotation generation that would use class names as for the type hints, ourDataFrame
refused to load, stating that the given types could not be loaded.As the TL;DR states states, this problem was fixed by importing the relevant class prior to unpickling.
However, this was not a viable solution for us.
The reason that even tracing the
numpy
test worked, is that when we importedpandas
for unpickling,numpy
was imported as a dependency ofpandas
, which madenumpy.array
accessible.To reproduce: import
pandas
andnumpy
, store an object fromnumpy
in theDataFrame
, then pickle and store it.Then, import only
pandas
for the unpickling procedure:This was fixed by storing the module and name of the type as strings within the
DataFrame
, which is sufficient to dynamically import desiredtype
s from the project, the standard library and from virtualenvs.The text was updated successfully, but these errors were encountered: