Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instance doesn't use pickled __class__ when same-named class exists #243

Closed
piccolbo opened this issue Nov 1, 2017 · 13 comments
Closed

Comments

@piccolbo
Copy link

piccolbo commented Nov 1, 2017

I was just reading @mmckerns answer on SO and marveling at the power of Dill. So I tried his first example and to my surprise it fails exactly like pickle would, whereas @mmckerns states "Pickle would blow up on the above." Which I think implies "dill doesn't". Then I found out that it fails in iPython but not python. Does anyone know what the cause is? Are there known limitations in using with iPython?

In [59]: import dill
    ...: 
    ...: class Foo(object):
    ...:   def bar(self, x):
    ...:     return x+self.y       
    ...:   y = 1
    ...: 
    ...: f = Foo()
    ...: _Foo = dill.dumps(Foo)
    ...: _f = dill.dumps(f)
    ...: class Foo(object):
    ...:   def bar(self, x):
    ...:     return x*self.z  
    ...:   z = -1
    ...: f_ = dill.loads(_f)
    ...: f_.y
    ...: 
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-59-de94877455f9> in <module>()
     14   z = -1
     15 f_ = dill.loads(_f)
---> 16 f_.y

AttributeError: 'Foo' object has no attribute 'y'

I reported this on SO as a comment, then I thought that thread was a bit too past its prime for anyone to pay attention, so I am reporting here and cross linking.

@piccolbo piccolbo changed the title AttributeError in @mmckerns own SO example AttributeError in @mmckerns own SO example in Ipython Nov 1, 2017
@mmckerns mmckerns changed the title AttributeError in @mmckerns own SO example in Ipython instance doesn't use pickled __class__ when in IPython Feb 26, 2018
@mmckerns
Copy link
Member

mmckerns commented Feb 26, 2018

Still haven't gotten to this, but I just wanted to make it more clear what is going on. If this behavior works as expected from python, but in IPython it doesn't, I'll have to investigate what IPython is doing that is causing the stored __class__ to be ignored.

@mmckerns
Copy link
Member

So, this works:

In [1]: import dill

In [2]: class Foo(object):
   ...:     def bar(self, x):
   ...:         return x+self.y
   ...:     y = 1
   ...: 

In [3]: f = Foo()

In [4]: _Foo = dill.dumps(Foo)

In [5]: _f = dill.dumps(f)

In [6]: del Foo

In [7]: del f

In [8]: f_ = dill.loads(_f)

In [9]: f_
Out[9]: <__main__.Foo at 0x1046bb7f0>

In [10]: f_.__class__
Out[10]: __main__.Foo

In [11]: f_.y
Out[11]: 1

In [12]: 

In [12]: class Foo(object):
    ...:     def bar(self, x):
    ...:         return x+self.z
    ...:     z = -1
    ...:     

In [13]: f_.y
Out[13]: 1

@mmckerns
Copy link
Member

You can see that the original class Foo is indeed serialized with instance f. However, the presence of a new class Foo in the global namespace causes the new Foo to become linked to _f when _f is unpickled.

In [1]: import dill

In [2]: class Foo(object):
   ...:     def bar(self, x):
   ...:         return x+self.y
   ...:     y = 1
   ...: 

In [3]: f = Foo()

In [4]: _Foo = dill.dumps(Foo)

In [5]: _f = dill.dumps(f)

In [6]: del Foo, f

In [7]: class Foo(object):
   ...:     def bar(self, x):
   ...:         return x+self.z
   ...:     z = -1
   ...: 

In [8]: f_ = dill.loads(_f)

In [9]: f_.__class__ == Foo
Out[9]: True

In [10]: f_.y
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-c57fc71c226a> in <module>()
----> 1 f_.y

AttributeError: 'Foo' object has no attribute 'y'

In [11]: _f
Out[11]: b'\x80\x03cdill.dill\n_create_type\nq\x00(cdill.dill\n_load_type\nq\x01X\x04\x00\x00\x00typeq\x02\x85q\x03Rq\x04X\x03\x00\x00\x00Fooq\x05h\x01X\x06\x00\x00\x00objectq\x06\x85q\x07Rq\x08\x85q\t}q\n(X\n\x00\x00\x00__module__q\x0bX\x08\x00\x00\x00__main__q\x0cX\x03\x00\x00\x00barq\rcdill.dill\n_create_function\nq\x0e(h\x01X\x08\x00\x00\x00CodeTypeq\x0f\x85q\x10Rq\x11(K\x02K\x00K\x02K\x02KCC\n|\x01|\x00j\x00\x17\x00S\x00q\x12N\x85q\x13X\x01\x00\x00\x00yq\x14\x85q\x15X\x04\x00\x00\x00selfq\x16X\x01\x00\x00\x00xq\x17\x86q\x18X\x1e\x00\x00\x00<ipython-input-2-8e4875db3fbc>q\x19h\rK\x02C\x02\x00\x01q\x1a))tq\x1bRq\x1cc__builtin__\n__main__\nh\rNN}q\x1dtq\x1eRq\x1fh\x14K\x01X\x07\x00\x00\x00__doc__q NX\r\x00\x00\x00__slotnames__q!]q"utq#Rq$)\x81q%.'

In [12]: f_.z
Out[12]: -1

In [13]: del Foo

In [14]: f_.y
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-c57fc71c226a> in <module>()
----> 1 f_.y

AttributeError: 'Foo' object has no attribute 'y'

In [15]: f_ = dill.loads(_f)

In [16]: f_.y
Out[16]: 1

@piccolbo
Copy link
Author

piccolbo commented Feb 27, 2018

Interesting, so you just delayed the redefinition of Foo and the problem goes away. Works for me too FWIW.

@mmckerns
Copy link
Member

I'm just checking some different versions -- and it doesn't seem to be either IPython specific or version specific. As long as there's a new definition for Foo after dill.dumps and before the dill.loads, the pickled class will be ignored, and new class definition will be used. However, removing the new definition, and reloading (i.e. do dill.loads again) will use the stored version of Foo. This seems to be the case for all the python versions I tested (2.6, 2.7, 3.4, 3.5, 3.6), regardless of IPython. Obviously something changed (either in python or dill) some while ago since I added the example to SO. I'll look into how dill finds classes when it's deserializing...

@mmckerns mmckerns changed the title instance doesn't use pickled __class__ when in IPython instance doesn't use pickled __class__ when same-named class exists Feb 27, 2018
@mmckerns
Copy link
Member

@piccolbo: yes, a serialized version of Foo is stored on the pickled instance of f in all cases. Apparently, there's something going on when the new instance is being built, that dill (or pickle underneath) decides to hook up the new Foo instead of the Foo that is stored on the pickled instance.

@piccolbo
Copy link
Author

I can repro in python as well now, contrary to my original report. I may have upgraded all of the above in the meantime. Next time I will submit a reproducible docker image. ;)

@mmckerns
Copy link
Member

Added ignore setting in 16125f7. When ignore=True, then do not attempt to lookup class (i.e. rely only on stored classes). I've kept the default behavior that overrides the stored class if a new class is found... however, now you can toggle this behavior to what you expected.

Try your code again with dill.settings['ignore'] = True

There was indeed a time (a while ago) that ignore=True was the default behavior, but that was due to a bug... so it was unintentional.

@piccolbo
Copy link
Author

Works for me too!

@mmckerns mmckerns added this to the dill-0.2.8 milestone Jun 22, 2018
@pranav9056
Copy link

pranav9056 commented Jul 6, 2018

I want to save an instance of a class that is defined in a different file, and I dump the instance of that class using dill.dump. If I make changes to the class and load the object again with the ignore=true flag set, It refers to the new definition of the class, am I missing something? (When my class is defined in the same file, it refers to the old definition of the class)

@mmckerns
Copy link
Member

mmckerns commented Jul 6, 2018

@pranav9056: please open a new issue, and provide a more detailed description (i.e. with some minimal code example)

@bariod
Copy link

bariod commented Aug 16, 2019

I'm facing the same problem as @pranav9056 but with dill.dumps.
I can reproduce the SO example but not with a class defined in a file.

>>> import dill
>>> import importlib
>>> import test_file
>>> f = test_file.Foo()
>>> _f = dill.dumps(f)
**changing class in test_file.py**
>>> importlib.reload(test_file)
<module 'test_file' from 'C:\\test_file.py'>
>>> g = test_file.Foo()
>>> f_ = dill.loads(_f, ignore=True)
>>> f_.bar()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: bar() missing 1 required positional argument: 'x'
>>> g.bar()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: bar() missing 1 required positional argument: 'x'

@mmckerns
Copy link
Member

@bariod: Thanks for the example. Please repost/open this as a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants