Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

threading.local global variable in _make.py breaks (cloud)pickling #754

Closed
pwais opened this issue Feb 3, 2021 · 5 comments
Closed

threading.local global variable in _make.py breaks (cloud)pickling #754

pwais opened this issue Feb 3, 2021 · 5 comments
Labels

Comments

@pwais
Copy link

pwais commented Feb 3, 2021

I was trying to use cloudpickle and attrs-based object and ran into a really weird issue where cloudpickle found a threading.local that obviously can't be pickled. When I instrumented the pickle and cloudpickle source code in my container, I found the issue traces back to the _already_repring global threading.local object in _make.py:
https://github.com/python-attrs/attrs/blame/efcbae51cd8053c193055618cc8a6ac7880ef062/src/attr/_make.py#L1744

Here is a minimal example that shows the error:

!pip3 install cloudpickle attrs

import cloudpickle
import attr

@attr.s(slots=True, eq=False, weakref_slot=False)
class Foo(object):
    
    
    bar = attr.ib(type=str, default='')

meow = Foo('wow')

cloudpickle.dumps(meow)

And here is a notebook where you can see the crash (at the time of writing): https://colab.research.google.com/drive/1WK4EqzWyNQ3PFNSopXk-Y32IQXi8hHL9?usp=sharing

Screenshot of the error:
Screen Shot 2021-02-03 at 2 44 34 PM

I believe pickle for an attrs-based object probably works, though, since I've done that with previous versions of attrs ...

Can we get rid of this global variable _already_repring ? The addition of it seems to address a rather odd use case ( 5e46afd ). Perhaps _already_repring could be initialized on first invocation of _make_repr() (e.g. turn _make_repr() into a callable object that creates _already_repring on __call__) rather than initialized like this at package load time. (Removing globals is probably a good thing here anyways?)

@hynek
Copy link
Member

hynek commented Feb 4, 2021

This is a duplicate of #458 and my thoughts on it are in #458 (comment). So far the cloudpickle team hasn't reached out to us.

@pwais
Copy link
Author

pwais commented Feb 4, 2021

Thanks for the context @hynek ! Are you familiar enough with attrs to have any thoughts on just modifying the _already_repring global thread-local variable? Might you know if the unit tests for attrs are good enough to test for regressions if this variable were modified? Looking through the discussion in the original PR ( #358 ) I don't really see warrant for the global scope. _make_repr() only has one call site so I'd be down for trying to fix this issue....

While I would agree that cloudpickle is doing something fairly non-standard by trying to pickle the entire module, I will have to invoke the hand-wavy argument that having mutable state (and thread-local state) at the module level can cause really bad bugs in general. Furthermore, perhaps multiprocessing uses cloudpickle some day? FWIW I arrived at this bug using pyspark, which uses cloudpickle (like ray). Sadly cloudpickle has been forked a few times (pyspark has its own internal copy right now), though perhaps the project assembles better stewardship in the future. (So I wouldn't wait on the "cloudpickle team" to reply ..)...

@wsanchez wsanchez added the Bug label Apr 9, 2021
@wsanchez
Copy link

wsanchez commented Apr 9, 2021

Let's move any discussion to #358 so it's all in one place.

@wsanchez wsanchez closed this as completed Apr 9, 2021
@hynek
Copy link
Member

hynek commented May 26, 2021

I mean #358 is closed, so… :D But I don not have any thoughts...if anybody has any ideas how to implement this without compromising the current behavior (aka the tests still pass), our PRs are open as the kids nowadays say. :)

@pwais
Copy link
Author

pwais commented May 26, 2021

This issue is open and has essentially the same repro as in my report: #458

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants