-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError
when sending graph with dask.array.ufunc.ufunc
s to scheduler and importing the wrapped numpy
function
#8442
Comments
Note that the pickled bytestring looks like Notably, the former contains |
A slightly easier way to reproduce this is without the cluster import dask
from numpy import array, exp
import dask.array as da
import pickle
from dask.base import collections_to_dsk
graph = collections_to_dsk(da.exp(da.from_array([1,2,3], chunks=(-1,))))
print(pickle.dumps(graph)) At least this script returns the same bytes as above and the diff between In fact, even removing the graphs themselves already shows the same byte string import pickle
import dask
from numpy import exp
print(pickle.dumps(exp)) The dask import somehow forces the |
It looks like which mutates the so the new reproducer is import pickle
import multiprocessing
from numpy import exp
print(pickle.dumps(exp)) |
Similar to @fjetter I went down this rabbit hole. Here are a few observations:
For imports to distributed/distributed/protocol/pickle.py Lines 70 to 77 in 7562f9c
__mp_main__ as well given the above issues.
At the moment, Unfortunately, just mashing those two fixes together won't fix our problem; maybe using |
A few last notes: Just using As to why |
Describe the issue:
There's an odd pickle error that occurs when a
numpy
function thatdask.array
wraps indask.array.ufunc.ufunc
:Minimal Complete Verifiable Example:
Traceback
This seems to be reproducible with all
ufunc
s. For example, try replacingexp
withabsolute
orsin
. TheSubprocessCluster
is not needed, the scheduler just has to run in a different process (i.e., no process-localLocalCluster
).Anything else we need to know?:
This example starts working if we do either of the following:
import dask
from numpy import exp
da.exp(...)
with a different ufunc that is not imported, e.g.,da.sin(...)
Environment:
main
The text was updated successfully, but these errors were encountered: