-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation for pickle5 support #5611
Conversation
This looks great @suquark! Can we make it so there is a flag that uses one or the other codepath (the default should be the old one) so we can merge this earlier and start experimenting with it? |
Test FAILed. |
e80b4e5
to
b430168
Compare
Test FAILed. |
Test PASSed. |
@suquark There are some more linting errors, can you also fix them? Let's merge the PR as it is and address the other issues in follow up PRs :) |
@pcmoritz Sorry for my late response. I am still working on it, it's not ready for review. |
5260a17
to
11d1b97
Compare
@richardliaw has also created a PR to update cloudpickle, you should coordinate to avoid conflicts (also make sure the version gets updated), see #5643 :) |
@suquark Yeah, makes sense! :) |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
@suquark Looks like the other PR has some problem to pass Jenkins. We shouldn't have this PR be blocked on the other one, how about we just remove the update of cloudpickle.py from this PR, so we can update it separately in a different PR? |
@pcmoritz Yep, but then I have to drop the Python3.8 cloudpickle_fast support (it depends on the lastest cloudpickle). But it seems totally fine for me since we will not support python3.8 recently. |
Test PASSed. |
Test FAILed. |
The jenkins build failed with: Autodetected HDF5 1.10.2
********************************************************************************
Summary of the h5py configuration
Path to HDF5: None
HDF5 Version: '1.10.2'
MPI Enabled: False
Rebuild Required: True
********************************************************************************
Executing cythonize()
h5py requires pkg-config unless the HDF5 path is explicitly specified
error: pkg-config probably not installed: FileNotFoundError(2, "No such file or directory: 'pkg-config'") |
It seems that all tests have failed due to h5py. Not sure what has happened. |
jenkins retest it please |
close and reopen the PR to re-trigger all tests |
Now that we have #5643 merged, can you rebase this PR and we can get it merged as well? |
1f3a100
to
fbceecf
Compare
Test FAILed. |
82a0ca4
to
fd67311
Compare
Test PASSed. |
Test PASSed. |
@pcmoritz it's ready to review and merge |
@@ -59,6 +58,14 @@ | |||
import uuid | |||
import threading | |||
|
|||
PICKLE5_ENABLED = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only be needed for cloudpickle_fast, right? Let's focus on supporting that codepath, so it doesn't break other things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we we have pickle5 enabled (python3 && python<3.8 && pickle5 installed), then we will try to use the old cloudpickle
Thanks, we need to make sure that cloudpickle as well as cloudpickle_fast are unchanged from upstream. We can't start to diverge from them, otherwise it will be very hard to maintain our fork in the future (there are frequent changes in cloudpickle). If they need to be changed, we need to get the changes upstream. If possible let's try to avoid that (there is a way to not change them by re-running the CloudPickler.init as in the code I shared with you, right?) |
@pcmoritz Yep, but that code doesn't work somehow. We didn't find out why it failed weeks ago. |
@suquark Do you know what is going wrong? |
It's fine if we cannot support the old cloudpickle with pickle5 for now I think (the old cloudpickle can be quite slow like 20x slower already on python objects since it uses the python pickle implementation, without C extensions) |
@pcmoritz I am not sure. But it could because of how python3.8 deals with the C++ hook function in cloudpickle_fast. I haven't tried on |
@pcmoritz yes, but I can create a PR to make the old cloudpickle 20x faster with |
Here's the cloudpickle PR: cloudpipe/cloudpickle#308 |
Thanks for doing it, looks great! |
Test FAILed. |
Jenkins retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be merged when tests pass
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test FAILed. |
Test PASSed. |
Why are these changes needed?
This one of the plans to refactor the serialization part.
pickle5
cloudpickle_fast.py
Direct Arrow zero-copy support for(should implement in the future)PickleBuffer
Related issue number
Checks
scripts/format.sh
to lint the changes in this PR.