Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation for pickle5 support #5611

Merged
merged 13 commits into from
Sep 14, 2019

Conversation

suquark
Copy link
Member

@suquark suquark commented Sep 1, 2019

Why are these changes needed?

This one of the plans to refactor the serialization part.

  • Pass all tests on python3.8
  • Backport pickle5
  • Check compatibility of cloudpickle_fast.py
  • Direct Arrow zero-copy support for PickleBuffer(should implement in the future)

Related issue number

Checks

@pcmoritz
Copy link
Contributor

pcmoritz commented Sep 1, 2019

This looks great @suquark! Can we make it so there is a flag that uses one or the other codepath (the default should be the old one) so we can merge this earlier and start experimenting with it?

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16706/
Test FAILed.

@suquark suquark force-pushed the pickle5serializaion branch from e80b4e5 to b430168 Compare September 2, 2019 04:02
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16718/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16721/
Test PASSed.

@pcmoritz
Copy link
Contributor

pcmoritz commented Sep 5, 2019

@suquark There are some more linting errors, can you also fix them? Let's merge the PR as it is and address the other issues in follow up PRs :)

@pcmoritz pcmoritz marked this pull request as ready for review September 5, 2019 18:19
@suquark
Copy link
Member Author

suquark commented Sep 5, 2019

@pcmoritz Sorry for my late response. I am still working on it, it's not ready for review.

@suquark suquark force-pushed the pickle5serializaion branch from 5260a17 to 11d1b97 Compare September 5, 2019 22:34
@pcmoritz
Copy link
Contributor

pcmoritz commented Sep 5, 2019

@richardliaw has also created a PR to update cloudpickle, you should coordinate to avoid conflicts (also make sure the version gets updated), see #5643 :)

@suquark
Copy link
Member Author

suquark commented Sep 5, 2019

@pcmoritz Thanks for your mention! I am also using the lastest cloudpickle but slightly modified. I think I should wait #5643 to merge so that this PR can work as a small patch.

@pcmoritz
Copy link
Contributor

pcmoritz commented Sep 6, 2019

@suquark Yeah, makes sense! :)

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16816/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16823/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16821/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16824/
Test FAILed.

@pcmoritz
Copy link
Contributor

pcmoritz commented Sep 6, 2019

@suquark Looks like the other PR has some problem to pass Jenkins. We shouldn't have this PR be blocked on the other one, how about we just remove the update of cloudpickle.py from this PR, so we can update it separately in a different PR?

@suquark
Copy link
Member Author

suquark commented Sep 6, 2019

@pcmoritz Yep, but then I have to drop the Python3.8 cloudpickle_fast support (it depends on the lastest cloudpickle). But it seems totally fine for me since we will not support python3.8 recently.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16830/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16844/
Test FAILed.

@suquark
Copy link
Member Author

suquark commented Sep 7, 2019

The jenkins build failed with:

    Autodetected HDF5 1.10.2
    ********************************************************************************
                           Summary of the h5py configuration
    
        Path to HDF5: None
        HDF5 Version: '1.10.2'
         MPI Enabled: False
    Rebuild Required: True
    
    ********************************************************************************
    Executing cythonize()
    h5py requires pkg-config unless the HDF5 path is explicitly specified
    error: pkg-config probably not installed: FileNotFoundError(2, "No such file or directory: 'pkg-config'")

@suquark
Copy link
Member Author

suquark commented Sep 7, 2019

It seems that all tests have failed due to h5py. Not sure what has happened.

@suquark suquark requested a review from pcmoritz September 7, 2019 22:28
@suquark
Copy link
Member Author

suquark commented Sep 7, 2019

jenkins retest it please

@suquark suquark closed this Sep 7, 2019
@suquark suquark reopened this Sep 7, 2019
@suquark
Copy link
Member Author

suquark commented Sep 7, 2019

close and reopen the PR to re-trigger all tests

@pcmoritz pcmoritz mentioned this pull request Sep 8, 2019
2 tasks
@pcmoritz
Copy link
Contributor

Now that we have #5643 merged, can you rebase this PR and we can get it merged as well?

@suquark suquark force-pushed the pickle5serializaion branch from 1f3a100 to fbceecf Compare September 10, 2019 01:40
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16923/
Test FAILed.

@suquark suquark force-pushed the pickle5serializaion branch from 82a0ca4 to fd67311 Compare September 10, 2019 04:13
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16924/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16931/
Test PASSed.

@suquark
Copy link
Member Author

suquark commented Sep 11, 2019

@pcmoritz it's ready to review and merge

@@ -59,6 +58,14 @@
import uuid
import threading

PICKLE5_ENABLED = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be needed for cloudpickle_fast, right? Let's focus on supporting that codepath, so it doesn't break other things.

Copy link
Member Author

@suquark suquark Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we we have pickle5 enabled (python3 && python<3.8 && pickle5 installed), then we will try to use the old cloudpickle

@pcmoritz
Copy link
Contributor

Thanks, we need to make sure that cloudpickle as well as cloudpickle_fast are unchanged from upstream. We can't start to diverge from them, otherwise it will be very hard to maintain our fork in the future (there are frequent changes in cloudpickle).

If they need to be changed, we need to get the changes upstream. If possible let's try to avoid that (there is a way to not change them by re-running the CloudPickler.init as in the code I shared with you, right?)

@suquark
Copy link
Member Author

suquark commented Sep 12, 2019

@pcmoritz Yep, but that code doesn't work somehow. We didn't find out why it failed weeks ago.

@pcmoritz
Copy link
Contributor

@suquark Do you know what is going wrong?

@pcmoritz
Copy link
Contributor

pcmoritz commented Sep 12, 2019

It's fine if we cannot support the old cloudpickle with pickle5 for now I think (the old cloudpickle can be quite slow like 20x slower already on python objects since it uses the python pickle implementation, without C extensions)

@suquark
Copy link
Member Author

suquark commented Sep 12, 2019

@pcmoritz I am not sure. But it could because of how python3.8 deals with the C++ hook function in cloudpickle_fast. I haven't tried on pickle5-backport, but it is likely to work because pickle5-backport haven't implemented the hook function. BTW, the hook function is critical for performance. It is not possible to make use of cPickle without the hook function (everything falls back to python-based pickle). I also created this issue for pickle5-backport: pitrou/pickle5-backport#9

@suquark
Copy link
Member Author

suquark commented Sep 12, 2019

@pcmoritz yes, but I can create a PR to make the old cloudpickle 20x faster with pickle5-backport by implementing the hook function.

@suquark
Copy link
Member Author

suquark commented Sep 12, 2019

Here's the cloudpickle PR: cloudpipe/cloudpickle#308

@pcmoritz
Copy link
Contributor

Thanks for doing it, looks great!

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16989/
Test FAILed.

@pcmoritz
Copy link
Contributor

Jenkins retest this please

Copy link
Contributor

@pcmoritz pcmoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be merged when tests pass

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/17003/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/17013/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/17012/
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/17016/
Test FAILed.

@pcmoritz pcmoritz merged commit 4c964c0 into ray-project:master Sep 14, 2019
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/17023/
Test PASSed.

@suquark suquark deleted the pickle5serializaion branch October 3, 2019 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants