-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Pickle's protocol 5 #3784
Merged
jacobtomlinson
merged 18 commits into
dask:master
from
jakirkham:support_pickle_protocol_5
May 21, 2020
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
b3b5c2d
Assign `header` and `frames` before returning
jakirkham b3dfdb2
Import `HIGHEST_PROTOCOL` at top-level
jakirkham 4ad42d4
Collect keyword arguments to `*dumps`
jakirkham f5380c1
Assign `result` and `return` once at end
jakirkham ed54236
Support out-of-band buffer serialization
jakirkham d374832
Require `cloudpickle` version `1.3.0`
jakirkham 1982049
Test Pickle with out-of-band buffers
jakirkham 4b54ba5
Import `PickleBuffer` (if available)
jakirkham f5aed04
Test `serialize`/`deserialize` with `pickle`
jakirkham 924dd45
Check serialized header + frames
jakirkham 6efa0ac
Check out-of-band buffers' content
jakirkham 70b7f41
Take `memoryview` of `PickleBuffer` for testing
jakirkham 8038bba
Collect buffers internally first
jakirkham 0578b6a
Only collect buffers if `buffer_callback` exists
jakirkham 1f3033c
Use `elif` instead for simplicity
jakirkham 3878f02
Use De Morgan's law to simplify logic
jakirkham 6745c7f
Check `buffer_callback` before calling it
jakirkham 195924f
Use `buffer.clear()` instead of `del buffer[:]`
jakirkham File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My expectation is that
frames[1:]
would be an empty list here. It might not be though. If it's not, that would be good to fix. Whether that has to do withdeserialize_bytes
behavior or not, I'm less sure. Maybe someone else knows?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frames
inpickle_loads
comes fromdeserialize_bytes
for me (take a look at exception stack trace). And for meframes[1:]
is not empty. That's why I am asking if this is expected or not.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. So do we have a reproducer? That should help us debug this further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I've been trying to reproduce this too, but haven't had any luck. Though I believe you that there could be a problem here. It's just tricky to come up with fixes in the dark. So anything you can come up with would help 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you say that Dask is able to split frames, may it happen when large pandas
DataFrames
are serialized? My test is a benchmark which operates on such objects, about 20M of records long and 40 columns wide.Modin splits them for parallel processing, but they still remain considerably large in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like PR ( #3841 ) will help? Still lack a test case though. So I have no way to confirm whether it actually helps (or ensure we don't accidentally regress).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even when sending large bytes that do get split by
frame_split_size
, I'm unable to reproduce an error. Things are properly reassembled on the other side. Not sure what's going on there.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah me neither unfortunately. Happy to dig deeper once we identify a reproducer 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar issue was raised recently ( #3851 ). Not sure if that is the same. Suspect PR ( #3639 ) fixes this. Would be good if you can try though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was fixed by PR ( #3639 ).