-
-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling custom serialization with MsgPack directly #4379
Comments
If there is an easy system within msgpack that we can hijack to avoid our
traversing the object then great, I'm all for it.
Aside: I like all of these micro-improvements, but I think that we should
also take a step back, and figure out if they're going to get us where we
want to go. I am curious if we're going to get an eventual 5-10x speedup
if we continue along the many-small-improvements path. If not, then we
might want to bail and reconsider things more broadly.
…On Tue, Jan 5, 2021 at 1:21 PM jakirkham ***@***.***> wrote:
cc @mrocklin <https://github.com/mrocklin> @quasiben
<https://github.com/quasiben>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4379 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTHRJHKXG5VFYOPPKEDSYN7FFANCNFSM4VBYVWQQ>
.
|
This came up again today when we were revisiting one of the case where the idea originated that the scheduler is slow. In that case about 50% of the time on the scheduler is spent evenly receiving/deserializing data (admittedly this before the HLG work) and performing transitions. Additionally another ~20% of time on the scheduler is spent sending data most of which is spent serializing it. As a result it seems reasonable to conclude improvements in serialization/deserialization are well worth our time. |
@jakirkham maybe spending time on Cythonization is the wrong move if this can be done easily. |
Yeah mostly was curious if the Looking at @madsbk, if you have some time on Monday, maybe we can chat about this? 🙂 |
Just chatted with @madsbk about this, I think we will start with trying to remove Will continue pushing on optimizing transitions and moving communication out of there ( #4454 ) ( #4451 ). cc @quasiben (for vis) |
As a first step we are adding a fast path for things that can be handled by MsgPack alone ( #4480 ). Though we are still interested in improving the serialization workflow overall, which may be handled in later work. |
Today we using things like
extract_serialize
to pull out objects MsgPack can't handle and serialize alongside it. In benchmarks we have done our extra handling code takes about 3x more time than MsgPack alone. An interesting idea to follow up on would be to see if we can add anExtType
or something to default encoding/decoding to handle out-of-band buffers and merely track where to insert them later. This would be analogous to how pickle works with out-of-band buffers. Though it may speed up serializing and deserializing by doing fewer passes over the data by leveraging MsgPack's own passes. In theory we could get up to a 4x speed up in serialization by following this strategy.The text was updated successfully, but these errors were encountered: