[REVIEW] Msgpack handles extract serialize #4531

madsbk · 2021-02-22T15:43:18Z

This PR address reduce the overhead of protocol.dumps() and protocol.loads() by handling the extraction of serializable objects directly in msgpack.

Closes Handling custom serialization with MsgPack directly #4379
Tests added / passed
Passes black distributed / flake8 distributed
Handle extraction of embedded bytes

Find and fix BUG triggered by not converting to bytes array implicitly:

distributed/distributed/protocol/numpy.py

Lines 105 to 109 in 3b8bd25

    
           # TODO: the new protocol.dumps() implementation doesn't convert 
        
           # memoryviews to bytes implicitly, which triggers a communication BUG. 
        
           # My guess is that somewhere we are using `len(buffer)` instead of 
        
           # `buffer.nbytes`. Casting to bytes here fixes the issue. 
        
           data = memoryview(data).cast("B")

madsbk · 2021-02-22T17:32:04Z

@mrocklin @jakirkham @quasiben, this PR isn't ready to be reviewed yet but it should work in most cases. It would be great if one of you could run a benchmark and see how much of an impact it has.

Running the following code, I am seeing 6-10 times speedup :)

    n = 10**6
    msg = [{"op": "health", "status": "OK"} for _ in range(n)]
    t1 = time.time()
    dumps(msg)
    t2 = time.time()
    print(f"n: {n}, time: ", t2-t1)

mrocklin · 2021-02-22T18:14:11Z

I'll give it a shot :)

mrocklin · 2021-02-22T18:34:23Z

Well, it has certainly disappeared from the summary.

I haven't yet seen the runtime difference though. I'll do that next.

mrocklin · 2021-02-22T18:39:59Z

Running the following computation on my laptop, with 20 workers with 1 thread each

from dask.distributed import Client, performance_report, wait
client = Client("localhost:8786")

import dask
import dask.dataframe as dd
dask.config.set({"optimization.fuse.active": False})
df = dask.datasets.timeseries(
    start="2020-01-01",
    end="2020-01-10",
    partition_freq="1h",
    freq="60s"
).persist()

df2 = df.set_index("x").persist()
wait(df2)

master takes 8.7s
this branch takes 6.5s

jakirkham · 2021-02-22T18:58:31Z

So a 25% reduction then?

How does msgpack.packb and msgpack.unpackb rank without this change? Noticing the latter comes up as the 6th item in that list

mrocklin · 2021-02-22T18:58:52Z

Hrm, I'm having some difficulty reproducing those results. I don't fully trust my install process. Let me do a couple of clean installs before we celebrate on this one.

…

On Mon, Feb 22, 2021 at 11:32 AM Mads R. B. Kristensen < ***@***.***> wrote: @mrocklin <https://github.com/mrocklin> @jakirkham <https://github.com/jakirkham> @quasiben <https://github.com/quasiben>, this PR isn't ready to be reviewed yet but it should work in most cases. It would be great if one of you could run a benchmark and see how much of an impact it has. Running the following code, I am seeing 6-10 times speedup :) n = 10**6 msg = [{"op": "health", "status": "OK"} for _ in range(n)] t1 = time.time() dumps(msg) t2 = time.time() print(f"n: {n}, time: ", t2-t1) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4531 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTHS7BI7M5GT5UV5RXTTAKIKPANCNFSM4YAX3YNA> .

jakirkham · 2021-02-22T19:59:19Z

No worries

mrocklin · 2021-02-22T20:00:12Z

OK, I've reinstalled twice into two different environments. I'm now also running the computation ten times in a loop.

Results

master: 8.17 s +- 1.6832019403346938
this branch: 7.38 s +- 0.8006945865771267

Definitely still a significant improvement. Variation is still high though. It would be good to see what this looks like on a quieter system.

jakirkham · 2021-02-22T20:26:40Z

Thanks for the update!

What do you mean by a quieter system? One dedicated to the benchmark or something else?

mrocklin · 2021-02-22T21:32:23Z

One not running Chrome, and one that is not over-saturated with 4x the number of workers as cores for example :)

I've started running with nice in order to try to protect the scheduler a it, but there is only so much that can be done. I really do want that raspberry pi cluster right about now :)

mrocklin · 2021-02-22T21:34:11Z

distributed/protocol/tests/test_protocol.py

@@ -85,6 +85,7 @@ def test_maybe_compress_sample():
    assert compressed == payload


+@pytest.mark.xfail(reason="TODO: fix")
 def test_large_bytes():
    for tp in (bytes, bytearray):
        msg = {"x": tp(b"0" * 1000000), "y": 1}


Why does this fail? Does msgpack fail on large messages?

No, it is because it doesn't support splitting of large frames. I am work on a PR that should make it much easier to handle splitting and writability: #4541

mrocklin · 2021-02-23T01:52:24Z

msgpack.unpackb does start to show up more in traces, but it's less than extract_serialize used to be.

jakirkham · 2021-02-23T18:14:49Z

That's from_frames though. I think extract_serialize is only used in to_frames

jakirkham · 2021-02-24T04:39:51Z

Ben and I tried to benchmark this today, but ran into some issues with hangs near the end of the shuffle. Unfortunately we weren't able to identify exactly what they were coming from. Also these hangs did not happen every time, but happened frequently enough that a few retries was sufficient to see a hang. That being said, as things are still being worked on here, maybe this isn't totally unexpected.

madsbk · 2021-02-24T11:07:07Z

Ben and I tried to benchmark this today, but ran into some issues with hangs near the end of the shuffle. Unfortunately we weren't able to identify exactly what they were coming from. Also these hangs did not happen every time, but happened frequently enough that a few retries was sufficient to see a hang. That being said, as things are still being worked on here, maybe this isn't totally unexpected.

Yeah, this PR still need some work. Getting #4541 merged will help a lot.

mrocklin · 2021-02-24T14:57:44Z

That's from_frames though. I think extract_serialize is only used in to_frames

extract_serialize was removed entirely from the trace. What I'm saying here is that there are new costs associated to this msgpack approach. This was a picture of one of them.

jakirkham · 2021-02-24T16:59:02Z

Yeah, this PR still need some work. Getting #4541 merged will help a lot.

Sounds good. Reviewed yesterday. Will follow up on any comments/updates today

jakirkham · 2021-02-24T17:01:51Z

That's from_frames though. I think extract_serialize is only used in to_frames

extract_serialize was removed entirely from the trace. What I'm saying here is that there are new costs associated to this msgpack approach. This was a picture of one of them.

Yep I believe that. Just noting the particular screenshot wouldn't have shown any calls to extract_serialize before this PR, but have no problem believing that would be different in another part of the trace or that the trace would now start to contain things like msgpack

madsbk · 2021-02-26T15:05:52Z

@mrocklin, @jakirkham , @quasiben, I have fixed most of the bugs and the PR should be ready for testing/benchmarking.

madsbk · 2021-03-02T12:47:55Z

@quasiben could you try to test again? I found and fixed a BUG in Tornado triggered when len(x) != x.nbytes: tornadoweb/tornado#2996
This PR now includes #4555, which also fixes the issue.

jakirkham · 2021-03-02T17:57:58Z

Ah this makes more sense. Thanks for digging into this Mads. Do you think it would be possible to do this casting as part of MsgPack serialization?

quasiben · 2021-03-02T18:32:36Z

Thanks @madsbk . While the shuffle now completes I do see errors like the following:

distributed.core - ERROR - Exception while handling op get_data
Traceback (most recent call last):
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/distributed/core.py", line 500, in handle_comm
    result = await result
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/distributed/worker.py", line 1359, in get_data
    compressed = await comm.write(msg, serializers=serializers)
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/distributed/comm/tcp.py", line 267, in write
    each_frame = memoryview(each_frame).cast("B")
TypeError: memoryview: cannot cast view with zeros in shape or strides
tornado.application - ERROR - Exception in callback functools.partial(<function TCPServer._handle_connection.<locals>.<lambda> at 0x7f4983677dc0>, <Task finished name='Task-640' coro=<BaseTCPListener._handle_stream() done, defined at /gpfs/fs1/bzaitlen/miniconda3/envs/202
10302-nightly-0.19/lib/python3.8/site-packages/distributed/comm/tcp.py:472> exception=TypeError('memoryview: cannot cast view with zeros in shape or strides')>)
Traceback (most recent call last):
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/tornado/ioloop.py", line 741, in _run_callback
    ret = callback()
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/tornado/tcpserver.py", line 331, in <lambda>
    gen.convert_yielded(future), lambda f: f.result()
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/distributed/comm/tcp.py", line 489, in _handle_stream
    await self.comm_handler(comm)
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/distributed/core.py", line 500, in handle_comm
    result = await result
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/distributed/worker.py", line 1359, in get_data
    compressed = await comm.write(msg, serializers=serializers)
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/distributed/comm/tcp.py", line 267, in write
    each_frame = memoryview(each_frame).cast("B")
TypeError: memoryview: cannot cast view with zeros in shape or strides
distributed.worker - ERROR - Worker stream died during communication: tcp://10.33.12.22:36911
Traceback (most recent call last):
  File "/gpfs/fs1/bzaitlen/miniconda3/envs/20210302-nightly-0.19/lib/python3.8/site-packages/distributed/comm/tcp.py", line 195, in read
    frames_nbytes = await stream.read_bytes(fmt_size)
tornado.iostream.StreamClosedError: Stream is closed

distributed/comm/tcp.py

quasiben · 2021-03-02T20:12:35Z

Woot! Working smoothly now with better performance too! We have typically been seeing: 23.11 +/- .77

start shuffle:  0
22.364916563034058
start shuffle:  1
17.257044792175293
start shuffle:  2
17.656667709350586
start shuffle:  3
17.426403522491455
start shuffle:  4
17.977993726730347
start shuffle:  5
18.042349576950073
start shuffle:  6
16.9795560836792
start shuffle:  7
21.994297981262207
start shuffle:  8
18.86323642730713
start shuffle:  9

jakirkham · 2021-03-02T20:19:14Z

So 5-6s faster than what we currently see. Does that sound right?

Looks like the 0th iteration was slower than that though. Is that consistent? Do we know what is slow about that iteration?

jakirkham · 2021-03-04T02:42:37Z

Went ahead and merged dask/master into this PR to incorporate other recent merges and retest with those changes. Hope that is ok 🙂

madsbk · 2021-03-04T08:24:03Z

Went ahead and merged dask/master into this PR to incorporate other recent merges and retest with those changes. Hope that is ok slightly_smiling_face

Thanks @jakirkham !

quasiben · 2021-03-04T15:00:25Z

Do we want to hold off on merging this until after the release ?

jakirkham · 2021-03-04T15:51:06Z

That seems reasonable. Though I think our testing here has given me more confidence in this change

jakirkham · 2021-03-08T18:40:05Z

Thanks Mads for working on this and everyone for the reviews! 😄

mrocklin reviewed Feb 22, 2021

View reviewed changes

jakirkham mentioned this pull request Feb 23, 2021

HLG serialization bug #4508

Closed

madsbk mentioned this pull request Feb 23, 2021

Serialize and split #4541

Merged

2 tasks

madsbk force-pushed the msgpack_extract_serialize branch 2 times, most recently from 4ab1ac9 to 9e0e884 Compare February 24, 2021 11:04

madsbk force-pushed the msgpack_extract_serialize branch 7 times, most recently from 3b8bd25 to 56246cd Compare February 26, 2021 14:08

madsbk force-pushed the msgpack_extract_serialize branch from 90aab1e to 4c3c46c Compare March 2, 2021 11:50

jakirkham mentioned this pull request Mar 2, 2021

[REVIEW] tcp.write(): cast memoryview to "B" #4555

Merged

2 tasks

jakirkham reviewed Mar 2, 2021

View reviewed changes

distributed/comm/tcp.py Outdated Show resolved Hide resolved

TCP.write(): use nbytes() to determine if memoryview 0-size

92b3823

madsbk mentioned this pull request Mar 3, 2021

Shortcut to_frames() #4480

Closed

3 tasks

Merge dask/master into madsbk/msgpack_extract_serialize

4d792ef

jakirkham approved these changes Mar 4, 2021

View reviewed changes

jakirkham requested a review from mrocklin March 4, 2021 02:43

jakirkham merged commit 2231f90 into dask:master Mar 8, 2021

madsbk deleted the msgpack_extract_serialize branch March 23, 2021 08:24

jakirkham mentioned this pull request Mar 30, 2021

[WIP] Distinguish tuples & lists in MsgPack serialization #4575

Draft

3 tasks

bsesar mentioned this pull request Jul 12, 2021

distributed.worker - ERROR - ('waiting', 'executing') OR distributed.protocol.core - CRITICAL - Failed to deserialize #5047

Open

This was referenced Feb 1, 2022

Cythonic SchedulerState (WIP) #5176

Closed

Drop support for cythonized scheduler #5685

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Msgpack handles extract serialize #4531

[REVIEW] Msgpack handles extract serialize #4531

madsbk commented Feb 22, 2021 •

edited

Loading

madsbk commented Feb 22, 2021

mrocklin commented Feb 22, 2021

mrocklin commented Feb 22, 2021

mrocklin commented Feb 22, 2021

jakirkham commented Feb 22, 2021

mrocklin commented Feb 22, 2021 via email

jakirkham commented Feb 22, 2021

mrocklin commented Feb 22, 2021

jakirkham commented Feb 22, 2021

mrocklin commented Feb 22, 2021

mrocklin Feb 22, 2021

madsbk Feb 23, 2021

mrocklin commented Feb 23, 2021

jakirkham commented Feb 23, 2021

jakirkham commented Feb 24, 2021

madsbk commented Feb 24, 2021

mrocklin commented Feb 24, 2021

jakirkham commented Feb 24, 2021

jakirkham commented Feb 24, 2021

madsbk commented Feb 26, 2021

madsbk commented Mar 2, 2021 •

edited

Loading

jakirkham commented Mar 2, 2021

quasiben commented Mar 2, 2021

quasiben commented Mar 2, 2021

jakirkham commented Mar 2, 2021

jakirkham commented Mar 4, 2021

madsbk commented Mar 4, 2021

quasiben commented Mar 4, 2021

jakirkham commented Mar 4, 2021

jakirkham commented Mar 8, 2021

	# TODO: the new protocol.dumps() implementation doesn't convert
	# memoryviews to bytes implicitly, which triggers a communication BUG.
	# My guess is that somewhere we are using `len(buffer)` instead of
	# `buffer.nbytes`. Casting to bytes here fixes the issue.
	data = memoryview(data).cast("B")

[REVIEW] Msgpack handles extract serialize #4531

[REVIEW] Msgpack handles extract serialize #4531

Conversation

madsbk commented Feb 22, 2021 • edited Loading

madsbk commented Feb 22, 2021

mrocklin commented Feb 22, 2021

mrocklin commented Feb 22, 2021

mrocklin commented Feb 22, 2021

jakirkham commented Feb 22, 2021

mrocklin commented Feb 22, 2021 via email

jakirkham commented Feb 22, 2021

mrocklin commented Feb 22, 2021

jakirkham commented Feb 22, 2021

mrocklin commented Feb 22, 2021

mrocklin Feb 22, 2021

Choose a reason for hiding this comment

madsbk Feb 23, 2021

Choose a reason for hiding this comment

mrocklin commented Feb 23, 2021

jakirkham commented Feb 23, 2021

jakirkham commented Feb 24, 2021

madsbk commented Feb 24, 2021

mrocklin commented Feb 24, 2021

jakirkham commented Feb 24, 2021

jakirkham commented Feb 24, 2021

madsbk commented Feb 26, 2021

madsbk commented Mar 2, 2021 • edited Loading

jakirkham commented Mar 2, 2021

quasiben commented Mar 2, 2021

quasiben commented Mar 2, 2021

jakirkham commented Mar 2, 2021

jakirkham commented Mar 4, 2021

madsbk commented Mar 4, 2021

quasiben commented Mar 4, 2021

jakirkham commented Mar 4, 2021

jakirkham commented Mar 8, 2021

madsbk commented Feb 22, 2021 •

edited

Loading

madsbk commented Mar 2, 2021 •

edited

Loading