-
-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Msgpack handles extract serialize #4531
Conversation
@mrocklin @jakirkham @quasiben, this PR isn't ready to be reviewed yet but it should work in most cases. It would be great if one of you could run a benchmark and see how much of an impact it has. Running the following code, I am seeing 6-10 times speedup :) n = 10**6
msg = [{"op": "health", "status": "OK"} for _ in range(n)]
t1 = time.time()
dumps(msg)
t2 = time.time()
print(f"n: {n}, time: ", t2-t1) |
I'll give it a shot :) |
Running the following computation on my laptop, with 20 workers with 1 thread each from dask.distributed import Client, performance_report, wait
client = Client("localhost:8786")
import dask
import dask.dataframe as dd
dask.config.set({"optimization.fuse.active": False})
df = dask.datasets.timeseries(
start="2020-01-01",
end="2020-01-10",
partition_freq="1h",
freq="60s"
).persist()
df2 = df.set_index("x").persist()
wait(df2)
|
So a 25% reduction then? How does |
Hrm, I'm having some difficulty reproducing those results. I don't fully
trust my install process. Let me do a couple of clean installs before we
celebrate on this one.
…On Mon, Feb 22, 2021 at 11:32 AM Mads R. B. Kristensen < ***@***.***> wrote:
@mrocklin <https://github.com/mrocklin> @jakirkham
<https://github.com/jakirkham> @quasiben <https://github.com/quasiben>,
this PR isn't ready to be reviewed yet but it should work in most cases. It
would be great if one of you could run a benchmark and see how much of an
impact it has.
Running the following code, I am seeing 6-10 times speedup :)
n = 10**6
msg = [{"op": "health", "status": "OK"} for _ in range(n)]
t1 = time.time()
dumps(msg)
t2 = time.time()
print(f"n: {n}, time: ", t2-t1)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4531 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTHS7BI7M5GT5UV5RXTTAKIKPANCNFSM4YAX3YNA>
.
|
No worries |
OK, I've reinstalled twice into two different environments. I'm now also running the computation ten times in a loop. Results
Definitely still a significant improvement. Variation is still high though. It would be good to see what this looks like on a quieter system. |
Thanks for the update! What do you mean by a quieter system? One dedicated to the benchmark or something else? |
One not running Chrome, and one that is not over-saturated with 4x the number of workers as cores for example :) I've started running with |
@@ -85,6 +85,7 @@ def test_maybe_compress_sample(): | |||
assert compressed == payload | |||
|
|||
|
|||
@pytest.mark.xfail(reason="TODO: fix") | |||
def test_large_bytes(): | |||
for tp in (bytes, bytearray): | |||
msg = {"x": tp(b"0" * 1000000), "y": 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this fail? Does msgpack fail on large messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is because it doesn't support splitting of large frames. I am work on a PR that should make it much easier to handle splitting and writability: #4541
That's |
Ben and I tried to benchmark this today, but ran into some issues with hangs near the end of the shuffle. Unfortunately we weren't able to identify exactly what they were coming from. Also these hangs did not happen every time, but happened frequently enough that a few retries was sufficient to see a hang. That being said, as things are still being worked on here, maybe this isn't totally unexpected. |
4ab1ac9
to
9e0e884
Compare
Yeah, this PR still need some work. Getting #4541 merged will help a lot. |
extract_serialize was removed entirely from the trace. What I'm saying here is that there are new costs associated to this msgpack approach. This was a picture of one of them. |
Sounds good. Reviewed yesterday. Will follow up on any comments/updates today |
Yep I believe that. Just noting the particular screenshot wouldn't have shown any calls to |
3b8bd25
to
56246cd
Compare
@mrocklin, @jakirkham , @quasiben, I have fixed most of the bugs and the PR should be ready for testing/benchmarking. |
90aab1e
to
4c3c46c
Compare
@quasiben could you try to test again? I found and fixed a BUG in Tornado triggered when |
Ah this makes more sense. Thanks for digging into this Mads. Do you think it would be possible to do this casting as part of MsgPack serialization? |
Thanks @madsbk . While the shuffle now completes I do see errors like the following:
|
Woot! Working smoothly now with better performance too! We have typically been seeing: 23.11 +/- .77
|
So 5-6s faster than what we currently see. Does that sound right? Looks like the 0th iteration was slower than that though. Is that consistent? Do we know what is slow about that iteration? |
Went ahead and merged |
Thanks @jakirkham ! |
Do we want to hold off on merging this until after the release ? |
That seems reasonable. Though I think our testing here has given me more confidence in this change |
Thanks Mads for working on this and everyone for the reviews! 😄 |
This PR address reduce the overhead of
protocol.dumps()
andprotocol.loads()
by handling the extraction of serializable objects directly inmsgpack
.black distributed
/flake8 distributed
distributed/distributed/protocol/numpy.py
Lines 105 to 109 in 3b8bd25