Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad performance with input_buffer > 3.0 #210

Closed
JakkuSakura opened this issue Jun 26, 2021 · 5 comments · Fixed by #214
Closed

Bad performance with input_buffer > 3.0 #210

JakkuSakura opened this issue Jun 26, 2021 · 5 comments · Fixed by #214

Comments

@JakkuSakura
Copy link

The assembling process takes up most of the time in my IO thread. Why is it so slow? I would like to file an PR to improve it.
image

@JakkuSakura
Copy link
Author

JakkuSakura commented Jun 26, 2021

Sadly, the performance of input_buffer is unacceptable. The benchmark is just pure memcpy.
https://github.com/qiujiangkun/input_buffer/blob/master/benches/read_data.rs

throughput/input_buffer time:   [35.198 ms 37.608 ms 40.192 ms]                                    
                        thrpt:  [199.04 MiB/s 212.72 MiB/s 227.29 MiB/s]

throughput/extend_from_slice                                                                            
                        time:   [221.84 us 222.59 us 223.36 us]
                        thrpt:  [34.977 GiB/s 35.098 GiB/s 35.217 GiB/s]

throughput/with_capacity                                                                            
                        time:   [222.57 us 223.88 us 225.39 us]
                        thrpt:  [34.662 GiB/s 34.896 GiB/s 35.102 GiB/s]

throughput/with_capacity_unsafe                                                                            
                        time:   [165.58 us 166.01 us 166.50 us]
                        thrpt:  [46.921 GiB/s 47.060 GiB/s 47.182 GiB/s]

After hack of removing quadratic zeroing, the performance improves about 160x.

throughput/input_buffer time:   [226.32 us 227.91 us 229.67 us]                                    
                        thrpt:  [34.016 GiB/s 34.278 GiB/s 34.520 GiB/s]

For those who want to improve the performance, use the following

[patch.crates-io]
input_buffer = { git = "https://github.com/qiujiangkun/input_buffer", tag="HACK" }

snapview/input_buffer#6 (comment)

@daniel-abramov daniel-abramov changed the title About the performance of assembling Bad performance with input_buffer > 3.0 Jun 26, 2021
@daniel-abramov
Copy link
Member

daniel-abramov commented Jun 29, 2021

Note that the provided custom "hack" version is essentially the same as older version of the input_buffer and the reason it was changed was that its unsound implementation, so as maintainer unfortunately I can't recommend using the "hack" version right now (as it may lead to the UB depending on the implementation of the stream that is passed to the tungstenite-rs).

Current state and discussion: snapview/input_buffer#6 (comment)

We're going to soon release the new version that is as safe as our current sound implementation but also significantly faster.

@whyCPPgofast
Copy link

whyCPPgofast commented Jul 4, 2021

I see that input_buffer crate was updated and the bottleneck has been fixed. snapview/input_buffer@f2e8410
Is it as simple as bumping the input_buffer version to 0.5.0 in tungstenite or is there more work to do? Thank you for the great library.

@daniel-abramov
Copy link
Member

You're right, it was just about bumping the version of the dependency and releasing a new version. I decided to postpone it a bit, so that we can merge and release even better version that performs even faster on average. I expect us delivering a new version tomorrow if the PR gets reviewed today.

@whyCPPgofast
Copy link

whyCPPgofast commented Jul 9, 2021

@application-developer-DA I see you updated the input_buffer crate and added benchmarks. Mind bumping the version of tungstenite-rs ? Thank you for the great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants