Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync vs Unsync versions benchmarks on MIPS and ARM #242

Open
glebpom opened this issue Jan 21, 2019 · 8 comments
Open

Sync vs Unsync versions benchmarks on MIPS and ARM #242

glebpom opened this issue Jan 21, 2019 · 8 comments

Comments

@glebpom
Copy link

glebpom commented Jan 21, 2019

Hi,

I have done some benchmarks based on @stbuehler fork with Unsync implementation (#200). My goal was to test it on MIPS and ARM processors. Generally, sync version shows comparable or better (someteims much better) results. But there are a couple of cases where unsync version is significantly better:

MIPSel (MediaTek MT7628AN ver:1 eco:2):

Unsync
test deref_two ... bench: 27,566 ns/iter (+/- 93,022)
test from_long_slice ... bench: 1,770 ns/iter (+/- 15,711) = 72 MB/s

Sync
test deref_two ... bench: 49,816 ns/iter (+/- 89,548)
test from_long_slice ... bench: 3,700 ns/iter (+/- 9,269) = 34 MB/s

ARMv6 (ARMv6-compatible processor rev 7 (v6l))

Unsync
test split_off_and_drop ... bench: 2,755,606 ns/iter (+/- 129,117)

Sync
test split_off_and_drop ... bench: 3,642,042 ns/iter (+/- 59,619)

@carllerche
Copy link
Member

Thanks for doing the benchmarks.

Could you clarify the action item for this issue?

@glebpom
Copy link
Author

glebpom commented Jan 29, 2019

This is more like a follow-up to #200
According to this benchmark, there are some specific cases on particular hardware, where current Bytes implementation may be optimized by possible using of unsync version. Do you think it makes sense to implement separate non-atomic version? Or is there any possibility to optimize current implementation for ARMv6 or MIPS?

@mzabaluev
Copy link
Contributor

mzabaluev commented Jun 16, 2019

A related, but more generally applicable idea is to redo BytesMut as !Sync and only put the buffer under atomic reference counting on .freeze(). The rationale is that the contents of a BytesMut are typically frequently mutated by a single thread or task accumulating input, but frozen buffers need to be handed off to processing threads with zero copying.

A consistent rework in this manner would need to slash mutating APIs on Bytes to channel people into more performant ways of forming their buffers.

@mzabaluev
Copy link
Contributor

FYI, #269 has a design proposal allowing !Sync instances without code forks or even feature flags.

@glebpom
Copy link
Author

glebpom commented Jun 26, 2019

@mzabaluev Looks great to me! @carllerche what do you think?

@carllerche
Copy link
Member

Generally, !Sync types are pretty annoying to work with as the !Sync aspect leaks. It's better, IMO, to have types be Sync and have fns that require mutating the state to take &mut self.

@mzabaluev
Copy link
Contributor

mzabaluev commented Jun 27, 2019

Generally, !Sync types are pretty annoying to work with as the !Sync aspect leaks.

It (as well as the propagation of !Send) can be considered a design feature, if it can be tuned in and out of generically. I had a very good experience with tower-h2 where it basically instructed me that my service implementation was unsuitable for the executor I was trying to use due to lacking Send, but it worked just fine with the current thread executor.

@glebpom
Copy link
Author

glebpom commented Jun 27, 2019

I use tokio-current-thread heavily, where using !Sync is just fine. I think it makes a lot of sense to provide the optional way to not pay the unnecessary price for Sync

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants