-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow performance on Linux? #336
Comments
When running the benchmark using tokio and sending all the requests at once and waiting for them to return concurrently it's a lot better. Tested with
|
I was able to replicate this on Ubuntu. Wonder where the performance loss is occuring |
Hello, we've been testing this benchmark on our own systems. When plugged we see benchmark results in line with the ones you have posted for non-Linux platforms, @alshdavid. That said, we've noticed that power saving mode or throttling due to being unplugged has a massive effect on the results. For instance, when I switch my machine to "Power Saver" in Gnome the results I get are: Ryzen 7 7840U / Ubuntu
Macbook M3 Max
Perhaps what's happening here is that the Linux implementation is very sensitive to power saving mode. |
I can confirm the same (i.e worse performance on power saving and numbers on par with OP's Windows and MacOS for performance mode) on NixOS 24.05, 24 × 12th Gen Intel® Core™ i7-12800HX, 64GB RAM
|
Although the above measurements show Linux to be ten times slower than macOS and five times slower than Windows, it's not clear to me why this is unexpected. The platform layer has distinct code for Linux, macOS, and Windows based on completely different OS primitives, so some performance differences would not be surprising. In particular, I wonder if the macOS support risks using Mach ports, rather than BSD features, for better performance. I'm also intrigued whether a factor of ten in these benchmarks represents a measurable performance problem for Servo (or other projects consuming IPC channel, if there are any). (I found one Servo issue specifically about layout of real world web pages being up to two times slower on Linux, when using "parallel" rather than "sequential" layout, but I have no idea if that could be caused by IPC channel performance differences.) |
We were evaluating using IPC channels at Atlassian for a project that has a Rust core which calls out to an external processes (Nodejs and other runtimes) to execute "plugin" code. The messaging overhead on Linux machines however made it impractical so that had us looking at alternative options. IPC is certainly still preferred as it's far simpler and a much nicer mental model than the alternatives. |
Thanks @alshdavid. Although Servo is probably the main consumer of IPC channel, I would be grateful for more information about your use case:
|
That's helpful - thank you. So it seems we don't yet have evidence that any multi-process implementation could perform sufficiently well for very chatty use cases such as yours on Linux. |
Unlikely. Is the overhead seen here a result of the serialization/deserialization of values across the IPC bridge? If that's the case, can we just send pointers? I am toying around with the idea of using shared memory between the process to store Rust channels which act as a bridge - though I don't know enough about how that actually works yet. Still quite new to working with OS APIs. Naively, I'm hoping I can store only a Rust channel in shared memory and send pointers to heap values between processes. Though I don't know if the receiving process can access the referenced value or if the OS prevents this (virtualized memory?). Perhaps I can have access to a shared heap by forking the parent process? Or perhaps there is a custom Rust allocator that manages a cross process shared heap |
I believe IPC channel is predicated on (de)serialising values sent across the channel. So I suspect "direct" transmission of values is beyond the scope of IPC channel.
Shared memory or memory mapped files are likely part of any performant solution. Indeed the current implementation already uses shared memory. These resources may be useful: https://users.rust-lang.org/t/shared-memory-for-interprocess-communication/92408
I personally think sharing (part of) the Rust heap between processes is a non-starter. It might be possible to build a library for managing shared memory or memory-mapped files as a way of passing values between processes, but that's likely to be a large piece of work. That said, it feels to me that this discussion is going beyond an issue against the current IPC channel implementation and is getting into the realm of speculating about better alternatives. Would you be comfortable closing the issue? |
True, I am happy to close this issue. Thanks for helping out 🙏 |
@alshdavid Thanks and I wish you good progress with https://github.com/atlassian-labs/atlaspack. |
Hi, I have written a wrapper util on top of
ipc_channel
that handles the handshake, swapping channels between the host/child and adds a request/response API.The performance on my M1 MBP was great, but I was surprised to find that the performance on Linux was significantly slower!
So I wrote a benchmark to test it out. The benchmark sends n requests, blocking on their responses (100k requests means 200k messages over the channel).
I'm not sure if it's my configuration, perhaps something else is interfering, however here are my results
Hardware
Results
macos
0.487s
windows
0.356s
linux
2.301s
macos
1.550s
windows
3.497s
linux
13.608s
macos
14.404s
windows
34.769s
linux
150.514s
Time taken for n round trip messages - Lower is better
I am have tried with/without the
memfd
option enabled and I have tried making this async (using tokio channels/threads) with the same outcome.This is my wrapper (benchmarks are under
examples
)https://github.com/alshdavid/ipc-channel-adapter
To run the benchmark run
just bench {number_of_requests}
e.g.just bench 100000
I'm investigating if another dependency is interfering, will update with my findings - but at the surface, any idea why this might be?
The text was updated successfully, but these errors were encountered: