-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance with Async #7
Comments
I've taken a (very quick) look at the code and have a few suggestions:
With these, I get numbers like this:
It seems like io_uring massively reduces voluntary context switches, as expected 😁 |
Oh wow, I forgot to enable release optimizations, obviously in a CPU bound workload these would have a significant impact. When you have a chance could you try run the tcp examples and pipe a file into them via netcat. They essentially do the same thing but via TCP, locally I can't validate any difference in speed between the two. If you get the same result then I think it makes sense to try just retrofitting file IO using uring onto the existing loop rather than a whole hog async event loop rewrite. |
@mt-caret
Adding this here as a central point for discussion rather than X-linking between different repositories.
When you have some time it would be great if you could validate some experimental results and check the examples in the prototype-async backend. In particular I'm seeing higher system usage in read-heavy workloads with Async. I can't figure out if this is an increase in load due to buffering or issues driven by file-IO threading.
If it is a true gain then the backend might have a notable impact on throughput in IO bound applications with streamed computation, since CPU's tend to be tuned to boost workloads on a single core.
With IO uring:
With Async:
Tested on a 10GB file with hot caches.
In particular, I'm using the bytes_ variants of methods with async, I don't think the increased memcpy should CPU usage all that much though. Perf top shows an increase in kernel locking, but it's too illuminating.
Kernel traces (A bit misleading since uring will have the read path obscured by the async kernel, wheras in the syscall path the read path is visible in the trace):
Would be great to get your thoughts and make sure this isn't just PEBKAC.
The text was updated successfully, but these errors were encountered: