Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use io_uring instead of epoll when supported #753

Open
lpereira opened this issue Dec 11, 2019 · 31 comments
Open

Use io_uring instead of epoll when supported #753

lpereira opened this issue Dec 11, 2019 · 31 comments
Labels
area-System.Net.Sockets enhancement Product code improvement that does NOT require public API changes/additions os-linux Linux OS (any supported distro) tenet-performance Performance related issue
Milestone

Comments

@lpereira
Copy link
Contributor

io_uring is a new method to perform efficient I/O on Linux systems. It provides a completion model (rather than a readiness model), similar to what IOCP on Windows provides, and unlike the standard poll-like interfaces, it can be used to request I/O from regular files as well (and, unlike the old/broken AIO in Linux, it doesn't require files to be opened in O_DIRECT mode).

It is a recent development, but reports of it being used by servers are very promising, often yielding gains exceeding 2 or 4x in throughput. Here's a talk by its main author with details, including benchmarks.

In addition to I/O (read/write/poll), it's also possible to handle connections (accept/connect) and a bunch of other things.

It should be possible to enable this and have both io_uring and epoll (as a fallback) in pal_networking.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Dec 11, 2019
@am11
Copy link
Member

am11 commented Dec 11, 2019

Going by the pdf, it seems that polled IO might be the most suited option for PAL networking, because it is efficient, closer to epoll implementation and does not require elevated privileges (like 'kernel side polling' option). Few questions:

  • should the implementation take dependency on liburing, or can it carry some boilerplate code and let go the liburing dependency (which comes from package managers); and instead make the kernel calls directly?
  • should it be added as a shim to support runtime check (like there are shims for libssl, libnuma,). this way the same portable linux build, when running on kernel version lower than 5.1, will switch to epoll?
  • in kernel v5.4, the implementation has improved significantly, so should the PAL implementation take 5.4 as a baseline to switch from epoll to io_uring or keep 5.1 as baseline (where it was first implemented)?

@lpereira
Copy link
Contributor Author

I'd say depend on liburing. Doing stuff by hand is possible but we would be essentially replicating it inside the runtime; better to stick with something that's been debugged and tested already. I don't know how much they care about API and ABI compatibility at this point, so using it as a shim might not be a good idea; maybe using a git submodule?

As for the minimum kernel requirement: for io_uring, we should support 5.4+ only, falling back to epoll on older kernel versions. There were many improvements in the 5.5 series too, so eventually we might even bump the requirements if we end up taking the advantage of these features, just to simplify how we implement stuff -- for instance, async file I/O and not only sockets. (This kernel is still not common in most distributions but would be nice if the performance just appeared out of the blue after a kernel upgrade.)

@damageboy
Copy link
Contributor

Possible dupe of:

https://github.com/dotnet/coreclr/issues/24441

This situation with the issues not yet ported is starting to generate noise...

@lpereira
Copy link
Contributor Author

lpereira commented Dec 11, 2019

Indeed it's a dupe, @damageboy. (I'll keep this issue open here as it might be easier to reference it and it's unlikely a lot of folks will keep a close eye on the coreclr repo after the consolidation.)

@damageboy
Copy link
Contributor

@lpereira Aren't the issues moving? Has anything changed?

@lpereira
Copy link
Contributor Author

@lpereira Aren't the issues moving? Has anything changed?

They're moving, but it should take a month or so. I can close this one once the move is complete (can't easily mark as dupe in different repos.)

@am11
Copy link
Member

am11 commented Dec 11, 2019

It should be possible to enable this and have both io_uring and epoll (as a fallback) in pal_networking.

i think pal_networking, coming from corefx, deserves a separate issue as there is a defined/finite surface area which is currently using epoll where io_uring can be incorporated. It can be tracked here.

coreclr issue is a broader discussion on how to make use of io_uring in variety of scenarios, which currently is done in coreclr's pal without using epoll and friends in kernel-agnostic manner, afaict.

@lpereira
Copy link
Contributor Author

Another thing I think we can use io_uring -- maybe not right now, but we could contribute a patch to the Linux kernel -- is to implement WaitForMultipleObjectsEx() using futexes directly, and have a command in io_uring to perform operations in multiple futexes at the same time.

@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Jan 8, 2020
@jeffschwMSFT jeffschwMSFT added this to the Future milestone Jan 8, 2020
@isilence
Copy link

Another thing I think we can use io_uring -- maybe not right now, but we could contribute a patch to the Linux kernel -- is to implement WaitForMultipleObjectsEx() using futexes directly, and have a command in io_uring to perform operations in multiple futexes at the same time.

@lpereira, I'm speculating, but would a new futex opcode with already implemented linked commands and timeouts suffice you?
Someone already mentioned supporting futex(2) axboe/liburing#39

@benaadams
Copy link
Member

benaadams commented Jan 19, 2020

epoll bare minimum echo server

50 clients, running 512 bytes, 60 sec.

Speed: 189185 request/sec, 189185 response/sec
Requests: 11351122
Responses: 11351122

io_uring bare minimum echo server (Linux 5.4 needed, lower versions don't return the right amount of bytes read from io_uring_prep_readv in cqe->res.) https://github.com/frevib/io_uring-echo-server

Benchmarking: localhost:5555
50 clients, running 512 bytes, 60 sec.

Speed: 368368 request/sec, 368368 response/sec
Requests: 22102112
Responses: 22102110

@isilence
Copy link

isilence commented Jan 19, 2020

The difference looks good, even though it can do even better. E.g. io_uring allows registered buffers and fds, supports IORING_OP_ACCEPT, etc. (or get rid of callocs in the loop...)

@benaadams
Copy link
Member

edit removed links as author has decided on GPL v3.0

@frevib
Copy link

frevib commented Jan 20, 2020

@benaadams changed it to MIT, sorry for the inconvenience. @isilence it definitely needs some optimizations and I think there are some tiny bugs. If you want/like/have time to issue a PR, I’m happy to merge.

@benaadams
Copy link
Member

edit author changed to MIT so put link back https://github.com/frevib/io_uring-echo-server :)

It's a networking example using liburing which is LGPL so can be linked to (though not derived from for MIT; so don't look at the source for liburing in case we do our own implementation on io_uring which must be clean and not derived from LGPL).

Though I don't know the dotnet policy on linking to LGPL and whether its allowed? /cc @jkotas

There's a very detailed document from the author of liburing @axboe who is also one of the authors of io_uring https://kernel.dk/io_uring.pdf on the motivation for io_uring and what it achieves, as well as how to use it (including considerations around memory barriers).

That then leads to the motivations for liburing and how to use that (it simplifies all the boilerplate setup and tear down for io_uring and handles all the memory barriers etc)

To quote

With the inner details of the io_uring out of the way, you'll now be relieved to learn that there's a simpler way to do much of the above. The liburing library serves two purposes:

  • Remove the need for boiler plate code for setup of an io_uring instance.
  • Provide a simplified API for basic use cases.

Also a LWN.net article about io_uring

@am11
Copy link
Member

am11 commented Jan 20, 2020

As noted above, I think at least for the usecase in pal_networking.c in this repository, where implementation is currently using epoll, does not require link to liburing (a convenience library). It is more work, yes, but IMO worth it for dotnet runtime. Taking a dependency on another runtime library comes with cost for packaging as well. For example, liburing is not readily available in Alpine Linux package and many other package management systems, see Absent in repositories.

@lpereira
Copy link
Contributor Author

lpereira commented Jan 20, 2020

Notwithstanding library availability -- because we could use git submodules, for instance, and statically link with liburing -- there's a bigger issue: linking with LGPL would require us to also distribute .o files in addition to the binaries for .NET.

So I agree that it would be better to reimplement what liburing does; it's a thin wrapper around the kernel API. It mostly reduces a lot of the boilerplate necessary to map the queues and provides a bunch of auxiliary functions and whatnot.

If we're unsure how to use the API, though, it's possible to read from other implementations; for instance, there's a dual-licensed Apache 2/MIT library for Rust that could be used for studying purposes.

@benaadams
Copy link
Member

Also the libuv PR for io_uring could be something to look at libuv/libuv#2322 (libuv uses an joyent attribution licence); where they also state they can't look at the source for liburing as its LGPL libuv/libuv#2322 (comment)

@axboe
Copy link

axboe commented Jan 20, 2020

FWIW, I'd be willing to change the liburing license to dual MIT/GPL. There's really nothing fancy in the library, it's mostly just helpers, and a simplified interface should the application wish to use that. But it'd be a shame to have some of this code duplicated just because of licensing constraints.

@lpereira
Copy link
Contributor Author

lpereira commented Jan 20, 2020

@axboe That would be appreciated; it would indeed help a lot with io_uring adoption, given that GPL family of licenses aren't, unfortunately (in my personal opinion), that popular these days.

@axboe
Copy link

axboe commented Jan 20, 2020

I like GPL for applications, and I still use it, but it makes less sense for libraries. And in particular for something like liburing, which isn't really a lot of smarts, it's mostly just setup and helper code. I'm doing some due diligence by emailing folks that have more than a few commits in liburing, then I'll change it provided nobody objects (can't see why they would).

@axboe
Copy link

axboe commented Jan 21, 2020

I'm doing some due diligence by emailing folks that have more than a few commits in liburing, then I'll change it provided nobody objects (can't see why they would).

This has now been done.

@lpereira
Copy link
Contributor Author

For the record, here's an ASP.NET transport by @tkp1n that reimplements liburing in C#: https://github.com/tkp1n/IoUring

@isilence
Copy link

@lpereira, I'm speculating, but would a new futex opcode with already implemented linked commands and timeouts suffice you?
Someone already mentioned supporting futex(2) axboe/liburing#39

Going back to the ignored question... Guys, what's your use case and what would you need to integrate io_uring? Support for futex(2)? Something else?

@lpereira
Copy link
Contributor Author

@lpereira, I'm speculating, but would a new futex opcode with already implemented linked commands and timeouts suffice you?
Someone already mentioned supporting futex(2) axboe/liburing#39

Going back to the ignored question... Guys, what's your use case and what would you need to integrate io_uring? Support for futex(2)? Something else?

Yeah, futex support for io_uring would be very welcome, especially if it had the FUTEX_WAIT_MULTIPLE command that was proposed a while ago (the use case is for Wine's implementation of WaitForMultipleObjects(), which is currently using polled eventfds, but we also have an implementation in our PAL that could benefit from this.)

@isilence
Copy link

isilence commented Jan 27, 2020

Yeah, futex support for io_uring would be very welcome, especially if it had the FUTEX_WAIT_MULTIPLE command that was proposed a while ago (the use case is for Wine's implementation of WaitForMultipleObjects(), which is currently using polled eventfds, but we also have an implementation in our PAL that could benefit from this.)

Great, I'll try to take a look. I'm concerned about not having fast-path in-userspace locking, but it should be any better than eventfd + epoll. I haven't seen FUTEX_WAIT_MULTIPLE, but will need it to be merged first.

@lpereira
Copy link
Contributor Author

lpereira commented Feb 3, 2020

This article about using io_uring in modern C++ (with coroutines et al) is a pretty good read and gives some API insights, too: https://cor3ntin.github.io/posts/iouring/

@benaadams
Copy link
Member

lwn article The rapid growth of io_uring

@scalablecory scalablecory added enhancement Product code improvement that does NOT require public API changes/additions os-linux Linux OS (any supported distro) labels Feb 18, 2020
@scalablecory scalablecory modified the milestones: Future, 5.0 Feb 18, 2020
@karelz karelz added the tenet-performance Performance related issue label Feb 18, 2020
@antonfirsov
Copy link
Member

antonfirsov commented Mar 2, 2020

A general update:

All prototyping is being done on https://github.com/tmds/Tmds.LinuxAsync, together with other experiments from #14304 . We hope to see some numbers soon. After that we can think about the productization of the changes.

@karelz karelz modified the milestones: 5.0, Future May 6, 2020
@ericsampson
Copy link

Is it possible to dupe-close one of these two issues, so that there is one main tracking issue?
#12650

radical pushed a commit to radical/runtime that referenced this issue Jul 7, 2022
…et#753)

Recognizes a bad state from a specific exit code and then exits with SIMULATOR_FAILURE exit code instead.
Prevents hangs when running the application.
@ShreyasJejurkar
Copy link
Contributor

Hopefully this will be considered for 9.0

@ReubenBond
Copy link
Member

Nice docs on io_uring for anyone interested in this: https://nick-black.com/dankwiki/index.php/Io_uring

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Net.Sockets enhancement Product code improvement that does NOT require public API changes/additions os-linux Linux OS (any supported distro) tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests