-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change sys::aio::lio_listio to sys::aio::LioCb::listio #872
Conversation
I found the copy/paste error that was causing the tests to fail on armv7 and i686-musl |
I discovered a problem with this PR: it makes it difficult to handle |
c458f49
to
022790e
Compare
The new LioCb structure allows us to control the exact arguments passed to lio_listio, guaranteeing that each call gets a unique storage location for the list argument. This prevents clients from misusing lio_listio in a way that causes events to get dropped from a kqueue Fixes nix-rust#870
A double panic can screw up the first panic's stack trace. Better not to assert! anything when the thread is already panicing.
Supporting the bytes crate was unnecessarily specific. This change replaces from_bytes and from_bytes_mut with from_boxed_slice and from_boxed_mut_slice, which can work with anything that implements Borrow<[u8]> and BorrowMut<[u8]>, respectively.
aac3bbc
to
fef70ad
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a test you could add that would fail with the old API and works with the new one? I didn't see any tests added to check that the new API doesn't have a similar failure mode.
/// The returned value will be the argument that was passed to | ||
/// `from_boxed_slice` when this `AioCb` was created. | ||
/// | ||
/// It is an error to call this method while the `AioCb` is still in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to actually make this a state machine so this error condition can be caught at compile time? So use a different type for in-progress AioCb
s and not-in-progress ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. It's conceivable. I would have to change the signature of methods that change the in_progress state to consume the object and return a new one, like this:
pub fn read(self) -> result::Result<AioCbInProgress, (Error, AioCb)> {
let p: *mut libc::aiocb = &mut self.aiocb;
Errno::result(unsafe {
libc::aio_read(p)
}).map(|_| {
AioCbInProgress::from_aiocb(self)
}).map_err(|e| {
(self, e)
})
}
I would also have to put a Box
around AioCb::aiocb
, because the libc::aiocb
must have a stable address once it's been passed to the kernel.
But I can't guarantee that the compiler will always catch the problem, because if lio_listio
returns EIO
, then some operations may be in-progress but not all. Which means that LioCb::aiocbs
would have to become a Vec
of an Enum
type. So you wouldn't be able to catch misuse of an LioCb
at compile time.
I think I could make such an arrangement work with the downsteam mio-aio and tokio-file crates, but they would likewise have to use Enum
types, so they wouldn't gain anything. tokio-file already has a totally different way of ensuring that you don't abuse an in-progress aiocb (based on runtime checks).
Do you think it's worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly think either way, the idea crossed my mind so I asked. I really don't understand this API very well and I don't plan to spend the time to do so. But I wanted to ask in case you hadn't thought about it and it would be a good idea. Current API LGTM.
src/sys/aio.rs
Outdated
/// operations that were restarted by [`LioCb::listio_resubmit`] | ||
/// | ||
/// [`AioCb::error`] (struct.AioCb.html#method.error) | ||
/// [`LioCb::listio_resubmit`] #method.listio_resubmit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This link is incorrect when rendered using rustdoc.
src/sys/aio.rs
Outdated
/// | ||
/// [`AioCb::aio_return`] (struct.AioCb.html#method.aio_return) | ||
/// [`AioCb::error`] (struct.AioCb.html#method.error) | ||
/// [`LioCb::aio_return`] #method.aio_return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect link format.
src/sys/aio.rs
Outdated
/// | ||
/// Use `listio` to submit an aio operation and wait for its completion. In | ||
/// this case, there is no need to use `aio_suspend` to wait or | ||
/// `AioCb#error` to poll. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect doc link here also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two bugs fixed with the new API. #870 could cause kqueue to deliver too few notifications if you issued multiple lio_listio
calls with the list
argument having the same stack location. The new API makes that impossible, because LioCb
owns the list.
The other bug dealt with lio_listio
returning EAGAIN
or EIO
because not all operations were enqueued because of resource limitations. I actually have written a test for this in the tokio-file crate. However, reliably hitting that resource limit requires knowing more information about the system, which I got with the sysrc
crate. I didn't want to import that crate into nix
as a dev dependency, so I didn't add the test here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you think that dependency is too much for our testing? I'm always wary of fixing subtle bugs but not having a test to make sure they don't re-occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I pushed a test for it.
src/sys/aio.rs
Outdated
/// [`AioCb::aio_return`] (struct.AioCb.html#method.aio_return) | ||
/// [`AioCb::error`] (struct.AioCb.html#method.error) | ||
/// [`LioCb::aio_return`] #method.aio_return | ||
/// [`lio_listio`]: http://pubs.opengroup.org/onlinepubs/9699919799/functions/lio_listio.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't this use the parens format for external URL like on line 1188? Is there a reason to prefer one syntax over the other and/or is this actually incorrect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this rare case, I actually got it right. Line 1188 is an inline link. Lines 1190-1191 are footnotes.
I think just that last comment, then squash everything and this is GTM. |
Looks pretty good I think, though I didn't dig too deep into AIO/LIO, but it seems reasonable. I'd resolve the one last issue I raised, then squash and I'm happy! |
// Deliberately exceed system resource limits, causing lio_listio to return EIO. | ||
// This test must run in its own process since it deliberately uses all AIO | ||
// resources. ATM it is only enabled on FreeBSD, because I don't know how to | ||
// check system AIO limits on other operating systems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this theoretically possible on other platforms? Maybe a FIXME or TODO should be added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, but these limits are OS-specific, and I don't know the rules on anything but FreeBSD. The errors, however, are standardized. So as long as listio_resubmit
does the right thing when it gets the error on FreeBSD, then it should do the right thing everywhere.
Which "one last issue" is that? |
To add a TODO or FIXME to add an implementation for other platforms. But it sounds like you don't think they're necessary. |
I really don't. And I don't know how to do it, either. I can't find any documentation about limits for Linux, DragonFly or NetBSD. |
Please squash them r+ me. |
It helps deal with errors like EAGAIN, which can result in a subset of an LioCb's operations being queued. The test is only enabled on FreeBSD, because it requires intimate knowledge of AIO system limits.
bors r+ susurrus |
872: Change sys::aio::lio_listio to sys::aio::LioCb::listio r=asomers a=asomers The new LioCb structure allows us to control the exact arguments passed to lio_listio, guaranteeing that each call gets a unique storage location for the list argument. This prevents clients from misusing lio_listio in a way that causes events to get dropped from a kqueue Fixes #870
The new LioCb structure allows us to control the exact arguments passed
to lio_listio, guaranteeing that each call gets a unique storage
location for the list argument. This prevents clients from misusing
lio_listio in a way that causes events to get dropped from a kqueue
Fixes #870