-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhance SubmissionQueue to support generate SQE in place #119
Conversation
Can you describe what problem it solves? |
Sorry, I has misused an old version as work base, and found a bug in the old version and it's fixed in the latest code base. |
The The |
Sure, I started working by reading all the code and then found something interesting:) |
BTW, I failed to run "cargo run --package io-uring-test" with the latest code with an error:
And the toolchain used is "nightly-x86_64-unknown-linux-gnu". |
You may be using a distribution that does not support statx, see here. |
Aha, the RefCell actually helps ownedsplit. Say:
|
My fault, SubmissionUring does not support Clone yet, so it should be safe. |
Hi @quininer , I have updated the MR and feel it's ready for review:) |
I missed it and I will look at it on the weekend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little bit dissatisfied with the api proposed by this PR. although it can indeed avoid build io_uring_sqe
, but it cannot avoid need to build Opcode type once.
benchmarks show that it does not have many performance advantages compared to normal push. I am not sure if this change is worthwhile.
normal time: [6.0094 us 6.0865 us 6.1785 us]
change: [-0.1695% +1.3807% +3.1320%] (p = 0.11 > 0.05)
No change in performance detected.
prepare time: [6.3320 us 6.4818 us 6.6643 us]
change: [-8.5613% -5.6659% -2.6755%] (p = 0.00 < 0.05)
Performance has improved.
src/squeue.rs
Outdated
@@ -285,6 +354,44 @@ impl Entry { | |||
} | |||
} | |||
|
|||
impl OptionValues { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name should be more descriptive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "SqeCommonOptions"?
src/opcode.rs
Outdated
/// Trait to prepare an SQE from an opcode object. | ||
pub trait PrepareSQE { | ||
/// Prepare an SQE from an opcode object. | ||
fn prepare(&self, sqe: &mut sys::io_uring_sqe, options: Option<&OptionValues>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not expose internal types, maybe this trait can not be exposed。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both push_command() and push_commands() uses PrepareSQE
as generic types.
@quininer I have implemented a version to prepare SQE in place. And take
|
Introduce helper next_sqe() and move_forward() for SQ, it will be reused in following patches. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Introduce trait PrepareSQE to support prepare SQE in place by calling opcode.prepare(&mut sqe). Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Add SubmissionQueue::push_command()/push_commands() to generate SQE in place. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Add unit test cases and benchmark for push_command(). Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Currently the opcode module provides a data structure for each io-uring operation code. And it works in following way to submit an SQE: - generate an opcode object, such Nop, Readv etc. - convert the opcode object into an SQE object - copy the generated SQE object onto the next available SQE in the submission queue One previous patch in this series implements PrepareSQE to optimize the process to submit a request as: - generate an opcode object, such Nop, Readv etc. - convert the opcode object to an SQE in the submission queue in place The process to submit a request could be further optimized as: - get the next available SQE from the submission queue - prepare the SQE for io uring operation - commit the SQE This patch enhance the opcode module to prepare SQE in place, without involving the intermedia opcode structures. The way to achieve the goal is to introduce a structure for each opcode to support in-place preparatio. #[repr(transparent)] pub struct ReadvSqe { sqe: sys::io_uring_sqe, } impl ReadvSqe { #[inline] pub fn prepare(&mut self, fd: sealed::Target, iovec: *const libc::iovec, len: u32) { } #[inline] pub fn ioprio(mut self, ioprio: u16) -> Self { self.sqe.ioprio = ioprio.as_sqe_value() as _; self } } impl<'a> From<&'a mut sys::io_uring_sqe> for &'a mut ReadvSqe { #[inline] fn from(sqe: &'a mut sys::io_uring_sqe) -> &'a mut ReadvSqe { unsafe { mem::transmute(sqe) } } } Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Export SubmissionQueue::get_available_sqe() to prepare available SQE for in-place preparation. And export SubmissionQueue::move_forward() to commit prepared SQEs. Sample code to use the new interface: pub fn prepare_sqe(mut sq: SubmissionQueue<'_>) { unsafe { match sq.get_available_sqe(0) { Ok(sqe) => { let nop_sqe: &mut crate::opcode::NopSqe = sqe.into(); nop_sqe.prepare(); sq.move_forward(1); } Err(_) => return, } } } Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
Add benchmark for preparing SQE in place. Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
I have also conduct a round of benchmark on one platform. For the Nop tests, the results are as below(in us):
For detailed analysis of the performance result, please refer to to #116 There are slight performance degradations for Nop, but I think the performance behave may change for actual workloads. And the new interface enables new usages of the io uring crate:) |
This is the same as I expected. opcode is just an entry builder by design. It should be constructed simply and pushed into sq immediately, so it should always be easy to optimize. If this PR does not provide a visible performance improvement, I tend not to accept it. |
Thanks for your patience, let's close it because it doesn't bring much values:) |
Anyway, thank you for trying, thank you. :D |
Enhance io_uring with new interfaces to support prepare SQE in place. These new interfaces pass argument by reference instead of by value, which then could be used to support ringbahn.
I have add some unit test cases/one benchmark case. According to my test result, there's no regression for existing SubmissionQueue::push(). The new SubmissionQueue::push_command() is slightly slower than push() by acceptable.