Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vsock] refactor VhostUserVsockThread worker #450

Merged
merged 2 commits into from
Sep 26, 2023

Conversation

ikicha
Copy link
Contributor

@ikicha ikicha commented Sep 19, 2023

Summary of the PR

For now, VhostUserVsockThread uses thread pool executor from futures, but it doesn't need to use thread pool executor and futures because we just need background worker thread, and a way to let it work.

So I removed unnecessary external dependency and made the logic simpler by using just thread and channel

Requirements

Before submitting your PR, please make sure you addressed the following
requirements:

  • [ V ] All commits in this PR are signed (with git commit -s), and the commit
    message has max 60 characters for the summary and max 75 characters for each
    description line.
  • [ V ] All added/changed functionality has a corresponding unit/integration
    test.
  • [ V ] All added/changed public-facing functionality has entries in the "Upcoming
    Release" section of CHANGELOG.md (if no such section exists, please create one).
  • [ V ] Any newly added unsafe code is properly documented.

crates/vsock/src/vhu_vsock_thread.rs Outdated Show resolved Hide resolved
@@ -126,7 +136,39 @@ impl VhostUserVsockThread {
),
);
}

let (sender, receiver) = mpsc::channel::<EventData>();
thread::spawn(move || loop {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It might be worth factoring this closure out to be its own function that explicitly takes the captured parameters by value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. is it what you want?

@stefano-garzarella
Copy link
Member

stefano-garzarella commented Sep 19, 2023

@ikicha thanks for the PR. Here some comments.

Does this change also improve performance?

I like that we reduce dependencies, but I honestly found the previous implementation easier to read. The only thing I didn't like was the duplication of code, which we could fix anyway.

Now with this change IIUC, each device creates a new thread that manages the packets.

Going back with the previous implementation, since we now support multiple devices at the same time, could we think about having a single ThreadPool for the entire application with a view to reducing resource consumption? (Of course we could have a new option to allow the user to decide the size of the pool.)

@ikicha
Copy link
Contributor Author

ikicha commented Sep 19, 2023

@ikicha thanks for the PR. Here some comments.

Does this change also improve performance?

TBH, I think it is almost the same. (might be slightly better because we don't uses thread pool executor's state management logic...)

I like that we reduce dependencies, but I honestly found the previous implementation easier to read. The only thing I didn't like was the duplication of code, which we could fix anyway.

Now with this change IIUC, each device creates a new thread that manages the packets.

I think the count of threads is the same.
As-is: a device has their own thread pool executor of which pool size is 1
To-be: a device has their own thread.

Going back with the previous implementation, since we now support multiple devices at the same time, could we think about having a single ThreadPool for the entire application with a view to reducing resource consumption? (Of course we could have a new option to allow the user to decide the size of the pool.)

Maybe we can use a single thread to handle every event in the entire application, but I am hesitated about multiple thread because IIUC the order of the packet is important, isn't it?

And also, with regard to thread pool impl(and if we decide to use thread pool), I think tokio looks good over futures-executor because it's more popular, and efficient, and support epoll event(we can replace epoll listener with that). But it might be big change, so we might do that in the future if there is a performance issue.

@stefano-garzarella
Copy link
Member

@ikicha thanks for the PR. Here some comments.
Does this change also improve performance?

TBH, I think it is almost the same. (might be slightly better because we don't uses thread pool executor's state management logic...)

If you can run some performance test will be great.

I like that we reduce dependencies, but I honestly found the previous implementation easier to read. The only thing I didn't like was the duplication of code, which we could fix anyway.
Now with this change IIUC, each device creates a new thread that manages the packets.

I think the count of threads is the same. As-is: a device has their own thread pool executor of which pool size is 1 To-be: a device has their own thread.

Yep, I agree on that. It was a premise for the next paragraph :-)

Going back with the previous implementation, since we now support multiple devices at the same time, could we think about having a single ThreadPool for the entire application with a view to reducing resource consumption? (Of course we could have a new option to allow the user to decide the size of the pool.)

Maybe we can use a single thread to handle every event in the entire application, but I am hesitated about multiple thread because IIUC the order of the packet is important, isn't it?

How the order will be impacted in that case?
The order should be important, but just for a single device point of view. I mean, a device should always process its packets in order, but multiple devices can overlap IMHO.

And also, with regard to thread pool impl(and if we decide to use thread pool), I think tokio looks good over futures-executor because it's more popular, and efficient, and support epoll event(we can replace epoll listener with that). But it might be big change, so we might do that in the future if there is a performance issue.

Got it, thanks for the details.

@ikicha
Copy link
Contributor Author

ikicha commented Sep 20, 2023

@ikicha thanks for the PR. Here some comments.
Does this change also improve performance?

TBH, I think it is almost the same. (might be slightly better because we don't uses thread pool executor's state management logic...)

If you can run some performance test will be great.

Guest: dd if=/dev/zero bs=100M count=10 | ncat --vsock 2 1234
Host: nc -l -U /tmp/vm4.vsock_1234 > /dev/null

I don't see difference between them.(it's around 450MB/s for now)

I like that we reduce dependencies, but I honestly found the previous implementation easier to read. The only thing I didn't like was the duplication of code, which we could fix anyway.
Now with this change IIUC, each device creates a new thread that manages the packets.

I think the count of threads is the same. As-is: a device has their own thread pool executor of which pool size is 1 To-be: a device has their own thread.

Yep, I agree on that. It was a premise for the next paragraph :-)

Going back with the previous implementation, since we now support multiple devices at the same time, could we think about having a single ThreadPool for the entire application with a view to reducing resource consumption? (Of course we could have a new option to allow the user to decide the size of the pool.)

Maybe we can use a single thread to handle every event in the entire application, but I am hesitated about multiple thread because IIUC the order of the packet is important, isn't it?

How the order will be impacted in that case? The order should be important, but just for a single device point of view. I mean, a device should always process its packets in order, but multiple devices can overlap IMHO.

It's okay with current impl.

But, I was talking about a global thread pool of which size is more than 1. In this case, it could be multiple thread handles packets which comes from the same device, and without additional lock mechanism, the order might be the problem.

And also, with regard to thread pool impl(and if we decide to use thread pool), I think tokio looks good over futures-executor because it's more popular, and efficient, and support epoll event(we can replace epoll listener with that). But it might be big change, so we might do that in the future if there is a performance issue.

Got it, thanks for the details.

@stefano-garzarella
Copy link
Member

We usually try to avoid in the same PR to add code in one commit and modify it in subsequent ones unless strictly necessary for bisectability or other reasons. In this case I think you can merge everything into one commit.

For now, VhostUserVsockThread uses thread pool executor from futures,
but it doesn't need to use thread pool executor and futures because
we just need background worker thread, and a way to let it work.

So I removed unnecessary external dependency and made the logic simpler
by using just thread and channel

Signed-off-by: Jeongik Cha <jeongik@google.com>
@ikicha
Copy link
Contributor Author

ikicha commented Sep 21, 2023

Squashed the commits into just one commit.

Copy link
Member

@stefano-garzarella stefano-garzarella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM!

@ikicha
Copy link
Contributor Author

ikicha commented Sep 22, 2023

@vireshk , can you review this PR? It looks like it needs 2 approvals to merge this.

@vireshk vireshk merged commit 7b2632b into rust-vmm:main Sep 26, 2023
@ikicha ikicha deleted the remove_futures_lib branch September 26, 2023 08:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants