Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vsock: Increase NUM_QUEUES to 3 #409

Merged
merged 3 commits into from
Jul 28, 2023
Merged

vsock: Increase NUM_QUEUES to 3 #409

merged 3 commits into from
Jul 28, 2023

Conversation

ikicha
Copy link
Contributor

@ikicha ikicha commented Jul 26, 2023

In virtio standard, vsock uses 3 vqs(https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-4380002), and crosvm and qemu also uses 3 queues, but this vhost-user-vsock device implementation assumes that there are only 2 vqs, and it causes the problem with crosvm. (at least)

Fixes #408

Copy link
Member

@stefano-garzarella stefano-garzarella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What problems does it cause in crosvm?

This device behaves like the vhost-vsock in-kernel device and only handles TX and RX queues. The event queue is handled by the VMM (as with the in-kernel device).
Indeed, in handle_event() the EVT_QUEUE_EVENT block code is empty.

I think we should make this change only when we implement something there for VMMs that doesn't handle the event queue.

crates/vsock/src/vhu_vsock.rs Outdated Show resolved Hide resolved
@ikicha
Copy link
Contributor Author

ikicha commented Jul 27, 2023

What problems does it cause in crosvm?

This device behaves like the vhost-vsock in-kernel device and only handles TX and RX queues. The event queue is handled by the VMM (as with the in-kernel device). Indeed, in handle_event() the EVT_QUEUE_EVENT block code is empty.

I think we should make this change only when we implement something there for VMMs that doesn't handle the event queue.

crosvm checks if the queue size is matched between device and vmm, and if not, it causes initialization error.

I was curious if qemu works well or not if we set queue size as 3.

@stefano-garzarella
Copy link
Member

crosvm checks if the queue size is matched between device and vmm, and if not, it causes initialization error.

I was curious if qemu works well or not if we set queue size as 3.

Just tested, and it works, but should we do something in handle_event() for the EVT_QUEUE_EVENT?
At least print a warning.

What crosvm is expecting with that queue?

@stefano-garzarella
Copy link
Member

Does crosvm also support vhost-vsock in-kernel device? How does it deal with it since it does not handle event queue?

@ikicha
Copy link
Contributor Author

ikicha commented Jul 27, 2023

For vhost-vsock in crosvm: https://github.com/google/crosvm/blob/e70a8774a1b56ab7b7f6667a7783694cf72ced95/devices/src/virtio/vhost/vsock.rs#L191C17-L191C28

It looks like event_queue only takes care of VIRTIO_VSOCK_EVENT_TRANSPORT_RESET during activation.

For vhost-user-vsock device for windows from crosvm: https://github.com/google/crosvm/blob/e70a8774a1b56ab7b7f6667a7783694cf72ced95/devices/src/virtio/vsock/sys/windows/vsock.rs#L1304

In this implementation, it doesn't act from event of event_queue, it just prints logs for error.

@stefano-garzarella
Copy link
Member

For vhost-vsock in crosvm: https://github.com/google/crosvm/blob/e70a8774a1b56ab7b7f6667a7783694cf72ced95/devices/src/virtio/vhost/vsock.rs#L191C17-L191C28

It looks like event_queue only takes care of VIRTIO_VSOCK_EVENT_TRANSPORT_RESET during activation.

Sorry, I don't know crosvm code. It looks like it is handling that queue, right?

Yep, also QEMU use it just to send VIRTIO_VSOCK_EVENT_TRANSPORT_RESET during a live-migration.

For vhost-user-vsock device for windows from crosvm: https://github.com/google/crosvm/blob/e70a8774a1b56ab7b7f6667a7783694cf72ced95/devices/src/virtio/vsock/sys/windows/vsock.rs#L1304

In this implementation, it doesn't act from event of event_queue, it just prints logs for error.

So, what we should do here?

Maybe it is better to fix crosvm to expect a vhost-user device not to implement all queues since it's legit as a behavior.

@ikicha
Copy link
Contributor Author

ikicha commented Jul 27, 2023

Out of curiosity, doesn't qemu check if the number of q is matched or not?

I think not to implement a part of queue might vary per vhost-user device implementation or protocol, I think it's hard to check these exceptions in crosvm(for example, the rule would be omitting the 3rd q in vsock is okay but its not okay for another device?) So, if there is no regression in qemu, I prefer increasing q number (and from windows impl, it looks like crosvm doesn't expect any acts from event q evt for vhost user vsock device)

(But I don't have much expertise about vhost protocol, I might be wrong)

@stefano-garzarella
Copy link
Member

stefano-garzarella commented Jul 27, 2023

Out of curiosity, doesn't qemu check if the number of q is matched or not?

Yep, but QEMU requires only 2 queues, since it handles the event queue.
It is exactly the same code that handle the in-kernel vhost-vsock device.

Is crosvm expecting 3 vq from the in-kernel vhost-vsock device?
I suppose no, because we don't handle it.

I think not to implement a part of queue might vary per vhost-user device implementation or protocol, I think it's hard to check these exceptions in crosvm(for example, the rule would be omitting the 3rd q in vsock is okay but its not okay for another device?) So, if there is no regression in qemu, I prefer increasing q number (and from windows impl, it looks like crosvm doesn't expect any acts from event q evt for vhost user vsock device)

(But I don't have much expertise about vhost protocol, I might be wrong)

I think it depends if crosvm expects that the vhost-user device handles that queue or not.
If it is the case, we also need to do something in the handle_event().

In the future we may want to implement all the queues in the device, but still, we can't say we handle 3 queues and then do nothing in handle_event()

@stefano-garzarella
Copy link
Member

However since for now the event queue is only used during live migration and this device doesn't support it, we might as well merge this PR, but we have to add a warning message at least to say that we don't support it.

(I still don't think it makes sense to support the event queue if we don't do anything here, though if there's really no way to fix crosvm, we can do it here)

@epilys
Copy link
Member

epilys commented Jul 27, 2023

I agree that this needs to be merged even if the device has no migration support. Since there must be three queues 1.

Footnotes

  1. 5.10.2 Virtqueues
    0 rx
    1 tx
    2 event

@stefano-garzarella
Copy link
Member

I agree that this needs to be merged even if the device has no migration support. Since there must be three queues 1.

Footnotes

1. > 5.10.2 Virtqueues
   > 0 rx
   > 1 tx
   > 2 event
   
   
   [leftwards_arrow_with_hook](#user-content-fnref-0-19b129b23d2b3c90b85a3f273a2a246b)

this is a virtio spec, vhost-user can support only a subset of VQs.

Just quoting Stefan:

"vhost (in general and that includes vhost-user) was not designed to be
a full VIRTIO device, just an interface for offloading specific
virtqueues. Now that vhost-user is being extended to support full
VIRTIO devices ("standalone"),..."

https://lore.kernel.org/qemu-devel/CAJSP0QXU7k3+J=zA=uuSfXjM=tz6wh13pabumbSEdvXFmJ6xMQ@mail.gmail.com/

So in view of supporting standalone, it might make sense, but we have to do something in handle_event() even just print a message to say we don't support it.

@epilys
Copy link
Member

epilys commented Jul 27, 2023

I have some questions:

this is a virtio spec, vhost-user can support only a subset of VQs.

In the QEMU spec, it says:
"Many devices have a fixed number of virtqueues." I cannot find where is it stated that a subset is allowed, is it on another document?

https://qemu.readthedocs.io/en/latest/interop/vhost-user.html#multiple-queue-support

So in view of supporting standalone, it might make sense, but we have to do something in handle_event() even just print a message to say we don't support it.

We should only just add an extra dummy queue. Vsock events are only sent by the device (in this case, vhost-user-vsock) to the guest, and since we don't support migration we won't ever use this queue. Right?

Another question since you mentioned the vhost-vsock kernel driver, isn't that for the virtio transport, not the device? (Correct me if I have mixed up things)

device: net/vmw_vsock/virtio_transport.c

https://github.com/torvalds/linux/blob/0a8db05b571ad5b8d5c8774a004c0424260a90bd/net/vmw_vsock/virtio_transport.c#L346

transport: https://github.com/torvalds/linux/blob/master/drivers/vhost/vsock.c

@stefano-garzarella
Copy link
Member

stefano-garzarella commented Jul 27, 2023

I have some questions:

this is a virtio spec, vhost-user can support only a subset of VQs.

In the QEMU spec, it says: "Many devices have a fixed number of virtqueues." I cannot find where is it stated that a subset is allowed, is it on another document?

The vhost-user specs are not great unfortunately :-(
Perhaps we should clarify this aspect, especially when we will add support for "standalone."
One of the differences between a "standalone" and non-"standalone" vhost-user device is also this: namely, that a "standalone" device has to implement all the virtqueues specified in the spec, because the VMM offloads all device emulation to the external process (like vDPA). Which is not true for non-standalone devices.

https://qemu.readthedocs.io/en/latest/interop/vhost-user.html#multiple-queue-support

So in view of supporting standalone, it might make sense, but we have to do something in handle_event() even just print a message to say we don't support it.

We should only just add an extra dummy queue. Vsock events are only sent by the device (in this case, vhost-user-vsock) to the guest, and since we don't support migration we won't ever use this queue. Right?

Right.

But in theory this is already done by the vhost crate, so this PR is already almost good. IMO though, we should just add the warning in handle_event().

Another question since you mentioned the vhost-vsock kernel driver, isn't that for the virtio transport, not the device? (Correct me if I have mixed up things)

Yep, there is a bit of an overlap of concepts here.

By transport in Linux's AF_VSOCK, we mean the system that is used under the hood to exchange packets (from AF_VSOCK's point of view).
The transports supported usually depend on the hypervisor used, for example we have virtio and vhost (KVM), hv_vsock (HyperV), VMCI (VMware).

So by virtio_transport, we mean virtio transport for vsock. Which is ultimately the virtio-vsock driver. So here transport is used regarding vsock, not virtio.

device: net/vmw_vsock/virtio_transport.c

Nope, this is the virtio-vsock driver (running in the guest).

https://github.com/torvalds/linux/blob/0a8db05b571ad5b8d5c8774a004c0424260a90bd/net/vmw_vsock/virtio_transport.c#L346

transport: https://github.com/torvalds/linux/blob/master/drivers/vhost/vsock.c

And this is the virtio-vsock device (vhost, running in the host).

Both are vsock transports: virtio_transport and vhost_transport

@epilys
Copy link
Member

epilys commented Jul 27, 2023

But in theory this is already done by the vhost crate, so this PR is already almost good. IMO though, we should just add the warning in handle_event().

Where is that?

@stefano-garzarella
Copy link
Member

But in theory this is already done by the vhost crate, so this PR is already almost good. IMO though, we should just add the warning in handle_event().

Where is that?

I meant VhostUserBackend, it will handle the queues returned by num_queues.

Maybe I didn't get the question...

@epilys
Copy link
Member

epilys commented Jul 27, 2023

@stefano-garzarella the question was in response to "add an extra dummy queue" -> "this is already done by the vhost crate" I thought you meant before this PR. Now it's all clear!

@ikicha apologies for the notification spam!

Copy link
Member

@stefano-garzarella stefano-garzarella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd a bit reorganize the commit order in this way (for bisectability and to avoid changing code in subsequent commits in the same PR):

  • patch 1: add comment on BACKEND_EVENT and use NUM_QUEUES + 1 and BACKEND_EVENT + 1 (no functional changes)
  • patch 2: add warn!() for EVT_QUEUE_EVENT
  • patch 3: increase NUM_QUEUES to 3

Maybe patch 2 and 3 could be merged in a single patch. Anyway, I don't have a strong opinion on that, so it is up to you ;-)

crates/vsock/src/vhu_vsock.rs Outdated Show resolved Hide resolved
@ikicha
Copy link
Contributor Author

ikicha commented Jul 27, 2023

I re-org'ed commit orders. and thanks for the review. (and i learnt about vhost a little bit from your conv :))

Copy link
Member

@stefano-garzarella stefano-garzarella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add vsock: prefix on the first patch

crates/vsock/src/vhu_vsock.rs Outdated Show resolved Hide resolved
crates/vsock/src/vhu_vsock.rs Outdated Show resolved Hide resolved
@ikicha ikicha force-pushed the crosvm branch 2 times, most recently from f20aeb1 to dbbe239 Compare July 27, 2023 15:51
@ikicha
Copy link
Contributor Author

ikicha commented Jul 28, 2023

@vireshk can you review this PR? It look it requires two or more owner review..

Copy link
Collaborator

@vireshk vireshk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a commit log to all the commits.

ikicha added 3 commits July 28, 2023 13:22
BACKEND_EVENT value depends on NUM_QUEUES, because it is the next value
of NUM_QUEUES, so set it based on NUM_QUEUES

Signed-off-by: Jeongik Cha <jeongik@google.com>
EVT_QUEUE_EVENT is an unexpected message from current impl, so add
warning for that

Signed-off-by: Jeongik Cha <jeongik@google.com>
In virtio standard, vsock uses 3 vqs. crosvm expects 3 vqs from
vhost-user-vsock impl, but this vhost-user-vsock device sets up
only 2 vqs because event vq isn't handled. And it causes crash in
crosvm. To avoid crash in crosvm, I increase NUM_QUEUES to 3

Signed-off-by: Jeongik Cha <jeongik@google.com>
@ikicha
Copy link
Contributor Author

ikicha commented Jul 28, 2023

Please add a commit log to all the commits.

Did you mean by the first commit? vsock: Set BACKEND_EVENT based on NUM_QUEUES ?

I added commit message for that.

@vireshk vireshk enabled auto-merge (rebase) July 28, 2023 04:26
@vireshk vireshk merged commit 637969d into rust-vmm:main Jul 28, 2023
@@ -295,7 +297,9 @@ impl VhostUserBackend<VringRwLock, ()> for VhostUserVsockBackend {
TX_QUEUE_EVENT => {
thread.process_tx(vring_tx, evt_idx)?;
}
EVT_QUEUE_EVENT => {}
EVT_QUEUE_EVENT => {
warn!("Received an unexpected EVT_QUEUE_EVENT");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually the device that sends events to the guest. When you get a virtqueue request from the guest, it isn't sending an event, it is just providing a buffer so that the host can send the event in the response whenever it feels the need. Keeping this as a no-op is probably more appropriate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this device we do not support sending events to the guest, so we expect the VMM to do so (e.g. QEMU). That's why we put the warning, because it's the VMM that should intercept these events and handle the free buffers posited by the driver.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'd guess really want to print the message when/if the 3rd queue is supplied at all, not necessarily when you get requests from it, but this is the easiest approximation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The third queue has to be provided to the guest in any case, but for example in QEMU, it handles it itself. So there is no expectation of receiving this event here. In crosvm this is not true?

@fkm3
Copy link

fkm3 commented Mar 22, 2024

FYI: If you implement VHOST_USER_PROTOCOL_F_MQ, then you could tell the (crosvm) frontend that you only support 2 queues and it should just work.

@stefano-garzarella
Copy link
Member

FYI: If you implement VHOST_USER_PROTOCOL_F_MQ, then you could tell the (crosvm) frontend that you only support 2 queues and it should just work.

Here we have a fixed number of queues so it's not really needed, but if that simplifies crosvm, it's fine with me to enable F_MQ.

@ikicha @fkm3 At this point, what does crosvm expect the vhost-device to return if we enable F_MQ? The number of queues that the vhost-device is able to handle (2 in this cases), or the number of total queues (3)?

@fkm3
Copy link

fkm3 commented Mar 25, 2024

At this point, what does crosvm expect the vhost-device to return if we enable F_MQ? The number of queues that the vhost-device is able to handle (2 in this cases), or the number of total queues (3)?

crosvm assumes everything is a "standalone" device ATM, so those are the same in its case. It will only setup VHOST_USER_GET_QUEUE_NUM queues for the guest. So, actually, this won't work. I was imagining that the vsock driver would be OK if the 3rd queue wasn't supplied, but it seems to require it. Sorry for the noise.

@stefano-garzarella
Copy link
Member

At this point, what does crosvm expect the vhost-device to return if we enable F_MQ? The number of queues that the vhost-device is able to handle (2 in this cases), or the number of total queues (3)?

crosvm assumes everything is a "standalone" device ATM, so those are the same in its case.

What do you mean with "standalone" device?
Do you mean that it expects the vhost-user device to handle all queues?

It will only setup VHOST_USER_GET_QUEUE_NUM queues for the guest. So, actually, this won't work.

What would not work? Returning 2 instead of 3?

I was imagining that the vsock driver would be OK if the 3rd queue wasn't supplied, but it seems to require it.

Nope, the VMM must provide 3 queues as required by the virtio spec.
For example, QEMU is allocating always 3 queues, but will expose only the first two to the vhost-user device.

Sorry for the noise.

Don't worry, this is interesting to understand ;-)

@fkm3
Copy link

fkm3 commented Mar 25, 2024

Yeah, crosvm expects the vhost-user backend to handle all the queues, the frontend is very thin. Theoretically crosvm could do the same thing as QEMU here, there is just no precedence in the codebase for it yet as far as I know.

@stefano-garzarella
Copy link
Member

@fkm3 got it. Do you want to send a patch to remove that warning?
We should explain that crosvm requires the device to handle all the queues, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vsock: it doesn't work with crosvm
5 participants