-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vsock: Increase NUM_QUEUES to 3 #409
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What problems does it cause in crosvm?
This device behaves like the vhost-vsock in-kernel device and only handles TX and RX queues. The event queue is handled by the VMM (as with the in-kernel device).
Indeed, in handle_event()
the EVT_QUEUE_EVENT
block code is empty.
I think we should make this change only when we implement something there for VMMs that doesn't handle the event queue.
crosvm checks if the queue size is matched between device and vmm, and if not, it causes initialization error. I was curious if qemu works well or not if we set queue size as 3. |
Just tested, and it works, but should we do something in What crosvm is expecting with that queue? |
Does crosvm also support vhost-vsock in-kernel device? How does it deal with it since it does not handle event queue? |
For vhost-vsock in crosvm: https://github.com/google/crosvm/blob/e70a8774a1b56ab7b7f6667a7783694cf72ced95/devices/src/virtio/vhost/vsock.rs#L191C17-L191C28 It looks like event_queue only takes care of For vhost-user-vsock device for windows from crosvm: https://github.com/google/crosvm/blob/e70a8774a1b56ab7b7f6667a7783694cf72ced95/devices/src/virtio/vsock/sys/windows/vsock.rs#L1304 In this implementation, it doesn't act from event of event_queue, it just prints logs for error. |
Sorry, I don't know crosvm code. It looks like it is handling that queue, right? Yep, also QEMU use it just to send
So, what we should do here? Maybe it is better to fix crosvm to expect a vhost-user device not to implement all queues since it's legit as a behavior. |
Out of curiosity, doesn't qemu check if the number of q is matched or not? I think not to implement a part of queue might vary per vhost-user device implementation or protocol, I think it's hard to check these exceptions in crosvm(for example, the rule would be omitting the 3rd q in vsock is okay but its not okay for another device?) So, if there is no regression in qemu, I prefer increasing q number (and from windows impl, it looks like crosvm doesn't expect any acts from event q evt for vhost user vsock device) (But I don't have much expertise about vhost protocol, I might be wrong) |
Yep, but QEMU requires only 2 queues, since it handles the event queue. Is crosvm expecting 3 vq from the in-kernel vhost-vsock device?
I think it depends if crosvm expects that the vhost-user device handles that queue or not. In the future we may want to implement all the queues in the device, but still, we can't say we handle 3 queues and then do nothing in handle_event() |
However since for now the event queue is only used during live migration and this device doesn't support it, we might as well merge this PR, but we have to add a warning message at least to say that we don't support it. (I still don't think it makes sense to support the event queue if we don't do anything here, though if there's really no way to fix crosvm, we can do it here) |
this is a virtio spec, vhost-user can support only a subset of VQs. Just quoting Stefan:
So in view of supporting standalone, it might make sense, but we have to do something in handle_event() even just print a message to say we don't support it. |
I have some questions:
In the QEMU spec, it says: https://qemu.readthedocs.io/en/latest/interop/vhost-user.html#multiple-queue-support
We should only just add an extra dummy queue. Vsock events are only sent by the device (in this case, Another question since you mentioned the vhost-vsock kernel driver, isn't that for the virtio transport, not the device? (Correct me if I have mixed up things) device: net/vmw_vsock/virtio_transport.c transport: https://github.com/torvalds/linux/blob/master/drivers/vhost/vsock.c |
The vhost-user specs are not great unfortunately :-(
Right. But in theory this is already done by the vhost crate, so this PR is already almost good. IMO though, we should just add the warning in handle_event().
Yep, there is a bit of an overlap of concepts here. By transport in Linux's AF_VSOCK, we mean the system that is used under the hood to exchange packets (from AF_VSOCK's point of view). So by virtio_transport, we mean virtio transport for vsock. Which is ultimately the virtio-vsock driver. So here transport is used regarding vsock, not virtio.
Nope, this is the virtio-vsock driver (running in the guest).
And this is the virtio-vsock device (vhost, running in the host). Both are vsock transports: virtio_transport and vhost_transport |
Where is that? |
I meant VhostUserBackend, it will handle the queues returned by Maybe I didn't get the question... |
@stefano-garzarella the question was in response to "add an extra dummy queue" -> "this is already done by the vhost crate" I thought you meant before this PR. Now it's all clear! @ikicha apologies for the notification spam! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd a bit reorganize the commit order in this way (for bisectability and to avoid changing code in subsequent commits in the same PR):
- patch 1: add comment on
BACKEND_EVENT
and useNUM_QUEUES + 1
andBACKEND_EVENT + 1
(no functional changes) - patch 2: add warn!() for
EVT_QUEUE_EVENT
- patch 3: increase
NUM_QUEUES
to 3
Maybe patch 2 and 3 could be merged in a single patch. Anyway, I don't have a strong opinion on that, so it is up to you ;-)
I re-org'ed commit orders. and thanks for the review. (and i learnt about vhost a little bit from your conv :)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add vsock:
prefix on the first patch
f20aeb1
to
dbbe239
Compare
@vireshk can you review this PR? It look it requires two or more owner review.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a commit log to all the commits.
BACKEND_EVENT value depends on NUM_QUEUES, because it is the next value of NUM_QUEUES, so set it based on NUM_QUEUES Signed-off-by: Jeongik Cha <jeongik@google.com>
EVT_QUEUE_EVENT is an unexpected message from current impl, so add warning for that Signed-off-by: Jeongik Cha <jeongik@google.com>
In virtio standard, vsock uses 3 vqs. crosvm expects 3 vqs from vhost-user-vsock impl, but this vhost-user-vsock device sets up only 2 vqs because event vq isn't handled. And it causes crash in crosvm. To avoid crash in crosvm, I increase NUM_QUEUES to 3 Signed-off-by: Jeongik Cha <jeongik@google.com>
Did you mean by the first commit? vsock: Set BACKEND_EVENT based on NUM_QUEUES ? I added commit message for that. |
@@ -295,7 +297,9 @@ impl VhostUserBackend<VringRwLock, ()> for VhostUserVsockBackend { | |||
TX_QUEUE_EVENT => { | |||
thread.process_tx(vring_tx, evt_idx)?; | |||
} | |||
EVT_QUEUE_EVENT => {} | |||
EVT_QUEUE_EVENT => { | |||
warn!("Received an unexpected EVT_QUEUE_EVENT"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually the device that sends events to the guest. When you get a virtqueue request from the guest, it isn't sending an event, it is just providing a buffer so that the host can send the event in the response whenever it feels the need. Keeping this as a no-op is probably more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this device we do not support sending events to the guest, so we expect the VMM to do so (e.g. QEMU). That's why we put the warning, because it's the VMM that should intercept these events and handle the free buffers posited by the driver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I'd guess really want to print the message when/if the 3rd queue is supplied at all, not necessarily when you get requests from it, but this is the easiest approximation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The third queue has to be provided to the guest in any case, but for example in QEMU, it handles it itself. So there is no expectation of receiving this event here. In crosvm this is not true?
FYI: If you implement |
Here we have a fixed number of queues so it's not really needed, but if that simplifies crosvm, it's fine with me to enable @ikicha @fkm3 At this point, what does crosvm expect the vhost-device to return if we enable |
crosvm assumes everything is a "standalone" device ATM, so those are the same in its case. It will only setup |
What do you mean with "standalone" device?
What would not work? Returning 2 instead of 3?
Nope, the VMM must provide 3 queues as required by the virtio spec.
Don't worry, this is interesting to understand ;-) |
Yeah, crosvm expects the vhost-user backend to handle all the queues, the frontend is very thin. Theoretically crosvm could do the same thing as QEMU here, there is just no precedence in the codebase for it yet as far as I know. |
@fkm3 got it. Do you want to send a patch to remove that warning? |
In virtio standard, vsock uses 3 vqs(https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.html#x1-4380002), and crosvm and qemu also uses 3 queues, but this vhost-user-vsock device implementation assumes that there are only 2 vqs, and it causes the problem with crosvm. (at least)
Fixes #408