[WIP] Add VIrtIO vhost-blk for Gen3 #168

LKomaryanskiy · 2024-09-27T12:22:42Z

Rebased patchseria that implements VirtIO vhost-blk functionality:
https://lore.kernel.org/kvm/20221013151839.689700-1-andrey.zhadchenko@virtuozzo.com/

PR also contains cherry-picked additional commits that should not change the functionality but make rebase much smoother.

This PR currently is still WIP, because, in case of using VirtIO vhost-blk feature in Gen3, DomA freezes during storage operation.

vhost_work_flush doesn't do anything with the work arg. This patch drops it and then renames vhost_work_flush to vhost_work_dev_flush to reflect that the function flushes all the works in the dev and not just a specific queue or work item. Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210525174733.6212-2-michael.christie@oracle.com Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost_poll_flush() is a simple wrapper around vhost_work_dev_flush(). It gives wrong impression that we are doing some work over vhost_poll, while in fact it flushes vhost_poll->dev. It only complicate understanding of the code and leads to mistakes like flushing the same vhost_dev several times in a row. Just remove vhost_poll_flush() and call vhost_work_dev_flush() directly. Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> [merge vhost_poll_flush removal from Stefano Garzarella] Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517180850.198915-2-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost_net_flush_vq() calls vhost_work_dev_flush() twice passing vhost_dev pointer obtained via 'n->poll[index].dev' and 'n->vqs[index].vq.poll.dev'. This is actually the same pointer, initialized in vhost_net_open()/vhost_dev_init()/vhost_poll_init() Remove vhost_net_flush_vq() and call vhost_work_dev_flush() directly. Do the flushes only once instead of several flush calls in a row which seems rather useless. Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> [drop vhost_dev forward declaration in vhost.h] Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517180850.198915-4-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

When vhost_work_dev_flush returns all work queued at that time will have completed. There is then no need to flush after every vhost_poll_stop call, and we can move the flush call to after the loop that stops the pollers. Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517180850.198915-3-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

vhost_vsock_flush() calls vhost_work_dev_flush(vsock->vqs[i].poll.dev) before vhost_work_dev_flush(&vsock->dev). This seems pointless as vsock->vqs[i].poll.dev is the same as &vsock->dev and several flushes in a row doesn't do anything useful, one is just enough. Signed-off-by: Andrey Ryabinin <arbn@yandex-team.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20220517180850.198915-6-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

This patch renames vhost_work_dev_flush to just vhost_dev_flush to relfect that it flushes everything on the device and that drivers don't know/care that polls are based on vhost_works. Drivers just flush the entire device and polls, and works for vhost-scsi management TMFs and IO net virtqueues, etc all are flushed. Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-Id: <20220517180850.198915-9-michael.christie@oracle.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leonid Komarianskyi <leonid_komarianskyi@epam.com>

Although QEMU virtio is quite fast, there is still some room for improvements. Disk latency can be reduced if we handle virito-blk requests in host kernel istead of passing them to QEMU. The patch adds vhost-blk kernel module to do so. Some test setups: fio --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=128 QEMU drive options: cache=none filesystem: xfs SSD: | randread, IOPS | randwrite, IOPS | Host | 95.8k | 85.3k | QEMU virtio | 57.5k | 79.4k | QEMU vhost-blk | 95.6k | 84.3k | RAMDISK (vq == vcpu = numjobs): | randread, IOPS | randwrite, IOPS | virtio, 1vcpu | 133k | 133k | virtio, 2vcpu | 305k | 306k | virtio, 4vcpu | 310k | 298k | vhost-blk, 1vcpu | 110k | 113k | vhost-blk, 2vcpu | 247k | 252k | vhost-blk, 4vcpu | 558k | 556k | v2: - removed unused VHOST_BLK_VQ - reworked bio handling a bit: now add all pages from signle iov into bio until it is full istead of allocating one bio per page - changed sector incrementation calculation - check move_iovec() in vhost_blk_req_handle() - remove snprintf check and better check ret from copy_to_iter for VIRTIO_BLK_ID_BYTES requests - discard vq request if vhost_blk_req_handle() returned negative code - forbid to change nonzero backend in vhost_blk_set_backend(). First of all, QEMU sets backend only once. Also if we want to change backend when we already running requests we need to be much more careful in vhost_blk_handle_guest_kick() as it is not taking any references. If userspace want to change backend that bad it can always reset device. - removed EXPERIMENTAL from Kconfig Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

We want to support several vhost workers. The first step is to rework vhost to use array of workers rather than single pointer. Update creation and cleanup routines. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

Make vhost_dev_flush support several workers and flush them simultaneously Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

Rework vhost_attach_cgroups to manipulate specified worker. Implement vhost_worker_flush as we need to flush specific worker. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

Add function to create a vhost worker and add it into the device. Rework vhost_dev_set_owner Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

Finally add ioctl to allow userspace to create additional workers For now only allow to increase the number of workers Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

Add worker pointer to every virtqueue. Add routine to assing workers to virtqueues and call it after any worker creation Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

Add routines to queue works on virtqueue assigned workers Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

Allow vhost polls to be associated with vqs so we can queue them on assigned workers. If polls are not associated with specific vqs queue them on the first virtqueue. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

mikechristie and others added 16 commits September 6, 2024 12:23

drivers/vhost: use array to store workers

1963c3d

We want to support several vhost workers. The first step is to rework vhost to use array of workers rather than single pointer. Update creation and cleanup routines. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

drivers/vhost: adjust vhost to flush all workers

d2eaad2

Make vhost_dev_flush support several workers and flush them simultaneously Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

drivers/vhost: rework cgroups attachment to be worker aware

b757161

Rework vhost_attach_cgroups to manipulate specified worker. Implement vhost_worker_flush as we need to flush specific worker. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

drivers/vhost: rework worker creation

3b10e1e

Add function to create a vhost worker and add it into the device. Rework vhost_dev_set_owner Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

drivers/vhost: add ioctl to increase the number of workers

e1b6196

Finally add ioctl to allow userspace to create additional workers For now only allow to increase the number of workers Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

drivers/vhost: assign workers to virtqueues

3fd4c2a

Add worker pointer to every virtqueue. Add routine to assing workers to virtqueues and call it after any worker creation Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

drivers/vhost: add API to queue work at virtqueue's worker

6a5b722

Add routines to queue works on virtqueue assigned workers Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com>

LKomaryanskiy marked this pull request as draft September 27, 2024 12:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add VIrtIO vhost-blk for Gen3 #168

[WIP] Add VIrtIO vhost-blk for Gen3 #168

LKomaryanskiy commented Sep 27, 2024

[WIP] Add VIrtIO vhost-blk for Gen3 #168

Are you sure you want to change the base?

[WIP] Add VIrtIO vhost-blk for Gen3 #168

Conversation

LKomaryanskiy commented Sep 27, 2024