-
Notifications
You must be signed in to change notification settings - Fork 14
VirtSnd: Driver MUST NOT place device writable buffers in tx queue #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
patchew-importer
pushed a commit
that referenced
this issue
Sep 11, 2023
virtio_load() as a whole should run in coroutine context because it reads from the migration stream and we don't want this to block. However, it calls virtio_set_features_nocheck() and devices don't expect their .set_features callback to run in a coroutine and therefore call functions that may not be called in coroutine context. To fix this, drop out of coroutine context for calling virtio_set_features_nocheck(). Without this fix, the following crash was reported: #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007efc738c05d3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #2 0x00007efc73873d26 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007efc738477f3 in __GI_abort () at abort.c:79 #4 0x00007efc7384771b in __assert_fail_base (fmt=0x7efc739dbcb8 "", assertion=assertion@entry=0x560aebfbf5cf "!qemu_in_coroutine()", file=file@entry=0x560aebfcd2d4 "../block/graph-lock.c", line=line@entry=275, function=function@entry=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:92 #5 0x00007efc7386ccc6 in __assert_fail (assertion=0x560aebfbf5cf "!qemu_in_coroutine()", file=0x560aebfcd2d4 "../block/graph-lock.c", line=275, function=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:101 #6 0x0000560aebcd8dd6 in bdrv_register_buf () #7 0x0000560aeb97ed97 in ram_block_added.llvm () #8 0x0000560aebb8303f in ram_block_add.llvm () #9 0x0000560aebb834fa in qemu_ram_alloc_internal.llvm () #10 0x0000560aebb2ac98 in vfio_region_mmap () #11 0x0000560aebb3ea0f in vfio_bars_register () #12 0x0000560aebb3c628 in vfio_realize () #13 0x0000560aeb90f0c2 in pci_qdev_realize () #14 0x0000560aebc40305 in device_set_realized () #15 0x0000560aebc48e07 in property_set_bool.llvm () #16 0x0000560aebc46582 in object_property_set () #17 0x0000560aebc4cd58 in object_property_set_qobject () #18 0x0000560aebc46ba7 in object_property_set_bool () #19 0x0000560aeb98b3ca in qdev_device_add_from_qdict () #20 0x0000560aebb1fbaf in virtio_net_set_features () #21 0x0000560aebb46b51 in virtio_set_features_nocheck () #22 0x0000560aebb47107 in virtio_load () #23 0x0000560aeb9ae7ce in vmstate_load_state () #24 0x0000560aeb9d2ee9 in qemu_loadvm_state_main () #25 0x0000560aeb9d45e1 in qemu_loadvm_state () #26 0x0000560aeb9bc32c in process_incoming_migration_co.llvm () #27 0x0000560aebeace56 in coroutine_trampoline.llvm () Cc: qemu-stable@nongnu.org Buglink: https://issues.redhat.com/browse/RHEL-832 Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230905145002.46391-3-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Sep 13, 2023
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555888630 in dpy_ui_info_supported (con=0x0) at ../ui/console.c:812 812 return con->hw_ops->ui_info != NULL; (gdb) bt #0 0x0000555555888630 in dpy_ui_info_supported (con=0x0) at ../ui/console.c:812 #1 0x00005555558a44b1 in protocol_client_msg (vs=0x5555578c76c0, data=0x5555581e93f0 <incomplete sequence \373>, len=24) at ../ui/vnc.c:2585 #2 0x00005555558a19ac in vnc_client_read (vs=0x5555578c76c0) at ../ui/vnc.c:1607 #3 0x00005555558a1ac2 in vnc_client_io (ioc=0x5555581eb0e0, condition=G_IO_IN, opaque=0x5555578c76c0) at ../ui/vnc.c:1635 Fixes: https://issues.redhat.com/browse/RHEL-2600 Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Albert Esteve <aesteve@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Sep 27, 2023
Replace the return path retry logic with finishing and restarting the thread. This fixes a race when resuming the migration that leads to a segfault. Currently when doing postcopy we consider that an IO error on the return path file could be due to a network intermittency. We then keep the thread alive but have it do cleanup of the 'from_dst_file' and wait on the 'postcopy_pause_rp' semaphore. When the user issues a migrate resume, a new return path is opened and the thread is allowed to continue. There's a race condition in the above mechanism. It is possible for the new return path file to be setup *before* the cleanup code in the return path thread has had a chance to run, leading to the *new* file being closed and the pointer set to NULL. When the thread is released after the resume, it tries to dereference 'from_dst_file' and crashes: Thread 7 "return path" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fffd1dbf700 (LWP 9611)] 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 154 return f->last_error; (gdb) bt #0 0x00005555560e4893 in qemu_file_get_error_obj (f=0x0, errp=0x0) at ../migration/qemu-file.c:154 #1 0x00005555560e4983 in qemu_file_get_error (f=0x0) at ../migration/qemu-file.c:206 #2 0x0000555555b9a1df in source_return_path_thread (opaque=0x555556e06000) at ../migration/migration.c:1876 #3 0x000055555602e14f in qemu_thread_start (args=0x55555782e780) at ../util/qemu-thread-posix.c:541 #4 0x00007ffff38d76ea in start_thread (arg=0x7fffd1dbf700) at pthread_create.c:477 #5 0x00007ffff35efa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Here's the race (important bit is open_return_path happening before migration_release_dst_files): migration | qmp | return path --------------------------+-----------------------------+--------------------------------- qmp_migrate_pause() shutdown(ms->to_dst_file) f->last_error = -EIO migrate_detect_error() postcopy_pause() set_state(PAUSED) wait(postcopy_pause_sem) qmp_migrate(resume) migrate_fd_connect() resume = state == PAUSED open_return_path <-- TOO SOON! set_state(RECOVER) post(postcopy_pause_sem) (incoming closes to_src_file) res = qemu_file_get_error(rp) migration_release_dst_files() ms->rp_state.from_dst_file = NULL post(postcopy_pause_rp_sem) postcopy_pause_return_path_thread() wait(postcopy_pause_rp_sem) rp = ms->rp_state.from_dst_file goto retry qemu_file_get_error(rp) SIGSEGV ------------------------------------------------------------------------------------------- We can keep the retry logic without having the thread alive and waiting. The only piece of data used by it is the 'from_dst_file' and it is only allowed to proceed after a migrate resume is issued and the semaphore released at migrate_fd_connect(). Move the retry logic to outside the thread by waiting for the thread to finish before pausing the migration. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230918172822.19052-8-farosas@suse.de>
patchew-importer
pushed a commit
that referenced
this issue
Oct 24, 2023
When the given uuid is already present in the hash table, virtio_add_resource() does not add the passed VirtioSharedObject. In this case, free it in the callers to avoid leaking memory. This fixed the following `make check` error, when built with --enable-sanitizers: 4/166 qemu:unit / test-virtio-dmabuf ERROR 1.51s exit status 1 ==7716==ERROR: LeakSanitizer: detected memory leaks Direct leak of 320 byte(s) in 20 object(s) allocated from: #0 0x7f6fc16e3808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144 #1 0x7f6fc1503e98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98) #2 0x564d63cafb6b in test_add_invalid_resource ../tests/unit/test-virtio-dmabuf.c:100 #3 0x7f6fc152659d (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a59d) SUMMARY: AddressSanitizer: 320 byte(s) leaked in 20 allocation(s). The changes at virtio_add_resource() itself are not strictly necessary for the memleak fix, but they make it more obvious that, on an error return, the passed object is not added to the hash. Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com> Message-Id: <c61c13f9a0c67dec473bdbfc8789c29ef26c900b.1696624734.git.quic_mathbern@quicinc.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Albert Esteve <aesteve@redhat.com> Signed-off-by: Matheus Tavares Bernardino <<a href="mailto:quic_mathbern@quicinc.com" target="_blank">quic_mathbern@quicinc.com</a>><br>
patchew-importer
pushed a commit
that referenced
this issue
Nov 8, 2023
If there is a pending DMA operation during ide_bus_reset(), the fact that the IDEState is already reset before the operation is canceled can be problematic. In particular, ide_dma_cb() might be called and then use the reset IDEState which contains the signature after the reset. When used to construct the IO operation this leads to ide_get_sector() returning 0 and nsector being 1. This is particularly bad, because a write command will thus destroy the first sector which often contains a partition table or similar. Traces showing the unsolicited write happening with IDEState 0x5595af6949d0 being used after reset: > ahci_port_write ahci(0x5595af6923f0)[0]: port write [reg:PxSCTL] @ 0x2c: 0x00000300 > ahci_reset_port ahci(0x5595af6923f0)[0]: reset port > ide_reset IDEstate 0x5595af6949d0 > ide_reset IDEstate 0x5595af694da8 > ide_bus_reset_aio aio_cancel > dma_aio_cancel dbs=0x7f64600089a0 > dma_blk_cb dbs=0x7f64600089a0 ret=0 > dma_complete dbs=0x7f64600089a0 ret=0 cb=0x5595acd40b30 > ahci_populate_sglist ahci(0x5595af6923f0)[0] > ahci_dma_prepare_buf ahci(0x5595af6923f0)[0]: prepare buf limit=512 prepared=512 > ide_dma_cb IDEState 0x5595af6949d0; sector_num=0 n=1 cmd=DMA WRITE > dma_blk_io dbs=0x7f6420802010 bs=0x5595ae2c6c30 offset=0 to_dev=1 > dma_blk_cb dbs=0x7f6420802010 ret=0 > (gdb) p *qiov > $11 = {iov = 0x7f647c76d840, niov = 1, {{nalloc = 1, local_iov = {iov_base = 0x0, > iov_len = 512}}, {__pad = "\001\000\000\000\000\000\000\000\000\000\000", > size = 512}}} > (gdb) bt > #0 blk_aio_pwritev (blk=0x5595ae2c6c30, offset=0, qiov=0x7f6420802070, flags=0, > cb=0x5595ace6f0b0 <dma_blk_cb>, opaque=0x7f6420802010) > at ../block/block-backend.c:1682 > #1 0x00005595ace6f185 in dma_blk_cb (opaque=0x7f6420802010, ret=<optimized out>) > at ../softmmu/dma-helpers.c:179 > #2 0x00005595ace6f778 in dma_blk_io (ctx=0x5595ae0609f0, > sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512, > io_func=io_func@entry=0x5595ace6ee30 <dma_blk_write_io_func>, > io_func_opaque=io_func_opaque@entry=0x5595ae2c6c30, > cb=0x5595acd40b30 <ide_dma_cb>, opaque=0x5595af6949d0, > dir=DMA_DIRECTION_TO_DEVICE) at ../softmmu/dma-helpers.c:244 > #3 0x00005595ace6f90a in dma_blk_write (blk=0x5595ae2c6c30, > sg=sg@entry=0x5595af694d00, offset=offset@entry=0, align=align@entry=512, > cb=cb@entry=0x5595acd40b30 <ide_dma_cb>, opaque=opaque@entry=0x5595af6949d0) > at ../softmmu/dma-helpers.c:280 > #4 0x00005595acd40e18 in ide_dma_cb (opaque=0x5595af6949d0, ret=<optimized out>) > at ../hw/ide/core.c:953 > #5 0x00005595ace6f319 in dma_complete (ret=0, dbs=0x7f64600089a0) > at ../softmmu/dma-helpers.c:107 > #6 dma_blk_cb (opaque=0x7f64600089a0, ret=0) at ../softmmu/dma-helpers.c:127 > #7 0x00005595ad12227d in blk_aio_complete (acb=0x7f6460005b10) > at ../block/block-backend.c:1527 > #8 blk_aio_complete (acb=0x7f6460005b10) at ../block/block-backend.c:1524 > #9 blk_aio_write_entry (opaque=0x7f6460005b10) at ../block/block-backend.c:1594 > #10 0x00005595ad258cfb in coroutine_trampoline (i0=<optimized out>, > i1=<optimized out>) at ../util/coroutine-ucontext.c:177 Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: simon.rowe@nutanix.com Message-ID: <20230906130922.142845-1-f.ebner@proxmox.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
Jan 16, 2024
…ock_status Using fleecing backup like in [0] on a qcow2 image (with metadata preallocation) can lead to the following assertion failure: > bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed. In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag will be set by the qcow2 driver, so the caller will recursively check the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call chain, in bdrv_co_do_block_status() for the snapshot-access driver, the assertion failure will happen, because both flags are set. To fix it, clear the recurse flag after the recursive check was done. In detail: > #0 qcow2_co_block_status Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID. > #1 bdrv_co_do_block_status Because of the data flag, bdrv_co_do_block_status() will now also set BDRV_BLOCK_ALLOCATED. Because of the recurse flag, bdrv_co_do_block_status() for the bdrv_file child will be called, which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ZERO. Now the return value inherits the zero flag. Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO. > #2 bdrv_co_common_block_status_above > #3 bdrv_co_block_status_above > #4 bdrv_co_block_status > #5 cbw_co_snapshot_block_status > #6 bdrv_co_snapshot_block_status > #7 snapshot_access_co_block_status > #8 bdrv_co_do_block_status Return value is propagated all the way up to here, where the assertion failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are both set. > #9 bdrv_co_common_block_status_above > #10 bdrv_co_block_status_above > #11 block_copy_block_status > #12 block_copy_dirty_clusters > #13 block_copy_common > #14 block_copy_async_co_entry > #15 coroutine_trampoline [0]: > #!/bin/bash > rm /tmp/disk.qcow2 > ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G > ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G > ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \ > --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \ > --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } } > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } } > {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } } > EOF Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240116154839.401030-1-f.ebner@proxmox.com>
patchew-importer
pushed a commit
that referenced
this issue
Jan 17, 2024
…ock_status Using fleecing backup like in [0] on a qcow2 image (with metadata preallocation) can lead to the following assertion failure: > bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed. In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag will be set by the qcow2 driver, so the caller will recursively check the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call chain, in bdrv_co_do_block_status() for the snapshot-access driver, the assertion failure will happen, because both flags are set. To fix it, clear the recurse flag after the recursive check was done. In detail: > #0 qcow2_co_block_status Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID. > #1 bdrv_co_do_block_status Because of the data flag, bdrv_co_do_block_status() will now also set BDRV_BLOCK_ALLOCATED. Because of the recurse flag, bdrv_co_do_block_status() for the bdrv_file child will be called, which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ZERO. Now the return value inherits the zero flag. Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO. > #2 bdrv_co_common_block_status_above > #3 bdrv_co_block_status_above > #4 bdrv_co_block_status > #5 cbw_co_snapshot_block_status > #6 bdrv_co_snapshot_block_status > #7 snapshot_access_co_block_status > #8 bdrv_co_do_block_status Return value is propagated all the way up to here, where the assertion failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are both set. > #9 bdrv_co_common_block_status_above > #10 bdrv_co_block_status_above > #11 block_copy_block_status > #12 block_copy_dirty_clusters > #13 block_copy_common > #14 block_copy_async_co_entry > #15 coroutine_trampoline [0]: > #!/bin/bash > rm /tmp/disk.qcow2 > ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G > ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G > ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \ > --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \ > --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } } > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } } > {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } } > EOF Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20240116154839.401030-1-f.ebner@proxmox.com>
patchew-importer
pushed a commit
that referenced
this issue
Jan 22, 2024
…ock_status Using fleecing backup like in [0] on a qcow2 image (with metadata preallocation) can lead to the following assertion failure: > bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed. In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag will be set by the qcow2 driver, so the caller will recursively check the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call chain, in bdrv_co_do_block_status() for the snapshot-access driver, the assertion failure will happen, because both flags are set. To fix it, clear the recurse flag after the recursive check was done. In detail: > #0 qcow2_co_block_status Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID. > #1 bdrv_co_do_block_status Because of the data flag, bdrv_co_do_block_status() will now also set BDRV_BLOCK_ALLOCATED. Because of the recurse flag, bdrv_co_do_block_status() for the bdrv_file child will be called, which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ZERO. Now the return value inherits the zero flag. Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO. > #2 bdrv_co_common_block_status_above > #3 bdrv_co_block_status_above > #4 bdrv_co_block_status > #5 cbw_co_snapshot_block_status > #6 bdrv_co_snapshot_block_status > #7 snapshot_access_co_block_status > #8 bdrv_co_do_block_status Return value is propagated all the way up to here, where the assertion failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are both set. > #9 bdrv_co_common_block_status_above > #10 bdrv_co_block_status_above > #11 block_copy_block_status > #12 block_copy_dirty_clusters > #13 block_copy_common > #14 block_copy_async_co_entry > #15 coroutine_trampoline [0]: > #!/bin/bash > rm /tmp/disk.qcow2 > ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G > ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G > ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \ > --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \ > --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } } > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } } > {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } } > EOF Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-id: 20240116154839.401030-1-f.ebner@proxmox.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20240122160126.394141-3-stefanha@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Jan 25, 2024
…ock_status Using fleecing backup like in [0] on a qcow2 image (with metadata preallocation) can lead to the following assertion failure: > bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed. In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag will be set by the qcow2 driver, so the caller will recursively check the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call chain, in bdrv_co_do_block_status() for the snapshot-access driver, the assertion failure will happen, because both flags are set. To fix it, clear the recurse flag after the recursive check was done. In detail: > #0 qcow2_co_block_status Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID. > #1 bdrv_co_do_block_status Because of the data flag, bdrv_co_do_block_status() will now also set BDRV_BLOCK_ALLOCATED. Because of the recurse flag, bdrv_co_do_block_status() for the bdrv_file child will be called, which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ZERO. Now the return value inherits the zero flag. Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO. > #2 bdrv_co_common_block_status_above > #3 bdrv_co_block_status_above > #4 bdrv_co_block_status > #5 cbw_co_snapshot_block_status > #6 bdrv_co_snapshot_block_status > #7 snapshot_access_co_block_status > #8 bdrv_co_do_block_status Return value is propagated all the way up to here, where the assertion failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are both set. > #9 bdrv_co_common_block_status_above > #10 bdrv_co_block_status_above > #11 block_copy_block_status > #12 block_copy_dirty_clusters > #13 block_copy_common > #14 block_copy_async_co_entry > #15 coroutine_trampoline [0]: > #!/bin/bash > rm /tmp/disk.qcow2 > ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G > ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G > ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \ > --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \ > --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } } > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } } > {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } } > EOF Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-id: 20240116154839.401030-1-f.ebner@proxmox.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Jan 30, 2024
A memory page poisoned from the hypervisor level is no longer readable. The migration of a VM will crash Qemu when it tries to read the memory address space and stumbles on the poisoned page with a similar stack trace: Program terminated with signal SIGBUS, Bus error. #0 _mm256_loadu_si256 #1 buffer_zero_avx2 #2 select_accel_fn #3 buffer_is_zero #4 save_zero_page #5 ram_save_target_page_legacy #6 ram_save_host_page #7 ram_find_and_save_block #8 ram_save_iterate #9 qemu_savevm_state_iterate #10 migration_iteration_run #11 migration_thread #12 qemu_thread_start To avoid this VM crash during the migration, prevent the migration when a known hardware poison exists on the VM. Signed-off-by: William Roche <william.roche@oracle.com> Message-Id: <20240130190640.139364-2-william.roche@oracle.com>
physics-enthusiast
pushed a commit
to physics-enthusiast/qemu
that referenced
this issue
Feb 8, 2024
…ock_status Using fleecing backup like in [0] on a qcow2 image (with metadata preallocation) can lead to the following assertion failure: > bdrv_co_do_block_status: Assertion `!(ret & BDRV_BLOCK_ZERO)' failed. In the reproducer [0], it happens because the BDRV_BLOCK_RECURSE flag will be set by the qcow2 driver, so the caller will recursively check the file child. Then the BDRV_BLOCK_ZERO set too. Later up the call chain, in bdrv_co_do_block_status() for the snapshot-access driver, the assertion failure will happen, because both flags are set. To fix it, clear the recurse flag after the recursive check was done. In detail: > #0 qcow2_co_block_status Returns 0x45 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID. > #1 bdrv_co_do_block_status Because of the data flag, bdrv_co_do_block_status() will now also set BDRV_BLOCK_ALLOCATED. Because of the recurse flag, bdrv_co_do_block_status() for the bdrv_file child will be called, which returns 0x16 = BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ZERO. Now the return value inherits the zero flag. Returns 0x57 = BDRV_BLOCK_RECURSE | BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_ALLOCATED | BDRV_BLOCK_ZERO. > patchew-project#2 bdrv_co_common_block_status_above > patchew-project#3 bdrv_co_block_status_above > #4 bdrv_co_block_status > #5 cbw_co_snapshot_block_status > #6 bdrv_co_snapshot_block_status > #7 snapshot_access_co_block_status > #8 bdrv_co_do_block_status Return value is propagated all the way up to here, where the assertion failure happens, because BDRV_BLOCK_RECURSE and BDRV_BLOCK_ZERO are both set. > #9 bdrv_co_common_block_status_above > #10 bdrv_co_block_status_above > #11 block_copy_block_status > #12 block_copy_dirty_clusters > #13 block_copy_common > #14 block_copy_async_co_entry > #15 coroutine_trampoline [0]: > #!/bin/bash > rm /tmp/disk.qcow2 > ./qemu-img create /tmp/disk.qcow2 -o preallocation=metadata -f qcow2 1G > ./qemu-img create /tmp/fleecing.qcow2 -f qcow2 1G > ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \ > --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2 \ > --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } } > {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } } > {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node1", "sync": "full", "job-id": "backup0" } } > EOF Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-id: 20240116154839.401030-1-f.ebner@proxmox.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 8a9be79) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
patchew-importer
pushed a commit
that referenced
this issue
Feb 8, 2024
A memory page poisoned from the hypervisor level is no longer readable. The migration of a VM will crash Qemu when it tries to read the memory address space and stumbles on the poisoned page with a similar stack trace: Program terminated with signal SIGBUS, Bus error. #0 _mm256_loadu_si256 #1 buffer_zero_avx2 #2 select_accel_fn #3 buffer_is_zero #4 save_zero_page #5 ram_save_target_page_legacy #6 ram_save_host_page #7 ram_find_and_save_block #8 ram_save_iterate #9 qemu_savevm_state_iterate #10 migration_iteration_run #11 migration_thread #12 qemu_thread_start To avoid this VM crash during the migration, prevent the migration when a known hardware poison exists on the VM. Signed-off-by: William Roche <william.roche@oracle.com> Link: https://lore.kernel.org/r/20240130190640.139364-2-william.roche@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20240208030528.368214-2-peterx@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Feb 9, 2024
A memory page poisoned from the hypervisor level is no longer readable. The migration of a VM will crash Qemu when it tries to read the memory address space and stumbles on the poisoned page with a similar stack trace: Program terminated with signal SIGBUS, Bus error. #0 _mm256_loadu_si256 #1 buffer_zero_avx2 #2 select_accel_fn #3 buffer_is_zero #4 save_zero_page #5 ram_save_target_page_legacy #6 ram_save_host_page #7 ram_find_and_save_block #8 ram_save_iterate #9 qemu_savevm_state_iterate #10 migration_iteration_run #11 migration_thread #12 qemu_thread_start To avoid this VM crash during the migration, prevent the migration when a known hardware poison exists on the VM. Signed-off-by: William Roche <william.roche@oracle.com> Link: https://lore.kernel.org/r/20240130190640.139364-2-william.roche@oracle.com Signed-off-by: Peter Xu <peterx@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 15, 2024
Hi, While doing some testing using numactl-based interleaving of application memory across regular memory and CXL-based memory using QEMU with tcg, I ran into an issue similar to what we saw a while back - link to old issue: https://lore.kernel.org/qemu-devel/CAFEAcA_a_AyQ=Epz3_+CheAT8Crsk9mOu894wbNW_FywamkZiw@mail.gmail.com/#t. When running: numactl --interleave 0,1 ./cachebench … I hit the following: numactl --interleave 0,1 ./cachebench --json_test_config ../test_configs/hit_ratio/graph_cache_follower_assocs/config.json qemu: fatal: cpu_io_recompile: could not find TB for pc=0x7fffc3926dd4 RAX=00007f65df55ba18 RBX=00007f65df55ba60 RCX=00007f65df221620 RDX=0000000000000000 RSI=00000000011c0260 RDI=00007f65df55ba60 RBP=00007ffdb4b4b280 RSP=00007ffdb4b4b1d0 R8 =00000000011c02c0 R9 =00007f65debf6b20 R10=00000000011bf5d0 R11=00007f65deb7d300 R12=00007ffdb4b4b260 R13=00007ffdb4b4b200 R14=00007ffdb4b4b220 R15=00000000011bf5a0 RIP=00007f65df18affc RFL=00000246 [---Z-P-] CPL=3 II=0 A20=1 SMM=0 HLT=0 ES =0000 0000000000000000 00000000 00000000 CS =0033 0000000000000000 ffffffff 00affb00 DPL=3 CS64 [-RA] SS =002b 0000000000000000 ffffffff 00cff300 DPL=3 DS [-WA] DS =0000 0000000000000000 00000000 00000000 FS =0000 00007f65de2f64c0 00000000 00000000 GS =0000 0000000000000000 00000000 00000000 LDT=0000 0000000000000000 00000000 00008200 DPL=0 LDT TR =0040 fffffe6c37990000 00004087 00008900 DPL=0 TSS64-avl GDT= fffffe6c3798e000 0000007f IDT= fffffe0000000000 00000fff CR0=80050033 CR2=00007f65df1b3eb0 CR3=0000000152a1e000 CR4=00350ef0 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 CCS=0000000000000000 CCD=0000000000000001 CCO=CLR EFER=0000000000000d01 FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 YMM00=0000000000000000 0000000000000000 00007f65df2233e0 00007f65df221620 YMM01=0000000000000000 0000000000000000 0000000000000000 43e0000000000000 YMM02=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM03=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM04=0000000000000000 0000000000000000 0000000000000000 3ff0000000000000 YMM05=0000000000000000 0000000000000000 0000000000000000 00007f65df2233e0 YMM06=0000000000000000 0000000000000000 0000000000000000 00007f65df2233b0 YMM07=0000000000000000 0000000000000000 62694c6568636143 2f65636170736b72 YMM08=0000000000000000 0000000000000000 6d622070656d7320 327876612031696d YMM09=0000000000000000 0000000000000000 0000000000000004 0000000000000004 YMM10=0000000000000000 0000000000000000 0000000000000002 0000000000000002 YMM11=0000000000000000 0000000000000000 0000000000000010 0000000000000010 YMM12=0000000000000000 0000000000000000 0000000000ff00fb 0000000000fe00fa YMM13=0000000000000000 0000000000000000 0000000000000000 00ff00fd00fb00f9 YMM14=0000000000000000 0000000000000000 0000000000000000 0000000000000000 YMM15=0000000000000000 0000000000000000 0000000000000000 0000000000000000 The backtrace is (using Jonathans cxl-2024-03-05 branch): (gdb) bt #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737297516096) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=140737297516096) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=140737297516096, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x00007ffff7642476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x00007ffff76287f3 in __GI_abort () at ./stdlib/abort.c:79 #5 0x0000555555c5a9df in cpu_abort (cpu=cpu@entry=0x5555578c19c0, fmt=fmt@entry=0x55555605d100 "cpu_io_recompile: could not find TB for pc=%p") at ../cpu-target.c:371 #6 0x0000555555caa065 in cpu_io_recompile (cpu=cpu@entry=0x5555578c19c0, retaddr=140736474541524) at ../accel/tcg/translate-all.c:610 #7 0x0000555555cacee7 in io_prepare (retaddr=140736474541524, addr=140075515361944, attrs=..., xlat=<optimized out>, cpu=0x5555578c19c0, out_offset=<synthetic pointer>) at ../accel/tcg/cputlb.c:1336 #8 do_st_mmio_leN (cpu=0x5555578c19c0, full=0x7ffd1a1554d0, val_le=140075515361816, addr=140075515361944, size=8, mmu_idx=3, ra=140736474541524) at ../accel/tcg/cputlb.c:2591 #9 0x0000555555cb179d in do_st_8 (ra=<optimized out>, memop=<optimized out>, mmu_idx=<optimized out>, val=140075515361816, p=<optimized out>, cpu=<optimized out>) at ../accel/tcg/cputlb.c:2784 #10 do_st8_mmu (cpu=0x5555578c19c0, addr=39050, val=140075515361816, oi=6, ra=140736474541524) at ../accel/tcg/cputlb.c:2862 #11 0x00007fffc3926e15 in code_gen_buffer () #12 0x0000555555ca0e5b in cpu_tb_exec (cpu=cpu@entry=0x5555578c19c0, itb=itb@entry=0x7fffc3926cc0 <code_gen_buffer+464678035>, tb_exit=tb_exit@entry=0x7ffff49ff6d8) at ../accel/tcg/cpu-exec.c:449 #13 0x0000555555ca13ac in cpu_loop_exec_tb (tb_exit=0x7ffff49ff6d8, last_tb=<synthetic pointer>, pc=<optimized out>, tb=0x7fffc3926cc0 <code_gen_buffer+464678035>, cpu=0x5555578c19c0) at ../accel/tcg/cpu-exec.c:904 #14 cpu_exec_loop (cpu=cpu@entry=0x5555578c19c0, sc=sc@entry=0x7ffff49ff770) at ../accel/tcg/cpu-exec.c:1019 #15 0x0000555555ca1bb1 in cpu_exec_setjmp (cpu=cpu@entry=0x5555578c19c0, sc=sc@entry=0x7ffff49ff770) at ../accel/tcg/cpu-exec.c:1036 #16 0x0000555555ca2388 in cpu_exec (cpu=cpu@entry=0x5555578c19c0) at ../accel/tcg/cpu-exec.c:1062 #17 0x0000555555cc65c4 in tcg_cpu_exec (cpu=cpu@entry=0x5555578c19c0) at ../accel/tcg/tcg-accel-ops.c:76 #18 0x0000555555cc671f in mttcg_cpu_thread_fn (arg=arg@entry=0x5555578c19c0) at ../accel/tcg/tcg-accel-ops-mttcg.c:95 #19 0x0000555555e61261 in qemu_thread_start (args=<optimized out>) at ../util/qemu-thread-posix.c:541 #20 0x00007ffff7694ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 #21 0x00007ffff7726850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 Looking at the tb being executed, it looks like it is a single instruction tb, so with my _very_ limited understanding of tcg, it shouldn’t be necessary to do the IO recompile: (gdb) up 13 #13 0x0000555555ca13ac in cpu_loop_exec_tb (tb_exit=0x7ffff49ff6d8, last_tb=<synthetic pointer>, pc=<optimized out>, tb=0x7fffc3926cc0 <code_gen_buffer+464678035>, cpu=0x5555578c19c0) at ../accel/tcg/cpu-exec.c:904 904 tb = cpu_tb_exec(cpu, tb, tb_exit); (gdb) print *tb $1 = {pc = 0, cs_base = 0, flags = 415285939, cflags = 4278321152, size = 7, icount = 1, tc = {ptr = 0x7fffc3926d80 <code_gen_buffer+464678227>, size = 176}, page_next = {0, 0}, page_addr = {18446744073709551615, 18446744073709551615}, jmp_lock = {value = 0}, jmp_reset_offset = {65535, 65535}, jmp_insn_offset = {65535, 65535}, jmp_target_addr = {0, 0}, jmp_list_head = 140736474540928, jmp_list_next = {0, 0}, jmp_dest = {0, 0}} If the application is run entirely out of MMIO memory, things work fine (the previous patches related to this is in Jonathans branch), so one thought is that it is related to having the code on a mix of regular and CXL memory. Since we previously had issues with code crossing page boundaries where only the second page is MMIO, I tried out the following change to the fix introduced for that issue thinking that reverting to the slow path in the middle of the translation might not correctly update can_do_io: Message-Id: <33748BB7-E617-4661-BDE3-5D29780FC9FF@wdc.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 21, 2024
IODA PCT table (#3) is implemented without any functionality, being a debug table. Signed-off-by: Saif Abrar <saif.abrar@linux.vnet.ibm.com> Message-Id: <20240321100422.5347-9-saif.abrar@linux.vnet.ibm.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 21, 2024
…B changes The old_bs variable in bdrv_next() is currently determined by looking at the old block backend. However, if the block graph changes before the next bdrv_next() call, it might be that the associated BDS is not the same that was referenced previously. In that case, the wrong BDS is unreferenced, leading to an assertion failure later: > bdrv_unref: Assertion `bs->refcnt > 0' failed. In particular, this can happen in the context of bdrv_flush_all(), when polling for bdrv_co_flush() in the generated co-wrapper leads to a graph change (for example with a stream block job [0]). A racy reproducer: > #!/bin/bash > rm -f /tmp/backing.qcow2 > rm -f /tmp/top.qcow2 > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } } > {"execute": "quit"} > EOF [0]: > #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...) > #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...) > #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...) > #3 bdrv_drop_filter (bs=..., errp=...) > #4 bdrv_cor_filter_drop (cor_filter_bs=...) > #5 stream_prepare (job=...) > #6 job_prepare_locked (job=...) > #7 job_txn_apply_locked (fn=..., job=...) > #8 job_do_finalize_locked (job=...) > #9 job_exit (opaque=...) > #10 aio_bh_poll (ctx=...) > #11 aio_poll (ctx=..., blocking=...) > #12 bdrv_poll_co (s=...) > #13 bdrv_flush (bs=...) > #14 bdrv_flush_all () > #15 do_vm_stop (state=..., send_stop=...) > #16 vm_shutdown () Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240321133323.831133-3-f.ebner@proxmox.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 21, 2024
This update changes how virtio_snd_handle_tx_xfer handles message size discrepancies and null streams. Instead of using error handling paths which led to unnecessary processing and potential null pointer dereferences, the function now continues to the next loop iteration. ASAN log illustrating the issue addressed: ERROR: AddressSanitizer: SEGV on unknown address 0x0000000000b4 (pc 0x57cea39967b8 bp 0x7ffce84d51b0 sp 0x7ffce84d5160 T0) #0 0x57cea39967b8 in qemu_mutex_lock_impl qemu/util/qemu-thread-posix.c:92:5 #1 0x57cea128c462 in qemu_mutex_lock qemu/include/qemu/thread.h:122:5 #2 0x57cea128d72f in qemu_lockable_lock qemu/include/qemu/lockable.h:95:5 #3 0x57cea128c294 in qemu_lockable_auto_lock qemu/include/qemu/lockable.h:105:5 #4 0x57cea1285eb2 in virtio_snd_handle_rx_xfer qemu/hw/audio/virtio-snd.c:1026:9 #5 0x57cea2caebbc in virtio_queue_notify_vq qemu/hw/virtio/virtio.c:2268:9 #6 0x57cea2cae412 in virtio_queue_host_notifier_read qemu/hw/virtio/virtio.c:3671:9 #7 0x57cea39822f1 in aio_dispatch_handler qemu/util/aio-posix.c:372:9 #8 0x57cea3979385 in aio_dispatch_handlers qemu/util/aio-posix.c:414:20 #9 0x57cea3978eb1 in aio_dispatch qemu/util/aio-posix.c:424:5 #10 0x57cea3a1eede in aio_ctx_dispatch qemu/util/async.c:360:5 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240321214259.518560-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 22, 2024
This update changes how virtio_snd_handle_tx_xfer handles message size discrepancies and null streams. Instead of using error handling paths which led to unnecessary processing and potential null pointer dereferences, the function now continues to the next loop iteration. ASAN log illustrating the issue addressed: ERROR: AddressSanitizer: SEGV on unknown address 0x0000000000b4 (pc 0x57cea39967b8 bp 0x7ffce84d51b0 sp 0x7ffce84d5160 T0) #0 0x57cea39967b8 in qemu_mutex_lock_impl qemu/util/qemu-thread-posix.c:92:5 #1 0x57cea128c462 in qemu_mutex_lock qemu/include/qemu/thread.h:122:5 #2 0x57cea128d72f in qemu_lockable_lock qemu/include/qemu/lockable.h:95:5 #3 0x57cea128c294 in qemu_lockable_auto_lock qemu/include/qemu/lockable.h:105:5 #4 0x57cea1285eb2 in virtio_snd_handle_rx_xfer qemu/hw/audio/virtio-snd.c:1026:9 #5 0x57cea2caebbc in virtio_queue_notify_vq qemu/hw/virtio/virtio.c:2268:9 #6 0x57cea2cae412 in virtio_queue_host_notifier_read qemu/hw/virtio/virtio.c:3671:9 #7 0x57cea39822f1 in aio_dispatch_handler qemu/util/aio-posix.c:372:9 #8 0x57cea3979385 in aio_dispatch_handlers qemu/util/aio-posix.c:414:20 #9 0x57cea3978eb1 in aio_dispatch qemu/util/aio-posix.c:424:5 #10 0x57cea3a1eede in aio_ctx_dispatch qemu/util/async.c:360:5 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20240321214259.518560-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 22, 2024
…B changes The old_bs variable in bdrv_next() is currently determined by looking at the old block backend. However, if the block graph changes before the next bdrv_next() call, it might be that the associated BDS is not the same that was referenced previously. In that case, the wrong BDS is unreferenced, leading to an assertion failure later: > bdrv_unref: Assertion `bs->refcnt > 0' failed. In particular, this can happen in the context of bdrv_flush_all(), when polling for bdrv_co_flush() in the generated co-wrapper leads to a graph change (for example with a stream block job [0]). A racy reproducer: > #!/bin/bash > rm -f /tmp/backing.qcow2 > rm -f /tmp/top.qcow2 > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } } > {"execute": "quit"} > EOF [0]: > #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...) > #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...) > #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...) > #3 bdrv_drop_filter (bs=..., errp=...) > #4 bdrv_cor_filter_drop (cor_filter_bs=...) > #5 stream_prepare (job=...) > #6 job_prepare_locked (job=...) > #7 job_txn_apply_locked (fn=..., job=...) > #8 job_do_finalize_locked (job=...) > #9 job_exit (opaque=...) > #10 aio_bh_poll (ctx=...) > #11 aio_poll (ctx=..., blocking=...) > #12 bdrv_poll_co (s=...) > #13 bdrv_flush (bs=...) > #14 bdrv_flush_all () > #15 do_vm_stop (state=..., send_stop=...) > #16 vm_shutdown () Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240322095009.346989-3-f.ebner@proxmox.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 22, 2024
This patch improves error handling in virtio_snd_handle_tx_xfer() and virtio_snd_handle_rx_xfer() in the VirtIO sound driver. Previously, 'goto' statements were used for error paths, leading to unnecessary processing and potential null pointer dereferences. Now, 'continue' is used to skip the rest of the current loop iteration for errors such as message size discrepancies or null streams, reducing crash risks. ASAN log illustrating the issue addressed: ERROR: AddressSanitizer: SEGV on unknown address 0x0000000000b4 #0 0x57cea39967b8 in qemu_mutex_lock_impl qemu/util/qemu-thread-posix.c:92:5 #1 0x57cea128c462 in qemu_mutex_lock qemu/include/qemu/thread.h:122:5 #2 0x57cea128d72f in qemu_lockable_lock qemu/include/qemu/lockable.h:95:5 #3 0x57cea128c294 in qemu_lockable_auto_lock qemu/include/qemu/lockable.h:105:5 #4 0x57cea1285eb2 in virtio_snd_handle_rx_xfer qemu/hw/audio/virtio-snd.c:1026:9 #5 0x57cea2caebbc in virtio_queue_notify_vq qemu/hw/virtio/virtio.c:2268:9 #6 0x57cea2cae412 in virtio_queue_host_notifier_read qemu/hw/virtio/virtio.c:3671:9 #7 0x57cea39822f1 in aio_dispatch_handler qemu/util/aio-posix.c:372:9 #8 0x57cea3979385 in aio_dispatch_handlers qemu/util/aio-posix.c:414:20 #9 0x57cea3978eb1 in aio_dispatch qemu/util/aio-posix.c:424:5 #10 0x57cea3a1eede in aio_ctx_dispatch qemu/util/async.c:360:5 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20240322110827.568412-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 25, 2024
IODA PCT table (#3) is implemented without any functionality, being a debug table. Signed-off-by: Saif Abrar <saif.abrar@linux.vnet.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20240321100422.5347-9-saif.abrar@linux.vnet.ibm.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 25, 2024
IODA PCT table (#3) is implemented without any functionality, being a debug table. Signed-off-by: Saif Abrar <saif.abrar@linux.vnet.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20240321100422.5347-9-saif.abrar@linux.vnet.ibm.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 25, 2024
IODA PCT table (#3) is implemented without any functionality, being a debug table. Signed-off-by: Saif Abrar <saif.abrar@linux.vnet.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20240321100422.5347-9-saif.abrar@linux.vnet.ibm.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 25, 2024
…B changes The old_bs variable in bdrv_next() is currently determined by looking at the old block backend. However, if the block graph changes before the next bdrv_next() call, it might be that the associated BDS is not the same that was referenced previously. In that case, the wrong BDS is unreferenced, leading to an assertion failure later: > bdrv_unref: Assertion `bs->refcnt > 0' failed. In particular, this can happen in the context of bdrv_flush_all(), when polling for bdrv_co_flush() in the generated co-wrapper leads to a graph change (for example with a stream block job [0]). A racy reproducer: > #!/bin/bash > rm -f /tmp/backing.qcow2 > rm -f /tmp/top.qcow2 > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } } > {"execute": "quit"} > EOF [0]: > #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...) > #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...) > #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...) > #3 bdrv_drop_filter (bs=..., errp=...) > #4 bdrv_cor_filter_drop (cor_filter_bs=...) > #5 stream_prepare (job=...) > #6 job_prepare_locked (job=...) > #7 job_txn_apply_locked (fn=..., job=...) > #8 job_do_finalize_locked (job=...) > #9 job_exit (opaque=...) > #10 aio_bh_poll (ctx=...) > #11 aio_poll (ctx=..., blocking=...) > #12 bdrv_poll_co (s=...) > #13 bdrv_flush (bs=...) > #14 bdrv_flush_all () > #15 do_vm_stop (state=..., send_stop=...) > #16 vm_shutdown () Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-Id: <20240322095009.346989-3-f.ebner@proxmox.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 25, 2024
…B changes The old_bs variable in bdrv_next() is currently determined by looking at the old block backend. However, if the block graph changes before the next bdrv_next() call, it might be that the associated BDS is not the same that was referenced previously. In that case, the wrong BDS is unreferenced, leading to an assertion failure later: > bdrv_unref: Assertion `bs->refcnt > 0' failed. In particular, this can happen in the context of bdrv_flush_all(), when polling for bdrv_co_flush() in the generated co-wrapper leads to a graph change (for example with a stream block job [0]). A racy reproducer: > #!/bin/bash > rm -f /tmp/backing.qcow2 > rm -f /tmp/top.qcow2 > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } } > {"execute": "quit"} > EOF [0]: > #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...) > #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...) > #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...) > #3 bdrv_drop_filter (bs=..., errp=...) > #4 bdrv_cor_filter_drop (cor_filter_bs=...) > #5 stream_prepare (job=...) > #6 job_prepare_locked (job=...) > #7 job_txn_apply_locked (fn=..., job=...) > #8 job_do_finalize_locked (job=...) > #9 job_exit (opaque=...) > #10 aio_bh_poll (ctx=...) > #11 aio_poll (ctx=..., blocking=...) > #12 bdrv_poll_co (s=...) > #13 bdrv_flush (bs=...) > #14 bdrv_flush_all () > #15 do_vm_stop (state=..., send_stop=...) > #16 vm_shutdown () Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20240322095009.346989-3-f.ebner@proxmox.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 26, 2024
…B changes The old_bs variable in bdrv_next() is currently determined by looking at the old block backend. However, if the block graph changes before the next bdrv_next() call, it might be that the associated BDS is not the same that was referenced previously. In that case, the wrong BDS is unreferenced, leading to an assertion failure later: > bdrv_unref: Assertion `bs->refcnt > 0' failed. In particular, this can happen in the context of bdrv_flush_all(), when polling for bdrv_co_flush() in the generated co-wrapper leads to a graph change (for example with a stream block job [0]). A racy reproducer: > #!/bin/bash > rm -f /tmp/backing.qcow2 > rm -f /tmp/top.qcow2 > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } } > {"execute": "quit"} > EOF [0]: > #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...) > #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...) > #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...) > #3 bdrv_drop_filter (bs=..., errp=...) > #4 bdrv_cor_filter_drop (cor_filter_bs=...) > #5 stream_prepare (job=...) > #6 job_prepare_locked (job=...) > #7 job_txn_apply_locked (fn=..., job=...) > #8 job_do_finalize_locked (job=...) > #9 job_exit (opaque=...) > #10 aio_bh_poll (ctx=...) > #11 aio_poll (ctx=..., blocking=...) > #12 bdrv_poll_co (s=...) > #13 bdrv_flush (bs=...) > #14 bdrv_flush_all () > #15 do_vm_stop (state=..., send_stop=...) > #16 vm_shutdown () Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240322095009.346989-3-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20240326135440.421609-5-kwolf@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Mar 26, 2024
…B changes The old_bs variable in bdrv_next() is currently determined by looking at the old block backend. However, if the block graph changes before the next bdrv_next() call, it might be that the associated BDS is not the same that was referenced previously. In that case, the wrong BDS is unreferenced, leading to an assertion failure later: > bdrv_unref: Assertion `bs->refcnt > 0' failed. In particular, this can happen in the context of bdrv_flush_all(), when polling for bdrv_co_flush() in the generated co-wrapper leads to a graph change (for example with a stream block job [0]). A racy reproducer: > #!/bin/bash > rm -f /tmp/backing.qcow2 > rm -f /tmp/top.qcow2 > ./qemu-img create /tmp/backing.qcow2 -f qcow2 64M > ./qemu-io -c "write -P42 0x0 0x1" /tmp/backing.qcow2 > ./qemu-img create /tmp/top.qcow2 -f qcow2 64M -b /tmp/backing.qcow2 -F qcow2 > ./qemu-system-x86_64 --qmp stdio \ > --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/top.qcow2 \ > <<EOF > {"execute": "qmp_capabilities"} > {"execute": "block-stream", "arguments": { "job-id": "stream0", "device": "node0" } } > {"execute": "quit"} > EOF [0]: > #0 bdrv_replace_child_tran (child=..., new_bs=..., tran=...) > #1 bdrv_replace_node_noperm (from=..., to=..., auto_skip=..., tran=..., errp=...) > #2 bdrv_replace_node_common (from=..., to=..., auto_skip=..., detach_subchain=..., errp=...) > #3 bdrv_drop_filter (bs=..., errp=...) > #4 bdrv_cor_filter_drop (cor_filter_bs=...) > #5 stream_prepare (job=...) > #6 job_prepare_locked (job=...) > #7 job_txn_apply_locked (fn=..., job=...) > #8 job_do_finalize_locked (job=...) > #9 job_exit (opaque=...) > #10 aio_bh_poll (ctx=...) > #11 aio_poll (ctx=..., blocking=...) > #12 bdrv_poll_co (s=...) > #13 bdrv_flush (bs=...) > #14 bdrv_flush_all () > #15 do_vm_stop (state=..., send_stop=...) > #16 vm_shutdown () Signed-off-by: Fiona Ebner <f.ebner@proxmox.com> Message-ID: <20240322095009.346989-3-f.ebner@proxmox.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Apr 2, 2024
When vhost-user or vhost-kernel is handling virtio net datapath, qemu should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by qemu in following code path: #0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511 #1 0x0000559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576 #2 0x0000559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801 #3 0x0000559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248 #4 0x0000559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525 #5 0x0000559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321 #6 0x0000559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774 #7 0x0000559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259 #8 0x0000559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199 #9 0x0000559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783 #10 0x0000559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592 #11 0x0000559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266 #12 0x0000559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412 #13 0x0000559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311 #14 0x0000559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392 #15 0x0000559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455 #16 0x0000559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459 #17 0x0000559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301 #18 0x0000559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62 #19 0x0000559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82 This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. Signed-off-by: Yajun Wu <yajunw@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Message-Id: <20240402045109.97729-1-yajunw@nvidia.com>
patchew-importer
pushed a commit
that referenced
this issue
Apr 2, 2024
When vhost-user or vhost-kernel is handling virtio net datapath, qemu should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by qemu in following code path: #0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511 #1 0x0000559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576 #2 0x0000559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801 #3 0x0000559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248 #4 0x0000559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525 #5 0x0000559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321 #6 0x0000559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774 #7 0x0000559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259 #8 0x0000559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199 #9 0x0000559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783 #10 0x0000559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592 #11 0x0000559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266 #12 0x0000559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412 #13 0x0000559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311 #14 0x0000559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392 #15 0x0000559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455 #16 0x0000559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459 #17 0x0000559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301 #18 0x0000559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62 #19 0x0000559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82 This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. Signed-off-by: Yajun Wu <yajunw@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20240402045109.97729-1-yajunw@nvidia.com>
patchew-importer
pushed a commit
that referenced
this issue
Apr 10, 2024
Per "SD Host Controller Standard Specification Version 3.00": * 2.2.5 Transfer Mode Register (Offset 00Ch) Writes to this register shall be ignored when the Command Inhibit (DAT) in the Present State register is 1. Do not update the TRNMOD register when Command Inhibit (DAT) bit is set to avoid the present-status register going out of sync, leading to malicious guest using DMA mode and overflowing the FIFO buffer: $ cat << EOF | qemu-system-i386 \ -display none -nographic -nodefaults \ -machine accel=qtest -m 512M \ -device sdhci-pci,sd-spec-version=3 \ -device sd-card,drive=mydrive \ -drive if=none,index=0,file=null-co://,format=raw,id=mydrive \ -qtest stdio outl 0xcf8 0x80001013 outl 0xcfc 0x91 outl 0xcf8 0x80001001 outl 0xcfc 0x06000000 write 0x9100002c 0x1 0x05 write 0x91000058 0x1 0x16 write 0x91000005 0x1 0x04 write 0x91000028 0x1 0x08 write 0x16 0x1 0x21 write 0x19 0x1 0x20 write 0x9100000c 0x1 0x01 write 0x9100000e 0x1 0x20 write 0x9100000f 0x1 0x00 write 0x9100000c 0x1 0x00 write 0x91000020 0x1 0x00 EOF Stack trace (part): ================================================================= ==89993==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x615000029900 at pc 0x55d5f885700d bp 0x7ffc1e1e9470 sp 0x7ffc1e1e9468 WRITE of size 1 at 0x615000029900 thread T0 #0 0x55d5f885700c in sdhci_write_dataport hw/sd/sdhci.c:564:39 #1 0x55d5f8849150 in sdhci_write hw/sd/sdhci.c:1223:13 #2 0x55d5fa01db63 in memory_region_write_accessor system/memory.c:497:5 #3 0x55d5fa01d245 in access_with_adjusted_size system/memory.c:573:18 #4 0x55d5fa01b1a9 in memory_region_dispatch_write system/memory.c:1521:16 #5 0x55d5fa09f5c9 in flatview_write_continue system/physmem.c:2711:23 #6 0x55d5fa08f78b in flatview_write system/physmem.c:2753:12 #7 0x55d5fa08f258 in address_space_write system/physmem.c:2860:18 ... 0x615000029900 is located 0 bytes to the right of 512-byte region [0x615000029700,0x615000029900) allocated by thread T0 here: #0 0x55d5f7237b27 in __interceptor_calloc #1 0x7f9e36dd4c50 in g_malloc0 #2 0x55d5f88672f7 in sdhci_pci_realize hw/sd/sdhci-pci.c:36:5 #3 0x55d5f844b582 in pci_qdev_realize hw/pci/pci.c:2092:9 #4 0x55d5fa2ee74b in device_set_realized hw/core/qdev.c:510:13 #5 0x55d5fa325bfb in property_set_bool qom/object.c:2358:5 #6 0x55d5fa31ea45 in object_property_set qom/object.c:1472:5 #7 0x55d5fa332509 in object_property_set_qobject om/qom-qobject.c:28:10 #8 0x55d5fa31f6ed in object_property_set_bool qom/object.c:1541:15 #9 0x55d5fa2e2948 in qdev_realize hw/core/qdev.c:292:12 #10 0x55d5f8eed3f1 in qdev_device_add_from_qdict system/qdev-monitor.c:719:10 #11 0x55d5f8eef7ff in qdev_device_add system/qdev-monitor.c:738:11 #12 0x55d5f8f211f0 in device_init_func system/vl.c:1200:11 #13 0x55d5fad0877d in qemu_opts_foreach util/qemu-option.c:1135:14 #14 0x55d5f8f0df9c in qemu_create_cli_devices system/vl.c:2638:5 #15 0x55d5f8f0db24 in qmp_x_exit_preconfig system/vl.c:2706:5 #16 0x55d5f8f14dc0 in qemu_init system/vl.c:3737:9 ... SUMMARY: AddressSanitizer: heap-buffer-overflow hw/sd/sdhci.c:564:39 in sdhci_write_dataport Add assertions to ensure the fifo_buffer[] is not overflowed by malicious accesses to the Buffer Data Port register. Fixes: CVE-2024-3447 Cc: qemu-stable@nongnu.org Fixes: d7dfca0 ("hw/sdhci: introduce standard SD host controller") Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58813 Reported-by: Alexander Bulekov <alxndr@bu.edu> Reported-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <CAFEAcA9iLiv1XGTGKeopgMa8Y9+8kvptvsb8z2OBeuy+5=NUfg@mail.gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240409145524.27913-1-philmd@linaro.org> Message-Id: <20240410091315.57241-15-philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
Apr 10, 2024
If a fragmented packet size is too short, do not try to calculate its checksum. Reproduced using: $ cat << EOF | qemu-system-i386 -display none -nodefaults \ -machine q35,accel=qtest -m 32M \ -device igb,netdev=net0 \ -netdev user,id=net0 \ -qtest stdio outl 0xcf8 0x80000810 outl 0xcfc 0xe0000000 outl 0xcf8 0x80000804 outw 0xcfc 0x06 write 0xe0000403 0x1 0x02 writel 0xe0003808 0xffffffff write 0xe000381a 0x1 0x5b write 0xe000381b 0x1 0x00 EOF Assertion failed: (offset == 0), function iov_from_buf_full, file util/iov.c, line 39. #1 0x5575e81e952a in iov_from_buf_full qemu/util/iov.c:39:5 #2 0x5575e6500768 in net_tx_pkt_update_sctp_checksum qemu/hw/net/net_tx_pkt.c:144:9 #3 0x5575e659f3e1 in igb_setup_tx_offloads qemu/hw/net/igb_core.c:478:11 #4 0x5575e659f3e1 in igb_tx_pkt_send qemu/hw/net/igb_core.c:552:10 #5 0x5575e659f3e1 in igb_process_tx_desc qemu/hw/net/igb_core.c:671:17 #6 0x5575e659f3e1 in igb_start_xmit qemu/hw/net/igb_core.c:903:9 #7 0x5575e659f3e1 in igb_set_tdt qemu/hw/net/igb_core.c:2812:5 #8 0x5575e657d6a4 in igb_core_write qemu/hw/net/igb_core.c:4248:9 Fixes: CVE-2024-3567 Cc: qemu-stable@nongnu.org Reported-by: Zheyu Ma <zheyuma97@gmail.com> Fixes: f199b13 ("igb: Implement Tx SCTP CSO") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2273 Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240410070459.49112-1-philmd@linaro.org> Message-Id: <20240410091315.57241-16-philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
Apr 10, 2024
Replace qemu_bh_new_guarded() by virtio_bh_new_guarded() so the bus and device use the same guard. Otherwise the DMA-reentrancy protection can be bypassed: $ cat << EOF | qemu-system-i386 -display none -nodefaults \ -machine q35,accel=qtest \ -m 512M \ -device virtio-gpu \ -qtest stdio outl 0xcf8 0x80000820 outl 0xcfc 0xe0004000 outl 0xcf8 0x80000804 outw 0xcfc 0x06 write 0xe0004030 0x4 0x024000e0 write 0xe0004028 0x1 0xff write 0xe0004020 0x4 0x00009300 write 0xe000401c 0x1 0x01 write 0x101 0x1 0x04 write 0x103 0x1 0x1c write 0x9301c8 0x1 0x18 write 0x105 0x1 0x1c write 0x107 0x1 0x1c write 0x109 0x1 0x1c write 0x10b 0x1 0x00 write 0x10d 0x1 0x00 write 0x10f 0x1 0x00 write 0x111 0x1 0x00 write 0x113 0x1 0x00 write 0x115 0x1 0x00 write 0x117 0x1 0x00 write 0x119 0x1 0x00 write 0x11b 0x1 0x00 write 0x11d 0x1 0x00 write 0x11f 0x1 0x00 write 0x121 0x1 0x00 write 0x123 0x1 0x00 write 0x125 0x1 0x00 write 0x127 0x1 0x00 write 0x129 0x1 0x00 write 0x12b 0x1 0x00 write 0x12d 0x1 0x00 write 0x12f 0x1 0x00 write 0x131 0x1 0x00 write 0x133 0x1 0x00 write 0x135 0x1 0x00 write 0x137 0x1 0x00 write 0x139 0x1 0x00 write 0xe0007003 0x1 0x00 EOF ... ================================================================= ==276099==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d000011178 at pc 0x562cc3b736c7 bp 0x7ffed49dee60 sp 0x7ffed49dee58 READ of size 8 at 0x60d000011178 thread T0 #0 0x562cc3b736c6 in virtio_gpu_ctrl_response hw/display/virtio-gpu.c:180:42 #1 0x562cc3b7c40b in virtio_gpu_ctrl_response_nodata hw/display/virtio-gpu.c:192:5 #2 0x562cc3b7c40b in virtio_gpu_simple_process_cmd hw/display/virtio-gpu.c:1015:13 #3 0x562cc3b82873 in virtio_gpu_process_cmdq hw/display/virtio-gpu.c:1050:9 #4 0x562cc4a85514 in aio_bh_call util/async.c:169:5 #5 0x562cc4a85c52 in aio_bh_poll util/async.c:216:13 #6 0x562cc4a1a79b in aio_dispatch util/aio-posix.c:423:5 #7 0x562cc4a8a2da in aio_ctx_dispatch util/async.c:358:5 #8 0x7f36840547a8 in g_main_context_dispatch (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x547a8) #9 0x562cc4a8b753 in glib_pollfds_poll util/main-loop.c:290:9 #10 0x562cc4a8b753 in os_host_main_loop_wait util/main-loop.c:313:5 #11 0x562cc4a8b753 in main_loop_wait util/main-loop.c:592:11 #12 0x562cc3938186 in qemu_main_loop system/runstate.c:782:9 #13 0x562cc43b7af5 in qemu_default_main system/main.c:37:14 #14 0x7f3683a6c189 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #15 0x7f3683a6c244 in __libc_start_main csu/../csu/libc-start.c:381:3 #16 0x562cc2a58ac0 in _start (qemu-system-i386+0x231bac0) 0x60d000011178 is located 56 bytes inside of 136-byte region [0x60d000011140,0x60d0000111c8) freed by thread T0 here: #0 0x562cc2adb662 in __interceptor_free (qemu-system-i386+0x239e662) #1 0x562cc3b86b21 in virtio_gpu_reset hw/display/virtio-gpu.c:1524:9 #2 0x562cc416e20e in virtio_reset hw/virtio/virtio.c:2145:9 #3 0x562cc37c5644 in virtio_pci_reset hw/virtio/virtio-pci.c:2249:5 #4 0x562cc4233758 in memory_region_write_accessor system/memory.c:497:5 #5 0x562cc4232eea in access_with_adjusted_size system/memory.c:573:18 previously allocated by thread T0 here: #0 0x562cc2adb90e in malloc (qemu-system-i386+0x239e90e) #1 0x7f368405a678 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5a678) #2 0x562cc4163ffc in virtqueue_split_pop hw/virtio/virtio.c:1612:12 #3 0x562cc4163ffc in virtqueue_pop hw/virtio/virtio.c:1783:16 #4 0x562cc3b91a95 in virtio_gpu_handle_ctrl hw/display/virtio-gpu.c:1112:15 #5 0x562cc4a85514 in aio_bh_call util/async.c:169:5 #6 0x562cc4a85c52 in aio_bh_poll util/async.c:216:13 #7 0x562cc4a1a79b in aio_dispatch util/aio-posix.c:423:5 SUMMARY: AddressSanitizer: heap-use-after-free hw/display/virtio-gpu.c:180:42 in virtio_gpu_ctrl_response With this change, the same reproducer triggers: qemu-system-i386: warning: Blocked re-entrant IO on MemoryRegion: virtio-pci-common-virtio-gpu at addr: 0x6 Fixes: CVE-2024-3446 Cc: qemu-stable@nongnu.org Reported-by: Alexander Bulekov <alxndr@bu.edu> Reported-by: Yongkang Jia <kangel@zju.edu.cn> Reported-by: Xiao Lei <nop.leixiao@gmail.com> Reported-by: Yiming Tao <taoym@zju.edu.cn> Buglink: https://bugs.launchpad.net/qemu/+bug/1888606 Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20240409105537.18308-3-philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
Apr 10, 2024
nand_command() and nand_getio() don't check @offset points into the block, nor the available data length (s->iolen) is not negative. In order to fix: - check the offset is in range in nand_blk_load_NAND_PAGE_SIZE(), - do not set @iolen if blk_load() failed. Reproducer: $ cat << EOF | qemu-system-arm -machine tosa \ -monitor none -serial none \ -display none -qtest stdio write 0x10000111 0x1 0xca write 0x10000104 0x1 0x47 write 0x1000ca04 0x1 0xd7 write 0x1000ca01 0x1 0xe0 write 0x1000ca04 0x1 0x71 write 0x1000ca00 0x1 0x50 write 0x1000ca04 0x1 0xd7 read 0x1000ca02 0x1 write 0x1000ca01 0x1 0x10 EOF ================================================================= ==15750==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61f000000de0 at pc 0x560e61557210 bp 0x7ffcfc4a59f0 sp 0x7ffcfc4a59e8 READ of size 1 at 0x61f000000de0 thread T0 #0 0x560e6155720f in mem_and hw/block/nand.c:101:20 #1 0x560e6155ac9c in nand_blk_write_512 hw/block/nand.c:663:9 #2 0x560e61544200 in nand_command hw/block/nand.c:293:13 #3 0x560e6153cc83 in nand_setio hw/block/nand.c:520:13 #4 0x560e61a0a69e in tc6393xb_nand_writeb hw/display/tc6393xb.c:380:13 #5 0x560e619f9bf7 in tc6393xb_writeb hw/display/tc6393xb.c:524:9 #6 0x560e647c7d03 in memory_region_write_accessor softmmu/memory.c:492:5 #7 0x560e647c7641 in access_with_adjusted_size softmmu/memory.c:554:18 #8 0x560e647c5f66 in memory_region_dispatch_write softmmu/memory.c:1514:16 #9 0x560e6485409e in flatview_write_continue softmmu/physmem.c:2825:23 #10 0x560e648421eb in flatview_write softmmu/physmem.c:2867:12 #11 0x560e64841ca8 in address_space_write softmmu/physmem.c:2963:18 #12 0x560e61170162 in qemu_writeb tests/qtest/videzzo/videzzo_qemu.c:1080:5 #13 0x560e6116eef7 in dispatch_mmio_write tests/qtest/videzzo/videzzo_qemu.c:1227:28 0x61f000000de0 is located 0 bytes to the right of 3424-byte region [0x61f000000080,0x61f000000de0) allocated by thread T0 here: #0 0x560e611276cf in malloc /root/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x7f7959a87e98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98) #2 0x560e64b98871 in object_new qom/object.c:749:12 #3 0x560e64b5d1a1 in qdev_new hw/core/qdev.c:153:19 #4 0x560e61547ea5 in nand_init hw/block/nand.c:639:11 #5 0x560e619f8772 in tc6393xb_init hw/display/tc6393xb.c:558:16 #6 0x560e6390bad2 in tosa_init hw/arm/tosa.c:250:12 SUMMARY: AddressSanitizer: heap-buffer-overflow hw/block/nand.c:101:20 in mem_and ==15750==ABORTING Broken since introduction in commit 3e3d581 ("NAND Flash memory emulation and ECC calculation helpers for use by NAND controllers"). Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1445 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1446 Reported-by: Qiang Liu <cyruscyliu@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240409135944.24997-4-philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
Apr 10, 2024
When the MAC Interface Layer (MIL) transmit FIFO is full, truncate the packet, and raise the Transmitter Error (TXE) flag. Broken since model introduction in commit 2a42499 ("LAN9118 emulation"). When using the reproducer from https://gitlab.com/qemu-project/qemu/-/issues/2267 we get: hw/net/lan9118.c:798:17: runtime error: index 2048 out of bounds for type 'uint8_t[2048]' (aka 'unsigned char[2048]') #0 0x563ec9a057b1 in tx_fifo_push hw/net/lan9118.c:798:43 #1 0x563ec99fbb28 in lan9118_writel hw/net/lan9118.c:1042:9 #2 0x563ec99f2de2 in lan9118_16bit_mode_write hw/net/lan9118.c:1205:9 #3 0x563ecbf78013 in memory_region_write_accessor system/memory.c:497:5 #4 0x563ecbf776f5 in access_with_adjusted_size system/memory.c:573:18 #5 0x563ecbf75643 in memory_region_dispatch_write system/memory.c:1521:16 #6 0x563ecc01bade in flatview_write_continue_step system/physmem.c:2713:18 #7 0x563ecc01b374 in flatview_write_continue system/physmem.c:2743:19 #8 0x563ecbff1c9b in flatview_write system/physmem.c:2774:12 #9 0x563ecbff1768 in address_space_write system/physmem.c:2894:18 ... [*] LAN9118 DS00002266B.pdf, Table 5.3.3 "INTERRUPT STATUS REGISTER" Cc: qemu-stable@nongnu.org Reported-by: Will Lester Reported-by: Chuhong Yuan <hslester96@gmail.com> Suggested-by: Peter Maydell <peter.maydell@linaro.org> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2267 Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20240409133801.23503-3-philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
Apr 10, 2024
Per "SD Host Controller Standard Specification Version 3.00": * 2.2.5 Transfer Mode Register (Offset 00Ch) Writes to this register shall be ignored when the Command Inhibit (DAT) in the Present State register is 1. Do not update the TRNMOD register when Command Inhibit (DAT) bit is set to avoid the present-status register going out of sync, leading to malicious guest using DMA mode and overflowing the FIFO buffer: $ cat << EOF | qemu-system-i386 \ -display none -nographic -nodefaults \ -machine accel=qtest -m 512M \ -device sdhci-pci,sd-spec-version=3 \ -device sd-card,drive=mydrive \ -drive if=none,index=0,file=null-co://,format=raw,id=mydrive \ -qtest stdio outl 0xcf8 0x80001013 outl 0xcfc 0x91 outl 0xcf8 0x80001001 outl 0xcfc 0x06000000 write 0x9100002c 0x1 0x05 write 0x91000058 0x1 0x16 write 0x91000005 0x1 0x04 write 0x91000028 0x1 0x08 write 0x16 0x1 0x21 write 0x19 0x1 0x20 write 0x9100000c 0x1 0x01 write 0x9100000e 0x1 0x20 write 0x9100000f 0x1 0x00 write 0x9100000c 0x1 0x00 write 0x91000020 0x1 0x00 EOF Stack trace (part): ================================================================= ==89993==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x615000029900 at pc 0x55d5f885700d bp 0x7ffc1e1e9470 sp 0x7ffc1e1e9468 WRITE of size 1 at 0x615000029900 thread T0 #0 0x55d5f885700c in sdhci_write_dataport hw/sd/sdhci.c:564:39 #1 0x55d5f8849150 in sdhci_write hw/sd/sdhci.c:1223:13 #2 0x55d5fa01db63 in memory_region_write_accessor system/memory.c:497:5 #3 0x55d5fa01d245 in access_with_adjusted_size system/memory.c:573:18 #4 0x55d5fa01b1a9 in memory_region_dispatch_write system/memory.c:1521:16 #5 0x55d5fa09f5c9 in flatview_write_continue system/physmem.c:2711:23 #6 0x55d5fa08f78b in flatview_write system/physmem.c:2753:12 #7 0x55d5fa08f258 in address_space_write system/physmem.c:2860:18 ... 0x615000029900 is located 0 bytes to the right of 512-byte region [0x615000029700,0x615000029900) allocated by thread T0 here: #0 0x55d5f7237b27 in __interceptor_calloc #1 0x7f9e36dd4c50 in g_malloc0 #2 0x55d5f88672f7 in sdhci_pci_realize hw/sd/sdhci-pci.c:36:5 #3 0x55d5f844b582 in pci_qdev_realize hw/pci/pci.c:2092:9 #4 0x55d5fa2ee74b in device_set_realized hw/core/qdev.c:510:13 #5 0x55d5fa325bfb in property_set_bool qom/object.c:2358:5 #6 0x55d5fa31ea45 in object_property_set qom/object.c:1472:5 #7 0x55d5fa332509 in object_property_set_qobject om/qom-qobject.c:28:10 #8 0x55d5fa31f6ed in object_property_set_bool qom/object.c:1541:15 #9 0x55d5fa2e2948 in qdev_realize hw/core/qdev.c:292:12 #10 0x55d5f8eed3f1 in qdev_device_add_from_qdict system/qdev-monitor.c:719:10 #11 0x55d5f8eef7ff in qdev_device_add system/qdev-monitor.c:738:11 #12 0x55d5f8f211f0 in device_init_func system/vl.c:1200:11 #13 0x55d5fad0877d in qemu_opts_foreach util/qemu-option.c:1135:14 #14 0x55d5f8f0df9c in qemu_create_cli_devices system/vl.c:2638:5 #15 0x55d5f8f0db24 in qmp_x_exit_preconfig system/vl.c:2706:5 #16 0x55d5f8f14dc0 in qemu_init system/vl.c:3737:9 ... SUMMARY: AddressSanitizer: heap-buffer-overflow hw/sd/sdhci.c:564:39 in sdhci_write_dataport Add assertions to ensure the fifo_buffer[] is not overflowed by malicious accesses to the Buffer Data Port register. Fixes: CVE-2024-3447 Cc: qemu-stable@nongnu.org Fixes: d7dfca0 ("hw/sdhci: introduce standard SD host controller") Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58813 Reported-by: Alexander Bulekov <alxndr@bu.edu> Reported-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <CAFEAcA9iLiv1XGTGKeopgMa8Y9+8kvptvsb8z2OBeuy+5=NUfg@mail.gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240409145524.27913-1-philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
Apr 10, 2024
If a fragmented packet size is too short, do not try to calculate its checksum. Reproduced using: $ cat << EOF | qemu-system-i386 -display none -nodefaults \ -machine q35,accel=qtest -m 32M \ -device igb,netdev=net0 \ -netdev user,id=net0 \ -qtest stdio outl 0xcf8 0x80000810 outl 0xcfc 0xe0000000 outl 0xcf8 0x80000804 outw 0xcfc 0x06 write 0xe0000403 0x1 0x02 writel 0xe0003808 0xffffffff write 0xe000381a 0x1 0x5b write 0xe000381b 0x1 0x00 EOF Assertion failed: (offset == 0), function iov_from_buf_full, file util/iov.c, line 39. #1 0x5575e81e952a in iov_from_buf_full qemu/util/iov.c:39:5 #2 0x5575e6500768 in net_tx_pkt_update_sctp_checksum qemu/hw/net/net_tx_pkt.c:144:9 #3 0x5575e659f3e1 in igb_setup_tx_offloads qemu/hw/net/igb_core.c:478:11 #4 0x5575e659f3e1 in igb_tx_pkt_send qemu/hw/net/igb_core.c:552:10 #5 0x5575e659f3e1 in igb_process_tx_desc qemu/hw/net/igb_core.c:671:17 #6 0x5575e659f3e1 in igb_start_xmit qemu/hw/net/igb_core.c:903:9 #7 0x5575e659f3e1 in igb_set_tdt qemu/hw/net/igb_core.c:2812:5 #8 0x5575e657d6a4 in igb_core_write qemu/hw/net/igb_core.c:4248:9 Fixes: CVE-2024-3567 Cc: qemu-stable@nongnu.org Reported-by: Zheyu Ma <zheyuma97@gmail.com> Fixes: f199b13 ("igb: Implement Tx SCTP CSO") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2273 Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20240410070459.49112-1-philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
May 29, 2024
Using -fsanitize=undefined with Clang v18 causes an error if function pointers are casted: qapi/qapi-clone-visitor.c:188:5: runtime error: call to function visit_type_SocketAddress through pointer to incorrect function type 'bool (*)(struct Visitor *, const char *, void **, struct Error **)' /tmp/qemu-ubsan/qapi/qapi-visit-sockets.c:487: note: visit_type_SocketAddress defined here #0 0x5642aa2f7f3b in qapi_clone qapi/qapi-clone-visitor.c:188:5 #1 0x5642aa2c8ce5 in qio_channel_socket_listen_async io/channel-socket.c:285:18 #2 0x5642aa2b8903 in test_io_channel_setup_async tests/unit/test-io-channel-socket.c:116:5 #3 0x5642aa2b8204 in test_io_channel tests/unit/test-io-channel-socket.c:179:9 #4 0x5642aa2b8129 in test_io_channel_ipv4 tests/unit/test-io-channel-socket.c:323:5 ... It also prevents enabling the strict mode of CFI which is currently disabled with -fsanitize-cfi-icall-generalize-pointers. The problematic casts are necessary to pass visit_type_T() and visit_type_T_members() as callbacks to qapi_clone() and qapi_clone_members(), respectively. Open-code these two functions to avoid the callbacks, and thus the type casts. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2346 Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240524-xkb-v4-3-2de564e5c859@daynix.com> [thuth: Improve commit message according to Markus' suggestions] Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20240529105454.1149225-23-thuth@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
May 29, 2024
Using -fsanitize=undefined with Clang v18 causes an error if function pointers are casted: qapi/qapi-clone-visitor.c:188:5: runtime error: call to function visit_type_SocketAddress through pointer to incorrect function type 'bool (*)(struct Visitor *, const char *, void **, struct Error **)' /tmp/qemu-ubsan/qapi/qapi-visit-sockets.c:487: note: visit_type_SocketAddress defined here #0 0x5642aa2f7f3b in qapi_clone qapi/qapi-clone-visitor.c:188:5 #1 0x5642aa2c8ce5 in qio_channel_socket_listen_async io/channel-socket.c:285:18 #2 0x5642aa2b8903 in test_io_channel_setup_async tests/unit/test-io-channel-socket.c:116:5 #3 0x5642aa2b8204 in test_io_channel tests/unit/test-io-channel-socket.c:179:9 #4 0x5642aa2b8129 in test_io_channel_ipv4 tests/unit/test-io-channel-socket.c:323:5 ... It also prevents enabling the strict mode of CFI which is currently disabled with -fsanitize-cfi-icall-generalize-pointers. The problematic casts are necessary to pass visit_type_T() and visit_type_T_members() as callbacks to qapi_clone() and qapi_clone_members(), respectively. Open-code these two functions to avoid the callbacks, and thus the type casts. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2346 Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-ID: <20240524-xkb-v4-3-2de564e5c859@daynix.com> [thuth: Improve commit message according to Markus' suggestions] Signed-off-by: Thomas Huth <thuth@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 18, 2024
Added bounds checking in the aspeed_gpio_read() and aspeed_gpio_write() functions to ensure the index idx is within the valid range of the reg_table array. The correct size of reg_table is determined dynamically based on whether it is aspeed_3_3v_gpios or aspeed_1_8v_gpios. If idx exceeds the size of reg_table, an error is logged, and the function returns. AddressSanitizer log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240618130928.3075494-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 18, 2024
This patch fixes a heap-buffer-overflow issue in the flash_erase function of the m25p80 flash memory emulation. The overflow occurs when the combination of offset and length exceeds the allocated memory for the storage. The patch adds a check to ensure that the erase length does not exceed the storage size and adjusts the length accordingly if necessary. Reproducer: cat << EOF | qemu-system-aarch64 -display none \ -machine accel=qtest, -m 512M -machine kudo-bmc -qtest stdio writeq 0xc0000010 0x6 writel 0xc000000c 0x9 writeq 0xc0000010 0xf27f9412 writeq 0xc000000f 0x2b5cdc26 writeq 0xc000000c 0xffffffffffffffff writeq 0xc000000c 0xffffffffffffffff writeq 0xc000000c 0xffffffffffffffff writel 0xc000000c 0x9 writeq 0xc000000c 0x9 EOF ASan log: ==2614308==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fd3fb7fc000 at pc 0x55aa77a442dc bp 0x7fffaa155900 sp 0x7fffaa1550c8 WRITE of size 65536 at 0x7fd3fb7fc000 thread T0 #0 0x55aa77a442db in __asan_memset llvm/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:26:3 #1 0x55aa77e7e6b3 in flash_erase hw/block/m25p80.c:631:5 #2 0x55aa77e6f8b1 in complete_collecting_data hw/block/m25p80.c:773:9 #3 0x55aa77e6aaa9 in m25p80_transfer8 hw/block/m25p80.c:1550:13 #4 0x55aa78e9a691 in ssi_transfer_raw_default hw/ssi/ssi.c:92:16 #5 0x55aa78e996c0 in ssi_transfer hw/ssi/ssi.c:165:14 #6 0x55aa78e8d76a in npcm7xx_fiu_uma_transaction hw/ssi/npcm7xx_fiu.c:336:9 #7 0x55aa78e8be4b in npcm7xx_fiu_ctrl_write hw/ssi/npcm7xx_fiu.c:428:13 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240618152328.3163680-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 19, 2024
ASan detected a global-buffer-overflow error in the aspeed_gpio_read() function. This issue occurred when reading beyond the bounds of the reg_table. To enhance the safety and maintainability of the Aspeed GPIO code, this commit introduces a reg_table_size member to the AspeedGPIOClass structure. This change ensures that the size of the GPIO register table is explicitly tracked and initialized, reducing the risk of errors if new register tables are introduced in the future. AddressSanitizer log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240619073301.3706650-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 19, 2024
ASan detected a global-buffer-overflow error in the aspeed_gpio_read() function. This issue occurred when reading beyond the bounds of the reg_table. To enhance the safety and maintainability of the Aspeed GPIO code, this commit introduces a reg_table_size member to the AspeedGPIOClass structure. This change ensures that the size of the GPIO register table is explicitly tracked and initialized, reducing the risk of errors if new register tables are introduced in the future. Reproducer: cat << EOF | qemu-system-aarch64 -display none \ -machine accel=qtest, -m 512M -machine ast1030-evb -qtest stdio readq 0x7e780272 EOF ASAN log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240619183638.4073070-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 19, 2024
ASan detected a global-buffer-overflow error in the aspeed_gpio_read() function. This issue occurred when reading beyond the bounds of the reg_table. To enhance the safety and maintainability of the Aspeed GPIO code, this commit introduces a reg_table_size member to the AspeedGPIOClass structure. This change ensures that the size of the GPIO register table is explicitly tracked and initialized, reducing the risk of errors if new register tables are introduced in the future. Reproducer: cat << EOF | qemu-system-aarch64 -display none \ -machine accel=qtest, -m 512M -machine ast1030-evb -qtest stdio readq 0x7e780272 EOF ASAN log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au> Message-Id: <20240619183638.4073070-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 20, 2024
ASan detected a global-buffer-overflow error in the aspeed_gpio_read() function. This issue occurred when reading beyond the bounds of the reg_table. To enhance the safety and maintainability of the Aspeed GPIO code, this commit introduces a reg_table_count member to the AspeedGPIOClass structure. This change ensures that the size of the GPIO register table is explicitly tracked and initialized, reducing the risk of errors if new register tables are introduced in the future. Reproducer: cat << EOF | qemu-system-aarch64 -display none \ -machine accel=qtest, -m 512M -machine ast1030-evb -qtest stdio readq 0x7e780272 EOF ASAN log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240620140239.375338-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 20, 2024
ASan detected a global-buffer-overflow error in the aspeed_gpio_read() function. This issue occurred when reading beyond the bounds of the reg_table. To enhance the safety and maintainability of the Aspeed GPIO code, this commit introduces a reg_table_count member to the AspeedGPIOClass structure. This change ensures that the size of the GPIO register table is explicitly tracked and initialized, reducing the risk of errors if new register tables are introduced in the future. Reproducer: cat << EOF | qemu-system-aarch64 -display none \ -machine accel=qtest, -m 512M -machine ast1030-evb -qtest stdio readq 0x7e780272 EOF ASAN log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240620140239.375338-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 21, 2024
ASan detected a global-buffer-overflow error in the aspeed_gpio_read() function. This issue occurred when reading beyond the bounds of the reg_table. To enhance the safety and maintainability of the Aspeed GPIO code, this commit introduces a reg_table_count member to the AspeedGPIOClass structure. This change ensures that the size of the GPIO register table is explicitly tracked and initialized, reducing the risk of errors if new register tables are introduced in the future. Reproducer: cat << EOF | qemu-system-aarch64 -display none \ -machine accel=qtest, -m 512M -machine ast1030-evb -qtest stdio readq 0x7e780272 EOF ASAN log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240620140239.375338-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jun 25, 2024
For multi-bytes commands, our implementation uses the @data_start and @data_offset fields to track byte access. We initialize the command start/offset in buffer once. Malicious guest might abuse by switching command while staying in the 'transfer' state, switching command buffer size, and our implementation can access out of buffer boundary. For example, CMD17 (READ_SINGLE_BLOCK) allows to read up to 512 bytes, and CMD13 (SEND_STATUS) up to 64 bytes. By switching from CMD17 to CMD13 (see reproducer below), bytes [64-511] are out of the 'status' buffer. Our implementation return R0 status code for unexpected commands. Such in-transaction command switch is unexpected and returns R0. This is a good place to reset the start/offset fields to avoid malicious accesses. Can be reproduced running: $ export UBSAN_OPTIONS=print_stacktrace=1:halt_on_error=1 $ cat << EOF | qemu-system-i386 \ -display none -nographic \ -machine accel=qtest -m 512M \ -nodefaults \ -device sdhci-pci,sd-spec-version=3 \ -device sd-card,drive=mydrive \ -drive if=none,index=0,file=null-co://,format=raw,id=mydrive \ -qtest stdio -trace sd\* -trace -sdbus_read outl 0xcf8 0x80001010 outl 0xcfc 0xe0000000 outl 0xcf8 0x80001004 outw 0xcfc 0x02 write 0xe000002c 0x1 0x05 write 0xe000000f 0x1 0x37 write 0xe000000a 0x1 0x01 write 0xe000000f 0x1 0x29 write 0xe000000f 0x1 0x02 write 0xe000000f 0x1 0x03 write 0xe000000c 0x1 0x32 write 0xe000000f 0x1 0x06 write 0xe0000005 0x1 0x01 write 0xe0000007 0x1 0x01 write 0xe0000003 0x1 0x00 write 0xe000000f 0x1 0x11 write 0xe000002a 0x1 0x01 write 0xe000002a 0x1 0x02 write 0xe000000f 0x1 0x0d write 0xe000002a 0x1 0x01 write 0xe000002a 0x1 0x02 EOF hw/sd/sd.c:1984:15: runtime error: index 256 out of bounds for type 'uint8_t [64]' #0 sd_read_byte hw/sd/sd.c:1984:15 #1 sdbus_read_data hw/sd/core.c:157:23 #2 sdhci_read_block_from_card hw/sd/sdhci.c:423:9 #3 sdhci_blkgap_write hw/sd/sdhci.c:1074:13 #4 sdhci_write hw/sd/sdhci.c:1195:13 #5 memory_region_write_accessor softmmu/memory.c:492:5 #6 access_with_adjusted_size softmmu/memory.c:554:18 #7 memory_region_dispatch_write softmmu/memory.c #8 flatview_write_continue softmmu/physmem.c:2778:23 #9 flatview_write softmmu/physmem.c:2818:14 #10 address_space_write softmmu/physmem.c:2910:18 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior hw/sd/sd.c:1984:15 Reported-by: Alexander Bulekov <alxndr@bu.edu> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/487 Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=36240 Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240408141717.66154-2-philmd@linaro.org>
patchew-importer
pushed a commit
that referenced
this issue
Jun 30, 2024
The musb_reset function was causing a memory leak by not properly freeing the memory associated with USBPacket instances before reinitializing them. This commit addresses the memory leak by adding calls to usb_packet_cleanup for each USBPacket instance before reinitializing them with usb_packet_init. Asan log: =2970623==ERROR: LeakSanitizer: detected memory leaks Direct leak of 256 byte(s) in 16 object(s) allocated from: #0 0x561e20629c3d in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fee91885738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x561e21b4d0e1 in usb_packet_init hw/usb/core.c:531:5 #3 0x561e21c5016b in musb_reset hw/usb/hcd-musb.c:372:9 #4 0x561e21c502a9 in musb_init hw/usb/hcd-musb.c:385:5 #5 0x561e21c893ef in tusb6010_realize hw/usb/tusb6010.c:827:15 #6 0x561e23443355 in device_set_realized hw/core/qdev.c:510:13 #7 0x561e2346ac1b in property_set_bool qom/object.c:2354:5 #8 0x561e23463895 in object_property_set qom/object.c:1463:5 #9 0x561e23477909 in object_property_set_qobject qom/qom-qobject.c:28:10 #10 0x561e234645ed in object_property_set_bool qom/object.c:1533:15 #11 0x561e2343c830 in qdev_realize hw/core/qdev.c:291:12 #12 0x561e2343c874 in qdev_realize_and_unref hw/core/qdev.c:298:11 #13 0x561e20ad5091 in sysbus_realize_and_unref hw/core/sysbus.c:261:12 #14 0x561e22553283 in n8x0_usb_setup hw/arm/nseries.c:800:5 #15 0x561e2254e99b in n8x0_init hw/arm/nseries.c:1356:5 #16 0x561e22561170 in n810_init hw/arm/nseries.c:1418:5 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240630163225.2973081-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 1, 2024
The musb_reset function was causing a memory leak by not properly freeing the memory associated with USBPacket instances before reinitializing them. This commit addresses the memory leak by adding calls to usb_packet_cleanup for each USBPacket instance before reinitializing them with usb_packet_init. Asan log: =2970623==ERROR: LeakSanitizer: detected memory leaks Direct leak of 256 byte(s) in 16 object(s) allocated from: #0 0x561e20629c3d in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fee91885738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x561e21b4d0e1 in usb_packet_init hw/usb/core.c:531:5 #3 0x561e21c5016b in musb_reset hw/usb/hcd-musb.c:372:9 #4 0x561e21c502a9 in musb_init hw/usb/hcd-musb.c:385:5 #5 0x561e21c893ef in tusb6010_realize hw/usb/tusb6010.c:827:15 #6 0x561e23443355 in device_set_realized hw/core/qdev.c:510:13 #7 0x561e2346ac1b in property_set_bool qom/object.c:2354:5 #8 0x561e23463895 in object_property_set qom/object.c:1463:5 #9 0x561e23477909 in object_property_set_qobject qom/qom-qobject.c:28:10 #10 0x561e234645ed in object_property_set_bool qom/object.c:1533:15 #11 0x561e2343c830 in qdev_realize hw/core/qdev.c:291:12 #12 0x561e2343c874 in qdev_realize_and_unref hw/core/qdev.c:298:11 #13 0x561e20ad5091 in sysbus_realize_and_unref hw/core/sysbus.c:261:12 #14 0x561e22553283 in n8x0_usb_setup hw/arm/nseries.c:800:5 #15 0x561e2254e99b in n8x0_init hw/arm/nseries.c:1356:5 #16 0x561e22561170 in n810_init hw/arm/nseries.c:1418:5 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Message-Id: <20240630163225.2973081-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 2, 2024
ASan detected a global-buffer-overflow error in the aspeed_gpio_read() function. This issue occurred when reading beyond the bounds of the reg_table. To enhance the safety and maintainability of the Aspeed GPIO code, this commit introduces a reg_table_count member to the AspeedGPIOClass structure. This change ensures that the size of the GPIO register table is explicitly tracked and initialized, reducing the risk of errors if new register tables are introduced in the future. Reproducer: cat << EOF | qemu-system-aarch64 -display none \ -machine accel=qtest, -m 512M -machine ast1030-evb -qtest stdio readq 0x7e780272 EOF ASAN log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2355 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au> Message-Id: <20240702080042.464220-2-clg@redhat.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 2, 2024
…te_asym_session The assignment of the op_code in the virtio_crypto_create_asym_session function was moved before its usage to ensure it is correctly set. Previously, if the function failed during the key_len check, the op_code did not have a proper value, causing virtio_crypto_free_create_session_req to not free the memory correctly, leading to a memory leak. By setting the op_code before performing any checks, we ensure that virtio_crypto_free_create_session_req has the correct context to perform cleanup operations properly, thus preventing memory leaks. ASAN log: ==3055068==ERROR: LeakSanitizer: detected memory leaks Direct leak of 512 byte(s) in 1 object(s) allocated from: #0 0x5586a75e6ddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fb6b63b6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x5586a864bbde in virtio_crypto_handle_ctrl hw/virtio/virtio-crypto.c:407:19 #3 0x5586a94fc84c in virtio_queue_notify_vq hw/virtio/virtio.c:2277:9 #4 0x5586a94fc0a2 in virtio_queue_host_notifier_read hw/virtio/virtio.c:3641:9 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240702210227.3059947-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 2, 2024
…te_asym_session The assignment of the op_code in the virtio_crypto_create_asym_session function was moved before its usage to ensure it is correctly set. Previously, if the function failed during the key_len check, the op_code did not have a proper value, causing virtio_crypto_free_create_session_req to not free the memory correctly, leading to a memory leak. By setting the op_code before performing any checks, we ensure that virtio_crypto_free_create_session_req has the correct context to perform cleanup operations properly, thus preventing memory leaks. ASAN log: ==3055068==ERROR: LeakSanitizer: detected memory leaks Direct leak of 512 byte(s) in 1 object(s) allocated from: #0 0x5586a75e6ddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fb6b63b6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x5586a864bbde in virtio_crypto_handle_ctrl hw/virtio/virtio-crypto.c:407:19 #3 0x5586a94fc84c in virtio_queue_notify_vq hw/virtio/virtio.c:2277:9 #4 0x5586a94fc0a2 in virtio_queue_host_notifier_read hw/virtio/virtio.c:3641:9 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240702210443.3060881-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 2, 2024
…te_asym_session The assignment of the op_code in the virtio_crypto_create_asym_session function was moved before its usage to ensure it is correctly set. Previously, if the function failed during the key_len check, the op_code did not have a proper value, causing virtio_crypto_free_create_session_req to not free the memory correctly, leading to a memory leak. By setting the op_code before performing any checks, we ensure that virtio_crypto_free_create_session_req has the correct context to perform cleanup operations properly, thus preventing memory leaks. ASAN log: ==3055068==ERROR: LeakSanitizer: detected memory leaks Direct leak of 512 byte(s) in 1 object(s) allocated from: #0 0x5586a75e6ddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fb6b63b6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x5586a864bbde in virtio_crypto_handle_ctrl hw/virtio/virtio-crypto.c:407:19 #3 0x5586a94fc84c in virtio_queue_notify_vq hw/virtio/virtio.c:2277:9 #4 0x5586a94fc0a2 in virtio_queue_host_notifier_read hw/virtio/virtio.c:3641:9 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240702210519.3061098-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 2, 2024
…te_asym_session Currently, if the function fails during the key_len check, the op_code does not have a proper value, causing virtio_crypto_free_create_session_req not to free the memory correctly, leading to a memory leak. By setting the op_code before performing any checks, we ensure that virtio_crypto_free_create_session_req has the correct context to perform cleanup operations properly, thus preventing memory leaks. ASAN log: ==3055068==ERROR: LeakSanitizer: detected memory leaks Direct leak of 512 byte(s) in 1 object(s) allocated from: #0 0x5586a75e6ddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7fb6b63b6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x5586a864bbde in virtio_crypto_handle_ctrl hw/virtio/virtio-crypto.c:407:19 #3 0x5586a94fc84c in virtio_queue_notify_vq hw/virtio/virtio.c:2277:9 #4 0x5586a94fc0a2 in virtio_queue_host_notifier_read hw/virtio/virtio.c:3641:9 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240702211835.3064505-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 2, 2024
ASan detected a global-buffer-overflow error in the aspeed_gpio_read() function. This issue occurred when reading beyond the bounds of the reg_table. To enhance the safety and maintainability of the Aspeed GPIO code, this commit introduces a reg_table_count member to the AspeedGPIOClass structure. This change ensures that the size of the GPIO register table is explicitly tracked and initialized, reducing the risk of errors if new register tables are introduced in the future. Reproducer: cat << EOF | qemu-system-aarch64 -display none \ -machine accel=qtest, -m 512M -machine ast1030-evb -qtest stdio readq 0x7e780272 EOF ASAN log indicating the issue: ==2602930==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a5da29e128 at pc 0x55a5d700dc62 bp 0x7fff096c4e90 sp 0x7fff096c4e88 READ of size 2 at 0x55a5da29e128 thread T0 #0 0x55a5d700dc61 in aspeed_gpio_read hw/gpio/aspeed_gpio.c:564:14 #1 0x55a5d933f3ab in memory_region_read_accessor system/memory.c:445:11 #2 0x55a5d92fba40 in access_with_adjusted_size system/memory.c:573:18 #3 0x55a5d92f842c in memory_region_dispatch_read1 system/memory.c:1426:16 #4 0x55a5d92f7b68 in memory_region_dispatch_read system/memory.c:1459:9 #5 0x55a5d9376ad1 in flatview_read_continue_step system/physmem.c:2836:18 #6 0x55a5d9376399 in flatview_read_continue system/physmem.c:2877:19 #7 0x55a5d93775b8 in flatview_read system/physmem.c:2907:12 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2355 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
patchew-importer
pushed a commit
that referenced
this issue
Jul 2, 2024
The allocated memory to hold LBA ranges leaks in the nvme_dsm function. This happens because the allocated memory for iocb->range is not freed in all error handling paths. Fix this by adding a free to ensure that the allocated memory is properly freed. ASAN log: ==3075137==ERROR: LeakSanitizer: detected memory leaks Direct leak of 480 byte(s) in 6 object(s) allocated from: #0 0x55f1f8a0eddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7f531e0f6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x55f1faf1f091 in blk_aio_get block/block-backend.c:2583:12 #3 0x55f1f945c74b in nvme_dsm hw/nvme/ctrl.c:2609:30 #4 0x55f1f945831b in nvme_io_cmd hw/nvme/ctrl.c:4470:16 #5 0x55f1f94561b7 in nvme_process_sq hw/nvme/ctrl.c:7039:29 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Message-Id: <20240702231303.3077584-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 3, 2024
The allocated memory to hold LBA ranges leaks in the nvme_dsm function. This happens because the allocated memory for iocb->range is not freed in all error handling paths. Fix this by adding a free to ensure that the allocated memory is properly freed. ASAN log: ==3075137==ERROR: LeakSanitizer: detected memory leaks Direct leak of 480 byte(s) in 6 object(s) allocated from: #0 0x55f1f8a0eddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7f531e0f6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x55f1faf1f091 in blk_aio_get block/block-backend.c:2583:12 #3 0x55f1f945c74b in nvme_dsm hw/nvme/ctrl.c:2609:30 #4 0x55f1f945831b in nvme_io_cmd hw/nvme/ctrl.c:4470:16 #5 0x55f1f94561b7 in nvme_process_sq hw/nvme/ctrl.c:7039:29 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Message-Id: <20240702231303.3077584-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 10, 2024
The allocated memory to hold LBA ranges leaks in the nvme_dsm function. This happens because the allocated memory for iocb->range is not freed in all error handling paths. Fix this by adding a free to ensure that the allocated memory is properly freed. ASAN log: ==3075137==ERROR: LeakSanitizer: detected memory leaks Direct leak of 480 byte(s) in 6 object(s) allocated from: #0 0x55f1f8a0eddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7f531e0f6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x55f1faf1f091 in blk_aio_get block/block-backend.c:2583:12 #3 0x55f1f945c74b in nvme_dsm hw/nvme/ctrl.c:2609:30 #4 0x55f1f945831b in nvme_io_cmd hw/nvme/ctrl.c:4470:16 #5 0x55f1f94561b7 in nvme_process_sq hw/nvme/ctrl.c:7039:29 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Message-Id: <20240702231303.3077584-1-zheyuma97@gmail.com>
patchew-importer
pushed a commit
that referenced
this issue
Jul 23, 2024
The allocated memory to hold LBA ranges leaks in the nvme_dsm function. This happens because the allocated memory for iocb->range is not freed in all error handling paths. Fix this by adding a free to ensure that the allocated memory is properly freed. ASAN log: ==3075137==ERROR: LeakSanitizer: detected memory leaks Direct leak of 480 byte(s) in 6 object(s) allocated from: #0 0x55f1f8a0eddd in malloc llvm/compiler-rt/lib/asan/asan_malloc_linux.cpp:129:3 #1 0x7f531e0f6738 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5e738) #2 0x55f1faf1f091 in blk_aio_get block/block-backend.c:2583:12 #3 0x55f1f945c74b in nvme_dsm hw/nvme/ctrl.c:2609:30 #4 0x55f1f945831b in nvme_io_cmd hw/nvme/ctrl.c:4470:16 #5 0x55f1f94561b7 in nvme_process_sq hw/nvme/ctrl.c:7039:29 Cc: qemu-stable@nongnu.org Fixes: d7d1474 ("hw/nvme: reimplement dsm to allow cancellation") Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: