Skip to content

Commit 528f449

Browse files
Roman PenstefanhaRH
authored andcommitted
coroutine-lock: do not touch coroutine after another one has been entered
Submission of requests on linux aio is a bit tricky and can lead to requests completions on submission path: 44713c9 ("linux-aio: Handle io_submit() failure gracefully") 0ed93d8 ("linux-aio: process completions from ioq_submit()") That means that any coroutine which has been yielded in order to wait for completion can be resumed from submission path and be eventually terminated (freed). The following use-after-free crash was observed when IO throttling was enabled: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f5813dff700 (LWP 56417)] virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252 (gdb) bt #0 virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252 ^^^^^^^^^^^^^^ remember the address #1 virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at virtio.c:282 #2 virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, len=<optimized out>) at virtio.c:308 qemu#3 virtio_blk_req_complete (req=req@entry=0x7f5804009a30, status=status@entry=0 '\000') at virtio-blk.c:61 qemu#4 virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at virtio-blk.c:126 qemu#5 blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923 qemu#6 coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:78 (gdb) p * elem $8 = {index = 77, out_num = 2, in_num = 1, in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0, in_sg = 0x0, out_sg = 0x7f5804009a50} ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 'in_sg' and 'out_sg' are invalid. e.g. it is impossible that 'in_sg' is zero, instead its value must be equal to: (gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * sizeof(elem->out_addr[0]) $26 = 0x7f5804009af0 Seems 'elem' was corrupted. Meanwhile another thread raised an abort: Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)): #0 raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113 qemu#3 qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60 qemu#4 qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119 ^^^^^^^^^^^^^^^^^^ WTF?? this is equal to elem from crashed thread qemu#5 qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60 qemu#6 qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119 qemu#7 qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60 qemu#8 qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119 qemu#9 qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60 qemu#10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119 qemu#11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60 qemu#12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119 qemu#13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at qemu-coroutine-lock.c:106 qemu#14 timer_cb (blk=0x5598b1e74280, is_write=<optimized out>) at throttle-groups.c:419 Crash can be explained by access of 'co' object from the loop inside qemu_co_queue_run_restart(): while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) { QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next); ^^^^^^^^^^^^^^^^^^^^ on each iteration 'co' is accessed, but 'co' can be already freed qemu_coroutine_enter(next); } When 'next' coroutine is resumed (entered) it can in its turn resume 'co', and eventually free it. That's why we see 'co' (which was freed) has the same address as 'elem' from the first backtrace. The fix is obvious: use temporary queue and do not touch coroutine after first qemu_coroutine_enter() is invoked. The issue is quite rare and happens every ~12 hours on very high IO and CPU load (building linux kernel with -j512 inside guest) when IO throttling is enabled. With the fix applied guest is running ~35 hours and is still alive so far. Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170601160847.23720-1-roman.penyaev@profitbricks.com Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Fam Zheng <famz@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-devel@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
1 parent 3a586d2 commit 528f449

File tree

2 files changed

+22
-2
lines changed

2 files changed

+22
-2
lines changed

util/qemu-coroutine-lock.c

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,10 +77,25 @@ void coroutine_fn qemu_co_queue_wait(CoQueue *queue, CoMutex *mutex)
7777
void qemu_co_queue_run_restart(Coroutine *co)
7878
{
7979
Coroutine *next;
80+
QSIMPLEQ_HEAD(, Coroutine) tmp_queue_wakeup =
81+
QSIMPLEQ_HEAD_INITIALIZER(tmp_queue_wakeup);
8082

8183
trace_qemu_co_queue_run_restart(co);
82-
while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) {
83-
QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next);
84+
85+
/* Because "co" has yielded, any coroutine that we wakeup can resume it.
86+
* If this happens and "co" terminates, co->co_queue_wakeup becomes
87+
* invalid memory. Therefore, use a temporary queue and do not touch
88+
* the "co" coroutine as soon as you enter another one.
89+
*
90+
* In its turn resumed "co" can pupulate "co_queue_wakeup" queue with
91+
* new coroutines to be woken up. The caller, who has resumed "co",
92+
* will be responsible for traversing the same queue, which may cause
93+
* a different wakeup order but not any missing wakeups.
94+
*/
95+
QSIMPLEQ_CONCAT(&tmp_queue_wakeup, &co->co_queue_wakeup);
96+
97+
while ((next = QSIMPLEQ_FIRST(&tmp_queue_wakeup))) {
98+
QSIMPLEQ_REMOVE_HEAD(&tmp_queue_wakeup, co_queue_next);
8499
qemu_coroutine_enter(next);
85100
}
86101
}

util/qemu-coroutine.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,11 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co)
126126

127127
qemu_co_queue_run_restart(co);
128128

129+
/* Beware, if ret == COROUTINE_YIELD and qemu_co_queue_run_restart()
130+
* has started any other coroutine, "co" might have been reentered
131+
* and even freed by now! So be careful and do not touch it.
132+
*/
133+
129134
switch (ret) {
130135
case COROUTINE_YIELD:
131136
return;

0 commit comments

Comments
 (0)