You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
btrfs: delay blkdev_put until after the device remove
BugLink: https://bugs.launchpad.net/bugs/1949397
[ Upstream commit 3fa421d ]
When removing the device we call blkdev_put() on the device once we've
removed it, and because we have an EXCL open we need to take the
->open_mutex on the block device to clean it up. Unfortunately during
device remove we are holding the sb writers lock, which results in the
following lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc2+ #407 Not tainted
------------------------------------------------------
losetup/11595 is trying to acquire lock:
ffff973ac35dd138 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock:
ffff973ac9812c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
lo_open+0x28/0x60 [loop]
blkdev_get_whole+0x25/0xf0
blkdev_get_by_dev.part.0+0x168/0x3c0
blkdev_open+0xd2/0xe0
do_dentry_open+0x161/0x390
path_openat+0x3cc/0xa20
do_filp_open+0x96/0x120
do_sys_openat2+0x7b/0x130
__x64_sys_openat+0x46/0x70
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}:
__mutex_lock+0x7d/0x750
blkdev_put+0x3a/0x220
btrfs_rm_device.cold+0x62/0xe5
btrfs_ioctl+0x2a31/0x2e70
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}:
lo_write_bvec+0xc2/0x240 [loop]
loop_process_work+0x238/0xd00 [loop]
process_one_work+0x26b/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
process_one_work+0x245/0x560
worker_thread+0x55/0x3c0
kthread+0x140/0x160
ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
flush_workqueue+0x91/0x5e0
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of:
(wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&lo->lo_mutex);
lock(&disk->open_mutex);
lock(&lo->lo_mutex);
lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11595:
#0: ffff973ac9812c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace:
CPU: 0 PID: 11595 Comm: losetup Not tainted 5.14.0-rc2+ #407
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack_lvl+0x57/0x72
check_noncircular+0xcf/0xf0
? stack_trace_save+0x3b/0x50
__lock_acquire+0x10ea/0x1d90
lock_acquire+0xb5/0x2b0
? flush_workqueue+0x67/0x5e0
? lockdep_init_map_type+0x47/0x220
flush_workqueue+0x91/0x5e0
? flush_workqueue+0x67/0x5e0
? verify_cpu+0xf0/0x100
drain_workqueue+0xa0/0x110
destroy_workqueue+0x36/0x250
__loop_clr_fd+0x9a/0x660 [loop]
? blkdev_ioctl+0x8d/0x2a0
block_ioctl+0x3f/0x50
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fc21255d4cb
So instead save the bdev and do the put once we've dropped the sb
writers lock in order to avoid the lockdep recursion.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
0 commit comments