Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
module: Prevent recursion bug caused by module RCU check
The stack tracer triggered the assert in module_assert_mutex_or_preempt(), which checks that rcu_sched is used as protection, as __module_address() is protect with a synchronize_sched(). The problem with this, is that the WARN_ON() triggers a stack dump which also calls __module_address() and triggers the same warning. All I had to debug this was the following over and over. ------------[ cut here ]------------ WARNING: CPU: 5 PID: 0 at /home/rostedt/work/git/linux-trace.git/kernel/module.c:291 module_assert_mutex_or_preempt+0xa0/0xb0() Modules linked in: be2iscsi iscsi_boot_sysfs bnx2i CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.2.0-test+ torvalds#138 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 ffff88011945ba78 ffff88011945ba78 ffffffff8184d08f 0000000000000000 0000000000000000 ffff88011945bab8 ffffffff810ac1b7 ffff88011945baa8 0000000000000001 ffffffffa0080077 ffffffff81a04410 0000000000000000 Call Trace: [<ffffffff8184d08f>] dump_stack+0x4f/0xa2 [<ffffffff810ac1b7>] warn_slowpath_common+0x97/0xe0 ------------[ cut here ]------------ WARNING: CPU: 5 PID: 0 at /home/rostedt/work/git/linux-trace.git/kernel/module.c:291 module_assert_mutex_or_preempt+0xa0/0xb0() Modules linked in: be2iscsi iscsi_boot_sysfs bnx2i CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.2.0-test+ torvalds#138 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 ffff88011945b768 ffff88011945b768 ffffffff8184d08f 0000000000000000 0000000000000000 ffff88011945b7a8 ffffffff810ac1b7 ffff88011945b798 0000000000000001 ffffffffa0080077 ffffffff81a02e10 0000000000000000 Call Trace: [<ffffffff8184d08f>] dump_stack+0x4f/0xa2 [<ffffffff810ac1b7>] warn_slowpath_common+0x97/0xe0 ------------[ cut here ]------------ WARNING: CPU: 5 PID: 0 at /home/rostedt/work/git/linux-trace.git/kernel/module.c:291 module_assert_mutex_or_preempt+0xa0/0xb0() Modules linked in: be2iscsi iscsi_boot_sysfs bnx2i CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.2.0-test+ torvalds#138 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 ffff88011945b458 ffff88011945b458 ffffffff8184d08f 0000000000000000 0000000000000000 ffff88011945b498 ffffffff810ac1b7 ffff88011945b488 0000000000000001 ffffffffa0080077 ffffffff81a02e10 0000000000000000 I first tried to switch the WARN_ON() to a WARN_ON_ONCE() but that didn't help because the flag within WARN_ON_ONCE() is set after the stack dump, which means that it didn't solve the issue. I just decided to open code the "once" logic, and now I get: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at /home/rostedt/work/git/linux-trace.git/kernel/module.c:304 module_assert_mutex_or_preempt+0x84/0xa0() Modules linked in: be2iscsi iscsi_boot_sysfs bnx2i CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.3.0-rc6-test+ torvalds#185 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 ffffffff81e03a48 ffffffff81e03a48 ffffffff81445d3f 0000000000000000 0000000000000000 ffffffff81e03a88 ffffffff810ae407 ffffffff81e03a78 0000000000000000 ffffffffa008b077 ffffffff81a04550 0000000000000000 Call Trace: [<ffffffff81445d3f>] dump_stack+0x4f/0xb0 [<ffffffff810ae407>] warn_slowpath_common+0x97/0xe0 [<ffffffffa008b077>] ? 0xffffffffa008b077 [<ffffffffa008b077>] ? 0xffffffffa008b077 [<ffffffff810ae46a>] warn_slowpath_null+0x1a/0x20 [<ffffffff81155ed4>] module_assert_mutex_or_preempt+0x84/0xa0 [<ffffffff81155f28>] ? __module_address+0x38/0x220 [<ffffffffa008b077>] ? 0xffffffffa008b077 [<ffffffff81155f28>] __module_address+0x38/0x220 [<ffffffff81155ef5>] ? __module_address+0x5/0x220 [<ffffffffa008b077>] ? 0xffffffffa008b077 [<ffffffff81156115>] ? __module_text_address+0x5/0x70 [<ffffffffa008b077>] ? 0xffffffffa008b077 [<ffffffffa008b077>] ? 0xffffffffa008b077 [<ffffffff81156126>] __module_text_address+0x16/0x70 [<ffffffff8115c101>] ? is_module_text_address+0x21/0x80 [<ffffffffa008b077>] ? 0xffffffffa008b077 [<ffffffff8115c101>] is_module_text_address+0x21/0x80 [<ffffffffa008b077>] ? 0xffffffffa008b077 [<ffffffff810d43ed>] __kernel_text_address+0x6d/0xa0 [<ffffffff81025ce3>] print_context_stack+0x73/0x120 [<ffffffff81024e3c>] dump_trace+0xcc/0x330 [<ffffffff81034ed5>] ? save_stack_trace+0x5/0x50 [<ffffffff8107e350>] ? tlbflush_read_file+0x60/0x60 [<ffffffff81034eff>] save_stack_trace+0x2f/0x50 [<ffffffff811b379f>] ? stack_trace_call+0x16f/0x3b0 [<ffffffff811b379f>] stack_trace_call+0x16f/0x3b0 [<ffffffff810e5460>] ? nr_iowait_cpu+0x30/0x30 [<ffffffff816cd612>] ? cpuidle_enter_state+0x62/0x3e0 [<ffffffffa008b077>] 0xffffffffa008b077 [<ffffffff816cd612>] ? cpuidle_enter_state+0x62/0x3e0 [<ffffffff8107e355>] ? leave_mm+0x5/0x150 [<ffffffff8146664c>] ? check_preemption_disabled+0x4c/0x120 [<ffffffff8107e355>] leave_mm+0x5/0x150 [<ffffffff814bc97f>] intel_idle+0xff/0x170 [<ffffffff8107e355>] ? leave_mm+0x5/0x150 [<ffffffff814bc97f>] ? intel_idle+0xff/0x170 [<ffffffff816cd627>] cpuidle_enter_state+0x77/0x3e0 [<ffffffff816cd9c7>] cpuidle_enter+0x17/0x20 [<ffffffff811032f0>] cpu_startup_entry+0x540/0x560 [<ffffffff81860eec>] rest_init+0x13c/0x150 [<ffffffff81860db5>] ? rest_init+0x5/0x150 [<ffffffff8202e9b9>] ? ftrace_init+0xc8/0x15b [<ffffffff82006160>] start_kernel+0x4ec/0x4f9 [<ffffffff82005a69>] ? set_init_arg+0x57/0x57 [<ffffffff82005120>] ? early_idt_handler_array+0x120/0x120 [<ffffffff820055f8>] x86_64_start_reservations+0x2a/0x2c [<ffffffff82005731>] x86_64_start_kernel+0x137/0x146 ---[ end trace ea6335cd0fe6994e ]--- Which would have helped me tremendously in solving the original bug. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
- Loading branch information