-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wifi does not work #3
Comments
jcdutton
pushed a commit
that referenced
this issue
Aug 4, 2018
commit 36eb8ff upstream. Crash dump shows following instructions crash> bt PID: 0 TASK: ffffffffbe412480 CPU: 0 COMMAND: "swapper/0" #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a #4 [ffff891ee00039e0] no_context at ffffffffbd074643 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925 [exception RIP: qlt_schedule_sess_for_deletion+15] RIP: ffffffffc02e526f RSP: ffff891ee0003c08 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0307847 RDX: 00000000000020e6 RSI: ffff891edbc377c8 RDI: 0000000000000000 RBP: ffff891ee0003c18 R8: ffffffffc02f0b20 R9: 0000000000000250 R10: 0000000000000258 R11: 000000000000b780 R12: ffff891ed9b43000 R13: 00000000000000f0 R14: 0000000000000006 R15: ffff891edbc377c8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx] #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx] #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx] #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx] #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02 #16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90 #17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984 #18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5 #19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18 --- <IRQ stack> --- #20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e [exception RIP: unknown or invalid address] RIP: 000000000000001f RSP: 0000000000000000 RFLAGS: fff3b8c2091ebb3f RAX: ffffbba5a0000200 RBX: 0000be8cdfa8f9fa RCX: 0000000000000018 RDX: 0000000000000101 RSI: 000000000000015d RDI: 0000000000000193 RBP: 0000000000000083 R8: ffffffffbe403e38 R9: 0000000000000002 R10: 0000000000000000 R11: ffffffffbe56b820 R12: ffff891ee001cf00 R13: ffffffffbd11c0a4 R14: ffffffffbe403d60 R15: 0000000000000001 ORIG_RAX: ffff891ee0022ac0 CS: 0000 SS: ffffffffffffffb9 bt: WARNING: possibly bogus exception frame #21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd #22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907 #23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3 #24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42 #25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3 #26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa #27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca #28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675 #29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb #30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5 Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login") Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Aug 4, 2018
[ Upstream commit 83fe6b8 ] When fq_codel_init fails, qdisc_create_dflt will cleanup by using qdisc_destroy. This function calls the ->reset() op prior to calling the ->destroy() op. Unfortunately, during the failure flow for sch_fq_codel, the ->flows parameter is not initialized, so the fq_codel_reset function will null pointer dereference. kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] kernel: PGD 0 P4D 0 kernel: Oops: 0000 [#1] SMP PTI kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me mei joydev i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c mgag200 ixgbe drm_kms_helper isci ttm firewire_ohci kernel: mdio drm igb libsas crc32c_intel firewire_core ptp pps_core scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf ipmi_msghandler [last unloaded: i40e] kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G OE 4.16.13custom-fq-codel-test+ #3 kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.05.0004.051120151007 05/11/2015 kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel] kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246 kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX: 00000000000005b9 kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI: ffff9cfd17b31c00 kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09: ffffffffb679c3e9 kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12: ffff9cfd17b31c00 kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15: 0000000000000001 kernel: FS: 00007fdec2f92800(0000) GS:ffff9d0326480000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4: 00000000000606e0 kernel: Call Trace: kernel: qdisc_destroy+0x56/0x140 kernel: qdisc_create_dflt+0x8b/0xb0 kernel: mq_init+0xc1/0xf0 kernel: qdisc_create_dflt+0x5a/0xb0 kernel: dev_activate+0x205/0x230 kernel: __dev_open+0xf5/0x160 kernel: __dev_change_flags+0x1a3/0x210 kernel: dev_change_flags+0x21/0x60 kernel: do_setlink+0x660/0xdf0 kernel: ? down_trylock+0x25/0x30 kernel: ? xfs_buf_trylock+0x1a/0xd0 [xfs] kernel: ? rtnl_newlink+0x816/0x990 kernel: ? _xfs_buf_find+0x327/0x580 [xfs] kernel: ? _cond_resched+0x15/0x30 kernel: ? kmem_cache_alloc+0x20/0x1b0 kernel: ? rtnetlink_rcv_msg+0x200/0x2f0 kernel: ? rtnl_calcit.isra.30+0x100/0x100 kernel: ? netlink_rcv_skb+0x4c/0x120 kernel: ? netlink_unicast+0x19e/0x260 kernel: ? netlink_sendmsg+0x1ff/0x3c0 kernel: ? sock_sendmsg+0x36/0x40 kernel: ? ___sys_sendmsg+0x295/0x2f0 kernel: ? ebitmap_cmp+0x6d/0x90 kernel: ? dev_get_by_name_rcu+0x73/0x90 kernel: ? skb_dequeue+0x52/0x60 kernel: ? __inode_wait_for_writeback+0x7f/0xf0 kernel: ? bit_waitqueue+0x30/0x30 kernel: ? fsnotify_grab_connector+0x3c/0x60 kernel: ? __sys_sendmsg+0x51/0x90 kernel: ? do_syscall_64+0x74/0x180 kernel: ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2 kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f 84 84 00 00 00 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 <48> 8b 73 08 48 8b 3b e8 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00 kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP: ffffbfbf4c1fb620 kernel: CR2: 0000000000000008 kernel: ---[ end trace e81a62bede66274e ]--- This is caused because flows_cnt is non-zero, but flows hasn't been initialized. fq_codel_init has left the private data in a partially initialized state. To fix this, reset flows_cnt to 0 when we fail to initialize. Additionally, to make the state more consistent, also cleanup the flows pointer when the allocation of backlogs fails. This fixes the NULL pointer dereference, since both the for-loop and memset in fq_codel_reset will be no-ops when flow_cnt is zero. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Aug 4, 2018
[ Upstream commit 4f4616c ] Similar to what we do when we remove a PCI function, set the QEDF_UNLOADING flag to prevent any requests from being queued while a vport is being deleted. This prevents any requests from getting stuck in limbo when the vport is unloaded or deleted. Fixes the crash: PID: 106676 TASK: ffff9a436aa90000 CPU: 12 COMMAND: "multipathd" #0 [ffff9a43567d3550] machine_kexec+522 at ffffffffaca60b2a #1 [ffff9a43567d35b0] __crash_kexec+114 at ffffffffacb13512 #2 [ffff9a43567d3680] crash_kexec+48 at ffffffffacb13600 #3 [ffff9a43567d3698] oops_end+168 at ffffffffad117768 #4 [ffff9a43567d36c0] no_context+645 at ffffffffad106f52 #5 [ffff9a43567d3710] __bad_area_nosemaphore+116 at ffffffffad106fe9 #6 [ffff9a43567d3760] bad_area+70 at ffffffffad107379 #7 [ffff9a43567d3788] __do_page_fault+1247 at ffffffffad11a8cf #8 [ffff9a43567d37f0] do_page_fault+53 at ffffffffad11a915 #9 [ffff9a43567d3820] page_fault+40 at ffffffffad116768 [exception RIP: qedf_init_task+61] RIP: ffffffffc0e13c2d RSP: ffff9a43567d38d0 RFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffbe920472c738 RCX: ffff9a434fa0e3e8 RDX: ffff9a434f695280 RSI: ffffbe920472c738 RDI: ffff9a43aa359c80 RBP: ffff9a43567d3950 R8: 0000000000000c15 R9: ffff9a3fb09b9880 R10: ffff9a434fa0e3e8 R11: ffff9a43567d35ce R12: 0000000000000000 R13: ffff9a434f695280 R14: ffff9a43aa359c80 R15: ffff9a3fb9e005c0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
commit a80c4ec upstream. After commit: 672ff6c ("KVM: x86: Raise #GP when guest vCPU do not support PMU") my AMD guests started #GPing like this: general protection fault: 0000 [#1] PREEMPT SMP CPU: 1 PID: 4355 Comm: bash Not tainted 5.1.0-rc6+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:x86_perf_event_update+0x3b/0xa0 with Code: pointing to RDPMC. It is RDPMC because the guest has the hardware watchdog CONFIG_HARDLOCKUP_DETECTOR_PERF enabled which uses perf. Instrumenting kvm_pmu_rdpmc() some, showed that it fails due to: if (!pmu->version) return 1; which the above commit added. Since AMD's PMU leaves the version at 0, that causes the #GP injection into the guest. Set pmu->version arbitrarily to 1 and move it above the non-applicable struct kvm_pmu members. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: kvm@vger.kernel.org Cc: Liran Alon <liran.alon@oracle.com> Cc: Mihai Carabas <mihai.carabas@oracle.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: x86@kernel.org Cc: stable@vger.kernel.org Fixes: 672ff6c ("KVM: x86: Raise #GP when guest vCPU do not support PMU") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
commit dea37a9 upstream. Syzkaller report this: BUG: KASAN: use-after-free in sysfs_remove_file_ns+0x5f/0x70 fs/sysfs/file.c:468 Read of size 8 at addr ffff8881f59a6b70 by task syz-executor.0/8363 CPU: 0 PID: 8363 Comm: syz-executor.0 Not tainted 5.0.0-rc8+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xfa/0x1ce lib/dump_stack.c:113 print_address_description+0x65/0x270 mm/kasan/report.c:187 kasan_report+0x149/0x18d mm/kasan/report.c:317 sysfs_remove_file_ns+0x5f/0x70 fs/sysfs/file.c:468 sysfs_remove_file include/linux/sysfs.h:519 [inline] driver_remove_file+0x40/0x50 drivers/base/driver.c:122 usb_remove_newid_files drivers/usb/core/driver.c:212 [inline] usb_deregister+0x12a/0x3b0 drivers/usb/core/driver.c:1005 cpia2_exit+0xa/0x16 [cpia2] __do_sys_delete_module kernel/module.c:1018 [inline] __se_sys_delete_module kernel/module.c:961 [inline] __x64_sys_delete_module+0x3dc/0x5e0 kernel/module.c:961 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462e99 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f86f3754c58 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000300 RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f86f37556bc R13: 00000000004bcca9 R14: 00000000006f6b48 R15: 00000000ffffffff Allocated by task 8363: set_track mm/kasan/common.c:85 [inline] __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:495 kmalloc include/linux/slab.h:545 [inline] kzalloc include/linux/slab.h:740 [inline] bus_add_driver+0xc0/0x610 drivers/base/bus.c:651 driver_register+0x1bb/0x3f0 drivers/base/driver.c:170 usb_register_driver+0x267/0x520 drivers/usb/core/driver.c:965 0xffffffffc1b4817c do_one_initcall+0xfa/0x5ca init/main.c:887 do_init_module+0x204/0x5f6 kernel/module.c:3460 load_module+0x66b2/0x8570 kernel/module.c:3808 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 8363: set_track mm/kasan/common.c:85 [inline] __kasan_slab_free+0x130/0x180 mm/kasan/common.c:457 slab_free_hook mm/slub.c:1430 [inline] slab_free_freelist_hook mm/slub.c:1457 [inline] slab_free mm/slub.c:3005 [inline] kfree+0xe1/0x270 mm/slub.c:3957 kobject_cleanup lib/kobject.c:662 [inline] kobject_release lib/kobject.c:691 [inline] kref_put include/linux/kref.h:67 [inline] kobject_put+0x146/0x240 lib/kobject.c:708 bus_remove_driver+0x10e/0x220 drivers/base/bus.c:732 driver_unregister+0x6c/0xa0 drivers/base/driver.c:197 usb_register_driver+0x341/0x520 drivers/usb/core/driver.c:980 0xffffffffc1b4817c do_one_initcall+0xfa/0x5ca init/main.c:887 do_init_module+0x204/0x5f6 kernel/module.c:3460 load_module+0x66b2/0x8570 kernel/module.c:3808 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8881f59a6b40 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 48 bytes inside of 256-byte region [ffff8881f59a6b40, ffff8881f59a6c40) The buggy address belongs to the page: page:ffffea0007d66980 count:1 mapcount:0 mapping:ffff8881f6c02e00 index:0x0 flags: 0x2fffc0000000200(slab) raw: 02fffc0000000200 dead000000000100 dead000000000200 ffff8881f6c02e00 raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8881f59a6a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8881f59a6a80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc >ffff8881f59a6b00: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff8881f59a6b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8881f59a6c00: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc cpia2_init does not check return value of cpia2_init, if it failed in usb_register_driver, there is already cleanup using driver_unregister. No need call cpia2_usb_cleanup on module exit. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
commit 56cd26b upstream. Syzkaller report this: BUG: KASAN: use-after-free in sysfs_remove_file_ns+0x5f/0x70 fs/sysfs/file.c:468 Read of size 8 at addr ffff8881dc7ae030 by task syz-executor.0/6249 CPU: 1 PID: 6249 Comm: syz-executor.0 Not tainted 5.0.0-rc8+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xfa/0x1ce lib/dump_stack.c:113 print_address_description+0x65/0x270 mm/kasan/report.c:187 kasan_report+0x149/0x18d mm/kasan/report.c:317 ? 0xffffffffc1728000 sysfs_remove_file_ns+0x5f/0x70 fs/sysfs/file.c:468 sysfs_remove_file include/linux/sysfs.h:519 [inline] driver_remove_file+0x40/0x50 drivers/base/driver.c:122 remove_bind_files drivers/base/bus.c:585 [inline] bus_remove_driver+0x186/0x220 drivers/base/bus.c:725 driver_unregister+0x6c/0xa0 drivers/base/driver.c:197 serial_ir_init_module+0x169/0x1000 [serial_ir] do_one_initcall+0xfa/0x5ca init/main.c:887 do_init_module+0x204/0x5f6 kernel/module.c:3460 load_module+0x66b2/0x8570 kernel/module.c:3808 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462e99 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f9450132c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99 RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003 RBP: 00007f9450132c70 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f94501336bc R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004 Allocated by task 6249: set_track mm/kasan/common.c:85 [inline] __kasan_kmalloc.constprop.3+0xa0/0xd0 mm/kasan/common.c:495 kmalloc include/linux/slab.h:545 [inline] kzalloc include/linux/slab.h:740 [inline] bus_add_driver+0xc0/0x610 drivers/base/bus.c:651 driver_register+0x1bb/0x3f0 drivers/base/driver.c:170 serial_ir_init_module+0xe8/0x1000 [serial_ir] do_one_initcall+0xfa/0x5ca init/main.c:887 do_init_module+0x204/0x5f6 kernel/module.c:3460 load_module+0x66b2/0x8570 kernel/module.c:3808 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 6249: set_track mm/kasan/common.c:85 [inline] __kasan_slab_free+0x130/0x180 mm/kasan/common.c:457 slab_free_hook mm/slub.c:1430 [inline] slab_free_freelist_hook mm/slub.c:1457 [inline] slab_free mm/slub.c:3005 [inline] kfree+0xe1/0x270 mm/slub.c:3957 kobject_cleanup lib/kobject.c:662 [inline] kobject_release lib/kobject.c:691 [inline] kref_put include/linux/kref.h:67 [inline] kobject_put+0x146/0x240 lib/kobject.c:708 bus_remove_driver+0x10e/0x220 drivers/base/bus.c:732 driver_unregister+0x6c/0xa0 drivers/base/driver.c:197 serial_ir_init_module+0x14c/0x1000 [serial_ir] do_one_initcall+0xfa/0x5ca init/main.c:887 do_init_module+0x204/0x5f6 kernel/module.c:3460 load_module+0x66b2/0x8570 kernel/module.c:3808 __do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902 do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8881dc7ae000 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 48 bytes inside of 256-byte region [ffff8881dc7ae000, ffff8881dc7ae100) The buggy address belongs to the page: page:ffffea000771eb80 count:1 mapcount:0 mapping:ffff8881f6c02e00 index:0x0 flags: 0x2fffc0000000200(slab) raw: 02fffc0000000200 ffffea0007d14800 0000000400000002 ffff8881f6c02e00 raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8881dc7adf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8881dc7adf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8881dc7ae000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8881dc7ae080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8881dc7ae100: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 There are already cleanup handlings in serial_ir_init error path, no need to call serial_ir_exit do it again in serial_ir_init_module, otherwise will trigger a use-after-free issue. Fixes: fa5dc29 ("[media] lirc_serial: move out of staging and rename to serial_ir") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
[ Upstream commit f80c5da ] This commit makes the kernel not send the next queued HCI command until a command complete arrives for the last HCI command sent to the controller. This change avoids a problem with some buggy controllers (seen on two SKUs of QCA9377) that send an extra command complete event for the previous command after the kernel had already sent a new HCI command to the controller. The problem was reproduced when starting an active scanning procedure, where an extra command complete event arrives for the LE_SET_RANDOM_ADDR command. When this happends the kernel ends up not processing the command complete for the following commmand, LE_SET_SCAN_PARAM, and ultimately behaving as if a passive scanning procedure was being performed, when in fact controller is performing an active scanning procedure. This makes it impossible to discover BLE devices as no device found events are sent to userspace. This problem is reproducible on 100% of the attempts on the affected controllers. The extra command complete event can be seen at timestamp 27.420131 on the btmon logs bellow. Bluetooth monitor ver 5.50 = Note: Linux version 5.0.0+ (x86_64) 0.352340 = Note: Bluetooth subsystem version 2.22 0.352343 = New Index: 80:C5:F2:8F:87:84 (Primary,USB,hci0) [hci0] 0.352344 = Open Index: 80:C5:F2:8F:87:84 [hci0] 0.352345 = Index Info: 80:C5:F2:8F:87:84 (Qualcomm) [hci0] 0.352346 @ MGMT Open: bluetoothd (privileged) version 1.14 {0x0001} 0.352347 @ MGMT Open: btmon (privileged) version 1.14 {0x0002} 0.352366 @ MGMT Open: btmgmt (privileged) version 1.14 {0x0003} 27.302164 @ MGMT Command: Start Discovery (0x0023) plen 1 {0x0003} [hci0] 27.302310 Address type: 0x06 LE Public LE Random < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #1 [hci0] 27.302496 Address: 15:60:F2:91:B2:24 (Non-Resolvable) > HCI Event: Command Complete (0x0e) plen 4 #2 [hci0] 27.419117 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #3 [hci0] 27.419244 Type: Active (0x01) Interval: 11.250 msec (0x0012) Window: 11.250 msec (0x0012) Own address type: Random (0x01) Filter policy: Accept all advertisement (0x00) > HCI Event: Command Complete (0x0e) plen 4 #4 [hci0] 27.420131 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #5 [hci0] 27.420259 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #6 [hci0] 27.420969 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) > HCI Event: Command Complete (0x0e) plen 4 #7 [hci0] 27.421983 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) @ MGMT Event: Command Complete (0x0001) plen 4 {0x0003} [hci0] 27.422059 Start Discovery (0x0023) plen 1 Status: Success (0x00) Address type: 0x06 LE Public LE Random @ MGMT Event: Discovering (0x0013) plen 2 {0x0003} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01) @ MGMT Event: Discovering (0x0013) plen 2 {0x0002} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01) @ MGMT Event: Discovering (0x0013) plen 2 {0x0001} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01) Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
[ Upstream commit 41a91c6 ] dwc3_gadget_suspend() is called under dwc->lock spinlock. In such context calling synchronize_irq() is not allowed. Move the problematic call out of the protected block to fix the following kernel BUG during system suspend: BUG: sleeping function called from invalid context at kernel/irq/manage.c:112 in_atomic(): 1, irqs_disabled(): 128, pid: 1601, name: rtcwake 6 locks held by rtcwake/1601: #0: f70ac2a2 (sb_writers#7){.+.+}, at: vfs_write+0x130/0x16c #1: b5fe1270 (&of->mutex){+.+.}, at: kernfs_fop_write+0xc0/0x1e4 #2: 7e597705 (kn->count#60){.+.+}, at: kernfs_fop_write+0xc8/0x1e4 #3: 8b3527d0 (system_transition_mutex){+.+.}, at: pm_suspend+0xc4/0xc04 #4: fc7f1c42 (&dev->mutex){....}, at: __device_suspend+0xd8/0x74c #5: 4b36507e (&(&dwc->lock)->rlock){....}, at: dwc3_gadget_suspend+0x24/0x3c irq event stamp: 11252 hardirqs last enabled at (11251): [<c09c54a4>] _raw_spin_unlock_irqrestore+0x6c/0x74 hardirqs last disabled at (11252): [<c09c4d44>] _raw_spin_lock_irqsave+0x1c/0x5c softirqs last enabled at (9744): [<c0102564>] __do_softirq+0x3a4/0x66c softirqs last disabled at (9737): [<c0128528>] irq_exit+0x140/0x168 Preemption disabled at: [<00000000>] (null) CPU: 7 PID: 1601 Comm: rtcwake Not tainted 5.0.0-rc3-next-20190122-00039-ga3f4ee4f8a52 #5252 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c01110f0>] (unwind_backtrace) from [<c010d120>] (show_stack+0x10/0x14) [<c010d120>] (show_stack) from [<c09a4d04>] (dump_stack+0x90/0xc8) [<c09a4d04>] (dump_stack) from [<c014c700>] (___might_sleep+0x22c/0x2c8) [<c014c700>] (___might_sleep) from [<c0189d68>] (synchronize_irq+0x28/0x84) [<c0189d68>] (synchronize_irq) from [<c05cbbf8>] (dwc3_gadget_suspend+0x34/0x3c) [<c05cbbf8>] (dwc3_gadget_suspend) from [<c05bd020>] (dwc3_suspend_common+0x154/0x410) [<c05bd020>] (dwc3_suspend_common) from [<c05bd34c>] (dwc3_suspend+0x14/0x2c) [<c05bd34c>] (dwc3_suspend) from [<c051c730>] (platform_pm_suspend+0x2c/0x54) [<c051c730>] (platform_pm_suspend) from [<c05285d4>] (dpm_run_callback+0xa4/0x3dc) [<c05285d4>] (dpm_run_callback) from [<c0528a40>] (__device_suspend+0x134/0x74c) [<c0528a40>] (__device_suspend) from [<c052c508>] (dpm_suspend+0x174/0x588) [<c052c508>] (dpm_suspend) from [<c0182134>] (suspend_devices_and_enter+0xc0/0xe74) [<c0182134>] (suspend_devices_and_enter) from [<c0183658>] (pm_suspend+0x770/0xc04) [<c0183658>] (pm_suspend) from [<c0180ddc>] (state_store+0x6c/0xcc) [<c0180ddc>] (state_store) from [<c09a9a70>] (kobj_attr_store+0x14/0x20) [<c09a9a70>] (kobj_attr_store) from [<c02d6800>] (sysfs_kf_write+0x4c/0x50) [<c02d6800>] (sysfs_kf_write) from [<c02d594c>] (kernfs_fop_write+0xfc/0x1e4) [<c02d594c>] (kernfs_fop_write) from [<c02593d8>] (__vfs_write+0x2c/0x160) [<c02593d8>] (__vfs_write) from [<c0259694>] (vfs_write+0xa4/0x16c) [<c0259694>] (vfs_write) from [<c0259870>] (ksys_write+0x40/0x8c) [<c0259870>] (ksys_write) from [<c0101000>] (ret_fast_syscall+0x0/0x28) Exception stack(0xed55ffa8 to 0xed55fff0) ... Fixes: 01c1088 ("usb: dwc3: gadget: synchronize_irq dwc irq in suspend") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
…cm_qla2xxx_close_session() [ Upstream commit d4023db ] This patch avoids that lockdep reports the following warning: ===================================================== WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected 5.1.0-rc1-dbg+ #11 Tainted: G W ----------------------------------------------------- rmdir/1478 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: 00000000e7ac4607 (&(&k->k_lock)->rlock){+.+.}, at: klist_next+0x43/0x1d0 and this task is already holding: 00000000cf0baf5e (&(&ha->tgt.sess_lock)->rlock){-...}, at: tcm_qla2xxx_close_session+0x57/0xb0 [tcm_qla2xxx] which would create a new lock dependency: (&(&ha->tgt.sess_lock)->rlock){-...} -> (&(&k->k_lock)->rlock){+.+.} but this new dependency connects a HARDIRQ-irq-safe lock: (&(&ha->tgt.sess_lock)->rlock){-...} ... which became HARDIRQ-irq-safe at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_fcport_event_handler+0x1f3d/0x22b0 [qla2xxx] qla2x00_async_login_sp_done+0x1dc/0x1f0 [qla2xxx] qla24xx_process_response_queue+0xa37/0x10e0 [qla2xxx] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x24d/0x2d0 secondary_startup_64+0xa4/0xb0 to a HARDIRQ-irq-unsafe lock: (&(&k->k_lock)->rlock){+.+.} ... which became HARDIRQ-irq-unsafe at: ... lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&k->k_lock)->rlock); local_irq_disable(); lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&k->k_lock)->rlock); <Interrupt> lock(&(&ha->tgt.sess_lock)->rlock); *** DEADLOCK *** 4 locks held by rmdir/1478: #0: 000000002c7f1ba4 (sb_writers#10){.+.+}, at: mnt_want_write+0x32/0x70 #1: 00000000c85eb147 (&default_group_class[depth - 1]#2/1){+.+.}, at: do_rmdir+0x217/0x2d0 #2: 000000002b164d6f (&sb->s_type->i_mutex_key#13){++++}, at: vfs_rmdir+0x7e/0x1d0 #3: 00000000cf0baf5e (&(&ha->tgt.sess_lock)->rlock){-...}, at: tcm_qla2xxx_close_session+0x57/0xb0 [tcm_qla2xxx] the dependencies between HARDIRQ-irq-safe lock and the holding lock: -> (&(&ha->tgt.sess_lock)->rlock){-...} ops: 127 { IN-HARDIRQ-W at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_fcport_event_handler+0x1f3d/0x22b0 [qla2xxx] qla2x00_async_login_sp_done+0x1dc/0x1f0 [qla2xxx] qla24xx_process_response_queue+0xa37/0x10e0 [qla2xxx] qla24xx_msix_rsp_q+0x79/0xf0 [qla2xxx] __handle_irq_event_percpu+0x79/0x3c0 handle_irq_event_percpu+0x70/0xf0 handle_irq_event+0x5a/0x8b handle_edge_irq+0x12c/0x310 handle_irq+0x192/0x20a do_IRQ+0x73/0x160 ret_from_intr+0x0/0x1d default_idle+0x23/0x1f0 arch_cpu_idle+0x15/0x20 default_idle_call+0x35/0x40 do_idle+0x2bb/0x2e0 cpu_startup_entry+0x1d/0x20 start_secondary+0x24d/0x2d0 secondary_startup_64+0xa4/0xb0 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 qla2x00_loop_resync+0xb3d/0x2690 [qla2xxx] qla2x00_do_dpc+0xcee/0xf30 [qla2xxx] kthread+0x1d2/0x1f0 ret_from_fork+0x3a/0x50 } ... key at: [<ffffffffa125f700>] __key.62804+0x0/0xfffffffffff7e900 [qla2xxx] ... acquired at: __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe the dependencies between the lock to be acquired and HARDIRQ-irq-unsafe lock: -> (&(&k->k_lock)->rlock){+.+.} ops: 14568 { HARDIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 SOFTIRQ-ON-W at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 INITIAL USE at: lock_acquire+0xe3/0x200 _raw_spin_lock+0x32/0x50 klist_add_tail+0x33/0xb0 device_add+0x7f4/0xb60 device_create_groups_vargs+0x11c/0x150 device_create_with_groups+0x89/0xb0 vtconsole_class_init+0xb2/0x124 do_one_initcall+0xc5/0x3ce kernel_init_freeable+0x295/0x32e kernel_init+0x11/0x11b ret_from_fork+0x3a/0x50 } ... key at: [<ffffffff83f3d900>] __key.15805+0x0/0x40 ... acquired at: __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe stack backtrace: CPU: 7 PID: 1478 Comm: rmdir Tainted: G W 5.1.0-rc1-dbg+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x86/0xca check_usage.cold.59+0x473/0x563 check_prev_add.constprop.43+0x1f1/0x1170 __lock_acquire+0x11ed/0x1b60 lock_acquire+0xe3/0x200 _raw_spin_lock_irqsave+0x3d/0x60 klist_next+0x43/0x1d0 device_for_each_child+0x96/0x110 scsi_target_block+0x3c/0x40 [scsi_mod] fc_remote_port_delete+0xe7/0x1c0 [scsi_transport_fc] qla2x00_mark_device_lost+0x4d3/0x500 [qla2xxx] qlt_unreg_sess+0x104/0x2c0 [qla2xxx] tcm_qla2xxx_close_session+0xa2/0xb0 [tcm_qla2xxx] target_shutdown_sessions+0x17b/0x190 [target_core_mod] core_tpg_del_initiator_node_acl+0xf3/0x1f0 [target_core_mod] target_fabric_nacl_base_release+0x25/0x30 [target_core_mod] config_item_release+0x9f/0x120 [configfs] config_item_put+0x29/0x2b [configfs] configfs_rmdir+0x3d2/0x520 [configfs] vfs_rmdir+0xb3/0x1d0 do_rmdir+0x25c/0x2d0 __x64_sys_rmdir+0x24/0x30 do_syscall_64+0x77/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
[ Upstream commit ff612ba ] We've been seeing the following sporadically throughout our fleet panic: kernel BUG at fs/btrfs/relocation.c:4584! netversion: 5.0-0 Backtrace: #0 [ffffc90003adb880] machine_kexec at ffffffff81041da8 #1 [ffffc90003adb8c8] __crash_kexec at ffffffff8110396c #2 [ffffc90003adb988] crash_kexec at ffffffff811048ad #3 [ffffc90003adb9a0] oops_end at ffffffff8101c19a #4 [ffffc90003adb9c0] do_trap at ffffffff81019114 #5 [ffffc90003adba00] do_error_trap at ffffffff810195d0 #6 [ffffc90003adbab0] invalid_op at ffffffff81a00a9b [exception RIP: btrfs_reloc_cow_block+692] RIP: ffffffff8143b614 RSP: ffffc90003adbb68 RFLAGS: 00010246 RAX: fffffffffffffff7 RBX: ffff8806b9c32000 RCX: ffff8806aad00690 RDX: ffff880850b295e0 RSI: ffff8806b9c32000 RDI: ffff88084f205bd0 RBP: ffff880849415000 R8: ffffc90003adbbe0 R9: ffff88085ac90000 R10: ffff8805f7369140 R11: 0000000000000000 R12: ffff880850b295e0 R13: ffff88084f205bd0 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffc90003adbbb0] __btrfs_cow_block at ffffffff813bf1cd #8 [ffffc90003adbc28] btrfs_cow_block at ffffffff813bf4b3 #9 [ffffc90003adbc78] btrfs_search_slot at ffffffff813c2e6c The way relocation moves data extents is by creating a reloc inode and preallocating extents in this inode and then copying the data into these preallocated extents. Once we've done this for all of our extents, we'll write out these dirty pages, which marks the extent written, and goes into btrfs_reloc_cow_block(). From here we get our current reloc_control, which _should_ match the reloc_control for the current block group we're relocating. However if we get an ENOSPC in this path at some point we'll bail out, never initiating writeback on this inode. Not a huge deal, unless we happen to be doing relocation on a different block group, and this block group is now rc->stage == UPDATE_DATA_PTRS. This trips the BUG_ON() in btrfs_reloc_cow_block(), because we expect to be done modifying the data inode. We are in fact done modifying the metadata for the data inode we're currently using, but not the one from the failed block group, and thus we BUG_ON(). (This happens when writeback finishes for extents from the previous group, when we are at btrfs_finish_ordered_io() which updates the data reloc tree (inode item, drops/adds extent items, etc).) Fix this by writing out the reloc data inode always, and then breaking out of the loop after that point to keep from tripping this BUG_ON() later. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> [ add note from Filipe ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
…text commit 0c9e8b3 upstream. stub_probe() and stub_disconnect() call functions which could call sleeping function in invalid context whil holding busid_lock. Fix the problem by refining the lock holds to short critical sections to change the busid_priv fields. This fix restructures the code to limit the lock holds in stub_probe() and stub_disconnect(). stub_probe(): [15217.927028] BUG: sleeping function called from invalid context at mm/slab.h:418 [15217.927038] in_atomic(): 1, irqs_disabled(): 0, pid: 29087, name: usbip [15217.927044] 5 locks held by usbip/29087: [15217.927047] #0: 0000000091647f28 (sb_writers#6){....}, at: vfs_write+0x191/0x1c0 [15217.927062] #1: 000000008f9ba75b (&of->mutex){....}, at: kernfs_fop_write+0xf7/0x1b0 [15217.927072] #2: 00000000872e5b4b (&dev->mutex){....}, at: __device_driver_lock+0x3b/0x50 [15217.927082] #3: 00000000e74ececc (&dev->mutex){....}, at: __device_driver_lock+0x46/0x50 [15217.927090] #4: 00000000b20abbe0 (&(&busid_table[i].busid_lock)->rlock){....}, at: get_busid_priv+0x48/0x60 [usbip_host] [15217.927103] CPU: 3 PID: 29087 Comm: usbip Tainted: G W 5.1.0-rc6+ #40 [15217.927106] Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A18 09/24/2013 [15217.927109] Call Trace: [15217.927118] dump_stack+0x63/0x85 [15217.927127] ___might_sleep+0xff/0x120 [15217.927133] __might_sleep+0x4a/0x80 [15217.927143] kmem_cache_alloc_trace+0x1aa/0x210 [15217.927156] stub_probe+0xe8/0x440 [usbip_host] [15217.927171] usb_probe_device+0x34/0x70 stub_disconnect(): [15279.182478] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 [15279.182487] in_atomic(): 1, irqs_disabled(): 0, pid: 29114, name: usbip [15279.182492] 5 locks held by usbip/29114: [15279.182494] #0: 0000000091647f28 (sb_writers#6){....}, at: vfs_write+0x191/0x1c0 [15279.182506] #1: 00000000702cf0f3 (&of->mutex){....}, at: kernfs_fop_write+0xf7/0x1b0 [15279.182514] #2: 00000000872e5b4b (&dev->mutex){....}, at: __device_driver_lock+0x3b/0x50 [15279.182522] #3: 00000000e74ececc (&dev->mutex){....}, at: __device_driver_lock+0x46/0x50 [15279.182529] #4: 00000000b20abbe0 (&(&busid_table[i].busid_lock)->rlock){....}, at: get_busid_priv+0x48/0x60 [usbip_host] [15279.182541] CPU: 0 PID: 29114 Comm: usbip Tainted: G W 5.1.0-rc6+ #40 [15279.182543] Hardware name: Dell Inc. OptiPlex 790/0HY9JP, BIOS A18 09/24/2013 [15279.182546] Call Trace: [15279.182554] dump_stack+0x63/0x85 [15279.182561] ___might_sleep+0xff/0x120 [15279.182566] __might_sleep+0x4a/0x80 [15279.182574] __mutex_lock+0x55/0x950 [15279.182582] ? get_busid_priv+0x48/0x60 [usbip_host] [15279.182587] ? reacquire_held_locks+0xec/0x1a0 [15279.182591] ? get_busid_priv+0x48/0x60 [usbip_host] [15279.182597] ? find_held_lock+0x94/0xa0 [15279.182609] mutex_lock_nested+0x1b/0x20 [15279.182614] ? mutex_lock_nested+0x1b/0x20 [15279.182618] kernfs_remove_by_name_ns+0x2a/0x90 [15279.182625] sysfs_remove_file_ns+0x15/0x20 [15279.182629] device_remove_file+0x19/0x20 [15279.182634] stub_disconnect+0x6d/0x180 [usbip_host] [15279.182643] usb_unbind_device+0x27/0x60 Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 10, 2019
…emove commit d27e5e0 upstream. With this early return due to zfcp_unit child(ren), we don't use the zfcp_port reference from the earlier zfcp_get_port_by_wwpn() anymore and need to put it. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: d99b601 ("[SCSI] zfcp: restore refcount check on port_remove") Cc: <stable@vger.kernel.org> #3.7+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 30, 2019
[ Upstream commit 198790d ] In free_percpu() we sometimes call pcpu_schedule_balance_work() to queue a work item (which does a wakeup) while holding pcpu_lock. This creates an unnecessary lock dependency between pcpu_lock and the scheduler's pi_lock. There are other places where we call pcpu_schedule_balance_work() without hold pcpu_lock, and this case doesn't need to be different. Moving the call outside the lock prevents the following lockdep splat when running tools/testing/selftests/bpf/{test_maps,test_progs} in sequence with lockdep enabled: ====================================================== WARNING: possible circular locking dependency detected 5.1.0-dbg-DEV #1 Not tainted ------------------------------------------------------ kworker/23:255/18872 is trying to acquire lock: 000000000bc79290 (&(&pool->lock)->rlock){-.-.}, at: __queue_work+0xb2/0x520 but task is already holding lock: 00000000e3e7a6aa (pcpu_lock){..-.}, at: free_percpu+0x36/0x260 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (pcpu_lock){..-.}: lock_acquire+0x9e/0x180 _raw_spin_lock_irqsave+0x3a/0x50 pcpu_alloc+0xfa/0x780 __alloc_percpu_gfp+0x12/0x20 alloc_htab_elem+0x184/0x2b0 __htab_percpu_map_update_elem+0x252/0x290 bpf_percpu_hash_update+0x7c/0x130 __do_sys_bpf+0x1912/0x1be0 __x64_sys_bpf+0x1a/0x20 do_syscall_64+0x59/0x400 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #3 (&htab->buckets[i].lock){....}: lock_acquire+0x9e/0x180 _raw_spin_lock_irqsave+0x3a/0x50 htab_map_update_elem+0x1af/0x3a0 -> #2 (&rq->lock){-.-.}: lock_acquire+0x9e/0x180 _raw_spin_lock+0x2f/0x40 task_fork_fair+0x37/0x160 sched_fork+0x211/0x310 copy_process.part.43+0x7b1/0x2160 _do_fork+0xda/0x6b0 kernel_thread+0x29/0x30 rest_init+0x22/0x260 arch_call_rest_init+0xe/0x10 start_kernel+0x4fd/0x520 x86_64_start_reservations+0x24/0x26 x86_64_start_kernel+0x6f/0x72 secondary_startup_64+0xa4/0xb0 -> #1 (&p->pi_lock){-.-.}: lock_acquire+0x9e/0x180 _raw_spin_lock_irqsave+0x3a/0x50 try_to_wake_up+0x41/0x600 wake_up_process+0x15/0x20 create_worker+0x16b/0x1e0 workqueue_init+0x279/0x2ee kernel_init_freeable+0xf7/0x288 kernel_init+0xf/0x180 ret_from_fork+0x24/0x30 -> #0 (&(&pool->lock)->rlock){-.-.}: __lock_acquire+0x101f/0x12a0 lock_acquire+0x9e/0x180 _raw_spin_lock+0x2f/0x40 __queue_work+0xb2/0x520 queue_work_on+0x38/0x80 free_percpu+0x221/0x260 pcpu_freelist_destroy+0x11/0x20 stack_map_free+0x2a/0x40 bpf_map_free_deferred+0x3c/0x50 process_one_work+0x1f7/0x580 worker_thread+0x54/0x410 kthread+0x10f/0x150 ret_from_fork+0x24/0x30 other info that might help us debug this: Chain exists of: &(&pool->lock)->rlock --> &htab->buckets[i].lock --> pcpu_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pcpu_lock); lock(&htab->buckets[i].lock); lock(pcpu_lock); lock(&(&pool->lock)->rlock); *** DEADLOCK *** 3 locks held by kworker/23:255/18872: #0: 00000000b36a6e16 ((wq_completion)events){+.+.}, at: process_one_work+0x17a/0x580 #1: 00000000dfd966f0 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x17a/0x580 #2: 00000000e3e7a6aa (pcpu_lock){..-.}, at: free_percpu+0x36/0x260 stack backtrace: CPU: 23 PID: 18872 Comm: kworker/23:255 Not tainted 5.1.0-dbg-DEV #1 Hardware name: ... Workqueue: events bpf_map_free_deferred Call Trace: dump_stack+0x67/0x95 print_circular_bug.isra.38+0x1c6/0x220 check_prev_add.constprop.50+0x9f6/0xd20 __lock_acquire+0x101f/0x12a0 lock_acquire+0x9e/0x180 _raw_spin_lock+0x2f/0x40 __queue_work+0xb2/0x520 queue_work_on+0x38/0x80 free_percpu+0x221/0x260 pcpu_freelist_destroy+0x11/0x20 stack_map_free+0x2a/0x40 bpf_map_free_deferred+0x3c/0x50 process_one_work+0x1f7/0x580 worker_thread+0x54/0x410 kthread+0x10f/0x150 ret_from_fork+0x24/0x30 Signed-off-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 30, 2019
[ Upstream commit 41be3e2 ] vfio_dev_present() which is the condition to wait_event_interruptible_timeout(), will call vfio_group_get_device and try to acquire the mutex group->device_lock. wait_event_interruptible_timeout() will set the state of the current task to TASK_INTERRUPTIBLE, before doing the condition check. This means that we will try to acquire the mutex while already in a sleeping state. The scheduler warns us by giving the following warning: [ 4050.264464] ------------[ cut here ]------------ [ 4050.264508] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b33c00e2>] prepare_to_wait_event+0x14a/0x188 [ 4050.264529] WARNING: CPU: 12 PID: 35924 at kernel/sched/core.c:6112 __might_sleep+0x76/0x90 .... 4050.264756] Call Trace: [ 4050.264765] ([<000000000017bbaa>] __might_sleep+0x72/0x90) [ 4050.264774] [<0000000000b97edc>] __mutex_lock+0x44/0x8c0 [ 4050.264782] [<0000000000b9878a>] mutex_lock_nested+0x32/0x40 [ 4050.264793] [<000003ff800d7abe>] vfio_group_get_device+0x36/0xa8 [vfio] [ 4050.264803] [<000003ff800d87c0>] vfio_del_group_dev+0x238/0x378 [vfio] [ 4050.264813] [<000003ff8015f67c>] mdev_remove+0x3c/0x68 [mdev] [ 4050.264825] [<00000000008e01b0>] device_release_driver_internal+0x168/0x268 [ 4050.264834] [<00000000008de692>] bus_remove_device+0x162/0x190 [ 4050.264843] [<00000000008daf42>] device_del+0x1e2/0x368 [ 4050.264851] [<00000000008db12c>] device_unregister+0x64/0x88 [ 4050.264862] [<000003ff8015ed84>] mdev_device_remove+0xec/0x130 [mdev] [ 4050.264872] [<000003ff8015f074>] remove_store+0x6c/0xa8 [mdev] [ 4050.264881] [<000000000046f494>] kernfs_fop_write+0x14c/0x1f8 [ 4050.264890] [<00000000003c1530>] __vfs_write+0x38/0x1a8 [ 4050.264899] [<00000000003c187c>] vfs_write+0xb4/0x198 [ 4050.264908] [<00000000003c1af2>] ksys_write+0x5a/0xb0 [ 4050.264916] [<0000000000b9e270>] system_call+0xdc/0x2d8 [ 4050.264925] 4 locks held by sh/35924: [ 4050.264933] #0: 000000001ef90325 (sb_writers#4){.+.+}, at: vfs_write+0x9e/0x198 [ 4050.264948] #1: 000000005c1ab0b3 (&of->mutex){+.+.}, at: kernfs_fop_write+0x1cc/0x1f8 [ 4050.264963] #2: 0000000034831ab8 (kn->count#297){++++}, at: kernfs_remove_self+0x12e/0x150 [ 4050.264979] #3: 00000000e152484f (&dev->mutex){....}, at: device_release_driver_internal+0x5c/0x268 [ 4050.264993] Last Breaking-Event-Address: [ 4050.265002] [<000000000017bbaa>] __might_sleep+0x72/0x90 [ 4050.265010] irq event stamp: 7039 [ 4050.265020] hardirqs last enabled at (7047): [<00000000001cee7a>] console_unlock+0x6d2/0x740 [ 4050.265029] hardirqs last disabled at (7054): [<00000000001ce87e>] console_unlock+0xd6/0x740 [ 4050.265040] softirqs last enabled at (6416): [<0000000000b8fe26>] __udelay+0xb6/0x100 [ 4050.265049] softirqs last disabled at (6415): [<0000000000b8fe06>] __udelay+0x96/0x100 [ 4050.265057] ---[ end trace d04a07d39d99a9f9 ]--- Let's fix this as described in the article https://lwn.net/Articles/628628/. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> [remove now redundant vfio_dev_present()] Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 30, 2019
[ Upstream commit 347ab94 ] This patch fixes deadlock warning if removing PWM device when CONFIG_PROVE_LOCKING is enabled. This issue can be reproceduced by the following steps on the R-Car H3 Salvator-X board if the backlight is disabled: # cd /sys/class/pwm/pwmchip0 # echo 0 > export # ls device export npwm power pwm0 subsystem uevent unexport # cd device/driver # ls bind e6e31000.pwm uevent unbind # echo e6e31000.pwm > unbind [ 87.659974] ====================================================== [ 87.666149] WARNING: possible circular locking dependency detected [ 87.672327] 5.0.0 #7 Not tainted [ 87.675549] ------------------------------------------------------ [ 87.681723] bash/2986 is trying to acquire lock: [ 87.686337] 000000005ea0e178 (kn->count#58){++++}, at: kernfs_remove_by_name_ns+0x50/0xa0 [ 87.694528] [ 87.694528] but task is already holding lock: [ 87.700353] 000000006313b17c (pwm_lock){+.+.}, at: pwmchip_remove+0x28/0x13c [ 87.707405] [ 87.707405] which lock already depends on the new lock. [ 87.707405] [ 87.715574] [ 87.715574] the existing dependency chain (in reverse order) is: [ 87.723048] [ 87.723048] -> #1 (pwm_lock){+.+.}: [ 87.728017] __mutex_lock+0x70/0x7e4 [ 87.732108] mutex_lock_nested+0x1c/0x24 [ 87.736547] pwm_request_from_chip.part.6+0x34/0x74 [ 87.741940] pwm_request_from_chip+0x20/0x40 [ 87.746725] export_store+0x6c/0x1f4 [ 87.750820] dev_attr_store+0x18/0x28 [ 87.754998] sysfs_kf_write+0x54/0x64 [ 87.759175] kernfs_fop_write+0xe4/0x1e8 [ 87.763615] __vfs_write+0x40/0x184 [ 87.767619] vfs_write+0xa8/0x19c [ 87.771448] ksys_write+0x58/0xbc [ 87.775278] __arm64_sys_write+0x18/0x20 [ 87.779721] el0_svc_common+0xd0/0x124 [ 87.783986] el0_svc_compat_handler+0x1c/0x24 [ 87.788858] el0_svc_compat+0x8/0x18 [ 87.792947] [ 87.792947] -> #0 (kn->count#58){++++}: [ 87.798260] lock_acquire+0xc4/0x22c [ 87.802353] __kernfs_remove+0x258/0x2c4 [ 87.806790] kernfs_remove_by_name_ns+0x50/0xa0 [ 87.811836] remove_files.isra.1+0x38/0x78 [ 87.816447] sysfs_remove_group+0x48/0x98 [ 87.820971] sysfs_remove_groups+0x34/0x4c [ 87.825583] device_remove_attrs+0x6c/0x7c [ 87.830197] device_del+0x11c/0x33c [ 87.834201] device_unregister+0x14/0x2c [ 87.838638] pwmchip_sysfs_unexport+0x40/0x4c [ 87.843509] pwmchip_remove+0xf4/0x13c [ 87.847773] rcar_pwm_remove+0x28/0x34 [ 87.852039] platform_drv_remove+0x24/0x64 [ 87.856651] device_release_driver_internal+0x18c/0x21c [ 87.862391] device_release_driver+0x14/0x1c [ 87.867175] unbind_store+0xe0/0x124 [ 87.871265] drv_attr_store+0x20/0x30 [ 87.875442] sysfs_kf_write+0x54/0x64 [ 87.879618] kernfs_fop_write+0xe4/0x1e8 [ 87.884055] __vfs_write+0x40/0x184 [ 87.888057] vfs_write+0xa8/0x19c [ 87.891887] ksys_write+0x58/0xbc [ 87.895716] __arm64_sys_write+0x18/0x20 [ 87.900154] el0_svc_common+0xd0/0x124 [ 87.904417] el0_svc_compat_handler+0x1c/0x24 [ 87.909289] el0_svc_compat+0x8/0x18 [ 87.913378] [ 87.913378] other info that might help us debug this: [ 87.913378] [ 87.921374] Possible unsafe locking scenario: [ 87.921374] [ 87.927286] CPU0 CPU1 [ 87.931808] ---- ---- [ 87.936331] lock(pwm_lock); [ 87.939293] lock(kn->count#58); [ 87.945120] lock(pwm_lock); [ 87.950599] lock(kn->count#58); [ 87.953908] [ 87.953908] *** DEADLOCK *** [ 87.953908] [ 87.959821] 4 locks held by bash/2986: [ 87.963563] #0: 00000000ace7bc30 (sb_writers#6){.+.+}, at: vfs_write+0x188/0x19c [ 87.971044] #1: 00000000287991b2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xb4/0x1e8 [ 87.978872] #2: 00000000f739d016 (&dev->mutex){....}, at: device_release_driver_internal+0x40/0x21c [ 87.988001] #3: 000000006313b17c (pwm_lock){+.+.}, at: pwmchip_remove+0x28/0x13c [ 87.995481] [ 87.995481] stack backtrace: [ 87.999836] CPU: 0 PID: 2986 Comm: bash Not tainted 5.0.0 #7 [ 88.005489] Hardware name: Renesas Salvator-X board based on r8a7795 ES1.x (DT) [ 88.012791] Call trace: [ 88.015235] dump_backtrace+0x0/0x190 [ 88.018891] show_stack+0x14/0x1c [ 88.022204] dump_stack+0xb0/0xec [ 88.025514] print_circular_bug.isra.32+0x1d0/0x2e0 [ 88.030385] __lock_acquire+0x1318/0x1864 [ 88.034388] lock_acquire+0xc4/0x22c [ 88.037958] __kernfs_remove+0x258/0x2c4 [ 88.041874] kernfs_remove_by_name_ns+0x50/0xa0 [ 88.046398] remove_files.isra.1+0x38/0x78 [ 88.050487] sysfs_remove_group+0x48/0x98 [ 88.054490] sysfs_remove_groups+0x34/0x4c [ 88.058580] device_remove_attrs+0x6c/0x7c [ 88.062671] device_del+0x11c/0x33c [ 88.066154] device_unregister+0x14/0x2c [ 88.070070] pwmchip_sysfs_unexport+0x40/0x4c [ 88.074421] pwmchip_remove+0xf4/0x13c [ 88.078163] rcar_pwm_remove+0x28/0x34 [ 88.081906] platform_drv_remove+0x24/0x64 [ 88.085996] device_release_driver_internal+0x18c/0x21c [ 88.091215] device_release_driver+0x14/0x1c [ 88.095478] unbind_store+0xe0/0x124 [ 88.099048] drv_attr_store+0x20/0x30 [ 88.102704] sysfs_kf_write+0x54/0x64 [ 88.106359] kernfs_fop_write+0xe4/0x1e8 [ 88.110275] __vfs_write+0x40/0x184 [ 88.113757] vfs_write+0xa8/0x19c [ 88.117065] ksys_write+0x58/0xbc [ 88.120374] __arm64_sys_write+0x18/0x20 [ 88.124291] el0_svc_common+0xd0/0x124 [ 88.128034] el0_svc_compat_handler+0x1c/0x24 [ 88.132384] el0_svc_compat+0x8/0x18 The sysfs unexport in pwmchip_remove() is completely asymmetric to what we do in pwmchip_add_with_polarity() and commit 0733424 ("pwm: Unexport children before chip removal") is a strong indication that this was wrong to begin with. We should just move pwmchip_sysfs_unexport() where it belongs, which is right after pwmchip_sysfs_unexport_children(). In that case, we do not need separate functions anymore either. We also really want to remove sysfs irrespective of whether or not the chip will be removed as a result of pwmchip_remove(). We can only assume that the driver will be gone after that, so we shouldn't leave any dangling sysfs files around. This warning disappears if we move pwmchip_sysfs_unexport() to the top of pwmchip_remove(), pwmchip_sysfs_unexport_children(). That way it is also outside of the pwm_lock section, which indeed doesn't seem to be needed. Moving the pwmchip_sysfs_export() call outside of that section also seems fine and it'd be perfectly symmetric with pwmchip_remove() again. So, this patch fixes them. Signed-off-by: Phong Hoang <phong.hoang.wz@renesas.com> [shimoda: revise the commit log and code] Fixes: 76abbdd ("pwm: Add sysfs interface") Fixes: 0733424 ("pwm: Unexport children before chip removal") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Hoan Nguyen An <na-hoan@jinso.co.jp> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 30, 2019
commit a58f2ce upstream. There was the below bug report from Wu Fangsuo. On the CMA allocation path, isolate_migratepages_range() could isolate unevictable LRU pages and reclaim_clean_page_from_list() can try to reclaim them if they are clean file-backed pages. page:ffffffbf02f33b40 count:86 mapcount:84 mapping:ffffffc08fa7a810 index:0x24 flags: 0x19040c(referenced|uptodate|arch_1|mappedtodisk|unevictable|mlocked) raw: 000000000019040c ffffffc08fa7a810 0000000000000024 0000005600000053 raw: ffffffc009b05b20 ffffffc009b05b20 0000000000000000 ffffffc09bf3ee80 page dumped because: VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page)) page->mem_cgroup:ffffffc09bf3ee80 ------------[ cut here ]------------ kernel BUG at /home/build/farmland/adroid9.0/kernel/linux/mm/vmscan.c:1350! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 7125 Comm: syz-executor Tainted: G S 4.14.81 #3 Hardware name: ASR AQUILAC EVB (DT) task: ffffffc00a54cd00 task.stack: ffffffc009b00000 PC is at shrink_page_list+0x1998/0x3240 LR is at shrink_page_list+0x1998/0x3240 pc : [<ffffff90083a2158>] lr : [<ffffff90083a2158>] pstate: 60400045 sp : ffffffc009b05940 .. shrink_page_list+0x1998/0x3240 reclaim_clean_pages_from_list+0x3c0/0x4f0 alloc_contig_range+0x3bc/0x650 cma_alloc+0x214/0x668 ion_cma_allocate+0x98/0x1d8 ion_alloc+0x200/0x7e0 ion_ioctl+0x18c/0x378 do_vfs_ioctl+0x17c/0x1780 SyS_ioctl+0xac/0xc0 Wu found it's due to commit ad6b670 ("mm: remove SWAP_MLOCK in ttu"). Before that, unevictable pages go to cull_mlocked so that we can't reach the VM_BUG_ON_PAGE line. To fix the issue, this patch filters out unevictable LRU pages from the reclaim_clean_pages_from_list in CMA. Link: http://lkml.kernel.org/r/20190524071114.74202-1-minchan@kernel.org Fixes: ad6b670 ("mm: remove SWAP_MLOCK in ttu") Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Wu Fangsuo <fangsuowu@asrmicro.com> Debugged-by: Wu Fangsuo <fangsuowu@asrmicro.com> Tested-by: Wu Fangsuo <fangsuowu@asrmicro.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Pankaj Suryawanshi <pankaj.suryawanshi@einfochips.com> Cc: <stable@vger.kernel.org> [4.12+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
jcdutton
pushed a commit
that referenced
this issue
Jun 30, 2019
[ Upstream commit dfe3de8 ] struct dfl_feature_platform_data (and it's mutex) is used by both fme and port devices, and when lockdep is enabled it complains about nesting between these locks. Tell lockdep about the difference so it can track each class separately. Here's the lockdep complaint: [ 409.680668] WARNING: possible recursive locking detected [ 409.685983] 5.1.0-rc3.fpga+ #1 Tainted: G E [ 409.691469] -------------------------------------------- [ 409.696779] fpgaconf/9348 is trying to acquire lock: [ 409.701746] 00000000a443fe2e (&pdata->lock){+.+.}, at: port_enable_set+0x24/0x60 [dfl_afu] [ 409.710006] [ 409.710006] but task is already holding lock: [ 409.715837] 0000000063b78782 (&pdata->lock){+.+.}, at: fme_pr_ioctl+0x21d/0x330 [dfl_fme] [ 409.724012] [ 409.724012] other info that might help us debug this: [ 409.730535] Possible unsafe locking scenario: [ 409.730535] [ 409.736457] CPU0 [ 409.738910] ---- [ 409.741360] lock(&pdata->lock); [ 409.744679] lock(&pdata->lock); [ 409.747999] [ 409.747999] *** DEADLOCK *** [ 409.747999] [ 409.753920] May be due to missing lock nesting notation [ 409.753920] [ 409.760704] 4 locks held by fpgaconf/9348: [ 409.764805] #0: 0000000063b78782 (&pdata->lock){+.+.}, at: fme_pr_ioctl+0x21d/0x330 [dfl_fme] [ 409.773408] #1: 00000000213c8a66 (®ion->mutex){+.+.}, at: fpga_region_program_fpga+0x24/0x200 [fpga_region] [ 409.783489] #2: 00000000fe63afb9 (&mgr->ref_mutex){+.+.}, at: fpga_mgr_lock+0x15/0x40 [fpga_mgr] [ 409.792354] #3: 000000000b2285c5 (&bridge->mutex){+.+.}, at: __fpga_bridge_get+0x26/0xa0 [fpga_bridge] [ 409.801740] [ 409.801740] stack backtrace: [ 409.806102] CPU: 45 PID: 9348 Comm: fpgaconf Kdump: loaded Tainted: G E 5.1.0-rc3.fpga+ #1 [ 409.815658] Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018 [ 409.825911] Call Trace: [ 409.828369] dump_stack+0x5e/0x8b [ 409.831686] __lock_acquire+0xf3d/0x10e0 [ 409.835612] ? find_held_lock+0x3c/0xa0 [ 409.839451] lock_acquire+0xbc/0x1d0 [ 409.843030] ? port_enable_set+0x24/0x60 [dfl_afu] [ 409.847823] ? port_enable_set+0x24/0x60 [dfl_afu] [ 409.852616] __mutex_lock+0x86/0x970 [ 409.856195] ? port_enable_set+0x24/0x60 [dfl_afu] [ 409.860989] ? port_enable_set+0x24/0x60 [dfl_afu] [ 409.865777] ? __mutex_unlock_slowpath+0x4b/0x290 [ 409.870486] port_enable_set+0x24/0x60 [dfl_afu] [ 409.875106] fpga_bridges_disable+0x36/0x50 [fpga_bridge] [ 409.880502] fpga_region_program_fpga+0xea/0x200 [fpga_region] [ 409.886338] fme_pr_ioctl+0x13e/0x330 [dfl_fme] [ 409.890870] fme_ioctl+0x66/0xe0 [dfl_fme] [ 409.894973] do_vfs_ioctl+0xa9/0x720 [ 409.898548] ? lockdep_hardirqs_on+0xf0/0x1a0 [ 409.902907] ksys_ioctl+0x60/0x90 [ 409.906225] __x64_sys_ioctl+0x16/0x20 [ 409.909981] do_syscall_64+0x5a/0x220 [ 409.913644] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 409.918698] RIP: 0033:0x7f9d31b9b8d7 [ 409.922276] Code: 44 00 00 48 8b 05 b9 15 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 89 15 2d 00 f7 d8 64 89 01 48 [ 409.941020] RSP: 002b:00007ffe4cae0d68 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 409.948588] RAX: ffffffffffffffda RBX: 00007f9d32ade6a0 RCX: 00007f9d31b9b8d7 [ 409.955719] RDX: 00007ffe4cae0df0 RSI: 000000000000b680 RDI: 0000000000000003 [ 409.962852] RBP: 0000000000000003 R08: 00007f9d2b70a177 R09: 00007ffe4cae0e40 [ 409.969984] R10: 00007ffe4cae0160 R11: 0000000000000202 R12: 00007ffe4cae0df0 [ 409.977115] R13: 000000000000b680 R14: 0000000000000000 R15: 00007ffe4cae0f60 Signed-off-by: Scott Wood <swood@redhat.com> Acked-by: Wu Hao <hao.wu@intel.com> Acked-by: Alan Tull <atull@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Aug 11, 2019
[ Upstream commit a4c0b3d ] INFO: task syz-executor.5:8634 blocked for more than 143 seconds. Not tainted 5.2.0-rc5+ #3 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor.5 D25632 8634 8224 0x00004004 Call Trace: context_switch kernel/sched/core.c:2818 [inline] __schedule+0x658/0x9e0 kernel/sched/core.c:3445 schedule+0x131/0x1d0 kernel/sched/core.c:3509 schedule_timeout+0x9a/0x2b0 kernel/time/timer.c:1783 do_wait_for_common+0x35e/0x5a0 kernel/sched/completion.c:83 __wait_for_common kernel/sched/completion.c:104 [inline] wait_for_common kernel/sched/completion.c:115 [inline] wait_for_completion+0x47/0x60 kernel/sched/completion.c:136 kthread_stop+0xb4/0x150 kernel/kthread.c:559 io_sq_thread_stop fs/io_uring.c:2252 [inline] io_finish_async fs/io_uring.c:2259 [inline] io_ring_ctx_free fs/io_uring.c:2770 [inline] io_ring_ctx_wait_and_kill+0x268/0x880 fs/io_uring.c:2834 io_uring_release+0x5d/0x70 fs/io_uring.c:2842 __fput+0x2e4/0x740 fs/file_table.c:280 ____fput+0x15/0x20 fs/file_table.c:313 task_work_run+0x17e/0x1b0 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:185 [inline] exit_to_usermode_loop arch/x86/entry/common.c:168 [inline] prepare_exit_to_usermode+0x402/0x4f0 arch/x86/entry/common.c:199 syscall_return_slowpath+0x110/0x440 arch/x86/entry/common.c:279 do_syscall_64+0x126/0x140 arch/x86/entry/common.c:304 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x412fb1 Code: 80 3b 7c 0f 84 c7 02 00 00 c7 85 d0 00 00 00 00 00 00 00 48 8b 05 cf a6 24 00 49 8b 14 24 41 b9 cb 2a 44 00 48 89 ee 48 89 df <48> 85 c0 4c 0f 45 c8 45 31 c0 31 c9 e8 0e 5b 00 00 85 c0 41 89 c7 RSP: 002b:00007ffe7ee6a180 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000412fb1 RDX: 0000001b2d920000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000001 R08: 00000000f3a3e1f8 R09: 00000000f3a3e1fc R10: 00007ffe7ee6a260 R11: 0000000000000293 R12: 000000000075c9a0 R13: 000000000075c9a0 R14: 0000000000024c00 R15: 000000000075bf2c ============================================= There is an wrong logic, when kthread_park running in front of io_sq_thread. CPU#0 CPU#1 io_sq_thread_stop: int kthread(void *_create): kthread_park() __kthread_parkme(self); <<< Wrong kthread_stop() << wait for self->exited << clear_bit KTHREAD_SHOULD_PARK ret = threadfn(data); | |- io_sq_thread |- kthread_should_park() << false |- schedule() <<< nobody wake up stuck CPU#0 stuck CPU#1 So, use a new variable sqo_thread_started to ensure that io_sq_thread run first, then io_sq_thread_stop. Reported-by: syzbot+94324416c485d422fe15@syzkaller.appspotmail.com Suggested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Aug 11, 2019
[ Upstream commit 8c6166c ] Prior to commit d021fab ("rds: rdma: add consumer reject") function "rds_rdma_cm_event_handler_cmn" would always honor a rejected connection attempt by issuing a "rds_conn_drop". The commit mentioned above added a "break", eliminating the "fallthrough" case and made the "rds_conn_drop" rather conditional: Now it only happens if a "consumer defined" reject (i.e. "rdma_reject") carries an integer-value of "1" inside "private_data": if (!conn) break; err = (int *)rdma_consumer_reject_data(cm_id, event, &len); if (!err || (err && ((*err) == RDS_RDMA_REJ_INCOMPAT))) { pr_warn("RDS/RDMA: conn <%pI6c, %pI6c> rejected, dropping connection\n", &conn->c_laddr, &conn->c_faddr); conn->c_proposed_version = RDS_PROTOCOL_COMPAT_VERSION; rds_conn_drop(conn); } rdsdebug("Connection rejected: %s\n", rdma_reject_msg(cm_id, event->status)); break; /* FALLTHROUGH */ A number of issues are worth mentioning here: #1) Previous versions of the RDS code simply rejected a connection by calling "rdma_reject(cm_id, NULL, 0);" So the value of the payload in "private_data" will not be "1", but "0". #2) Now the code has become dependent on host byte order and sizing. If one peer is big-endian, the other is little-endian, or there's a difference in sizeof(int) (e.g. ILP64 vs LP64), the *err check does not work as intended. #3) There is no check for "len" to see if the data behind *err is even valid. Luckily, it appears that the "rdma_reject(cm_id, NULL, 0)" will always carry 148 bytes of zeroized payload. But that should probably not be relied upon here. #4) With the added "break;", we might as well drop the misleading "/* FALLTHROUGH */" comment. This commit does _not_ address issue #2, as the sender would have to agree on a byte order as well. Here is the sequence of messages in this observed error-scenario: Host-A is pre-QoS changes (excluding the commit mentioned above) Host-B is post-QoS changes (including the commit mentioned above) #1 Host-B issues a connection request via function "rds_conn_path_transition" connection state transitions to "RDS_CONN_CONNECTING" #2 Host-A rejects the incompatible connection request (from #1) It does so by calling "rdma_reject(cm_id, NULL, 0);" #3 Host-B receives an "RDMA_CM_EVENT_REJECTED" event (from #2) But since the code is changed in the way described above, it won't drop the connection here, simply because "*err == 0". #4 Host-A issues a connection request #5 Host-B receives an "RDMA_CM_EVENT_CONNECT_REQUEST" event and ends up calling "rds_ib_cm_handle_connect". But since the state is already in "RDS_CONN_CONNECTING" (as of #1) it will end up issuing a "rdma_reject" without dropping the connection: if (rds_conn_state(conn) == RDS_CONN_CONNECTING) { /* Wait and see - our connect may still be succeeding */ rds_ib_stats_inc(s_ib_connect_raced); } goto out; #6 Host-A receives an "RDMA_CM_EVENT_REJECTED" event (from #5), drops the connection and tries again (goto #4) until it gives up. Tested-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Aug 11, 2019
[ Upstream commit e88439d ] [BUG] Lockdep will report the following circular locking dependency: WARNING: possible circular locking dependency detected 5.2.0-rc2-custom #24 Tainted: G O ------------------------------------------------------ btrfs/8631 is trying to acquire lock: 000000002536438c (&fs_info->qgroup_ioctl_lock#2){+.+.}, at: btrfs_qgroup_inherit+0x40/0x620 [btrfs] but task is already holding lock: 000000003d52cc23 (&fs_info->tree_log_mutex){+.+.}, at: create_pending_snapshot+0x8b6/0xe60 [btrfs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&fs_info->tree_log_mutex){+.+.}: __mutex_lock+0x76/0x940 mutex_lock_nested+0x1b/0x20 btrfs_commit_transaction+0x475/0xa00 [btrfs] btrfs_commit_super+0x71/0x80 [btrfs] close_ctree+0x2bd/0x320 [btrfs] btrfs_put_super+0x15/0x20 [btrfs] generic_shutdown_super+0x72/0x110 kill_anon_super+0x18/0x30 btrfs_kill_super+0x16/0xa0 [btrfs] deactivate_locked_super+0x3a/0x80 deactivate_super+0x51/0x60 cleanup_mnt+0x3f/0x80 __cleanup_mnt+0x12/0x20 task_work_run+0x94/0xb0 exit_to_usermode_loop+0xd8/0xe0 do_syscall_64+0x210/0x240 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #1 (&fs_info->reloc_mutex){+.+.}: __mutex_lock+0x76/0x940 mutex_lock_nested+0x1b/0x20 btrfs_commit_transaction+0x40d/0xa00 [btrfs] btrfs_quota_enable+0x2da/0x730 [btrfs] btrfs_ioctl+0x2691/0x2b40 [btrfs] do_vfs_ioctl+0xa9/0x6d0 ksys_ioctl+0x67/0x90 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x65/0x240 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&fs_info->qgroup_ioctl_lock#2){+.+.}: lock_acquire+0xa7/0x190 __mutex_lock+0x76/0x940 mutex_lock_nested+0x1b/0x20 btrfs_qgroup_inherit+0x40/0x620 [btrfs] create_pending_snapshot+0x9d7/0xe60 [btrfs] create_pending_snapshots+0x94/0xb0 [btrfs] btrfs_commit_transaction+0x415/0xa00 [btrfs] btrfs_mksubvol+0x496/0x4e0 [btrfs] btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs] btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs] btrfs_ioctl+0xa90/0x2b40 [btrfs] do_vfs_ioctl+0xa9/0x6d0 ksys_ioctl+0x67/0x90 __x64_sys_ioctl+0x1a/0x20 do_syscall_64+0x65/0x240 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: &fs_info->qgroup_ioctl_lock#2 --> &fs_info->reloc_mutex --> &fs_info->tree_log_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&fs_info->tree_log_mutex); lock(&fs_info->reloc_mutex); lock(&fs_info->tree_log_mutex); lock(&fs_info->qgroup_ioctl_lock#2); *** DEADLOCK *** 6 locks held by btrfs/8631: #0: 00000000ed8f23f6 (sb_writers#12){.+.+}, at: mnt_want_write_file+0x28/0x60 #1: 000000009fb1597a (&type->i_mutex_dir_key#10/1){+.+.}, at: btrfs_mksubvol+0x70/0x4e0 [btrfs] #2: 0000000088c5ad88 (&fs_info->subvol_sem){++++}, at: btrfs_mksubvol+0x128/0x4e0 [btrfs] #3: 000000009606fc3e (sb_internal#2){.+.+}, at: start_transaction+0x37a/0x520 [btrfs] #4: 00000000f82bbdf5 (&fs_info->reloc_mutex){+.+.}, at: btrfs_commit_transaction+0x40d/0xa00 [btrfs] #5: 000000003d52cc23 (&fs_info->tree_log_mutex){+.+.}, at: create_pending_snapshot+0x8b6/0xe60 [btrfs] [CAUSE] Due to the delayed subvolume creation, we need to call btrfs_qgroup_inherit() inside commit transaction code, with a lot of other mutex hold. This hell of lock chain can lead to above problem. [FIX] On the other hand, we don't really need to hold qgroup_ioctl_lock if we're in the context of create_pending_snapshot(). As in that context, we're the only one being able to modify qgroup. All other qgroup functions which needs qgroup_ioctl_lock are either holding a transaction handle, or will start a new transaction: Functions will start a new transaction(): * btrfs_quota_enable() * btrfs_quota_disable() Functions hold a transaction handler: * btrfs_add_qgroup_relation() * btrfs_del_qgroup_relation() * btrfs_create_qgroup() * btrfs_remove_qgroup() * btrfs_limit_qgroup() * btrfs_qgroup_inherit() call inside create_subvol() So we have a higher level protection provided by transaction, thus we don't need to always hold qgroup_ioctl_lock in btrfs_qgroup_inherit(). Only the btrfs_qgroup_inherit() call in create_subvol() needs to hold qgroup_ioctl_lock, while the btrfs_qgroup_inherit() call in create_pending_snapshot() is already protected by transaction. So the fix is to detect the context by checking trans->transaction->state. If we're at TRANS_STATE_COMMIT_DOING, then we're in commit transaction context and no need to get the mutex. Reported-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
jcdutton
pushed a commit
that referenced
this issue
Sep 26, 2020
[ Upstream commit 843d926 ] syzbot reported twice a lockdep issue in fib6_del() [1] which I think is caused by net->ipv6.fib6_null_entry having a NULL fib6_table pointer. fib6_del() already checks for fib6_null_entry special case, we only need to return earlier. Bug seems to occur very rarely, I have thus chosen a 'bug origin' that makes backports not too complex. [1] WARNING: suspicious RCU usage 5.9.0-rc4-syzkaller #0 Not tainted ----------------------------- net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 4 locks held by syz-executor.5/8095: #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401 #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline] #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312 #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613 #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline] #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245 stack backtrace: CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996 fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180 fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102 fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150 fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230 __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246 fib6_clean_all net/ipv6/ip6_fib.c:2257 [inline] fib6_run_gc+0x113/0x2d0 net/ipv6/ip6_fib.c:2320 ndisc_netdev_event+0x217/0x350 net/ipv6/ndisc.c:1805 notifier_call_chain+0xb5/0x200 kernel/notifier.c:83 call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2033 call_netdevice_notifiers_extack net/core/dev.c:2045 [inline] call_netdevice_notifiers net/core/dev.c:2059 [inline] dev_close_many+0x30b/0x650 net/core/dev.c:1634 rollback_registered_many+0x3a8/0x1210 net/core/dev.c:9261 rollback_registered net/core/dev.c:9329 [inline] unregister_netdevice_queue+0x2dd/0x570 net/core/dev.c:10410 unregister_netdevice include/linux/netdevice.h:2774 [inline] ppp_release+0x216/0x240 drivers/net/ppp/ppp_generic.c:403 __fput+0x285/0x920 fs/file_table.c:281 task_work_run+0xdd/0x190 kernel/task_work.c:141 tracehook_notify_resume include/linux/tracehook.h:188 [inline] exit_to_user_mode_loop kernel/entry/common.c:163 [inline] exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 421842e ("net/ipv6: Add fib6_null_entry") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Wifi does not currently work.
The Linux kernel includes all the drivers needed: mwifiex
The Device Tree needs updating.
./linux/arch/arm/boot/dts/imx6q-utilite-hdmi.dts
./linux/arch/arm/boot/dts/imx6q-utilite-pro.dts
This is a placeholder, until the Device tree is updated.
The text was updated successfully, but these errors were encountered: