11aa: barf on teardown #24

twpedersen · 2013-01-11T19:06:29Z

[58440.829805] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[58440.829966] IP: [] skb_dequeue+0x52/0x80
[58440.830013] PGD 5efdd067 PUD 1a5eb067 PMD 0
[58440.830013] Oops: 0002 [#1] SMP
[58440.830013] CPU 3
[58440.830013] Modules linked in: ath9k ath9k_htc ath9k_common ath9k_hw ath mac80211 cfg80211 r8169
[58440.830013]
[58440.830013] Pid: 22928, comm: ip Not tainted 3.5.0-5ca2e19+ #412 To Be Filled By O.E.M. To Be Filled
By O.E.M./To be filled by O.E.M.
[58440.830013] RIP: 0010:[] [] skb_dequeue+0x52/0x80
[58440.830013] RSP: 0018:ffff88001b0e7648 EFLAGS: 00010002
[58440.830013] RAX: 0000000000000286 RBX: ffff880068365f00 RCX: 0000000000000000
[58440.830013] RDX: 0000000000000000 RSI: ffffea0001a0d940 RDI: ffff8800796f4fbc
[58440.830013] RBP: ffff88001b0e7668 R08: ffff880068365c00 R09: 000000018010000e
[58440.830013] R10: ffffffff815bfe57 R11: ffffffffffffffe2 R12: ffff8800796f4fa8
[58440.830013] R13: ffff8800796f4fbc R14: 0000000000000000 R15: 0000000000000100
[58440.830013] FS: 00007fb2f23f0700(0000) GS:ffff88007f380000(0000) knlGS:0000000000000000
[58440.830013] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[58440.830013] CR2: 0000000000000008 CR3: 000000001a604000 CR4: 00000000000007e0
[58440.830013] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[58440.830013] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[58440.830013] Process ip (pid: 22928, threadinfo ffff88001b0e6000, task ffff88007bb7e180)
[58440.830013] Stack:
[58440.830013] ffffffff815bea44 ffff8800796f4fa8 ffff8800796f4480 ffff8800796f5210
[58440.830013] ffff88001b0e7688 ffffffff815c02b8 ffff88001b0e76a8 0000000000000001
[58440.830013] ffff88001b0e76a8 ffffffffa00d6997 ffff88007c0ba6c0 ffff8800796f4480
[58440.830013] Call Trace:
[58440.830013] [] ? skb_dequeue+0x74/0x80
[58440.830013] [] skb_queue_purge+0x28/0x40
[58440.830013] [] ieee80211_clear_tx_pending+0x37/0x50 [mac80211]
[58440.830013] [] ieee80211_do_stop+0x2b9/0x7c0 [mac80211]

@task

One of the problems that arise when converting dedicated custom threadpool to workqueue is that the shared worker pool used by workqueue anonimizes each worker making it more difficult to identify what the worker was doing on which target from the output of sysrq-t or debug dump from oops, BUG() and friends. For example, after writeback is converted to use workqueue instead of priviate thread pool, there's no easy to tell which backing device a writeback work item was working on at the time of task dump, which, according to our writeback brethren, is important in tracking down issues with a lot of mounted file systems on a lot of different devices. This patchset implements a way for a work function to mark its execution instance so that task dump of the worker task includes information to indicate what the work item was doing. An example WARN dump would look like the following. WARNING: at fs/fs-writeback.c:1015 bdi_writeback_workfn+0x2b4/0x3c0() Modules linked in: CPU: 0 Pid: 28 Comm: kworker/u18:0 Not tainted 3.9.0-rc1-work+ #24 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 Workqueue: writeback bdi_writeback_workfn (flush-8:16) ffffffff820a3a98 ffff88015b927cb8 ffffffff81c61855 ffff88015b927cf8 ffffffff8108f500 0000000000000000 ffff88007a171948 ffff88007a1716b0 ffff88015b49df00 ffff88015b8d3940 0000000000000000 ffff88015b927d08 Call Trace: [<ffffffff81c61855>] dump_stack+0x19/0x1b [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0 ... This patch: Implement probe_kthread_data() which returns kthread_data if accessible. The function is equivalent to kthread_data() except that the specified @task may not be a kthread or its vfork_done is already cleared rendering struct kthread inaccessible. In the former case, probe_kthread_data() may return any value. In the latter, NULL. This will be used to safely print debug information without affecting synchronization in the normal paths. Workqueue debug info printing on dump_stack() and friends will make use of it. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Writeback has been recently converted to use workqueue instead of its private thread pool implementation. One negative side effect of this conversion is that there's no easy to tell which backing device a writeback work item was working on at the time of task dump, be it sysrq-t, BUG, WARN or whatever, which, according to our writeback brethren, is important in tracking down issues with a lot of mounted file systems on a lot of different devices. This patch restores that information using the new worker description facility. bdi_writeback_workfn() calls set_work_desc() to identify which bdi it's working on. The description is printed out together with the worqueue name and worker function as in the following example dump. WARNING: at fs/fs-writeback.c:1015 bdi_writeback_workfn+0x2b4/0x3c0() Modules linked in: Pid: 28, comm: kworker/u18:0 Not tainted 3.9.0-rc1-work+ #24 empty empty/S3992 Workqueue: writeback bdi_writeback_workfn (flush-8:16) ffffffff820a3a98 ffff88015b927cb8 ffffffff81c61855 ffff88015b927cf8 ffffffff8108f500 0000000000000000 ffff88007a171948 ffff88007a1716b0 ffff88015b49df00 ffff88015b8d3940 0000000000000000 ffff88015b927d08 Call Trace: [<ffffffff81c61855>] dump_stack+0x19/0x1b [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0 [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20 [<ffffffff81200144>] bdi_writeback_workfn+0x2b4/0x3c0 [<ffffffff810b4c87>] process_one_work+0x1d7/0x660 [<ffffffff810b5c72>] worker_thread+0x122/0x380 [<ffffffff810bdfea>] kthread+0xea/0xf0 [<ffffffff81c6cedc>] ret_from_fork+0x7c/0xb0 Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Several people reported the warning: "kernel BUG at kernel/timer.c:729!" and the stack trace is: #7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905 #8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge] #9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge] #10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge] #11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge] #12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc #13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6 #14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad #15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17 #16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68 #17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101 #18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8 #19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun] #20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun] #21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1 #22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe #23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f #24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1 #25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292 this is due to I forgot to check if mp->timer is armed in br_multicast_del_pg(). This bug is introduced by commit 9f00b2e (bridge: only expire the mdb entry when query is received). Same for __br_mdb_del(). Tested-by: poma <pomidorabelisima@gmail.com> Reported-by: LiYonghua <809674045@qq.com> Reported-by: Robert Hancock <hancockrwd@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>

As the new x86 CPU bootup printout format code maintainer, I am taking immediate action to improve and clean (and thus indulge my OCD) the reporting of the cores when coming up online. Fix padding to a right-hand alignment, cleanup code and bind reporting width to the max number of supported CPUs on the system, like this: [ 0.074509] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 #6 #7 OK [ 0.644008] smpboot: Booting Node 1, Processors: #8 #9 #10 #11 #12 #13 #14 #15 OK [ 1.245006] smpboot: Booting Node 2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK [ 1.864005] smpboot: Booting Node 3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK [ 2.489005] smpboot: Booting Node 4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK [ 3.093005] smpboot: Booting Node 5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK [ 3.698005] smpboot: Booting Node 6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK [ 4.304005] smpboot: Booting Node 7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK [ 4.961413] Brought up 64 CPUs and this: [ 0.072367] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 #6 #7 OK [ 0.686329] Brought up 8 CPUs Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Libin <huawei.libin@huawei.com> Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>

Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 [ 0.603005] .... node #1, CPUs: #8 #9 #10 #11 #12 #13 #14 #15 [ 1.200005] .... node #2, CPUs: #16 #17 #18 #19 #20 #21 #22 #23 [ 1.796005] .... node #3, CPUs: #24 #25 #26 #27 #28 #29 #30 #31 [ 2.393005] .... node #4, CPUs: #32 #33 #34 #35 #36 #37 #38 #39 [ 2.996005] .... node #5, CPUs: #40 #41 #42 #43 #44 #45 #46 #47 [ 3.600005] .... node #6, CPUs: #48 #49 #50 #51 #52 #53 #54 #55 [ 4.202005] .... node #7, CPUs: #56 #57 #58 #59 #60 #61 #62 #63 [ 4.811005] .... node #8, CPUs: #64 #65 #66 #67 #68 #69 #70 #71 [ 5.421006] .... node #9, CPUs: #72 #73 #74 #75 #76 #77 #78 #79 [ 6.032005] .... node #10, CPUs: #80 #81 #82 #83 #84 #85 #86 #87 [ 6.648006] .... node #11, CPUs: #88 #89 #90 #91 #92 #93 #94 #95 [ 7.262005] .... node #12, CPUs: #96 #97 #98 #99 #100 #101 #102 #103 [ 7.865005] .... node #13, CPUs: #104 #105 #106 #107 #108 #109 #110 #111 [ 8.466005] .... node #14, CPUs: #112 #113 #114 #115 #116 #117 #118 #119 [ 9.073006] .... node #15, CPUs: #120 #121 #122 #123 #124 #125 #126 #127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: huawei.libin@huawei.com Cc: wangyijing@huawei.com Cc: fenghua.yu@intel.com Cc: guohanjun@huawei.com Cc: paul.gortmaker@windriver.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org>

…urces. When we change group_thread_cnt from sysfs entry, it can OOPS. The kernel messages are: [ 135.299021] BUG: unable to handle kernel NULL pointer dereference at (null) [ 135.299073] IP: [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440 [ 135.299107] PGD 0 [ 135.299122] Oops: 0000 [#1] SMP [ 135.299144] Modules linked in: netconsole e1000e ptp pps_core [ 135.299188] CPU: 3 PID: 2225 Comm: md0_raid5 Not tainted 3.12.0+ #24 [ 135.299214] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015 11/09/2011 [ 135.299255] task: ffff8800b9638f80 ti: ffff8800b77a4000 task.ti: ffff8800b77a4000 [ 135.299283] RIP: 0010:[<ffffffff815188ab>] [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440 [ 135.299323] RSP: 0018:ffff8800b77a5c48 EFLAGS: 00010002 [ 135.299344] RAX: ffff880037bb5c70 RBX: 0000000000000000 RCX: 0000000000000008 [ 135.299371] RDX: ffff880037bb5cb8 RSI: 0000000000000001 RDI: ffff880037bb5c00 [ 135.299398] RBP: ffff8800b77a5d08 R08: 0000000000000001 R09: 0000000000000000 [ 135.299425] R10: ffff8800b77a5c98 R11: 00000000ffffffff R12: ffff880037bb5c00 [ 135.299452] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880037bb5c70 [ 135.299479] FS: 0000000000000000(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000 [ 135.299510] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 135.299532] CR2: 0000000000000000 CR3: 0000000001c0b000 CR4: 00000000000407e0 [ 135.299559] Stack: [ 135.299570] ffff8800b77a5c88 ffffffff8107383e ffff8800b77a5c88 ffff880037a64300 [ 135.299611] 000000000000ec08 ffff880037bb5cb8 ffff8800b77a5c98 ffffffffffffffd8 [ 135.299654] 000000000000ec08 ffff880037bb5c60 ffff8800b77a5c98 ffff8800b77a5c98 [ 135.299696] Call Trace: [ 135.299711] [<ffffffff8107383e>] ? __wake_up+0x4e/0x70 [ 135.299733] [<ffffffff81518f88>] raid5d+0x4c8/0x680 [ 135.299756] [<ffffffff817174ed>] ? schedule_timeout+0x15d/0x1f0 [ 135.299781] [<ffffffff81524c9f>] md_thread+0x11f/0x170 [ 135.299804] [<ffffffff81069cd0>] ? wake_up_bit+0x40/0x40 [ 135.299826] [<ffffffff81524b80>] ? md_rdev_init+0x110/0x110 [ 135.299850] [<ffffffff81069656>] kthread+0xc6/0xd0 [ 135.299871] [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70 [ 135.299899] [<ffffffff81722ffc>] ret_from_fork+0x7c/0xb0 [ 135.299923] [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70 [ 135.299951] Code: ff ff ff 0f 84 d7 fe ff ff e9 5c fe ff ff 66 90 41 8b b4 24 d8 01 00 00 45 31 ed 85 f6 0f 8e 7b fd ff ff 49 8b 9c 24 d0 01 00 00 <48> 3b 1b 49 89 dd 0f 85 67 fd ff ff 48 8d 43 28 31 d2 eb 17 90 [ 135.300005] RIP [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440 [ 135.300005] RSP <ffff8800b77a5c48> [ 135.300005] CR2: 0000000000000000 [ 135.300005] ---[ end trace 504854e5bb7562ed ]--- [ 135.300005] Kernel panic - not syncing: Fatal exception This is because raid5d() can be running when the multi-thread resources are changed via system. We see need to provide locking. mddev->device_lock is suitable, but we cannot simple call alloc_thread_groups under this lock as we cannot allocate memory while holding a spinlock. So change alloc_thread_groups() to allocate and return the data structures, then raid5_store_group_thread_cnt() can take the lock while updating the pointers to the data structures. This fixes a bug introduced in 3.12 and so is suitable for the 3.12.x stable series. Fixes: b721420 Cc: stable@vger.kernel.org (3.12) Signed-off-by: Jianpeng Ma <majianpeng@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Shaohua Li <shli@kernel.org>

After commit e9e4ea7 "net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data" The Versatile SMSC LAN91C111 is crashing like this: ------------[ cut here ]------------ kernel BUG at /home/linus/linux/drivers/net/ethernet/smsc/smc91x.c:599! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 PID: 43 Comm: udhcpc Not tainted 3.13.0-rc1+ #24 task: c6ccfaa0 ti: c6cd0000 task.ti: c6cd0000 PC is at smc_hardware_send_pkt+0x198/0x22c LR is at smc_hardware_send_pkt+0x24/0x22c pc : [<c01be324>] lr : [<c01be1b0>] psr: 20000013 sp : c6cd1d08 ip : 00000001 fp : 00000000 r10: c02adb08 r9 : 00000000 r8 : c6ced802 r7 : c786fba0 r6 : 00000146 r5 : c8800000 r4 : c78d6000 r3 : 0000000f r2 : 00000146 r1 : 00000000 r0 : 00000031 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005317f Table: 06cf4000 DAC: 00000015 Process udhcpc (pid: 43, stack limit = 0xc6cd01c0) Stack: (0xc6cd1d08 to 0xc6cd2000) 1d00: 00000010 c8800000 c78d6000 c786fba0 c78d6000 c01be868 1d20: c01be7a4 00004000 00000000 c786fba0 c6c12b80 c0208554 000004d0 c780fc60 1d40: 00000220 c01fb734 00000000 00000000 00000000 c6c9a440 c6c12b80 c78d6000 1d60: c786fba0 c6c9a440 00000000 c021d1d8 00000000 00000000 c6c12b80 c78d6000 1d80: c786fba0 00000001 c6c9a440 c02087f8 c6c9a4a0 00080008 00000000 00000000 1da0: c78d6000 c786fba0 c78d6000 00000138 00000000 00000000 00000000 00000000 1dc0: 00000000 c027ba74 00000138 00000138 00000001 00000010 c6cedc00 00000000 1de0: 00000008 c7404400 c6cd1eec c6cd1f14 c067a73c c065c0b8 00000000 c067a740 1e00: 01ffffff 002040d0 00000000 00000000 00000000 00000000 00000000 ffffffff 1e20: 43004400 00110022 c6cdef20 c027ae8c c6ccfaa0 be82d65c 00000014 be82d3cc 1e40: 00000000 00000000 00000000 c01f2870 00000000 00000000 00000000 c6cd1e88 1e60: c6ccfaa0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1e80: 00000000 00000000 00000031 c7802310 c7802300 00000138 c7404400 c0771da0 1ea0: 00000000 c6cd1eec c7800340 00000138 be82d65c 00000014 be82d3cc c6cd1f08 1ec0: 00000014 00000000 c7404400 c7404400 00000138 c01f4628 c78d6000 00000000 1ee0: 00000000 be82d3cc 00000138 c6cd1f08 00000014 c6cd1ee4 00000001 00000000 1f00: 00000000 00000000 00080011 00000002 06000000 ffffffff 0000ffff 00000002 1f20: 06000000 ffffffff 0000ffff c00928c8 c065c520 c6cd1f58 00000003 c009299c 1f40: 00000003 c065c520 c7404400 00000000 c7404400 c01f2218 c78106b0 c7441cb0 1f60: 00000000 00000006 c06799fc 00000000 00000000 00000006 00000000 c01f3ee0 1f80: 00000000 00000000 be82d678 be82d65c 00000014 00000001 00000122 c00139c8 1fa0: c6cd0000 c0013840 be82d65c 00000014 00000006 be82d3cc 00000138 00000000 1fc0: be82d65c 00000014 00000001 00000122 00000000 00000000 00018cb1 00000000 1fe0: 00003801 be82d3a8 0003a0c7 b6e9af08 60000010 00000006 00000000 00000000 [<c01be324>] (smc_hardware_send_pkt+0x198/0x22c) from [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) from [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) from [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) from [<c02087f8>] (dev_queue_xmit+0x238/0x42c) [<c02087f8>] (dev_queue_xmit+0x238/0x42c) from [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) from [<c01f2870>] (sock_sendmsg+0x84/0xa8) [<c01f2870>] (sock_sendmsg+0x84/0xa8) from [<c01f4628>] (SyS_sendto+0xb8/0xdc) [<c01f4628>] (SyS_sendto+0xb8/0xdc) from [<c0013840>] (ret_fast_syscall+0x0/0x2c) Code: e3130002 1a000001 e3130001 0affffcd (e7f001f2) ---[ end trace 81104fe70e8da7fe ]--- Kernel panic - not syncing: Fatal exception in interrupt This is because the macro operations in smc91x.h defined for Versatile are missing SMC_outsw() as used in this commit. The Versatile needs and uses the same accessors as the other platforms in the first if(...) clause, just switch it to using that and we have one problem less to worry about. Checkpatch complains about spacing, but I have opted to follow the style of this .h-file. Cc: Russell King <linux@arm.linux.org.uk> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Jonathan Cameron <jic23@cam.ac.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>

After commit e9e4ea7 "net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data" The Versatile SMSC LAN91C111 is crashing like this: ------------[ cut here ]------------ kernel BUG at /home/linus/linux/drivers/net/ethernet/smsc/smc91x.c:599! Internal error: Oops - BUG: 0 [#1] ARM Modules linked in: CPU: 0 PID: 43 Comm: udhcpc Not tainted 3.13.0-rc1+ #24 task: c6ccfaa0 ti: c6cd0000 task.ti: c6cd0000 PC is at smc_hardware_send_pkt+0x198/0x22c LR is at smc_hardware_send_pkt+0x24/0x22c pc : [<c01be324>] lr : [<c01be1b0>] psr: 20000013 sp : c6cd1d08 ip : 00000001 fp : 00000000 r10: c02adb08 r9 : 00000000 r8 : c6ced802 r7 : c786fba0 r6 : 00000146 r5 : c8800000 r4 : c78d6000 r3 : 0000000f r2 : 00000146 r1 : 00000000 r0 : 00000031 Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005317f Table: 06cf4000 DAC: 00000015 Process udhcpc (pid: 43, stack limit = 0xc6cd01c0) Stack: (0xc6cd1d08 to 0xc6cd2000) 1d00: 00000010 c8800000 c78d6000 c786fba0 c78d6000 c01be868 1d20: c01be7a4 00004000 00000000 c786fba0 c6c12b80 c0208554 000004d0 c780fc60 1d40: 00000220 c01fb734 00000000 00000000 00000000 c6c9a440 c6c12b80 c78d6000 1d60: c786fba0 c6c9a440 00000000 c021d1d8 00000000 00000000 c6c12b80 c78d6000 1d80: c786fba0 00000001 c6c9a440 c02087f8 c6c9a4a0 00080008 00000000 00000000 1da0: c78d6000 c786fba0 c78d6000 00000138 00000000 00000000 00000000 00000000 1dc0: 00000000 c027ba74 00000138 00000138 00000001 00000010 c6cedc00 00000000 1de0: 00000008 c7404400 c6cd1eec c6cd1f14 c067a73c c065c0b8 00000000 c067a740 1e00: 01ffffff 002040d0 00000000 00000000 00000000 00000000 00000000 ffffffff 1e20: 43004400 00110022 c6cdef20 c027ae8c c6ccfaa0 be82d65c 00000014 be82d3cc 1e40: 00000000 00000000 00000000 c01f2870 00000000 00000000 00000000 c6cd1e88 1e60: c6ccfaa0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 1e80: 00000000 00000000 00000031 c7802310 c7802300 00000138 c7404400 c0771da0 1ea0: 00000000 c6cd1eec c7800340 00000138 be82d65c 00000014 be82d3cc c6cd1f08 1ec0: 00000014 00000000 c7404400 c7404400 00000138 c01f4628 c78d6000 00000000 1ee0: 00000000 be82d3cc 00000138 c6cd1f08 00000014 c6cd1ee4 00000001 00000000 1f00: 00000000 00000000 00080011 00000002 06000000 ffffffff 0000ffff 00000002 1f20: 06000000 ffffffff 0000ffff c00928c8 c065c520 c6cd1f58 00000003 c009299c 1f40: 00000003 c065c520 c7404400 00000000 c7404400 c01f2218 c78106b0 c7441cb0 1f60: 00000000 00000006 c06799fc 00000000 00000000 00000006 00000000 c01f3ee0 1f80: 00000000 00000000 be82d678 be82d65c 00000014 00000001 00000122 c00139c8 1fa0: c6cd0000 c0013840 be82d65c 00000014 00000006 be82d3cc 00000138 00000000 1fc0: be82d65c 00000014 00000001 00000122 00000000 00000000 00018cb1 00000000 1fe0: 00003801 be82d3a8 0003a0c7 b6e9af08 60000010 00000006 00000000 00000000 [<c01be324>] (smc_hardware_send_pkt+0x198/0x22c) from [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) from [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) from [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) [<c021d1d8>] (sch_direct_xmit+0x94/0x18c) from [<c02087f8>] (dev_queue_xmit+0x238/0x42c) [<c02087f8>] (dev_queue_xmit+0x238/0x42c) from [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) [<c027ba74>] (packet_sendmsg+0xbe8/0xd28) from [<c01f2870>] (sock_sendmsg+0x84/0xa8) [<c01f2870>] (sock_sendmsg+0x84/0xa8) from [<c01f4628>] (SyS_sendto+0xb8/0xdc) [<c01f4628>] (SyS_sendto+0xb8/0xdc) from [<c0013840>] (ret_fast_syscall+0x0/0x2c) Code: e3130002 1a000001 e3130001 0affffcd (e7f001f2) ---[ end trace 81104fe70e8da7fe ]--- Kernel panic - not syncing: Fatal exception in interrupt This is because the macro operations in smc91x.h defined for Versatile are missing SMC_outsw() as used in this commit. The Versatile needs and uses the same accessors as the other platforms in the first if(...) clause, just switch it to using that and we have one problem less to worry about. This includes a hunk of a patch from Will Deacon fixin the other 32bit platforms as well: Innokom, Ramses, PXA, PCM027. Checkpatch complains about spacing, but I have opted to follow the style of this .h-file. Cc: Russell King <linux@arm.linux.org.uk> Cc: Nicolas Pitre <nico@fluxnic.net> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Jonathan Cameron <jic23@cam.ac.uk> Cc: stable@vger.kernel.org Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>

Avoid circular mutex lock by pushing the dev->lock to the .fini callback on each extension. As em28xx-dvb, em28xx-alsa and em28xx-rc have their own data structures, and don't touch at the common structure during .fini, only em28xx-v4l needs to be locked. [ 90.994317] ====================================================== [ 90.994356] [ INFO: possible circular locking dependency detected ] [ 90.994395] 3.13.0-rc1+ #24 Not tainted [ 90.994427] ------------------------------------------------------- [ 90.994458] khubd/54 is trying to acquire lock: [ 90.994490] (&card->controls_rwsem){++++.+}, at: [<ffffffffa0177b08>] snd_ctl_dev_free+0x28/0x60 [snd] [ 90.994656] [ 90.994656] but task is already holding lock: [ 90.994688] (&dev->lock){+.+.+.}, at: [<ffffffffa040db81>] em28xx_close_extension+0x31/0x90 [em28xx] [ 90.994843] [ 90.994843] which lock already depends on the new lock. [ 90.994843] [ 90.994874] [ 90.994874] the existing dependency chain (in reverse order) is: [ 90.994905] -> #1 (&dev->lock){+.+.+.}: [ 90.995057] [<ffffffff810b8fa3>] __lock_acquire+0xb43/0x1330 [ 90.995121] [<ffffffff810b9f82>] lock_acquire+0xa2/0x120 [ 90.995182] [<ffffffff816a5b6c>] mutex_lock_nested+0x5c/0x3c0 [ 90.995245] [<ffffffffa0422cca>] em28xx_vol_put_mute+0x1ba/0x1d0 [em28xx_alsa] [ 90.995309] [<ffffffffa017813d>] snd_ctl_elem_write+0xfd/0x140 [snd] [ 90.995376] [<ffffffffa01791c2>] snd_ctl_ioctl+0xe2/0x810 [snd] [ 90.995442] [<ffffffff811db8b0>] do_vfs_ioctl+0x300/0x520 [ 90.995504] [<ffffffff811dbb51>] SyS_ioctl+0x81/0xa0 [ 90.995568] [<ffffffff816b1929>] system_call_fastpath+0x16/0x1b [ 90.995630] -> #0 (&card->controls_rwsem){++++.+}: [ 90.995780] [<ffffffff810b7a47>] check_prevs_add+0x947/0x950 [ 90.995841] [<ffffffff810b8fa3>] __lock_acquire+0xb43/0x1330 [ 90.995901] [<ffffffff810b9f82>] lock_acquire+0xa2/0x120 [ 90.995962] [<ffffffff816a762b>] down_write+0x3b/0xa0 [ 90.996022] [<ffffffffa0177b08>] snd_ctl_dev_free+0x28/0x60 [snd] [ 90.996088] [<ffffffffa017a255>] snd_device_free+0x65/0x140 [snd] [ 90.996154] [<ffffffffa017a751>] snd_device_free_all+0x61/0xa0 [snd] [ 90.996219] [<ffffffffa0173af4>] snd_card_do_free+0x14/0x130 [snd] [ 90.996283] [<ffffffffa0173f14>] snd_card_free+0x84/0x90 [snd] [ 90.996349] [<ffffffffa0423397>] em28xx_audio_fini+0x97/0xb0 [em28xx_alsa] [ 90.996411] [<ffffffffa040dba6>] em28xx_close_extension+0x56/0x90 [em28xx] [ 90.996475] [<ffffffffa040f639>] em28xx_usb_disconnect+0x79/0x90 [em28xx] [ 90.996539] [<ffffffff814a06e7>] usb_unbind_interface+0x67/0x1d0 [ 90.996620] [<ffffffff8142920f>] __device_release_driver+0x7f/0xf0 [ 90.996682] [<ffffffff814292a5>] device_release_driver+0x25/0x40 [ 90.996742] [<ffffffff81428b0c>] bus_remove_device+0x11c/0x1a0 [ 90.996801] [<ffffffff81425536>] device_del+0x136/0x1d0 [ 90.996863] [<ffffffff8149e0c0>] usb_disable_device+0xb0/0x290 [ 90.996923] [<ffffffff814930c5>] usb_disconnect+0xb5/0x1d0 [ 90.996984] [<ffffffff81495ab6>] hub_port_connect_change+0xd6/0xad0 [ 90.997044] [<ffffffff814967c3>] hub_events+0x313/0x9b0 [ 90.997105] [<ffffffff81496e95>] hub_thread+0x35/0x170 [ 90.997165] [<ffffffff8108ea2f>] kthread+0xff/0x120 [ 90.997226] [<ffffffff816b187c>] ret_from_fork+0x7c/0xb0 [ 90.997287] [ 90.997287] other info that might help us debug this: [ 90.997287] [ 90.997318] Possible unsafe locking scenario: [ 90.997318] [ 90.997348] CPU0 CPU1 [ 90.997378] ---- ---- [ 90.997408] lock(&dev->lock); [ 90.997497] lock(&card->controls_rwsem); [ 90.997607] lock(&dev->lock); [ 90.997697] lock(&card->controls_rwsem); [ 90.997786] [ 90.997786] *** DEADLOCK *** [ 90.997786] [ 90.997817] 5 locks held by khubd/54: [ 90.997847] #0: (&__lockdep_no_validate__){......}, at: [<ffffffff81496564>] hub_events+0xb4/0x9b0 [ 90.998025] #1: (&__lockdep_no_validate__){......}, at: [<ffffffff81493076>] usb_disconnect+0x66/0x1d0 [ 90.998204] #2: (&__lockdep_no_validate__){......}, at: [<ffffffff8142929d>] device_release_driver+0x1d/0x40 [ 90.998383] #3: (em28xx_devlist_mutex){+.+.+.}, at: [<ffffffffa040db77>] em28xx_close_extension+0x27/0x90 [em28xx] [ 90.998567] #4: (&dev->lock){+.+.+.}, at: [<ffffffffa040db81>] em28xx_close_extension+0x31/0x90 [em28xx] Reviewed-by: Frank Schäfer <fschaefer.oss@googlemail.com> Tested-by: Antti Palosaari <crope@iki.fi> Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

11aa: barf on teardown #24

11aa: barf on teardown #24

twpedersen commented Jan 11, 2013

11aa: barf on teardown #24

11aa: barf on teardown #24

Comments

twpedersen commented Jan 11, 2013