Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

complete lockup #3217

Closed
tomposmiko opened this issue Mar 24, 2015 · 9 comments
Closed

complete lockup #3217

tomposmiko opened this issue Mar 24, 2015 · 9 comments

Comments

@tomposmiko
Copy link

The very first one from yesterday in the log files:

Mar 23 14:22:29 v303 kernel: [3386006.881486] ------------[ cut here ]------------
Mar 23 14:22:29 v303 kernel: [3386006.881497] WARNING: CPU: 6 PID: 6992 at /build/buildd/linux-3.13.0/kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xd0()
Mar 23 14:22:29 v303 kernel: [3386006.881499] Watchdog detected hard LOCKUP on cpu 6
Mar 23 14:22:29 v303 kernel: [3386006.881501] Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp veth ip6table_filter ip6_tables iptable_filter ip_tables x_tables zram(C) bridge stp llc intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul joydev glue_helper ablk_helper cryptd serio_raw lpc_ich ipmi_si nct6775 hwmon_vid video jc42 mac_hid coretemp zfs(POX) zunicode(POX) zavl(POX) zcommon(POX) znvpair(POX) spl(OX) btrfs libcrc32c raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor ses enclosure raid6_pq raid1 e1000e hid_generic mpt2sas raid0 ptp raid_class usbhid ahci multipath psmouse hid libahci pps_core scsi_transport_sas linear
Mar 23 14:22:29 v303 kernel: [3386006.881568] CPU: 6 PID: 6992 Comm: chgrp Tainted: P         C OX 3.13.0-45-generic #74-Ubuntu
Mar 23 14:22:29 v303 kernel: [3386006.881570] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.10 01/09/2014
Mar 23 14:22:29 v303 kernel: [3386006.881572]  0000000000000009 ffff88082fd85c20 ffffffff81720eb6 ffff88082fd85c68
Mar 23 14:22:29 v303 kernel: [3386006.881577]  ffff88082fd85c58 ffffffff810677cd ffff880803a78000 0000000000000000
Mar 23 14:22:29 v303 kernel: [3386006.881580]  ffff88082fd85d88 0000000000000000 ffff88082fd85ef8 ffff88082fd85cb8
Mar 23 14:22:29 v303 kernel: [3386006.881584] Call Trace:
Mar 23 14:22:29 v303 kernel: [3386006.881586]  <NMI>  [<ffffffff81720eb6>] dump_stack+0x45/0x56
Mar 23 14:22:29 v303 kernel: [3386006.881596]  [<ffffffff810677cd>] warn_slowpath_common+0x7d/0xa0
Mar 23 14:22:29 v303 kernel: [3386006.881599]  [<ffffffff8106783c>] warn_slowpath_fmt+0x4c/0x50
Mar 23 14:22:29 v303 kernel: [3386006.881603]  [<ffffffff8110dae0>] ? restart_watchdog_hrtimer+0x50/0x50
Mar 23 14:22:29 v303 kernel: [3386006.881607]  [<ffffffff8110db7c>] watchdog_overflow_callback+0x9c/0xd0
Mar 23 14:22:29 v303 kernel: [3386006.881611]  [<ffffffff8114583e>] __perf_event_overflow+0x8e/0x240
Mar 23 14:22:29 v303 kernel: [3386006.881617]  [<ffffffff81029258>] ? x86_perf_event_set_period+0xe8/0x150
Mar 23 14:22:29 v303 kernel: [3386006.881620]  [<ffffffff81146354>] perf_event_overflow+0x14/0x20
Mar 23 14:22:29 v303 kernel: [3386006.881624]  [<ffffffff810306bd>] intel_pmu_handle_irq+0x1ed/0x3f0
Mar 23 14:22:29 v303 kernel: [3386006.881629]  [<ffffffff811883a1>] ? unmap_kernel_range_noflush+0x11/0x20
Mar 23 14:22:29 v303 kernel: [3386006.881634]  [<ffffffff8172ae1b>] perf_event_nmi_handler+0x2b/0x50
Mar 23 14:22:29 v303 kernel: [3386006.881637]  [<ffffffff8172a638>] nmi_handle.isra.3+0x88/0x180
Mar 23 14:22:29 v303 kernel: [3386006.881641]  [<ffffffff8172a800>] do_nmi+0xd0/0x340
Mar 23 14:22:29 v303 kernel: [3386006.881644]  [<ffffffff81729aa1>] end_repeat_nmi+0x1e/0x2e
Mar 23 14:22:29 v303 kernel: [3386006.881648]  [<ffffffff81728fba>] ? _raw_spin_lock_irq+0x3a/0x60
Mar 23 14:22:29 v303 kernel: [3386006.881651]  [<ffffffff81728fba>] ? _raw_spin_lock_irq+0x3a/0x60
Mar 23 14:22:29 v303 kernel: [3386006.881654]  [<ffffffff81728fba>] ? _raw_spin_lock_irq+0x3a/0x60
Mar 23 14:22:29 v303 kernel: [3386006.881656]  <<EOE>>  [<ffffffff817280af>] rwsem_down_read_failed+0x3f/0x150
Mar 23 14:22:29 v303 kernel: [3386006.881661]  [<ffffffff817276f2>] ? mutex_lock+0x12/0x2f
Mar 23 14:22:29 v303 kernel: [3386006.881667]  [<ffffffff81371d34>] call_rwsem_down_read_failed+0x14/0x30
Mar 23 14:22:29 v303 kernel: [3386006.881672]  [<ffffffff81727930>] ? down_read+0x20/0x30
Mar 23 14:22:29 v303 kernel: [3386006.881721]  [<ffffffffa036a870>] zap_get_leaf_byblk+0xd0/0x2c0 [zfs]
Mar 23 14:22:29 v303 kernel: [3386006.881724]  [<ffffffff817276f2>] ? mutex_lock+0x12/0x2f
Mar 23 14:22:29 v303 kernel: [3386006.881764]  [<ffffffffa036aac4>] zap_deref_leaf+0x64/0x80 [zfs]
Mar 23 14:22:29 v303 kernel: [3386006.881802]  [<ffffffffa036c697>] fzap_cursor_retrieve+0x107/0x2a0 [zfs]
Mar 23 14:22:29 v303 kernel: [3386006.881840]  [<ffffffffa036fbbc>] zap_cursor_retrieve+0x5c/0x2f0 [zfs]
Mar 23 14:22:29 v303 kernel: [3386006.881865]  [<ffffffffa0311ea5>] ? dmu_prefetch+0x235/0x2c0 [zfs]
Mar 23 14:22:29 v303 kernel: [3386006.881906]  [<ffffffffa038f9be>] zfs_readdir+0x14e/0x4c0 [zfs]
Mar 23 14:22:29 v303 kernel: [3386006.881911]  [<ffffffff811845c8>] ? page_add_new_anon_rmap+0xd8/0x170
Mar 23 14:22:29 v303 kernel: [3386006.881915]  [<ffffffff8117a0e2>] ? handle_mm_fault+0x882/0xf00
Mar 23 14:22:29 v303 kernel: [3386006.881920]  [<ffffffff8117f50a>] ? vma_merge+0x10a/0x340
Mar 23 14:22:29 v303 kernel: [3386006.881926]  [<ffffffff8172cf54>] ? __do_page_fault+0x204/0x560
Mar 23 14:22:29 v303 kernel: [3386006.881973]  [<ffffffffa03a8b1c>] zpl_iterate+0x3c/0x60 [zfs]
Mar 23 14:22:29 v303 kernel: [3386006.881977]  [<ffffffff811d1155>] iterate_dir+0xa5/0xe0
Mar 23 14:22:29 v303 kernel: [3386006.881982]  [<ffffffff8110e7cb>] ? __secure_computing+0x6b/0x240
Mar 23 14:22:29 v303 kernel: [3386006.881985]  [<ffffffff811d15b2>] SyS_getdents+0x92/0x120
Mar 23 14:22:29 v303 kernel: [3386006.881987]  [<ffffffff811d1270>] ? fillonedir+0xe0/0xe0
Mar 23 14:22:29 v303 kernel: [3386006.881991]  [<ffffffff81731b1c>] ? tracesys+0x7e/0xe6
Mar 23 14:22:29 v303 kernel: [3386006.881994]  [<ffffffff81731b7f>] tracesys+0xe1/0xe6
Mar 23 14:22:29 v303 kernel: [3386006.881997] ---[ end trace f6b2d42f8f85665e ]---
[  418.606052] ------------[ cut here ]------------
[  418.606058] WARNING: CPU: 1 PID: 6133 at /build/buildd/linux-3.13.0/kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xd0()
[  418.606058] Watchdog detected hard LOCKUP on cpu 1
[  418.606059] Modules linked in: ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp veth ip6table_filter ip6_tables iptable_filter ip_tables x_tables zram(C) bridge stp llc intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper serio_raw ablk_helper cryptd lpc_ich joydev nct6775 hwmon_vid shpchp jc42 ipmi_si coretemp video mac_hid zfs(POX) zunicode(POX) zavl(POX) zcommon(POX) znvpair(POX) spl(OX) btrfs libcrc32c raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq ses enclosure raid1 mpt2sas e1000e hid_generic raid0 raid_class ptp usbhid multipath hid psmouse pata_acpi scsi_transport_sas pps_core linear
[  418.606091] CPU: 1 PID: 6133 Comm: rm Tainted: P         C OX 3.13.0-48-generic #80-Ubuntu
[  418.606092] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  418.606093]  0000000000000009 ffff88082fc45c20 ffffffff81721506 ffff88082fc45c68
[  418.606094]  ffff88082fc45c58 ffffffff810677dd ffff880803f50000 0000000000000000
[  418.606096]  ffff88082fc45d88 0000000000000000 ffff88082fc45ef8 ffff88082fc45cb8
[  418.606098] Call Trace:
[  418.606099]  <NMI>  [<ffffffff81721506>] dump_stack+0x45/0x56
[  418.606104]  [<ffffffff810677dd>] warn_slowpath_common+0x7d/0xa0
[  418.606105]  [<ffffffff8106784c>] warn_slowpath_fmt+0x4c/0x50
[  418.606107]  [<ffffffff8110dde0>] ? restart_watchdog_hrtimer+0x50/0x50
[  418.606109]  [<ffffffff8110de7c>] watchdog_overflow_callback+0x9c/0xd0
[  418.606111]  [<ffffffff81145b3e>] __perf_event_overflow+0x8e/0x240
[  418.606114]  [<ffffffff81029348>] ? x86_perf_event_set_period+0xe8/0x150
[  418.606115]  [<ffffffff81146654>] perf_event_overflow+0x14/0x20
[  418.606117]  [<ffffffff810307ad>] intel_pmu_handle_irq+0x1ed/0x3f0
[  418.606120]  [<ffffffff81188731>] ? unmap_kernel_range_noflush+0x11/0x20
[  418.606122]  [<ffffffff8172b45b>] perf_event_nmi_handler+0x2b/0x50
[  418.606124]  [<ffffffff8172ac78>] nmi_handle.isra.3+0x88/0x180
[  418.606125]  [<ffffffff8172ae40>] do_nmi+0xd0/0x340
[  418.606127]  [<ffffffff8172a0e1>] end_repeat_nmi+0x1e/0x2e
[  418.606129]  [<ffffffff817295ff>] ? _raw_spin_lock_irq+0x3f/0x60
[  418.606130]  [<ffffffff817295ff>] ? _raw_spin_lock_irq+0x3f/0x60
[  418.606132]  [<ffffffff817295ff>] ? _raw_spin_lock_irq+0x3f/0x60
[  418.606132]  <<EOE>>  [<ffffffff817286ef>] rwsem_down_read_failed+0x3f/0x150
[  418.606135]  [<ffffffff81727d32>] ? mutex_lock+0x12/0x2f
[  418.606138]  [<ffffffff813722b4>] call_rwsem_down_read_failed+0x14/0x30
[  418.606140]  [<ffffffff81727f70>] ? down_read+0x20/0x30
[  418.606164]  [<ffffffffa036e870>] zap_get_leaf_byblk+0xd0/0x2c0 [zfs]
[  418.606165]  [<ffffffff81727d32>] ? mutex_lock+0x12/0x2f
[  418.606182]  [<ffffffffa036eac4>] zap_deref_leaf+0x64/0x80 [zfs]
[  418.606198]  [<ffffffffa0370697>] fzap_cursor_retrieve+0x107/0x2a0 [zfs]
[  418.606214]  [<ffffffffa0373bbc>] zap_cursor_retrieve+0x5c/0x2f0 [zfs]
[  418.606225]  [<ffffffffa0315ea5>] ? dmu_prefetch+0x235/0x2c0 [zfs]
[  418.606242]  [<ffffffffa03939be>] zfs_readdir+0x14e/0x4c0 [zfs]
[  418.606249]  [<ffffffffa0293c1a>] ? tsd_hash_search.isra.1+0x10a/0x1b0 [spl]
[  418.606253]  [<ffffffffa0294f0c>] ? tsd_exit+0x28c/0x2c0 [spl]
[  418.606255]  [<ffffffff81727d32>] ? mutex_lock+0x12/0x2f
[  418.606257]  [<ffffffff810f40d2>] ? from_kgid_munged+0x12/0x20
[  418.606275]  [<ffffffffa03acb1c>] zpl_iterate+0x3c/0x60 [zfs]
[  418.606277]  [<ffffffff811d14e5>] iterate_dir+0xa5/0xe0
[  418.606278]  [<ffffffff8110eacb>] ? __secure_computing+0x6b/0x240
[  418.606279]  [<ffffffff811d1942>] SyS_getdents+0x92/0x120
[  418.606281]  [<ffffffff811d1600>] ? fillonedir+0xe0/0xe0
[  418.606282]  [<ffffffff8173216c>] ? tracesys+0x7e/0xe6
[  418.606284]  [<ffffffff817321cf>] tracesys+0xe1/0xe6
[  418.606285] ---[ end trace 7b65ff1c8c7a5c21 ]---
[  463.915704] INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 7, t=15033 jiffies, g=24982, c=24981, q=0)
[  463.918248] sending NMI to all CPUs:
[  463.918252] NMI backtrace for cpu 3
[  463.918264] CPU: 3 PID: 0 Comm: swapper/3 Tainted: P        WC OX 3.13.0-48-generic #80-Ubuntu
[  463.918265] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  463.918267] task: ffff880803ee4800 ti: ffff880803eec000 task.ti: ffff880803eec000
[  463.918268] RIP: 0010:[<ffffffff813e87a8>]  [<ffffffff813e87a8>] intel_idle+0xd8/0x140
[  463.918273] RSP: 0018:ffff880803eede28  EFLAGS: 00000046
[  463.918275] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[  463.918276] RDX: 0000000000000000 RSI: ffffffff81c93de0 RDI: 0000000001c0e000
[  463.918277] RBP: ffff880803eede50 R08: ffff88082fcd030c R09: 0000000000000018
[  463.918278] R10: 0000000000003d42 R11: 0000000000006480 R12: 0000000000000004
[  463.918279] R13: 0000000000000020 R14: 0000000000000003 R15: ffffffff81c93f58
[  463.918281] FS:  0000000000000000(0000) GS:ffff88082fcc0000(0000) knlGS:0000000000000000
[  463.918282] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  463.918284] CR2: 00007f89dde83000 CR3: 0000000001c0e000 CR4: 00000000001407e0
[  463.918285] Stack:
[  463.918286]  0000000303eede50 ffff88082fcda700 ffffffff81c93de0 0000006c160bd9e8
[  463.918289]  0000000000000004 ffff880803eede88 ffffffff815d3e00 ffff880803eedf38
[  463.918292]  ffff88082fcda700 0000000000000004 0000000000000003 ffffffff81c93de0
[  463.918294] Call Trace:
[  463.918299]  [<ffffffff815d3e00>] cpuidle_enter_state+0x40/0xc0
[  463.918301]  [<ffffffff815d3f39>] cpuidle_idle_call+0xb9/0x1f0
[  463.918305]  [<ffffffff8101d35e>] arch_cpu_idle+0xe/0x30
[  463.918308]  [<ffffffff810befa5>] cpu_startup_entry+0xc5/0x290
[  463.918311]  [<ffffffff8104150d>] start_secondary+0x21d/0x2d0
[  463.918312] Code: 48 89 d1 48 2d c8 1f 00 00 0f 01 c8 0f ae f0 65 48 8b 04 25 30 b8 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <85> 1d 2a b6 8a 00 75 0e 48 8d 75 dc bf 05 00 00 00 e8 92 b5 ce 
[  463.918335] NMI backtrace for cpu 1
[  463.918338] CPU: 1 PID: 6133 Comm: rm Tainted: P        WC OX 3.13.0-48-generic #80-Ubuntu
[  463.918340] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  463.918341] task: ffff88059b033000 ti: ffff88056de52000 task.ti: ffff88056de52000
[  463.918343] RIP: 0010:[<ffffffff817295ff>]  [<ffffffff817295ff>] _raw_spin_lock_irq+0x3f/0x60
[  463.918347] RSP: 0018:ffff88056de53aa8  EFLAGS: 00000002
[  463.918348] RAX: 0000000000000631 RBX: ffff88059b033000 RCX: 0000000000005841
[  463.918350] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88057274ff08
[  463.918351] RBP: ffff88056de53aa8 R08: 0000000000000202 R09: ffff88082fc56340
[  463.918352] R10: ffffea0016666000 R11: 0000000000000000 R12: ffff88057274ff00
[  463.918353] R13: ffff88057274ff08 R14: 0000000000000000 R15: ffff88057274ff00
[  463.918355] FS:  00007f6e50f40740(0000) GS:ffff88082fc40000(0000) knlGS:0000000000000000
[  463.918356] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  463.918361] CR2: 00007f6e509e0f50 CR3: 0000000575f9f000 CR4: 00000000001407e0
[  463.918365] Stack:
[  463.918370]  ffff88056de53b00 ffffffff817286ef ffff88057973e9e0 ffffffff81727d32
[  463.918386]  ffff88057973e9e0 ffff88059b033000 ffffffff00000001 000000000000000a
[  463.918402]  ffff88057274ff00 0000000000000002 ffff88056de53d38 ffff88056de53b60
[  463.918418] Call Trace:
[  463.918427]  [<ffffffff817286ef>] rwsem_down_read_failed+0x3f/0x150
[  463.918436]  [<ffffffff81727d32>] ? mutex_lock+0x12/0x2f
[  463.918445]  [<ffffffff813722b4>] call_rwsem_down_read_failed+0x14/0x30
[  463.918453]  [<ffffffff81727f70>] ? down_read+0x20/0x30
[  463.918485]  [<ffffffffa036e870>] zap_get_leaf_byblk+0xd0/0x2c0 [zfs]
[  463.918487]  [<ffffffff81727d32>] ? mutex_lock+0x12/0x2f
[  463.918507]  [<ffffffffa036eac4>] zap_deref_leaf+0x64/0x80 [zfs]
[  463.918528]  [<ffffffffa0370697>] fzap_cursor_retrieve+0x107/0x2a0 [zfs]
[  463.918553]  [<ffffffffa0373bbc>] zap_cursor_retrieve+0x5c/0x2f0 [zfs]
[  463.918567]  [<ffffffffa0315ea5>] ? dmu_prefetch+0x235/0x2c0 [zfs]
[  463.918590]  [<ffffffffa03939be>] zfs_readdir+0x14e/0x4c0 [zfs]
[  463.918599]  [<ffffffffa0293c1a>] ? tsd_hash_search.isra.1+0x10a/0x1b0 [spl]
[  463.918605]  [<ffffffffa0294f0c>] ? tsd_exit+0x28c/0x2c0 [spl]
[  463.918607]  [<ffffffff81727d32>] ? mutex_lock+0x12/0x2f
[  463.918610]  [<ffffffff810f40d2>] ? from_kgid_munged+0x12/0x20
[  463.918633]  [<ffffffffa03acb1c>] zpl_iterate+0x3c/0x60 [zfs]
[  463.918635]  [<ffffffff811d14e5>] iterate_dir+0xa5/0xe0
[  463.918638]  [<ffffffff8110eacb>] ? __secure_computing+0x6b/0x240
[  463.918640]  [<ffffffff811d1942>] SyS_getdents+0x92/0x120
[  463.918642]  [<ffffffff811d1600>] ? fillonedir+0xe0/0xe0
[  463.918644]  [<ffffffff8173216c>] ? tracesys+0x7e/0xe6
[  463.918646]  [<ffffffff817321cf>] tracesys+0xe1/0xe6
[  463.918647] Code: 00 f0 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f b7 f2 b8 00 80 00 00 eb 0d 66 0f 1f 44 00 00 f3 90 83 e8 01 74 0a <0f> b7 0f 66 39 ca 75 f1 5d c3 0f 1f 80 00 00 00 00 eb d9 90 90 
[  463.918670] NMI backtrace for cpu 7
[  463.918673] CPU: 7 PID: 25712 Comm: java Tainted: P        WC OX 3.13.0-48-generic #80-Ubuntu
[  463.918674] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  463.918676] task: ffff8805bc9d8000 ti: ffff8805ad4b6000 task.ti: ffff8805ad4b6000
[  463.918677] RIP: 0010:[<ffffffff81370e42>]  [<ffffffff81370e42>] __const_udelay+0x12/0x30
[  463.918680] RSP: 0018:ffff88082fdc3df0  EFLAGS: 00000046
[  463.918682] RAX: 0000000001062560 RBX: 0000000000002710 RCX: 0000000000000004
[  463.918683] RDX: 0000000000c96c9c RSI: 0000000000000100 RDI: 0000000000418958
[  463.918684] RBP: ffff88082fdc3e08 R08: 0000000000000082 R09: 0000000000000439
[  463.918686] R10: 0000000000000000 R11: ffff88082fdc3b2e R12: ffffffff81c4e180
[  463.918687] R13: ffffffff81d143a0 R14: ffffffff81c4e180 R15: 0000000000000007
[  463.918689] FS:  00007f39f0415700(0000) GS:ffff88082fdc0000(0000) knlGS:0000000000000000
[  463.918690] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  463.918691] CR2: 00007f50b87f7d00 CR3: 000000072e796000 CR4: 00000000001407e0
[  463.918692] Stack:
[  463.918693]  ffff88082fdc3e08 ffffffff8104524f ffff88082fdcd840 ffff88082fdc3e60
[  463.918696]  ffffffff810cb0c1 ffffffff81c4e180 ffffffff00000001 0000000000000000
[  463.918698]  0000000000000001 ffff8805bc9d8000 0000000000000000 0000000000000007
[  463.918701] Call Trace:
[  463.918702]  <IRQ> 
[  463.918703]  [<ffffffff8104524f>] ? arch_trigger_all_cpu_backtrace+0x8f/0xb0
[  463.918709]  [<ffffffff810cb0c1>] rcu_check_callbacks+0x631/0x650
[  463.918712]  [<ffffffff810763a7>] update_process_times+0x47/0x70
[  463.918715]  [<ffffffff810d62a5>] tick_sched_handle.isra.17+0x25/0x60
[  463.918717]  [<ffffffff810d6321>] tick_sched_timer+0x41/0x60
[  463.918721]  [<ffffffff8108e7c7>] __run_hrtimer+0x77/0x1d0
[  463.918723]  [<ffffffff810d62e0>] ? tick_sched_handle.isra.17+0x60/0x60
[  463.918726]  [<ffffffff8108ef8f>] hrtimer_interrupt+0xef/0x230
[  463.918729]  [<ffffffff81043647>] local_apic_timer_interrupt+0x37/0x60
[  463.918732]  [<ffffffff8173438f>] smp_apic_timer_interrupt+0x3f/0x60
[  463.918735]  [<ffffffff81732d1d>] apic_timer_interrupt+0x6d/0x80
[  463.918736]  <EOI> 
[  463.918737]  [<ffffffff81639b35>] ? sk_run_filter+0x295/0x700
[  463.918742]  [<ffffffff8110eb7d>] ? __secure_computing+0x11d/0x240
[  463.918744]  [<ffffffff8110eacb>] ? __secure_computing+0x6b/0x240
[  463.918747]  [<ffffffff81021157>] syscall_trace_enter+0x197/0x250
[  463.918750]  [<ffffffff8173216c>] tracesys+0x7e/0xe6
[  463.918751] Code: 89 e5 ff 15 f9 d2 91 00 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 8d 04 bd 00 00 00 00 65 48 8b 14 25 e0 2c 01 00 <48> 8d 0c 12 48 c1 e2 06 48 89 e5 48 29 ca f7 e2 48 8d 7a 01 ff 
[  463.918773] NMI backtrace for cpu 5
[  463.918778] CPU: 5 PID: 0 Comm: swapper/5 Tainted: P        WC OX 3.13.0-48-generic #80-Ubuntu
[  463.918783] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  463.918784] task: ffff880803ef0000 ti: ffff880803ef8000 task.ti: ffff880803ef8000
[  463.918785] RIP: 0010:[<ffffffff813e87a8>]  [<ffffffff813e87a8>] intel_idle+0xd8/0x140
[  463.918791] RSP: 0018:ffff880803ef9e28  EFLAGS: 00000046
[  463.918795] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[  463.918800] RDX: 0000000000000000 RSI: ffffffff81c93de0 RDI: 0000000001c0e000
[  463.918805] RBP: ffff880803ef9e50 R08: ffff88082fd50310 R09: 0000000000000018
[  463.918809] R10: 0000000000030444 R11: 0000000000049e17 R12: 0000000000000004
[  463.918814] R13: 0000000000000020 R14: 0000000000000003 R15: ffffffff81c93f58
[  463.918818] FS:  0000000000000000(0000) GS:ffff88082fd40000(0000) knlGS:0000000000000000
[  463.918823] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  463.918827] CR2: 00007fe47331776c CR3: 0000000001c0e000 CR4: 00000000001407e0
[  463.918832] Stack:
[  463.918836]  0000000503ef9e50 ffff88082fd5a700 ffffffff81c93de0 0000006c13b80837
[  463.918843]  0000000000000004 ffff880803ef9e88 ffffffff815d3e00 ffff880803ef9f38
[  463.918846]  ffff88082fd5a700 0000000000000004 0000000000000005 ffffffff81c93de0
[  463.918848] Call Trace:
[  463.918851]  [<ffffffff815d3e00>] cpuidle_enter_state+0x40/0xc0
[  463.918854]  [<ffffffff815d3f39>] cpuidle_idle_call+0xb9/0x1f0
[  463.918856]  [<ffffffff8101d35e>] arch_cpu_idle+0xe/0x30
[  463.918859]  [<ffffffff810befa5>] cpu_startup_entry+0xc5/0x290
[  463.918861]  [<ffffffff8104150d>] start_secondary+0x21d/0x2d0
[  463.918862] Code: 48 89 d1 48 2d c8 1f 00 00 0f 01 c8 0f ae f0 65 48 8b 04 25 30 b8 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <85> 1d 2a b6 8a 00 75 0e 48 8d 75 dc bf 05 00 00 00 e8 92 b5 ce 
[  463.918885] NMI backtrace for cpu 0
[  463.918888] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P        WC OX 3.13.0-48-generic #80-Ubuntu
[  463.918890] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  463.918891] task: ffffffff81c15480 ti: ffffffff81c00000 task.ti: ffffffff81c00000
[  463.918896] RIP: 0010:[<ffffffff813e87a8>]  [<ffffffff813e87a8>] intel_idle+0xd8/0x140
[  463.918905] RSP: 0018:ffffffff81c01e38  EFLAGS: 00000046
[  463.918909] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[  463.918910] RDX: 0000000000000000 RSI: ffffffff81c93de0 RDI: 0000000001c0e000
[  463.918911] RBP: ffffffff81c01e60 R08: ffff88082fc10310 R09: 0000000000000018
[  463.918913] R10: 0000000000005225 R11: 000000000000daa8 R12: 0000000000000004
[  463.918914] R13: 0000000000000020 R14: 0000000000000003 R15: ffffffff81c93f58
[  463.918915] FS:  0000000000000000(0000) GS:ffff88082fc00000(0000) knlGS:0000000000000000
[  463.918917] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  463.918920] CR2: 00007f3cbe9ae000 CR3: 0000000001c0e000 CR4: 00000000001407f0
[  463.918925] Stack:
[  463.918926]  0000000081c01e60 ffff88082fc1a700 ffffffff81c93de0 0000006c14cb5203
[  463.918929]  0000000000000004 ffffffff81c01e98 ffffffff815d3e00 ffffffffffffffff
[  463.918932]  ffff88082fc1a700 0000000000000004 0000000000000000 ffffffff81c93de0
[  463.918936] Call Trace:
[  463.918939]  [<ffffffff815d3e00>] cpuidle_enter_state+0x40/0xc0
[  463.918944]  [<ffffffff815d3f39>] cpuidle_idle_call+0xb9/0x1f0
[  463.918947]  [<ffffffff8101d35e>] arch_cpu_idle+0xe/0x30
[  463.918950]  [<ffffffff810befa5>] cpu_startup_entry+0xc5/0x290
[  463.918953]  [<ffffffff8170f9c7>] rest_init+0x77/0x80
[  463.918957]  [<ffffffff81d35f70>] start_kernel+0x438/0x443
[  463.918959]  [<ffffffff81d35941>] ? repair_env_string+0x5c/0x5c
[  463.918962]  [<ffffffff81d35120>] ? early_idt_handlers+0x120/0x120
[  463.918964]  [<ffffffff81d355ee>] x86_64_start_reservations+0x2a/0x2c
[  463.918966]  [<ffffffff81d35733>] x86_64_start_kernel+0x143/0x152
[  463.918967] Code: 48 89 d1 48 2d c8 1f 00 00 0f 01 c8 0f ae f0 65 48 8b 04 25 30 b8 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <85> 1d 2a b6 8a 00 75 0e 48 8d 75 dc bf 05 00 00 00 e8 92 b5 ce 
[  463.918996] NMI backtrace for cpu 4
[  463.918998] CPU: 4 PID: 0 Comm: swapper/4 Tainted: P        WC OX 3.13.0-48-generic #80-Ubuntu
[  463.918999] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  463.919000] task: ffff880803ee6000 ti: ffff880803eee000 task.ti: ffff880803eee000
[  463.919002] RIP: 0010:[<ffffffff813e87a8>]  [<ffffffff813e87a8>] intel_idle+0xd8/0x140
[  463.919004] RSP: 0018:ffff880803eefe28  EFLAGS: 00000046
[  463.919005] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[  463.919008] RDX: 0000000000000000 RSI: ffffffff81c93de0 RDI: 0000000001c0e000
[  463.919012] RBP: ffff880803eefe50 R08: ffff88082fd10310 R09: 0000000000000014
[  463.919016] R10: 000000000000c37d R11: 0000000000024a45 R12: 0000000000000004
[  463.919017] R13: 0000000000000020 R14: 0000000000000003 R15: ffffffff81c93f58
[  463.919020] FS:  0000000000000000(0000) GS:ffff88082fd00000(0000) knlGS:0000000000000000
[  463.919021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  463.919023] CR2: 00007f9dd73d2af8 CR3: 0000000001c0e000 CR4: 00000000001407e0
[  463.919025] Stack:
[  463.919028]  0000000403eefe50 ffff88082fd1a700 ffffffff81c93de0 0000006c158f5dc1
[  463.919030]  0000000000000004 ffff880803eefe88 ffffffff815d3e00 ffff880803eeff38
[  463.919034]  ffff88082fd1a700 0000000000000004 0000000000000004 ffffffff81c93de0
[  463.919036] Call Trace:
[  463.919039]  [<ffffffff815d3e00>] cpuidle_enter_state+0x40/0xc0
[  463.919041]  [<ffffffff815d3f39>] cpuidle_idle_call+0xb9/0x1f0
[  463.919044]  [<ffffffff8101d35e>] arch_cpu_idle+0xe/0x30
[  463.919046]  [<ffffffff810befa5>] cpu_startup_entry+0xc5/0x290
[  463.919048]  [<ffffffff8104150d>] start_secondary+0x21d/0x2d0
[  463.919050] Code: 48 89 d1 48 2d c8 1f 00 00 0f 01 c8 0f ae f0 65 48 8b 04 25 30 b8 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <85> 1d 2a b6 8a 00 75 0e 48 8d 75 dc bf 05 00 00 00 e8 92 b5 ce 
[  463.919071] NMI backtrace for cpu 2
[  463.919075] CPU: 2 PID: 0 Comm: swapper/2 Tainted: P        WC OX 3.13.0-48-generic #80-Ubuntu
[  463.919077] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  463.919085] task: ffff880803ee3000 ti: ffff880803eea000 task.ti: ffff880803eea000
[  463.919089] RIP: 0010:[<ffffffff813e87a8>]  [<ffffffff813e87a8>] intel_idle+0xd8/0x140
[  463.919093] RSP: 0018:ffff880803eebe28  EFLAGS: 00000046
[  463.919095] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[  463.919096] RDX: 0000000000000000 RSI: ffffffff81c93de0 RDI: 0000000001c0e000
[  463.919098] RBP: ffff880803eebe50 R08: ffff88082fc90310 R09: 0000000000000018
[  463.919099] R10: 0000000000004ca6 R11: 00000000000121b2 R12: 0000000000000004
[  463.919104] R13: 0000000000000020 R14: 0000000000000003 R15: ffffffff81c93f58
[  463.919107] FS:  0000000000000000(0000) GS:ffff88082fc80000(0000) knlGS:0000000000000000
[  463.919108] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  463.919109] CR2: 00007f4f41762c00 CR3: 0000000001c0e000 CR4: 00000000001407e0
[  463.919112] Stack:
[  463.919114]  0000000203eebe50 ffff88082fc9a700 ffffffff81c93de0 0000006c15f85b4f
[  463.919118]  0000000000000004 ffff880803eebe88 ffffffff815d3e00 ffff880803eebf38
[  463.919120]  ffff88082fc9a700 0000000000000004 0000000000000002 ffffffff81c93de0
[  463.919123] Call Trace:
[  463.919126]  [<ffffffff815d3e00>] cpuidle_enter_state+0x40/0xc0
[  463.919128]  [<ffffffff815d3f39>] cpuidle_idle_call+0xb9/0x1f0
[  463.919131]  [<ffffffff8101d35e>] arch_cpu_idle+0xe/0x30
[  463.919133]  [<ffffffff810befa5>] cpu_startup_entry+0xc5/0x290
[  463.919139]  [<ffffffff8104150d>] start_secondary+0x21d/0x2d0
[  463.919144] Code: 48 89 d1 48 2d c8 1f 00 00 0f 01 c8 0f ae f0 65 48 8b 04 25 30 b8 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <85> 1d 2a b6 8a 00 75 0e 48 8d 75 dc bf 05 00 00 00 e8 92 b5 ce 
[  463.919165] NMI backtrace for cpu 6
[  463.919169] CPU: 6 PID: 0 Comm: swapper/6 Tainted: P        WC OX 3.13.0-48-generic #80-Ubuntu
[  463.919171] Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 2.0c 10/17/2013
[  463.919172] task: ffff880803ef1800 ti: ffff880803efa000 task.ti: ffff880803efa000
[  463.919173] RIP: 0010:[<ffffffff813e87a8>]  [<ffffffff813e87a8>] intel_idle+0xd8/0x140
[  463.919175] RSP: 0018:ffff880803efbe28  EFLAGS: 00000046
[  463.919181] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[  463.919183] RDX: 0000000000000000 RSI: ffffffff81c93de0 RDI: 0000000001c0e000
[  463.919184] RBP: ffff880803efbe50 R08: ffff88082fd90310 R09: 0000000000000014
[  463.919185] R10: 000000000000c37d R11: 0000000000022df0 R12: 0000000000000004
[  463.919186] R13: 0000000000000020 R14: 0000000000000003 R15: ffffffff81c93f58
[  463.919188] FS:  0000000000000000(0000) GS:ffff88082fd80000(0000) knlGS:0000000000000000
[  463.919189] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  463.919194] CR2: 00007fc9c1a8ff38 CR3: 0000000001c0e000 CR4: 00000000001407e0
[  463.919197] Stack:
[  463.919198]  0000000603efbe50 ffff88082fd9a700 ffffffff81c93de0 0000006c135a615a
[  463.919201]  0000000000000004 ffff880803efbe88 ffffffff815d3e00 ffff880803efbf38
[  463.919203]  ffff88082fd9a700 0000000000000004 0000000000000006 ffffffff81c93de0
[  463.919205] Call Trace:
[  463.919211]  [<ffffffff815d3e00>] cpuidle_enter_state+0x40/0xc0
[  463.919213]  [<ffffffff815d3f39>] cpuidle_idle_call+0xb9/0x1f0
[  463.919218]  [<ffffffff8101d35e>] arch_cpu_idle+0xe/0x30
[  463.919220]  [<ffffffff810befa5>] cpu_startup_entry+0xc5/0x290
[  463.919224]  [<ffffffff8104150d>] start_secondary+0x21d/0x2d0
[  463.919226] Code: 48 89 d1 48 2d c8 1f 00 00 0f 01 c8 0f ae f0 65 48 8b 04 25 30 b8 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <85> 1d 2a b6 8a 00 75 0e 48 8d 75 dc bf 05 00 00 00 e8 92 b5 ce 

Linux files 3.13.0-48-generic #80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

ii dkms 2.2.0.3-1.1ubuntu5.14.04+zfs9trusty all Dynamic Kernel Module Support Framework
ii libzfs2 0.6.3-5
trusty amd64 Native OpenZFS filesystem library for Linux
ii mountall 2.53-zfs1 amd64 filesystem mounting tool
ii ubuntu-zfs 8trusty amd64 Native ZFS filesystem metapackage for Ubuntu.
ii zfs-dkms 0.6.3-5
trusty amd64 Native OpenZFS filesystem kernel modules for Linux
ii zfs-doc 0.6.3-5trusty amd64 Native OpenZFS filesystem documentation and examples.
ii zfs-tools 0.4.1ubuntu1-tompos-cxn7 all A collection of tools for ZFS
ii zfsutils 0.6.3-5
trusty amd64 Native OpenZFS management utilities for Linux

Starting from yesterday the machine gets completely locked up after a while.
It's reproducible it by creating 10000 directory under a directory, where inherited acl is set.

There is acltype=posixacl and xattr=sa. After xattr=on the issue is still remaining.
@dweeezil
This is the same filesystem where my previous (non-deletable) similar issue was.
Actually those files are still there.

The log files are usually empty, except the for the very first case.
When this happens processes gets stuck (I was seeing one time a cat process reading from procfs and eating 100% CPU), load goes up and the machine is going down.
IPMI console cannot be accessible.
eg.:
12560 ? R 1:01 rm -r 1 10 100 1000 10000 1001 1002 1003 1004 1005 1006 1007 1008 1009 101 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 102 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 103 10

This process is there for about 10 mins.

@tomposmiko
Copy link
Author

Now I can reproduce it with just a simple ls on the problematic directory.
Hoever, directories above there (1..10000) has been removed successfully.

@putnam
Copy link

putnam commented Mar 24, 2015

I'm having the same problem with almost exactly the same stack trace.

I have xattr=sa and acltype=off.

Running kernel 3.13.0-48-generic on Ubuntu Server with ZoL 0.6.3-5. Today I upgraded my kernel to -48 from -46 but have no direct evidence this change was the culprit.

I can reproduce this reliably on my system by copying a small app bundle over samba. Oddly other files don't trigger it. SMB ends up hanging and eventually the system falls over.

Example stack trace from my machine:

[ 3444.723463] NMI backtrace for cpu 3
[ 3444.723466] CPU: 3 PID: 4693 Comm: smbd Tainted: P W OX 3.13.0-48-generic #80-Ubuntu
[ 3444.723467] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.0003.041920141333 04/19/2014
[ 3444.723468] task: ffff8800a11c6000 ti: ffff8805f9e2e000 task.ti: ffff8805f9e2e000
[ 3444.723469] RIP: 0010:[] [] _raw_spin_lock_irq+0x3a/0x60
[ 3444.723473] RSP: 0018:ffff8805f9e2faa8 EFLAGS: 00000003
[ 3444.723474] RAX: 0000000000000287 RBX: ffff8800a11c6000 RCX: 000000000000e2b4
[ 3444.723475] RDX: 0000000000000006 RSI: 0000000000000006 RDI: ffff88064d0c0d88
[ 3444.723475] RBP: ffff8805f9e2faa8 R08: 0000000000000202 R09: ffff88083fc76340
[ 3444.723476] R10: ffffea00179c1400 R11: 0000000000000000 R12: ffff88064d0c0d80
[ 3444.723477] R13: ffff88064d0c0d88 R14: 0000000000000000 R15: ffff88064d0c0d80
[ 3444.723478] FS: 00007fcdaab6f780(0000) GS:ffff88083fc60000(0000) knlGS:0000000000000000
[ 3444.723479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3444.723480] CR2: 00007fcdaabbe000 CR3: 0000000604179000 CR4: 00000000000407e0
[ 3444.723481] Stack:
[ 3444.723482] ffff8805f9e2fb00 ffffffff817286ef ffff8805eaa67538 ffffffff81727d32
[ 3444.723489] ffff8805eaa67538 ffff8800a11c6000 ffffffff00000001 000000000000000a
[ 3444.723500] ffff88064d0c0d80 0000000000000002 ffff8805f9e2fd38 ffff8805f9e2fb60
[ 3444.723503] Call Trace:
[ 3444.723505] [] rwsem_down_read_failed+0x3f/0x150
[ 3444.723507] [] ? mutex_lock+0x12/0x2f
[ 3444.723510] [] call_rwsem_down_read_failed+0x14/0x30
[ 3444.723517] [] ? down_read+0x20/0x30
[ 3444.723547] [] zap_get_leaf_byblk+0xd0/0x2c0 [zfs]
[ 3444.723549] [] ? mutex_lock+0x12/0x2f
[ 3444.723567] [] zap_deref_leaf+0x64/0x80 [zfs]
[ 3444.723583] [] fzap_cursor_retrieve+0x107/0x2a0 [zfs]
[ 3444.723599] [] zap_cursor_retrieve+0x5c/0x2f0 [zfs]
[ 3444.723614] [] ? sa_lookup+0x63/0x80 [zfs]
[ 3444.723616] [] ? filldir+0x88/0x100
[ 3444.723635] [] zfs_readdir+0x14e/0x4c0 [zfs]
[ 3444.723638] [] ? generic_getxattr+0x4c/0x70
[ 3444.723641] [] ? kmem_cache_free+0x1b5/0x1e0
[ 3444.723644] [] ? locks_free_lock+0x64/0x70
[ 3444.723646] [] ? __posix_lock_file+0x226/0x560
[ 3444.723665] [] zpl_iterate+0x3c/0x60 [zfs]
[ 3444.723666] [] iterate_dir+0xa5/0xe0
[ 3444.723668] [] SyS_getdents+0x92/0x120
[ 3444.723669] [] ? fillonedir+0xe0/0xe0
[ 3444.723671] [] system_call_fastpath+0x1a/0x1f

@putnam
Copy link

putnam commented Mar 24, 2015

This looks a lot like #3143

@tomposmiko
Copy link
Author

Indeed, thanks.

@snajpa
Copy link
Contributor

snajpa commented Mar 25, 2015

Update to current master please, there are metadata cache limit patches now, which should address this.

@tomposmiko
Copy link
Author

I doubt, that it's cache related.
The same behaviour on two different machines on the same dataset (transferred by zfs).
After removed the problematic (but small) dataset the production one looks pretty stable again.

But I will give a try.

@tomposmiko tomposmiko reopened this Mar 25, 2015
@tomposmiko
Copy link
Author

I found out, how I can reopen the issue (via a comment..:).

@snajpa
Copy link
Contributor

snajpa commented Mar 25, 2015

Most likely it isn't, at least not directly - if your ARC was/is growing uncontrollably (which these patches I'm referring to address) then you might run into a deadlock really fast, because it triggers memory reclaim.

@tomposmiko
Copy link
Author

IMO this report can be closed safely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants