Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang/crash on long-running zfs receive #396

Closed
korni opened this issue Sep 10, 2011 · 4 comments
Closed

Hang/crash on long-running zfs receive #396

korni opened this issue Sep 10, 2011 · 4 comments
Milestone

Comments

@korni
Copy link

korni commented Sep 10, 2011

zfs-0.6.0-rc5 on CentOS 6.0 version 2.6.32-71.29.1.el6.x86_64 (2GB mem, 8 disks on two Arcea SATA-RAID
controllers (JBOD)), during a long runnig zfs receive from a OpenSolaris snv_134 X86. The incremental snapshot
transferred was in a series of ~500 GBytes, the individual chunk was rather big (150 GB). After the crash,
all disks (also outside of zfs) were blocked.

BTW: Installation and usage "just worked fine" .-) I'm switching from OpenSolaris to Linux because it looks
like low-level handling of the Supermicro MB and the RAID controllers works better on Linux ...
Wish: inline/sharable .zfs/snapshot directory like on Solaris.

korni

Sep 10 11:16:06 yoda kernel: INFO: task kswapd0:50 blocked for more than 120 seconds.
Sep 10 11:16:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:16:06 yoda kernel: kswapd0 D ffff8800797b2ac0 0 50 2 0x00000000
Sep 10 11:16:06 yoda kernel: ffff8800798ed9f0 0000000000000046 0000000000000000 0000000100000003
Sep 10 11:16:06 yoda kernel: 0000000000000000 0000000000000082 ffff8800798ed990 ffff88007190bda0
Sep 10 11:16:06 yoda kernel: ffff8800798eba98 ffff8800798edfd8 0000000000010518 ffff8800798eba98
Sep 10 11:16:06 yoda kernel: Call Trace:
Sep 10 11:16:06 yoda kernel: [] ? prepare_to_wait_exclusive+0x4e/0x80
Sep 10 11:16:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:16:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:16:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:16:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:16:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:16:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:16:06 yoda kernel: [] zfs_inactive+0xec/0x1e0 [zfs]
Sep 10 11:16:06 yoda kernel: [] zpl_clear_inode+0xe/0x10 [zfs]
Sep 10 11:16:06 yoda kernel: [] clear_inode+0x8f/0x110
Sep 10 11:16:06 yoda kernel: [] dispose_list+0x40/0x120
Sep 10 11:16:06 yoda kernel: [] shrink_icache_memory+0x274/0x2e0
Sep 10 11:16:06 yoda kernel: [] shrink_slab+0x13a/0x1a0
Sep 10 11:16:06 yoda kernel: [] balance_pgdat+0x54e/0x770
Sep 10 11:16:06 yoda kernel: [] ? isolate_pages_global+0x0/0x380
Sep 10 11:16:06 yoda kernel: [] ? prepare_to_wait+0x4e/0x80
Sep 10 11:16:06 yoda kernel: [] kswapd+0x134/0x390
Sep 10 11:16:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:16:06 yoda kernel: [] ? kswapd+0x0/0x390
Sep 10 11:16:06 yoda kernel: [] kthread+0x96/0xa0
Sep 10 11:16:06 yoda kernel: [] child_rip+0xa/0x20
Sep 10 11:16:06 yoda kernel: [] ? kthread+0x0/0xa0
Sep 10 11:16:06 yoda kernel: [] ? child_rip+0x0/0x20
Sep 10 11:16:06 yoda kernel: INFO: task txg_sync:1528 blocked for more than 120 seconds.
Sep 10 11:16:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:16:06 yoda kernel: txg_sync D ffff88007f823080 0 1528 2 0x00000000
Sep 10 11:16:06 yoda kernel: ffff88007190d5a0 0000000000000046 0000000000000000 0000000100000003
Sep 10 11:16:06 yoda kernel: 0000000000000000 0000000000000086 ffff88007190d540 0000000100d9b426
Sep 10 11:16:06 yoda kernel: ffff880071909068 ffff88007190dfd8 0000000000010518 ffff880071909068
Sep 10 11:16:06 yoda kernel: Call Trace:
Sep 10 11:16:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:16:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:16:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:16:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:16:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:16:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:16:06 yoda kernel: [] zfs_inactive+0xec/0x1e0 [zfs]
Sep 10 11:16:06 yoda kernel: [] zpl_clear_inode+0xe/0x10 [zfs]
Sep 10 11:16:06 yoda kernel: [] clear_inode+0x8f/0x110
Sep 10 11:16:06 yoda kernel: [] dispose_list+0x40/0x120
Sep 10 11:16:06 yoda kernel: [] shrink_icache_memory+0x274/0x2e0
Sep 10 11:16:06 yoda kernel: [] shrink_slab+0x13a/0x1a0
Sep 10 11:16:06 yoda kernel: [] do_try_to_free_pages+0x2d6/0x500
Sep 10 11:16:06 yoda kernel: [] ? schedule_timeout+0x19c/0x2f0
Sep 10 11:16:06 yoda kernel: [] try_to_free_pages+0x9f/0x130
Sep 10 11:16:06 yoda kernel: [] ? isolate_pages_global+0x0/0x380
Sep 10 11:16:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:16:06 yoda kernel: [] __alloc_pages_nodemask+0x3ee/0x850
Sep 10 11:16:06 yoda kernel: [] kmem_getpages+0x62/0x170
Sep 10 11:16:06 yoda kernel: [] fallback_alloc+0x1ba/0x270
Sep 10 11:16:06 yoda kernel: [] ? cache_grow+0x2cf/0x320
Sep 10 11:16:06 yoda kernel: [] ____cache_alloc_node+0x99/0x160
Sep 10 11:16:06 yoda kernel: [] ? kmem_alloc_debug+0xb3/0x130 [spl]
Sep 10 11:16:06 yoda kernel: [] __kmalloc+0x189/0x220
Sep 10 11:16:06 yoda kernel: [] kmem_alloc_debug+0xb3/0x130 [spl]
Sep 10 11:16:06 yoda kernel: [] ddt_get_dedup_stats+0x40/0x80 [zfs]
Sep 10 11:16:06 yoda kernel: [] ddt_get_dedup_dspace+0x2a/0x40 [zfs]
Sep 10 11:16:06 yoda kernel: [] spa_update_dspace+0x30/0x50 [zfs]
Sep 10 11:16:06 yoda kernel: [] spa_sync+0x562/0x9a0 [zfs]
Sep 10 11:16:06 yoda kernel: [] ? __wake_up+0x53/0x70
Sep 10 11:16:06 yoda kernel: [] txg_sync_thread+0x225/0x3b0 [zfs]
Sep 10 11:16:06 yoda kernel: [] ? txg_sync_thread+0x0/0x3b0 [zfs]
Sep 10 11:16:06 yoda kernel: [] ? txg_sync_thread+0x0/0x3b0 [zfs]
Sep 10 11:16:06 yoda kernel: [] thread_generic_wrapper+0x68/0x80 [spl]
Sep 10 11:16:06 yoda kernel: [] ? thread_generic_wrapper+0x0/0x80 [spl]
Sep 10 11:16:06 yoda kernel: [] kthread+0x96/0xa0
Sep 10 11:16:06 yoda kernel: [] child_rip+0xa/0x20
Sep 10 11:16:06 yoda kernel: [] ? kthread+0x0/0xa0
Sep 10 11:16:06 yoda kernel: [] ? child_rip+0x0/0x20
Sep 10 11:16:06 yoda kernel: INFO: task zfs:2843 blocked for more than 120 seconds.
Sep 10 11:16:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:16:06 yoda kernel: zfs D ffff8800797b2ac0 0 2843 2841 0x00000080
Sep 10 11:16:06 yoda kernel: ffff8800649bf8b8 0000000000000082 000000026b4c1940 0000000100000003
Sep 10 11:16:06 yoda kernel: ffff88002543e040 0000000000000086 ffffffffa0409390 ffff88007190bda0
Sep 10 11:16:06 yoda kernel: ffff88007b8a5028 ffff8800649bffd8 0000000000010518 ffff88007b8a5028
Sep 10 11:16:06 yoda kernel: Call Trace:
Sep 10 11:16:06 yoda kernel: [] ? prepare_to_wait_exclusive+0x4e/0x80
Sep 10 11:16:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:16:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:16:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:16:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:16:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:16:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:16:06 yoda kernel: [] restore_write+0xad/0x130 [zfs]
Sep 10 11:16:06 yoda kernel: [] dmu_recv_stream+0x98c/0xd80 [zfs]
Sep 10 11:16:06 yoda kernel: [] ? nvlist_common+0x111/0x1f0 [znvpair]
Sep 10 11:16:06 yoda kernel: [] ? kmem_free_debug+0x16/0x20 [spl]
Sep 10 11:16:06 yoda kernel: [] zfs_ioc_recv+0x356/0xf40 [zfs]
Sep 10 11:16:06 yoda kernel: [] ? kmem_free_debug+0x16/0x20 [spl]
Sep 10 11:16:06 yoda kernel: [] ? spa_lookup+0x62/0xc0 [zfs]
Sep 10 11:16:06 yoda kernel: [] ? spa_open_common+0x23c/0x370 [zfs]
Sep 10 11:16:06 yoda kernel: [] zfsdev_ioctl+0xfd/0x1d0 [zfs]
Sep 10 11:16:06 yoda kernel: [] vfs_ioctl+0x22/0xa0
Sep 10 11:16:06 yoda kernel: [] ? finish_task_switch+0x42/0xd0
Sep 10 11:16:06 yoda kernel: [] do_vfs_ioctl+0x84/0x580
Sep 10 11:16:06 yoda kernel: [] ? thread_return+0x4e/0x778
Sep 10 11:16:06 yoda kernel: [] sys_ioctl+0x81/0xa0
Sep 10 11:16:06 yoda kernel: [] system_call_fastpath+0x16/0x1b
Sep 10 11:18:06 yoda kernel: INFO: task kswapd0:50 blocked for more than 120 seconds.
Sep 10 11:18:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:18:06 yoda kernel: kswapd0 D ffff8800797b2ac0 0 50 2 0x00000000
Sep 10 11:18:06 yoda kernel: ffff8800798ed9f0 0000000000000046 0000000000000000 0000000100000003
Sep 10 11:18:06 yoda kernel: 0000000000000000 0000000000000082 ffff8800798ed990 ffff88007190bda0
Sep 10 11:18:06 yoda kernel: ffff8800798eba98 ffff8800798edfd8 0000000000010518 ffff8800798eba98
Sep 10 11:18:06 yoda kernel: Call Trace:
Sep 10 11:18:06 yoda kernel: [] ? prepare_to_wait_exclusive+0x4e/0x80
Sep 10 11:18:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:18:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:18:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:18:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:18:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:18:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:18:06 yoda kernel: [] zfs_inactive+0xec/0x1e0 [zfs]
Sep 10 11:18:06 yoda kernel: [] zpl_clear_inode+0xe/0x10 [zfs]
Sep 10 11:18:06 yoda kernel: [] clear_inode+0x8f/0x110
Sep 10 11:18:06 yoda kernel: [] dispose_list+0x40/0x120
Sep 10 11:18:06 yoda kernel: [] shrink_icache_memory+0x274/0x2e0
Sep 10 11:18:06 yoda kernel: [] shrink_slab+0x13a/0x1a0
Sep 10 11:18:06 yoda kernel: [] balance_pgdat+0x54e/0x770
Sep 10 11:18:06 yoda kernel: [] ? isolate_pages_global+0x0/0x380
Sep 10 11:18:06 yoda kernel: [] ? prepare_to_wait+0x4e/0x80
Sep 10 11:18:06 yoda kernel: [] kswapd+0x134/0x390
Sep 10 11:18:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:18:06 yoda kernel: [] ? kswapd+0x0/0x390
Sep 10 11:18:06 yoda kernel: [] kthread+0x96/0xa0
Sep 10 11:18:06 yoda kernel: [] child_rip+0xa/0x20
Sep 10 11:18:06 yoda kernel: [] ? kthread+0x0/0xa0
Sep 10 11:18:06 yoda kernel: [] ? child_rip+0x0/0x20
Sep 10 11:18:06 yoda kernel: INFO: task txg_sync:1528 blocked for more than 120 seconds.
Sep 10 11:18:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:18:06 yoda kernel: txg_sync D ffff88007f823080 0 1528 2 0x00000000
Sep 10 11:18:06 yoda kernel: ffff88007190d5a0 0000000000000046 0000000000000000 0000000100000003
Sep 10 11:18:06 yoda kernel: 0000000000000000 0000000000000086 ffff88007190d540 0000000100d9b426
Sep 10 11:18:06 yoda kernel: ffff880071909068 ffff88007190dfd8 0000000000010518 ffff880071909068
Sep 10 11:18:06 yoda kernel: Call Trace:
Sep 10 11:18:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:18:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:18:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:18:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:18:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:18:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:18:06 yoda kernel: [] zfs_inactive+0xec/0x1e0 [zfs]
Sep 10 11:18:06 yoda kernel: [] zpl_clear_inode+0xe/0x10 [zfs]
Sep 10 11:18:06 yoda kernel: [] clear_inode+0x8f/0x110
Sep 10 11:18:06 yoda kernel: [] dispose_list+0x40/0x120
Sep 10 11:18:06 yoda kernel: [] shrink_icache_memory+0x274/0x2e0
Sep 10 11:18:06 yoda kernel: [] shrink_slab+0x13a/0x1a0
Sep 10 11:18:06 yoda kernel: [] do_try_to_free_pages+0x2d6/0x500
Sep 10 11:18:06 yoda kernel: [] ? schedule_timeout+0x19c/0x2f0
Sep 10 11:18:06 yoda kernel: [] try_to_free_pages+0x9f/0x130
Sep 10 11:18:06 yoda kernel: [] ? isolate_pages_global+0x0/0x380
Sep 10 11:18:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:18:06 yoda kernel: [] __alloc_pages_nodemask+0x3ee/0x850
Sep 10 11:18:06 yoda kernel: [] kmem_getpages+0x62/0x170
Sep 10 11:18:06 yoda kernel: [] fallback_alloc+0x1ba/0x270
Sep 10 11:18:06 yoda kernel: [] ? cache_grow+0x2cf/0x320
Sep 10 11:18:06 yoda kernel: [] ____cache_alloc_node+0x99/0x160
Sep 10 11:18:06 yoda kernel: [] ? kmem_alloc_debug+0xb3/0x130 [spl]
Sep 10 11:18:06 yoda kernel: [] __kmalloc+0x189/0x220
Sep 10 11:18:06 yoda kernel: [] kmem_alloc_debug+0xb3/0x130 [spl]
Sep 10 11:18:06 yoda kernel: [] ddt_get_dedup_stats+0x40/0x80 [zfs]
Sep 10 11:18:06 yoda kernel: [] ddt_get_dedup_dspace+0x2a/0x40 [zfs]
Sep 10 11:18:06 yoda kernel: [] spa_update_dspace+0x30/0x50 [zfs]
Sep 10 11:18:06 yoda kernel: [] spa_sync+0x562/0x9a0 [zfs]
Sep 10 11:18:06 yoda kernel: [] ? __wake_up+0x53/0x70
Sep 10 11:18:06 yoda kernel: [] txg_sync_thread+0x225/0x3b0 [zfs]
Sep 10 11:18:06 yoda kernel: [] ? txg_sync_thread+0x0/0x3b0 [zfs]
Sep 10 11:18:06 yoda kernel: [] ? txg_sync_thread+0x0/0x3b0 [zfs]
Sep 10 11:18:06 yoda kernel: [] thread_generic_wrapper+0x68/0x80 [spl]
Sep 10 11:18:06 yoda kernel: [] ? thread_generic_wrapper+0x0/0x80 [spl]
Sep 10 11:18:06 yoda kernel: [] kthread+0x96/0xa0
Sep 10 11:18:06 yoda kernel: [] child_rip+0xa/0x20
Sep 10 11:18:06 yoda kernel: [] ? kthread+0x0/0xa0
Sep 10 11:18:06 yoda kernel: [] ? child_rip+0x0/0x20
Sep 10 11:18:06 yoda kernel: INFO: task zfs:2843 blocked for more than 120 seconds.
Sep 10 11:18:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:18:06 yoda kernel: zfs D ffff8800797b2ac0 0 2843 2841 0x00000080
Sep 10 11:18:06 yoda kernel: ffff8800649bf8b8 0000000000000082 000000026b4c1940 0000000100000003
Sep 10 11:18:06 yoda kernel: ffff88002543e040 0000000000000086 ffffffffa0409390 ffff88007190bda0
Sep 10 11:18:06 yoda kernel: ffff88007b8a5028 ffff8800649bffd8 0000000000010518 ffff88007b8a5028
Sep 10 11:18:06 yoda kernel: Call Trace:
Sep 10 11:18:06 yoda kernel: [] ? prepare_to_wait_exclusive+0x4e/0x80
Sep 10 11:18:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:18:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:18:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:18:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:18:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:18:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:18:06 yoda kernel: [] restore_write+0xad/0x130 [zfs]
Sep 10 11:18:06 yoda kernel: [] dmu_recv_stream+0x98c/0xd80 [zfs]
Sep 10 11:18:06 yoda kernel: [] ? nvlist_common+0x111/0x1f0 [znvpair]
Sep 10 11:18:06 yoda kernel: [] ? kmem_free_debug+0x16/0x20 [spl]
Sep 10 11:18:06 yoda kernel: [] zfs_ioc_recv+0x356/0xf40 [zfs]
Sep 10 11:18:06 yoda kernel: [] ? kmem_free_debug+0x16/0x20 [spl]
Sep 10 11:18:06 yoda kernel: [] ? spa_lookup+0x62/0xc0 [zfs]
Sep 10 11:18:06 yoda kernel: [] ? spa_open_common+0x23c/0x370 [zfs]
Sep 10 11:18:06 yoda kernel: [] zfsdev_ioctl+0xfd/0x1d0 [zfs]
Sep 10 11:18:06 yoda kernel: [] vfs_ioctl+0x22/0xa0
Sep 10 11:18:06 yoda kernel: [] ? finish_task_switch+0x42/0xd0
Sep 10 11:18:06 yoda kernel: [] do_vfs_ioctl+0x84/0x580
Sep 10 11:18:06 yoda kernel: [] ? thread_return+0x4e/0x778
Sep 10 11:18:06 yoda kernel: [] sys_ioctl+0x81/0xa0
Sep 10 11:18:06 yoda kernel: [] system_call_fastpath+0x16/0x1b
Sep 10 11:20:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:20:06 yoda kernel: kswapd0 D ffff8800797b2ac0 0 50 2 0x00000000
Sep 10 11:20:06 yoda kernel: ffff8800798ed9f0 0000000000000046 0000000000000000 0000000100000003
Sep 10 11:20:06 yoda kernel: 0000000000000000 0000000000000082 ffff8800798ed990 ffff88007190bda0
Sep 10 11:20:06 yoda kernel: ffff8800798eba98 ffff8800798edfd8 0000000000010518 ffff8800798eba98
Sep 10 11:20:06 yoda kernel: Call Trace:
Sep 10 11:20:06 yoda kernel: [] ? prepare_to_wait_exclusive+0x4e/0x80
Sep 10 11:20:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:20:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:20:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:20:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:20:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:20:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:20:06 yoda kernel: [] zfs_inactive+0xec/0x1e0 [zfs]
Sep 10 11:20:06 yoda kernel: [] zpl_clear_inode+0xe/0x10 [zfs]
Sep 10 11:20:06 yoda kernel: [] clear_inode+0x8f/0x110
Sep 10 11:20:06 yoda kernel: [] dispose_list+0x40/0x120
Sep 10 11:20:06 yoda kernel: [] shrink_icache_memory+0x274/0x2e0
Sep 10 11:20:06 yoda kernel: [] shrink_slab+0x13a/0x1a0
Sep 10 11:20:06 yoda kernel: [] balance_pgdat+0x54e/0x770
Sep 10 11:20:06 yoda kernel: [] ? isolate_pages_global+0x0/0x380
Sep 10 11:20:06 yoda kernel: [] ? prepare_to_wait+0x4e/0x80
Sep 10 11:20:06 yoda kernel: [] kswapd+0x134/0x390
Sep 10 11:20:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:20:06 yoda kernel: [] ? kswapd+0x0/0x390
Sep 10 11:20:06 yoda kernel: [] kthread+0x96/0xa0
Sep 10 11:20:06 yoda kernel: [] child_rip+0xa/0x20
Sep 10 11:20:06 yoda kernel: [] ? kthread+0x0/0xa0
Sep 10 11:20:06 yoda kernel: [] ? child_rip+0x0/0x20
Sep 10 11:20:06 yoda kernel: INFO: task txg_sync:1528 blocked for more than 120 seconds.
Sep 10 11:20:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:20:06 yoda kernel: txg_sync D ffff88007f823080 0 1528 2 0x00000000
Sep 10 11:20:06 yoda kernel: ffff88007190d5a0 0000000000000046 0000000000000000 0000000100000003
Sep 10 11:20:06 yoda kernel: 0000000000000000 0000000000000086 ffff88007190d540 0000000100d9b426
Sep 10 11:20:06 yoda kernel: ffff880071909068 ffff88007190dfd8 0000000000010518 ffff880071909068
Sep 10 11:20:06 yoda kernel: Call Trace:
Sep 10 11:20:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:20:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:20:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:20:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:20:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:20:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:20:06 yoda kernel: [] zfs_inactive+0xec/0x1e0 [zfs]
Sep 10 11:20:06 yoda kernel: [] zpl_clear_inode+0xe/0x10 [zfs]
Sep 10 11:20:06 yoda kernel: [] clear_inode+0x8f/0x110
Sep 10 11:20:06 yoda kernel: [] dispose_list+0x40/0x120
Sep 10 11:20:06 yoda kernel: [] shrink_icache_memory+0x274/0x2e0
Sep 10 11:20:06 yoda kernel: [] shrink_slab+0x13a/0x1a0
Sep 10 11:20:06 yoda kernel: [] do_try_to_free_pages+0x2d6/0x500
Sep 10 11:20:06 yoda kernel: [] ? schedule_timeout+0x19c/0x2f0
Sep 10 11:20:06 yoda kernel: [] try_to_free_pages+0x9f/0x130
Sep 10 11:20:06 yoda kernel: [] ? isolate_pages_global+0x0/0x380
Sep 10 11:20:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:20:06 yoda kernel: [] __alloc_pages_nodemask+0x3ee/0x850
Sep 10 11:20:06 yoda kernel: [] kmem_getpages+0x62/0x170
Sep 10 11:20:06 yoda kernel: [] fallback_alloc+0x1ba/0x270
Sep 10 11:20:06 yoda kernel: [] ? cache_grow+0x2cf/0x320
Sep 10 11:20:06 yoda kernel: [] ____cache_alloc_node+0x99/0x160
Sep 10 11:20:06 yoda kernel: [] ? kmem_alloc_debug+0xb3/0x130 [spl]
Sep 10 11:20:06 yoda kernel: [] __kmalloc+0x189/0x220
Sep 10 11:20:06 yoda kernel: [] kmem_alloc_debug+0xb3/0x130 [spl]
Sep 10 11:20:06 yoda kernel: [] ddt_get_dedup_stats+0x40/0x80 [zfs]
Sep 10 11:20:06 yoda kernel: [] ddt_get_dedup_dspace+0x2a/0x40 [zfs]
Sep 10 11:20:06 yoda kernel: [] spa_update_dspace+0x30/0x50 [zfs]
Sep 10 11:20:06 yoda kernel: [] spa_sync+0x562/0x9a0 [zfs]
Sep 10 11:20:06 yoda kernel: [] ? __wake_up+0x53/0x70
Sep 10 11:20:06 yoda kernel: [] txg_sync_thread+0x225/0x3b0 [zfs]
Sep 10 11:20:06 yoda kernel: [] ? txg_sync_thread+0x0/0x3b0 [zfs]
Sep 10 11:20:06 yoda kernel: [] ? txg_sync_thread+0x0/0x3b0 [zfs]
Sep 10 11:20:06 yoda kernel: [] thread_generic_wrapper+0x68/0x80 [spl]
Sep 10 11:20:06 yoda kernel: [] ? thread_generic_wrapper+0x0/0x80 [spl]
Sep 10 11:20:06 yoda kernel: [] kthread+0x96/0xa0
Sep 10 11:20:06 yoda kernel: [] child_rip+0xa/0x20
Sep 10 11:20:06 yoda kernel: [] ? kthread+0x0/0xa0
Sep 10 11:20:06 yoda kernel: [] ? child_rip+0x0/0x20
Sep 10 11:20:06 yoda kernel: INFO: task zfs:2843 blocked for more than 120 seconds.
Sep 10 11:20:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:20:06 yoda kernel: zfs D ffff8800797b2ac0 0 2843 2841 0x00000080
Sep 10 11:20:06 yoda kernel: ffff8800649bf8b8 0000000000000082 000000026b4c1940 0000000100000003
Sep 10 11:20:06 yoda kernel: ffff88002543e040 0000000000000086 ffffffffa0409390 ffff88007190bda0
Sep 10 11:20:06 yoda kernel: ffff88007b8a5028 ffff8800649bffd8 0000000000010518 ffff88007b8a5028
Sep 10 11:20:06 yoda kernel: Call Trace:
Sep 10 11:20:06 yoda kernel: [] ? prepare_to_wait_exclusive+0x4e/0x80
Sep 10 11:20:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:20:06 yoda kernel: [] ? autoremove_wake_function+0x0/0x40
Sep 10 11:20:06 yoda kernel: [] __cv_wait+0x13/0x20 [spl]
Sep 10 11:20:06 yoda kernel: [] txg_wait_open+0x7b/0xa0 [zfs]
Sep 10 11:20:06 yoda kernel: [] dmu_tx_wait+0xed/0xf0 [zfs]
Sep 10 11:20:06 yoda kernel: [] dmu_tx_assign+0x6a/0x410 [zfs]
Sep 10 11:20:06 yoda kernel: [] restore_write+0xad/0x130 [zfs]
Sep 10 11:20:06 yoda kernel: [] dmu_recv_stream+0x98c/0xd80 [zfs]
Sep 10 11:20:06 yoda kernel: [] ? nvlist_common+0x111/0x1f0 [znvpair]
Sep 10 11:20:06 yoda kernel: [] ? kmem_free_debug+0x16/0x20 [spl]
Sep 10 11:20:06 yoda kernel: [] zfs_ioc_recv+0x356/0xf40 [zfs]
Sep 10 11:20:06 yoda kernel: [] ? kmem_free_debug+0x16/0x20 [spl]
Sep 10 11:20:06 yoda kernel: [] ? spa_lookup+0x62/0xc0 [zfs]
Sep 10 11:20:06 yoda kernel: [] ? spa_open_common+0x23c/0x370 [zfs]
Sep 10 11:20:06 yoda kernel: [] zfsdev_ioctl+0xfd/0x1d0 [zfs]
Sep 10 11:20:06 yoda kernel: [] vfs_ioctl+0x22/0xa0
Sep 10 11:20:06 yoda kernel: [] ? finish_task_switch+0x42/0xd0
Sep 10 11:20:06 yoda kernel: [] do_vfs_ioctl+0x84/0x580
Sep 10 11:20:06 yoda kernel: [] ? thread_return+0x4e/0x778
Sep 10 11:20:06 yoda kernel: [] sys_ioctl+0x81/0xa0
Sep 10 11:20:06 yoda kernel: [] system_call_fastpath+0x16/0x1b
Sep 10 11:22:06 yoda kernel: INFO: task kswapd0:50 blocked for more than 120 seconds.
Sep 10 11:22:06 yoda kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 10 11:22:06 yoda kernel: kswapd0 D ffff8800797b2ac0 0 50 2 0x00000000
Sep 10 11:22:06 yoda kernel: ffff8800798ed9f0 0000000000000046 0000000000000000 0000000100000003
Sep 10 11:22:06 yoda kernel: 0000000000000000 0000000000000082 ffff8800798ed990 ffff88007190bda0
Sep 10 11:22:06 yoda kernel: ffff8800798eba98 ffff8800798edfd8 0000000000010518 ffff8800798eba98
Sep 10 11:22:06 yoda kernel: Call Trace:
Sep 10 11:22:06 yoda kernel: [] ? prepare_to_wait_exclusive+0x4e/0x80
Sep 10 11:22:06 yoda kernel: [] cv_wait_common+0x78/0xe0 [spl]
Sep 10 11:22:06 yoda kernel: [

@behlendorf
Copy link
Contributor

This looks like a deadlock on the open txg. Thanks for opening the bug we'll get it fixed.

@behlendorf
Copy link
Contributor

It's unclear from the stack exactly what caused the deadlock here. I believe the first few critical stacks are missing.

@behlendorf
Copy link
Contributor

Is this still an issue in the latest source?

@behlendorf
Copy link
Contributor

Closing as stale.

ahrens added a commit to ahrens/zfs that referenced this issue Aug 17, 2021
…penzfs#396)

When the agent is resuming (i.e. agent restarted, kernel reconnects and resumes a txg), the kernel re-issues any outstanding writes to the agent. The agent finds any objects that were written before it crashed ("recovered objects"), and only creates new objects ("gap objects") for blocks that were not already stored to objects. i.e. If there's a gap in the objects, we need to fill it in, based on the writes that the kernel gave us.

The problem is that the agent was putting as many writes as possible into the "gap object", but it needs to limit this to only blocks that are before the next "recovered object".
pcd1193182 pushed a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants