Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INFO: task zvol:1898 blocked for more than 120 seconds #10131

Closed
samuelxhu opened this issue Mar 14, 2020 · 3 comments
Closed

INFO: task zvol:1898 blocked for more than 120 seconds #10131

samuelxhu opened this issue Mar 14, 2020 · 3 comments
Labels
Status: Stale No recent activity for issue

Comments

@samuelxhu
Copy link

System information

Type | Version/Name
--- | --- OpenZFS on Linux
Distribution Name | GNU/Linux 4.4.131-20190810.kylin.server-generic
Distribution Version |
Linux Kernel | Linux 4.4.131
Architecture | aarch64
ZFS Version | 0.8.2-1
SPL Version | 0.8.2-1

Describe the problem you're observing

Under 4KB random write stress test, the machine was blocked after some hours of running. The system has enough free memory
root@node43:~# free -m
total used free shared buff/cache available
Mem: 120524 44229 72124 380 4169 64552
Swap: 7811 0 7811
and the kernel has a PAGESIZE of 64KB

Describe how to reproduce the problem

Repeat 4KB random write stress test

Include any warning/errors/backtraces from the system logs

Mar 13 14:32:31 dkyun01 kernel: [882121.840601] INFO: task zvol:1898 blocked for more than 120 seconds.
Mar 13 14:32:31 dkyun01 kernel: [882121.846927] Tainted: P OE 4.4.131-20190810.kylin.server-generic #kylin
Mar 13 14:32:31 dkyun01 kernel: [882121.855090] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 13 14:32:31 dkyun01 kernel: [882121.862975] zvol D ffff8000000861c4 0 1898 2 0x00000000
Mar 13 14:32:31 dkyun01 kernel: [882121.862981] Call trace:
Mar 13 14:32:31 dkyun01 kernel: [882121.865504] [] __switch_to+0xa4/0xb8
Mar 13 14:32:31 dkyun01 kernel: [882121.865510] [] __schedule+0x254/0x740
Mar 13 14:32:31 dkyun01 kernel: [882121.865513] [] schedule+0x40/0x98
Mar 13 14:32:31 dkyun01 kernel: [882121.865516] [] rwsem_down_write_failed+0xf4/0x310
Mar 13 14:32:31 dkyun01 kernel: [882121.865519] [] down_write+0x5c/0x78
Mar 13 14:32:31 dkyun01 kernel: [882121.865755] [] dmu_zfetch+0x1b8/0x5c0 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.866011] [] dmu_buf_hold_array_by_dnode+0x450/0x4d8 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.866256] [] dmu_write_uio_dnode+0x58/0x140 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.866497] [] zvol_write+0x15c/0x450 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.866521] [] taskq_thread+0x294/0x468 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.866527] [] kthread+0x110/0x118
Mar 13 14:32:31 dkyun01 kernel: [882121.866530] [] ret_from_fork+0x10/0x30
Mar 13 14:32:31 dkyun01 kernel: [882121.866534] INFO: task zvol:1899 blocked for more than 120 seconds.
Mar 13 14:32:31 dkyun01 kernel: [882121.872869] Tainted: P OE 4.4.131-20190810.kylin.server-generic #kylin
Mar 13 14:32:31 dkyun01 kernel: [882121.881021] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 13 14:32:31 dkyun01 kernel: [882121.888897] zvol D ffff8000000861c4 0 1899 2 0x00000000
Mar 13 14:32:31 dkyun01 kernel: [882121.888901] Call trace:
Mar 13 14:32:31 dkyun01 kernel: [882121.891435] [] __switch_to+0xa4/0xb8
Mar 13 14:32:31 dkyun01 kernel: [882121.891446] [] __schedule+0x254/0x740
Mar 13 14:32:31 dkyun01 kernel: [882121.891449] [] schedule+0x40/0x98
Mar 13 14:32:31 dkyun01 kernel: [882121.891468] [] cv_wait_common+0x15c/0x188 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.891492] [] __cv_wait+0x30/0x40 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.891748] [] dmu_tx_wait+0x1c0/0x330 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.891979] [] dmu_tx_assign+0x178/0x4b0 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.892209] [] zvol_write+0x140/0x450 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.892232] [] taskq_thread+0x294/0x468 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.892237] [] kthread+0x110/0x118
Mar 13 14:32:31 dkyun01 kernel: [882121.892240] [] ret_from_fork+0x10/0x30
Mar 13 14:32:31 dkyun01 kernel: [882121.892243] INFO: task zvol:1900 blocked for more than 120 seconds.
Mar 13 14:32:31 dkyun01 kernel: [882121.898562] Tainted: P OE 4.4.131-20190810.kylin.server-generic #kylin
Mar 13 14:32:31 dkyun01 kernel: [882121.906710] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 13 14:32:31 dkyun01 kernel: [882121.914600] zvol D ffff8000000861c4 0 1900 2 0x00000000
Mar 13 14:32:31 dkyun01 kernel: [882121.914604] Call trace:
Mar 13 14:32:31 dkyun01 kernel: [882121.917124] [] __switch_to+0xa4/0xb8
Mar 13 14:32:31 dkyun01 kernel: [882121.917128] [] __schedule+0x254/0x740
Mar 13 14:32:31 dkyun01 kernel: [882121.917131] [] schedule+0x40/0x98
Mar 13 14:32:31 dkyun01 kernel: [882121.917148] [] cv_wait_common+0x15c/0x188 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.917164] [] __cv_wait+0x30/0x40 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.917411] [] dmu_tx_wait+0x1c0/0x330 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.917634] [] dmu_tx_assign+0x178/0x4b0 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.917859] [] zvol_write+0x140/0x450 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.917881] [] taskq_thread+0x294/0x468 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.917884] [] kthread+0x110/0x118
Mar 13 14:32:31 dkyun01 kernel: [882121.917887] [] ret_from_fork+0x10/0x30
Mar 13 14:32:31 dkyun01 kernel: [882121.917890] INFO: task zvol:1901 blocked for more than 120 seconds.
Mar 13 14:32:31 dkyun01 kernel: [882121.924221] Tainted: P OE 4.4.131-20190810.kylin.server-generic #kylin
Mar 13 14:32:31 dkyun01 kernel: [882121.932370] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 13 14:32:31 dkyun01 kernel: [882121.940247] zvol D ffff8000000861c4 0 1901 2 0x00000000
Mar 13 14:32:31 dkyun01 kernel: [882121.940251] Call trace:
Mar 13 14:32:31 dkyun01 kernel: [882121.942780] [] __switch_to+0xa4/0xb8
Mar 13 14:32:31 dkyun01 kernel: [882121.942790] [] __schedule+0x254/0x740
Mar 13 14:32:31 dkyun01 kernel: [882121.942793] [] schedule+0x40/0x98
Mar 13 14:32:31 dkyun01 kernel: [882121.942796] [] rwsem_down_read_failed+0xdc/0x120
Mar 13 14:32:31 dkyun01 kernel: [882121.942798] [] down_read+0x48/0x58
Mar 13 14:32:31 dkyun01 kernel: [882121.943033] [] dmu_zfetch+0xa0/0x5c0 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.943258] [] dmu_buf_hold_array_by_dnode+0x450/0x4d8 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.943476] [] dmu_write_uio_dnode+0x58/0x140 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.943697] [] zvol_write+0x15c/0x450 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.943718] [] taskq_thread+0x294/0x468 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.943722] [] kthread+0x110/0x118
Mar 13 14:32:31 dkyun01 kernel: [882121.943726] [] ret_from_fork+0x10/0x30
Mar 13 14:32:31 dkyun01 kernel: [882121.943927] INFO: task txg_quiesce:85831 blocked for more than 120 seconds.
Mar 13 14:32:31 dkyun01 kernel: [882121.950958] Tainted: P OE 4.4.131-20190810.kylin.server-generic #kylin
Mar 13 14:32:31 dkyun01 kernel: [882121.959096] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 13 14:32:31 dkyun01 kernel: [882121.966982] txg_quiesce D ffff8000000861c4 0 85831 2 0x00000000
Mar 13 14:32:31 dkyun01 kernel: [882121.966989] Call trace:
Mar 13 14:32:31 dkyun01 kernel: [882121.969511] [] __switch_to+0xa4/0xb8
Mar 13 14:32:31 dkyun01 kernel: [882121.969515] [] __schedule+0x254/0x740
Mar 13 14:32:31 dkyun01 kernel: [882121.969517] [] schedule+0x40/0x98
Mar 13 14:32:31 dkyun01 kernel: [882121.969538] [] cv_wait_common+0x15c/0x188 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.969553] [] __cv_wait+0x30/0x40 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.969830] [] txg_quiesce_thread+0x254/0x338 [zfs]
Mar 13 14:32:31 dkyun01 kernel: [882121.969850] [] thread_generic_wrapper+0x70/0x88 [spl]
Mar 13 14:32:31 dkyun01 kernel: [882121.969854] [] kthread+0x110/0x118
Mar 13 14:32:31 dkyun01 kernel: [882121.969857] [] ret_from_fork+0x10/0x30
Mar 13 14:32:31 dkyun01 kernel: [882121.969936] INFO: task tp_osd_tp:86965 blocked for more than 120 seconds.
Mar 13 14:32:31 dkyun01 kernel: [882121.976791] Tainted: P OE 4.4.131-20190810.kylin.server-generic #kylin
Mar 13 14:32:31 dkyun01 kernel: [882121.984941] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 13 14:32:31 dkyun01 kernel: [882121.992829] tp_osd_tp D ffff8000000861c4 0 86965 1 0x00000000
Mar 13 14:32:31 dkyun01 kernel: [882121.992835] Call trace:
Mar 13 14:32:31 dkyun01 kernel: [882121.995356] [] __switch_to+0xa4/0xb8
Mar 13 14:32:31 dkyun01 kernel: [882121.995360] [] __schedule+0x254/0x740
Mar 13 14:32:31 dkyun01 kernel: [882121.995363] [] schedule+0x40/0x98
Mar 13 14:32:31 dkyun01 kernel: [882121.995366] [] schedule_timeout+0x188/0x250
Mar 13 14:32:31 dkyun01 kernel: [882121.995369] [] io_schedule_timeout+0xb4/0x130
Mar 13 14:32:31 dkyun01 kernel: [882121.995373] [] __blkdev_direct_IO_simple.isra.0+0x1f4/0x218
Mar 13 14:32:31 dkyun01 kernel: [882121.995375] [] blkdev_direct_IO+0xf0/0xf8
Mar 13 14:32:31 dkyun01 kernel: [882121.995379] [] generic_file_read_iter+0xb4/0x170
Mar 13 14:32:31 dkyun01 kernel: [882121.995382] [] blkdev_read_iter+0x50/0x78
Mar 13 14:32:31 dkyun01 kernel: [882121.995385] [] new_sync_read+0x88/0xb8
Mar 13 14:32:31 dkyun01 kernel: [882121.995388] [] __vfs_read+0x74/0x90
Mar 13 14:32:31 dkyun01 kernel: [882121.995390] [] vfs_read+0xa4/0x168
Mar 13 14:32:31 dkyun01 kernel: [882121.995393] [] SyS_pread64+0x84/0xc0
Mar 13 14:32:31 dkyun01 kernel: [882121.995396] [] el0_svc_naked+0x38/0x3c

@devZer0
Copy link

devZer0 commented Apr 17, 2020

i was transferring a virtual machine image from a crashed system (for recovery) to a new machine and got this error after 14gb being copied on the target machine. the sending process hung afterwards.

dd if=/dev/vg_backupvm2/vm-103-disk-0 bs=1024k | pv | ssh root@192.168.1.52 "dd of=/dev/rpool/data/vm-103-disk-0

besides that blocking, no other errors shown in dmesg or zpool

[1181714.996166] INFO: task dd:32414 blocked for more than 120 seconds.
[1181714.996223] Tainted: P IO 5.3.18-3-pve #1
[1181714.996264] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1181714.996318] dd D 0 32414 32381 0x00000000
[1181714.996321] Call Trace:
[1181714.996331] __schedule+0x2bb/0x660
[1181714.996335] ? bit_wait_timeout+0xa0/0xa0
[1181714.996337] schedule+0x33/0xa0
[1181714.996339] io_schedule+0x16/0x40
[1181714.996341] bit_wait_io+0x11/0x50
[1181714.996343] __wait_on_bit+0x33/0xa0
[1181714.996346] out_of_line_wait_on_bit+0x90/0xb0
[1181714.996351] ? var_wake_function+0x30/0x30
[1181714.996355] __block_write_begin_int+0x22c/0x610
[1181714.996357] ? blkdev_direct_IO+0x490/0x490
[1181714.996360] ? blkdev_direct_IO+0x490/0x490
[1181714.996362] block_write_begin+0x4d/0xf0
[1181714.996365] blkdev_write_begin+0x23/0x30
[1181714.996369] generic_perform_write+0xf2/0x1b0
[1181714.996375] ? file_update_time+0x5e/0x130
[1181714.996377] ? current_time+0x46/0x80
[1181714.996381] __generic_file_write_iter+0x101/0x1f0
[1181714.996383] ? touch_atime+0x33/0xe0
[1181714.996386] blkdev_write_iter+0xa0/0x120
[1181714.996390] new_sync_write+0x125/0x1c0
[1181714.996393] __vfs_write+0x29/0x40
[1181714.996395] vfs_write+0xab/0x1b0
[1181714.996398] ksys_write+0x61/0xe0
[1181714.996401] __x64_sys_write+0x1a/0x20
[1181714.996405] do_syscall_64+0x5a/0x130
[1181714.996408] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1181714.996411] RIP: 0033:0x7fb158a7e504
[1181714.996418] Code: Bad RIP value.
[1181714.996419] RSP: 002b:00007ffde0775758 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[1181714.996422] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb158a7e504
[1181714.996423] RDX: 0000000000000200 RSI: 00005610f6f74000 RDI: 0000000000000001
[1181714.996424] RBP: 0000000000000200 R08: 0000000000000000 R09: 00005610f6f74030
[1181714.996426] R10: ffffffffffffff3b R11: 0000000000000246 R12: 00005610f6f74000
[1181714.996427] R13: 0000000000000800 R14: 0000000000000200 R15: 00005610f6f72200

@devZer0
Copy link

devZer0 commented Apr 17, 2020

probably a duplicate of #9172

@stale
Copy link

stale bot commented Apr 17, 2021

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Apr 17, 2021
@stale stale bot closed this as completed Jul 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue
Projects
None yet
Development

No branches or pull requests

2 participants