Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

list_del corruption. next->prev should be ffff9df513f39368, but was dead000000000122 and freeze in abd_free_gang_abd #10401

Closed
dioni21 opened this issue Jun 4, 2020 · 15 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@dioni21
Copy link
Contributor

dioni21 commented Jun 4, 2020

System information

Type Version/Name
Distribution Name Fedora
Distribution Version 31 updated
Linux Kernel 5.6.15-200.fc31.x86_64
Architecture x86_64
ZFS Version 0.8.0-802_g3bf3b164e (compiled from git master)
SPL Version XXX

Describe the problem you're observing

I just had to reboot because ZFS was frozen

Describe how to reproduce the problem

I don't know what triggered the bug, but my system is compiled with debug options:

./configure --enable-silent-rules --enable-dependency-tracking --enable-asan --enable-debuginfo --enable-debug --enable-debug-kmem --enable-debug-kmem-tracking

Include any warning/errors/backtraces from the system logs

From /var/log/messages (it is not stored on ZFS, so it could be written):

Jun  3 20:38:08 nexus kernel: [120444.102047] list_del corruption. next->prev should be ffff9df513f39368, but was dead000000000122
Jun  3 20:38:08 nexus kernel: [120444.102089] ------------[ cut here ]------------
Jun  3 20:38:08 nexus kernel: [120444.102092] kernel BUG at lib/list_debug.c:54!
Jun  3 20:38:08 nexus kernel: [120444.102110] invalid opcode: 0000 [#1] SMP NOPTI
Jun  3 20:38:08 nexus kernel: [120444.102113] CPU: 7 PID: 1471 Comm: z_wr_int Tainted: P        W  OE     5.6.15-200.fc31.x86_64 #1
Jun  3 20:38:08 nexus kernel: [120444.102114] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./970A-D3P, BIOS FD 02/26/2016
Jun  3 20:38:08 nexus kernel: [120444.102119] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x55
Jun  3 20:38:08 nexus kernel: [120444.102122] Code: c7 c7 b0 2a 3d b1 e8 65 bf c3 ff 0f 0b 48 89 fe 48 c7 c7 40 2b 3d b1 e8 54 bf c3 ff 0f 0b 48 c7 c7 f0 2b 3d b1 e8 46 bf c3 ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 b0 2b 3d b1 e8 32 bf c3 ff 0f 0b
Jun  3 20:38:08 nexus kernel: [120444.102123] RSP: 0018:ffffa94141563d00 EFLAGS: 00010292
Jun  3 20:38:08 nexus kernel: [120444.102125] RAX: 0000000000000054 RBX: ffff9df513f39368 RCX: 0000000000000000
Jun  3 20:38:08 nexus kernel: [120444.102126] RDX: ffff9df95fde7c80 RSI: ffff9df95fdd9cc8 RDI: ffff9df95fdd9cc8
Jun  3 20:38:08 nexus kernel: [120444.102128] RBP: ffff9df513f39368 R08: 000000000000045d R09: 0000000000000064
Jun  3 20:38:08 nexus kernel: [120444.102129] R10: ffffa94141563bb8 R11: 0000000000000000 R12: ffff9df61554b530
Jun  3 20:38:08 nexus kernel: [120444.102130] R13: ffff9df92578cd00 R14: 0000000000012000 R15: 0000000000011fff
Jun  3 20:38:08 nexus kernel: [120444.102132] FS:  0000000000000000(0000) GS:ffff9df95fdc0000(0000) knlGS:0000000000000000
Jun  3 20:38:08 nexus kernel: [120444.102133] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun  3 20:38:08 nexus kernel: [120444.102134] CR2: 00003faa7bb1d89b CR3: 000000042ae46000 CR4: 00000000000406e0
Jun  3 20:38:08 nexus kernel: [120444.102135] Call Trace:
Jun  3 20:38:08 nexus kernel: [120444.102292]  __list_del_entry+0x9/0x20 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102348]  list_del+0x9/0x20 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102402]  list_remove_head+0x1a/0x30 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102456]  abd_free_gang_abd+0x32/0xc0 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102550]  zio_done+0x397/0x1900 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102558]  ? _cond_resched+0x15/0x30
Jun  3 20:38:08 nexus kernel: [120444.102560]  ? mutex_lock+0xe/0x30
Jun  3 20:38:08 nexus kernel: [120444.102647]  ? __raw_spin_unlock+0x5/0x10 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102741]  ? zio_wait_for_children+0xac/0x170 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102836]  zio_execute+0xe6/0x220 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102852]  taskq_thread+0x207/0x460 [spl]
Jun  3 20:38:08 nexus kernel: [120444.102858]  ? wake_up_q+0xa0/0xa0
Jun  3 20:38:08 nexus kernel: [120444.102948]  ? zio_execute_stack_check.constprop.0+0x10/0x10 [zfs]
Jun  3 20:38:08 nexus kernel: [120444.102952]  kthread+0xf9/0x130
Jun  3 20:38:08 nexus kernel: [120444.102960]  ? kasan_check_write.constprop.0+0x10/0x10 [spl]
Jun  3 20:38:08 nexus kernel: [120444.102963]  ? kthread_park+0x90/0x90
Jun  3 20:38:08 nexus kernel: [120444.102966]  ret_from_fork+0x22/0x40
Jun  3 20:38:08 nexus kernel: [120444.102968] Modules linked in: vmnet(OE) vmmon(OE) parport_pc parport fuse vmw_vsock_vmci_transport vsock vmw_vmci tcp_htcp nfnetlink bluetooth ecdh_generic rfkill ecc it87 hwmon_vid zfs(POE) zunicode(POE) zlua(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) snd_hda_codec_hdmi edac_mce_amd snd_hda_intel usblp snd_intel_dspcfg snd_hda_codec crct10dif_pclmul crc32_pclmul snd_hda_core snd_hwdep ghash_clmulni_intel pcspkr k10temp fam15h_power snd_seq sp5100_tco snd_seq_device nvidia(POE) snd_pcm i2c_piix4 snd_timer drm snd soundcore acpi_cpufreq nfsd auth_rpcgss tcp_bbr sch_fq binfmt_misc nfsv4 dns_resolver nfs_acl nfs lockd grace sunrpc fscache ip_tables raid1 nvme ata_generic pata_acpi crc32c_intel nvme_core pata_marvell r8169
Jun  3 20:38:08 nexus kernel: [120444.103055] ---[ end trace cdd7070024d56a2c ]---
Jun  3 20:38:08 nexus kernel: [120444.103058] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x55
Jun  3 20:38:08 nexus kernel: [120444.103060] Code: c7 c7 b0 2a 3d b1 e8 65 bf c3 ff 0f 0b 48 89 fe 48 c7 c7 40 2b 3d b1 e8 54 bf c3 ff 0f 0b 48 c7 c7 f0 2b 3d b1 e8 46 bf c3 ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 b0 2b 3d b1 e8 32 bf c3 ff 0f 0b
Jun  3 20:38:08 nexus kernel: [120444.103062] RSP: 0018:ffffa94141563d00 EFLAGS: 00010292
Jun  3 20:38:08 nexus kernel: [120444.103063] RAX: 0000000000000054 RBX: ffff9df513f39368 RCX: 0000000000000000
Jun  3 20:38:08 nexus kernel: [120444.103064] RDX: ffff9df95fde7c80 RSI: ffff9df95fdd9cc8 RDI: ffff9df95fdd9cc8
Jun  3 20:38:08 nexus kernel: [120444.103066] RBP: ffff9df513f39368 R08: 000000000000045d R09: 0000000000000064
Jun  3 20:38:08 nexus kernel: [120444.103067] R10: ffffa94141563bb8 R11: 0000000000000000 R12: ffff9df61554b530
Jun  3 20:38:08 nexus kernel: [120444.103068] R13: ffff9df92578cd00 R14: 0000000000012000 R15: 0000000000011fff
Jun  3 20:38:08 nexus kernel: [120444.103070] FS:  0000000000000000(0000) GS:ffff9df95fdc0000(0000) knlGS:0000000000000000
Jun  3 20:38:08 nexus kernel: [120444.103072] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun  3 20:38:08 nexus kernel: [120444.103073] CR2: 00003faa7bb1d89b CR3: 000000042ae46000 CR4: 00000000000406e0
Jun  3 20:38:09 nexus abrt-dump-journal-oops[9516]: abrt-dump-journal-oops: Found oopses: 1
Jun  3 20:38:09 nexus abrt-dump-journal-oops[9516]: abrt-dump-journal-oops: Creating problem directories
Jun  3 20:38:10 nexus abrt-notification[5076]: System encountered a non-fatal error in __list_del_entry()
Jun  3 20:38:10 nexus abrt-dump-journal-oops: Reported 1 kernel oopses to Abrt
@dioni21
Copy link
Contributor Author

dioni21 commented Jun 4, 2020

Could this be related to recent commits on fb82226 ?

dioni21 referenced this issue Jun 4, 2020
Adding the gang ABD type, which allows for linear and scatter ABDs to
be chained together into a single ABD.

This can be used to avoid doing memory copies to/from ABDs. An example
of this can be found in vdev_queue.c in the vdev_queue_aggregate()
function.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Brian <bwa@clemson.edu>
Co-authored-by: Mark Maybee <mmaybee@cray.com>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #10069
@ahrens
Copy link
Member

ahrens commented Jun 4, 2020

@bwatkinson or @mmaybee may want to take a look.

@bwatkinson
Copy link
Contributor

bwatkinson commented Jun 4, 2020

@dioni21 What VDEV's were in your ZPool when this occurred? The only place, currently, that a gang ABD is allocated is in vdev_queue_aggregate(). It almost appears as though an ABD in the list was removed prior to abd_free_gang_abd() was called. What doesn't make sense to me though is how we are getting to this error. We are always allocating a new ABD struct in abd_gang_add() if the ABD as already in another gang ABD (the abd_gang_link is active). The only time this should be occurring currently is in the event of ditto blocks or when the same ABD is used in vdev_label.c.

Even that confuses me though. Somehow next->prev for the list entry is set to LIST_POISON2 when looking at the kernel 5.6 code (include/linux/poison.h shows how LIST_POISON evaluates to the 0xdead address + 0x122) . Even though the trace does not show it, this error appears to come from __list_del_entry_valid(). I am assuming your kernel build defines CONFIG_DEBUG_LIST.

@dioni21
Copy link
Contributor Author

dioni21 commented Jun 5, 2020

@bwatkinson My pool is composed by a single 3-way mirror vdev (please, don’t ask), a mirror special, a mirror slog and a two-disk cache. I can post specifics if really needed.

My kernel is just the default from Fedora 31, no local compilation, but some dynamic modules added.

By ditto blocks what do you mean? Some datasets in this pool have the copies=2 setting, and all used dedup. But I have never used the ditto option.

@bwatkinson
Copy link
Contributor

Yes the copies is what I was referring to as ditto blocks. This is still super strange to me, and I haven’t been able to figure out how this happen.

I will experiment with setting up a mirror with multiple copies for the data blocks to see if I can replicate this with the compile flags you posted. Please let me know if you encounter this again in the mean time. Maybe we can get another more stack traces if it does occur again.

@dioni21
Copy link
Contributor Author

dioni21 commented Jun 5, 2020

I will experiment with setting up a mirror with multiple copies for the data blocks to see if I can replicate this with the compile flags you posted. Please let me know if you encounter this again in the mean time. Maybe we can get another more stack traces if it does occur again.

Note that the error was not immediate, and my home dir (most movimented dataset) has copies. You may need to do some disk burnin to trigger the bug.

Other than another stack trace, is there something else I can do to help? Some specific compile flag or runtime parameter?

@bwatkinson
Copy link
Contributor

I think if it is triggered again, any stack traces you could gather be helpful. Other than that, I will see if I can get this to occur myself.

It is just really strange, because ASSERT’s in the ABD code should have been hit/failed before the validation code in the kernel for the list link set off this error.

@dioni21
Copy link
Contributor Author

dioni21 commented Jun 17, 2020

@bwatkinson just happened again. This time I may have a personal suspect: integration with kernel mode NFS. I almost never use NFS to access my ZFS data at home, but sometimes I do. Last time it crashed, I was using the NFS client. This may explain the rarity of occurrences and maybe lack detection in testing.

Here's today's stack trace. We can see 2 dump. One at 21:18 in abd code, and another at 21:31, for NFS, probably trying to access already frozen ZFS data.

[Tue Jun 16 21:18:21 2020] list_del corruption, ffff9a93489a81f8->prev is LIST_POISON2 (dead000000000122)
[Tue Jun 16 21:18:21 2020] ------------[ cut here ]------------
[Tue Jun 16 21:18:21 2020] kernel BUG at lib/list_debug.c:48!
[Tue Jun 16 21:18:21 2020] invalid opcode: 0000 [#1] SMP NOPTI
[Tue Jun 16 21:18:21 2020] CPU: 2 PID: 1480 Comm: z_wr_int Tainted: P        W  OE     5.6.16-200.fc31.x86_64 #1
[Tue Jun 16 21:18:21 2020] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./970A-D3P, BIOS FD 02/26/2016
[Tue Jun 16 21:18:21 2020] RIP: 0010:__list_del_entry_valid.cold+0x45/0x55
[Tue Jun 16 21:18:21 2020] Code: c3 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 d0 2b 3d af e8 32 c5 c3 ff 0f 0b 48 89 fe 4c 89 c2 48 c7 c7 98 2b 3d af e8 1e c5 c3 ff <0f> 0b cc cc cc cc cc cc cc cc cc cc cc cc cc cc 41 54 44 8b 0e 45
[Tue Jun 16 21:18:21 2020] RSP: 0018:ffffa7fd46b47d00 EFLAGS: 00010292
[Tue Jun 16 21:18:21 2020] RAX: 000000000000004e RBX: ffff9a93489a81f8 RCX: 0000000000000000
[Tue Jun 16 21:18:21 2020] RDX: ffff9a951fca7c80 RSI: ffff9a951fc99cc8 RDI: ffff9a951fc99cc8
[Tue Jun 16 21:18:21 2020] RBP: ffff9a93489a81f8 R08: 0000000000000582 R09: 000000000000005e
[Tue Jun 16 21:18:21 2020] R10: ffffa7fd46b47bb8 R11: 0000000000000000 R12: ffff9a93489a9910
[Tue Jun 16 21:18:21 2020] R13: ffff9a94787f8000 R14: 0000000000008000 R15: 0000000000007fff
[Tue Jun 16 21:18:21 2020] FS:  0000000000000000(0000) GS:ffff9a951fc80000(0000) knlGS:0000000000000000
[Tue Jun 16 21:18:21 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Tue Jun 16 21:18:21 2020] CR2: 00003818bbd184d0 CR3: 0000000555c42000 CR4: 00000000000406e0
[Tue Jun 16 21:18:21 2020] Call Trace:
[Tue Jun 16 21:18:21 2020]  __list_del_entry+0x9/0x20 [zfs]
[Tue Jun 16 21:18:21 2020]  list_del+0x9/0x20 [zfs]
[Tue Jun 16 21:18:21 2020]  list_remove_head+0x1a/0x30 [zfs]
[Tue Jun 16 21:18:21 2020]  abd_free_gang_abd+0x32/0xc0 [zfs]
[Tue Jun 16 21:18:21 2020]  zio_done+0x397/0x1900 [zfs]
[Tue Jun 16 21:18:21 2020]  ? _cond_resched+0x15/0x30
[Tue Jun 16 21:18:21 2020]  ? mutex_lock+0xe/0x30
[Tue Jun 16 21:18:21 2020]  ? __raw_spin_unlock+0x5/0x10 [zfs]
[Tue Jun 16 21:18:21 2020]  ? zio_wait_for_children+0xac/0x170 [zfs]
[Tue Jun 16 21:18:21 2020]  zio_execute+0xe6/0x220 [zfs]
[Tue Jun 16 21:18:21 2020]  taskq_thread+0x207/0x460 [spl]
[Tue Jun 16 21:18:21 2020]  ? wake_up_q+0xa0/0xa0
[Tue Jun 16 21:18:21 2020]  ? zio_execute_stack_check.constprop.0+0x10/0x10 [zfs]
[Tue Jun 16 21:18:21 2020]  kthread+0xf9/0x130
[Tue Jun 16 21:18:21 2020]  ? kasan_check_write.constprop.0+0x10/0x10 [spl]
[Tue Jun 16 21:18:21 2020]  ? kthread_park+0x90/0x90
[Tue Jun 16 21:18:21 2020]  ret_from_fork+0x22/0x40
[Tue Jun 16 21:18:21 2020] Modules linked in: ipheth loop uinput cdc_acm vmnet(OE) parport_pc parport fuse vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE) nfnetlink bluetooth tcp_htcp ecdh_generic rfkill ecc it87 hwmon_vid zfs(POE) zunicode(POE) zlua(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) edac_mce_amd spl(OE) crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi usblp snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep pcspkr snd_seq nvidia(POE) sp5100_tco fam15h_power k10temp snd_seq_device i2c_piix4 snd_pcm snd_timer drm snd soundcore acpi_cpufreq nfsd auth_rpcgss tcp_bbr sch_fq binfmt_misc nfsv4 dns_resolver nfs_acl nfs lockd grace sunrpc fscache ip_tables raid1 nvme ata_generic pata_acpi nvme_core crc32c_intel pata_marvell r8169
[Tue Jun 16 21:18:21 2020] ---[ end trace b270989b09a1b739 ]---
[Tue Jun 16 21:18:21 2020] RIP: 0010:__list_del_entry_valid.cold+0x45/0x55
[Tue Jun 16 21:18:21 2020] Code: c3 ff 0f 0b 48 89 f2 48 89 fe 48 c7 c7 d0 2b 3d af e8 32 c5 c3 ff 0f 0b 48 89 fe 4c 89 c2 48 c7 c7 98 2b 3d af e8 1e c5 c3 ff <0f> 0b cc cc cc cc cc cc cc cc cc cc cc cc cc cc 41 54 44 8b 0e 45
[Tue Jun 16 21:18:21 2020] RSP: 0018:ffffa7fd46b47d00 EFLAGS: 00010292
[Tue Jun 16 21:18:21 2020] RAX: 000000000000004e RBX: ffff9a93489a81f8 RCX: 0000000000000000
[Tue Jun 16 21:18:21 2020] RDX: ffff9a951fca7c80 RSI: ffff9a951fc99cc8 RDI: ffff9a951fc99cc8
[Tue Jun 16 21:18:21 2020] RBP: ffff9a93489a81f8 R08: 0000000000000582 R09: 000000000000005e
[Tue Jun 16 21:18:21 2020] R10: ffffa7fd46b47bb8 R11: 0000000000000000 R12: ffff9a93489a9910
[Tue Jun 16 21:18:21 2020] R13: ffff9a94787f8000 R14: 0000000000008000 R15: 0000000000007fff
[Tue Jun 16 21:18:21 2020] FS:  0000000000000000(0000) GS:ffff9a951fc80000(0000) knlGS:0000000000000000
[Tue Jun 16 21:18:21 2020] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Tue Jun 16 21:18:21 2020] CR2: 00003818bbd184d0 CR3: 0000000555c42000 CR4: 00000000000406e0
[Tue Jun 16 21:31:05 2020] (ostnamed): page allocation failure: order:5, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[Tue Jun 16 21:31:05 2020] CPU: 5 PID: 36243 Comm: (ostnamed) Tainted: P      D W  OE     5.6.16-200.fc31.x86_64 #1
[Tue Jun 16 21:31:05 2020] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./970A-D3P, BIOS FD 02/26/2016
[Tue Jun 16 21:31:05 2020] Call Trace:
[Tue Jun 16 21:31:05 2020]  dump_stack+0x66/0x90
[Tue Jun 16 21:31:05 2020]  warn_alloc.cold+0x75/0xd9
[Tue Jun 16 21:31:05 2020]  ? _cond_resched+0x15/0x30
[Tue Jun 16 21:31:05 2020]  ? __alloc_pages_direct_compact+0x144/0x150
[Tue Jun 16 21:31:05 2020]  __alloc_pages_slowpath+0xd99/0xdd0
[Tue Jun 16 21:31:05 2020]  ? kernfs_add_one+0xe2/0x130
[Tue Jun 16 21:31:05 2020]  __alloc_pages_nodemask+0x2c9/0x310
[Tue Jun 16 21:31:05 2020]  kmalloc_order+0x1b/0x80
[Tue Jun 16 21:31:05 2020]  kmalloc_order_trace+0x1d/0xa0
[Tue Jun 16 21:31:05 2020]  nfsd_reply_cache_init+0xda/0x170 [nfsd]
[Tue Jun 16 21:31:05 2020]  nfsd_init_net+0x60/0x140 [nfsd]
[Tue Jun 16 21:31:05 2020]  ops_init+0x3a/0x100
[Tue Jun 16 21:31:05 2020]  setup_net+0xc5/0x220
[Tue Jun 16 21:31:05 2020]  copy_net_ns+0xf5/0x220
[Tue Jun 16 21:31:05 2020]  create_new_namespaces+0x113/0x2a0
[Tue Jun 16 21:31:05 2020]  unshare_nsproxy_namespaces+0x55/0xa0
[Tue Jun 16 21:31:05 2020]  ksys_unshare+0x164/0x330
[Tue Jun 16 21:31:05 2020]  __x64_sys_unshare+0xe/0x20
[Tue Jun 16 21:31:05 2020]  do_syscall_64+0x5b/0x1c0
[Tue Jun 16 21:31:05 2020]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Tue Jun 16 21:31:05 2020] RIP: 0033:0x7fbd213b239b
[Tue Jun 16 21:31:05 2020] Code: 73 01 c3 48 8b 0d ed fa 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 01 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd fa 0b 00 f7 d8 64 89 01 48
[Tue Jun 16 21:31:05 2020] RSP: 002b:00007fff28475218 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
[Tue Jun 16 21:31:05 2020] RAX: ffffffffffffffda RBX: 00005616cc0d3d58 RCX: 00007fbd213b239b
[Tue Jun 16 21:31:05 2020] RDX: 0000000000000000 RSI: 00007fff28475180 RDI: 0000000040000000
[Tue Jun 16 21:31:05 2020] RBP: 00007fff28475240 R08: 0000000000000000 R09: 00005616cc0fdca0
[Tue Jun 16 21:31:05 2020] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[Tue Jun 16 21:31:05 2020] R13: 0000000000000000 R14: 00000000fffffff5 R15: 0000000000000000
[Tue Jun 16 21:31:05 2020] Mem-Info:
[Tue Jun 16 21:31:05 2020] active_anon:1925169 inactive_anon:1056950 isolated_anon:0
                            active_file:332330 inactive_file:319387 isolated_file:0
                            unevictable:24 dirty:509 writeback:432 unstable:0
                            slab_reclaimable:138454 slab_unreclaimable:1809445
                            mapped:973551 shmem:892130 pagetables:27011 bounce:0
                            free:1719917 free_pcp:0 free_cma:0
[Tue Jun 16 21:31:05 2020] Node 0 active_anon:7700676kB inactive_anon:4227800kB active_file:1329320kB inactive_file:1277548kB unevictable:96kB isolated(anon):0kB isolated(file):0kB mapped:3894204kB dirty:2036kB writeback:1728kB shmem:3568520kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[Tue Jun 16 21:31:05 2020] Node 0 DMA free:15360kB min:28kB low:40kB high:52kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15360kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jun 16 21:31:05 2020] lowmem_reserve[]: 0 2687 31781 31781 31781
[Tue Jun 16 21:31:05 2020] Node 0 DMA32 free:581952kB min:5712kB low:8464kB high:11216kB reserved_highatomic:2048KB active_anon:40128kB inactive_anon:95476kB active_file:30004kB inactive_file:80268kB unevictable:0kB writepending:64kB present:3111984kB managed:2784304kB mlocked:0kB kernel_stack:544kB pagetables:1868kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jun 16 21:31:05 2020] lowmem_reserve[]: 0 0 29093 29093 29093
[Tue Jun 16 21:31:05 2020] Node 0 Normal free:6282356kB min:61836kB low:91624kB high:121412kB reserved_highatomic:2048KB active_anon:7660548kB inactive_anon:4132324kB active_file:1299316kB inactive_file:1197280kB unevictable:96kB writepending:3700kB present:30408700kB managed:29799580kB mlocked:96kB kernel_stack:34272kB pagetables:106176kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[Tue Jun 16 21:31:05 2020] lowmem_reserve[]: 0 0 0 0 0
[Tue Jun 16 21:31:05 2020] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15360kB
[Tue Jun 16 21:31:05 2020] Node 0 DMA32: 18666*4kB (UME) 14890*8kB (UME) 9763*16kB (UMEH) 5005*32kB (UMEH) 1126*64kB (UMEH) 2*128kB (H) 3*256kB (H) 1*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 583752kB
[Tue Jun 16 21:31:05 2020] Node 0 Normal: 153442*4kB (UMEH) 322599*8kB (UMEH) 136537*16kB (UMEH) 24917*32kB (UMEH) 1683*64kB (MEH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6284208kB
[Tue Jun 16 21:31:05 2020] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Tue Jun 16 21:31:05 2020] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Tue Jun 16 21:31:05 2020] 1545000 total pagecache pages
[Tue Jun 16 21:31:05 2020] 1177 pages in swap cache
[Tue Jun 16 21:31:05 2020] Swap cache stats: add 20324, delete 19147, find 1910/2128
[Tue Jun 16 21:31:05 2020] Free swap  = 67034872kB
[Tue Jun 16 21:31:05 2020] Total swap = 67108856kB
[Tue Jun 16 21:31:05 2020] 8384167 pages RAM
[Tue Jun 16 21:31:05 2020] 0 pages HighMem/MovableOnly
[Tue Jun 16 21:31:05 2020] 234356 pages reserved
[Tue Jun 16 21:31:05 2020] 0 pages cma reserved
[Tue Jun 16 21:31:05 2020] 0 pages hwpoisoned

I'll try to force use NFS more now, just in case.

@ahrens ahrens added the Component: Encryption "native encryption" feature label Jun 17, 2020
@dioni21
Copy link
Contributor Author

dioni21 commented Jun 17, 2020

Note: I do not use encryption.

@behlendorf behlendorf added Type: Defect Incorrect behavior (e.g. crash, hang) and removed Component: Encryption "native encryption" feature labels Jun 18, 2020
@bwatkinson
Copy link
Contributor

@dioni21, so we might have found the issue. I have created the following PR:
#10511
There appeared to be a race condition between adding a ABD as a child to gang and possibly removing the same ABD from another gang at the same time.

@dioni21
Copy link
Contributor Author

dioni21 commented Jul 8, 2020

Good to know. Unfortunately I am not sure on when I'll be able to test. Due to Covid pandemic I'm a bit far from my main machine.

I'm curious: How did you find it? May you quickly debrief?

@bwatkinson
Copy link
Contributor

Good to know. Unfortunately I am not sure on when I'll be able to test. Due to Covid pandemic I'm a bit far from my main machine.

I'm curious: How did you find it? May you quickly debrief?

So @sdimitro was able to get a stack trace where we were able to see all the link values in the gang ABD after this crash occurred. We found that each of the children ABD's link prev values were pointing at LIST_POISON2. After reasoning this out, we both came to the conclusion a race condition existed on adding and removing a single ABD from multiple gang ABD's at the same time. I previously was not locking the children ABD's when removing them from a gang ABD, but we are now locking the children before removing them so there is a consistent view to the ABD link statuses.

If you are able to apply this patch with your previous set up let me know if the issue occurs again. I am also going to fire up a VM replicating your setup, but I have not been able to get the LIST_POISON2 bug to occur previously in my VM's. I am hoping this patch will resolve the issue.

@dioni21
Copy link
Contributor Author

dioni21 commented Jul 9, 2020

@bwatkinson I see that there are still some changes being made. Where can I get the latest version? https://github.com/bwatkinson/zfs/tree/gang_abd_debug ? The latest commit seens to be a277205, which is referred as last force push on #10511 , but its date is Jun, 9, not Jul 8!

I am still remote, but I my computer keeps freezing for the same bug. I'd love to try your patch.

@bwatkinson
Copy link
Contributor

@bwatkinson I see that there are still some changes being made. Where can I get the latest version? https://github.com/bwatkinson/zfs/tree/gang_abd_debug ? The latest commit seens to be a277205, which is referred as last force push on #10511 , but its date is Jun, 9, not Jul 8!

I am still remote, but I my computer keeps freezing for the same bug. I'd love to try your patch.

You should be able to pull in the changes from the PR now if you would like to try them out. I have just been making minor changes that will have no effect on the functionality of the patch.

@behlendorf
Copy link
Contributor

Resolved by #10511

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

4 participants