panic: kernel BUG at net/core/skbuff.c:109! #1125

melver · 2020-08-14T14:31:54Z

Using a recent syzkaller config, we get the below panic on very recent Clang (0b90a08f7722980f6074c6eada8022242408cdb4). This issue does not exist in Clang 11 (no bisection attempted yet).

.config: bad.config.txt
steps to reproduce: 1) boot kernel; 2) try to ssh into VM or any other network-related activity.

------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:109!
invalid opcode: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
RIP: 0010:skb_panic+0xc4/0xd0 net/core/skbuff.c:105
Code: 48 8b 74 24 08 48 8b 54 24 10 44 89 e9 44 8b 44 24 04 49 89 e9 b8 00 00 00 00 53 41 54 41 57 41 56 e8 7e ba 17 fc 48 83 c4 20 <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 55 41 57 41 56 41 55 41 54 53
RSP: 0018:ffffc900000e8450 EFLAGS: 00010286
RAX: 0000000000000098 RBX: ffffffff8739c99c RCX: 9f928d0f82712900
RDX: 0000000000000301 RSI: 0000000000000301 RDI: 0000000000000000
RBP: ffff8888136be800 R08: ffffffff813ca1fc R09: 0000ffff875f7d1f
R10: 0000ffffffffffff R11: 0000000000000000 R12: 00000000000002c0
R13: 00000000f19e81cc R14: ffff888721cd6774 R15: 0000000000000140
FS:  0000000000000000(0000) GS:ffff88881fa80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3a3c1efab4 CR3: 000000081177a006 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 00000000
Call Trace:
 <IRQ>
 skb_under_panic+0xc/0x10 net/core/skbuff.c:119
 skb_push+0x96/0xa0 net/core/skbuff.c:1884
 tcp_make_synack+0x439/0x800 net/ipv4/tcp_output.c:3414
 tcp_v4_send_synack+0x71/0x3a0 net/ipv4/tcp_ipv4.c:979
 tcp_conn_request+0x1348/0x1640 net/ipv4/tcp_input.c:6771
 tcp_v4_conn_request+0x10b/0x130 net/ipv4/tcp_ipv4.c:1474
 tcp_rcv_state_process+0x74e/0x17a0 net/ipv4/tcp_input.c:6246
 tcp_v4_do_rcv+0x401/0x4a0 net/ipv4/tcp_ipv4.c:1664
 tcp_v4_rcv+0x1fdd/0x2830 net/ipv4/tcp_ipv4.c:2012
 ip_protocol_deliver_rcu+0x2f2/0x540 net/ipv4/ip_input.c:204
 ip_local_deliver_finish net/ipv4/ip_input.c:231 [inline]
 NF_HOOK include/linux/netfilter.h:301 [inline]
 ip_local_deliver+0x26d/0x310 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:449 [inline]
 ip_sublist_rcv_finish net/ipv4/ip_input.c:550 [inline]
 ip_list_rcv_finish net/ipv4/ip_input.c:600 [inline]
 ip_sublist_rcv+0x61f/0x670 net/ipv4/ip_input.c:608
 ip_list_rcv+0x262/0x290 net/ipv4/ip_input.c:643
 __netif_receive_skb_list_ptype net/core/dev.c:5329 [inline]
 __netif_receive_skb_list_core+0x34b/0x450 net/core/dev.c:5377
 __netif_receive_skb_list+0x262/0x2e0 net/core/dev.c:5429
 netif_receive_skb_list_internal+0x16c/0x440 net/core/dev.c:5534
 gro_normal_list net/core/dev.c:5645 [inline]
 napi_complete_done+0x1a1/0x3a0 net/core/dev.c:6370
 virtqueue_napi_complete+0x28/0x80 drivers/net/virtio_net.c:329
 virtnet_poll+0x64f/0x780 drivers/net/virtio_net.c:1455
 napi_poll net/core/dev.c:6687 [inline]
 net_rx_action+0x317/0x8f0 net/core/dev.c:6757
 __do_softirq+0x1b6/0x30e kernel/softirq.c:298
 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 do_softirq_own_stack+0x71/0x90 arch/x86/kernel/irq_64.c:77
 invoke_softirq kernel/softirq.c:393 [inline]
 __irq_exit_rcu+0x10c/0x120 kernel/softirq.c:423
 irq_exit_rcu+0x5/0x10 kernel/softirq.c:435
 common_interrupt+0x1e6/0x240 arch/x86/kernel/irq.c:239
 asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:572
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: ff ff e9 0d ff ff ff e8 10 d7 24 fb e9 72 ff ff ff cc cc cc cc cc cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d 84 2b 5c 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 74 2b 5c 00 f4 c3 cc cc 65 48 8b 04
RSP: 0018:ffffc90000083ef0 EFLAGS: 00000202
RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffff88881c74c340
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff88881c74c340 R08: ffffffff814f5c10 R09: 000088881c74c367
R10: 0000ffffffffffff R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: ffffc90000083f07 R15: 0000000000000000
 arch_safe_halt arch/x86/include/asm/paravirt.h:150 [inline]
 default_idle+0x1b/0x30 arch/x86/kernel/process.c:688
 default_idle_call kernel/sched/idle.c:94 [inline]
 cpuidle_idle_call kernel/sched/idle.c:163 [inline]
 do_idle+0xf7/0x2c0 kernel/sched/idle.c:276
 cpu_startup_entry+0x15/0x20 kernel/sched/idle.c:372
 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243
Modules linked in:
---[ end trace e6cbb0171868af05 ]---

The text was updated successfully, but these errors were encountered:

melver · 2020-09-30T15:20:27Z

Likely related, but I've recently encountered other strange memory corruptions on Clang 12 (and no other compiler).

Clang 12 @ 43d239d0fadb1f8ea297580ca39dfbee96c913c1
Kernel @ next-20200930

Sample trace:

------------[ cut here ]------------
list_del corruption, ffffcb01d14e6408->next is LIST_POISON1 (dead000000000100)
WARNING: CPU: 0 PID: 138 at lib/list_debug.c:47 __list_del_entry_valid+0x46/0xa0 lib/list_debug.c:45
Modules linked in:
CPU: 0 PID: 138 Comm: systemd-journal Tainted: G        W         5.9.0-rc7-next-20200930 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
RIP: 0010:__list_del_entry_valid+0x46/0xa0 lib/list_debug.c:45
Code: c2 22 48 39 d1 74 29 48 39 31 75 3c b3 01 48 39 70 08 75 4f 89 d8 5b c3 90 31 db 48 c7 c7 10 7f a2 a9 31 c0 e8 7b f8 ab ff 90 <0f> 0b 90 90 eb e4 90 31 db 48 c7 c7 46 7f a2 a9 31 c0 e8 63 f8 ab
RSP: 0018:ffffa26801413b48 EFLAGS: 00010046
RAX: 38c708c80cfa2900 RBX: 0000000000000000 RCX: ffff9b20bd3c0040
RDX: 0000000000000000 RSI: ffffffffa99fd122 RDI: 00000000ffffffff
RBP: 0000000000000000 R08: 0000000362362d81 R09: 0000000000000002
R10: 000000002d2d2d2d R11: ffffffffa82689b0 R12: 0000000000190018
R13: 0000000000190018 R14: ffffcb01d14e6400 R15: 0000000000190019
FS:  00007f10a432c8c0(0000) GS:ffff9b221fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f10a1252000 CR3: 00000006ba9e2002 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 __list_del_entry include/linux/list.h:132 [inline]
 list_del include/linux/list.h:146 [inline]
 remove_full mm/slub.c:1064 [inline]
 __slab_free+0x29f/0x440 mm/slub.c:3055
 do_slab_free mm/slub.c:3134 [inline]
 slab_free mm/slub.c:3147 [inline]
 kmem_cache_free+0x1d0/0x230 mm/slub.c:3162
 skb_free_datagram+0x15/0x60 net/core/datagram.c:325
 unix_dgram_recvmsg+0x417/0x4e0 net/unix/af_unix.c:2181
 ____sys_recvmsg+0x22a/0x250 net/socket.c:885
 ___sys_recvmsg net/socket.c:2627 [inline]
 __sys_recvmsg+0x138/0x2c0 net/socket.c:2663
 do_syscall_64+0x34/0x50 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f10a38bddc7
Code: 89 01 b8 ff ff ff ff eb d8 66 2e 0f 1f 84 00 00 00 00 00 8b 05 0a b6 20 00 85 c0 75 2e 48 63 ff 48 63 d2 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 71 20 00 f7 d8 64 89 02 48
RSP: 002b:00007ffda5d549e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 00007ffda5d54f60 RCX: 00007f10a38bddc7
RDX: 0000000040000040 RSI: 00007ffda5d54a40 RDI: 0000000000000006
RBP: 00007ffda5d54a40 R08: 0000000000000008 R09: 0000000000000070
R10: 000000000000000e R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000006 R14: 0000563c323ac958 R15: 0005b0894d3619db
CPU: 0 PID: 138 Comm: systemd-journal Tainted: G        W         5.9.0-rc7-next-20200930 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xdb/0x10e lib/dump_stack.c:118
 __warn+0xdd/0x1a0 kernel/panic.c:608
 report_bug+0x1bc/0x260 lib/bug.c:198
 handle_bug+0x43/0x80 arch/x86/kernel/traps.c:235
 exc_invalid_op+0x18/0xb0 arch/x86/kernel/traps.c:255
 asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:581
RIP: 0010:__list_del_entry_valid+0x46/0xa0 lib/list_debug.c:45
Code: c2 22 48 39 d1 74 29 48 39 31 75 3c b3 01 48 39 70 08 75 4f 89 d8 5b c3 90 31 db 48 c7 c7 10 7f a2 a9 31 c0 e8 7b f8 ab ff 90 <0f> 0b 90 90 eb e4 90 31 db 48 c7 c7 46 7f a2 a9 31 c0 e8 63 f8 ab
RSP: 0018:ffffa26801413b48 EFLAGS: 00010046
RAX: 38c708c80cfa2900 RBX: 0000000000000000 RCX: ffff9b20bd3c0040
RDX: 0000000000000000 RSI: ffffffffa99fd122 RDI: 00000000ffffffff
RBP: 0000000000000000 R08: 0000000362362d81 R09: 0000000000000002
R10: 000000002d2d2d2d R11: ffffffffa82689b0 R12: 0000000000190018
R13: 0000000000190018 R14: ffffcb01d14e6400 R15: 0000000000190019
 __list_del_entry include/linux/list.h:132 [inline]
 list_del include/linux/list.h:146 [inline]
 remove_full mm/slub.c:1064 [inline]
 __slab_free+0x29f/0x440 mm/slub.c:3055
 do_slab_free mm/slub.c:3134 [inline]
 slab_free mm/slub.c:3147 [inline]
 kmem_cache_free+0x1d0/0x230 mm/slub.c:3162
 skb_free_datagram+0x15/0x60 net/core/datagram.c:325
 unix_dgram_recvmsg+0x417/0x4e0 net/unix/af_unix.c:2181
 ____sys_recvmsg+0x22a/0x250 net/socket.c:885
 ___sys_recvmsg net/socket.c:2627 [inline]
 __sys_recvmsg+0x138/0x2c0 net/socket.c:2663
 do_syscall_64+0x34/0x50 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f10a38bddc7
Code: 89 01 b8 ff ff ff ff eb d8 66 2e 0f 1f 84 00 00 00 00 00 8b 05 0a b6 20 00 85 c0 75 2e 48 63 ff 48 63 d2 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 b1 71 20 00 f7 d8 64 89 02 48
RSP: 002b:00007ffda5d549e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 00007ffda5d54f60 RCX: 00007f10a38bddc7
RDX: 0000000040000040 RSI: 00007ffda5d54a40 RDI: 0000000000000006
RBP: 00007ffda5d54a40 R08: 0000000000000008 R09: 0000000000000070
R10: 000000000000000e R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000006 R14: 0000563c323ac958 R15: 0005b0894d3619db
---[ end trace e7c592efbbc987ba ]---

Config: clang12-corruption.config.txt

nickdesaulniers · 2020-09-30T15:46:13Z

Is this reproducible? Sounds like yes, based on ssh? If you have a ramdisk, we can probably boot test this quickly in QEMU.

Are you able to bisect LLVM commits to pinpoint the first bad commit?

melver · 2020-09-30T15:53:20Z

Is this reproducible? Sounds like yes, based on ssh? If you have a ramdisk, we can probably boot test this quickly in QEMU.

I'd suggest to reproduce the 2nd newer report -- no user interaction required, just booting the kernel to user space is sufficient.

Are you able to bisect LLVM commits to pinpoint the first bad commit?

Am bisecting, but this will take a while. Maybe I have something by tomorrow.

melver · 2020-09-30T22:07:01Z

This is what I get:

4b0aa5724feaa89a9538dcab97e018110b0e4bc3 is the first bad commit
commit 4b0aa5724feaa89a9538dcab97e018110b0e4bc3
Author: James Y Knight <jyknight@google.com>
Date:   Fri May 15 23:43:30 2020 -0400
    
    Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
    
    Before this instruction supported output values, it fit fairly
    naturally as a terminator. However, being a terminator while also
    supporting outputs causes some trouble, as the physreg->vreg COPY
    operations cannot be in the same block.
    
    Modeling it as a non-terminator allows it to be handled the same way
    as invoke is handled already.
    
    Most of the changes here were created by auditing all the existing
    users of MachineBasicBlock::isEHPad() and
    MachineBasicBlock::hasEHPadSuccessor(), and adding calls to
    isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate.

    Reviewed By: nickdesaulniers, void

    Differential Revision: https://reviews.llvm.org/D79794

$ git bisect log
git bisect start
# bad: [7ab7b979d29e1e43701cf690f5cf1903740f50e3] Bump the trunk major version to 12
git bisect bad 7ab7b979d29e1e43701cf690f5cf1903740f50e3
# good: [ca2dcbd030eadbf0aa9b660efe864ff08af6e18b] [SafeStack,NFC] Make StackColoring read-only
git bisect good ca2dcbd030eadbf0aa9b660efe864ff08af6e18b
# good: [f7a14514ee63dc2ab9558c50254efb8ac2ad7cc6] [darwin][driver] isMacosxVersionLT should check against the minimum supported OS version
git bisect good f7a14514ee63dc2ab9558c50254efb8ac2ad7cc6
# bad: [021d56abb9ee3028cb88895144d71365e566c32f] [SVE] Make Constant::getSplatValue work for scalable vector splats
git bisect bad 021d56abb9ee3028cb88895144d71365e566c32f
# bad: [0059f6ffe84241b9728e48c1eabdaf1a6abbef66] [NewPM] Add -basic-aa to pr33196.ll
git bisect bad 0059f6ffe84241b9728e48c1eabdaf1a6abbef66
# good: [3ee580d0176f69a9f724469660f1d1805e0b6a06] [ARM][LowOverheadLoops] Handle reductions
git bisect good 3ee580d0176f69a9f724469660f1d1805e0b6a06
# bad: [a59dc55c2a11c1b125d1a356a90b0b2bf72b16fb] [InstSimplify] Move assume icmp test (NFC)
git bisect bad a59dc55c2a11c1b125d1a356a90b0b2bf72b16fb
# good: [05a20a9e9aba301a828bcbd72b0ed724755752d1] [RISCV] Temporarily move riscv-expand-pseudo pass to PreEmitPass2
git bisect good 05a20a9e9aba301a828bcbd72b0ed724755752d1
# bad: [f11305780f08969488add6c84439fc91d18692dc] [CodeGen] Fix warnings in DAGCombiner::visitSCALAR_TO_VECTOR
git bisect bad f11305780f08969488add6c84439fc91d18692dc
# good: [6bd1db08e7ccd61996d3867d22ff8eb1979f8621] [InstCombine] Don't let an alignment assume prevent new/delete removals.
git bisect good 6bd1db08e7ccd61996d3867d22ff8eb1979f8621
# bad: [3eacfdc72f1aa3ac53eb300116f194d560053ec7] [BPF] Fix a BTF gen bug related to a pointer struct member
git bisect bad 3eacfdc72f1aa3ac53eb300116f194d560053ec7
# good: [0f6afd946d25a2e83288339934f8fa384e38eea3] [CVP] Use different number in test (NFC)
git bisect good 0f6afd946d25a2e83288339934f8fa384e38eea3
# bad: [4b0aa5724feaa89a9538dcab97e018110b0e4bc3] Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.
git bisect bad 4b0aa5724feaa89a9538dcab97e018110b0e4bc3
# good: [78c69a00a4cff786e0ef13c895d0db309d6b3f42] [NFC] Clean up uses of MachineModuleInfoWrapperPass
git bisect good 78c69a00a4cff786e0ef13c895d0db309d6b3f42
# first bad commit: [4b0aa5724feaa89a9538dcab97e018110b0e4bc3] Change the INLINEASM_BR MachineInstr to be a non-terminating instruction.

Also sadly it seems this found its way into Clang 11:

$ git branch -a --contains 4b0aa5724feaa89a9538dcab97e018110b0e4bc3
  ...
  remotes/origin/master
  remotes/origin/release/11.x
  ...

nickdesaulniers · 2020-09-30T22:25:39Z

ok, next thing is to isolate the configs.
tools/testing/ktest/config-bisect.pl can be rerun first with a good and bad config file, then repeatedly with the same arguments plus good or bad and can help bisect configs. I assume we don't observe this for defconfig?

nickdesaulniers · 2020-09-30T22:38:29Z

with the above config on mainline, I observe the following warnings from objtool:

  OBJTOOL vmlinux.o
vmlinux.o: warning: objtool: __do_fast_syscall_32()+0x49: call to syscall_enter_from_user_mode_work() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_machine_check()+0x49: call to mce_rdmsrl() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_nmi_enter()+0x11e: call to preempt_schedule_notrace_thunk() leaves .noinstr.text section

but the kernel boots just fine for me. I see now you mention linux-next. I get the same warnings from objtool when testing -next. Boots fine. It looks like your panic is in recvmsg; so maybe your test setup is listing on a socket on boot and mine is not?

There were also two critical fixes to 4b0aa5724feaa89a9538dcab97e018110b0e4bc3 which will complicate a bisection. If you test after that landed but before the fixes, then it will seem bad, when it may be a separate issue already fixed. The fixes were:

60433c63acb71935111304d71e41b7ee982398f8
f7a53d82c0902147909f28a9295a9d00b4b27d38

I also have a pending fix that should be tested:
https://reviews.llvm.org/D88438

Another thing to test; enabling assertions in LLVM then rebuild and see if anything trips. I'll try to isolate those objtool warnings, since they may be our smoking gun.

cc @jyknight @gwelymernans

nickdesaulniers · 2020-09-30T23:17:45Z

(the objtool warnings I observe exist regardless of 4b0aa5724feaa89a9538dcab97e018110b0e4bc3. Forked #1169 to track those).

bwendling · 2020-10-01T06:33:58Z

bug.log

I built an ASAN kernel and ran it through qemu. It had to sit for a bit, but it eventually panicked.

bwendling · 2020-10-01T08:04:18Z

Fairly certain it's in mm/slub.c. When I compile a gcc-built kernel with a clang-built mm/slub.o then I get the segfaults.

melver · 2020-10-01T08:25:54Z

but the kernel boots just fine for me. I see now you mention linux-next. I get the same warnings from objtool when testing -next.

Sorry about the warnings, they might be harmless for what we're trying to debug (I just habitually enable CONFIG_DEBUG_ENTRY these days).

Boots fine. It looks like your panic is in recvmsg; so maybe your test setup is listing on a socket on boot and mine is not?

I'm using a syzkaller image, but not syzkaller itself.

There were also two critical fixes to 4b0aa5724feaa89a9538dcab97e018110b0e4bc3 which will complicate a bisection. If you test after that landed but before the fixes, then it will seem bad, when it may be a separate issue already fixed. The fixes were:

60433c63acb71935111304d71e41b7ee982398f8

f7a53d82c0902147909f28a9295a9d00b4b27d38

I also have a pending fix that should be tested:
https://reviews.llvm.org/D88438

Tested that on LLVM master branch and we still crash.

Another thing to test; enabling assertions in LLVM then rebuild and see if anything trips. I'll try to isolate those objtool warnings, since they may be our smoking gun.

I've built with assertions, and nothing fires.

The objtool warnings might be red herrings, but we shouldn't rule anything out.

I'll try to do some config bisection, but the config I gave is already quite vanilla, just with a bunch of debug options enabled, specifically:

        CONFIG_KCOV=y
        CONFIG_DEBUG_KERNEL=y
        CONFIG_EXPERT=y
        CONFIG_DEBUG_LIST=y
        CONFIG_DEBUG_ENTRY=y
        CONFIG_STACK_VALIDATION=y
        CONFIG_VMLINUX_VALIDATION=y
        CONFIG_DEBUG_VIRTUAL=y

bwendling · 2020-10-01T09:33:52Z

I'm going to be OOO until Saturday. Here are the LLVM IR files of mm/slub.c compiled with the old compiler (slub.orig.no-opt.ll) and the new compiler (slub.new.no-opt.ll). You should be able to run them through opt and llc to verify that the transformations are correct. You might want to try -O0 first, just to see if the failure is super obvious.

Anyway, to reproduce:

Use .config from panic: kernel BUG at net/core/skbuff.c:109! #1125 (comment).
Compiling with just make -j6.
Compile mm/slub.o from the attached .ll files and place it into the mm/ directory.
Recompile with make (should only regenerate the vmlinux, bzImage, et al files).
Run in qemu. I'm using this command, but there may be extraneous options.

$ qemu-system-x86_64   -drive file=~/rootfs-dirty.img,index=0 \
    -m 20G -smp 4 -net user,hostfwd=tcp::10022-:22 -net nic \
    -nographic   -kernel arch/x86/boot/bzImage \
    -append "console=ttyS0 root=/dev/sda rw debug earlyprintk=serial slub_debug=QUZ" \
    -enable-kvm -cpu host

slub.orig.no-opt.ll.txt

slub.new.no-opt.ll.txt

bwendling · 2020-10-01T09:42:14Z

Okay, one last thing. I did an experiment compiling mm/slub.c at -O1 and there wasn't a failure. Compiling with -O2 resulted in a failure. It should be fairly straight-forward now to iterate through the optimization passes to see which one's messing up.

melver · 2020-10-01T09:44:58Z

Okay, one last thing. I did an experiment compiling mm/slub.c at -O1 and there wasn't a failure. Compiling with -O2 resulted in a failure. It should be fairly straight-forward now to iterate through the optimization passes to see which one's messing up.

This is very good information, thank you! Let's see if I can make something of that.

nickdesaulniers · 2020-10-01T19:47:37Z

I built an ASAN kernel and ran it through qemu.

Was that with @melver 's supplied config? I still haven't been able to reproduce.

Fairly certain it's in mm/slub.c. When I compile a gcc-built kernel with a clang-built mm/slub.o then I get the segfaults.

Neat, how did you bisect them so quickly? It would be generally useful for use to be able to repeat the process of object file bisection.

You should be able to run them through opt and llc to verify that the transformations are correct.

$ opt -O2 slub.new.no-opt.ll.txt -o - -S > slub.new.opt.ll
$ grep -rn callbr slub.new.opt.ll | wc -l
74

😒

(that's too many instances to analyze, we need to get more specific about which function is problematic)

Based on the suspected change, it should be something going wrong during llc, right?

nickdesaulniers · 2020-10-01T22:22:30Z

@melver sent me a userspace image, I can now repro w/

$ qemu-system-x86_64 -kernel /android0/linux-next/arch/x86/boot/bzImage \
  -append "console=ttyS0 root=/dev/sda debug earlyprintk=serial slub_debug=UZ" \
  -nographic -smp 8 -m 32G  -enable-kvm -cpu host  -device virtio-scsi-pci,id=scsi  \
  -drive discard=unmap,file=debian-stretch.qcow2,if=none,id=hd0 -device scsi-hd,drive=hd0

nickdesaulniers · 2020-10-01T23:11:46Z

The panic I observe with @melver 's STR, ie.

[    5.815818] ------------[ cut here ]------------
[    5.816542] list_del corruption, ffffde299b727308->next is LIST_POISON1 (dead000000000100)

is reproducible at llvm commit c430c21202c377cfb9fce0e7272f7208d1e8a531, ie, before the big asm goto related changes.

melver · 2020-10-02T12:27:10Z

Some more experiments (llvm master branch):

I found that changing inlining also affects if the corruptions happen or not. -O1 seems to only use the always-inliner (see clang/lib/CodeGen/BackendUtil.cpp:622) and when switching to that one even with -O2 makes the corruptions disappear.

I tried to play with it a bit more, and with this patch (without which the below didn't work, as some other inlining heuristics seem to mess with it?)

diff --git a/llvm/lib/Analysis/InlineCost.cpp b/llvm/lib/Analysis/InlineCost.cpp
index 0a2de5d4ba9b..a130671f3538 100644
--- a/llvm/lib/Analysis/InlineCost.cpp
+++ b/llvm/lib/Analysis/InlineCost.cpp
@@ -2490,6 +2490,8 @@ InlineParams llvm::getInlineParams(int Threshold) {
   else
     Params.DefaultThreshold = Threshold;

+  return Params;
+
   // Set the HintThreshold knob from the -inlinehint-threshold.
   Params.HintThreshold = HintThreshold;

, compiling slub.c with -O2 -mllvm -inline-threshold=0 results in no corruptions;
with -O2 -mllvm -inline-threshold=1 results in corruptions again.

Not sure if that gets us closer, since inlining will also affect later optimizations; it might help minimize the IR diff.

bwendling · 2020-10-04T09:19:31Z

I think it's related to https://reviews.llvm.org/D86260. I'm going to check further.

bwendling · 2020-10-04T10:21:47Z

What I have so far:

I tracked down the issue to instcombine. It's converting some xors in __slab_free(). I don't think the transformation's wrong, at least I haven't seen anything outrageous happening, but I think it's causing an issue during code generation. At first, there was the change I mentioned previously, where code generation's setting the eax register before the asm goto block instead of afterwards. But it fails even with the patch in D86260.

The thing about instcombine is that one change could have far reaching effects. I added options to limit the number of iterations instcombine performs, but still need to isolate it further.

bwendling · 2020-10-05T04:11:17Z

Found it! This transformation in mm/slub.c causes the error. It happens during instcombine in get_partial_node().

--- slub.good.ll	2020-10-04 20:40:56.021553985 -0700
+++ s.ll	2020-10-04 21:09:36.820367021 -0700
@@ -11288,16 +11288,17 @@
   %i20.i.i.i = load i32, i32* %i1928.i.i.i, align 8
   %i22.i.i.i = and i32 %i20.i.i.i, 2166016
   %i23.i.i.i = icmp ne i32 %i22.i.i.i, 0
-  br label %kmem_cache_has_cpu_partial.exit
+  %phi.bo.i = xor i1 %i23.i.i.i, true
+  %i116179 = getelementptr inbounds %struct.kmem_cache, %struct.kmem_cache* %arg, i64 0, i32 7
+  br i1 %phi.bo.i, label %bb113, label %bb145

 kmem_cache_has_cpu_partial.exit:                  ; preds = %bb110, %bb17.i.i.i
-  %i.0.i.i.i = phi i1 [ %i23.i.i.i, %bb17.i.i.i ], [ false, %bb110 ]
-  %i3.i174 = xor i1 %i.0.i.i.i, true
   %i116 = getelementptr inbounds %struct.kmem_cache, %struct.kmem_cache* %arg, i64 0, i32 7
-  br i1 %i3.i174, label %bb113, label %bb145
+  br label %bb113

 bb113:                                            ; preds = %kmem_cache_has_cpu_partial.exit
-  %i117 = load i32, i32* %i116, align 4
+  %i116180 = phi i32* [ %i116179, %bb17.i.i.i ], [ %i116, %kmem_cache_has_cpu_partial.exit ]
+  %i117 = load i32, i32* %i116180, align 4
   %i118 = lshr i32 %i117, 1
   %i119 = icmp ugt i32 %i97, %i118
   br i1 %i119, label %bb145, label %bb68
@@ -11308,7 +11309,7 @@
   br label %bb68

 bb145:                                            ; preds = %bb129.i, %bb86.i, %kmem_cache_has_cpu_partial.exit, %bb113, %bb68
-  %i10.3 = phi i8* [ %i10.0, %bb68 ], [ %i10.1, %bb113 ], [ %i10.1, %kmem_cache_has_cpu_partial.exit ], [ %i10.0, %bb86.i ], [ %i10.0, %bb129.i ]
+  %i10.3 = phi i8* [ %i10.0, %bb68 ], [ %i10.1, %bb113 ], [ %i10.1, %bb17.i.i.i ], [ %i10.0, %bb86.i ], [ %i10.0, %bb129.i ]
   call void @_raw_spin_unlock(%struct.raw_spinlock* %i3.i) #24
   br label %bb149

bwendling · 2020-10-05T11:24:55Z

Candidate fix here: https://reviews.llvm.org/D88823. Needs a testcase though.

nickdesaulniers · 2020-10-05T21:09:47Z

Configs

defconfig+SCSI_LOWLEVEL+VIRTIO_PCI+VIRTIO+SCSI_VIRTIO are the minimum set of configs needed to boot the provided userspace image. (Identifying those helps us bisect kernel configs, since those are a minimum).

On top of those, to repro the failure:

Difference between good (+) and bad (-)
-PREEMPT_RCU=y
-DEBUG_PREEMPT=y
-PREEMPT_TRACER=n
-PREEMPTION=y
-UNINLINE_SPIN_UNLOCK=y
 PREEMPT n -> y
 PREEMPT_NONE y -> n
+INLINE_READ_UNLOCK=y
+INLINE_READ_UNLOCK_IRQ=y
+INLINE_WRITE_UNLOCK=y
+INLINE_WRITE_UNLOCK_IRQ=y
+INLINE_SPIN_UNLOCK_IRQ=y

All of those change based on PREEMPT (changing simply that in menuconfig flips all of the above). So the issue seems specific to CONFIG_PREEMPT.

(defconfig+CONFIG_PREEMPT alone isn't enough to cause any panics with my own userspace image, so this is definitely the result of complex interactions between multiple configs).

Fixes

Patching in https://reviews.llvm.org/D88823 does resolve the issue for me.

Regression

There was a comment earlier from @melver suggesting this was a regression, though I was not able to confirm. I see now the SHA I tested was incorrect.

is reproducible at llvm commit c430c21202c377cfb9fce0e7272f7208d1e8a531, ie, before the big asm goto related changes.

78c69a00a4cff786e0ef13c895d0db309d6b3f42 is the first commit before the suspected callbr changes.

Indeed, if I sync back to 78c69a00a4cf:

78c69a00a4cf last known good
4b0aa5724fea first known bad

(So the earlier comment about me not being able to confirm the bisection result was an error on my part). I've filed https://llvm.org/pr47735 to block the clang-11 release (cc @zmodem)

Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node *after* the INLINEASM_BR, which is not okay. See: ClangBuiltLinux/linux#1125 Differential Revision: https://reviews.llvm.org/D88823

Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node *after* the INLINEASM_BR, which is not okay. See: ClangBuiltLinux/linux#1125 Differential Revision: https://reviews.llvm.org/D88823 (cherry picked from commit d2c61d2)

Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node *after* the INLINEASM_BR, which is not okay. See: ClangBuiltLinux/linux#1125 Differential Revision: https://reviews.llvm.org/D88823

Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node *after* the INLINEASM_BR, which is not okay. See: ClangBuiltLinux/linux#1125 Differential Revision: https://reviews.llvm.org/D88823 (cherry picked from commit 36b3bf7)

Tail duplication of a block with an INLINEASM_BR may result in a PHI node on the indirect branch. This is okay, but it also introduces a copy for that PHI node *after* the INLINEASM_BR, which is not okay. See: ClangBuiltLinux/linux#1125 Differential Revision: https://reviews.llvm.org/D88823

melver added [ARCH] x86_64 This bug impacts ARCH=x86_64 [BUG] llvm (main) A bug in an unreleased version of LLVM (this label is appropriate for regressions) Kernel panic [BUG] linux A bug that should be fixed in the mainline kernel. labels Aug 14, 2020

nickdesaulniers added the asm goto related to the implementation of asm goto label Sep 30, 2020

nickdesaulniers mentioned this issue Sep 30, 2020

call to X() leaves .noinstr.text section #1169

Closed

nickdesaulniers added the more info needed More information requested to issue author from project members. label Sep 30, 2020

nickdesaulniers assigned melver Sep 30, 2020

nickdesaulniers assigned nickdesaulniers and bwendling Oct 1, 2020

bwendling removed their assignment Oct 6, 2020

nickdesaulniers assigned jyknight and bwendling and unassigned melver, jyknight and nickdesaulniers Oct 6, 2020

nickdesaulniers added [PATCH] Accepted A submitted patch has been accepted upstream and removed [PATCH] Submitted A patch has been submitted for review labels Oct 7, 2020

nickdesaulniers added [FIXED][LLVM] 12 This bug was fixed in LLVM 12.0 Needs Backport Should be backported to either linux-stable tree or latest llvm release branch. and removed [PATCH] Accepted A submitted patch has been accepted upstream labels Oct 7, 2020

dileks mentioned this issue Oct 7, 2020

LLVM 11 release cycle #1136

Closed

dileks added [FIXED][LLVM] 11 This bug was fixed in LLVM 11.0 and removed Needs Backport Should be backported to either linux-stable tree or latest llvm release branch. labels Oct 7, 2020

nickdesaulniers closed this as completed Oct 7, 2020

panic: kernel BUG at net/core/skbuff.c:109! #1125

panic: kernel BUG at net/core/skbuff.c:109! #1125

Comments

melver commented Aug 14, 2020

melver commented Sep 30, 2020

Uh oh!

nickdesaulniers commented Sep 30, 2020

Uh oh!

melver commented Sep 30, 2020

Uh oh!

melver commented Sep 30, 2020

Uh oh!

nickdesaulniers commented Sep 30, 2020

Uh oh!

nickdesaulniers commented Sep 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nickdesaulniers commented Sep 30, 2020

Uh oh!

bwendling commented Oct 1, 2020

Uh oh!

bwendling commented Oct 1, 2020

Uh oh!

melver commented Oct 1, 2020

Uh oh!

bwendling commented Oct 1, 2020

Uh oh!

bwendling commented Oct 1, 2020

Uh oh!

melver commented Oct 1, 2020

Uh oh!

nickdesaulniers commented Oct 1, 2020

Uh oh!

nickdesaulniers commented Oct 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nickdesaulniers commented Oct 1, 2020

Uh oh!

melver commented Oct 2, 2020

Uh oh!

bwendling commented Oct 4, 2020

Uh oh!

bwendling commented Oct 4, 2020

Uh oh!

bwendling commented Oct 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bwendling commented Oct 5, 2020

Uh oh!

nickdesaulniers commented Oct 5, 2020 • edited by nathanchance Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Configs

Fixes

Regression

Uh oh!

nickdesaulniers commented Sep 30, 2020 •

edited

Loading

nickdesaulniers commented Oct 1, 2020 •

edited

Loading

bwendling commented Oct 5, 2020 •

edited

Loading

nickdesaulniers commented Oct 5, 2020 •

edited by nathanchance

Loading