Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel panic when launching container with FAN network on 24.04 #14025

Open
masnax opened this issue Aug 30, 2024 · 5 comments
Open

Kernel panic when launching container with FAN network on 24.04 #14025

masnax opened this issue Aug 30, 2024 · 5 comments
Assignees
Labels
Bug Confirmed to be a bug

Comments

@masnax
Copy link
Contributor

masnax commented Aug 30, 2024

reproducer:

# Install LXD on a 24.04 VM (also reproducible with ubuntu-minimal-daily):
lxc launch ubuntu:24.04 v1 -c limits.cpu=4 -c limits.memory=4GiB --vm

# Proceed in the VM for the rest:
lxc exec v1 -- bash

# Install LXD from channel latest/stable
snap install lxd --channel latest/stable

# Configure LXD
lxd init --auto

# Create a FAN network
lxc network create lxdfan0 bridge.mode=fan ipv4.nat=true

# Launch a container using the FAN network
lxc launch ubuntu-minimal:22.04 c1 --network lxdfan0

# Try to interact with LXD
lxc ls
internal error, please report: running "lxd.lxc" failed: transient scope not created in 10s

An excerpt from journalctl is below:

Aug 30 21:51:57 v1 kernel: ------------[ cut here ]------------
Aug 30 21:51:57 v1 kernel: Voluntary context switch within RCU read-side critical section!
Aug 30 21:51:57 v1 kernel: WARNING: CPU: 1 PID: 2669 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x2ce/0x2f0
Aug 30 21:51:57 v1 kernel: Modules linked in: veth vxlan ip6_udp_tunnel udp_tunnel dummy nft_masq nft_chain_nat bridge stp llc zfs(PO) spl(O) nvme_fabrics nvme_core nvme_auth ebtable_filter ebtables ip6table_raw ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter nf_tables libcrc32c vhost_vsock vhost vhost_iotlb binfmt_misc kvm_amd ccp kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 nls_iso8859_1 joydev aesni_intel crypto_simd cryptd virtio_gpu 9pnet_virtio virtio_dma_buf xhci_pci psmouse ahci 9pnet virtiofs libahci vmw_vsock_virtio_transport xhci_pci_renesas vmw_vsock_virtio_transport_common vsock virtio_input input_leds serio_raw efi_pstore nfnetlink dmi_sysfs virtio_rng ip_tables x_tables autofs4
Aug 30 21:51:57 v1 kernel: CPU: 1 PID: 2669 Comm: systemd-resolve Tainted: P           O       6.8.0-41-generic #41-Ubuntu
Aug 30 21:51:57 v1 kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022
Aug 30 21:51:57 v1 kernel: RIP: 0010:rcu_note_context_switch+0x2ce/0x2f0
Aug 30 21:51:57 v1 kernel: Code: fe ff ff ba 02 00 00 00 be 01 00 00 00 e8 fa d0 fe ff e9 6b fe ff ff 48 c7 c7 60 7d a6 a8 c6 05 ab 99 61 02 01 e8 d2 0d f2 ff <0f> 0b e9 96 fd ff ff 0f 0b e9 36 ff ff ff 0f 0b e9 18 ff ff ff 66
Aug 30 21:51:57 v1 kernel: RSP: 0018:ffffb611812bbd80 EFLAGS: 00010046
Aug 30 21:51:57 v1 kernel: RAX: 0000000000000000 RBX: ffff9613faeb5a00 RCX: 0000000000000000
Aug 30 21:51:57 v1 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Aug 30 21:51:57 v1 kernel: RBP: ffffb611812bbda0 R08: 0000000000000000 R09: 0000000000000000
Aug 30 21:51:57 v1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Aug 30 21:51:57 v1 kernel: R13: ffff9613b89dd200 R14: 0000000000000000 R15: 0000000000000000
Aug 30 21:51:57 v1 kernel: FS:  00007ec3a402c5c0(0000) GS:ffff9613fae80000(0000) knlGS:0000000000000000
Aug 30 21:51:57 v1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 30 21:51:57 v1 kernel: CR2: 000062592dc892b8 CR3: 000000013890a000 CR4: 00000000007506f0
Aug 30 21:51:57 v1 kernel: PKRU: 55555554
Aug 30 21:51:57 v1 kernel: Call Trace:
Aug 30 21:51:57 v1 kernel:  <TASK>
Aug 30 21:51:57 v1 kernel:  ? show_regs+0x6d/0x80
Aug 30 21:51:57 v1 kernel:  ? __warn+0x89/0x160
Aug 30 21:51:57 v1 kernel:  ? rcu_note_context_switch+0x2ce/0x2f0
Aug 30 21:51:57 v1 kernel:  ? report_bug+0x17e/0x1b0
Aug 30 21:51:57 v1 kernel:  ? handle_bug+0x51/0xa0
Aug 30 21:51:57 v1 kernel:  ? exc_invalid_op+0x18/0x80
Aug 30 21:51:57 v1 kernel:  ? asm_exc_invalid_op+0x1b/0x20
Aug 30 21:51:57 v1 kernel:  ? rcu_note_context_switch+0x2ce/0x2f0
Aug 30 21:51:57 v1 kernel:  __schedule+0x81/0x6b0
Aug 30 21:51:57 v1 kernel:  schedule+0x33/0x110
Aug 30 21:51:57 v1 kernel:  syscall_exit_to_user_mode+0x22d/0x260
Aug 30 21:51:57 v1 kernel:  do_syscall_64+0x8c/0x180
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? syscall_exit_to_user_mode+0x89/0x260
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? do_syscall_64+0x8c/0x180
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? irqentry_exit_to_user_mode+0x7e/0x260
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? irqentry_exit+0x43/0x50
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? exc_page_fault+0x94/0x1b0
Aug 30 21:51:57 v1 kernel:  entry_SYSCALL_64_after_hwframe+0x78/0x80
Aug 30 21:51:57 v1 kernel: RIP: 0033:0x7ec3a3f14887
Aug 30 21:51:57 v1 kernel: Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
Aug 30 21:51:57 v1 kernel: RSP: 002b:00007ffcbb32de08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Aug 30 21:51:57 v1 kernel: RAX: 000000000000002d RBX: 000062592dc882b0 RCX: 00007ec3a3f14887
Aug 30 21:51:57 v1 kernel: RDX: 000000000000002d RSI: 000062592dc88360 RDI: 0000000000000011
Aug 30 21:51:57 v1 kernel: RBP: 000062592dc7e690 R08: 00007ffcbb32dde4 R09: 0000000000000000
Aug 30 21:51:57 v1 kernel: R10: 00000000000005aa R11: 0000000000000246 R12: 0000000000000011
Aug 30 21:51:57 v1 kernel: R13: 0000000000000002 R14: 000000000000002d R15: 000062592dc88360
Aug 30 21:51:57 v1 kernel:  </TASK>
Aug 30 21:51:57 v1 kernel: ---[ end trace 0000000000000000 ]---

@ivanlawrence
Copy link

ivanlawrence commented Sep 1, 2024

I encountered the same error but all defaults in my cluster (using FAN). I thought it was host OS issue (long story) so I rebuilt the host systems and got the same error internal error, please report: running "lxd.lxc" failed: transient scope not created in 10s when attempting to lxc ls on hostA after creating a container on hostB
two hosts both with same version

ivan@i5:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04 LTS
Release:        24.04
Codename:       noble
ivan@i5:~$ uname -a
Linux i5 6.8.0-41-generic #41-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug  2 20:41:06 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

@tomponline
Copy link
Member

This is likely a kernel bug.

@simondeziel @mihalicyn have you seen any bugs about this on Ubuntu?

tomponline added a commit to canonical/microcloud that referenced this issue Sep 1, 2024
@tomponline
Copy link
Member

tomponline commented Sep 2, 2024

We've seen similar reports previously:

#12161

And a kernel bug was opened here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2064176

@mihalicyn mihalicyn self-assigned this Sep 3, 2024
@tomponline tomponline added the Bug Confirmed to be a bug label Sep 3, 2024
@mihalicyn
Copy link
Member

Fix https://lists.ubuntu.com/archives/kernel-team/2024-September/153511.html

@tomponline
Copy link
Member

@mihalicyn can we close this now as we have a patch in the ubuntu kernel for the fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug
Projects
None yet
Development

No branches or pull requests

4 participants