Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: Firecracker does not run on a RPi3 kernel 4.19.58 KVM enabled #1186

Closed
DieterReuter opened this issue Jul 16, 2019 · 10 comments
Closed
Assignees

Comments

@DieterReuter
Copy link

Currently Firecracker doesn't run on the RPi3 with KVM enabled.

Here are the details, when I’m starting Firecracker (did used the binary from GitHub releases) on arm64 I did get the following error message:

$ sudo ./firecracker-v0.17.0-aarch64
2019-07-15T12:59:02.780924077 [anonymous-instance:ERROR:src/main.rs:53] Firecracker panicked at 'Cannot create VMM: Missing KVM capability: Irqchip', src/libcore/result.rs:1009:5
2019-07-15T12:59:02.979244234 [anonymous-instance:ERROR:src/main.rs:56] stack backtrace:
   0: backtrace::backtrace::trace_unsynchronized::hc634af106af00d20 (0x533dd3)
   1: backtrace::capture::Backtrace::new::ha2fbfd2368c482cf (0x5331a3)
   2: firecracker::main::{{closure}}::hd8d62d1cc7597ec4 (0x40370b)
   3: std::panicking::rust_panic_with_hook::h28b9ce6fa7a5033b (0x53cb2b)
             at src/libstd/panicking.rs:495
   4: std::panicking::continue_panic_fmt::h4c221b9431554bc2 (0x53c9cb)
             at src/libstd/panicking.rs:398
   5: rust_begin_unwind (0x5493d7)
             at src/libstd/panicking.rs:325
   6: core::panicking::panic_fmt::h4d67173bc68f6d5a (0x54b0df)
             at src/libcore/panicking.rs:95
   7: core::result::unwrap_failed::h2119ca699bc4feea (0x4672f3)
2019-07-15T12:59:02.979727098 [anonymous-instance:ERROR:src/main.rs:60] Failed to log metrics while panicking: Logger was not initialized.
Aborted

FC basically complains about Missing KVM capability: Irqchip, but this one is included in the kernel. For reference here are the KVM related kernel settings:

$ grep KVM 4.19.58-hypriotos-v8.config
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQFD=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_MMIO=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM_VFIO=y
CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL=y
CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
CONFIG_HAVE_KVM_IRQ_BYPASS=y
CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE=y
CONFIG_KVM=y
CONFIG_KVM_ARM_HOST=y
CONFIG_KVM_ARM_PMU=y
CONFIG_KVM_INDIRECT_VECTORS=y

The system in question is a Raspberry Pi 3B, running Debian 9 with a Linux kernel 4.19.58

[    0.000000] Linux version 4.19.58-hypriotos-v8 (root@405b3dab1308) (gcc version 7.4.1 20181213 [linaro-7.4-2019.02 revision 56ec6f6b99cc167ff0c2f8e1a2eed33b1edc85d4] (Linaro GCC 7.4-2019.02)) #1 SMP PREEMPT Mon Jul 15 11:13:17 UTC 2019
[    0.000000] Machine model: Raspberry Pi 3 Model B Plus Rev 1.3

GIC is also enable in the kernel

CONFIG_ARM_GIC=y
CONFIG_ARM_GIC_MAX_NR=1
CONFIG_ARM_GIC_V3=y
CONFIG_ARM_GIC_V3_ITS=y
@andreeaflorescu andreeaflorescu self-assigned this Jul 16, 2019
@andreeaflorescu andreeaflorescu added Feature: CPU Support: ARM Type: Bug Indicates an unexpected problem or unintended behavior labels Jul 16, 2019
@andreeaflorescu andreeaflorescu changed the title aarch64: Firecracker does not run on a RPi3 kernel 4.9.58 KVM enabled aarch64: Firecracker does not run on a RPi3 kernel 4.19.58 KVM enabled Jul 16, 2019
@andreeaflorescu andreeaflorescu added Support: Failure and removed Type: Bug Indicates an unexpected problem or unintended behavior labels Jul 16, 2019
@andreeaflorescu
Copy link
Member

The error reported by Firecracker (Cannot create VMM: Missing KVM capability: Irqchip) comes from the capability checks that we are doing before starting a VM.

The failing KVM capability here is KVM_CAP_IRQCHIP. On aarch64 this capability reports the presence of GIC devices on the host. Removing the capability check results in the following error when issuing the InstanceStart command:

"Cannot configure virtual machine. SetupGIC(CreateGIC(Os { code: 19, kind: Other, message: No such device }))"

In other words, it looks like GIC is not available on the host. I also found some articles that are mentioning GIC not being available on RPi3: https://www.raspberrypi.org/forums/viewtopic.php?f=62&t=156639&p=1057546&hilit=GIC#p1057546

I would try to run QEMU + KVM on the host and see if that works. Based on the discussions in this issue: raspberrypi/linux#1868 I will assume that it doesn't work. Without GIC v3 support on the host there is no way to run Firecracker unfortunately.

@DieterReuter
Copy link
Author

DieterReuter commented Jul 16, 2019

We already checked and proofed that Qemu 3.1.0 is able to run a Guest VM on exactly this same system.

root@black-pearl:/home/pirate# qemu_install/bin/qemu-system-aarch64 --version
QEMU emulator version 3.1.0 (v3.1.0-dirty)
Copyright (c) 2003-2018 Fabrice Bellard and the QEMU Project developers
GUEST
=======

# uname -a
Linux onapphost 4.17.0-00001-g010275f #1 SMP PREEMPT Sun Jun 24 10:48:35 EEST 2018 aarch64 GNU/Linux
# cat /proc/cpuinfo 
processor	: 0
BogoMIPS	: 38.40
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd03
CPU revision	: 4

# cat /proc/meminfo 
MemTotal:         682220 kB
MemFree:          399876 kB
MemAvailable:     367160 kB
Buffers:               0 kB
Cached:           267800 kB
SwapCached:            0 kB
Active:            35992 kB
Inactive:         232436 kB
Active(anon):      35992 kB
Inactive(anon):   232436 kB
Active(file):          0 kB
Inactive(file):        0 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:           664 kB
Mapped:             2244 kB
Shmem:            267800 kB
Slab:               8416 kB
SReclaimable:       2148 kB
SUnreclaim:         6268 kB
KernelStack:         684 kB
PageTables:          112 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      341108 kB
Committed_AS:     270424 kB
VmallocTotal:   135290290112 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
CmaTotal:          16384 kB
CmaFree:           16128 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB

@sudeep-holla
Copy link

In other words, it looks like GIC is not available on the host. I also found some articles that are mentioning GIC not being available on RPi3: https://www.raspberrypi.org/forums/viewtopic.php?f=62&t=156639&p=1057546&hilit=GIC#p1057546

That's RPi3. RPi4 does have GIC but it looks like GICv2 and not GICv3 which firecracker expects.

I would try to run QEMU + KVM on the host and see if that works. Based on the discussions in this issue: raspberrypi/linux#1868 I will assume that it doesn't work. Without GIC v3 support on the host there is no way to run Firecracker unfortunately.

Ah right, you have already mentioned the minimum requirement here and RPi4 seem to have GICv2 if I read https://github.com/raspberrypi/linux/blob/rpi-4.19.y/arch/arm/boot/dts/bcm2838.dtsi correctly which is included in couple of rpi-4.dts I see there.

@andreeaflorescu
Copy link
Member

@DieterReuter Are you running Qemu with --enable-kvm?

@andreeaflorescu
Copy link
Member

In other words, it looks like GIC is not available on the host. I also found some articles that are mentioning GIC not being available on RPi3: https://www.raspberrypi.org/forums/viewtopic.php?f=62&t=156639&p=1057546&hilit=GIC#p1057546

That's RPi3. RPi4 does have GIC but it looks like GICv2 and not GICv3 which firecracker expects.

@sudeep-holla I think the RPi that @DieterReuter is using doesn't have GIC at all. If it would have GICv2 the capability check would not fail (at least that's what I understand from the kernel code). If the RPi would have GICv2, but not GICv3, you would still not be able to run Firecracker microVMs, but the failure would happen later in the execution path when you send the InstanceStart command. That's when we call create_device with GIC_v3.

Some code references that got me to the above conclusion:

In kvm_vgic_hyp_init KVM is probing for both GIC_v2 and GIC_v3 devices. If none of these devices are available on the host, the check on KVM_CAP_IRQCHIP returns unsupported.

@luxas
Copy link

luxas commented Jul 16, 2019

Good digging @andreeaflorescu and @sudeep-holla 👍

raspberrypi/linux#1868

In this thread they run qemu with args --enable-kvm -machine virt,kernel-irqchip=off. I didn't totally get why they disabled the irqchip, but maybe because of no hardware support in RPi 3. However, I found this patch, not entirely sure if it's related or not: https://patchwork.kernel.org/patch/6783091/

Also, there was an other (possibly relevant) message and error:

qemu-system-aarch64: KVM with user space irqchip only works 
when the host kernel supports KVM_CAP_ARM_TIMER
...
It turned out that one of the SUSE developers is the one who sent 
userland-GIC emulation to QEMU, so I guess he incorporated necessary 
KVM patches in SUSE kernel. When I used his patched QEMU build
(agraf/qemu no-kvm-irqchip branch)

I think the RPi that @DieterReuter is using doesn't have GIC at all

most likely. or, it has some non-standard proprietary gic. I couldn't find a definition of a gic here so...
https://github.com/raspberrypi/linux/blob/rpi-4.19.y/arch/arm/boot/dts/bcm2837.dtsi

RPi4 does have GIC but it looks like GICv2 and not GICv3 which firecracker expects.

That'd be a bummer. Can you explain to me that doesn't understand what you use the GIC for, and why you need v3? Is there any way that could be worked around in the case of RPi (even though we'd do it in an experimental fork first, for the RPi4)?

@andreeaflorescu
Copy link
Member

andreeaflorescu commented Jul 16, 2019

That'd be a bummer. Can you explain to me that doesn't understand what you use the GIC for, and why you need v3? Is there any way that could be worked around in the case of RPi (even though we'd do it in an experimental fork first, for the RPi4)?

I believe we require GIC_v3 because we added support for armv8 and it looks like armv8 can use either GIC_v3 or GIC_v4. @dianpopa should have more insight into this particular choice.

As for why we need GIC, this is required for implementing our Virtio devices (block and network). As far as I can tell the interrupts are the mechanism used for the communication between the CPU and the devices. I tried to find some sort of documentation that can better explain how and why Virtio devices implemented with MMIO require interrupts, but I miserably failed.

I haven't really touched that code in Firecracker. @dhrgit recently implemented a virtio device so he might be able to give a better explanation on the Virtio internals.

@acatangiu
Copy link
Contributor

acatangiu commented Jul 16, 2019

@luxas The GIC device is a Generic Interrupt Controller and serves the same purpose that IRQCHIPs serve on x86 for example. It a device that follows the GIC spec [1] to implement a device that can be programmed/configured to route IRQs from different devices to different CPUs.

GIC is the preferred way of routing interrupts on the ARM architecture and this device is required for supporting most other devices (not just virtio) that require IRQs to properly function.

Is there any way that could be worked around in the case of RPi (even though we'd do it in an experimental fork first, for the RPi4)?

I am not sure. There are alternatives to IRQs when it comes to letting the CPU know about devices events. The popular one is polling for example, where the CPU constantly polls each device continuously looking for pending events. With the right support in the guest OS and the right emulation, that is definitely possible. Unfortunately Firecracker's devices do not support that particular way of functioning and our device emulation follows the general standard implementation for each of our devices and thus uses IRQs.
You can't "work around" using GIC in Firecracker on ARM. One possible option to investigate is adding GICv2 support to Firecracker so that RPi4 is enabled. Another option is emulating the GIC in Firecracker instead of using the KVM one which needs support on the host.

[1] https://static.docs.arm.com/ihi0069/d/IHI0069D_gic_architecture_specification.pdf

@andreeaflorescu
Copy link
Member

@DieterReuter As per our slack conversation I am closing this issue as the root cause is that GIC v3 is not available on the RPi you are running Firecracker on. If you would like to continue the conversation related to adding GIC emulation in Firecracker, please open another issue.

@sudeep-holla
Copy link

That'd be a bummer. Can you explain to me that doesn't understand what you use the GIC for, and why you need v3? Is there any way that could be worked around in the case of RPi (even though we'd do it in an experimental fork first, for the RPi4)?

I believe we require GIC_v3 because we added support for armv8 and it looks like armv8 can use either GIC_v3 or GIC_v4. @dianpopa should have more insight into this particular choice.

It's not true. There are hardware that are ARMv8 and GICv2. I thought choice of minimum GICv3 was made in Firecracker to keep it simple. As you see in the spec you have pointed out that changes are huge and I assume hence the decision for min requirement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants