Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to boot Ubuntu 22.04 LTS (cloud image) on arm64 #298

Closed
edigaryev opened this issue Dec 4, 2023 · 5 comments · Fixed by #355
Closed

Unable to boot Ubuntu 22.04 LTS (cloud image) on arm64 #298

edigaryev opened this issue Dec 4, 2023 · 5 comments · Fixed by #355
Assignees

Comments

@edigaryev
Copy link
Contributor

edigaryev commented Dec 4, 2023

How to reproduce

#!/bin/bash

set -euo pipefail

QCOW2_NAME="jammy-server-cloudimg-arm64.img"

if [ ! -e "${QCOW2_NAME}" ]
then
	wget -O "${QCOW2_NAME}" https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-arm64.img
fi

RAW_NAME="jammy-server-cloudimg-arm64.raw"

if [ ! -e "${RAW_NAME}" ]
then
	qemu-img convert -p -f qcow2 -O raw "${QCOW2_NAME}" "${RAW_NAME}"
fi

if [ ! -d "rust-hypervisor-firmware" ]
then
	git clone https://github.com/cloud-hypervisor/rust-hypervisor-firmware.git
fi

FIRMWARE_PATH="rust-hypervisor-firmware/target/aarch64-unknown-none/release/hypervisor-fw"

if [ ! -e "${FIRMWARE_PATH}" ]
then
	pushd rust-hypervisor-firmware
	cargo build --release --target aarch64-unknown-none.json -Zbuild-std=core,alloc -Zbuild-std-features=compiler-builtins-mem
	popd
fi

cloud-hypervisor --serial tty --console pty --kernel "${FIRMWARE_PATH}" --disk path="${RAW_NAME}"

Expected output

The VM boots and shows ubuntu login: .

Actual output

$ ./run.sh
[...]
cloud-hypervisor: 8.063405ms: <vmm> WARN:arch/src/aarch64/fdt.rs:114 -- File: /sys/devices/system/cpu/cpu0/cache/index3/size does not exist.
cloud-hypervisor: 8.138801ms: <vmm> WARN:arch/src/aarch64/fdt.rs:148 -- File: /sys/devices/system/cpu/cpu0/cache/index3/coherency_line_size does not exist.
cloud-hypervisor: 8.187810ms: <vmm> WARN:arch/src/aarch64/fdt.rs:171 -- File: /sys/devices/system/cpu/cpu0/cache/index3/number_of_sets does not exist.
cloud-hypervisor: 8.266878ms: <vmm> WARN:arch/src/aarch64/fdt.rs:428 -- L2 cache shared with other cpus

Booting with FDT
Found PCI device vendor=8086 device=d57 in slot=0
Found PCI device vendor=1af4 device=1043 in slot=1
Found PCI device vendor=1af4 device=1042 in slot=2
Found PCI device vendor=1af4 device=1044 in slot=3
PCI Device: 0:2.0 1af4:1042
Bar: type=MemorySpace32 address=0x2ff80000 size=0x80000
Bar: type=MemorySpace32 address=0x0 size=0x0
Bar: type=MemorySpace32 address=0x0 size=0x0
Bar: type=MemorySpace32 address=0x0 size=0x0
Bar: type=MemorySpace32 address=0x0 size=0x0
Bar: type=MemorySpace32 address=0x0 size=0x0
Updated BARs: type=MemorySpace32 address=2ff80000 size=80000
Updated BARs: type=MemorySpace32 address=0 size=0
Updated BARs: type=MemorySpace32 address=0 size=0
Updated BARs: type=MemorySpace32 address=0 size=0
Updated BARs: type=MemorySpace32 address=0 size=0
Updated BARs: type=MemorySpace32 address=0 size=0
Virtio block device configured. Capacity: 4612096 sectors
Found EFI partition
Filesystem ready
Error loading default entry: File(NotFound)
Using EFI boot.
Found bootloader: \EFI\BOOT\BOOTAA64.EFI
Executable loaded
Failed to set MokListRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListRT: Unsupported
Failed to set MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListTrustedRT: Unsupported
Something has gone seriously wrong: import_mok_state() failed: Unsupported
TPM logging failed: Unsupported

The VM hangs and 100% CPU usage by Cloud Hypervisor process can be observed.

Versions tested

Rust Hypervisor Firmware built from main, and:

$ cloud-hypervisor --version
cloud-hypervisor v36.0.0

Hardware used

a1.metal AWS EC2 instance running Debian 12 (arm64).

Notes

EDK2 works is just fine.

Related: #198.

@edigaryev edigaryev changed the title Unable to boot Ubuntu 22.04 LTS (cloud image): VM hangs Unable to boot Ubuntu 22.04 LTS (cloud image) on arm64 Dec 4, 2023
edigaryev added a commit to cirruslabs/vetu that referenced this issue Dec 5, 2023
* firmware: fetch EDK2 firmware instead of Rust Hypervisor Firmware

To work around cloud-hypervisor/rust-hypervisor-firmware#298.

* fetch(): validate HTTP response code

* Single Fetch() method that accepts a function + export FetchURL method

* Move FetchURL() and FetchURLToFile() to binaryfetcher package

* binaryfetcher.Fetch() → binaryfetcher.GetOrFetch()

* fetch.go → fetchurl.go
@retrage retrage self-assigned this Dec 9, 2023
@retrage
Copy link
Contributor

retrage commented Dec 9, 2023

I tried to reproduce it on Neoverse-N1, with current (20231207) Ubuntu 22.04 Jammy cloud image (sha256sum: d74dc6f9bc92da4dff973bab1b6dab411c7b6a5219fcdbec25413832cb4b23ba), but cannot. Here is a part of the log:

Found bootloader: \EFI\BOOT\BOOTAA64.EFI                                        
Executable loaded                                                               
Failed to set MokListRT: Unsupported                                            
TPM logging failed: Unsupported                                                 
Could not create MokListRT: Unsupported                                         
Failed to set MokListXRT: Unsupported                                           
TPM logging failed: Unsupported                                                 
Could not create MokListXRT: Unsupported                                        
TPM logging failed: Unsupported                                                 
Could not create MokListTrustedRT: Unsupported                                  
Something has gone seriously wrong: import_mok_state() failed: Unsupported      
TPM logging failed: Unsupported                                                 
Could not create variable: Unsupported                                          
Failed to set MokListRT: Unsupported                                            
TPM logging failed: Unsupported                                                 
Could not create MokListRT: Unsupported                                         
Failed to set MokListXRT: Unsupported                                           
TPM logging failed: Unsupported                                                 
Could not create MokListXRT: Unsupported                                        
TPM logging failed: Unsupported                                                 
Could not create MokListTrustedRT: Unsupported                                  
Something has gone seriously wrong: import_mok_state() failed: Unsupported      
TPM logging failed: Unsupported                                                 
EFI stub: Booting Linux Kernel...                                               
EFI stub: EFI_RNG_PROTOCOL unavailable                                          
EFI stub: Using DTB from configuration table                                    
EFI stub: Exiting boot services...                                              
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x413fd0c1]          
[    0.000000] Linux version 5.15.0-89-generic (buildd@bos02-arm64-007) (gcc (Ub
untu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #99-U
buntu SMP Mon Oct 30 23:43:36 UTC 2023 (Ubuntu 5.15.0-89.99-generic 5.15.126)

I also tried Ubuntu 22.04 image 20231201, sha256sum: ea069246bbd12557ee13cd17f4f8be55a3885a7186c98a3afc677babde136d1f with exactly same command line arguments, but still cannot reproduce it.

@peng6662001
Copy link

@thomasbarrett
Copy link

thomasbarrett commented Aug 20, 2024

I seem to be running into this as well using Ubuntu 22.04 cloud-image + cloud-hypervisor main @retrage on a r8g.metal-24xl instance. I only ran into this issue while starting virtual machines with 128GiB+ of RAM. Any idea what could be causing this?

@retrage
Copy link
Contributor

retrage commented Aug 21, 2024

I have no idea why we cannot reproduce the issue. We can close this issue when your PR #346 is merged.

acarp-crusoe added a commit to acarp-crusoe/rust-hypervisor-firmware that referenced this issue Nov 27, 2024
Bumping the possible page table range from 128G to 2TB
to support larger systems.

Fixes cloud-hypervisor#298

Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
@acarp-crusoe
Copy link
Contributor

I ran into this while attempting to create VMs with a DRAM region > 126G. I took some time to investigate and found that the issue is around a fixed limit of 128G on the page table for aarch64. In the aarch64 memory layout the page table is hardcoded to a max address space of 128G, so going beyond that exhausts the page table limit.

pub mod map {
    // Create page table for 128G is enough
    pub const END: usize = 0x20_0000_0000;

Including the space reserved for the kernel as well as MMIO the 128G allocated leaves about 126G for total System Memory. I've put out a PR to bump this limit to support the systems that we're developing with (2TB).

#355

github-merge-queue bot pushed a commit that referenced this issue Nov 27, 2024
Bumping the possible page table range from 128G to 2TB
to support larger systems.

Fixes #298

Signed-off-by: Andrew Carp <acarp@crusoeenergy.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants