|  | 
|  | 1 | +.. SPDX-License-Identifier: GPL-2.0 | 
|  | 2 | +
 | 
|  | 3 | +============== | 
|  | 4 | +Nitro Enclaves | 
|  | 5 | +============== | 
|  | 6 | + | 
|  | 7 | +Overview | 
|  | 8 | +======== | 
|  | 9 | + | 
|  | 10 | +Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability | 
|  | 11 | +that allows customers to carve out isolated compute environments within EC2 | 
|  | 12 | +instances [1]. | 
|  | 13 | + | 
|  | 14 | +For example, an application that processes sensitive data and runs in a VM, | 
|  | 15 | +can be separated from other applications running in the same VM. This | 
|  | 16 | +application then runs in a separate VM than the primary VM, namely an enclave. | 
|  | 17 | + | 
|  | 18 | +An enclave runs alongside the VM that spawned it. This setup matches low latency | 
|  | 19 | +applications needs. The resources that are allocated for the enclave, such as | 
|  | 20 | +memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a | 
|  | 21 | +process running in the primary VM, that communicates with the NE driver via an | 
|  | 22 | +ioctl interface. | 
|  | 23 | + | 
|  | 24 | +In this sense, there are two components: | 
|  | 25 | + | 
|  | 26 | +1. An enclave abstraction process - a user space process running in the primary | 
|  | 27 | +VM guest that uses the provided ioctl interface of the NE driver to spawn an | 
|  | 28 | +enclave VM (that's 2 below). | 
|  | 29 | + | 
|  | 30 | +There is a NE emulated PCI device exposed to the primary VM. The driver for this | 
|  | 31 | +new PCI device is included in the NE driver. | 
|  | 32 | + | 
|  | 33 | +The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl | 
|  | 34 | +maps to an enclave start PCI command. The PCI device commands are then | 
|  | 35 | +translated into  actions taken on the hypervisor side; that's the Nitro | 
|  | 36 | +hypervisor running on the host where the primary VM is running. The Nitro | 
|  | 37 | +hypervisor is based on core KVM technology. | 
|  | 38 | + | 
|  | 39 | +2. The enclave itself - a VM running on the same host as the primary VM that | 
|  | 40 | +spawned it. Memory and CPUs are carved out of the primary VM and are dedicated | 
|  | 41 | +for the enclave VM. An enclave does not have persistent storage attached. | 
|  | 42 | + | 
|  | 43 | +The memory regions carved out of the primary VM and given to an enclave need to | 
|  | 44 | +be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of | 
|  | 45 | +this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from | 
|  | 46 | +user space [2][3]. The memory size for an enclave needs to be at least 64 MiB. | 
|  | 47 | +The enclave memory and CPUs need to be from the same NUMA node. | 
|  | 48 | + | 
|  | 49 | +An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain | 
|  | 50 | +available for the primary VM. A CPU pool has to be set for NE purposes by an | 
|  | 51 | +user with admin capability. See the cpu list section from the kernel | 
|  | 52 | +documentation [4] for how a CPU pool format looks. | 
|  | 53 | + | 
|  | 54 | +An enclave communicates with the primary VM via a local communication channel, | 
|  | 55 | +using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device, | 
|  | 56 | +while the enclave VM has a virtio-mmio vsock emulated device. The vsock device | 
|  | 57 | +uses eventfd for signaling. The enclave VM sees the usual interfaces - local | 
|  | 58 | +APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio | 
|  | 59 | +device is placed in memory below the typical 4 GiB. | 
|  | 60 | + | 
|  | 61 | +The application that runs in the enclave needs to be packaged in an enclave | 
|  | 62 | +image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the | 
|  | 63 | +enclave VM. The enclave VM has its own kernel and follows the standard Linux | 
|  | 64 | +boot protocol [6]. | 
|  | 65 | + | 
|  | 66 | +The kernel bzImage, the kernel command line, the ramdisk(s) are part of the | 
|  | 67 | +Enclave Image Format (EIF); plus an EIF header including metadata such as magic | 
|  | 68 | +number, eif version, image size and CRC. | 
|  | 69 | + | 
|  | 70 | +Hash values are computed for the entire enclave image (EIF), the kernel and | 
|  | 71 | +ramdisk(s). That's used, for example, to check that the enclave image that is | 
|  | 72 | +loaded in the enclave VM is the one that was intended to be run. | 
|  | 73 | + | 
|  | 74 | +These crypto measurements are included in a signed attestation document | 
|  | 75 | +generated by the Nitro Hypervisor and further used to prove the identity of the | 
|  | 76 | +enclave; KMS is an example of service that NE is integrated with and that checks | 
|  | 77 | +the attestation doc. | 
|  | 78 | + | 
|  | 79 | +The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The | 
|  | 80 | +init process in the enclave connects to the vsock CID of the primary VM and a | 
|  | 81 | +predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is | 
|  | 82 | +used to check in the primary VM that the enclave has booted. The CID of the | 
|  | 83 | +primary VM is 3. | 
|  | 84 | + | 
|  | 85 | +If the enclave VM crashes or gracefully exits, an interrupt event is received by | 
|  | 86 | +the NE driver. This event is sent further to the user space enclave process | 
|  | 87 | +running in the primary VM via a poll notification mechanism. Then the user space | 
|  | 88 | +enclave process can exit. | 
|  | 89 | + | 
|  | 90 | +[1] https://aws.amazon.com/ec2/nitro/nitro-enclaves/ | 
|  | 91 | +[2] https://www.kernel.org/doc/html/latest/admin-guide/mm/hugetlbpage.html | 
|  | 92 | +[3] https://lwn.net/Articles/807108/ | 
|  | 93 | +[4] https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html | 
|  | 94 | +[5] https://man7.org/linux/man-pages/man7/vsock.7.html | 
|  | 95 | +[6] https://www.kernel.org/doc/html/latest/x86/boot.html | 
0 commit comments