Skip to content

Commit

Permalink
[Pal/Linux-SGX] Add sgx.disable_[cpu-feature] manifest options
Browse files Browse the repository at this point in the history
This commit adds three new manifest options: `sgx.disable_avx`,
`sgx.disable_avx512`, `sgx.disable_amx`. Setting each of these options
to `true` disables the corresponding CPU feature inside the SGX enclave
even if this CPU feature is available on the system: this may improve
enclave performance because this CPU feature will *not* be saved and
restored during enclave entry/exit. For example, disabling Intel AMX may
improve performance of some workloads by around 5%.

Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
  • Loading branch information
Dmitrii Kuvaiskii authored and himanshupatelh committed Feb 18, 2022
1 parent 3d9154f commit 1451bb6
Show file tree
Hide file tree
Showing 10 changed files with 393 additions and 6 deletions.
2 changes: 2 additions & 0 deletions CI-Examples/amxtest/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/amxtest
/*.o
53 changes: 53 additions & 0 deletions CI-Examples/amxtest/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
SGX_SIGNER_KEY ?= ../../Pal/src/host/Linux-SGX/signer/enclave-key.pem

CFLAGS = -Wall -Wextra

ifeq ($(DEBUG),1)
GRAMINE_LOG_LEVEL = debug
CFLAGS += -g
else
GRAMINE_LOG_LEVEL = error
CFLAGS += -O3
endif

.PHONY: all
all: amxtest amxtest.manifest
ifeq ($(SGX),1)
all: amxtest.manifest.sgx amxtest.sig amxtest.token
endif

amxtest: amxtest.o

amxtest.o: amxtest.c

amxtest.manifest: amxtest.manifest.template
gramine-manifest \
-Dlog_level=$(GRAMINE_LOG_LEVEL) \
$< $@

amxtest.manifest.sgx: amxtest.manifest amxtest
@test -s $(SGX_SIGNER_KEY) || \
{ echo "SGX signer private key was not found, please specify SGX_SIGNER_KEY!"; exit 1; }
gramine-sgx-sign \
--key $(SGX_SIGNER_KEY) \
--manifest $< \
--output $@

amxtest.sig: amxtest.manifest.sgx

amxtest.token: amxtest.sig
gramine-sgx-get-token \
--output $@ --sig $<

ifeq ($(SGX),)
GRAMINE = gramine-direct
else
GRAMINE = gramine-sgx
endif

.PHONY: clean
clean:
$(RM) *.token *.sig *.manifest.sgx *.manifest *.o amxtest OUTPUT

.PHONY: distclean
distclean: clean
50 changes: 50 additions & 0 deletions CI-Examples/amxtest/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# AMX test

This directory contains a Makefile and a manifest template for running a simple
AMX test in Gramine. The test performs 10,000,000 `sched_yield()` system calls.
This system call is chosen because it maps 1:1 to the actual host syscall in
case of Gramine. In other words, every `sched_yield()` in the test app results
in one EEXIT -> host `sched_yield` -> EENTER in Gramine-SGX.

Thus, this test app can be used as a micro-benchmark of latency of EEXIT/EENTER
and AEX/RESUME SGX flows, including the XSAVE/XRSTOR done as part of these
EEXIT/EENTER/AEX/ERESUME flows.

# Building

## Building for Linux

Run `make` (non-debug) or `make DEBUG=1` (debug) in the directory.

## Building for SGX

Run `make SGX=1` (non-debug) or `make SGX=1 DEBUG=1` (debug) in the directory.

# Run with Gramine

- Modify `sgx.disable_amx` manifest option to enable/disable AMX feature inside
the SGX enclave (i.e., hide the AMX feature from XSAVE/XRSTOR flows).

- Modify SSA frame size in Gramine to test different SSA sizes. For this, patch
Gramine with a one-liner: `#define SSA_FRAME_SIZE (PRESET_PAGESIZE * 1)` and
rebuild Gramine.

Don't forget to test with Gramine built *in release mode*!

Without SGX (shown for sanity, actually has no difference):
```sh
# run without initializing AMX feature (so-called XINUSE)
gramine-direct amxtest

# run with initializing AMX feature (argv[1] can be any string)
gramine-direct amxtest inuse
```

With SGX:
```sh
# run without initializing AMX feature (so-called XINUSE)
gramine-sgx amxtest

# run with initializing AMX feature (argv[1] can be any string)
gramine-sgx amxtest inuse
```
178 changes: 178 additions & 0 deletions CI-Examples/amxtest/amxtest.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
#define _GNU_SOURCE
#include <err.h>
#include <errno.h>
#include <sched.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
#include <x86intrin.h>

#ifndef __x86_64__
# error This test is 64-bit only
#endif

#define LOOPS (10 * 1000 * 1000)

#define XFEATURE_XTILECFG 17
#define XFEATURE_XTILEDATA 18
#define XFEATURE_MASK_XTILECFG (1 << XFEATURE_XTILECFG)
#define XFEATURE_MASK_XTILEDATA (1 << XFEATURE_XTILEDATA)
#define XFEATURE_MASK_XTILE (XFEATURE_MASK_XTILECFG | XFEATURE_MASK_XTILEDATA)

#define XSTATE_CPUID 0xd
#define XSTATE_USER_STATE_SUBLEAVE 0x0

#define XSAVE_HDR_OFFSET 512

static uint32_t xsave_size;
static uint32_t xsave_xtiledata_offset;
static uint32_t xsave_xtiledata_size;
static void* xsave_buffer;

static inline uint64_t __xgetbv(uint32_t index) {
uint32_t eax, edx;

asm volatile("xgetbv;"
: "=a" (eax), "=d" (edx)
: "c" (index));
return eax + ((uint64_t)edx << 32);
}

static inline void __cpuid(uint32_t *eax, uint32_t *ebx, uint32_t *ecx, uint32_t *edx) {
asm volatile("cpuid;"
: "=a" (*eax), "=b" (*ebx), "=c" (*ecx), "=d" (*edx)
: "0" (*eax), "2" (*ecx));
}

static inline void __xsave(void *buffer, uint32_t lo, uint32_t hi) {
asm volatile("xsave (%%rdi)"
: : "D" (buffer), "a" (lo), "d" (hi)
: "memory");
}

static inline void __xrstor(void *buffer, uint32_t lo, uint32_t hi) {
asm volatile("xrstor (%%rdi)"
: : "D" (buffer), "a" (lo), "d" (hi));
}

static inline bool check_xsave_capability(void) {
if (__xgetbv(0) & XFEATURE_MASK_XTILEDATA) {
return true;
}
return false;
}

static void check_cpuid(void) {
uint32_t eax, ebx, ecx, edx;

eax = XSTATE_CPUID;
ecx = XSTATE_USER_STATE_SUBLEAVE;

__cpuid(&eax, &ebx, &ecx, &edx);
if (!ebx)
err(1, "xstate cpuid: xsave size");

xsave_size = ebx;

eax = XSTATE_CPUID;
ecx = XFEATURE_XTILECFG;

__cpuid(&eax, &ebx, &ecx, &edx);
if (!eax || !ebx)
err(1, "xstate cpuid: tile config state");

eax = XSTATE_CPUID;
ecx = XFEATURE_XTILEDATA;

__cpuid(&eax, &ebx, &ecx, &edx);
if (!eax || !ebx)
err(1, "xstate cpuid: tile data state");

xsave_xtiledata_size = eax;
xsave_xtiledata_offset = ebx;
}

static inline uint64_t get_xstatebv(void *xsave) {
return *(uint64_t *)(xsave + XSAVE_HDR_OFFSET);
}

static inline void set_xstatebv(void *xsave, uint64_t bv) {
*(uint64_t *)(xsave + XSAVE_HDR_OFFSET) = bv;
}

static void set_tiledata(void *tiledata) {
int *ptr = tiledata;
int data = rand();

for (size_t i = 0; i < xsave_xtiledata_size / sizeof(int); i++, ptr++)
*ptr = data;
}

static bool xrstor(void *buffer, uint32_t lo, uint32_t hi) {
__xrstor(buffer, lo, hi);
return true;
}

static int init_amx_random(void) {
if (!check_xsave_capability()) {
printf("XSAVE disabled/Tile data not available.\n");
return 1;
}

check_cpuid();

xsave_buffer = aligned_alloc(64, xsave_size);
if (!xsave_buffer)
err(1, "aligned_alloc()");

set_xstatebv(xsave_buffer, XFEATURE_MASK_XTILE);
set_tiledata(xsave_buffer + xsave_xtiledata_offset);

unsigned int mxcsr;
__asm__ ("stmxcsr %0" : "=m"(mxcsr));

if (!xrstor(xsave_buffer, -1, -1)) {
printf("[FAIL]\tXRSTOR failed (loading tile data).\n");
return 1;
}

__asm__ ("ldmxcsr %0" : : "m"(mxcsr));

free(xsave_buffer);
return 0;
}

int main(int argc, char** argv) {
int ret;
(void)argv[0];

if (argc > 1) {
ret = init_amx_random();
if (ret) {
printf("AMX initialization failed\n");
return ret;
}
printf("Initialized AMX to a random tile\n");
}

printf("Starting micro-benchmark... ");

clock_t begin = clock();
for (long i = 0; i < LOOPS; i++) {
/* below syscall has 1:1 mapping to host syscall in Gramine-SGX (i.e., each sched_yield
* leads to one EENTER and one EEXIT) */
ret = sched_yield();
if (ret) {
printf("sched_yield failed?!\n");
return ret;
}
}
clock_t end = clock();

double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
printf("done in %f seconds\n", time_spent);
return 0;
}
25 changes: 25 additions & 0 deletions CI-Examples/amxtest/amxtest.manifest.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
sgx.disable_amx = true # modify this to test with AMX / without AMX

loader.insecure__use_cmdline_argv = true

loader.preload = "file:{{ gramine.libos }}" # for compatibility with v1.0

loader.entrypoint = "file:{{ gramine.libos }}"
libos.entrypoint = "amxtest"
loader.log_level = "{{ log_level }}"
loader.argv0_override = "amxtest"

loader.env.LD_LIBRARY_PATH = "/lib"

fs.mount.lib.type = "chroot"
fs.mount.lib.path = "/lib"
fs.mount.lib.uri = "file:{{ gramine.runtimedir() }}"

sgx.debug = true
sgx.nonpie_binary = true

sgx.trusted_files = [
"file:amxtest",
"file:{{ gramine.libos }}",
"file:{{ gramine.runtimedir() }}/",
]
11 changes: 11 additions & 0 deletions Documentation/devel/performance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,17 @@ platform and use very slow functions, leading to 10-100x overhead over native
your case, enable the features in the manifest, e.g., set ``sgx.require_avx =
true``.

Gramine also allows to explicitly disable not-security-critical CPU features
using the following manifest options: ``sgx.disable_avx``,
``sgx.disable_avx512``, ``sgx.disable_amx``. By default, all of these options
are set to ``false`` -- this means that Gramine will enable the CPU feature if
available on the system. Setting each of these options to ``true`` disables the
corresponding CPU feature inside the SGX enclave even if this CPU feature is
available on the system: this may improve enclave performance because this CPU
feature will *not* be saved and restored during enclave entry/exit. But be aware
that if the graminized application relies on this CPU feature, the application
will crash with "illegal instruction".

For more information on SGX logic regarding optional CPU features, see the Intel
Software Developer Manual, Table 38-3 ("Layout of ATTRIBUTES Structure") under
the SGX section.
Expand Down
38 changes: 33 additions & 5 deletions Documentation/manifest-syntax.rst
Original file line number Diff line number Diff line change
Expand Up @@ -514,11 +514,39 @@ Optional CPU features (AVX, AVX512, MPX, PKRU, AMX)
sgx.require_amx = [true|false]
(Default: false)

This syntax ensures that the CPU features are available and enabled for the
enclave. If the options are set in the manifest but the features are unavailable
on the platform, enclave initialization will fail. If the options are unset,
enclave initialization will succeed even if these features are unavailable on
the platform.
sgx.disable_avx = [true|false]
sgx.disable_avx512 = [true|false]
sgx.disable_amx = [true|false]
(Default: false)

The ``sgx.require_[feature]`` syntax ensures that the corresponding CPU feature
is available and enabled for the SGX enclave. If the option is set in the
manifest but the corresponding CPU feature is unavailable on the platform,
enclave initialization will fail. If the option is unset, enclave initialization
will succeed even if the corresponding feature is unavailable on the platform.

The ``sgx.disable_[feature]`` syntax disables the corresponding CPU feature
inside the SGX enclave even if this CPU feature is available on the platform:
this may improve enclave performance because this CPU feature will *not* be
saved and restored during enclave entry/exit. This syntax is provided to improve
performance of applications that are known to *not* rely on certain CPU
features. Be aware that if the application relies on some disabled CPU features,
the application will fail with SIGILL ("illegal instruction"). For example, if
the application is built with AVX support, and AVX is disabled in the manifest,
the application will crash. Only not-security-critical CPU features may be
disabled (currently these are AVX, AVX512 and AMX).

It is meaningless to set a CPU feature as both required and disabled. Currently
Gramine doesn't disallow this, but the feature will be disabled in such case.
For example, setting both ``sgx.require_avx = true`` and ``sgx.disable_avx =
true`` will result in the SGX enclave running with AVX disabled.

In case of doubt, it is recommended to keep the default values for these
features (e.g. ``sgx.require_avx = false`` and ``sgx.disable_avx =
false``). In this case, Gramine auto-detects the corresponding CPU features on
the platform and enables them if available, regardless of whether the
application uses them or not.


ISV Product ID and SVN
^^^^^^^^^^^^^^^^^^^^^^
Expand Down
Loading

0 comments on commit 1451bb6

Please sign in to comment.