Refactor sysfs to improve sanitization #355

vijaydhanraj · 2022-01-22T05:15:59Z

Description of the changes

This PR refactors sysfs to improve sanitization. It fixes issues 1, 2 listed as part of gramineproject/graphene#2105.

This supersedes PR #106 and I have closed the PR.

How to test this PR?

Please run sysfs_common and proc_common regression tests. Also, openvino benchmark using the TBB library can be used to validate the changes.

This change is

mkow · 2022-01-22T14:16:21Z

Jenkins, test this please

mkow

Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @vijaydhanraj)

a discussion (no related file):
From AddressSanitizer:

[LibOS.shim.test.regression.test_libos.TC_40_FileSystem.test_040_sysfs]

error: asan: heap-buffer-overflow while trying to store 8 bytes at 0xf766cb8
error: asan: the bad address is 0xf766cb8 (0 from beginning of access)
error: asan: location: create_socket_range_from_core at db_main.c, libpal.so+0x1149bd
error: asan: (for a full traceback, use GDB with a breakpoint at "pal_abort")
error: asan:
error: asan: shadow bytes around the bad address:
error: asan:   0x18001eecd50: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan:   0x18001eecd60: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan:   0x18001eecd70: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan:   0x18001eecd80: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan: =>0x18001eecd90: fa fa 00 00 fa fa 00[fa]fa fa fa fa fa fa fa fa
error: asan:   0x18001eecda0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:   0x18001eecdb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:   0x18001eecdc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:   0x18001eecdd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:
error: asan: shadow byte legend (1 shadow byte = 8 application bytes):
error: asan:           addressable: 00
error: asan: partially addressable: 01..07
error: asan:     heap left redzone: fa
error: asan:     freed heap region: fd
error: asan:    stack left redzone: f1
error: asan:     stack mid redzone: f2
error: asan:   stack right redzone: f3
error: asan:     stack after scope: f8
error: asan:   alloca left redzone: ca
error: asan:  alloca right redzone: cb
error: asan:        global redzone: f9
error: asan:         user-poisoned: f7

vijaydhanraj

Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

From AddressSanitizer:

[LibOS.shim.test.regression.test_libos.TC_40_FileSystem.test_040_sysfs]

error: asan: heap-buffer-overflow while trying to store 8 bytes at 0xf766cb8
error: asan: the bad address is 0xf766cb8 (0 from beginning of access)
error: asan: location: create_socket_range_from_core at db_main.c, libpal.so+0x1149bd
error: asan: (for a full traceback, use GDB with a breakpoint at "pal_abort")
error: asan:
error: asan: shadow bytes around the bad address:
error: asan:   0x18001eecd50: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan:   0x18001eecd60: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan:   0x18001eecd70: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan:   0x18001eecd80: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan: =>0x18001eecd90: fa fa 00 00 fa fa 00[fa]fa fa fa fa fa fa fa fa
error: asan:   0x18001eecda0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:   0x18001eecdb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:   0x18001eecdc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:   0x18001eecdd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:
error: asan: shadow byte legend (1 shadow byte = 8 application bytes):
error: asan:           addressable: 00
error: asan: partially addressable: 01..07
error: asan:     heap left redzone: fa
error: asan:     freed heap region: fd
error: asan:    stack left redzone: f1
error: asan:     stack mid redzone: f2
error: asan:   stack right redzone: f3
error: asan:     stack after scope: f8
error: asan:   alloca left redzone: ca
error: asan:  alloca right redzone: cb
error: asan:        global redzone: f9
error: asan:         user-poisoned: f7

Thanks, @mkow. Can you please tell me how to reproduce this locally?

I tried the following:

1. export CC=clang-9 CXX=clang++-9 AS=clang-9
2. meson setup build/ --buildtype=debug -Ddirect=enabled -Dsgx=enabled -Dsgx_driver=upstream -Dsgx_driver_include_path="" -Dtests=enabled -Ddcap=enabled -Dasan=enabled

This results in the error below,

LibOS/shim/test/regression/meson.build:176:8: ERROR: Problem encountered: Compiling tests is currently unsupported with Musl and compilers other than GCC. You need to either disable Musl (-Dmusl=disabled) or use GCC (CC=gcc)

So as suggested in the error, added -Dmusl=disabled

meson setup build/ --buildtype=debug -Ddirect=enabled -Dsgx=enabled -Dsgx_driver=upstream -Dsgx_driver_include_path="" -Dtests=enabled -Ddcap=enabled -Dasan=enabled -Dmusl=disabled

This worked fine but ninja -C build/ failed with the following error,

ninja: build stopped: subcommand failed.

Also, I see messages like these in the log,

[34/721] Compiling C object Pal/regression/Pie.p/Pie.c.o
FAILED: Pal/regression/Pie.p/Pie.c.o 
clang-9 -IPal/regression/Pie.p -IPal/regression -I../Pal/regression -I../Pal/include/pal -I../Pal/include/arch/x86_64 -I../Pal/include/arch/x86_64/Linux -Icommon/include -I../common/include -I../common/include/arch/x86_64 -fcolor-diagnostics -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -std=c11 -O0 -g -Wa,--noexecstack -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Wwrite-strings -Wnull-dereference -DDEBUG -fPIE -fsanitize=address -fno-sanitize-link-runtime -mllvm -asan-mapping-offset=0x18000000000 -DASAN -mllvm -asan-use-after-return=0 -fno-stack-protector -fno-builtin -nostdlib -Wno-unused-parameter -fPIE -MD -MQ Pal/regression/Pie.p/Pie.c.o -MF Pal/regression/Pie.p/Pie.c.o.d -o Pal/regression/Pie.p/Pie.c.o -c ../Pal/regression/Pie.c
clang: error: unknown argument: '-fno-sanitize-link-runtime'

vijaydhanraj

Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow)

a discussion (no related file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Thanks, @mkow. Can you please tell me how to reproduce this locally?

I tried the following:

1. export CC=clang-9 CXX=clang++-9 AS=clang-9
2. meson setup build/ --buildtype=debug -Ddirect=enabled -Dsgx=enabled -Dsgx_driver=upstream -Dsgx_driver_include_path="" -Dtests=enabled -Ddcap=enabled -Dasan=enabled

This results in the error below,

LibOS/shim/test/regression/meson.build:176:8: ERROR: Problem encountered: Compiling tests is currently unsupported with Musl and compilers other than GCC. You need to either disable Musl (-Dmusl=disabled) or use GCC (CC=gcc)

So as suggested in the error, added -Dmusl=disabled

meson setup build/ --buildtype=debug -Ddirect=enabled -Dsgx=enabled -Dsgx_driver=upstream -Dsgx_driver_include_path="" -Dtests=enabled -Ddcap=enabled -Dasan=enabled -Dmusl=disabled

This worked fine but ninja -C build/ failed with the following error,

ninja: build stopped: subcommand failed.

Also, I see messages like these in the log,

[34/721] Compiling C object Pal/regression/Pie.p/Pie.c.o
FAILED: Pal/regression/Pie.p/Pie.c.o 
clang-9 -IPal/regression/Pie.p -IPal/regression -I../Pal/regression -I../Pal/include/pal -I../Pal/include/arch/x86_64 -I../Pal/include/arch/x86_64/Linux -Icommon/include -I../common/include -I../common/include/arch/x86_64 -fcolor-diagnostics -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -std=c11 -O0 -g -Wa,--noexecstack -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Wwrite-strings -Wnull-dereference -DDEBUG -fPIE -fsanitize=address -fno-sanitize-link-runtime -mllvm -asan-mapping-offset=0x18000000000 -DASAN -mllvm -asan-use-after-return=0 -fno-stack-protector -fno-builtin -nostdlib -Wno-unused-parameter -fPIE -MD -MQ Pal/regression/Pie.p/Pie.c.o -MF Pal/regression/Pie.p/Pie.c.o.d -o Pal/regression/Pie.p/Pie.c.o -c ../Pal/regression/Pie.c
clang: error: unknown argument: '-fno-sanitize-link-runtime'

I think the below should fix the issue, but want to test with both ASan and UBSan before pushing the changes.

diff --git a/Pal/src/host/Linux-SGX/db_main.c b/Pal/src/host/Linux-SGX/db_main.c
index dc41346f..144a69f5 100644
--- a/Pal/src/host/Linux-SGX/db_main.c
+++ b/Pal/src/host/Linux-SGX/db_main.c
@@ -372,8 +372,8 @@ static int create_socket_range_from_core(struct pal_res_range_info* socket_info_
 
     if (socket_id != prev_socket_id) {
         socket_info_arr[socket_id].range_cnt++;
-        size_t new_size = sizeof(new_size) * socket_info_arr[socket_id].range_cnt;
-        size_t old_size = new_size - sizeof(old_size);
+        size_t new_size = sizeof(struct pal_range_info) * socket_info_arr[socket_id].range_cnt;
+        size_t old_size = new_size - sizeof(struct pal_range_info);
         struct pal_range_info* tmp = malloc(new_size);
         if (!tmp)
             return -1;

mkow

Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @vijaydhanraj)

a discussion (no related file):

clang: error: unknown argument: '-fno-sanitize-link-runtime'

Try clang10, yours is probably too old and doesn't support this flag yet.

vijaydhanraj

Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

clang: error: unknown argument: '-fno-sanitize-link-runtime'
Try clang10, yours is probably too old and doesn't support this flag yet.

Thanks, installed clang-12 and was able to repro the issue. Tested with the above patch and it resolves the issue.

dimakuv · 2022-01-24T08:13:59Z

Jenkins, test this please

dimakuv

Reviewed 20 of 21 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 124 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

-- commits, line 6 at r2:
Linux_SGX -> Linux-SGX

-- commits, line 7 at r2:
...unify -> ...and unify

-- commits, line 8 at r2:
PAL -> PALs

-- commits, line 19 at r2:
Actually, I would just remove this sentence and immediately start with 1., 2., 3.

LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):

/* Converts an integer to a string, optionally appending a given single-letter unit suffix
 * (see enum size_multiplier for possible values).
 * Note: This function adds a newline at end of the string. */

at the end

LibOS/shim/include/shim_fs_pseudo.h, line 216 at r2 (raw file):

 * Note: This function adds a newline at end of the string. */
int sys_convert_int_to_sizestr(size_t val, enum size_multiplier size_mult, char* str,
                               size_t buf_size);

We have inconsistency in naming: str and buf_size. I suggest to replace str -> buf.

LibOS/shim/include/shim_fs_pseudo.h, line 220 at r2 (raw file):

/* Converts struct pal_res_range_info to a string representation.
 * Example output when sep == ',': "10-63,68,70-127".
 * Note: This function adds a newline at end of the string. */

at the end

LibOS/shim/include/shim_fs_pseudo.h, line 222 at r2 (raw file):

 * Note: This function adds a newline at end of the string. */
int sys_convert_ranges_to_str(const struct pal_res_range_info* resource_range_info, char* str,
                              size_t buf_size, const char* sep);

ditto (buf)

LibOS/shim/include/shim_fs_pseudo.h, line 222 at r2 (raw file):

 * Note: This function adds a newline at end of the string. */
int sys_convert_ranges_to_str(const struct pal_res_range_info* resource_range_info, char* str,
                              size_t buf_size, const char* sep);

Can we move sep before str and buf_size (to become a second argument, not the last one)?

This is more easy on the eyes: we will have two input arguments first, and then two arguments refering to the buffer.

LibOS/shim/include/shim_fs_pseudo.h, line 227 at r2 (raw file):

 * on the possible cores count in the system.
 * Example output for 64 cores in total and ranges 0-15,48-55: "00ff0000,0000ffff".
 * Note: This function adds a newline at end of the string. */

at the end

LibOS/shim/include/shim_fs_pseudo.h, line 229 at r2 (raw file):

 * Note: This function adds a newline at end of the string. */
int sys_convert_ranges_to_cpu_bitmap_str(const struct pal_res_range_info* resource_range_info, char* str,
                                         size_t buf_size);

ditto (buf)

LibOS/shim/src/fs/proc/info.c, line 129 at r2 (raw file):

        if (g_pal_public_state->enable_sysfs_topology) {
            ADD_INFO("physical id\t: %zu\n", ti->core_topology_arr[i].socket_id);
        }

Can we have an else clause? It is very unexpected that if sysfs is disabled, then there will be no physical id string at all.

Maybe simply ADD_INFO("physical id\t: %zu\n", 0); is good enough for now (i.e., we have only one socket)?

LibOS/shim/src/fs/sys/cache_info.c, line 35 at r2 (raw file):

        ret = sys_convert_ranges_to_cpu_bitmap_str(&cache_info->shared_cpu_map, str, sizeof(str));
    } else if (strcmp(name, "level") == 0) {
        ret = sys_convert_int_to_sizestr(cache_info->level, MULTIPLIER_NONE, str, sizeof(str));

Why do you use your special function here, instead of a simple:

    ret = snprintf(str, sizeof(str), "%zu\n", cache_info->level);

Same for the below cases (coherency line size, number of sets, ...)

LibOS/shim/src/fs/sys/cache_info.c, line 39 at r2 (raw file):

        switch (cache_info->type) {
            case CACHE_TYPE_DATA:
                ret = snprintf(str, sizeof(str), "%s", "Data\n");

\n is typically added to the format string, so please change to this:

    ret = snprintf(str, sizeof(str), "%s\n", "Data");

Same for the below two cases.

LibOS/shim/src/fs/sys/cache_info.c, line 67 at r2 (raw file):

    if (ret < 0)
        return ret;

Please add an empty line after this, a bit hard to read otherwise

LibOS/shim/src/fs/sys/cpu_info.c, line 32 at r2 (raw file):

    if (ret < 0)
        return ret;

Please add an empty line after this, a bit hard to read otherwise

LibOS/shim/src/fs/sys/cpu_info.c, line 52 at r2 (raw file):

            return -ENOENT;
        ret = sys_convert_int_to_sizestr(core_topology->is_logical_core_online, MULTIPLIER_NONE,
                                         str, sizeof(str));

Why do you use your special function here, instead of a simple:

    ret = snprintf(str, sizeof(str), "%zu\n", core_topology->is_logical_core_online);

Same for the below cases (coherency line size, number of sets, ...)

LibOS/shim/src/fs/sys/cpu_info.c, line 69 at r2 (raw file):

    if (ret < 0)
        return ret;

Please add an empty line after this, a bit hard to read otherwise

LibOS/shim/src/fs/sys/fs.c, line 20 at r2 (raw file):

                               size_t buf_size) {
    int ret = 0;

Please add an assert that size_mult is one of NONE, KB, MB, GB.

LibOS/shim/src/fs/sys/fs.c, line 38 at r2 (raw file):

    if ((size_t)ret >= buf_size)
        ret = -EOVERFLOW;
    return ret;

Please combine three lines into one via return condition ? true : false.

LibOS/shim/src/fs/sys/fs.c, line 46 at r2 (raw file):

    size_t offset = 0;
    for (size_t i = 0; i < range_cnt; i++) {
        if (offset > buf_size)

>=

LibOS/shim/src/fs/sys/fs.c, line 65 at r2 (raw file):

        /* Truncation has occurred */
        if ((size_t)ret >= buf_size)

buf_size -> buf_size - offset (because this is the real "max size of string" argument passed to snprintf)

LibOS/shim/src/fs/sys/fs.c, line 90 at r2 (raw file):

        for (size_t j = start; j <= end; j++) {
            size_t index = j / BITS_IN_TYPE(int);

int -> uint32_t

LibOS/shim/src/fs/sys/fs.c, line 93 at r2 (raw file):

            assert(index < cpumask_cnt);

            bitmap[index] |= 1U << (j % BITS_IN_TYPE(int));

int -> uint32_t

LibOS/shim/src/fs/sys/fs.c, line 100 at r2 (raw file):

    size_t offset = 0;
    for (size_t j = cpumask_cnt; j > 0; j--) {
        if (offset > buf_size) {

``>=

LibOS/shim/src/fs/sys/fs.c, line 119 at r2 (raw file):

        /* Truncation has occurred */
        if ((size_t)ret >= buf_size) {

buf_size -> buf_size - offset (because this is the real "max size of string" argument passed to snprintf)

LibOS/shim/src/fs/sys/node_info.c, line 50 at r2 (raw file):

        if (strcmp(parent_name, "hugepages-2048kB") == 0) {
            ret = sys_convert_int_to_sizestr(numa_topology->nr_hugepages[HUGEPAGES_2M],
                                             MULTIPLIER_NONE, str, sizeof(str));

Why do you use your special function here, instead of a simple:

    ret = snprintf(str, sizeof(str), "%zu\n", ...);

Same for the below case.

LibOS/shim/src/fs/sys/node_info.c, line 64 at r2 (raw file):

    if (ret < 0)
        return ret;

Please add an empty line after this, a bit hard to read otherwise

Pal/include/arch/x86_64/pal_topology.h, line 3 at r2 (raw file):

/* SPDX-License-Identifier: LGPL-3.0-or-later */
/* Copyright (C) 2022 Intel Corporation
 *                    Michał Kowalczyk <mkow@invisiblethingslab.com>

Nit: you can add your copyright too

Pal/include/arch/x86_64/pal_topology.h, line 41 at r2 (raw file):

struct pal_res_range_info {
    /* Count of total number of resources present. Eg. if output of `/sys/devices/system/cpu/online`

Simply Total number of ... (remove count of, it is redundant)

Pal/include/arch/x86_64/pal_topology.h, line 45 at r2 (raw file):

    size_t resource_cnt;

    /* Count of total number of ranges present. Eg. if output of `/sys/devices/system/cpu/online`

Simply Total number of ... (remove count of, it is redundant)

Pal/include/arch/x86_64/pal_topology.h, line 49 at r2 (raw file):

    size_t range_cnt;

    /* Array to store each range and has `range_cnt` elements. Eg. if output of

I would rephrase the first sentence like this:

Array of ranges, with `range_cnt` items.

Pal/include/arch/x86_64/pal_topology.h, line 51 at r2 (raw file):

    /* Array to store each range and has `range_cnt` elements. Eg. if output of
     * `/sys/devices/system/cpu/online` were 0-15,21,32-63 then `ranges_arr` will be
     * [[0, 15], [21, 21], [32 64]]. Note: When there is only a single number instead of range both

You current example looks like a two-dimensional array (array of arrays). But in reality, you have an array of structs. So you should change to:

[ {0, 15}, {21, 21}, {32, 63} ]

Also, did you make a typo? Why do you have 64 instead of 63 in the last item?

Pal/include/arch/x86_64/pal_topology.h, line 52 at r2 (raw file):

     * `/sys/devices/system/cpu/online` were 0-15,21,32-63 then `ranges_arr` will be
     * [[0, 15], [21, 21], [32 64]]. Note: When there is only a single number instead of range both
     * start and end are assigned start value.

I would remove the Note sentence, it is redundant (you already give an example of [21, 21] -- no need to explain this representation in words).

Pal/include/arch/x86_64/pal_topology.h, line 53 at r2 (raw file):

     * [[0, 15], [21, 21], [32 64]]. Note: When there is only a single number instead of range both
     * start and end are assigned start value.
     * Note: The ranges_arr has non-overlapping elements */

I would rephrase to: The ranges should not overlap.

Pal/include/arch/x86_64/pal_topology.h, line 66 at r2 (raw file):

    size_t number_of_sets;
    size_t physical_line_partition;
};

Would it be possible to add same-line comments on each of these? With examples. For example, I have no idea what type of the cache could mean.

UPDATE: type must be of type enum cache_type, please fix. Actually, then you can ignore my request for same-line comments, the rest fields are recognizable.

Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):

struct pal_core_topo_info {
    /* [0] element is uninitialized because core 0 is always online */

Does this comment make sense now? Previously, the string is_logical_core_online could be uninitialized, but now it's a integer which is always initialized (moreover, it would be reasonable to always set is_logical_core_online = 1 for core 0).

I vote for removing this comment (unless I misunderstand smth).

Pal/include/arch/x86_64/pal_topology.h, line 99 at r2 (raw file):

    /* Number of physical packages in the system */
    size_t sockets_cnt;

How is this sockets_cnt different from online_nodes.resource_cnt? Are you sure you still need this sockets_cnt variable for anything?

Pal/include/arch/x86_64/pal_topology.h, line 101 at r2 (raw file):

    size_t sockets_cnt;
    /* Number of physical cores in a socket (physical package). */
    size_t physical_cores_per_socket;

Is this physical_cores_per_socket variable still relevant? You have all this info in online_nodes.ranges_arr. Please check if this variable is really needed.

Pal/src/host/Linux-common/topo_info.c, line 20 at r2 (raw file):

/* Opens a pseudo-file describing HW resources and simply reads the value stored in the file. If
 * `retval` and `size_mult` are passed, the value read and the size_qualifier if any are stored.

Mentioning retval in this sentence is misleading, it makes an impression that reval may be passed as NULL, but this is not true. In other words, retval must always be passed to this function. So please remove retval mention from this sentence (keep only size_mult).

Pal/src/host/Linux-common/topo_info.c, line 27 at r2 (raw file):

        /* `retval` must be passed to store the value read from buffer */
        return -EINVAL;
    }

Instead of the if check, simply assert(retval) -- this is a local function anyway, so we can make assumptions how it is called.

Pal/src/host/Linux-common/topo_info.c, line 32 at r2 (raw file):

        *size_mult = MULTIPLIER_NONE;

    int fd = DO_SYSCALL(open, filename, O_RDONLY | O_CLOEXEC, 0);

You can remove the last argument (0), it will still work.

Pal/src/host/Linux-common/topo_info.c, line 36 at r2 (raw file):

        return fd;

    char buf[64];

Instead of 64, cannot we use one of your FILESZ macros?

Pal/src/host/Linux-common/topo_info.c, line 40 at r2 (raw file):

    DO_SYSCALL(close, fd);
    if (ret < 0)
        return ret;

Isn't this whole open+read+close sequence exactly the same as what read_file_buffer() function does? Please replace this code with the call to this function.

Pal/src/host/Linux-common/topo_info.c, line 73 at r2 (raw file):

/* Opens a pseudo-file describing HW resources such as online CPUs and counts the number of
 * HW resources, ranges present in the file. The result is stored in `struct pal_res_range_info`.

HW resources, ranges -> HW resources and their ranges

Pal/src/host/Linux-common/topo_info.c, line 73 at r2 (raw file):

/* Opens a pseudo-file describing HW resources such as online CPUs and counts the number of
 * HW resources, ranges present in the file. The result is stored in `struct pal_res_range_info`.

...stored in `res_info`.

Pal/src/host/Linux-common/topo_info.c, line 75 at r2 (raw file):

 * HW resources, ranges present in the file. The result is stored in `struct pal_res_range_info`.
 * Returns UNIX error code on failure and 0 on success.
 * N.B: Understands complex formats like "1,3-5,6"

Add a dot at the end of this sentence.

Pal/src/host/Linux-common/topo_info.c, line 81 at r2 (raw file):

        /* `res_info` must be passed to store the range info read from buffer */
        return -EINVAL;
    }

Instead of the if check, simply assert(res_info) -- this is a local function anyway, so we can make assumptions how it is called.

Pal/src/host/Linux-common/topo_info.c, line 88 at r2 (raw file):

    res_info->ranges_arr = NULL;

    int fd = DO_SYSCALL(open, filename, O_RDONLY | O_CLOEXEC, 0);

You can remove the last argument (0), it will still work.

Pal/src/host/Linux-common/topo_info.c, line 92 at r2 (raw file):

        return fd;

    char buf[64];

Instead of 64, cannot we use one of your FILESZ macros?

Pal/src/host/Linux-common/topo_info.c, line 96 at r2 (raw file):

    DO_SYSCALL(close, fd);
    if (ret < 0)
        return ret;

Isn't this whole open+read+close sequence exactly the same as what read_file_buffer() function does? Please replace this code with the call to this function.

Pal/src/host/Linux-common/topo_info.c, line 101 at r2 (raw file):

    char* ptr = buf;
    size_t resource_cnt = 0;

Why do you need a helper variable? Simply use res_info->resource_cnt instead.

Pal/src/host/Linux-common/topo_info.c, line 119 at r2 (raw file):

        size_t range_end;

        if (*end == '\0' || *end == ',' || *end == '\n' || *end == ' ') {

Shouldn't you add also *end == '\t'?

Pal/src/host/Linux-common/topo_info.c, line 120 at r2 (raw file):

        if (*end == '\0' || *end == ',' || *end == '\n' || *end == ' ') {
            /* single HW resource index, count as one more */

This count as one more comment doesn't make much sense now. I suggest to simply remove this part of the comment.

Pal/src/host/Linux-common/topo_info.c, line 125 at r2 (raw file):

            resource_cnt++;
        } else if (*end == '-') {
            /* HW resource range, count how many HW resources are in range */

This count how many ... comment doesn't make much sense now. I suggest to simply remove this part of the comment.

Pal/src/host/Linux-common/topo_info.c, line 142 at r2 (raw file):

                goto out;
            }
            resource_cnt += end_val - start_val + 1; //inclusive (e.g., 0-7, or 8-16)

end_val - start_val -> diff

Pal/src/host/Linux-common/topo_info.c, line 142 at r2 (raw file):

                goto out;
            }
            resource_cnt += end_val - start_val + 1; //inclusive (e.g., 0-7, or 8-16)

The comment is weird. Why are there two examples? I would simply say /* +1 because of inclusive range */

Pal/src/host/Linux-common/topo_info.c, line 150 at r2 (raw file):

        /* Update range info */
        res_info->range_cnt++;

Please add an empty line after this line, easier to read

Pal/src/host/Linux-common/topo_info.c, line 151 at r2 (raw file):

        /* Update range info */
        res_info->range_cnt++;
        /* Realloc as we identify new range when parsing */

Simply Realloc the array of ranges (expand by one range)

Pal/src/host/Linux-common/topo_info.c, line 178 at r2 (raw file):

    return 0;
out:

You get to this label only if something failed. Please rename it to fail:.

Pal/src/host/Linux-common/topo_info.c, line 183 at r2 (raw file):

            free(res_info[i].ranges_arr);
    }
    /* Clear user supplied buffer */

I would remove this comment, it is obvious

Pal/src/host/Linux-common/topo_info.c, line 218 at r2 (raw file):

        /* `cache_indices_cnt` must be passed to store the number of cache indices */
        return -EINVAL;
    }

Instead of the if check, simply assert(cache_indices_cnt) -- this is a local function anyway, so we can make assumptions how it is called.

Pal/src/host/Linux-common/topo_info.c, line 220 at r2 (raw file):

    }

    char buf[1024];

Instead of 1024, cannot we use one of your FILESZ macros?

Pal/src/host/Linux-common/topo_info.c, line 221 at r2 (raw file):

    char buf[1024];
    int ret = 0;

Please don't use this pattern of assigning 0 to ret during init. Instead, add an explicit ret = 0; statement before the out: label.

Pal/src/host/Linux-common/topo_info.c, line 249 at r2 (raw file):

        ret = -ENOENT;
        goto out;
    }

Please add an empty line

Pal/src/host/Linux-common/topo_info.c, line 250 at r2 (raw file):

        goto out;
    }
    *cache_indices_cnt = dirs_cnt;

Please add ret = 0;

Pal/src/host/Linux-common/topo_info.c, line 266 at r2 (raw file):

    }

    char filename[128];

Instead of 128, cannot we use one of your FILESZ macros?

Pal/src/host/Linux-common/topo_info.c, line 270 at r2 (raw file):

        snprintf(filename, sizeof(filename),
                 "/sys/devices/system/cpu/cpu%zu/cache/index%zu/shared_cpu_list", core_idx,
                 cache_idx);

This is hard to read. You have the part /sys/devices/system/cpu/cpu%zu/cache/index%zu/ always the same -- please move it to a helper variable, and use the helper variable everywhere, like this:

char dirname[128];
snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu/cache/index%zu", core_idx, cache_idx);

snrpintf(filename, sizeof(filename), "%s/shared_cpu_list", dirname);
snrpintf(filename, sizeof(filename), "%s/level", dirname);
... and so on ...

Pal/src/host/Linux-common/topo_info.c, line 285 at r2 (raw file):

                 core_idx, cache_idx);

        ret = read_file_buffer(filename, type, ARRAY_SIZE(type)-1);

Why ARRAY_SIZE? Simply sizeof()

Pal/src/host/Linux-common/topo_info.c, line 336 at r2 (raw file):

    return 0;

out_cache:

There is no reason to have this specific name, please rename to simply out:

Pal/src/host/Linux-common/topo_info.c, line 363 at r2 (raw file):

    /* TODO: correctly support offline cores */
    if (possible_logical_cores_cnt > online_logical_cores_cnt) {
        log_error("Some CPUs seem to be offline; Gramine currently doesn't support core offling");

offling -> offlining

Pal/src/host/Linux-common/topo_info.c, line 377 at r2 (raw file):

    size_t current_max_socket = 0;
    char filename[128];

Instead of 128, cannot we use one of your FILESZ macros?

Pal/src/host/Linux-common/topo_info.c, line 381 at r2 (raw file):

        /* cpu0 is always online and thus the "online" file is not present. */
        if (idx != 0) {
            snprintf(filename, sizeof(filename), "/sys/devices/system/cpu/cpu%zu/online", idx);

This is hard to read. You have the part /sys/devices/system/cpu/cpu%zu/ always the same -- please move it to a helper variable, and use the helper variable everywhere, like this:

char dirname[128];
snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu", idx);

snrpintf(filename, sizeof(filename), "%s/online", dirname);
snrpintf(filename, sizeof(filename), "%s/topology/core_id", dirname);
... and so on ...

Pal/src/host/Linux-common/topo_info.c, line 386 at r2 (raw file):

            if (ret < 0)
                goto out;
        }

Please add else { core_topology_arr[idx].is_logical_core_online = 1; }

Pal/src/host/Linux-common/topo_info.c, line 425 at r2 (raw file):

    topo_info->sockets_cnt = current_max_socket + 1;
    topo_info->physical_cores_per_socket = core_topology_arr[0].core_siblings.resource_cnt /
                                           core_topology_arr[0].thread_siblings.resource_cnt;

As I mentioned previously, I think you don't need these two variables anymore (sockets_cnt and physical_cores_per_socket).

Pal/src/host/Linux-SGX/db_main.c, line 126 at r2 (raw file):

}

static int copy_resource_range_to_enclave(struct pal_res_range_info* untrusted_src,

Please add a comment that this function and all below functions do not care about correctly freeing memory.

Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):

static int copy_resource_range_to_enclave(struct pal_res_range_info* untrusted_src,
                                          struct pal_res_range_info* dest) {
    size_t range_cnt = untrusted_src->range_cnt;

To access untrusted base-type variables, you should always use READ_ONCE(). In this case:

size_t range_cnt = READ_ONCE(untrusted_src->range_cnt);

Pal/src/host/Linux-SGX/db_main.c, line 135 at r2 (raw file):

    }

    if (!sgx_copy_to_enclave(ranges_arr, range_cnt * sizeof(*ranges_arr), untrusted_src->ranges_arr,

READ_ONCE(untrusted_src->ranges_arr)

(For examples, you can check enclave_ocalls.c)

Pal/src/host/Linux-SGX/db_main.c, line 143 at r2 (raw file):

    dest->ranges_arr = ranges_arr;
    dest->range_cnt = range_cnt;
    dest->resource_cnt = untrusted_src->resource_cnt;

READ_ONCE

Pal/src/host/Linux-SGX/db_main.c, line 158 at r2 (raw file):

    for (size_t idx = 0; idx < online_logical_cores_cnt; idx++) {
        core_topo_arr[idx].is_logical_core_online = untrusted_src[idx].is_logical_core_online;

Here and everywhere: READ_ONCE()

Pal/src/host/Linux-SGX/db_main.c, line 165 at r2 (raw file):

                                                 &core_topo_arr[idx].core_siblings);
        if (ret < 0) {
            log_error("Copying core_topo[%zu].core_siblings failed", idx);

core_topo_arr, please fix all such issues in your logs

Pal/src/host/Linux-SGX/db_main.c, line 185 at r2 (raw file):

        for (size_t lvl = 0; lvl < cache_indices_cnt; lvl++) {
            cache_info_arr[lvl].level = untrusted_src[idx].cache_info_arr[lvl].level;

Can you introduce a helper variable, otherwise it's hard to read. Like this:

struct pal_core_cache_info* untrusted_cache_info = READ_ONCE(&untrusted_src[idx].cache_info_arr[lvl]);

cache_info_arr[lvl].type = READ_ONCE(untrusted_cache_info->type);
...

Pal/src/host/Linux-SGX/db_main.c, line 206 at r2 (raw file):

            }
        }
        core_topo_arr[idx].cache_info_arr = cache_info_arr;

Please add an empty line before this one.

Pal/src/host/Linux-SGX/db_main.c, line 208 at r2 (raw file):

        core_topo_arr[idx].cache_info_arr = cache_info_arr;
    }
    g_pal_public_state.topo_info.core_topology_arr = core_topo_arr;

I don't like that this generic function suddenly updates a global variable -- this is an unexpected side-effect of this function. Instead, please add a new function argument which will be a pointer to the topo_info object, and pass &g_pal_public_state.topo_info as this argument in the caller.

Pal/src/host/Linux-SGX/db_main.c, line 222 at r2 (raw file):

    for (size_t idx = 0; idx < online_nodes_cnt; idx++) {
        numa_topo_arr[idx].nr_hugepages[HUGEPAGES_2M] = untrusted_src[idx].nr_hugepages[HUGEPAGES_2M];

READ_ONCE()

Pal/src/host/Linux-SGX/db_main.c, line 228 at r2 (raw file):

                                                 &numa_topo_arr[idx].cpumap);
        if (ret < 0) {
            log_error("Copying numa_topo[%zu].core_siblings failed", idx);

numa_topo_arr

Pal/src/host/Linux-SGX/db_main.c, line 239 at r2 (raw file):

        }
    }
    g_pal_public_state.topo_info.numa_topology_arr = numa_topo_arr;

I don't like that this generic function suddenly updates a global variable -- this is an unexpected side-effect of this function. Instead, please add a new function argument which will be a pointer to the topo_info object, and pass &g_pal_public_state.topo_info as this argument in the caller.

Pal/src/host/Linux-SGX/db_main.c, line 247 at r2 (raw file):

 * 1. Ensures the resource as well as range count doesn't exceed limits.
 * 2. Ensures that ranges don't overlap like "1-5, 3-4".
 * 3. Ensures the ranges aren't malformed like "1-5, 7-1".

Can you add logic to verify that resource_cnt is correct and corresponds to ranges in res_info->ranges_arr? I mean, you can do a for loop over res_info->ranges_arr[idx] and sum up the range diffs. This will add protection against malicious resource_cnt.

Pal/src/host/Linux-SGX/db_main.c, line 261 at r2 (raw file):

    size_t range_cnt = res_info->range_cnt;
    if (!IS_IN_RANGE_INCL(range_cnt, 1, 1 << 7)) {
        log_error("Invalid range count: %zu", range_cnt);

Invalid number of ranges

Pal/src/host/Linux-SGX/db_main.c, line 277 at r2 (raw file):

        /* Ensure start and end fall within range limits */
        if (!IS_IN_RANGE_INCL(start, range_min_limit, range_max_limit)) {
            log_error("Invalid start range: %zu", start);

Invalid start of range

Pal/src/host/Linux-SGX/db_main.c, line 282 at r2 (raw file):

        if ((start != end) && !IS_IN_RANGE_INCL(end, start + 1, range_max_limit)) {
            log_error("Invalid end range: %zu", end);

Invalid end of range

Pal/src/host/Linux-SGX/db_main.c, line 289 at r2 (raw file):

         *`previous_end` is not yet initialized. */
        if (check_for_overlaps && previous_end >= start) {
            log_error("Malformed range: previous_end = %zu, current start = %zu", previous_end,

Malformed range -> Overlapping ranges

Pal/src/host/Linux-SGX/db_main.c, line 315 at r2 (raw file):

                cache_info_arr[lvl].type == CACHE_TYPE_INSTRUCTION) {
            /* Taking HT into account */
            max_limit = 2;

Please introduce a macro MAX_HYPERTHREADS_PER_CORE (or something like this) in one of the header files instead of hard-coding 2 here.

Pal/src/host/Linux-SGX/db_main.c, line 321 at r2 (raw file):

        }

        int ret = sanitize_hw_resource_range(&cache_info_arr[lvl].shared_cpu_map, 1, max_limit, 0,

Can you add a comment with an example of shared_cpu_map? I'm confused how this can be in range 1..2 for the case of DATA cache...

Pal/src/host/Linux-SGX/db_main.c, line 328 at r2 (raw file):

        }

        size_t level = cache_info_arr[lvl].level;

What's the point in this level helper variable?

Pal/src/host/Linux-SGX/db_main.c, line 329 at r2 (raw file):

        size_t level = cache_info_arr[lvl].level;
        if (!IS_IN_RANGE_INCL(level, 1, 3))      /* x86 processors have max of 3 cache levels */

Please introduce a macro MAX_CACHE_LEVELS (or something like this) in one of the header files instead of hard-coding 3 here.

Pal/src/host/Linux-SGX/db_main.c, line 337 at r2 (raw file):

                cache_info_arr[lvl].size_multiplier != MULTIPLIER_NONE) {
            return -1;
        }

Why not remove this big IF and just add another else {return -1} below?

Pal/src/host/Linux-SGX/db_main.c, line 354 at r2 (raw file):

            return -1;

        size_t coherency_line_size = cache_info_arr[lvl].coherency_line_size;

What's the point in this helper variable?

Pal/src/host/Linux-SGX/db_main.c, line 358 at r2 (raw file):

            return -1;

        size_t number_of_sets = cache_info_arr[lvl].number_of_sets;

What's the point in this helper variable?

Pal/src/host/Linux-SGX/db_main.c, line 362 at r2 (raw file):

            return -1;

        size_t physical_line_partition = cache_info_arr[lvl].physical_line_partition;

What's the point in this helper variable?

Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):

}

static int create_socket_range_from_core(struct pal_res_range_info* socket_info_arr,

I highly dislike this function. I don't understand what it does exactly, the name is definitely wrong (create? this func doesn't create anything). I don't know how many items are in the socket_info_arr, and in general this function has many assumptions on what its caller does.

Please move this function into its caller. Or make this function more self-contained and self-explanatory.

Pal/src/host/Linux-SGX/db_main.c, line 402 at r2 (raw file):

                                struct pal_res_range_info* socket_info_arr, size_t sockets_cnt) {
    /* core-siblings represent all the cores that are part of a socket. We cross-verify the
     * socket info against this. */

I suggest the comment:

For each socket, cross-verify that its set of cores is the same as the core topology's core-siblings:
  - Pick the first core in the socket
  - Find its core-siblings in the core topology
  - Verify that the array of cores in the socket object is exactly the same as core-siblings in the core topology object

Then you can remove the other comment (pick the first ...)

Pal/src/host/Linux-SGX/db_main.c, line 409 at r2 (raw file):

        /* pick the first core in the socket */
        size_t core_in_socket =  socket_info_arr[idx].ranges_arr[0].start;

You have double-space here

Pal/src/host/Linux-SGX/db_main.c, line 410 at r2 (raw file):

        /* pick the first core in the socket */
        size_t core_in_socket =  socket_info_arr[idx].ranges_arr[0].start;
        size_t core_sibling_cnt = core_topology_arr[core_in_socket].core_siblings.range_cnt;

Instead of core_sibling_cnt, I suggest to use:

    struct pal_res_range_info* core_siblings = &core_topology_arr[core_in_socket].core_siblings;

    /* and then use this helper variable */
    if (core_siblings->range_cnt != socket_info_arr[idx].range_cnt) { ...

Pal/src/host/Linux-SGX/db_main.c, line 433 at r2 (raw file):

                                       size_t sockets_cnt) {
    int ret;
    if (online_logical_cores_cnt == 0 || cache_indices_cnt == 0)

Shouldn't you check sockets_cnt as well?

Pal/src/host/Linux-SGX/db_main.c, line 449 at r2 (raw file):

        }

        size_t core_id = core_topology_arr[idx].core_id;

What's the point in this helper variable? Please remove.

Pal/src/host/Linux-SGX/db_main.c, line 456 at r2 (raw file):

        ret = sanitize_hw_resource_range(&core_topology_arr[idx].core_siblings, 1,
                                         online_logical_cores_cnt, 0, online_logical_cores_cnt);

Shouldn't the last argument be online_logical_cores_cnt - 1 (note the - 1)?

Pal/src/host/Linux-SGX/db_main.c, line 463 at r2 (raw file):

        /* Max. SMT siblings currently supported on x86 processors is 4 */
        ret = sanitize_hw_resource_range(&core_topology_arr[idx].thread_siblings, 1, 4, 0,

In some other place you said that the max SMT siblings are 2 :)

Please introduce a macro MAX_HYPERTHREADS_PER_CORE (or something like this) in one of the header files instead of hard-coding 4 here.

Pal/src/host/Linux-SGX/db_main.c, line 464 at r2 (raw file):

        /* Max. SMT siblings currently supported on x86 processors is 4 */
        ret = sanitize_hw_resource_range(&core_topology_arr[idx].thread_siblings, 1, 4, 0,
                                         online_logical_cores_cnt);

Shouldn't the last argument be online_logical_cores_cnt - 1 (note the - 1)?

Pal/src/host/Linux-SGX/db_main.c, line 486 at r2 (raw file):

         * #1. From the socket_id of each core, create a range of cores present in each socket.
         * #2. Compare range of cores in each socket core_siblings which basically is the all cores
         * in a physical package (socket).

Please use:

  * #2. Compare array of cores in each socket against the array of core-siblings from
        the core topology.

Pal/src/host/Linux-SGX/db_main.c, line 499 at r2 (raw file):

        log_error("Copying topo_info into the enclave failed");
        return -1;
    }

Why did you remove this shallow copy? This is important! Please restore this part and use temp_topo_info.

Pal/src/host/Linux-SGX/db_main.c, line 512 at r2 (raw file):

                                       size_t online_nodes_cnt, size_t online_logical_cores_cnt,
                                       size_t possible_logical_cores_cnt) {
    int ret = 0;

Please don't init with zero. Instead, keep int ret; and right-before the out label, do ret = 0;

Pal/src/host/Linux-SGX/db_main.c, line 521 at r2 (raw file):

    for (size_t idx = 0; idx < online_nodes_cnt; idx++) {
        ret = sanitize_hw_resource_range(&numa_topology_arr[idx].cpumap, 1,
                                         online_logical_cores_cnt, 0, online_logical_cores_cnt);

-1 in the last argument?

Pal/src/host/Linux-SGX/db_main.c, line 532 at r2 (raw file):

            size_t end = numa_topology_arr[idx].cpumap.ranges_arr[i].end;
            for (size_t j = start; j <= end; j++) {
                size_t index = j / BITS_IN_TYPE(int);

BITS_IN_TYPE(uint32_t)

Pal/src/host/Linux-SGX/db_main.c, line 533 at r2 (raw file):

            for (size_t j = start; j <= end; j++) {
                size_t index = j / BITS_IN_TYPE(int);
                assert(index < cpumask_cnt);

I think this shouldn't be an assert but instead a proper if (index >= cpumask_cnt) { fail }

Pal/src/host/Linux-SGX/db_main.c, line 535 at r2 (raw file):

                assert(index < cpumask_cnt);

                if (bitmap[index] & (1U << (j % BITS_IN_TYPE(int)))) {

BITS_IN_TYPE(uint32_t)

Pal/src/host/Linux-SGX/db_main.c, line 540 at r2 (raw file):

                    goto out_numa;
                }
                bitmap[index] |= 1U << (j % BITS_IN_TYPE(int));

BITS_IN_TYPE(uint32_t)

Pal/src/host/Linux-SGX/db_main.c, line 547 at r2 (raw file):

        size_t distances = numa_topology_arr[idx].distance.resource_cnt;
        if (distances != online_nodes_cnt) {
            log_error("Invalid numa_topology_arr[%zu].distance", idx);

Invalid numa_topology: number of distances is not the same as number of NUMA nodes

Pal/src/host/Linux-SGX/db_main.c, line 554 at r2 (raw file):

    if (total_cores_in_numa != online_logical_cores_cnt) {
        log_error("Invalid numa_topology: more cores in NUMA than online");

more -> not the same number of

Pal/src/host/Linux-SGX/db_main.c, line 558 at r2 (raw file):

        goto out_numa;
    }

Here must be ret = 0;

Pal/src/host/Linux-SGX/db_main.c, line 559 at r2 (raw file):

    }

out_numa:

Rename to simply out:

Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):

    memset(topo_info, 0, sizeof(*topo_info));

    ret = copy_resource_range_to_enclave(&uptr_topo_info->possible_logical_cores,

This is wrong... You're accessing somewhere inside the untrusted uptr_topo_info. You removed the verification-and-copy of this uptr_topo_info, so it may point anywhere (attacker can choose this address!) inside the enclave memory, and everything will break.

Pal/src/host/Linux-SGX/db_main.c, line 740 at r2 (raw file):

    /* Virtual environments such as QEMU may assign each core to a separate socket/package with
     * one or more NUMA nodes. So we check against the number of online logical cores. */
    if (!IS_IN_RANGE_INCL(topo_info->sockets_cnt , 1, online_logical_cores_cnt)) {

Remove space before ,

dimakuv

Reviewable status: all files reviewed, 125 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

Pal/src/host/Linux-common/topo_info.c, line 181 at r2 (raw file):

    for (size_t i = 0; i < res_info->range_cnt; i++) {
        if (res_info[i].ranges_arr)
            free(res_info[i].ranges_arr);

??? How can this work? Instead of this for loop, it should be simply:

    if (res_info->ranges_arr)
        free(res_info->ranges_arr);

mkow

Reviewable status: all files reviewed, 125 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @vijaydhanraj)

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
Migrating the discussion from #106 again:

@mkow:

Why does this multiplier even exists? Wasn't the idea that what host passes should be as simple and normalized as possible? Why not just pass the number of bytes everywhere and just in that int->str function shorten it if it's divisible by some size unit? (for specific rounding rules please check the kernel source)

With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".

And another problem: what if the untrusted host gives you LONG_MAX gigabytes? I think in some places you'll just pass it to the app without erroring out, because you never needed to multiply it, so there was no chance for overflow detection (which will certainly happen in the user app then).

@vijaydhanraj:

Why does this multiplier even exists?

We have a multiplier because we exactly need to know how the normal kernel is representing the strings. For example, on reading /sys/devices/system/cpu/cpu2/cache/index3/size we get 49152K. Now this can be represented as 48M or 49152K and in this case, kernel chose the latter representation, and we need to give the same to the end-user.

With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".

Please see the earlier comment.

And another problem: what if the untrusted host gives you LONG_MAX gigabytes?

Please see here (lines 263-267) https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R263, I do multiply the size qualifier and check the limits when sanitizing so we shouldn't be running into overflow issues in this case. If you see the issue somewhere, please point me to it.

@mkow:

And finally my next reply: (i.e. from this review iteration)

in this case, kernel chose the latter representation

The problem is that it's the untrusted kernel (or actually just host, not necessarily kernel) which chose these. We can't rely on it choosing proper specifiers.

https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R263

This link doesn't work (or at least is not pointing me to any particular place in the PR).

@vijaydhanraj:

The problem is that it's the untrusted kernel (or actually just host, not necessarily kernel) which chose these. We can't rely on it choosing proper specifiers.

yes agree, that is why we take the values passed by the untrusted host and then run a sanitization check on it. This way we can ensure that the untrusted host is not passing garbage or malicious value and maintain the representation as the host.

This link doesn't work (or at least is not pointing me to any particular place in the PR).
Please see here, https://github.com/gramineproject/gramine/pull/106/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R259, lines 259-262

@mkow:

yes agree, that is why we take the values passed by the untrusted host and then run a sanitization check on it. This way we can ensure that the untrusted host is not passing garbage or malicious value and maintain the representation as the host.

But as I said, you aren't sanitizing the unit, e.g. 1000000000K passes the current sanitization if I'm not mistaken.

Please see here, https://github.com/gramineproject/gramine/pull/106/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R259, lines 259-262

Same, this links to the whole PR diff for me. Can you just give me the filename and the line in current revision?

[end of old quotes]

I still think we should normalize the data and just pass a single number instead of number + unit. The unit is untrusted, so you can't say that we need to follow the host kernel - we don't know what the host kernel really emits.

mkow

Reviewed 2 of 21 files at r1, all commit messages.
Reviewable status: all files reviewed, 128 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @vijaydhanraj)

a discussion (no related file):
@vijaydhanraj Could you move the first two commits into a separate PR? They are trivial and we should be able do merge them quickly.

-- commits, line 8 at r2:
direcly -> directly

-- commits, line 25 at r2:
I'd change this description, the whole idea of this change is not just to pass integers instead of strings, but to use more normalized and clean representation of the topology data on the trusted<->untrusted boundary, so that we can reason about its correctness easier.

vijaydhanraj

Reviewable status: all files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

@vijaydhanraj Could you move the first two commits into a separate PR? They are trivial and we should be able do merge them quickly.

Sure @mkow, I will create a new PR with those 2 commits. I will remove those commits from this PR before final rebase. There are lot of discussions here and so don't want to force push.

-- commits, line 6 at r2:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Linux_SGX -> Linux-SGX

Blocked myself. Will do this as part of the final rebase.

-- commits, line 7 at r2:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

...unify -> ...and unify

Blocked myself. Will do this as part of the final rebase.

-- commits, line 8 at r2:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

PAL -> PALs

Blocked myself. Will do this as part of the final rebase.

-- commits, line 8 at r2:

Previously, mkow (Michał Kowalczyk) wrote…

direcly -> directly

Blocked myself. Will do this as part of the final rebase.

-- commits, line 19 at r2:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, I would just remove this sentence and immediately start with 1., 2., 3.

Blocked myself. Will do this as part of the final rebase.

-- commits, line 25 at r2:

Previously, mkow (Michał Kowalczyk) wrote…

I'd change this description, the whole idea of this change is not just to pass integers instead of strings, but to use more normalized and clean representation of the topology data on the trusted<->untrusted boundary, so that we can reason about its correctness easier.

How about the following?

This commit creates a clean representation of topology data on the trusted<->untrusted boundary, so that we can easily verify its correctness. The following are done:
        1. Refactor sysfs code to convert CPU/NUMA information from
           Linux-formatted strings to integers in untrusted PAL.
        2. Copy and sanitize CPU/NUMA information based on untrusted
           integers to trusted PAL.
        3. Convert the trusted CPU/NUMA information from integers back
           to Linux-formatted strings in LibOS.

LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

at the end

Done.

LibOS/shim/include/shim_fs_pseudo.h, line 216 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

We have inconsistency in naming: str and buf_size. I suggest to replace str -> buf.

I think it will be better to change buf_size -> str_size as it seems to fit with the function names. So have changed it this way. This will also align with what we had discussed before regarding the naming of the char buffers.

Let me know if you want to revert it to buf and buf_size.

LibOS/shim/include/shim_fs_pseudo.h, line 220 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

at the end

Done.

LibOS/shim/include/shim_fs_pseudo.h, line 222 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto (buf)

Done, but renamed buf_size -> str_size.

LibOS/shim/include/shim_fs_pseudo.h, line 222 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can we move sep before str and buf_size (to become a second argument, not the last one)?

This is more easy on the eyes: we will have two input arguments first, and then two arguments refering to the buffer.

Done.

LibOS/shim/include/shim_fs_pseudo.h, line 227 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

at the end

Done.

LibOS/shim/include/shim_fs_pseudo.h, line 229 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto (buf)

Done, but renamed buf_size -> str_size.

LibOS/shim/src/fs/proc/info.c, line 129 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can we have an else clause? It is very unexpected that if sysfs is disabled, then there will be no physical id string at all.

Maybe simply ADD_INFO("physical id\t: %zu\n", 0); is good enough for now (i.e., we have only one socket)?

This will be incorrect as most servers have more than 1 socket. We anyway only print a selected few values. So, unless an application is failing, or we are breaking something in Gramine, I wouldn't suggest adding a dummy value.

LibOS/shim/src/fs/sys/cache_info.c, line 35 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you use your special function here, instead of a simple:
    ret = snprintf(str, sizeof(str), "%zu\n", cache_info->level);
Same for the below cases (coherency line size, number of sets, ...)

This was done for the cache size case where we check and append the size qualifier like K, M, and G and ended up using it for other cases like coherency line size, number of sets as well. Also felt it will be more uniform.

LibOS/shim/src/fs/sys/cache_info.c, line 39 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

\n is typically added to the format string, so please change to this:
    ret = snprintf(str, sizeof(str), "%s\n", "Data");
Same for the below two cases.

Done.

LibOS/shim/src/fs/sys/cache_info.c, line 67 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add an empty line after this, a bit hard to read otherwise

Done.

LibOS/shim/src/fs/sys/cpu_info.c, line 32 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add an empty line after this, a bit hard to read otherwise

Done.

LibOS/shim/src/fs/sys/cpu_info.c, line 52 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you use your special function here, instead of a simple:
    ret = snprintf(str, sizeof(str), "%zu\n", core_topology->is_logical_core_online);
Same for the below cases (coherency line size, number of sets, ...)

Please see my earlier comment on this.

LibOS/shim/src/fs/sys/cpu_info.c, line 69 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add an empty line after this, a bit hard to read otherwise

Done.

LibOS/shim/src/fs/sys/fs.c, line 20 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add an assert that size_mult is one of NONE, KB, MB, GB.

Good point, but I think it is better to return an error instead of an assert.

LibOS/shim/src/fs/sys/fs.c, line 38 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please combine three lines into one via return condition ? true : false.

I added an additional check as ret can be negative in case of error and so didn't end up combining into a single return. But if you think it is better, I can do something like,

if (ret < 0)
    return ret;

return ((size_t)ret >= str_size) ? -EOVERFLOW : 0

LibOS/shim/src/fs/sys/fs.c, line 46 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

>=

Correct, offset doesn't account for the null terminator.

LibOS/shim/src/fs/sys/fs.c, line 65 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

buf_size -> buf_size - offset (because this is the real "max size of string" argument passed to snprintf)

you are right, fixed it.

LibOS/shim/src/fs/sys/fs.c, line 90 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

int -> uint32_t

Done.

LibOS/shim/src/fs/sys/fs.c, line 93 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

int -> uint32_t

Done.

LibOS/shim/src/fs/sys/fs.c, line 100 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

``>=

Done.

LibOS/shim/src/fs/sys/fs.c, line 119 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

buf_size -> buf_size - offset (because this is the real "max size of string" argument passed to snprintf)

Done.

LibOS/shim/src/fs/sys/node_info.c, line 50 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you use your special function here, instead of a simple:
    ret = snprintf(str, sizeof(str), "%zu\n", ...);
Same for the below case.

Please see my earlier comment on this.

LibOS/shim/src/fs/sys/node_info.c, line 64 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add an empty line after this, a bit hard to read otherwise

Done.

Pal/include/arch/x86_64/pal_topology.h, line 3 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Nit: you can add your copyright too

Done.

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Migrating the discussion from #106 again:

@mkow:

Why does this multiplier even exists? Wasn't the idea that what host passes should be as simple and normalized as possible? Why not just pass the number of bytes everywhere and just in that int->str function shorten it if it's divisible by some size unit? (for specific rounding rules please check the kernel source)

With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".

And another problem: what if the untrusted host gives you LONG_MAX gigabytes? I think in some places you'll just pass it to the app without erroring out, because you never needed to multiply it, so there was no chance for overflow detection (which will certainly happen in the user app then).

@vijaydhanraj:

Why does this multiplier even exists?

We have a multiplier because we exactly need to know how the normal kernel is representing the strings. For example, on reading /sys/devices/system/cpu/cpu2/cache/index3/size we get 49152K. Now this can be represented as 48M or 49152K and in this case, kernel chose the latter representation, and we need to give the same to the end-user.

With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".

Please see the earlier comment.

And another problem: what if the untrusted host gives you LONG_MAX gigabytes?

Please see here (lines 263-267) https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R263, I do multiply the size qualifier and check the limits when sanitizing so we shouldn't be running into overflow issues in this case. If you see the issue somewhere, please point me to it.

@mkow:

And finally my next reply: (i.e. from this review iteration)

in this case, kernel chose the latter representation

The problem is that it's the untrusted kernel (or actually just host, not necessarily kernel) which chose these. We can't rely on it choosing proper specifiers.

https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R263

This link doesn't work (or at least is not pointing me to any particular place in the PR).

@vijaydhanraj:

The problem is that it's the untrusted kernel (or actually just host, not necessarily kernel) which chose these. We can't rely on it choosing proper specifiers.

yes agree, that is why we take the values passed by the untrusted host and then run a sanitization check on it. This way we can ensure that the untrusted host is not passing garbage or malicious value and maintain the representation as the host.

This link doesn't work (or at least is not pointing me to any particular place in the PR).
Please see here, https://github.com/gramineproject/gramine/pull/106/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R259, lines 259-262

@mkow:

yes agree, that is why we take the values passed by the untrusted host and then run a sanitization check on it. This way we can ensure that the untrusted host is not passing garbage or malicious value and maintain the representation as the host.

But as I said, you aren't sanitizing the unit, e.g. 1000000000K passes the current sanitization if I'm not mistaken.

Please see here, https://github.com/gramineproject/gramine/pull/106/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R259, lines 259-262

Same, this links to the whole PR diff for me. Can you just give me the filename and the line in current revision?

[end of old quotes]

I still think we should normalize the data and just pass a single number instead of number + unit. The unit is untrusted, so you can't say that we need to follow the host kernel - we don't know what the host kernel really emits.

That is true but why do you say only the unit is untrusted? Even the value is untrusted right?

Also, for cache size, the expectation of the application will be a value followed by unit. By normalizing we are changing this format.

Pal/include/arch/x86_64/pal_topology.h, line 41 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Simply Total number of ... (remove count of, it is redundant)

Done.

Pal/include/arch/x86_64/pal_topology.h, line 45 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Simply Total number of ... (remove count of, it is redundant)

Done.

Pal/include/arch/x86_64/pal_topology.h, line 49 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would rephrase the first sentence like this:
Array of ranges, with `range_cnt` items.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 51 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You current example looks like a two-dimensional array (array of arrays). But in reality, you have an array of structs. So you should change to:
[ {0, 15}, {21, 21}, {32, 63} ]
Also, did you make a typo? Why do you have 64 instead of 63 in the last item?

yes, it should be 63. Also changed the example.

Pal/include/arch/x86_64/pal_topology.h, line 52 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would remove the Note sentence, it is redundant (you already give an example of [21, 21] -- no need to explain this representation in words).

I wanted to make this explicit. After a few months, I myself might miss this :). But I can remove it if it is too obvious.

Pal/include/arch/x86_64/pal_topology.h, line 53 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would rephrase to: The ranges should not overlap.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 66 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Would it be possible to add same-line comments on each of these? With examples. For example, I have no idea what type of the cache could mean.

UPDATE: type must be of type enum cache_type, please fix. Actually, then you can ignore my request for same-line comments, the rest fields are recognizable.

yes, done.

Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):
By uninitialized, I mean some garbage value. We don't read this from the host as the file is not present. Also, when the application queries for this path, we return -ENOENT from LibOS.

$ cat /sys/devices/system/cpu/cpu0/online                                                                                 
cat: /sys/devices/system/cpu/cpu0/online: No such file or directory

reasonable to always set is_logical_core_online = 1

Given this why set is_logical_core_online to 1?

I put this as a reminder comment, but if it is confusing, we can remove it.

Pal/include/arch/x86_64/pal_topology.h, line 99 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

How is this sockets_cnt different from online_nodes.resource_cnt? Are you sure you still need this sockets_cnt variable for anything?

I think you are getting confused with sockets and NUMA nodes. A socket is a physical package, this is dictated by the hardware/VMM. A NUMA node is a software abstraction to split resources. We can have n NUMA nodes within a socket. See this, https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonSPv2+msc.v2.1.png, where we have 2 sockets and each socket, has 2 NUMA nodes.

yes, I use sockets_cnt during socket sanitization.

Pal/include/arch/x86_64/pal_topology.h, line 101 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Is this physical_cores_per_socket variable still relevant? You have all this info in online_nodes.ranges_arr. Please check if this variable is really needed.

They are different resources. Please see my earlier comment.

Pal/src/host/Linux-common/topo_info.c, line 20 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Mentioning retval in this sentence is misleading, it makes an impression that reval may be passed as NULL, but this is not true. In other words, retval must always be passed to this function. So please remove retval mention from this sentence (keep only size_mult).

Done.

Pal/src/host/Linux-common/topo_info.c, line 27 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of the if check, simply assert(retval) -- this is a local function anyway, so we can make assumptions how it is called.

Done.

Pal/src/host/Linux-common/topo_info.c, line 32 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You can remove the last argument (0), it will still work.

Replaced open+read+close with read_file_buffer().

Pal/src/host/Linux-common/topo_info.c, line 36 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of 64, cannot we use one of your FILESZ macros?

Done.

Pal/src/host/Linux-common/topo_info.c, line 40 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Isn't this whole open+read+close sequence exactly the same as what read_file_buffer() function does? Please replace this code with the call to this function.

Done.

Pal/src/host/Linux-common/topo_info.c, line 73 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

HW resources, ranges -> HW resources and their ranges

Done.

Pal/src/host/Linux-common/topo_info.c, line 73 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

...stored in `res_info`.

Done.

Pal/src/host/Linux-common/topo_info.c, line 75 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Add a dot at the end of this sentence.

Done.

Pal/src/host/Linux-common/topo_info.c, line 81 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of the if check, simply assert(res_info) -- this is a local function anyway, so we can make assumptions how it is called.

Done.

Pal/src/host/Linux-common/topo_info.c, line 88 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You can remove the last argument (0), it will still work.

Replaced open+read+close with read_file_buffer().

Pal/src/host/Linux-common/topo_info.c, line 92 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of 64, cannot we use one of your FILESZ macros?

Done.

Pal/src/host/Linux-common/topo_info.c, line 96 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Isn't this whole open+read+close sequence exactly the same as what read_file_buffer() function does? Please replace this code with the call to this function.

Done.

Pal/src/host/Linux-common/topo_info.c, line 101 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why do you need a helper variable? Simply use res_info->resource_cnt instead.

A few of the lines exceeded 100 chars and I needed to wrap them. So added this helper variable.

Removed it.

Pal/src/host/Linux-common/topo_info.c, line 119 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Shouldn't you add also *end == '\t'?

Unless we encounter files with \t, it is better not to add it. So far, I haven't seen the need.

Pal/src/host/Linux-common/topo_info.c, line 120 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This count as one more comment doesn't make much sense now. I suggest to simply remove this part of the comment.

Done.

Pal/src/host/Linux-common/topo_info.c, line 125 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This count how many ... comment doesn't make much sense now. I suggest to simply remove this part of the comment.

Done.

Pal/src/host/Linux-common/topo_info.c, line 142 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

end_val - start_val -> diff

Done.

Pal/src/host/Linux-common/topo_info.c, line 142 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

The comment is weird. Why are there two examples? I would simply say /* +1 because of inclusive range */

Done.

Pal/src/host/Linux-common/topo_info.c, line 150 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add an empty line after this line, easier to read

Done.

Pal/src/host/Linux-common/topo_info.c, line 151 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Simply Realloc the array of ranges (expand by one range)

Done.

Pal/src/host/Linux-common/topo_info.c, line 178 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You get to this label only if something failed. Please rename it to fail:.

Done.

Pal/src/host/Linux-common/topo_info.c, line 181 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

??? How can this work? Instead of this for loop, it should be simply:
    if (res_info->ranges_arr)
        free(res_info->ranges_arr);

oops my bad, yes fixed it.

Pal/src/host/Linux-common/topo_info.c, line 183 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would remove this comment, it is obvious

Done.

Pal/src/host/Linux-common/topo_info.c, line 218 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of the if check, simply assert(cache_indices_cnt) -- this is a local function anyway, so we can make assumptions how it is called.

Done.

Pal/src/host/Linux-common/topo_info.c, line 220 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of 1024, cannot we use one of your FILESZ macros?

That would be too little to store the alllinux_dirent structs. Since this is a small function, felt it was fine to hardcode.

Pal/src/host/Linux-common/topo_info.c, line 221 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please don't use this pattern of assigning 0 to ret during init. Instead, add an explicit ret = 0; statement before the out: label.

OK, done.

Pal/src/host/Linux-common/topo_info.c, line 249 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add an empty line

Done.

Pal/src/host/Linux-common/topo_info.c, line 250 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add ret = 0;

Done.

Pal/src/host/Linux-common/topo_info.c, line 266 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of 128, cannot we use one of your FILESZ macros?

PAL_SYSFS_BUF_FILESZ is only 64, but here we end up with strings with larger lengths. We could use PAL_SYSFS_MAP_FILESZ but the name is confusing in this context.

We can either leave it or put some other macro length of the filename.

Pal/src/host/Linux-common/topo_info.c, line 270 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This is hard to read. You have the part /sys/devices/system/cpu/cpu%zu/cache/index%zu/ always the same -- please move it to a helper variable, and use the helper variable everywhere, like this:
char dirname[128];
snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu/cache/index%zu", core_idx, cache_idx);

snrpintf(filename, sizeof(filename), "%s/shared_cpu_list", dirname);
snrpintf(filename, sizeof(filename), "%s/level", dirname);
... and so on ...

Done.

Pal/src/host/Linux-common/topo_info.c, line 285 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why ARRAY_SIZE? Simply sizeof()

yes used sizeof().

Pal/src/host/Linux-common/topo_info.c, line 336 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

There is no reason to have this specific name, please rename to simply out:

Let us rename it to fail to be consistent and it also makes more sense as these are failure cases.

Pal/src/host/Linux-common/topo_info.c, line 363 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

offling -> offlining

Done.

Pal/src/host/Linux-common/topo_info.c, line 377 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of 128, cannot we use one of your FILESZ macros?

OK, created a macro PAL_SYSFS_PATH_LEN. Let me know if you have other suggestions or revert it back to using hardcoded value.

Pal/src/host/Linux-common/topo_info.c, line 381 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This is hard to read. You have the part /sys/devices/system/cpu/cpu%zu/ always the same -- please move it to a helper variable, and use the helper variable everywhere, like this:
char dirname[128];
snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu", idx);

snrpintf(filename, sizeof(filename), "%s/online", dirname);
snrpintf(filename, sizeof(filename), "%s/topology/core_id", dirname);
... and so on ...

Done.

Pal/src/host/Linux-common/topo_info.c, line 386 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add else { core_topology_arr[idx].is_logical_core_online = 1; }

Please see my other comment on this. I don't think we should even set this core_topology_arr[0].is_logical_core_online parameter as any access to it from LibOS will result in failure.

Pal/src/host/Linux-common/topo_info.c, line 425 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

As I mentioned previously, I think you don't need these two variables anymore (sockets_cnt and physical_cores_per_socket).

sockets and NUMA nodes are different hardware resources. Please see my other comment.

Pal/src/host/Linux-SGX/db_main.c, line 126 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add a comment that this function and all below functions do not care about correctly freeing memory.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

To access untrusted base-type variables, you should always use READ_ONCE(). In this case:
size_t range_cnt = READ_ONCE(untrusted_src->range_cnt);

Done, but I didn't understand the need to use READ_ONCE. Can you please clarify why we need to use READ_ONCE()?

I see some examples in this file where we don't use READ_ONCE() like https://github.com/gramineproject/gramine/blob/master/Pal/src/host/Linux-SGX/db_main.c#L721 and
https://github.com/gramineproject/gramine/blob/master/Pal/src/host/Linux-SGX/db_main.c#L735

Pal/src/host/Linux-SGX/db_main.c, line 135 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

READ_ONCE(untrusted_src->ranges_arr)

(For examples, you can check enclave_ocalls.c)

Done.

Pal/src/host/Linux-SGX/db_main.c, line 143 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

READ_ONCE

Done.

Pal/src/host/Linux-SGX/db_main.c, line 158 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Here and everywhere: READ_ONCE()

Done.

Pal/src/host/Linux-SGX/db_main.c, line 165 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

core_topo_arr, please fix all such issues in your logs

Done.

Pal/src/host/Linux-SGX/db_main.c, line 185 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you introduce a helper variable, otherwise it's hard to read. Like this:
struct pal_core_cache_info* untrusted_cache_info = READ_ONCE(&untrusted_src[idx].cache_info_arr[lvl]);

cache_info_arr[lvl].type = READ_ONCE(untrusted_cache_info->type);
...

Done, expect for the initial address assignment which throws an error.

../Pal/src/host/Linux-SGX/db_main.c:190:17: error: cannot take the address of an rvalue of type 'struct pal_core_cache_info *'
                READ_ONCE(&untrusted_src[idx].cache_info_arr[lvl]);
                ^         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../common/include/api.h:295:69: note: expanded from macro 'READ_ONCE'
#define READ_ONCE(x) ({ __typeof__(x) y = *(volatile __typeof__(x)*)&(x); y;})

Pal/src/host/Linux-SGX/db_main.c, line 206 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add an empty line before this one.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 208 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I don't like that this generic function suddenly updates a global variable -- this is an unexpected side-effect of this function. Instead, please add a new function argument which will be a pointer to the topo_info object, and pass &g_pal_public_state.topo_info as this argument in the caller.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 222 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

READ_ONCE()

Done.

Pal/src/host/Linux-SGX/db_main.c, line 228 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

numa_topo_arr

Done.

Pal/src/host/Linux-SGX/db_main.c, line 239 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I don't like that this generic function suddenly updates a global variable -- this is an unexpected side-effect of this function. Instead, please add a new function argument which will be a pointer to the topo_info object, and pass &g_pal_public_state.topo_info as this argument in the caller.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 247 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you add logic to verify that resource_cnt is correct and corresponds to ranges in res_info->ranges_arr? I mean, you can do a for loop over res_info->ranges_arr[idx] and sum up the range diffs. This will add protection against malicious resource_cnt.

yes, done.

Pal/src/host/Linux-SGX/db_main.c, line 261 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Invalid number of ranges

Isn't range_cnt interpreted as range count and num_ranges interpreted as number of ranges? And we decided to avoid the latter?

If so, I would prefer the existing comment as it aligns with what we intended.

Pal/src/host/Linux-SGX/db_main.c, line 277 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Invalid start of range

Done.

Pal/src/host/Linux-SGX/db_main.c, line 282 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Invalid end of range

Done.

Pal/src/host/Linux-SGX/db_main.c, line 289 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Malformed range -> Overlapping ranges

Done.

Pal/src/host/Linux-SGX/db_main.c, line 315 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please introduce a macro MAX_HYPERTHREADS_PER_CORE (or something like this) in one of the header files instead of hard-coding 2 here.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 321 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you add a comment with an example of shared_cpu_map? I'm confused how this can be in range 1..2 for the case of DATA cache...

The data cache isn't shared with other cores, so we end-up with just the core/and its sibling if HT is enabled.
For example,

$ cat /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map                                                         
00000000,00000002

What comment will help here?

Pal/src/host/Linux-SGX/db_main.c, line 328 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What's the point in this level helper variable?

Removed it.

Pal/src/host/Linux-SGX/db_main.c, line 329 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please introduce a macro MAX_CACHE_LEVELS (or something like this) in one of the header files instead of hard-coding 3 here.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 337 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why not remove this big IF and just add another else {return -1} below?

Sorry I didn't get your comment. I want to explicitly make sure the size_multiplier is one of the accepted values. Anything else is returned as an error.

Pal/src/host/Linux-SGX/db_main.c, line 354 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What's the point in this helper variable?

Removed it.

Pal/src/host/Linux-SGX/db_main.c, line 358 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What's the point in this helper variable?

Removed it.

Pal/src/host/Linux-SGX/db_main.c, line 362 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What's the point in this helper variable?

Removed it.

Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):
The inputs to this function are socket_info_arr which is empty at first and gets populated as we identify socket<->core mapping, socket_id, the socket where the core is present and core_id is the core. From this, we create a socket map of type struct pal_res_range_info where range_cnt corresponds to each socket and ranges_arr stores the range of cores present in the socket.

So, it does create a socket map from the socket_id and core_id.

I don't know how many items are in the socket_info_arr

Yes, because we build this on the fly with socket_id and core_id

I did have this logic as part of the caller, but it looked very clumsy and so moved it as a separate function.

Does the below comment help?

/* Create a socket to core mapping from `socket_id` and `core_id` and store the info as part of
 * `socket_info_arr` */

Pal/src/host/Linux-SGX/db_main.c, line 402 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I suggest the comment:

For each socket, cross-verify that its set of cores is the same as the core topology's core-siblings:
  - Pick the first core in the socket
  - Find its core-siblings in the core topology
  - Verify that the array of cores in the socket object is exactly the same as core-siblings in the core topology object

Then you can remove the other comment (pick the first ...)

Done.

Pal/src/host/Linux-SGX/db_main.c, line 409 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You have double-space here

Done.

Pal/src/host/Linux-SGX/db_main.c, line 410 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Instead of core_sibling_cnt, I suggest to use:

    struct pal_res_range_info* core_siblings = &core_topology_arr[core_in_socket].core_siblings;

    /* and then use this helper variable */
    if (core_siblings->range_cnt != socket_info_arr[idx].range_cnt) { ...

Done.

Pal/src/host/Linux-SGX/db_main.c, line 433 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Shouldn't you check sockets_cnt as well?

I wanted to remove this as it is redundant. We do check these in the caller itself.

Pal/src/host/Linux-SGX/db_main.c, line 449 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

What's the point in this helper variable? Please remove.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 456 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Shouldn't the last argument be online_logical_cores_cnt - 1 (note the - 1)?

No, this is resource count. For example, if range was 0-63, the count is 64 and we check against this value.

Pal/src/host/Linux-SGX/db_main.c, line 463 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

In some other place you said that the max SMT siblings are 2 :)

Please introduce a macro MAX_HYPERTHREADS_PER_CORE (or something like this) in one of the header files instead of hard-coding 4 here.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 464 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Shouldn't the last argument be online_logical_cores_cnt - 1 (note the - 1)?

Please see my earlier comment on this.

Pal/src/host/Linux-SGX/db_main.c, line 486 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please use:

  * #2. Compare array of cores in each socket against the array of core-siblings from
        the core topology.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 499 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Why did you remove this shallow copy? This is important! Please restore this part and use temp_topo_info.

Now that we have added READ_ONCE() do we still want to restore it? If so, can you please explain why?

Pal/src/host/Linux-SGX/db_main.c, line 512 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please don't init with zero. Instead, keep int ret; and right-before the out label, do ret = 0;

Done.

Pal/src/host/Linux-SGX/db_main.c, line 521 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

-1 in the last argument?

This is count so -1 will be incorrect check. Please see my earlier comment on this.

Pal/src/host/Linux-SGX/db_main.c, line 532 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

BITS_IN_TYPE(uint32_t)

Done.

Pal/src/host/Linux-SGX/db_main.c, line 533 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I think this shouldn't be an assert but instead a proper if (index >= cpumask_cnt) { fail }

Done

Pal/src/host/Linux-SGX/db_main.c, line 535 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

BITS_IN_TYPE(uint32_t)

Done.

Pal/src/host/Linux-SGX/db_main.c, line 540 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

BITS_IN_TYPE(uint32_t)

Done.

Pal/src/host/Linux-SGX/db_main.c, line 547 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Invalid numa_topology: number of distances is not the same as number of NUMA nodes

Rephrased as "Distance count is not same as the NUMA nodes count

Pal/src/host/Linux-SGX/db_main.c, line 554 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

more -> not the same number of

Rephrased as Invalid numa_topology: Mismatch between NUMA cores and online cores count

Pal/src/host/Linux-SGX/db_main.c, line 558 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Here must be ret = 0;

Done.

Pal/src/host/Linux-SGX/db_main.c, line 559 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Rename to simply out:

Renamed to fail as in other places.

Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):
Instead of copying the whole structure, I am copying each member and verifying it. Even previously with temp_topo_info we just did a shallow copy but no verification was done. Then in this function we made a deep copy and sanitization.

You removed the verification-and-copy of this uptr_topo_info,

Am I missing something? Can you please clarify?

Pal/src/host/Linux-SGX/db_main.c, line 740 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Remove space before ,

Done.

vijaydhanraj

Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

a discussion (no related file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Sure @mkow, I will create a new PR with those 2 commits. I will remove those commits from this PR before final rebase. There are lot of discussions here and so don't want to force push.

Pulled out the first 2 commits and created #360

vijaydhanraj

Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

-- commits, line 6 at r2:

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Blocked myself. Will do this as part of the final rebase.

Done as part of PR 360

-- commits, line 7 at r2:

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Blocked myself. Will do this as part of the final rebase.

Done as part of PR 360

-- commits, line 8 at r2:

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Blocked myself. Will do this as part of the final rebase.

Done as part of PR 360

-- commits, line 8 at r2:

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Blocked myself. Will do this as part of the final rebase.

Done as part of PR 360

mkow · 2022-01-25T22:47:14Z

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):

why do you say only the unit is untrusted? Even the value is untrusted right?

By unit I mean "in which unit we'll display this value to the user". And the unstrusted host may select some impossible combinations, like the ones I already listed in this very conversation.

Also, for cache size, the expectation of the application will be a value followed by unit. By normalizing we are changing this format.

I don't want you to change the format in sysfs for the app, but to change the format on the untrusted interface.

vijaydhanraj

Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):

But as I said, you aren't sanitizing the unit, e.g. 1000000000K passes the current sanitization if I'm not mistaken

So, this is the problem. And to address this you are suggesting removing the size multiplier and keeping the value. For example, if the value was 4K you want to store as 4096in the untrusted PAL. Copy to it trusted PAL and run sanitization on this and when application queries this value, LibOS will change 4096 to 4K format.

Is my understanding correct?

mkow · 2022-01-26T00:35:02Z

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

But as I said, you aren't sanitizing the unit, e.g. 1000000000K passes the current sanitization if I'm not mistaken

So, this is the problem. And to address this you are suggesting removing the size multiplier and keeping the value. For example, if the value was 4K you want to store as 4096in the untrusted PAL. Copy to it trusted PAL and run sanitization on this and when application queries this value, LibOS will change 4096 to 4K format.

Is my understanding correct?

Yes, exactly.

vijaydhanraj

Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Yes, exactly.

Got it. So currently I store 4 and K that I read from untrusted PAL. But in the trusted PAL, I first check if the multiplier is such K is valid. Then I multiply the value which is 4 times K and then check we don't end with huge numbers like as you have pointed in your example. If the multiplied value (4096) is within range, I keep the value and the size multiplier else I throw an error.

Please see the below logic

        size_t multiplier = 1;
        if (cache_info_arr[lvl].size_multiplier == MULTIPLIER_KB)
            multiplier = 1024;
        else if (cache_info_arr[lvl].size_multiplier == MULTIPLIER_MB)
            multiplier = 1024 * 1024;
        else if (cache_info_arr[lvl].size_multiplier == MULTIPLIER_GB)
            multiplier = 1024 * 1024 * 1024;

        size_t cache_size;
        if (__builtin_mul_overflow(cache_info_arr[lvl].size, multiplier, &cache_size))
            return -1;

        if (!IS_IN_RANGE_INCL(cache_size, 1, 1 << 30))
            return -1;

mkow · 2022-01-26T01:05:38Z

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):

Got it. So currently I store 4 and K [...]

Yes, you currently do it differently, that's why we have this whole discussion. What's your point/question?

vijaydhanraj

Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Got it. So currently I store 4 and K [...]

Yes, you currently do it differently, that's why we have this whole discussion. What's your point/question?

It turns out that Linux kernel always displays cache size in terms of KB. This makes things simpler for us and we can remove the size qualifier. As suggested by @mkow, cache size is sanitized after multiplying with 1024 and when the application requests, LibOS converts the value to xK format.

Please see here, https://elixir.bootlin.com/linux/v5.13/source/drivers/base/cacheinfo.c#L380
and https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu

size:
			the total cache size in kB

dimakuv

Reviewed 8 of 8 files at r4, all commit messages.
Reviewable status: all files reviewed, 29 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

LibOS/shim/include/shim_fs_pseudo.h, line 216 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I think it will be better to change buf_size -> str_size as it seems to fit with the function names. So have changed it this way. This will also align with what we had discussed before regarding the naming of the char buffers.

Let me know if you want to revert it to buf and buf_size.

Ok, I'm fine with str_size of course. Just wanted consistency in naming.

LibOS/shim/src/fs/proc/info.c, line 129 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

This will be incorrect as most servers have more than 1 socket. We anyway only print a selected few values. So, unless an application is failing, or we are breaking something in Gramine, I wouldn't suggest adding a dummy value.

OK, I see your point. Hopefully we'll remove this enable_sysfs_topology soon anyway.

Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

It turns out that Linux kernel always displays cache size in terms of KB. This makes things simpler for us and we can remove the size qualifier. As suggested by @mkow, cache size is sanitized after multiplying with 1024 and when the application requests, LibOS converts the value to xK format.

Please see here, https://elixir.bootlin.com/linux/v5.13/source/drivers/base/cacheinfo.c#L380
and https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu
size:
			the total cache size in kB

LGTM, I like it, we now removed a lot of code.

Pal/include/arch/x86_64/pal_topology.h, line 52 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I wanted to make this explicit. After a few months, I myself might miss this :). But I can remove it if it is too obvious.

OK, let's keep it if it helps developers.

Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

By uninitialized, I mean some garbage value. We don't read this from the host as the file is not present. Also, when the application queries for this path, we return -ENOENT from LibOS.
$ cat /sys/devices/system/cpu/cpu0/online                                                                                 
cat: /sys/devices/system/cpu/cpu0/online: No such file or directory
reasonable to always set is_logical_core_online = 1

Given this why set is_logical_core_online to 1?

I put this as a reminder comment, but if it is confusing, we can remove it.

Maybe rephrase to a more verbose:

/* Core 0 is always online (and thus there is no file `/sys/devices/system/cpu/cpu0/online`),
 * so we leave is_logical_core_online for core 0 uninitialized */

What I constantly forget is that cpu0/online file doesn't exist, and I think this should be mentioned in this comment.

Pal/include/arch/x86_64/pal_topology.h, line 99 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I think you are getting confused with sockets and NUMA nodes. A socket is a physical package, this is dictated by the hardware/VMM. A NUMA node is a software abstraction to split resources. We can have n NUMA nodes within a socket. See this, https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonSPv2+msc.v2.1.png, where we have 2 sockets and each socket, has 2 NUMA nodes.

yes, I use sockets_cnt during socket sanitization.

OK, thanks. Yes, I was confusing sockets to be the same as NUMA nodes. Thanks for the clarification.

Pal/src/host/Linux-common/topo_info.c, line 101 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

A few of the lines exceeded 100 chars and I needed to wrap them. So added this helper variable.

Removed it.

OK, I understand the reasoning (I guess in the old code it was more useful). The new version looks better to me.

Pal/src/host/Linux-common/topo_info.c, line 119 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Unless we encounter files with \t, it is better not to add it. So far, I haven't seen the need.

But you have a check for \t 20 lines above (see my other comment).

Pal/src/host/Linux-common/topo_info.c, line 266 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

PAL_SYSFS_BUF_FILESZ is only 64, but here we end up with strings with larger lengths. We could use PAL_SYSFS_MAP_FILESZ but the name is confusing in this context.

We can either leave it or put some other macro length of the filename.

LGTM, you added a new macro.

Pal/src/host/Linux-common/topo_info.c, line 336 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Let us rename it to fail to be consistent and it also makes more sense as these are failure cases.

OK, yes, this is better.

Pal/src/host/Linux-common/topo_info.c, line 377 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

OK, created a macro PAL_SYSFS_PATH_LEN. Let me know if you have other suggestions or revert it back to using hardcoded value.

OK, this is good.

Pal/src/host/Linux-common/topo_info.c, line 44 at r4 (raw file):

        ret = __builtin_mul_overflow(val, 1024, &val);
        if (ret < 0)
            return -EINVAL;

__builtin_mul_overflow returns a boolean: true if overflow is detected and false if everything is fine.

So this should be:

    if (__builtin_mul_overflow(val, 1024, &val))
        return -EOVERFLOW;

Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):

        /* Ensure the size hasn't exceed max threshold after multiplying by size qualifier */
        if (val > INT_MAX)
            return -EINVAL;

Actually, isn't -EOVERFLOW is a better error here?

Pal/src/host/Linux-common/topo_info.c, line 77 at r4 (raw file):

    char* ptr = str;
    while (*ptr) {
        while (*ptr == ' ' || *ptr == '\t' || *ptr == ',')

FYI: Here you have check for \t, but in the below comment you say that you haven't encountered any files that use \t. Delete from here then maybe?

Pal/src/host/Linux-common/topo_info.c, line 112 at r4 (raw file):

            if (__builtin_add_overflow(res_info->resource_cnt, diff, &total_cnt) ||
                                       total_cnt >= INT_MAX) {
                ret = -EINVAL;

Actually, isn't -EOVERFLOW is a better error here?

Pal/src/host/Linux-common/topo_info.c, line 163 at r4 (raw file):

}

ssize_t read_file_buffer(const char* filename, char* buf, size_t count) {

Can you move this function to the very top of this file, now that you use it in other functions as well?

Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done, but I didn't understand the need to use READ_ONCE. Can you please clarify why we need to use READ_ONCE()?

I see some examples in this file where we don't use READ_ONCE() like https://github.com/gramineproject/gramine/blob/master/Pal/src/host/Linux-SGX/db_main.c#L721 and
https://github.com/gramineproject/gramine/blob/master/Pal/src/host/Linux-SGX/db_main.c#L735

Oops, these are bugs. Actually, we are supposed to use READ_ONCE() on every read access of the untrusted variable.

The idea is to prevent the compiler from doing optimizations that break security guarantees. When you do a simple myvar = untrusted_var;, the compiler can do all sorts of weird stuff: the compiler may decide to read untrusted_var two or more times instead of just once. The compiler may say "This developer added a useless myvar alias to the perfectly good untrusted_var, the developer is stupid, let me just replace myvar with untrusted_var everywhere in the code".

With READ_ONCE(), we prevent the compiler from playing smart and accidentally accessing the untrusted variables many times. Also recall that if an untrusted variable is accessed two times, then the attacker can replace the untrusted-variable value in-between the two accesses, and suddenly one part of your code has value 1 and another part of your code has value 2 of the same variable -- and your program glitches, fails, or exposes a vulnerability. Adding READ_ONCE() is a simple way for the developer to be sure that untrusted_var is first copied into myvar, and then the stored in secure memory and never maliciously changing myvar is used.

P.S. Feel free to fix those bugs you found in a separate PR.

Pal/src/host/Linux-SGX/db_main.c, line 261 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Isn't range_cnt interpreted as range count and num_ranges interpreted as number of ranges? And we decided to avoid the latter?

If so, I would prefer the existing comment as it aligns with what we intended.

OK

Pal/src/host/Linux-SGX/db_main.c, line 321 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

The data cache isn't shared with other cores, so we end-up with just the core/and its sibling if HT is enabled.
For example,
$ cat /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map                                                         
00000000,00000002
What comment will help here?

Thanks, this example helps. Maybe this:

/* recall that `shared_cpu_map` shows this core + its siblings (if HT is enabled), for example:
 *     /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map: 00000000,00000002 */

Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):
No, this doesn't help me much. I still highly dislike the function.

I did have this logic as part of the caller, but it looked very clumsy and so moved it as a separate function.

I think the clumsiness stems from the fact that you have the realloc logic here (it is like 2/3 of this function). Maybe introduce a helper func realloc_from_size(void* buf, size_t old_size, size_t new_size)? And then this function will become like 6 lines in total, and then you can remove this function and inline these 6 lines of code into the caller.

Also, I think you have the same realloc logic somewhere else, so you could even make this realloc_from_size() function available as a common function, and use it in both cases.

Pal/src/host/Linux-SGX/db_main.c, line 499 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Now that we have added READ_ONCE() do we still want to restore it? If so, can you please explain why?

See my other comment.

Pal/src/host/Linux-SGX/db_main.c, line 559 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Renamed to fail as in other places.

But this is not a fail. You get to this label even in good cases. So it should be out.

Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Instead of copying the whole structure, I am copying each member and verifying it. Even previously with temp_topo_info we just did a shallow copy but no verification was done. Then in this function we made a deep copy and sanitization.

You removed the verification-and-copy of this uptr_topo_info,

Am I missing something? Can you please clarify?

At this point, you're accessing the untrusted pointer value (uptr_topo_info-> is transformed into asm mov (<untrusted mem>), %rdx). The pointer value that is stored at this address <untrusted mem> may contain an address in enclave memory, so you end up with giving a malicious first argument to copy_resource_range_to_enclave() -- this first argument is a pointer to some arbitrary enclave-memory address (specified by the attacker).

You removed the if (!sgx_copy_to_enclave(&temp_topo_info, sizeof(temp_topo_info), uptr_topo_info, sizeof(*uptr_topo_info))) operation, which is important here. This operation doesn't just perform a copy, it also performs a check on the untrusted pointer uptr_topo_info -- this operation verifies that the untrusted pointer does not point into the enclave memory. See this:

gramine/Pal/src/host/Linux-SGX/enclave_framework.c

Lines 139 to 140 in 32d2778

    
           !sgx_is_completely_outside_enclave(uptr, usize) || 
        
           !sgx_is_completely_within_enclave(ptr, usize)) {

Pal/src/host/Linux-SGX/db_main.c, line 126 at r4 (raw file):

}

/* This function and all the below functions doesn't clean up resources on failure.

doesn't -> don't

Pal/src/host/Linux-SGX/db_main.c, line 153 at r4 (raw file):

                                         size_t online_logical_cores_cnt,
                                         size_t cache_indices_cnt,
                                         struct pal_core_topo_info** out_core_topology_arr) {

Could you please add an assert:

assert(out_core_topology_arr);

Pal/src/host/Linux-SGX/db_main.c, line 160 at r4 (raw file):

        return -1;
    }

Now that I look at all these READ_ONCE() macros you added, I think that it will be just simpler to perform a shallow copy of *untrusted_src into a temporary variable and use this variable (and you will get rid of READ_ONCE because now you will be accessing a variable stored inside the enclave).

I definitely want this shallow copy-to-enclave implemented in this function. I'm not sure about the above copy_resource_range_to_enclave() -- that one is pretty trivial and doesn't really merit a temporary variable. But your choice, it makes everyone feel safer to use shallow copies :)

Pal/src/host/Linux-SGX/db_main.c, line 167 at r4 (raw file):

        core_topo_arr[idx].socket_id = READ_ONCE(untrusted_src[idx].socket_id);

        int ret = copy_resource_range_to_enclave(&untrusted_src[idx].core_siblings,

You're accessing the untrusted var here as well, you're supposed to use READ_ONCE().

Exactly because of such hard-to-reason-about cases, I suggest to do a shallow copy and not bother about these.

Pal/src/host/Linux-SGX/db_main.c, line 174 at r4 (raw file):

        }

        ret = copy_resource_range_to_enclave(&untrusted_src[idx].thread_siblings,

ditto

Pal/src/host/Linux-SGX/db_main.c, line 191 at r4 (raw file):

        for (size_t lvl = 0; lvl < cache_indices_cnt; lvl++) {
            struct pal_core_cache_info* untrusted_cache_info =
                &untrusted_src[idx].cache_info_arr[lvl];

This also merits a shallow copy.

Pal/src/host/Linux-SGX/db_main.c, line 221 at r4 (raw file):

static int sgx_copy_numa_topo_to_enclave(struct pal_numa_topo_info* untrusted_src,
                                         size_t online_nodes_cnt,
                                         struct pal_numa_topo_info** out_numa_topology_arr) {

ditto (please add an assert)

Pal/src/host/Linux-SGX/db_main.c, line 234 at r4 (raw file):

            READ_ONCE(untrusted_src[idx].nr_hugepages[HUGEPAGES_1G]);

        int ret = copy_resource_range_to_enclave(&untrusted_src[idx].cpumap,

READ_ONCE() or do a shallow copy of untrusted_src[idx]

Pal/src/host/Linux-SGX/db_main.c, line 241 at r4 (raw file):

        }

        ret = copy_resource_range_to_enclave(&untrusted_src[idx].distance,

ditto

Pal/src/host/Linux-SGX/db_main.c, line 286 at r4 (raw file):

        size_t end = res_info->ranges_arr[i].end;

        resource_cnt_from_ranges += end - start + 1;

I would move this line after the two checks -- otherwise end < start, and you'll get an integer overflow. But after the two checks, you are sure that end and start are good, and you can safely do this arithmetics.

Pal/src/host/Linux-SGX/db_main.c, line 401 at r4 (raw file):

 * - Find its core-siblings in the core topology.
 * - Verify that the cores in the socket info array is exactly the same as core-siblings
 *   present in core topology array.

This sentence is very hard to read. Maybe add " like this (if I understand correctly what you want to say here):

 * - Verify that the "cores in the socket info" array is exactly the same as the "core-siblings
 *   present in core topology" array.

Pal/src/host/Linux-SGX/db_main.c, line 418 at r4 (raw file):

        struct pal_range_info* core_sibling_ranges_arr =
            core_topology_arr[core_in_socket].core_siblings.ranges_arr;

Now you don't really need this helper variable core_sibling_ranges_arr.

Instead you could use smth like core_siblings->ranges_arr[j].start and core_siblings->ranges_arr[j].end.

Or are these too long and don't fit into 100 chars? In this case, you can at least change:

        struct pal_range_info* core_sibling_ranges_arr = core_siblings->ranges_arr;

Pal/src/host/Linux-SGX/db_main.c, line 532 at r4 (raw file):

                size_t index = j / BITS_IN_TYPE(uint32_t);
                if (index >= cpumask_cnt) {
                    log_error("Invalid numa topology: cpumap range, %zu - %zu", start, j);

I don't understand this message at all. Maybe simply ..."Invalid numa topology: Core %zu is beyond CPU mask limit", j);

mkow

Reviewed 12 of 21 files at r1, 1 of 8 files at r3, 7 of 8 files at r4, all commit messages.
Reviewable status: all files reviewed, 64 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

-- commits, line 25 at r2:

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

How about the following?

This commit creates a clean representation of topology data on the trusted<->untrusted boundary, so that we can easily verify its correctness. The following are done:
        1. Refactor sysfs code to convert CPU/NUMA information from
           Linux-formatted strings to integers in untrusted PAL.
        2. Copy and sanitize CPU/NUMA information based on untrusted
           integers to trusted PAL.
        3. Convert the trusted CPU/NUMA information from integers back
           to Linux-formatted strings in LibOS.

Looks better.

LibOS/shim/include/shim_fs_pseudo.h, line 222 at r4 (raw file):

 * Example output for 64 cores in total and ranges 0-15,48-55: "00ff0000,0000ffff".
 * Note: This function adds a newline at the end of the string. */
int sys_convert_ranges_to_cpu_bitmap_str(const struct pal_res_range_info* resource_range_info, char* str,

please fix wrapping

LibOS/shim/src/fs/proc/info.c, line 129 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

OK, I see your point. Hopefully we'll remove this enable_sysfs_topology soon anyway.

I also prefer skipping it than printing wrong value.

LibOS/shim/src/fs/sys/cache_info.c, line 45 at r4 (raw file):

                break;
            case CACHE_TYPE_UNIFIED:
                ret = snprintf(str, sizeof(str), "%s\n", "Unified");

What's the point of "%s\n", "Unified"? Please simplify here and above.

LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):

    size_t range_cnt = resource_range_info->range_cnt;
    size_t offset = 0;
    for (size_t i = 0; i < range_cnt; i++) {

You leave str uninitialized and return success for empty ranges list.

LibOS/shim/src/fs/sys/fs.c, line 56 at r4 (raw file):

BITS_TO_INTS

bitmap is not an array of ints

LibOS/shim/src/fs/sys/fs.c, line 75 at r4 (raw file):

    /* Convert cpumask to strings */
    size_t offset = 0;
    for (size_t j = cpumask_cnt; j > 0; j--) {

Please assert that cpumask_cnt > 0, it should always be true, but otherwise this function unintuitively returns incorrect results.

Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Maybe rephrase to a more verbose:
/* Core 0 is always online (and thus there is no file `/sys/devices/system/cpu/cpu0/online`),
 * so we leave is_logical_core_online for core 0 uninitialized */
What I constantly forget is that cpu0/online file doesn't exist, and I think this should be mentioned in this comment.

+1 to Dmitrii's proposal

Pal/include/arch/x86_64/pal_topology.h, line 4 at r3 (raw file):

/* Copyright (C) 2022 Intel Corporation
 *                    Michał Kowalczyk <mkow@invisiblethingslab.com>
 * Copyright (C) 2022 Intel Corporation

You don't have to duplicate this line, see how we do this in other files.

Pal/include/arch/x86_64/pal_topology.h, line 11 at r3 (raw file):

#define PAL_TOPOLOGY_H

/* Used to represent buffers having numeric values with unit suffixes. E.g "1024576K".

E.g -> E.g.

Pal/include/arch/x86_64/pal_topology.h, line 11 at r3 (raw file):

#define PAL_TOPOLOGY_H

/* Used to represent buffers having numeric values with unit suffixes. E.g "1024576K".

suffixes. E -> suffixes, e

Pal/include/arch/x86_64/pal_topology.h, line 11 at r4 (raw file):

#define PAL_TOPOLOGY_H

/* Used to represent buffers having numeric values with unit suffixes. E.g "1024576K".

Also, isn't this outdated now?

Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):

struct pal_range_info {
    size_t start;
    size_t end;

inclusive or exclusive?

Pal/include/arch/x86_64/pal_topology.h, line 43 at r4 (raw file):

struct pal_res_range_info {
    /* Total number of resources present. Eg. if output of `/sys/devices/system/cpu/online` were

Eg. -> E.g.

Pal/include/arch/x86_64/pal_topology.h, line 43 at r4 (raw file):

struct pal_res_range_info {
    /* Total number of resources present. Eg. if output of `/sys/devices/system/cpu/online` were

were -> was

Pal/include/arch/x86_64/pal_topology.h, line 47 at r4 (raw file):

    size_t resource_cnt;

    /* Total number of ranges present. Eg. if output of `/sys/devices/system/cpu/online` were

Eg. -> E.g.

Pal/include/arch/x86_64/pal_topology.h, line 47 at r4 (raw file):

    size_t resource_cnt;

    /* Total number of ranges present. Eg. if output of `/sys/devices/system/cpu/online` were

were -> was

Pal/include/arch/x86_64/pal_topology.h, line 49 at r4 (raw file):

    /* Total number of ranges present. Eg. if output of `/sys/devices/system/cpu/online` were
     * 0-15,21,32-63 then `range_cnt` will be 3 */
    size_t range_cnt;

range_cnt -> ranges_cnt

Pal/include/arch/x86_64/pal_topology.h, line 51 at r4 (raw file):

    size_t range_cnt;

    /* Array of ranges, with `range_cnt` items. Eg. if output of `/sys/devices/system/cpu/online`

Eg. -> E.g.

Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):

    /* Array of ranges, with `range_cnt` items. Eg. if output of `/sys/devices/system/cpu/online`
     * were 0-15,21,32-63 then `ranges_arr` will be [{0, 15}, {21, 21}, {32 63}]. When there is only

were -> was

Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):

[{0, 15}, {21, 21}, {32 63}]

Please unify formatting here.

Pal/include/arch/x86_64/pal_topology.h, line 53 at r4 (raw file):

When there is only a single number instead of range both start and end are assigned start value

I'd drop this sentence, it's already clearly explained by the example.

Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):

/* Note: The ranges should not overlap */

But can they touch each other? E.g. is [[0, 15], [16, 31]] a valid array of ranges?

Pal/include/arch/x86_64/pal_topology.h, line 62 at r4 (raw file):

So to get actual cache size multiply this value by 1024.

This is super confusing, please store this value in bytes instead.

Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):

struct pal_core_topo_info {
    /* [0] element is uninitialized because core 0 is always online */
    size_t is_logical_core_online;

This should be bool.

Pal/src/host/Linux-common/topo_info.c, line 21 at r4 (raw file):

/* Opens a pseudo-file describing HW resources and simply reads the value stored in the file.
 * Returns UNIX error code on failure and 0 on success. */
static int get_hw_resource_value(const char* filename, size_t* retval) {

retval -> out_value

Pal/src/host/Linux-common/topo_info.c, line 34 at r4 (raw file):

    long val = strtol(str, &end, 10);
    if (val < 0 || val > INT_MAX)
        return -ENOENT;

-ENOENT -> -EINVAL or something like that

Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):

"1,3-5,6"

This isn't a good example, it suggests that the ends are exclusive, which I think isn't the case.

Pal/src/host/Linux-common/topo_info.c, line 60 at r4 (raw file):

 * N.B: Understands complex formats like "1,3-5,6".
 */
static int get_hw_resource_range(const char* filename, struct pal_res_range_info* res_info) {

res_info -> out_info

Pal/src/host/Linux-common/topo_info.c, line 115 at r4 (raw file):

                goto fail;
            }
            res_info->resource_cnt += diff + 1; /* +1 because of inclusive range */

You have an overflow check here, but not on the incrementation above?

Pal/src/host/Linux-common/topo_info.c, line 125 at r4 (raw file):

        res_info->range_cnt++;

        /* Realloc the array of ranges (expand by one range) */

Doing realloc on every iteration is super slow, it probably will be non-negligible on large servers. Please do some overestimation with *2 growing when it's too small and then trim it with realloc after the loop.

Pal/src/host/Linux-common/topo_info.c, line 153 at r4 (raw file):

fail:
    if (res_info->ranges_arr)

no need for this if

Pal/src/host/Linux-common/topo_info.c, line 163 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Can you move this function to the very top of this file, now that you use it in other functions as well?

But please not in this PR, the diff is already large and moving code around is annoying to review :)

Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):

/* Returns number of cache levels present on this system by counting "indexX" dir entries under
 * `/sys/devices/system/cpu/cpuX/cache` on success and negative UNIX error code on failure. */

This comment is stale

Pal/src/host/Linux-common/topo_info.c, line 191 at r4 (raw file):

    char buf[1024];
    int ret;
    int dirs_cnt = 0;

size_t

Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):

PAL_SYSFS_PATH_LEN

Is this length or size?

Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):

        return ret;

    if (topo_info->online_logical_cores.resource_cnt > INT32_MAX)

Why can't it be larger?

Pal/src/host/Linux-common/topo_info.c, line 316 at r4 (raw file):

    size_t online_logical_cores_cnt = topo_info->online_logical_cores.resource_cnt;

    if (topo_info->possible_logical_cores.resource_cnt > INT32_MAX)

ditto

Pal/src/host/Linux-common/topo_info.c, line 341 at r4 (raw file):

        snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu", idx);

        /* cpu0 is always online and thus the "online" file is not present. */

Could we instead of leaving it uninitialized set this value to "true" for core0? There's a risk that someone will use these structures for something else than sysfs and will get hurt.

Pal/src/host/Linux-common/topo_info.c, line 401 at r4 (raw file):

        return -ENOMEM;

    char filename[PAL_SYSFS_PATH_LEN];

ditto, len or size?

vijaydhanraj

Reviewable status: 15 of 21 files reviewed, 64 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

LibOS/shim/include/shim_fs_pseudo.h, line 222 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

please fix wrapping

Are you suggesting having text until 100 chars? If so, I purposefully put "Example..." in a separate line so that it will be easier to read. Is this ok?

LibOS/shim/src/fs/sys/cache_info.c, line 45 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

What's the point of "%s\n", "Unified"? Please simplify here and above.

Done.

LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

You leave str uninitialized and return success for empty ranges list.

Done.

LibOS/shim/src/fs/sys/fs.c, line 56 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

BITS_TO_INTS
bitmap is not an array of ints

Changed it to BITS_TO_UINT32S

LibOS/shim/src/fs/sys/fs.c, line 75 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Please assert that cpumask_cnt > 0, it should always be true, but otherwise this function unintuitively returns incorrect results.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Maybe rephrase to a more verbose:
/* Core 0 is always online (and thus there is no file `/sys/devices/system/cpu/cpu0/online`),
 * so we leave is_logical_core_online for core 0 uninitialized */
What I constantly forget is that cpu0/online file doesn't exist, and I think this should be mentioned in this comment.

sure, let me add this comment.

Pal/include/arch/x86_64/pal_topology.h, line 4 at r3 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

You don't have to duplicate this line, see how we do this in other files.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 11 at r3 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

E.g -> E.g.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 11 at r3 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

suffixes. E -> suffixes, e

Done.

Pal/include/arch/x86_64/pal_topology.h, line 11 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Also, isn't this outdated now?

This is still valid when we read the cache size from the host which is numeric with suffix format.

Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

inclusive or exclusive?

This is inclusive.

Pal/include/arch/x86_64/pal_topology.h, line 43 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Eg. -> E.g.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 43 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

were -> was

Done.

Pal/include/arch/x86_64/pal_topology.h, line 47 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Eg. -> E.g.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 47 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

were -> was

Done.

Pal/include/arch/x86_64/pal_topology.h, line 49 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

range_cnt -> ranges_cnt

why? I want to mention the variable name which is range_cnt

Pal/include/arch/x86_64/pal_topology.h, line 51 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Eg. -> E.g.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

were -> was

Done.

Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

[{0, 15}, {21, 21}, {32 63}]
Please unify formatting here.

[{0, 15}, {21, 21}, {32, 63}] is this fine?

Pal/include/arch/x86_64/pal_topology.h, line 53 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

When there is only a single number instead of range both start and end are assigned start value

I'd drop this sentence, it's already clearly explained by the example.

OK, if both reviewers feel this way, I will drop it.

Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

/* Note: The ranges should not overlap */
But can they touch each other? E.g. is [[0, 15], [16, 31]] a valid array of ranges?

Yes, this is a valid case.

Pal/include/arch/x86_64/pal_topology.h, line 62 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

So to get actual cache size multiply this value by 1024.

This is super confusing, please store this value in bytes instead.

The comment is incorrect, I do store the value in bytes. Changed comment as below,

    /* We store cache size is in bytes. But cache size is usually displayed in KB format. Please do
     * the needful conversion.
     * Pls ref: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu */

Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

This should be bool.

yes, bool would be most appropriate here but we use a common function get_hw_resource_value() to read the values from files. This function takes size_t pointer as a parameter to store the read value(s). So set this variable as size_t instead of bool.

Pal/src/host/Linux-common/topo_info.c, line 119 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

But you have a check for \t 20 lines above (see my other comment).

yes, I had missed it earlier but have removed it.

Pal/src/host/Linux-common/topo_info.c, line 21 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

retval -> out_value

Done.

Pal/src/host/Linux-common/topo_info.c, line 34 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

-ENOENT -> -EINVAL or something like that

Changed it to -EINVAL

Pal/src/host/Linux-common/topo_info.c, line 44 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

__builtin_mul_overflow returns a boolean: true if overflow is detected and false if everything is fine.

So this should be:
    if (__builtin_mul_overflow(val, 1024, &val))
        return -EOVERFLOW;

ah, yes. Done.

Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, isn't -EOVERFLOW is a better error here?

The overflow should be caught by the earlier function __builtin_mul_overflow right?

In this case, it is not overflowing, as val is of type long and can hold a larger value on a 64-bit system and if it is a 32-bit system, shouldn't have a value greater than INT_MAX.

This check was intended for 64-bit systems where we don't end up with unrealistically large values and so limit to INT_MAX. Note: In trusted PAL we do check for such limits, but this will also help in untrusted PAL case as well.

If you feel this is redundant and not needed for untrusted PAL case, I can remove this check itself.

Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

"1,3-5,6"

This isn't a good example, it suggests that the ends are exclusive, which I think isn't the case.

Is this 1-5,10-15,20,30-31 better?

Pal/src/host/Linux-common/topo_info.c, line 60 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

res_info -> out_info

Done.

Pal/src/host/Linux-common/topo_info.c, line 77 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

FYI: Here you have check for \t, but in the below comment you say that you haven't encountered any files that use \t. Delete from here then maybe?

yes, missed it. I have removed this as well.

Pal/src/host/Linux-common/topo_info.c, line 112 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, isn't -EOVERFLOW is a better error here?

yes, changed to -EOVERFLOW

Pal/src/host/Linux-common/topo_info.c, line 115 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

You have an overflow check here, but not on the incrementation above?

good catch, done.

Pal/src/host/Linux-common/topo_info.c, line 125 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Doing realloc on every iteration is super slow, it probably will be non-negligible on large servers. Please do some overestimation with *2 growing when it's too small and then trim it with realloc after the loop.

Yes agree, but will do this optimization in a separate PR. Have added a TODO to capture this effort.
For this release let us target this basic version of adding each range at a time.

Pal/src/host/Linux-common/topo_info.c, line 153 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

no need for this if

Done.

Pal/src/host/Linux-common/topo_info.c, line 163 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

But please not in this PR, the diff is already large and moving code around is annoying to review :)

I moved the code and committed but will revert.

Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

This comment is stale

Reworded as

/* Updates `cache_indices_cnt` with number of cache levels present on the system by counting
 * "indexX" dir entries under `/sys/devices/system/cpu/cpuX/cache`. Returns 0 on success and
 * negative UNIX error code on failure. */

Pal/src/host/Linux-common/topo_info.c, line 191 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

size_t

Done.

Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

PAL_SYSFS_PATH_LEN
Is this length or size?

This is length as per our coding guidelines as it denotes characters in a C-string.

Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Why can't it be larger?

It can be but we need some upper limit, right? So, for now we have it as INT32_MAX. In our validation, if we hit issues, we can update this.

Pal/src/host/Linux-common/topo_info.c, line 316 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

ditto

same reason as above.

Pal/src/host/Linux-common/topo_info.c, line 341 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Could we instead of leaving it uninitialized set this value to "true" for core0? There's a risk that someone will use these structures for something else than sysfs and will get hurt.

We have big comments not to do so :). Also, if we think the other way since it is set to 1 some can end up using it as a valid value. This can be an issue. But if you insist, I can set it to 1.

Pal/src/host/Linux-common/topo_info.c, line 401 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

ditto, len or size?

length.

Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Oops, these are bugs. Actually, we are supposed to use READ_ONCE() on every read access of the untrusted variable.

The idea is to prevent the compiler from doing optimizations that break security guarantees. When you do a simple myvar = untrusted_var;, the compiler can do all sorts of weird stuff: the compiler may decide to read untrusted_var two or more times instead of just once. The compiler may say "This developer added a useless myvar alias to the perfectly good untrusted_var, the developer is stupid, let me just replace myvar with untrusted_var everywhere in the code".

With READ_ONCE(), we prevent the compiler from playing smart and accidentally accessing the untrusted variables many times. Also recall that if an untrusted variable is accessed two times, then the attacker can replace the untrusted-variable value in-between the two accesses, and suddenly one part of your code has value 1 and another part of your code has value 2 of the same variable -- and your program glitches, fails, or exposes a vulnerability. Adding READ_ONCE() is a simple way for the developer to be sure that untrusted_var is first copied into myvar, and then the stored in secure memory and never maliciously changing myvar is used.

P.S. Feel free to fix those bugs you found in a separate PR.

Thanks for the nice explanation! Sure, will create a separate PR and fix these.

Now, I have removed all the READ_ONCE() and took the shallow copy approach. The code looks more readable with shallow copy :)

Pal/src/host/Linux-SGX/db_main.c, line 321 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Thanks, this example helps. Maybe this:

/* recall that `shared_cpu_map` shows this core + its siblings (if HT is enabled), for example:
 *     /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map: 00000000,00000002 */

Done.

Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):

so you could even make this realloc_from_size() function available as a common function, and use it in both cases

We did go down this path here, gramineproject/graphene#2661 and finally decided we will just have it inline. So let us drop realloc_from_size() idea and I will inline it.

See the snippet below,

But in the other place you use realloc_size(). Now we have a strange state that some PAL sources use realloc_size and some sources don't use it (and instead do the malloc-memcpy-free pattern explicitly).

Sorry to go back and forth, but let's remove your newly-introduced realloc_size function and replace the only caller of realloc_size with the explicit malloc-memcpy-free pattern. It doesn't make sense to introduce a function for only one place in the code and (for some obscure reason) to not use this function in other places.

This means that you should drop the commit [Pal] Add realloc_size() in PAL slab manager.

Pal/src/host/Linux-SGX/db_main.c, line 499 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

See my other comment.

Got it. Brought back temp copy approach and removed READ_ONCE()

Pal/src/host/Linux-SGX/db_main.c, line 559 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

But this is not a fail. You get to this label even in good cases. So it should be out.

OK, changed it to out.

Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

At this point, you're accessing the untrusted pointer value (uptr_topo_info-> is transformed into asm mov (<untrusted mem>), %rdx). The pointer value that is stored at this address <untrusted mem> may contain an address in enclave memory, so you end up with giving a malicious first argument to copy_resource_range_to_enclave() -- this first argument is a pointer to some arbitrary enclave-memory address (specified by the attacker).

You removed the if (!sgx_copy_to_enclave(&temp_topo_info, sizeof(temp_topo_info), uptr_topo_info, sizeof(*uptr_topo_info))) operation, which is important here. This operation doesn't just perform a copy, it also performs a check on the untrusted pointer uptr_topo_info -- this operation verifies that the untrusted pointer does not point into the enclave memory. See this:

gramine/Pal/src/host/Linux-SGX/enclave_framework.c

Lines 139 to 140 in 32d2778

!sgx_is_completely_outside_enclave(uptr, usize) ||

!sgx_is_completely_within_enclave(ptr, usize)) {

Thanks for the clear explanation! The idea was to avoid 2 copies, but I see the issue now. I will revert it back to shallow-copy approach.

Pal/src/host/Linux-SGX/db_main.c, line 126 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

doesn't -> don't

Done.

Pal/src/host/Linux-SGX/db_main.c, line 153 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Could you please add an assert:
assert(out_core_topology_arr);

Done.

Pal/src/host/Linux-SGX/db_main.c, line 160 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Now that I look at all these READ_ONCE() macros you added, I think that it will be just simpler to perform a shallow copy of *untrusted_src into a temporary variable and use this variable (and you will get rid of READ_ONCE because now you will be accessing a variable stored inside the enclave).

I definitely want this shallow copy-to-enclave implemented in this function. I'm not sure about the above copy_resource_range_to_enclave() -- that one is pretty trivial and doesn't really merit a temporary variable. But your choice, it makes everyone feel safer to use shallow copies :)

yes switched to shallow copy approach.

Pal/src/host/Linux-SGX/db_main.c, line 167 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You're accessing the untrusted var here as well, you're supposed to use READ_ONCE().

Exactly because of such hard-to-reason-about cases, I suggest to do a shallow copy and not bother about these.

Did shallow copy to address the issue.

Pal/src/host/Linux-SGX/db_main.c, line 174 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Did shallow copy to address the issue.

Pal/src/host/Linux-SGX/db_main.c, line 191 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This also merits a shallow copy.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 221 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto (please add an assert)

Done.

Pal/src/host/Linux-SGX/db_main.c, line 234 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

READ_ONCE() or do a shallow copy of untrusted_src[idx]

Did shallow copy to address the issue.

Pal/src/host/Linux-SGX/db_main.c, line 241 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Did shallow copy to address the issue.

Pal/src/host/Linux-SGX/db_main.c, line 286 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would move this line after the two checks -- otherwise end < start, and you'll get an integer overflow. But after the two checks, you are sure that end and start are good, and you can safely do this arithmetics.

Done.

Code quote:

resource_cnt_from_ranges

Pal/src/host/Linux-SGX/db_main.c, line 401 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This sentence is very hard to read. Maybe add " like this (if I understand correctly what you want to say here):
 * - Verify that the "cores in the socket info" array is exactly the same as the "core-siblings
 *   present in core topology" array.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 418 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Now you don't really need this helper variable core_sibling_ranges_arr.

Instead you could use smth like core_siblings->ranges_arr[j].start and core_siblings->ranges_arr[j].end.

Or are these too long and don't fit into 100 chars? In this case, you can at least change:
        struct pal_range_info* core_sibling_ranges_arr = core_siblings->ranges_arr;

yes, this was redundant. Removed it.

Pal/src/host/Linux-SGX/db_main.c, line 532 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I don't understand this message at all. Maybe simply ..."Invalid numa topology: Core %zu is beyond CPU mask limit", j);

Done.

dimakuv

Reviewed 6 of 6 files at r5, all commit messages.
Reviewable status: all files reviewed, 64 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

LibOS/shim/src/fs/sys/fs.c, line 75 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done.

I don't see Michal's requested change.

Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

This is inclusive.

I guess Michal wants a comment like this:

struct pal_range_info {
    size_t start;
    size_t end;  /* inclusive, e.g., start = 0 and end = 1 has two items (0 and 1) */

Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

yes, bool would be most appropriate here but we use a common function get_hw_resource_value() to read the values from files. This function takes size_t pointer as a parameter to store the read value(s). So set this variable as size_t instead of bool.

I wouldn't care much about the actual type here... So I'm fine with Vijay's approach.

Pal/include/arch/x86_64/pal_topology.h, line 60 at r5 (raw file):

    size_t level;
    enum cache_type type;
    /* We store cache size is in bytes. But cache size is usually displayed in KB format. Please do

is in bytes -> in bytes

Pal/include/arch/x86_64/pal_topology.h, line 60 at r5 (raw file):

    size_t level;
    enum cache_type type;
    /* We store cache size is in bytes. But cache size is usually displayed in KB format. Please do

Weird phrasing, I suggest:

/* We store cache size in bytes, but it is typically displayed in KB (e.g., "256K"), so for output
 * it should be converted to KBs. See https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu */

Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

The overflow should be caught by the earlier function __builtin_mul_overflow right?

In this case, it is not overflowing, as val is of type long and can hold a larger value on a 64-bit system and if it is a 32-bit system, shouldn't have a value greater than INT_MAX.

This check was intended for 64-bit systems where we don't end up with unrealistically large values and so limit to INT_MAX. Note: In trusted PAL we do check for such limits, but this will also help in untrusted PAL case as well.

If you feel this is redundant and not needed for untrusted PAL case, I can remove this check itself.

No, it's fine then. Resolving.

Pal/src/host/Linux-common/topo_info.c, line 125 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Yes agree, but will do this optimization in a separate PR. Have added a TODO to capture this effort.
For this release let us target this basic version of adding each range at a time.

Agreed with Vijay.

Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):

            }

            out_info->resource_cnt++;

Actually, I would prefer to have:

out_info->resource_cnt = total_cnt;

This way it's clear why you did the check for overflow.

Pal/src/host/Linux-common/topo_info.c, line 117 at r5 (raw file):

            size_t diff = end_val - start_val;
            size_t total_cnt;
            if (__builtin_add_overflow(out_info->resource_cnt, diff, &total_cnt) ||

Shouldn't you replace diff with diff + 1 here?

Alternatively, you can add + 1 to the diff = ... above.

Pal/src/host/Linux-common/topo_info.c, line 122 at r5 (raw file):

                goto fail;
            }
            out_info->resource_cnt += diff + 1; /* +1 because of inclusive range */

Actually, I would prefer to have:

out_info->resource_cnt = total_cnt;

This way it's clear why you did the check for overflow.

Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Thanks for the nice explanation! Sure, will create a separate PR and fix these.

Now, I have removed all the READ_ONCE() and took the shallow copy approach. The code looks more readable with shallow copy :)

You added the shallow copy? where? I don't see it.

Or did you add the shallow copy to the caller of this function, and now the first argument of this function (untrusted_src) points to the trusted in-enclave values (ranges_arr, range_cnt, resource_cnt)? If so, you should change the name of the argument (untrusted_src -> src).

UPDATE: Ok, so you indeed supply an safe in-enclave first argument. So you should rename untrusted_src to simply src.

But because struct pal_res_range_info contains in itself a ranges_arr field (pointer to some other memory), this field in untrusted. So you must have sgx_copy_to_enclave() on this field -- which you do already, that's good. So this function looks correct, nothing needs to be changed (except my small comments).

Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

so you could even make this realloc_from_size() function available as a common function, and use it in both cases

We did go down this path here, gramineproject/graphene#2661 and finally decided we will just have it inline. So let us drop realloc_from_size() idea and I will inline it.

See the snippet below,
But in the other place you use realloc_size(). Now we have a strange state that some PAL sources use realloc_size and some sources don't use it (and instead do the malloc-memcpy-free pattern explicitly).

Sorry to go back and forth, but let's remove your newly-introduced realloc_size function and replace the only caller of realloc_size with the explicit malloc-memcpy-free pattern. It doesn't make sense to introduce a function for only one place in the code and (for some obscure reason) to not use this function in other places.

This means that you should drop the commit [Pal] Add realloc_size() in PAL slab manager.

Thanks for this reminder. I forgot that we had this discussion several months ago :)

Thanks for inlining this function now. It looks more reasonable to me. And once we introduce realloc_size() or something similar in a follow-up PR, the code will be much easier to read.

Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Thanks for the clear explanation! The idea was to avoid 2 copies, but I see the issue now. I will revert it back to shallow-copy approach.

You're welcome. Yes, I understand the desire to optimize for performance and clarity (by avoiding 2 copies), but unfortunately security is often at odds with this desire :) For example, consider a classic case of I/O bounce buffers (used for security reasons) -- there are typically 4 redundant copies of data, and this performance loss is because of security requirement...

Pal/src/host/Linux-SGX/db_main.c, line 128 at r5 (raw file):

/* This function and all the below functions don't clean up resources on failure.
 * We terminate the process anyway. */
static int copy_resource_range_to_enclave(struct pal_res_range_info* untrusted_src,

untrusted_src -> src

Pal/src/host/Linux-SGX/db_main.c, line 130 at r5 (raw file):

static int copy_resource_range_to_enclave(struct pal_res_range_info* untrusted_src,
                                          struct pal_res_range_info* dest) {
    size_t range_cnt = untrusted_src->range_cnt;

There is no need for this helper variable now. Just use src->range_cnt everywhere in this func.

Pal/src/host/Linux-SGX/db_main.c, line 137 at r5 (raw file):

    }

    if (!sgx_copy_to_enclave(ranges_arr, range_cnt * sizeof(*ranges_arr),

Please add a comment like this:

/* even though `src` points to a safe in-enclave object, the `src->ranges_arr` pointer is
 * untrusted and may maliciously point to inside the enclave; thus need to use
 * `sgx_copy_to_enclave()` function */

Pal/src/host/Linux-SGX/db_main.c, line 138 at r5 (raw file):

    if (!sgx_copy_to_enclave(ranges_arr, range_cnt * sizeof(*ranges_arr),
                             untrusted_src->ranges_arr,

READ_ONCE()

Pal/src/host/Linux-SGX/db_main.c, line 156 at r5 (raw file):

    assert(out_core_topology_arr);

    struct pal_core_topo_info* temp_core_topo_arr =

You have two ways of writing "core topology": core_topo_arr and core_topology_arr. It's hard to read. Let's unify...

Meh, the fact that you sometimes use topo and sometimes topology drives me nuts :) I guess just use topo everywhere.

Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):

    /* Shallow copy contents of core_topology_arr (untrusted_src) into enclave */
    if (!sgx_copy_to_enclave(temp_core_topo_arr,
            online_logical_cores_cnt * sizeof(*temp_core_topo_arr), untrusted_src,

untrusted_src -> READ_ONCE(untrusted_src)

Pal/src/host/Linux-SGX/db_main.c, line 167 at r5 (raw file):

            online_logical_cores_cnt * sizeof(*temp_core_topo_arr), untrusted_src,
            online_logical_cores_cnt * sizeof(*untrusted_src))) {
        log_error("Shallow copy of temp core_topology_arr into the enclave failed");

Remove temp

Pal/src/host/Linux-SGX/db_main.c, line 171 at r5 (raw file):

    }

    struct pal_core_topo_info* core_topology_arr =

core_topology_arr -> core_topo_arr

Pal/src/host/Linux-SGX/db_main.c, line 198 at r5 (raw file):

        }

        /* Allocate enclave memory to store cache info */

This comment belongs to another malloc (below), not this one. Here it should be:

/* Shallow copy contents of cache_info_arr (untrusted pointer) into enclave */

Pal/src/host/Linux-SGX/db_main.c, line 202 at r5 (raw file):

            malloc(cache_indices_cnt * sizeof(*temp_cache_info_arr));
        if (!temp_cache_info_arr) {
            log_error("Allocation for temp cache_info_arr failed");

"Allocation for shallow copy of cache_info_arr failed"

Pal/src/host/Linux-SGX/db_main.c, line 208 at r5 (raw file):

        if (!sgx_copy_to_enclave(temp_cache_info_arr,
                                 cache_indices_cnt * sizeof(*temp_cache_info_arr),
                                 temp_core_topo_arr->cache_info_arr,

READ_ONCE()

Rule of thumb: always add READ_ONCE to the third argument of sgx_copy_to_enclave().

Pal/src/host/Linux-SGX/db_main.c, line 211 at r5 (raw file):

                                 cache_indices_cnt *
                                 sizeof(*temp_core_topo_arr->cache_info_arr))) {
            log_error("Shallow copy of temp cache_info_arr into the enclave failed");

Remove temp

Pal/src/host/Linux-SGX/db_main.c, line 247 at r5 (raw file):

    free(temp_core_topo_arr);
    return 0;
}

FYI: I like the new version, with shallow copies (bar my comments). I tried to review it carefully, and from the security point of you, it looks correct (in the sense that we don't use untrusted objects without first copying them into the enclave).

Pal/src/host/Linux-SGX/db_main.c, line 257 at r5 (raw file):

        malloc(online_nodes_cnt * sizeof(*temp_numa_topo_arr));
    if (!temp_numa_topo_arr) {
        log_error("Allocation for temp numa_topology_arr failed");

"Allocation for shallow copy of numa_topo_arr failed"

Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):

    /* Shallow copy contents of numa_topology_arr (untrusted_src) into enclave */
    if (!sgx_copy_to_enclave(temp_numa_topo_arr,
            online_nodes_cnt * sizeof(*temp_numa_topo_arr), untrusted_src,

READ_ONCE(untrusted_src)

Pal/src/host/Linux-SGX/db_main.c, line 265 at r5 (raw file):

            online_nodes_cnt * sizeof(*temp_numa_topo_arr), untrusted_src,
            online_nodes_cnt * sizeof(*untrusted_src))) {
        log_error("Shallow copy of temp numa_topology_arr into the enclave failed");

Remove temp

numa_topology_arr -> numa_topo_arr

Pal/src/host/Linux-SGX/db_main.c, line 300 at r5 (raw file):

    free(temp_numa_topo_arr);
    return 0;
}

FYI: I like the new version, with shallow copies (bar my comments). I tried to review it carefully, and from the security point of you, it looks correct (in the sense that we don't use untrusted objects without first copying them into the enclave).

Pal/src/host/Linux-SGX/db_main.c, line 739 at r5 (raw file):

    struct pal_topo_info temp_topo_info;
    if (!sgx_copy_to_enclave(&temp_topo_info, sizeof(temp_topo_info),
                             uptr_topo_info, sizeof(*uptr_topo_info))) {

READ_ONCE(uptr_topo_info)

mkow

Reviewed 5 of 6 files at r5, all commit messages.
Reviewable status: all files reviewed, 42 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

LibOS/shim/include/shim_fs_pseudo.h, line 222 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Are you suggesting having text until 100 chars? If so, I purposefully put "Example..." in a separate line so that it will be easier to read. Is this ok?

I mean only the function signature, char* str is wrapped. You should see this on Reviewable if you configured line length to 100 in it.

LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done.

Why do both now?
Instead of what you did maybe just add str[0] = '\0'? There's no reason to fail on empty range if we can trivially make it work correctly for it.

Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I guess Michal wants a comment like this:

struct pal_range_info {
    size_t start;
    size_t end;  /* inclusive, e.g., start = 0 and end = 1 has two items (0 and 1) */

Yes, I ofc know which one is the case, this was the question from the point of a reader of our codebase. It's not clear from the header until you read the implementation.
Ad the comment: Please do as Dima suggested, but maybe without this example, "inclusive" is not a complex term which would require lengthy explanations.

Pal/include/arch/x86_64/pal_topology.h, line 49 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

why? I want to mention the variable name which is range_cnt

? I want you to change the variable name. I don't understand your response here...

Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

[{0, 15}, {21, 21}, {32, 63}] is this fine?

Looks good now.

Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Yes, this is a valid case.

As before, I've read the whole PR so I know, but other developers reading this header in the future won't and the current comment doesn't make it clear how this works.

Pal/include/arch/x86_64/pal_topology.h, line 62 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

The comment is incorrect, I do store the value in bytes. Changed comment as below,

    /* We store cache size is in bytes. But cache size is usually displayed in KB format. Please do
     * the needful conversion.
     * Pls ref: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu */

Then please just drop the whole comment, I don't see why "how we usually print this value in sysfs" could have any importance here.

Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I wouldn't care much about the actual type here... So I'm fine with Vijay's approach.

Then just use get_hw_resource_value() on a temporary variable and then assign to is_logical_core_online. size_t makes no sense here and this is a global structure, it should be as clean and intuitive as possible.

Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):

If you feel this is redundant and not needed for untrusted PAL case, I can remove this check itself.

I don't see any point in doing these checks here, especially that nothing should break if the value is bigger than this.

Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Is this 1-5,10-15,20,30-31 better?

Why not just change "6" to "7" in your original example? The new one is much more complicated but doesn't really provide any more information than the original.

Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Reworded as

/* Updates `cache_indices_cnt` with number of cache levels present on the system by counting
 * "indexX" dir entries under `/sys/devices/system/cpu/cpuX/cache`. Returns 0 on success and
 * negative UNIX error code on failure. */

It's not really correct now, "to update something" slightly implies reading and then refreshing some value. Here we just return via pointer and never use the old *cache_indices_cnt.

Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

This is length as per our coding guidelines as it denotes characters in a C-string.

Then the buffer is too small, you don't have space for the trailing null byte. I.e. you won't fit a C-string with PAL_SYSFS_PATH_LEN chars there.

Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):

It can be but we need some upper limit, right?

Why?

Pal/src/host/Linux-common/topo_info.c, line 341 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

We have big comments not to do so :). Also, if we think the other way since it is set to 1 some can end up using it as a valid value. This can be an issue. But if you insist, I can set it to 1.

But this is a valid value, the core 0 is actually online. It's only the sysfs printing code which skips printing it, but I see no reason to have this quirk reflected in a global structure.

Pal/src/host/Linux-common/topo_info.c, line 401 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

length.

see my reply above

Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, I would prefer to have:
out_info->resource_cnt = total_cnt;
This way it's clear why you did the check for overflow.

Why not just remove this temporary variable and add directly to out_info->resource_cnt?

Pal/src/host/Linux-common/topo_info.c, line 117 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Shouldn't you replace diff with diff + 1 here?

Alternatively, you can add + 1 to the diff = ... above.

It's correct because of that >= below, although quite weird.

vijaydhanraj

Reviewable status: 12 of 21 files reviewed, 42 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

LibOS/shim/include/shim_fs_pseudo.h, line 222 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I mean only the function signature, char* str is wrapped. You should see this on Reviewable if you configured line length to 100 in it.

yes got it, thanks.

LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Why do both now?
Instead of what you did maybe just add str[0] = '\0'? There's no reason to fail on empty range if we can trivially make it work correctly for it.

Sorry, I don't understand. Are you saying it is fine to pass an empty string back to the application? This function basically converts ranges to strings, and I feel if the range is empty, we should not return success.

Converted memset(str, 0, str_size) -> str[0] = '\0' although I still prefer memset() :).

LibOS/shim/src/fs/sys/fs.c, line 75 at r4 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I don't see Michal's requested change.

Sorry I put the check in a different place in db_main.c, but it is not actually needed there. Fixed it here.

Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Yes, I ofc know which one is the case, this was the question from the point of a reader of our codebase. It's not clear from the header until you read the implementation.
Ad the comment: Please do as Dima suggested, but maybe without this example, "inclusive" is not a complex term which would require lengthy explanations.

oh ok :). Added the comment.

This comment confused me :) I was not sure if it was a question or a suggestion. Next time just say, add a comment.

Pal/include/arch/x86_64/pal_topology.h, line 49 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

? I want you to change the variable name. I don't understand your response here...

I misunderstood your comment. I thought you meant to change the comment.
Done.

Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

As before, I've read the whole PR so I know, but other developers reading this header in the future won't and the current comment doesn't make it clear how this works.

Yes, please say "add an example in the comment". I was not sure if it was a question or a suggestion.

I have updated the example to show touching ranges. Is this [{0, 15}, {16, 30}, {31, 31}] example fine?

Pal/include/arch/x86_64/pal_topology.h, line 62 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Then please just drop the whole comment, I don't see why "how we usually print this value in sysfs" could have any importance here.

OK, done.

Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Then just use get_hw_resource_value() on a temporary variable and then assign to is_logical_core_online. size_t makes no sense here and this is a global structure, it should be as clean and intuitive as possible.

OK, done.

Pal/include/arch/x86_64/pal_topology.h, line 60 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

is in bytes -> in bytes

Done.

Pal/include/arch/x86_64/pal_topology.h, line 60 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Weird phrasing, I suggest:

/* We store cache size in bytes, but it is typically displayed in KB (e.g., "256K"), so for output
 * it should be converted to KBs. See https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu */

Done.

Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

If you feel this is redundant and not needed for untrusted PAL case, I can remove this check itself.

I don't see any point in doing these checks here, especially that nothing should break if the value is bigger than this.

Nothing will break but we can have bizarre values which will trickle down to trusted PAL and then fail. This was for catching the issue in the untrusted PAL itself and failing.

But I have removed the checks.

Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Why not just change "6" to "7" in your original example? The new one is much more complicated but doesn't really provide any more information than the original.

I thought having an individual value at the extremes made you think it was exclusive. So made it as a range. But I don't understand how changing 6->7 will help. Please explain.

I have changed it to 7 now.

Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

It's not really correct now, "to update something" slightly implies reading and then refreshing some value. Here we just return via pointer and never use the old *cache_indices_cnt.

How about now?

/* This function stores the number of cache levels present on the system by counting "indexX" dir
 * entries under `/sys/devices/system/cpu/cpuX/cache`in `out_cache_indices_cnt`. Returns 0 on
 * success and negative UNIX error code on failure. */

Also renamed name the out variable as out_cache_indices_cnt.

If this is not clear, please suggest your text and I will work it in.

Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Then the buffer is too small, you don't have space for the trailing null byte. I.e. you won't fit a C-string with PAL_SYSFS_PATH_LEN chars there.

Sorry not able to get your comment. Wait, are you saying that the C-string lengths (strlen) don't account for null byte and so this is not a LEN butSIZE?

With this assumption, I am changing PAL_SYSFS_PATH_LEN -> PAL_SYSFS_PATH_SIZE

But I read C-string as "Hello", which does include the null byte and not compare it with the return value of strlen().

Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

It can be but we need some upper limit, right?

Why?

Can we have larger than 2147483647 online cores? I set this as an upper limit to catch such ridiculously large values which will finally fail in trusted PAL because we have limit checks there. Idea was to catch such inconsistencies earlier and return.

But I have removed these checks.

Pal/src/host/Linux-common/topo_info.c, line 341 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

But this is a valid value, the core 0 is actually online. It's only the sysfs printing code which skips printing it, but I see no reason to have this quirk reflected in a global structure.

OK, I have initialized this to true.

Pal/src/host/Linux-common/topo_info.c, line 401 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

see my reply above

Done?

Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Why not just remove this temporary variable and add directly to out_info->resource_cnt?

This will exceed 100 chars so used a temp variable for better readability.

Pal/src/host/Linux-common/topo_info.c, line 117 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Shouldn't you replace diff with diff + 1 here?

Alternatively, you can add + 1 to the diff = ... above.

I do it below to account for inclusiveness, but I agree that we should do it as part of the diff because we can catch any potential overflows.

Pal/src/host/Linux-common/topo_info.c, line 122 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Actually, I would prefer to have:
out_info->resource_cnt = total_cnt;
This way it's clear why you did the check for overflow.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You added the shallow copy? where? I don't see it.

Or did you add the shallow copy to the caller of this function, and now the first argument of this function (untrusted_src) points to the trusted in-enclave values (ranges_arr, range_cnt, resource_cnt)? If so, you should change the name of the argument (untrusted_src -> src).

UPDATE: Ok, so you indeed supply an safe in-enclave first argument. So you should rename untrusted_src to simply src.

But because struct pal_res_range_info contains in itself a ranges_arr field (pointer to some other memory), this field in untrusted. So you must have sgx_copy_to_enclave() on this field -- which you do already, that's good. So this function looks correct, nothing needs to be changed (except my small comments).

Yes, your understanding of the flow is right.
Done, and added a comment that ranges_arr is untrusted.

Pal/src/host/Linux-SGX/db_main.c, line 128 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

untrusted_src -> src

Done.

Pal/src/host/Linux-SGX/db_main.c, line 130 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

There is no need for this helper variable now. Just use src->range_cnt everywhere in this func.

Done.

Pal/src/host/Linux-SGX/db_main.c, line 137 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add a comment like this:

/* even though `src` points to a safe in-enclave object, the `src->ranges_arr` pointer is
 * untrusted and may maliciously point to inside the enclave; thus need to use
 * `sgx_copy_to_enclave()` function */

Done.

Pal/src/host/Linux-SGX/db_main.c, line 138 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

READ_ONCE()

Done.

Pal/src/host/Linux-SGX/db_main.c, line 156 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

You have two ways of writing "core topology": core_topo_arr and core_topology_arr. It's hard to read. Let's unify...

Meh, the fact that you sometimes use topo and sometimes topology drives me nuts :) I guess just use topo everywhere.

OK :) done. I have also changed numa_topology_arr -> numa_topo_arr

Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

untrusted_src -> READ_ONCE(untrusted_src)

Done.

Pal/src/host/Linux-SGX/db_main.c, line 167 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Remove temp

Done.

Pal/src/host/Linux-SGX/db_main.c, line 171 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

core_topology_arr -> core_topo_arr

Done.

Pal/src/host/Linux-SGX/db_main.c, line 198 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This comment belongs to another malloc (below), not this one. Here it should be:
/* Shallow copy contents of cache_info_arr (untrusted pointer) into enclave */

Done.

Pal/src/host/Linux-SGX/db_main.c, line 202 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

"Allocation for shallow copy of cache_info_arr failed"

Done.

Pal/src/host/Linux-SGX/db_main.c, line 208 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

READ_ONCE()

Rule of thumb: always add READ_ONCE to the third argument of sgx_copy_to_enclave().

Done. I have also added TODOs to the missing ones that were not part of this PR. As discussed, I will fix these in a separate PR.

Pal/src/host/Linux-SGX/db_main.c, line 211 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Remove temp

Done.

Pal/src/host/Linux-SGX/db_main.c, line 257 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

"Allocation for shallow copy of numa_topo_arr failed"

Done.

Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

READ_ONCE(untrusted_src)

Done.

Pal/src/host/Linux-SGX/db_main.c, line 265 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Remove temp

numa_topology_arr -> numa_topo_arr

Done.

Pal/src/host/Linux-SGX/db_main.c, line 739 at r5 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

READ_ONCE(uptr_topo_info)

Done.

mkow

Reviewed 9 of 9 files at r6, all commit messages.
Reviewable status: all files reviewed, 35 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):

Are you saying it is fine to pass an empty string back to the application?

We probably won't do this because we sanitize the data, but it seems more intuitive to me that printing an empty list gives you an empty string, rather than erroring out. But I won't block on this.

Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

oh ok :). Added the comment.

This comment confused me :) I was not sure if it was a question or a suggestion. Next time just say, add a comment.

But I don't want you to blindly copy-paste my suggestions without thinking ;) You should read them, look at the code and think whether they make sense and what does they mean (e.g.: "oh, this is actually unclear, I should add a comment" or maybe "I already have a comment about this, but you probably just missed it" or "maybe instead it's better to rename this variable?").

Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Yes, please say "add an example in the comment". I was not sure if it was a question or a suggestion.

I have updated the example to show touching ranges. Is this [{0, 15}, {16, 30}, {31, 31}] example fine?

Better, but now the example has no holes in the ranges, which are allowed. Reading it you can get an impression that the ranges have to be adjacent.

Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):

This was for catching the issue in the untrusted PAL itself and failing.

But this is pointless, we need to duplicate these checks anyways in trusted PAL. And on non-TEE PALs these values are trusted.

Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I thought having an individual value at the extremes made you think it was exclusive. So made it as a range. But I don't understand how changing 6->7 will help. Please explain.

I have changed it to 7 now.

Because specifying 3-5,6 makes no sense if the ranges are inclusive, you'd have 3-6 instead then. But if the ends were exclusive, then 3-5,6 would make sense.

Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

How about now?
/* This function stores the number of cache levels present on the system by counting "indexX" dir
 * entries under `/sys/devices/system/cpu/cpuX/cache`in `out_cache_indices_cnt`. Returns 0 on
 * success and negative UNIX error code on failure. */
Also renamed name the out variable as out_cache_indices_cnt.

If this is not clear, please suggest your text and I will work it in.

Looks ok, but you're missing a space before in.

Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):

Wait, are you saying that the C-string lengths (strlen) don't account for null byte and so this is not a LEN but SIZE?

Yes, strlen() doesn't count the null byte, but it's a bit unrelated. It's just our convention that we use "len" for C-string length and "size" for C-string byte-size.

But I read C-string as "Hello", which does include the null byte and not compare it with the return value of strlen().

I don't understand what you wanted to say here ;)

Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Can we have larger than 2147483647 online cores? I set this as an upper limit to catch such ridiculously large values which will finally fail in trusted PAL because we have limit checks there. Idea was to catch such inconsistencies earlier and return.

But I have removed these checks.

It can only mask missing verification in the trusted code and is just code duplication because we already check it there.

Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

This will exceed 100 chars so used a temp variable for better readability.

Hmm, it easily fits in 100 chars for me?

            if (__builtin_add_overflow(out_info->resource_cnt, 1, &out_info->resource_cnt)) {

Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done.

Waaait, what was wrong with the old code? Why READ_ONCE() here?

Pal/src/host/Linux-SGX/db_main.c, line 208 at r5 (raw file):

Rule of thumb: always add READ_ONCE to the third argument of sgx_copy_to_enclave().

Uhmmm?

Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done.

?

Pal/src/host/Linux-SGX/db_main.c, line 739 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done.

?

Pal/src/host/Linux-SGX/db_main.c, line 88 at r6 (raw file):

    }

    /* TODO: Fix untrusted pointer access by adding `READ_ONCE()`*/

I don't understand this TODO, what is incorrectly accessed here? You added it at a line which looks correct.

Pal/src/host/Linux-SGX/db_main.c, line 131 at r6 (raw file):

static int copy_resource_range_to_enclave(struct pal_res_range_info* src,
                                          struct pal_res_range_info* dest) {
    struct pal_range_info* ranges_arr = malloc(src->ranges_cnt * sizeof(*ranges_arr));

What if untrusted host sets src->ranges_cnt to 0x8000000000000001?

Pal/src/host/Linux-SGX/db_main.c, line 262 at r6 (raw file):

        malloc(online_nodes_cnt * sizeof(*temp_numa_topo_arr));
    if (!temp_numa_topo_arr) {
        log_error("Allocation for shallow copy of numa_topo_arr failed");

Why printing an error here? We don't do this in other cases when we run out of memory, we should just print this in a single place, somewhere top-level.
ditto for other places like this

Pal/src/host/Linux-SGX/db_main.c, line 485 at r6 (raw file):

}

/* This function doesn't clean up resources on failure: we terminate the process anyway. */

Please keep this comment at each function which doesn't do the clean-up. When verifying a specific function for memory leaks you don't search the whole 1k-LoC file if there's a comment somewhere saying that "here and below" some usual rules don't apply ;)

Pal/src/host/Linux-SGX/db_main.c, line 909 at r6 (raw file):

    /* At this point we don't yet have memory manager, so we cannot allocate memory dynamically. */
    static char libpal_path[1024 + 1];
    /* TODO: Fix untrusted pointer access by adding `READ_ONCE()`*/

Where's that untrusted access?

Pal/src/host/Linux-SGX/db_main.c, line 923 at r6 (raw file):

    /* We can't verify the following arguments from the urts. So we copy them directly but need to
     * be careful when we use them. */
     /* TODO: Fix untrusted pointer access by adding `READ_ONCE()`*/

ditto

vijaydhanraj

Reviewable status: 18 of 21 files reviewed, 35 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Better, but now the example has no holes in the ranges, which are allowed. Reading it you can get an impression that the ranges have to be adjacent.

OK, made the ranges discontinuous.

Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Because specifying 3-5,6 makes no sense if the ranges are inclusive, you'd have 3-6 instead then. But if the ends were exclusive, then 3-5,6 would make sense.

Thanks for the clarification.

Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Looks ok, but you're missing a space before in.

Done.

Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Wait, are you saying that the C-string lengths (strlen) don't account for null byte and so this is not a LEN but SIZE?

Yes, strlen() doesn't count the null byte, but it's a bit unrelated. It's just our convention that we use "len" for C-string length and "size" for C-string byte-size.

But I read C-string as "Hello", which does include the null byte and not compare it with the return value of strlen().

I don't understand what you wanted to say here ;)

I was trying to reason out why you would suggest _SIZE instead of _LEN :)

Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

It can only mask missing verification in the trusted code and is just code duplication because we already check it there.

OK.

Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Hmm, it easily fits in 100 chars for me?

            if (__builtin_add_overflow(out_info->resource_cnt, 1, &out_info->resource_cnt)) {

oh my bad, removed the temporary variable.

Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Waaait, what was wrong with the old code? Why READ_ONCE() here?

The issue here is temp_topo_info is a shallow copied object and so the core_topo_arr is still an untrusted pointer. So, to prevent compiler optimization READ_ONCE() is used.

Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

?

Same reason for using READ_ONCE() in sgx_copy_core_topo_to_enclave()

Pal/src/host/Linux-SGX/db_main.c, line 88 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I don't understand this TODO, what is incorrectly accessed here? You added it at a line which looks correct.

Hereuptr_src is an untrusted pointer and we don't want the compiler to do any optimization to it. So added the TODO

Pal/src/host/Linux-SGX/db_main.c, line 131 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

What if untrusted host sets src->ranges_cnt to 0x8000000000000001?

Good point, this might result in a DOS attack. How about adding an upper limit check for ranges_cnt? But I am not sure what should be the upper limit?

 static int copy_resource_range_to_enclave(struct pal_res_range_info* src,
                                           struct pal_res_range_info* dest) {
+    if (src->ranges_cnt > UINT16_MAX)
+        return -1;
+
     struct pal_range_info* ranges_arr = malloc(src->ranges_cnt * sizeof(*ranges_arr));
     if (!ranges_arr) {
         log_error("Range allocation failed");

Pal/src/host/Linux-SGX/db_main.c, line 262 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Why printing an error here? We don't do this in other cases when we run out of memory, we should just print this in a single place, somewhere top-level.
ditto for other places like this

If we remove all the printing, and something fails we end up with this message Failed to copy and sanitize topology information. This is not helpful. I think these logs are really needed esp. for topology cases where we have multiple levels of function calls. The only place where I haven't added error prints is in the realloc flows as we plan to clean this up in a separate PR. If needed I can add error prints, there too.

The other function where we do malloc and don't print error on failure is make_argv_list(). This is a small function, and a top-level error message makes sense here.

But if reviewers strongly feel otherwise, I can remove it.

Pal/src/host/Linux-SGX/db_main.c, line 485 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Please keep this comment at each function which doesn't do the clean-up. When verifying a specific function for memory leaks you don't search the whole 1k-LoC file if there's a comment somewhere saying that "here and below" some usual rules don't apply ;)

OK, I have duplicated this comment on all functions where we use malloc/calloc

Pal/src/host/Linux-SGX/db_main.c, line 909 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Where's that untrusted access?

There isn't untrusted access. I have updated my comment.

Pal/src/host/Linux-SGX/db_main.c, line 923 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

ditto

Updated my comment.

dimakuv

Reviewed 6 of 9 files at r6, 3 of 3 files at r7, all commit messages.
Reviewable status: all files reviewed, 21 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

Pal/src/host/Linux-SGX/db_main.c, line 138 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Done.

Sorry, I was wrong. You don't need it here, because you do not dereference src->ranges_arr (which is an untrusted pointer) and thus there is no untrusted memory access happenning. You only dereference src which is a trusted pointer. So you don't need READ_ONCE(), just remove it.

Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

The issue here is temp_topo_info is a shallow copied object and so the core_topo_arr is still an untrusted pointer. So, to prevent compiler optimization READ_ONCE() is used.

Yes, sorry, Vijay. Please revert READ_ONCE(untrusted_src) -> untrusted_src.

Pal/src/host/Linux-SGX/db_main.c, line 208 at r5 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Rule of thumb: always add READ_ONCE to the third argument of sgx_copy_to_enclave().

Uhmmm?

Please revert.

Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Same reason for using READ_ONCE() in sgx_copy_core_topo_to_enclave()

Please revert

Pal/src/host/Linux-SGX/db_main.c, line 739 at r5 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

?

Please revert

Pal/src/host/Linux-SGX/db_main.c, line 88 at r6 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Hereuptr_src is an untrusted pointer and we don't want the compiler to do any optimization to it. So added the TODO

See my other comment here, I was an idiot and explained you not 100% correctly :(

Pal/src/host/Linux-SGX/db_main.c, line 262 at r6 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

If we remove all the printing, and something fails we end up with this message Failed to copy and sanitize topology information. This is not helpful. I think these logs are really needed esp. for topology cases where we have multiple levels of function calls. The only place where I haven't added error prints is in the realloc flows as we plan to clean this up in a separate PR. If needed I can add error prints, there too.

The other function where we do malloc and don't print error on failure is make_argv_list(). This is a small function, and a top-level error message makes sense here.

But if reviewers strongly feel otherwise, I can remove it.

I was actually fine with this error.

Pal/src/host/Linux-SGX/db_main.c, line 909 at r6 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

There isn't untrusted access. I have updated my comment.

Please revert this comment (delete it)

Pal/src/host/Linux-SGX/db_main.c, line 923 at r6 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Updated my comment.

Please revert this comment (delete it)

Pal/src/host/Linux-SGX/db_main.c, line 90 at r7 (raw file):

    /* TODO: Fix potential compiler optimization of untrusted pointer `uptr_src` by adding
     * `READ_ONCE()`. */

Oops, sorry Vijay. This was my blunder. I got it wrong.

So, the rule should be: if the third argument of sgx_copy_to_enclave() contains a dereference of an untrusted pointer (i.e., contains memory access through an untrusted pointer), only then we need to add READ_ONCE(). (Because this is what the macro READ_ONCE() does -- it accesses memory only once).

In this case -- like in some other cases (I marked them all in this PR) -- I was wrong and I recommended you this redundant READ_ONCE(). In this particular case, the third argument is simply uptr_src -- which doesn't do any memory access, but simply this pointer value is passed to the function. So there is no need in READ_ONCE().

Please remove this TODO comment.

Pal/src/host/Linux-SGX/db_main.c, line 470 at r7 (raw file):

        if (idx != 0) {     /* core 0 is always online */
            bool is_core_online = core_topo_arr[idx].is_logical_core_online;
            if (is_core_online != false && is_core_online != true) {

Wait, this doesn't make sense anymore :) Booleans can only have false and true values, nothing else, so this if is meaningless now.

I suggest to simply remove this check on is_logical_core_online completely. All these 7 lines.

mkow

Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
I think I'm missing something, if we allow CPUs to be only from 0 to online_logical_cores_cnt - 1 then why do we even store /sys/devices/system/cpu/online as ranges? It always has to be just a single range, from 0 to online_logical_cores_cnt - 1, the we can just not store this redundant information at all?

Regarding inconsistent view between untrusted host and SGX, [...]

Yes, that's right, but that's not my point, my point is about internal inconsistencies between the data we expose to the app.

If you want, I can add this check but based on Gitter chat looks like you already have a proposal for this and plan to push a commit. So, I will hold off from pushing any changes.

Yes, I'll do this to speed things up for the release, but there are many things I don't understand here, so please watch discussions under this PR :)

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

    /* Virtual environments such as QEMU may assign each core to a separate socket/package with
     * one or more NUMA nodes. So we check against the number of online logical cores. */
    if (!IS_IN_RANGE_INCL(topo_info->sockets_cnt, 1, online_logical_cores_cnt)) {

Can't you have more sockets than online CPUs? E.g. when all cores from some socket are turned off.

mkow

Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

         * example: /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map: 00000000,00000002 */
        int ret = sanitize_hw_resource_range(&cache_info_arr[lvl].shared_cpu_map, 1, max_limit, 0,
                                             online_logical_cores_cnt);

cache_info_arr[lvl].shared_cpu_map is another list which may include non-existing cores

vijaydhanraj

Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I think I'm missing something, if we allow CPUs to be only from 0 to online_logical_cores_cnt - 1 then why do we even store /sys/devices/system/cpu/online as ranges? It always has to be just a single range, from 0 to online_logical_cores_cnt - 1, the we can just not store this redundant information at all?

Regarding inconsistent view between untrusted host and SGX, [...]

Yes, that's right, but that's not my point, my point is about internal inconsistencies between the data we expose to the app.

If you want, I can add this check but based on Gitter chat looks like you already have a proposal for this and plan to push a commit. So, I will hold off from pushing any changes.

Yes, I'll do this to speed things up for the release, but there are many things I don't understand here, so please watch discussions under this PR :)

Sure, please go ahead.

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

cache_info_arr[lvl].shared_cpu_map is another list which may include non-existing cores

Yes, we don't check the range subsets/offlined cores. This was intended for core offlining PR. Currently, we make sure that the range of cores as part of the share_cpu_map is not greater than the total online_logical_cores_cnt.

For example, if we have shared_cpu_map (LLC) range as 0-31, and the online cores are 0-8, then we fail. But if shared_cpu_map were like 0-1, 7-8 (with cores 2-6 non-existing) we don't fail.

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Can't you have more sockets than online CPUs? E.g. when all cores from some socket are turned off.

I don't think the kernel will expose them. I did a quick test on my ICX system which has 2 sockets and each having 32 cores. I offlined all the cores except core 0 and this is the output of lscpu. Also, I don't think sysfs exposes any path to show the total sockets count. (other tool like dmidecode may expose the total socket count but we don't use this)

Architecture:         x86_64
CPU op-mode(s):       32-bit, 64-bit
Byte Order:           Little Endian
CPU(s):               64
On-line CPU(s) list:  0
Off-line CPU(s) list: 1-63
Thread(s) per core:   1
Core(s) per socket:   1
Socket(s):            1
NUMA node(s):         2
Vendor ID:            GenuineIntel
CPU family:           6
Model:                106
Model name:           Intel(R) Xeon(R) Platinum 8352Y CPU @ 2.20GHz
Stepping:             6
CPU MHz:              2200.000
BogoMIPS:             4400.00
Virtualization:       VT-x
L1d cache:            48K
L1i cache:            32K
L2 cache:             1280K
L3 cache:             49152K
NUMA node0 CPU(s):    0
NUMA node1 CPU(s):

mkow

Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Sure, please go ahead.

What about my question?

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

This was intended for core offlining PR.

But I thought the idea was that right now we'll refuse to start if some cores in topology are declared as offlined, not that we'll expose inconsistent data to the app. Especially exposing indexes which point to nothing.

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I don't think the kernel will expose them. I did a quick test on my ICX system which has 2 sockets and each having 32 cores. I offlined all the cores except core 0 and this is the output of lscpu. Also, I don't think sysfs exposes any path to show the total sockets count. (other tool like dmidecode may expose the total socket count but we don't use this)
Architecture:         x86_64
CPU op-mode(s):       32-bit, 64-bit
Byte Order:           Little Endian
CPU(s):               64
On-line CPU(s) list:  0
Off-line CPU(s) list: 1-63
Thread(s) per core:   1
Core(s) per socket:   1
Socket(s):            1
NUMA node(s):         2
Vendor ID:            GenuineIntel
CPU family:           6
Model:                106
Model name:           Intel(R) Xeon(R) Platinum 8352Y CPU @ 2.20GHz
Stepping:             6
CPU MHz:              2200.000
BogoMIPS:             4400.00
Virtualization:       VT-x
L1d cache:            48K
L1i cache:            32K
L2 cache:             1280K
L3 cache:             49152K
NUMA node0 CPU(s):    0
NUMA node1 CPU(s):    

I see, thanks for checking, seems they are indeed hidden. But other interesting thing is that the node1 cpu list is empty, so we should allow empty ranges?

mkow

Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):

struct pal_numa_topo_info {
    struct pal_res_range_info cpumap;
    struct pal_res_range_info distance;

What does this represent? I think on Linux it's not a list of ranges, but a list of integers? (distances to each other node)

mkow

Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

What does this represent? I think on Linux it's not a list of ranges, but a list of integers? (distances to each other node)

Actually, this field almost completely lacks sanitization, the only thing which is sanitized is resource_cnt field, which is a self-declared (not verified with the list of ranges) sum of the ranges which has to equal to online_nodes_cnt. Everything else can be arbitrary.
That's why it passes the tests... I was wondering how it's even possible that the tests are passing despite the ranges being unsorted here (that's the case now on my server).

Pal/src/host/Linux-SGX/db_main.c, line 601 at r10 (raw file):

    }

    if (total_cores_in_numa != online_logical_cores_cnt) {

That's not enough, this sanitization misses verification against online and possible cores list. Non-existent indices will pass it.

vijaydhanraj

Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Actually, this field almost completely lacks sanitization, the only thing which is sanitized is resource_cnt field, which is a self-declared (not verified with the list of ranges) sum of the ranges which has to equal to online_nodes_cnt. Everything else can be arbitrary.
That's why it passes the tests... I was wondering how it's even possible that the tests are passing despite the ranges being unsorted here (that's the case now on my server).

yes, these are integers like 10 20 or 20 10 basically denoting the numa distances. Unlike other ranges like core_siblings we don't call sanitize_hw_resource_range() to sanitize this. This is because distance can vary from system to system. All we want is to store the distance count. But again, this will raise the question of why store in pal_res_range_info type. It is because I wanted to use a common function like get_hw_resource_range() to read from the untrusted host which only returns a value of pal_res_range_info type.

Other fields like ranges_cnt and ranges_arr are don't care as far as distance is concerned.

This isn't a range although it is stored in pal_res_range_info. So, it will be treated as individual values instead of range and can be unsorted.

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

What about my question?

Sorry missed it, this was done to create the bare bones for supporting core offlining. I didn't want to keep changing the architecture.

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

This was intended for core offlining PR.

But I thought the idea was that right now we'll refuse to start if some cores in topology are declared as offlined, not that we'll expose inconsistent data to the app. Especially exposing indexes which point to nothing.

This case typically arises when cores are offline right, and as you pointed out, it might not lead to inconsistent data but maybe a performance issue. So, this was left until we have proper core offlining support.

We check resource_cnt and range_cnt are not 0 for shared_cpu_map. So, we will not have empty indexes but missing cores.

Pal/src/host/Linux-SGX/db_main.c, line 601 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

That's not enough, this sanitization misses verification against online and possible cores list. Non-existent indices will pass it.

Yes, we should add a range subset check which will ensure that each core in numa node is a subset of online_cores. Since we are planning to add online_logical_cores being a subset of possible_logical_cores check, the above should be good.

I think if we add a single function to check if one range is a subset of another, we could use it for all these cases.

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I see, thanks for checking, seems they are indeed hidden. But other interesting thing is that the node1 cpu list is empty, so we should allow empty ranges?

yes, we can change the limits when we call sanitize_hw_resource_range() and also need to tweak this function to add a special check for numa which can have empty range_cnt. But again, let us keep this for core offlining PR.

mkow

Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

yes, these are integers like 10 20 or 20 10 basically denoting the numa distances. Unlike other ranges like core_siblings we don't call sanitize_hw_resource_range() to sanitize this. This is because distance can vary from system to system. All we want is to store the distance count. But again, this will raise the question of why store in pal_res_range_info type. It is because I wanted to use a common function like get_hw_resource_range() to read from the untrusted host which only returns a value of pal_res_range_info type.

Other fields like ranges_cnt and ranges_arr are don't care as far as distance is concerned.

This isn't a range although it is stored in pal_res_range_info. So, it will be treated as individual values instead of range and can be unsorted.

What if untrusted PAL sets the ranges to something where start != end? I think you'll print output in invalid format in /sys/devices/system/node/node<n>/distance?
Also, I really dislike this hack, i.e. "recycling" structures for something completely different than they were intended to. Here we just want a list of integers, so we should use a list of integers.

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Sorry missed it, this was done to create the bare bones for supporting core offlining. I didn't want to keep changing the architecture.

But right now we allow online_logical_cores ranges which are different than [0; online_logical_cores_cnt)]? So, such ranges pass sanitization but are actually unsupported and cause a lot of inconsistencies.

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

as you pointed out, it might not lead to inconsistent data but maybe a performance issue

I didn't say this. I said that right now we allow inconsistent data, e.g. containing invalid core indices.

We check resource_cnt and range_cnt are not 0 for shared_cpu_map. So, we will not have empty indexes but missing cores.

I didn't say empty (and btw, 0 is not "empty", "0" is just a zero index), I said indices pointing nowhere - e.g. indices which doesn't point to any core.

Pal/src/host/Linux-SGX/db_main.c, line 601 at r10 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Yes, we should add a range subset check which will ensure that each core in numa node is a subset of online_cores. Since we are planning to add online_logical_cores being a subset of possible_logical_cores check, the above should be good.

I think if we add a single function to check if one range is a subset of another, we could use it for all these cases.

Yup, I think it will solve at least this specific issue.

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

yes, we can change the limits when we call sanitize_hw_resource_range() and also need to tweak this function to add a special check for numa which can have empty range_cnt. But again, let us keep this for core offlining PR.

Why not just accept empty ranges?

vijaydhanraj

Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

Not yet, I'm still verifying the sanitization and checks in the trusted part.

@mkow I have captured the following changes:

Checkonline_logical_cores is a subset of possible_logical_cores.
Check shared_cpu_map is a subset of online_logical_cores.
Check cores in each numa node is a subset of online_logical_cores.
Change numa distance type from pal_res_range_info to list of integers
Allow empty numa nodes but I still prefer to fail for this case.

I'm not sure if you already have these changes, but if not please do let me know. I can get started and have something for you to review by tomorrow morning.

Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):

What if untrusted PAL sets the ranges to something where start != end? I think you'll print output in invalid format in /sys/devices/system/node/node/distance?

Yes, you are correct, we will print invalid format.

Also, I really dislike this hack, i.e. "recycling" structures for something completely different than they were intended to. Here we just want a list of integers, so we should use a list of integers.

I can clean this up and have it as a list of integers. This will also solve the invalid printing format.

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

But right now we allow online_logical_cores ranges which are different than [0; online_logical_cores_cnt)]? So, such ranges pass sanitization but are actually unsupported and cause a lot of inconsistencies.

I think we agreed that for this issue we need to add online_logical_cores being a subset of possible_logical_cores check. In addition, I suggest we also have a check for possible_logical_cores such that we only allow range_cnt to be 1 and the ranges_arr always start from 0 until max cores.

PS: possible_logical_cores will only have 0-max cores even if we offline cores. So resource_cnt will be n cores, range_cnt will be 1 and range_arr will only have one range starting from 0 until max cores.

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

I didn't say this. I said that right now we allow inconsistent data, e.g. containing invalid core indices.

True :) I interpreted it incorrectly. Typically applications don't check if a cpu shares a cache or not. It will end up using this inconsistent data, with the downside being performance due to incorrect cpu<->cache mapping. Here again, if we make sure shared_cpu_map is a subset of online_logical_cores we can make sure the data is consistent. But even with such a check, the attacker can make share_cpu_map a subset of online_logical_cores but map different caches to different cores which will still cause performance issues.

I didn't say empty (and btw, 0 is not "empty", "0" is just a zero index), I said indices pointing nowhere - e.g. indices which doesn't point to any core.

It can't be indices that don't point to any core as we create a bitmap from the range_arr. So, an index if set will either point to a core that truly shares the cache or point to a core that doesn't share the cache (compromised system) or that core is not for that cache.

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Why not just accept empty ranges?

The numa nodes will only be empty if we offline cores or the attacker has compromised the system. Right now, we don't have proper offlining support to properly validate this case and so I suggested we fail if the numa node is empty. This is the current behavior.

dimakuv

Reviewed all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

mkow

Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

a discussion (no related file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

@mkow I have captured the following changes:

Checkonline_logical_cores is a subset of possible_logical_cores.

Check shared_cpu_map is a subset of online_logical_cores.

Check cores in each numa node is a subset of online_logical_cores.

Change numa distance type from pal_res_range_info to list of integers

Allow empty numa nodes but I still prefer to fail for this case.

I'm not sure if you already have these changes, but if not please do let me know. I can get started and have something for you to review by tomorrow morning.

I started working on this and will try to solve at least some of the issues. We discussed this with maintainers and will just merge what we have (because it's still much better than the old code) but will need to keep it under the insecure/experimental flag until we make sure that everything is really validated.
And these issues are probably not everything, I still keep finding issues and it's quite hard to tell from the current implementation whether everything is already consistent or not. I'll try to change the code/structures to make verification part simpler and easier to prove.

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I think we agreed that for this issue we need to add online_logical_cores being a subset of possible_logical_cores check. In addition, I suggest we also have a check for possible_logical_cores such that we only allow range_cnt to be 1 and the ranges_arr always start from 0 until max cores.

PS: possible_logical_cores will only have 0-max cores even if we offline cores. So resource_cnt will be n cores, range_cnt will be 1 and range_arr will only have one range starting from 0 until max cores.

Not sure if that's enough: you trim the range in sysfs printing to [0; online_logical_cores_cnt)but in sanitization allow other ranges as well. So, effectively we'll expose the more complex ranges (visible e.g. in counts of resources) but will trim the available info in cpu directory to ranges SET-AND [0; online_logical_cores_cnt).

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

It will end up using this inconsistent data, with the downside being performance

or memory corruption, depending on the app's implementation.

Performance is not an issue, the host can always deny resources.

It can't be indices that don't point to any core as we create a bitmap from the range_arr. So, an index if set will either point to a core that truly shares the cache or point to a core that doesn't share the cache (compromised system) or that core is not for that cache.

But where do we verify that cache_info_arr[lvl].shared_cpu_map contains only valid indices?

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

the attacker has compromised the system

There's no "compromised system" here, the host owner is the attacker.

we don't have proper offlining support to properly validate this case

I don't think there will be anything in the current sanitization that breaks if we remove this check.

vijaydhanraj

Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)

a discussion (no related file):

I started working on this and will try to solve at least some of the issues. We discussed this with maintainers and will just merge what we have (because it's still much better than the old code) but will need to keep it under the insecure/experimental flag until we make sure that everything is really validated.

OK, sounds good. So, do you want me to merge the fixups and rebase them to the latest master? or do you have some changes to push?

And these issues are probably not everything, I still keep finding issues and it's quite hard to tell from the current implementation whether everything is already consistent or not. I'll try to change the code/structures to make verification part simpler and easier to prove.

Sure, we can keep the discussions active.

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Not sure if that's enough: you trim the range in sysfs printing to [0; online_logical_cores_cnt)but in sanitization allow other ranges as well. So, effectively we'll expose the more complex ranges (visible e.g. in counts of resources) but will trim the available info in cpu directory to ranges SET-AND [0; online_logical_cores_cnt).

Can you please give an example of the failure case? It will be easier to understand your case.

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

It will end up using this inconsistent data, with the downside being performance

or memory corruption, depending on the app's implementation.

Performance is not an issue, the host can always deny resources.

It can't be indices that don't point to any core as we create a bitmap from the range_arr. So, an index if set will either point to a core that truly shares the cache or point to a core that doesn't share the cache (compromised system) or that core is not for that cache.

But where do we verify that cache_info_arr[lvl].shared_cpu_map contains only valid indices?

I don't think we can verify this without reading CPUID leaf 0x4 on each core.

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

There's no "compromised system" here, the host owner is the attacker.

By "compromised system", I mean the integrity of the system is lost. It could be due to an attacker who could be a host owner or not.

I don't think there will be anything in the current sanitization that breaks if we remove this check.

How do we differentiate the case where the attacker has willfully tampered and made the node empty Vs truly empty due to offlined cores? Might again lead to inconsistent data?

mkow

Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)

a discussion (no related file):

OK, sounds good. So, do you want me to merge the fixups and rebase them to the latest master? or do you have some changes to push?

I have some changes which I'm still working on, but I can submit them as a follow-up PR, this code is behind the "experimental" flag, so it should be fine to merge the non-polished version for now.

Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

What if untrusted PAL sets the ranges to something where start != end? I think you'll print output in invalid format in /sys/devices/system/node/node/distance?

Yes, you are correct, we will print invalid format.

Also, I really dislike this hack, i.e. "recycling" structures for something completely different than they were intended to. Here we just want a list of integers, so we should use a list of integers.

I can clean this up and have it as a list of integers. This will also solve the invalid printing format.

I'll do it in a separate PR.

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

Can you please give an example of the failure case? It will be easier to understand your case.

online_logical_cores = [[0, 1], [5,6]] and online_logical_cores_cnt = 4, the structures in PAL will say that there are 4 CPUs, but sysfs will show only two of them? Or maybe I'm missing something? Actually, don't you have 2 different numbering for cores? I.e. a different one in ranges and a different one in path component names? (i.e. "cpu").

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I don't think we can verify this without reading CPUID leaf 0x4 on each core.

By "valid" I meant "consistent with other information we present to the app", e.g. the list of online CPUs.

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

By "compromised system", I mean the integrity of the system is lost. It could be due to an attacker who could be a host owner or not.

We don't assume any integrity of the host, there's nothing to be "lost".

How do we differentiate the case where the attacker has willfully tampered and made the node empty Vs truly empty due to offlined cores? Might again lead to inconsistent data?

No, why would it have to lead to inconsistent data? We fully control what we present to the application, we can make it fully consistent.

vijaydhanraj

Reviewable status: 12 of 21 files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mkow)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

OK, sounds good. So, do you want me to merge the fixups and rebase them to the latest master? or do you have some changes to push?

I have some changes which I'm still working on, but I can submit them as a follow-up PR, this code is behind the "experimental" flag, so it should be fine to merge the non-polished version for now.

Rebased to the latest master and pushed the changes.

-- commits, line 25 at r2:

Previously, mkow (Michał Kowalczyk) wrote…

Looks better.

Done.

Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I'll do it in a separate PR.

OK.

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

I think we agreed that for this issue we need to add online_logical_cores being a subset of possible_logical_cores check. In addition, I suggest we also have a check for possible_logical_cores such that we only allow range_cnt to be 1 and the ranges_arr always start from 0 until max cores.
PS: possible_logical_cores will only have 0-max cores even if we offline cores. So resource_cnt will be n cores, range_cnt will be 1 and range_arr will only have one range starting from 0 until max cores.

With the above changes, and online_logical_cores = [[0, 1], [5,6]] shouldn't gramine start up fail as possible_logical_cores_cnt > online_logical_cores_cnt?

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
OK, that we can make sure by checking the list of cores that are part of share_cpu_map is a subset of online_logical_cores. This we already established right?

But even with such a check, the attacker can make share_cpu_map a subset of online_logical_cores but map different caches to different cores which will still cause performance issues.

This we cannot verify without proper cpuid checks.

Pal/src/host/Linux-SGX/db_main.c, line 601 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Yup, I think it will solve at least this specific issue.

OK.

Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

By "compromised system", I mean the integrity of the system is lost. It could be due to an attacker who could be a host owner or not.

We don't assume any integrity of the host, there's nothing to be "lost".

How do we differentiate the case where the attacker has willfully tampered and made the node empty Vs truly empty due to offlined cores? Might again lead to inconsistent data?

No, why would it have to lead to inconsistent data? We fully control what we present to the application, we can make it fully consistent.

yes, I take back, there wouldn't be inconsistent data. We can add this check.

mkow · 2022-01-31T18:46:30Z

Jenkins, retest this please

dimakuv · 2022-01-31T18:57:34Z

@vijaydhanraj Jenkins fails like this:

In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../Pal/src/host/Linux-common/topo_info.c: In function ‘get_core_topo_info’:
../Pal/src/host/Linux-common/topo_info.c:334:50: error: ‘/online’ directive output may be truncated writing 7 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  334 |             snprintf(filename, sizeof(filename), "%s/online", dirname);
      |                                                  ^~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:334:53: note: format string is defined here
  334 |             snprintf(filename, sizeof(filename), "%s/online", dirname);
      |                                                     ^~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 8 and 135 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:334:13: note: in expansion of macro ‘snprintf’
  334 |             snprintf(filename, sizeof(filename), "%s/online", dirname);
      |             ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:345:46: error: ‘/topology/core_id’ directive output may be truncated writing 17 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  345 |         snprintf(filename, sizeof(filename), "%s/topology/core_id", dirname);
      |                                              ^~~~~~~~~~~~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:345:49: note: format string is defined here
  345 |         snprintf(filename, sizeof(filename), "%s/topology/core_id", dirname);
      |                                                 ^~~~~~~~~~~~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 18 and 145 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:345:9: note: in expansion of macro ‘snprintf’
  345 |         snprintf(filename, sizeof(filename), "%s/topology/core_id", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:350:46: error: ‘/topology/core_siblings_list’ directive output may be truncated writing 28 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  350 |         snprintf(filename, sizeof(filename), "%s/topology/core_siblings_list", dirname);
      |                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:350:49: note: format string is defined here
  350 |         snprintf(filename, sizeof(filename), "%s/topology/core_siblings_list", dirname);
      |                                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 29 and 156 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:350:9: note: in expansion of macro ‘snprintf’
  350 |         snprintf(filename, sizeof(filename), "%s/topology/core_siblings_list", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:355:46: error: ‘/topology/thread_siblings_list’ directive output may be truncated writing 30 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  355 |         snprintf(filename, sizeof(filename), "%s/topology/thread_siblings_list", dirname);
      |                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:355:49: note: format string is defined here
  355 |         snprintf(filename, sizeof(filename), "%s/topology/thread_siblings_list", dirname);
      |                                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 31 and 158 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:355:9: note: in expansion of macro ‘snprintf’
  355 |         snprintf(filename, sizeof(filename), "%s/topology/thread_siblings_list", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:360:46: error: ‘/topology/physical_package_id’ directive output may be truncated writing 29 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  360 |         snprintf(filename, sizeof(filename), "%s/topology/physical_package_id", dirname);
      |                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:360:49: note: format string is defined here
  360 |         snprintf(filename, sizeof(filename), "%s/topology/physical_package_id", dirname);
      |                                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 30 and 157 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:360:9: note: in expansion of macro ‘snprintf’
  360 |         snprintf(filename, sizeof(filename), "%s/topology/physical_package_id", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:240:46: error: ‘/shared_cpu_list’ directive output may be truncated writing 16 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  240 |         snprintf(filename, sizeof(filename), "%s/shared_cpu_list", dirname);
      |                                              ^~~~~~~~~~~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:240:49: note: format string is defined here
  240 |         snprintf(filename, sizeof(filename), "%s/shared_cpu_list", dirname);
      |                                                 ^~~~~~~~~~~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 17 and 144 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:240:9: note: in expansion of macro ‘snprintf’
  240 |         snprintf(filename, sizeof(filename), "%s/shared_cpu_list", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:245:46: error: ‘/level’ directive output may be truncated writing 6 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  245 |         snprintf(filename, sizeof(filename), "%s/level", dirname);
      |                                              ^~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:245:49: note: format string is defined here
  245 |         snprintf(filename, sizeof(filename), "%s/level", dirname);
      |                                                 ^~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 7 and 134 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:245:9: note: in expansion of macro ‘snprintf’
  245 |         snprintf(filename, sizeof(filename), "%s/level", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:251:46: error: ‘/type’ directive output may be truncated writing 5 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  251 |         snprintf(filename, sizeof(filename), "%s/type", dirname);
      |                                              ^~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:251:49: note: format string is defined here
  251 |         snprintf(filename, sizeof(filename), "%s/type", dirname);
      |                                                 ^~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 6 and 133 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:251:9: note: in expansion of macro ‘snprintf’
  251 |         snprintf(filename, sizeof(filename), "%s/type", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:268:46: error: ‘/size’ directive output may be truncated writing 5 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  268 |         snprintf(filename, sizeof(filename), "%s/size", dirname);
      |                                              ^~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:268:49: note: format string is defined here
  268 |         snprintf(filename, sizeof(filename), "%s/size", dirname);
      |                                                 ^~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 6 and 133 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:268:9: note: in expansion of macro ‘snprintf’
  268 |         snprintf(filename, sizeof(filename), "%s/size", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:273:46: error: ‘/coherency_line_size’ directive output may be truncated writing 20 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  273 |         snprintf(filename, sizeof(filename), "%s/coherency_line_size", dirname);
      |                                              ^~~~~~~~~~~~~~~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:273:49: note: format string is defined here
  273 |         snprintf(filename, sizeof(filename), "%s/coherency_line_size", dirname);
      |                                                 ^~~~~~~~~~~~~~~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 21 and 148 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:273:9: note: in expansion of macro ‘snprintf’
  273 |         snprintf(filename, sizeof(filename), "%s/coherency_line_size", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:278:46: error: ‘/number_of_sets’ directive output may be truncated writing 15 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  278 |         snprintf(filename, sizeof(filename), "%s/number_of_sets", dirname);
      |                                              ^~~~~~~~~~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:278:49: note: format string is defined here
  278 |         snprintf(filename, sizeof(filename), "%s/number_of_sets", dirname);
      |                                                 ^~~~~~~~~~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 16 and 143 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:278:9: note: in expansion of macro ‘snprintf’
  278 |         snprintf(filename, sizeof(filename), "%s/number_of_sets", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:283:46: error: ‘/physical_line_partition’ directive output may be truncated writing 24 bytes into a region of size between 1 and 128 [-Werror=format-truncation=]
  283 |         snprintf(filename, sizeof(filename), "%s/physical_line_partition", dirname);
      |                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
../common/include/api_fortified.h:54:86: note: in definition of macro ‘snprintf’
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |                                                                                      ^~~
../Pal/src/host/Linux-common/topo_info.c:283:49: note: format string is defined here
  283 |         snprintf(filename, sizeof(filename), "%s/physical_line_partition", dirname);
      |                                                 ^~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../common/include/api.h:446,
                 from ../Pal/src/host/Linux-common/topo_info.c:14:
../common/include/api_fortified.h:54:5: note: ‘__builtin___snprintf_chk’ output between 25 and 152 bytes into a destination of size 128
   54 |     __builtin___snprintf_chk(buf, buf_size, __USE_FORTIFY_LEVEL - 1, __api_bos(buf), fmt)
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../Pal/src/host/Linux-common/topo_info.c:283:9: note: in expansion of macro ‘snprintf’
  283 |         snprintf(filename, sizeof(filename), "%s/physical_line_partition", dirname);
      |         ^~~~~~~~
../Pal/src/host/Linux-common/topo_info.c: At top level:
cc1: error: unrecognized command line option ‘-Wno-gnu-alignof-expression’ [-Werror]

Please fix.

This commit creates a clean representation of topology data on the trusted<->untrusted boundary, so that we can easily verify its correctness. The following are done: 1. Refactor sysfs code to convert CPU/NUMA information from Linux-formatted strings to integers in untrusted PAL. 2. Copy and sanitize CPU/NUMA information based on untrusted integers to trusted PAL. 3. Convert the trusted CPU/NUMA information from integers back to Linux-formatted strings in LibOS. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>

mkow

Reviewed 9 of 9 files at r11, 1 of 1 files at r12, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I think we agreed that for this issue we need to add online_logical_cores being a subset of possible_logical_cores check. In addition, I suggest we also have a check for possible_logical_cores such that we only allow range_cnt to be 1 and the ranges_arr always start from 0 until max cores.
PS: possible_logical_cores will only have 0-max cores even if we offline cores. So resource_cnt will be n cores, range_cnt will be 1 and range_arr will only have one range starting from 0 until max cores.

With the above changes, and online_logical_cores = [[0, 1], [5,6]] shouldn't gramine start up fail as possible_logical_cores_cnt > online_logical_cores_cnt?

But we can set possible_logical_cores_cnt to something large? I didn't even use it in the example above.

Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):

OK, that we can make sure by checking the list of cores that are part of share_cpu_map is a subset of online_logical_cores. This we already established right?

I think so.

This we cannot verify without proper cpuid checks.

Yes, but as I said, we don't care about the untrusted host influencing performance, because it has much better ways to do it than this.

mkow · 2022-01-31T22:19:08Z

Jenkins, retest this please

vijaydhanraj · 2022-01-31T22:20:10Z

Thanks @dimakuv, tested with GCC 9.4 and was able to reproduce this issue. Fixed it now.
Thanks @mkow for restarting the tests.

vijaydhanraj

Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

But we can set possible_logical_cores_cnt to something large? I didn't even use it in the example above.

I mean without core offlining online_logical_core_cnt should be equal to possible_logical_cores_cnt. So, if we have holes in online_logical_cores as you mentioned in your example, it should fail.

mkow

Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, vijaydhanraj (Vijay Dhanraj) wrote…

I mean without core offlining online_logical_core_cnt should be equal to possible_logical_cores_cnt. So, if we have holes in online_logical_cores as you mentioned in your example, it should fail.

But we don't have such a check and I think we only discussed checking if online_logical_core_cnt is a subset of possible_logical_cores_cnt, not checking for equality.

vijaydhanraj

Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

But we don't have such a check and I think we only discussed checking if online_logical_core_cnt is a subset of possible_logical_cores_cnt, not checking for equality.

We already have this possible_logical_cores_cnt > online_logical_cores_cnt which will catch the example you showed. But we should change this to equality check until proper core offlining support is added.

mkow

Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):

We already have this possible_logical_cores_cnt > online_logical_cores_cnt which will catch the example you showed.

It won't, my example allows specifying more possible cores, so that this check passes.

But we should change this to equality check until proper core offlining support is added.

Better to just check for equality of both lists. But we can do this later.

dimakuv

Reviewed 1 of 1 files at r10, 2 of 9 files at r11, 1 of 1 files at r12, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved

vijaydhanraj mentioned this pull request Jan 22, 2022

Refactor sysfs to improve sanitization #106

Closed

mkow requested changes Jan 22, 2022

View reviewed changes

vijaydhanraj commented Jan 23, 2022

View reviewed changes

mkow reviewed Jan 23, 2022

View reviewed changes

vijaydhanraj commented Jan 24, 2022

View reviewed changes

dimakuv suggested changes Jan 24, 2022

View reviewed changes

mkow requested changes Jan 24, 2022

View reviewed changes

mkow requested changes Jan 25, 2022

View reviewed changes

vijaydhanraj commented Jan 25, 2022

View reviewed changes

vijaydhanraj commented Jan 26, 2022

View reviewed changes

dimakuv suggested changes Jan 26, 2022

View reviewed changes

mkow requested changes Jan 27, 2022

View reviewed changes

vijaydhanraj commented Jan 27, 2022

View reviewed changes

dimakuv suggested changes Jan 27, 2022

View reviewed changes

mkow requested changes Jan 27, 2022

View reviewed changes

vijaydhanraj commented Jan 27, 2022

View reviewed changes

mkow requested changes Jan 28, 2022

View reviewed changes

vijaydhanraj commented Jan 28, 2022

View reviewed changes

dimakuv suggested changes Jan 28, 2022

View reviewed changes

mkow reviewed Jan 30, 2022

View reviewed changes

mkow requested changes Jan 30, 2022

View reviewed changes

vijaydhanraj commented Jan 30, 2022

View reviewed changes

mkow reviewed Jan 30, 2022

View reviewed changes

mkow requested changes Jan 30, 2022

View reviewed changes

mkow requested changes Jan 31, 2022

View reviewed changes

vijaydhanraj commented Jan 31, 2022

View reviewed changes

mkow requested changes Jan 31, 2022

View reviewed changes

vijaydhanraj commented Jan 31, 2022

View reviewed changes

dimakuv previously approved these changes Jan 31, 2022

View reviewed changes

mkow requested changes Jan 31, 2022

View reviewed changes

vijaydhanraj commented Jan 31, 2022

View reviewed changes

mkow requested changes Jan 31, 2022

View reviewed changes

vijaydhanraj dismissed dimakuv’s stale review via 74dedd5 January 31, 2022 18:13

vijaydhanraj force-pushed the vdhanraj/refactor_sysfs_part2 branch from 9c491f4 to 74dedd5 Compare January 31, 2022 18:13

vijaydhanraj commented Jan 31, 2022

View reviewed changes

vijaydhanraj force-pushed the vdhanraj/refactor_sysfs_part2 branch from 74dedd5 to bc06c66 Compare January 31, 2022 22:06

vijaydhanraj force-pushed the vdhanraj/refactor_sysfs_part2 branch from bc06c66 to 66f58d0 Compare January 31, 2022 22:11

mkow approved these changes Jan 31, 2022

View reviewed changes

vijaydhanraj commented Jan 31, 2022

View reviewed changes

mkow reviewed Jan 31, 2022

View reviewed changes

vijaydhanraj commented Jan 31, 2022

View reviewed changes

mkow reviewed Jan 31, 2022

View reviewed changes

dimakuv approved these changes Feb 1, 2022

View reviewed changes

dimakuv merged commit 66f58d0 into gramineproject:master Feb 1, 2022

	!sgx_is_completely_outside_enclave(uptr, usize) \|\|
	!sgx_is_completely_within_enclave(ptr, usize)) {

Refactor sysfs to improve sanitization #355

Refactor sysfs to improve sanitization #355

Conversation

vijaydhanraj commented Jan 22, 2022 • edited by mkow Loading

Description of the changes

How to test this PR?

mkow commented Jan 22, 2022

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv commented Jan 24, 2022

dimakuv left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow commented Jan 25, 2022

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow commented Jan 26, 2022

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow commented Jan 26, 2022

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

mkow left a comment

Choose a reason for hiding this comment

vijaydhanraj left a comment

Choose a reason for hiding this comment

dimakuv left a comment

Choose a reason for hiding this comment

mkow left a comment

vijaydhanraj commented Jan 22, 2022 •

edited by mkow

Loading