-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor sysfs to improve sanitization #355
Refactor sysfs to improve sanitization #355
Conversation
Jenkins, test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @vijaydhanraj)
a discussion (no related file):
From AddressSanitizer:
[LibOS.shim.test.regression.test_libos.TC_40_FileSystem.test_040_sysfs]
error: asan: heap-buffer-overflow while trying to store 8 bytes at 0xf766cb8
error: asan: the bad address is 0xf766cb8 (0 from beginning of access)
error: asan: location: create_socket_range_from_core at db_main.c, libpal.so+0x1149bd
error: asan: (for a full traceback, use GDB with a breakpoint at "pal_abort")
error: asan:
error: asan: shadow bytes around the bad address:
error: asan: 0x18001eecd50: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan: 0x18001eecd60: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan: 0x18001eecd70: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan: 0x18001eecd80: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
error: asan: =>0x18001eecd90: fa fa 00 00 fa fa 00[fa]fa fa fa fa fa fa fa fa
error: asan: 0x18001eecda0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan: 0x18001eecdb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan: 0x18001eecdc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan: 0x18001eecdd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
error: asan:
error: asan: shadow byte legend (1 shadow byte = 8 application bytes):
error: asan: addressable: 00
error: asan: partially addressable: 01..07
error: asan: heap left redzone: fa
error: asan: freed heap region: fd
error: asan: stack left redzone: f1
error: asan: stack mid redzone: f2
error: asan: stack right redzone: f3
error: asan: stack after scope: f8
error: asan: alloca left redzone: ca
error: asan: alloca right redzone: cb
error: asan: global redzone: f9
error: asan: user-poisoned: f7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
From AddressSanitizer:
[LibOS.shim.test.regression.test_libos.TC_40_FileSystem.test_040_sysfs] error: asan: heap-buffer-overflow while trying to store 8 bytes at 0xf766cb8 error: asan: the bad address is 0xf766cb8 (0 from beginning of access) error: asan: location: create_socket_range_from_core at db_main.c, libpal.so+0x1149bd error: asan: (for a full traceback, use GDB with a breakpoint at "pal_abort") error: asan: error: asan: shadow bytes around the bad address: error: asan: 0x18001eecd50: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00 error: asan: 0x18001eecd60: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00 error: asan: 0x18001eecd70: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00 error: asan: 0x18001eecd80: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00 error: asan: =>0x18001eecd90: fa fa 00 00 fa fa 00[fa]fa fa fa fa fa fa fa fa error: asan: 0x18001eecda0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa error: asan: 0x18001eecdb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa error: asan: 0x18001eecdc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa error: asan: 0x18001eecdd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa error: asan: error: asan: shadow byte legend (1 shadow byte = 8 application bytes): error: asan: addressable: 00 error: asan: partially addressable: 01..07 error: asan: heap left redzone: fa error: asan: freed heap region: fd error: asan: stack left redzone: f1 error: asan: stack mid redzone: f2 error: asan: stack right redzone: f3 error: asan: stack after scope: f8 error: asan: alloca left redzone: ca error: asan: alloca right redzone: cb error: asan: global redzone: f9 error: asan: user-poisoned: f7
Thanks, @mkow. Can you please tell me how to reproduce this locally?
I tried the following:
1. export CC=clang-9 CXX=clang++-9 AS=clang-9
2. meson setup build/ --buildtype=debug -Ddirect=enabled -Dsgx=enabled -Dsgx_driver=upstream -Dsgx_driver_include_path="" -Dtests=enabled -Ddcap=enabled -Dasan=enabled
This results in the error below,
LibOS/shim/test/regression/meson.build:176:8: ERROR: Problem encountered: Compiling tests is currently unsupported with Musl and compilers other than GCC. You need to either disable Musl (-Dmusl=disabled) or use GCC (CC=gcc)
So as suggested in the error, added -Dmusl=disabled
meson setup build/ --buildtype=debug -Ddirect=enabled -Dsgx=enabled -Dsgx_driver=upstream -Dsgx_driver_include_path="" -Dtests=enabled -Ddcap=enabled -Dasan=enabled -Dmusl=disabled
This worked fine but ninja -C build/
failed with the following error,
ninja: build stopped: subcommand failed.
Also, I see messages like these in the log,
[34/721] Compiling C object Pal/regression/Pie.p/Pie.c.o
FAILED: Pal/regression/Pie.p/Pie.c.o
clang-9 -IPal/regression/Pie.p -IPal/regression -I../Pal/regression -I../Pal/include/pal -I../Pal/include/arch/x86_64 -I../Pal/include/arch/x86_64/Linux -Icommon/include -I../common/include -I../common/include/arch/x86_64 -fcolor-diagnostics -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -std=c11 -O0 -g -Wa,--noexecstack -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Wwrite-strings -Wnull-dereference -DDEBUG -fPIE -fsanitize=address -fno-sanitize-link-runtime -mllvm -asan-mapping-offset=0x18000000000 -DASAN -mllvm -asan-use-after-return=0 -fno-stack-protector -fno-builtin -nostdlib -Wno-unused-parameter -fPIE -MD -MQ Pal/regression/Pie.p/Pie.c.o -MF Pal/regression/Pie.p/Pie.c.o.d -o Pal/regression/Pie.p/Pie.c.o -c ../Pal/regression/Pie.c
clang: error: unknown argument: '-fno-sanitize-link-runtime'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow)
a discussion (no related file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Thanks, @mkow. Can you please tell me how to reproduce this locally?
I tried the following:
1. export CC=clang-9 CXX=clang++-9 AS=clang-9 2. meson setup build/ --buildtype=debug -Ddirect=enabled -Dsgx=enabled -Dsgx_driver=upstream -Dsgx_driver_include_path="" -Dtests=enabled -Ddcap=enabled -Dasan=enabled
This results in the error below,
LibOS/shim/test/regression/meson.build:176:8: ERROR: Problem encountered: Compiling tests is currently unsupported with Musl and compilers other than GCC. You need to either disable Musl (-Dmusl=disabled) or use GCC (CC=gcc)
So as suggested in the error, added
-Dmusl=disabled
meson setup build/ --buildtype=debug -Ddirect=enabled -Dsgx=enabled -Dsgx_driver=upstream -Dsgx_driver_include_path="" -Dtests=enabled -Ddcap=enabled -Dasan=enabled -Dmusl=disabled
This worked fine but
ninja -C build/
failed with the following error,ninja: build stopped: subcommand failed.
Also, I see messages like these in the log,
[34/721] Compiling C object Pal/regression/Pie.p/Pie.c.o FAILED: Pal/regression/Pie.p/Pie.c.o clang-9 -IPal/regression/Pie.p -IPal/regression -I../Pal/regression -I../Pal/include/pal -I../Pal/include/arch/x86_64 -I../Pal/include/arch/x86_64/Linux -Icommon/include -I../common/include -I../common/include/arch/x86_64 -fcolor-diagnostics -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Werror -std=c11 -O0 -g -Wa,--noexecstack -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes -Wwrite-strings -Wnull-dereference -DDEBUG -fPIE -fsanitize=address -fno-sanitize-link-runtime -mllvm -asan-mapping-offset=0x18000000000 -DASAN -mllvm -asan-use-after-return=0 -fno-stack-protector -fno-builtin -nostdlib -Wno-unused-parameter -fPIE -MD -MQ Pal/regression/Pie.p/Pie.c.o -MF Pal/regression/Pie.p/Pie.c.o.d -o Pal/regression/Pie.p/Pie.c.o -c ../Pal/regression/Pie.c clang: error: unknown argument: '-fno-sanitize-link-runtime'
I think the below should fix the issue, but want to test with both ASan
and UBSan
before pushing the changes.
diff --git a/Pal/src/host/Linux-SGX/db_main.c b/Pal/src/host/Linux-SGX/db_main.c
index dc41346f..144a69f5 100644
--- a/Pal/src/host/Linux-SGX/db_main.c
+++ b/Pal/src/host/Linux-SGX/db_main.c
@@ -372,8 +372,8 @@ static int create_socket_range_from_core(struct pal_res_range_info* socket_info_
if (socket_id != prev_socket_id) {
socket_info_arr[socket_id].range_cnt++;
- size_t new_size = sizeof(new_size) * socket_info_arr[socket_id].range_cnt;
- size_t old_size = new_size - sizeof(old_size);
+ size_t new_size = sizeof(struct pal_range_info) * socket_info_arr[socket_id].range_cnt;
+ size_t old_size = new_size - sizeof(struct pal_range_info);
struct pal_range_info* tmp = malloc(new_size);
if (!tmp)
return -1;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @vijaydhanraj)
a discussion (no related file):
clang: error: unknown argument: '-fno-sanitize-link-runtime'
Try clang10, yours is probably too old and doesn't support this flag yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 21 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
clang: error: unknown argument: '-fno-sanitize-link-runtime'
Try clang10, yours is probably too old and doesn't support this flag yet.
Thanks, installed clang-12
and was able to repro the issue. Tested with the above patch and it resolves the issue.
Jenkins, test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 20 of 21 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 124 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
-- commits, line 6 at r2:
Linux_SGX
-> Linux-SGX
-- commits, line 7 at r2:
...unify
-> ...and unify
-- commits, line 8 at r2:
PAL
-> PALs
-- commits, line 19 at r2:
Actually, I would just remove this sentence and immediately start with 1., 2., 3.
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
/* Converts an integer to a string, optionally appending a given single-letter unit suffix * (see enum size_multiplier for possible values). * Note: This function adds a newline at end of the string. */
at the end
LibOS/shim/include/shim_fs_pseudo.h, line 216 at r2 (raw file):
* Note: This function adds a newline at end of the string. */ int sys_convert_int_to_sizestr(size_t val, enum size_multiplier size_mult, char* str, size_t buf_size);
We have inconsistency in naming: str
and buf_size
. I suggest to replace str
-> buf
.
LibOS/shim/include/shim_fs_pseudo.h, line 220 at r2 (raw file):
/* Converts struct pal_res_range_info to a string representation. * Example output when sep == ',': "10-63,68,70-127". * Note: This function adds a newline at end of the string. */
at the end
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r2 (raw file):
* Note: This function adds a newline at end of the string. */ int sys_convert_ranges_to_str(const struct pal_res_range_info* resource_range_info, char* str, size_t buf_size, const char* sep);
ditto (buf
)
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r2 (raw file):
* Note: This function adds a newline at end of the string. */ int sys_convert_ranges_to_str(const struct pal_res_range_info* resource_range_info, char* str, size_t buf_size, const char* sep);
Can we move sep
before str
and buf_size
(to become a second argument, not the last one)?
This is more easy on the eyes: we will have two input arguments first, and then two arguments refering to the buffer.
LibOS/shim/include/shim_fs_pseudo.h, line 227 at r2 (raw file):
* on the possible cores count in the system. * Example output for 64 cores in total and ranges 0-15,48-55: "00ff0000,0000ffff". * Note: This function adds a newline at end of the string. */
at the end
LibOS/shim/include/shim_fs_pseudo.h, line 229 at r2 (raw file):
* Note: This function adds a newline at end of the string. */ int sys_convert_ranges_to_cpu_bitmap_str(const struct pal_res_range_info* resource_range_info, char* str, size_t buf_size);
ditto (buf
)
LibOS/shim/src/fs/proc/info.c, line 129 at r2 (raw file):
if (g_pal_public_state->enable_sysfs_topology) { ADD_INFO("physical id\t: %zu\n", ti->core_topology_arr[i].socket_id); }
Can we have an else
clause? It is very unexpected that if sysfs is disabled, then there will be no physical id
string at all.
Maybe simply ADD_INFO("physical id\t: %zu\n", 0);
is good enough for now (i.e., we have only one socket)?
LibOS/shim/src/fs/sys/cache_info.c, line 35 at r2 (raw file):
ret = sys_convert_ranges_to_cpu_bitmap_str(&cache_info->shared_cpu_map, str, sizeof(str)); } else if (strcmp(name, "level") == 0) { ret = sys_convert_int_to_sizestr(cache_info->level, MULTIPLIER_NONE, str, sizeof(str));
Why do you use your special function here, instead of a simple:
ret = snprintf(str, sizeof(str), "%zu\n", cache_info->level);
Same for the below cases (coherency line size, number of sets, ...)
LibOS/shim/src/fs/sys/cache_info.c, line 39 at r2 (raw file):
switch (cache_info->type) { case CACHE_TYPE_DATA: ret = snprintf(str, sizeof(str), "%s", "Data\n");
\n
is typically added to the format string, so please change to this:
ret = snprintf(str, sizeof(str), "%s\n", "Data");
Same for the below two cases.
LibOS/shim/src/fs/sys/cache_info.c, line 67 at r2 (raw file):
if (ret < 0) return ret;
Please add an empty line after this, a bit hard to read otherwise
LibOS/shim/src/fs/sys/cpu_info.c, line 32 at r2 (raw file):
if (ret < 0) return ret;
Please add an empty line after this, a bit hard to read otherwise
LibOS/shim/src/fs/sys/cpu_info.c, line 52 at r2 (raw file):
return -ENOENT; ret = sys_convert_int_to_sizestr(core_topology->is_logical_core_online, MULTIPLIER_NONE, str, sizeof(str));
Why do you use your special function here, instead of a simple:
ret = snprintf(str, sizeof(str), "%zu\n", core_topology->is_logical_core_online);
Same for the below cases (coherency line size, number of sets, ...)
LibOS/shim/src/fs/sys/cpu_info.c, line 69 at r2 (raw file):
if (ret < 0) return ret;
Please add an empty line after this, a bit hard to read otherwise
LibOS/shim/src/fs/sys/fs.c, line 20 at r2 (raw file):
size_t buf_size) { int ret = 0;
Please add an assert that size_mult
is one of NONE
, KB
, MB
, GB
.
LibOS/shim/src/fs/sys/fs.c, line 38 at r2 (raw file):
if ((size_t)ret >= buf_size) ret = -EOVERFLOW; return ret;
Please combine three lines into one via return condition ? true : false
.
LibOS/shim/src/fs/sys/fs.c, line 46 at r2 (raw file):
size_t offset = 0; for (size_t i = 0; i < range_cnt; i++) { if (offset > buf_size)
>=
LibOS/shim/src/fs/sys/fs.c, line 65 at r2 (raw file):
/* Truncation has occurred */ if ((size_t)ret >= buf_size)
buf_size
-> buf_size - offset
(because this is the real "max size of string" argument passed to snprintf)
LibOS/shim/src/fs/sys/fs.c, line 90 at r2 (raw file):
for (size_t j = start; j <= end; j++) { size_t index = j / BITS_IN_TYPE(int);
int
-> uint32_t
LibOS/shim/src/fs/sys/fs.c, line 93 at r2 (raw file):
assert(index < cpumask_cnt); bitmap[index] |= 1U << (j % BITS_IN_TYPE(int));
int
-> uint32_t
LibOS/shim/src/fs/sys/fs.c, line 100 at r2 (raw file):
size_t offset = 0; for (size_t j = cpumask_cnt; j > 0; j--) { if (offset > buf_size) {
``>=
LibOS/shim/src/fs/sys/fs.c, line 119 at r2 (raw file):
/* Truncation has occurred */ if ((size_t)ret >= buf_size) {
buf_size
-> buf_size - offset
(because this is the real "max size of string" argument passed to snprintf)
LibOS/shim/src/fs/sys/node_info.c, line 50 at r2 (raw file):
if (strcmp(parent_name, "hugepages-2048kB") == 0) { ret = sys_convert_int_to_sizestr(numa_topology->nr_hugepages[HUGEPAGES_2M], MULTIPLIER_NONE, str, sizeof(str));
Why do you use your special function here, instead of a simple:
ret = snprintf(str, sizeof(str), "%zu\n", ...);
Same for the below case.
LibOS/shim/src/fs/sys/node_info.c, line 64 at r2 (raw file):
if (ret < 0) return ret;
Please add an empty line after this, a bit hard to read otherwise
Pal/include/arch/x86_64/pal_topology.h, line 3 at r2 (raw file):
/* SPDX-License-Identifier: LGPL-3.0-or-later */ /* Copyright (C) 2022 Intel Corporation * Michał Kowalczyk <mkow@invisiblethingslab.com>
Nit: you can add your copyright too
Pal/include/arch/x86_64/pal_topology.h, line 41 at r2 (raw file):
struct pal_res_range_info { /* Count of total number of resources present. Eg. if output of `/sys/devices/system/cpu/online`
Simply Total number of ...
(remove count of
, it is redundant)
Pal/include/arch/x86_64/pal_topology.h, line 45 at r2 (raw file):
size_t resource_cnt; /* Count of total number of ranges present. Eg. if output of `/sys/devices/system/cpu/online`
Simply Total number of ...
(remove count of
, it is redundant)
Pal/include/arch/x86_64/pal_topology.h, line 49 at r2 (raw file):
size_t range_cnt; /* Array to store each range and has `range_cnt` elements. Eg. if output of
I would rephrase the first sentence like this:
Array of ranges, with `range_cnt` items.
Pal/include/arch/x86_64/pal_topology.h, line 51 at r2 (raw file):
/* Array to store each range and has `range_cnt` elements. Eg. if output of * `/sys/devices/system/cpu/online` were 0-15,21,32-63 then `ranges_arr` will be * [[0, 15], [21, 21], [32 64]]. Note: When there is only a single number instead of range both
You current example looks like a two-dimensional array (array of arrays). But in reality, you have an array of structs. So you should change to:
[ {0, 15}, {21, 21}, {32, 63} ]
Also, did you make a typo? Why do you have 64
instead of 63
in the last item?
Pal/include/arch/x86_64/pal_topology.h, line 52 at r2 (raw file):
* `/sys/devices/system/cpu/online` were 0-15,21,32-63 then `ranges_arr` will be * [[0, 15], [21, 21], [32 64]]. Note: When there is only a single number instead of range both * start and end are assigned start value.
I would remove the Note sentence, it is redundant (you already give an example of [21, 21]
-- no need to explain this representation in words).
Pal/include/arch/x86_64/pal_topology.h, line 53 at r2 (raw file):
* [[0, 15], [21, 21], [32 64]]. Note: When there is only a single number instead of range both * start and end are assigned start value. * Note: The ranges_arr has non-overlapping elements */
I would rephrase to: The ranges should not overlap.
Pal/include/arch/x86_64/pal_topology.h, line 66 at r2 (raw file):
size_t number_of_sets; size_t physical_line_partition; };
Would it be possible to add same-line comments on each of these? With examples. For example, I have no idea what type
of the cache could mean.
UPDATE: type
must be of type enum cache_type
, please fix. Actually, then you can ignore my request for same-line comments, the rest fields are recognizable.
Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):
struct pal_core_topo_info { /* [0] element is uninitialized because core 0 is always online */
Does this comment make sense now? Previously, the string is_logical_core_online
could be uninitialized, but now it's a integer which is always initialized (moreover, it would be reasonable to always set is_logical_core_online = 1
for core 0).
I vote for removing this comment (unless I misunderstand smth).
Pal/include/arch/x86_64/pal_topology.h, line 99 at r2 (raw file):
/* Number of physical packages in the system */ size_t sockets_cnt;
How is this sockets_cnt
different from online_nodes.resource_cnt
? Are you sure you still need this sockets_cnt
variable for anything?
Pal/include/arch/x86_64/pal_topology.h, line 101 at r2 (raw file):
size_t sockets_cnt; /* Number of physical cores in a socket (physical package). */ size_t physical_cores_per_socket;
Is this physical_cores_per_socket
variable still relevant? You have all this info in online_nodes.ranges_arr
. Please check if this variable is really needed.
Pal/src/host/Linux-common/topo_info.c, line 20 at r2 (raw file):
/* Opens a pseudo-file describing HW resources and simply reads the value stored in the file. If * `retval` and `size_mult` are passed, the value read and the size_qualifier if any are stored.
Mentioning retval
in this sentence is misleading, it makes an impression that reval
may be passed as NULL
, but this is not true. In other words, retval
must always be passed to this function. So please remove retval
mention from this sentence (keep only size_mult
).
Pal/src/host/Linux-common/topo_info.c, line 27 at r2 (raw file):
/* `retval` must be passed to store the value read from buffer */ return -EINVAL; }
Instead of the if check, simply assert(retval)
-- this is a local function anyway, so we can make assumptions how it is called.
Pal/src/host/Linux-common/topo_info.c, line 32 at r2 (raw file):
*size_mult = MULTIPLIER_NONE; int fd = DO_SYSCALL(open, filename, O_RDONLY | O_CLOEXEC, 0);
You can remove the last argument (0
), it will still work.
Pal/src/host/Linux-common/topo_info.c, line 36 at r2 (raw file):
return fd; char buf[64];
Instead of 64, cannot we use one of your FILESZ
macros?
Pal/src/host/Linux-common/topo_info.c, line 40 at r2 (raw file):
DO_SYSCALL(close, fd); if (ret < 0) return ret;
Isn't this whole open
+read
+close
sequence exactly the same as what read_file_buffer()
function does? Please replace this code with the call to this function.
Pal/src/host/Linux-common/topo_info.c, line 73 at r2 (raw file):
/* Opens a pseudo-file describing HW resources such as online CPUs and counts the number of * HW resources, ranges present in the file. The result is stored in `struct pal_res_range_info`.
HW resources, ranges
-> HW resources and their ranges
Pal/src/host/Linux-common/topo_info.c, line 73 at r2 (raw file):
/* Opens a pseudo-file describing HW resources such as online CPUs and counts the number of * HW resources, ranges present in the file. The result is stored in `struct pal_res_range_info`.
...stored in `res_info`.
Pal/src/host/Linux-common/topo_info.c, line 75 at r2 (raw file):
* HW resources, ranges present in the file. The result is stored in `struct pal_res_range_info`. * Returns UNIX error code on failure and 0 on success. * N.B: Understands complex formats like "1,3-5,6"
Add a dot at the end of this sentence.
Pal/src/host/Linux-common/topo_info.c, line 81 at r2 (raw file):
/* `res_info` must be passed to store the range info read from buffer */ return -EINVAL; }
Instead of the if check, simply assert(res_info)
-- this is a local function anyway, so we can make assumptions how it is called.
Pal/src/host/Linux-common/topo_info.c, line 88 at r2 (raw file):
res_info->ranges_arr = NULL; int fd = DO_SYSCALL(open, filename, O_RDONLY | O_CLOEXEC, 0);
You can remove the last argument (0
), it will still work.
Pal/src/host/Linux-common/topo_info.c, line 92 at r2 (raw file):
return fd; char buf[64];
Instead of 64, cannot we use one of your FILESZ
macros?
Pal/src/host/Linux-common/topo_info.c, line 96 at r2 (raw file):
DO_SYSCALL(close, fd); if (ret < 0) return ret;
Isn't this whole open
+read
+close
sequence exactly the same as what read_file_buffer()
function does? Please replace this code with the call to this function.
Pal/src/host/Linux-common/topo_info.c, line 101 at r2 (raw file):
char* ptr = buf; size_t resource_cnt = 0;
Why do you need a helper variable? Simply use res_info->resource_cnt
instead.
Pal/src/host/Linux-common/topo_info.c, line 119 at r2 (raw file):
size_t range_end; if (*end == '\0' || *end == ',' || *end == '\n' || *end == ' ') {
Shouldn't you add also *end == '\t'
?
Pal/src/host/Linux-common/topo_info.c, line 120 at r2 (raw file):
if (*end == '\0' || *end == ',' || *end == '\n' || *end == ' ') { /* single HW resource index, count as one more */
This count as one more
comment doesn't make much sense now. I suggest to simply remove this part of the comment.
Pal/src/host/Linux-common/topo_info.c, line 125 at r2 (raw file):
resource_cnt++; } else if (*end == '-') { /* HW resource range, count how many HW resources are in range */
This count how many ...
comment doesn't make much sense now. I suggest to simply remove this part of the comment.
Pal/src/host/Linux-common/topo_info.c, line 142 at r2 (raw file):
goto out; } resource_cnt += end_val - start_val + 1; //inclusive (e.g., 0-7, or 8-16)
end_val - start_val
-> diff
Pal/src/host/Linux-common/topo_info.c, line 142 at r2 (raw file):
goto out; } resource_cnt += end_val - start_val + 1; //inclusive (e.g., 0-7, or 8-16)
The comment is weird. Why are there two examples? I would simply say /* +1 because of inclusive range */
Pal/src/host/Linux-common/topo_info.c, line 150 at r2 (raw file):
/* Update range info */ res_info->range_cnt++;
Please add an empty line after this line, easier to read
Pal/src/host/Linux-common/topo_info.c, line 151 at r2 (raw file):
/* Update range info */ res_info->range_cnt++; /* Realloc as we identify new range when parsing */
Simply Realloc the array of ranges (expand by one range)
Pal/src/host/Linux-common/topo_info.c, line 178 at r2 (raw file):
return 0; out:
You get to this label only if something failed. Please rename it to fail:
.
Pal/src/host/Linux-common/topo_info.c, line 183 at r2 (raw file):
free(res_info[i].ranges_arr); } /* Clear user supplied buffer */
I would remove this comment, it is obvious
Pal/src/host/Linux-common/topo_info.c, line 218 at r2 (raw file):
/* `cache_indices_cnt` must be passed to store the number of cache indices */ return -EINVAL; }
Instead of the if check, simply assert(cache_indices_cnt)
-- this is a local function anyway, so we can make assumptions how it is called.
Pal/src/host/Linux-common/topo_info.c, line 220 at r2 (raw file):
} char buf[1024];
Instead of 1024, cannot we use one of your FILESZ
macros?
Pal/src/host/Linux-common/topo_info.c, line 221 at r2 (raw file):
char buf[1024]; int ret = 0;
Please don't use this pattern of assigning 0
to ret
during init. Instead, add an explicit ret = 0;
statement before the out:
label.
Pal/src/host/Linux-common/topo_info.c, line 249 at r2 (raw file):
ret = -ENOENT; goto out; }
Please add an empty line
Pal/src/host/Linux-common/topo_info.c, line 250 at r2 (raw file):
goto out; } *cache_indices_cnt = dirs_cnt;
Please add ret = 0;
Pal/src/host/Linux-common/topo_info.c, line 266 at r2 (raw file):
} char filename[128];
Instead of 128, cannot we use one of your FILESZ
macros?
Pal/src/host/Linux-common/topo_info.c, line 270 at r2 (raw file):
snprintf(filename, sizeof(filename), "/sys/devices/system/cpu/cpu%zu/cache/index%zu/shared_cpu_list", core_idx, cache_idx);
This is hard to read. You have the part /sys/devices/system/cpu/cpu%zu/cache/index%zu/
always the same -- please move it to a helper variable, and use the helper variable everywhere, like this:
char dirname[128];
snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu/cache/index%zu", core_idx, cache_idx);
snrpintf(filename, sizeof(filename), "%s/shared_cpu_list", dirname);
snrpintf(filename, sizeof(filename), "%s/level", dirname);
... and so on ...
Pal/src/host/Linux-common/topo_info.c, line 285 at r2 (raw file):
core_idx, cache_idx); ret = read_file_buffer(filename, type, ARRAY_SIZE(type)-1);
Why ARRAY_SIZE
? Simply sizeof()
Pal/src/host/Linux-common/topo_info.c, line 336 at r2 (raw file):
return 0; out_cache:
There is no reason to have this specific name, please rename to simply out:
Pal/src/host/Linux-common/topo_info.c, line 363 at r2 (raw file):
/* TODO: correctly support offline cores */ if (possible_logical_cores_cnt > online_logical_cores_cnt) { log_error("Some CPUs seem to be offline; Gramine currently doesn't support core offling");
offling
-> offlining
Pal/src/host/Linux-common/topo_info.c, line 377 at r2 (raw file):
size_t current_max_socket = 0; char filename[128];
Instead of 128, cannot we use one of your FILESZ
macros?
Pal/src/host/Linux-common/topo_info.c, line 381 at r2 (raw file):
/* cpu0 is always online and thus the "online" file is not present. */ if (idx != 0) { snprintf(filename, sizeof(filename), "/sys/devices/system/cpu/cpu%zu/online", idx);
This is hard to read. You have the part /sys/devices/system/cpu/cpu%zu/
always the same -- please move it to a helper variable, and use the helper variable everywhere, like this:
char dirname[128];
snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu", idx);
snrpintf(filename, sizeof(filename), "%s/online", dirname);
snrpintf(filename, sizeof(filename), "%s/topology/core_id", dirname);
... and so on ...
Pal/src/host/Linux-common/topo_info.c, line 386 at r2 (raw file):
if (ret < 0) goto out; }
Please add else { core_topology_arr[idx].is_logical_core_online = 1; }
Pal/src/host/Linux-common/topo_info.c, line 425 at r2 (raw file):
topo_info->sockets_cnt = current_max_socket + 1; topo_info->physical_cores_per_socket = core_topology_arr[0].core_siblings.resource_cnt / core_topology_arr[0].thread_siblings.resource_cnt;
As I mentioned previously, I think you don't need these two variables anymore (sockets_cnt
and physical_cores_per_socket
).
Pal/src/host/Linux-SGX/db_main.c, line 126 at r2 (raw file):
} static int copy_resource_range_to_enclave(struct pal_res_range_info* untrusted_src,
Please add a comment that this function and all below functions do not care about correctly freeing memory.
Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):
static int copy_resource_range_to_enclave(struct pal_res_range_info* untrusted_src, struct pal_res_range_info* dest) { size_t range_cnt = untrusted_src->range_cnt;
To access untrusted base-type variables, you should always use READ_ONCE()
. In this case:
size_t range_cnt = READ_ONCE(untrusted_src->range_cnt);
Pal/src/host/Linux-SGX/db_main.c, line 135 at r2 (raw file):
} if (!sgx_copy_to_enclave(ranges_arr, range_cnt * sizeof(*ranges_arr), untrusted_src->ranges_arr,
READ_ONCE(untrusted_src->ranges_arr)
(For examples, you can check enclave_ocalls.c
)
Pal/src/host/Linux-SGX/db_main.c, line 143 at r2 (raw file):
dest->ranges_arr = ranges_arr; dest->range_cnt = range_cnt; dest->resource_cnt = untrusted_src->resource_cnt;
READ_ONCE
Pal/src/host/Linux-SGX/db_main.c, line 158 at r2 (raw file):
for (size_t idx = 0; idx < online_logical_cores_cnt; idx++) { core_topo_arr[idx].is_logical_core_online = untrusted_src[idx].is_logical_core_online;
Here and everywhere: READ_ONCE()
Pal/src/host/Linux-SGX/db_main.c, line 165 at r2 (raw file):
&core_topo_arr[idx].core_siblings); if (ret < 0) { log_error("Copying core_topo[%zu].core_siblings failed", idx);
core_topo_arr
, please fix all such issues in your logs
Pal/src/host/Linux-SGX/db_main.c, line 185 at r2 (raw file):
for (size_t lvl = 0; lvl < cache_indices_cnt; lvl++) { cache_info_arr[lvl].level = untrusted_src[idx].cache_info_arr[lvl].level;
Can you introduce a helper variable, otherwise it's hard to read. Like this:
struct pal_core_cache_info* untrusted_cache_info = READ_ONCE(&untrusted_src[idx].cache_info_arr[lvl]);
cache_info_arr[lvl].type = READ_ONCE(untrusted_cache_info->type);
...
Pal/src/host/Linux-SGX/db_main.c, line 206 at r2 (raw file):
} } core_topo_arr[idx].cache_info_arr = cache_info_arr;
Please add an empty line before this one.
Pal/src/host/Linux-SGX/db_main.c, line 208 at r2 (raw file):
core_topo_arr[idx].cache_info_arr = cache_info_arr; } g_pal_public_state.topo_info.core_topology_arr = core_topo_arr;
I don't like that this generic function suddenly updates a global variable -- this is an unexpected side-effect of this function. Instead, please add a new function argument which will be a pointer to the topo_info
object, and pass &g_pal_public_state.topo_info
as this argument in the caller.
Pal/src/host/Linux-SGX/db_main.c, line 222 at r2 (raw file):
for (size_t idx = 0; idx < online_nodes_cnt; idx++) { numa_topo_arr[idx].nr_hugepages[HUGEPAGES_2M] = untrusted_src[idx].nr_hugepages[HUGEPAGES_2M];
READ_ONCE()
Pal/src/host/Linux-SGX/db_main.c, line 228 at r2 (raw file):
&numa_topo_arr[idx].cpumap); if (ret < 0) { log_error("Copying numa_topo[%zu].core_siblings failed", idx);
numa_topo_arr
Pal/src/host/Linux-SGX/db_main.c, line 239 at r2 (raw file):
} } g_pal_public_state.topo_info.numa_topology_arr = numa_topo_arr;
I don't like that this generic function suddenly updates a global variable -- this is an unexpected side-effect of this function. Instead, please add a new function argument which will be a pointer to the topo_info
object, and pass &g_pal_public_state.topo_info
as this argument in the caller.
Pal/src/host/Linux-SGX/db_main.c, line 247 at r2 (raw file):
* 1. Ensures the resource as well as range count doesn't exceed limits. * 2. Ensures that ranges don't overlap like "1-5, 3-4". * 3. Ensures the ranges aren't malformed like "1-5, 7-1".
Can you add logic to verify that resource_cnt
is correct and corresponds to ranges in res_info->ranges_arr
? I mean, you can do a for loop over res_info->ranges_arr[idx]
and sum up the range diffs. This will add protection against malicious resource_cnt
.
Pal/src/host/Linux-SGX/db_main.c, line 261 at r2 (raw file):
size_t range_cnt = res_info->range_cnt; if (!IS_IN_RANGE_INCL(range_cnt, 1, 1 << 7)) { log_error("Invalid range count: %zu", range_cnt);
Invalid number of ranges
Pal/src/host/Linux-SGX/db_main.c, line 277 at r2 (raw file):
/* Ensure start and end fall within range limits */ if (!IS_IN_RANGE_INCL(start, range_min_limit, range_max_limit)) { log_error("Invalid start range: %zu", start);
Invalid start of range
Pal/src/host/Linux-SGX/db_main.c, line 282 at r2 (raw file):
if ((start != end) && !IS_IN_RANGE_INCL(end, start + 1, range_max_limit)) { log_error("Invalid end range: %zu", end);
Invalid end of range
Pal/src/host/Linux-SGX/db_main.c, line 289 at r2 (raw file):
*`previous_end` is not yet initialized. */ if (check_for_overlaps && previous_end >= start) { log_error("Malformed range: previous_end = %zu, current start = %zu", previous_end,
Malformed range
-> Overlapping ranges
Pal/src/host/Linux-SGX/db_main.c, line 315 at r2 (raw file):
cache_info_arr[lvl].type == CACHE_TYPE_INSTRUCTION) { /* Taking HT into account */ max_limit = 2;
Please introduce a macro MAX_HYPERTHREADS_PER_CORE
(or something like this) in one of the header files instead of hard-coding 2
here.
Pal/src/host/Linux-SGX/db_main.c, line 321 at r2 (raw file):
} int ret = sanitize_hw_resource_range(&cache_info_arr[lvl].shared_cpu_map, 1, max_limit, 0,
Can you add a comment with an example of shared_cpu_map
? I'm confused how this can be in range 1..2
for the case of DATA cache...
Pal/src/host/Linux-SGX/db_main.c, line 328 at r2 (raw file):
} size_t level = cache_info_arr[lvl].level;
What's the point in this level
helper variable?
Pal/src/host/Linux-SGX/db_main.c, line 329 at r2 (raw file):
size_t level = cache_info_arr[lvl].level; if (!IS_IN_RANGE_INCL(level, 1, 3)) /* x86 processors have max of 3 cache levels */
Please introduce a macro MAX_CACHE_LEVELS
(or something like this) in one of the header files instead of hard-coding 3
here.
Pal/src/host/Linux-SGX/db_main.c, line 337 at r2 (raw file):
cache_info_arr[lvl].size_multiplier != MULTIPLIER_NONE) { return -1; }
Why not remove this big IF and just add another else {return -1}
below?
Pal/src/host/Linux-SGX/db_main.c, line 354 at r2 (raw file):
return -1; size_t coherency_line_size = cache_info_arr[lvl].coherency_line_size;
What's the point in this helper variable?
Pal/src/host/Linux-SGX/db_main.c, line 358 at r2 (raw file):
return -1; size_t number_of_sets = cache_info_arr[lvl].number_of_sets;
What's the point in this helper variable?
Pal/src/host/Linux-SGX/db_main.c, line 362 at r2 (raw file):
return -1; size_t physical_line_partition = cache_info_arr[lvl].physical_line_partition;
What's the point in this helper variable?
Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):
} static int create_socket_range_from_core(struct pal_res_range_info* socket_info_arr,
I highly dislike this function. I don't understand what it does exactly, the name is definitely wrong (create
? this func doesn't create anything). I don't know how many items are in the socket_info_arr
, and in general this function has many assumptions on what its caller does.
Please move this function into its caller. Or make this function more self-contained and self-explanatory.
Pal/src/host/Linux-SGX/db_main.c, line 402 at r2 (raw file):
struct pal_res_range_info* socket_info_arr, size_t sockets_cnt) { /* core-siblings represent all the cores that are part of a socket. We cross-verify the * socket info against this. */
I suggest the comment:
For each socket, cross-verify that its set of cores is the same as the core topology's core-siblings:
- Pick the first core in the socket
- Find its core-siblings in the core topology
- Verify that the array of cores in the socket object is exactly the same as core-siblings in the core topology object
Then you can remove the other comment (pick the first ...
)
Pal/src/host/Linux-SGX/db_main.c, line 409 at r2 (raw file):
/* pick the first core in the socket */ size_t core_in_socket = socket_info_arr[idx].ranges_arr[0].start;
You have double-space here
Pal/src/host/Linux-SGX/db_main.c, line 410 at r2 (raw file):
/* pick the first core in the socket */ size_t core_in_socket = socket_info_arr[idx].ranges_arr[0].start; size_t core_sibling_cnt = core_topology_arr[core_in_socket].core_siblings.range_cnt;
Instead of core_sibling_cnt
, I suggest to use:
struct pal_res_range_info* core_siblings = &core_topology_arr[core_in_socket].core_siblings;
/* and then use this helper variable */
if (core_siblings->range_cnt != socket_info_arr[idx].range_cnt) { ...
Pal/src/host/Linux-SGX/db_main.c, line 433 at r2 (raw file):
size_t sockets_cnt) { int ret; if (online_logical_cores_cnt == 0 || cache_indices_cnt == 0)
Shouldn't you check sockets_cnt
as well?
Pal/src/host/Linux-SGX/db_main.c, line 449 at r2 (raw file):
} size_t core_id = core_topology_arr[idx].core_id;
What's the point in this helper variable? Please remove.
Pal/src/host/Linux-SGX/db_main.c, line 456 at r2 (raw file):
ret = sanitize_hw_resource_range(&core_topology_arr[idx].core_siblings, 1, online_logical_cores_cnt, 0, online_logical_cores_cnt);
Shouldn't the last argument be online_logical_cores_cnt - 1
(note the - 1
)?
Pal/src/host/Linux-SGX/db_main.c, line 463 at r2 (raw file):
/* Max. SMT siblings currently supported on x86 processors is 4 */ ret = sanitize_hw_resource_range(&core_topology_arr[idx].thread_siblings, 1, 4, 0,
In some other place you said that the max SMT siblings are 2 :)
Please introduce a macro MAX_HYPERTHREADS_PER_CORE
(or something like this) in one of the header files instead of hard-coding 4
here.
Pal/src/host/Linux-SGX/db_main.c, line 464 at r2 (raw file):
/* Max. SMT siblings currently supported on x86 processors is 4 */ ret = sanitize_hw_resource_range(&core_topology_arr[idx].thread_siblings, 1, 4, 0, online_logical_cores_cnt);
Shouldn't the last argument be online_logical_cores_cnt - 1
(note the - 1
)?
Pal/src/host/Linux-SGX/db_main.c, line 486 at r2 (raw file):
* #1. From the socket_id of each core, create a range of cores present in each socket. * #2. Compare range of cores in each socket core_siblings which basically is the all cores * in a physical package (socket).
Please use:
* #2. Compare array of cores in each socket against the array of core-siblings from
the core topology.
Pal/src/host/Linux-SGX/db_main.c, line 499 at r2 (raw file):
log_error("Copying topo_info into the enclave failed"); return -1; }
Why did you remove this shallow copy? This is important! Please restore this part and use temp_topo_info
.
Pal/src/host/Linux-SGX/db_main.c, line 512 at r2 (raw file):
size_t online_nodes_cnt, size_t online_logical_cores_cnt, size_t possible_logical_cores_cnt) { int ret = 0;
Please don't init with zero. Instead, keep int ret;
and right-before the out label, do ret = 0;
Pal/src/host/Linux-SGX/db_main.c, line 521 at r2 (raw file):
for (size_t idx = 0; idx < online_nodes_cnt; idx++) { ret = sanitize_hw_resource_range(&numa_topology_arr[idx].cpumap, 1, online_logical_cores_cnt, 0, online_logical_cores_cnt);
-1
in the last argument?
Pal/src/host/Linux-SGX/db_main.c, line 532 at r2 (raw file):
size_t end = numa_topology_arr[idx].cpumap.ranges_arr[i].end; for (size_t j = start; j <= end; j++) { size_t index = j / BITS_IN_TYPE(int);
BITS_IN_TYPE(uint32_t)
Pal/src/host/Linux-SGX/db_main.c, line 533 at r2 (raw file):
for (size_t j = start; j <= end; j++) { size_t index = j / BITS_IN_TYPE(int); assert(index < cpumask_cnt);
I think this shouldn't be an assert but instead a proper if (index >= cpumask_cnt) { fail }
Pal/src/host/Linux-SGX/db_main.c, line 535 at r2 (raw file):
assert(index < cpumask_cnt); if (bitmap[index] & (1U << (j % BITS_IN_TYPE(int)))) {
BITS_IN_TYPE(uint32_t)
Pal/src/host/Linux-SGX/db_main.c, line 540 at r2 (raw file):
goto out_numa; } bitmap[index] |= 1U << (j % BITS_IN_TYPE(int));
BITS_IN_TYPE(uint32_t)
Pal/src/host/Linux-SGX/db_main.c, line 547 at r2 (raw file):
size_t distances = numa_topology_arr[idx].distance.resource_cnt; if (distances != online_nodes_cnt) { log_error("Invalid numa_topology_arr[%zu].distance", idx);
Invalid numa_topology: number of distances is not the same as number of NUMA nodes
Pal/src/host/Linux-SGX/db_main.c, line 554 at r2 (raw file):
if (total_cores_in_numa != online_logical_cores_cnt) { log_error("Invalid numa_topology: more cores in NUMA than online");
more
-> not the same number of
Pal/src/host/Linux-SGX/db_main.c, line 558 at r2 (raw file):
goto out_numa; }
Here must be ret = 0;
Pal/src/host/Linux-SGX/db_main.c, line 559 at r2 (raw file):
} out_numa:
Rename to simply out:
Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):
memset(topo_info, 0, sizeof(*topo_info)); ret = copy_resource_range_to_enclave(&uptr_topo_info->possible_logical_cores,
This is wrong... You're accessing somewhere inside the untrusted uptr_topo_info
. You removed the verification-and-copy of this uptr_topo_info
, so it may point anywhere (attacker can choose this address!) inside the enclave memory, and everything will break.
Pal/src/host/Linux-SGX/db_main.c, line 740 at r2 (raw file):
/* Virtual environments such as QEMU may assign each core to a separate socket/package with * one or more NUMA nodes. So we check against the number of online logical cores. */ if (!IS_IN_RANGE_INCL(topo_info->sockets_cnt , 1, online_logical_cores_cnt)) {
Remove space before ,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 125 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
Pal/src/host/Linux-common/topo_info.c, line 181 at r2 (raw file):
for (size_t i = 0; i < res_info->range_cnt; i++) { if (res_info[i].ranges_arr) free(res_info[i].ranges_arr);
??? How can this work? Instead of this for loop, it should be simply:
if (res_info->ranges_arr)
free(res_info->ranges_arr);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 125 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @vijaydhanraj)
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
Migrating the discussion from #106 again:
Why does this multiplier even exists? Wasn't the idea that what host passes should be as simple and normalized as possible? Why not just pass the number of bytes everywhere and just in that int->str function shorten it if it's divisible by some size unit? (for specific rounding rules please check the kernel source)
With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".
And another problem: what if the untrusted host gives you
LONG_MAX
gigabytes? I think in some places you'll just pass it to the app without erroring out, because you never needed to multiply it, so there was no chance for overflow detection (which will certainly happen in the user app then).
Why does this multiplier even exists?
We have a multiplier because we exactly need to know how the normal kernel is representing the strings. For example, on reading
/sys/devices/system/cpu/cpu2/cache/index3/size
we get49152K
. Now this can be represented as48M
or49152K
and in this case, kernel chose the latter representation, and we need to give the same to the end-user.With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".
Please see the earlier comment.
And another problem: what if the untrusted host gives you
LONG_MAX
gigabytes?Please see here (lines 263-267) https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R263, I do multiply the size qualifier and check the limits when sanitizing so we shouldn't be running into overflow issues in this case. If you see the issue somewhere, please point me to it.
And finally my next reply: (i.e. from this review iteration)
in this case, kernel chose the latter representation
The problem is that it's the untrusted kernel (or actually just host, not necessarily kernel) which chose these. We can't rely on it choosing proper specifiers.
This link doesn't work (or at least is not pointing me to any particular place in the PR).
The problem is that it's the untrusted kernel (or actually just host, not necessarily kernel) which chose these. We can't rely on it choosing proper specifiers.
yes agree, that is why we take the values passed by the untrusted host and then run a sanitization check on it. This way we can ensure that the untrusted host is not passing garbage or malicious value and maintain the representation as the host.
This link doesn't work (or at least is not pointing me to any particular place in the PR).
Please see here, https://github.com/gramineproject/gramine/pull/106/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R259, lines 259-262
yes agree, that is why we take the values passed by the untrusted host and then run a sanitization check on it. This way we can ensure that the untrusted host is not passing garbage or malicious value and maintain the representation as the host.
But as I said, you aren't sanitizing the unit, e.g.
1000000000K
passes the current sanitization if I'm not mistaken.Please see here, https://github.com/gramineproject/gramine/pull/106/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R259, lines 259-262
Same, this links to the whole PR diff for me. Can you just give me the filename and the line in current revision?
[end of old quotes]
I still think we should normalize the data and just pass a single number instead of number + unit. The unit is untrusted, so you can't say that we need to follow the host kernel - we don't know what the host kernel really emits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 21 files at r1, all commit messages.
Reviewable status: all files reviewed, 128 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @vijaydhanraj)
a discussion (no related file):
@vijaydhanraj Could you move the first two commits into a separate PR? They are trivial and we should be able do merge them quickly.
-- commits, line 8 at r2:
direcly
-> directly
-- commits, line 25 at r2:
I'd change this description, the whole idea of this change is not just to pass integers instead of strings, but to use more normalized and clean representation of the topology data on the trusted<->untrusted boundary, so that we can reason about its correctness easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
@vijaydhanraj Could you move the first two commits into a separate PR? They are trivial and we should be able do merge them quickly.
Sure @mkow, I will create a new PR with those 2 commits. I will remove those commits from this PR before final rebase. There are lot of discussions here and so don't want to force push.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Linux_SGX
->Linux-SGX
Blocked myself. Will do this as part of the final rebase.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
...unify
->...and unify
Blocked myself. Will do this as part of the final rebase.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
PAL
->PALs
Blocked myself. Will do this as part of the final rebase.
Previously, mkow (Michał Kowalczyk) wrote…
direcly
->directly
Blocked myself. Will do this as part of the final rebase.
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, I would just remove this sentence and immediately start with 1., 2., 3.
Blocked myself. Will do this as part of the final rebase.
Previously, mkow (Michał Kowalczyk) wrote…
I'd change this description, the whole idea of this change is not just to pass integers instead of strings, but to use more normalized and clean representation of the topology data on the trusted<->untrusted boundary, so that we can reason about its correctness easier.
How about the following?
This commit creates a clean representation of topology data on the trusted<->untrusted boundary, so that we can easily verify its correctness. The following are done:
1. Refactor sysfs code to convert CPU/NUMA information from
Linux-formatted strings to integers in untrusted PAL.
2. Copy and sanitize CPU/NUMA information based on untrusted
integers to trusted PAL.
3. Convert the trusted CPU/NUMA information from integers back
to Linux-formatted strings in LibOS.
LibOS/shim/include/shim_fs_pseudo.h, line 214 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
at the end
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 216 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We have inconsistency in naming:
str
andbuf_size
. I suggest to replacestr
->buf
.
I think it will be better to change buf_size
-> str_size
as it seems to fit with the function names. So have changed it this way. This will also align with what we had discussed before regarding the naming of the char buffers.
Let me know if you want to revert it to buf
and buf_size
.
LibOS/shim/include/shim_fs_pseudo.h, line 220 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
at the end
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto (
buf
)
Done, but renamed buf_size
-> str_size
.
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Can we move
sep
beforestr
andbuf_size
(to become a second argument, not the last one)?This is more easy on the eyes: we will have two input arguments first, and then two arguments refering to the buffer.
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 227 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
at the end
Done.
LibOS/shim/include/shim_fs_pseudo.h, line 229 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto (
buf
)
Done, but renamed buf_size
-> str_size
.
LibOS/shim/src/fs/proc/info.c, line 129 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Can we have an
else
clause? It is very unexpected that if sysfs is disabled, then there will be nophysical id
string at all.Maybe simply
ADD_INFO("physical id\t: %zu\n", 0);
is good enough for now (i.e., we have only one socket)?
This will be incorrect as most servers have more than 1 socket. We anyway only print a selected few values. So, unless an application is failing, or we are breaking something in Gramine, I wouldn't suggest adding a dummy value.
LibOS/shim/src/fs/sys/cache_info.c, line 35 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you use your special function here, instead of a simple:
ret = snprintf(str, sizeof(str), "%zu\n", cache_info->level);
Same for the below cases (coherency line size, number of sets, ...)
This was done for the cache size
case where we check and append the size qualifier like K
, M
, and G
and ended up using it for other cases like coherency line size
, number of sets
as well. Also felt it will be more uniform.
LibOS/shim/src/fs/sys/cache_info.c, line 39 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
\n
is typically added to the format string, so please change to this:ret = snprintf(str, sizeof(str), "%s\n", "Data");
Same for the below two cases.
Done.
LibOS/shim/src/fs/sys/cache_info.c, line 67 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add an empty line after this, a bit hard to read otherwise
Done.
LibOS/shim/src/fs/sys/cpu_info.c, line 32 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add an empty line after this, a bit hard to read otherwise
Done.
LibOS/shim/src/fs/sys/cpu_info.c, line 52 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you use your special function here, instead of a simple:
ret = snprintf(str, sizeof(str), "%zu\n", core_topology->is_logical_core_online);
Same for the below cases (coherency line size, number of sets, ...)
Please see my earlier comment on this.
LibOS/shim/src/fs/sys/cpu_info.c, line 69 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add an empty line after this, a bit hard to read otherwise
Done.
LibOS/shim/src/fs/sys/fs.c, line 20 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add an assert that
size_mult
is one ofNONE
,KB
,MB
,GB
.
Good point, but I think it is better to return an error instead of an assert
.
LibOS/shim/src/fs/sys/fs.c, line 38 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please combine three lines into one via
return condition ? true : false
.
I added an additional check as ret
can be negative in case of error and so didn't end up combining into a single return. But if you think it is better, I can do something like,
if (ret < 0)
return ret;
return ((size_t)ret >= str_size) ? -EOVERFLOW : 0
LibOS/shim/src/fs/sys/fs.c, line 46 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
>=
Correct, offset doesn't account for the null terminator.
LibOS/shim/src/fs/sys/fs.c, line 65 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
buf_size
->buf_size - offset
(because this is the real "max size of string" argument passed to snprintf)
you are right, fixed it.
LibOS/shim/src/fs/sys/fs.c, line 90 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
int
->uint32_t
Done.
LibOS/shim/src/fs/sys/fs.c, line 93 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
int
->uint32_t
Done.
LibOS/shim/src/fs/sys/fs.c, line 100 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
``>=
Done.
LibOS/shim/src/fs/sys/fs.c, line 119 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
buf_size
->buf_size - offset
(because this is the real "max size of string" argument passed to snprintf)
Done.
LibOS/shim/src/fs/sys/node_info.c, line 50 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you use your special function here, instead of a simple:
ret = snprintf(str, sizeof(str), "%zu\n", ...);
Same for the below case.
Please see my earlier comment on this.
LibOS/shim/src/fs/sys/node_info.c, line 64 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add an empty line after this, a bit hard to read otherwise
Done.
Pal/include/arch/x86_64/pal_topology.h, line 3 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Nit: you can add your copyright too
Done.
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Migrating the discussion from #106 again:
Why does this multiplier even exists? Wasn't the idea that what host passes should be as simple and normalized as possible? Why not just pass the number of bytes everywhere and just in that int->str function shorten it if it's divisible by some size unit? (for specific rounding rules please check the kernel source)
With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".
And another problem: what if the untrusted host gives you
LONG_MAX
gigabytes? I think in some places you'll just pass it to the app without erroring out, because you never needed to multiply it, so there was no chance for overflow detection (which will certainly happen in the user app then).Why does this multiplier even exists?
We have a multiplier because we exactly need to know how the normal kernel is representing the strings. For example, on reading
/sys/devices/system/cpu/cpu2/cache/index3/size
we get49152K
. Now this can be represented as48M
or49152K
and in this case, kernel chose the latter representation, and we need to give the same to the end-user.With current version you have a problem that you may pass to the app things which normal kernel would never pass - e.g. "1024K" instead of "1M", or even things like "1000000000K".
Please see the earlier comment.
And another problem: what if the untrusted host gives you
LONG_MAX
gigabytes?Please see here (lines 263-267) https://github.com/gramineproject/graphene/pull/2661/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R263, I do multiply the size qualifier and check the limits when sanitizing so we shouldn't be running into overflow issues in this case. If you see the issue somewhere, please point me to it.
And finally my next reply: (i.e. from this review iteration)
in this case, kernel chose the latter representation
The problem is that it's the untrusted kernel (or actually just host, not necessarily kernel) which chose these. We can't rely on it choosing proper specifiers.
This link doesn't work (or at least is not pointing me to any particular place in the PR).
The problem is that it's the untrusted kernel (or actually just host, not necessarily kernel) which chose these. We can't rely on it choosing proper specifiers.
yes agree, that is why we take the values passed by the untrusted host and then run a sanitization check on it. This way we can ensure that the untrusted host is not passing garbage or malicious value and maintain the representation as the host.
This link doesn't work (or at least is not pointing me to any particular place in the PR).
Please see here, https://github.com/gramineproject/gramine/pull/106/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R259, lines 259-262yes agree, that is why we take the values passed by the untrusted host and then run a sanitization check on it. This way we can ensure that the untrusted host is not passing garbage or malicious value and maintain the representation as the host.
But as I said, you aren't sanitizing the unit, e.g.
1000000000K
passes the current sanitization if I'm not mistaken.Please see here, https://github.com/gramineproject/gramine/pull/106/files#diff-cc6f581d340e5e5a512e7693bb560ee623a287dab6d3d24c47410912a2ddd305R259, lines 259-262
Same, this links to the whole PR diff for me. Can you just give me the filename and the line in current revision?
[end of old quotes]
I still think we should normalize the data and just pass a single number instead of number + unit. The unit is untrusted, so you can't say that we need to follow the host kernel - we don't know what the host kernel really emits.
That is true but why do you say only the unit is untrusted? Even the value is untrusted right?
Also, for cache size, the expectation of the application will be a value followed by unit. By normalizing we are changing this format.
Pal/include/arch/x86_64/pal_topology.h, line 41 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Simply
Total number of ...
(removecount of
, it is redundant)
Done.
Pal/include/arch/x86_64/pal_topology.h, line 45 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Simply
Total number of ...
(removecount of
, it is redundant)
Done.
Pal/include/arch/x86_64/pal_topology.h, line 49 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I would rephrase the first sentence like this:
Array of ranges, with `range_cnt` items.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 51 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You current example looks like a two-dimensional array (array of arrays). But in reality, you have an array of structs. So you should change to:
[ {0, 15}, {21, 21}, {32, 63} ]
Also, did you make a typo? Why do you have
64
instead of63
in the last item?
yes, it should be 63. Also changed the example.
Pal/include/arch/x86_64/pal_topology.h, line 52 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I would remove the Note sentence, it is redundant (you already give an example of
[21, 21]
-- no need to explain this representation in words).
I wanted to make this explicit. After a few months, I myself might miss this :). But I can remove it if it is too obvious.
Pal/include/arch/x86_64/pal_topology.h, line 53 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I would rephrase to:
The ranges should not overlap.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 66 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Would it be possible to add same-line comments on each of these? With examples. For example, I have no idea what
type
of the cache could mean.UPDATE:
type
must be of typeenum cache_type
, please fix. Actually, then you can ignore my request for same-line comments, the rest fields are recognizable.
yes, done.
Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):
By uninitialized, I mean some garbage value. We don't read this from the host as the file is not present. Also, when the application queries for this path, we return -ENOENT
from LibOS.
$ cat /sys/devices/system/cpu/cpu0/online
cat: /sys/devices/system/cpu/cpu0/online: No such file or directory
reasonable to always set is_logical_core_online = 1
Given this why set is_logical_core_online
to 1?
I put this as a reminder comment, but if it is confusing, we can remove it.
Pal/include/arch/x86_64/pal_topology.h, line 99 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
How is this
sockets_cnt
different fromonline_nodes.resource_cnt
? Are you sure you still need thissockets_cnt
variable for anything?
I think you are getting confused with sockets
and NUMA nodes
. A socket is a physical package, this is dictated by the hardware/VMM. A NUMA node is a software abstraction to split resources. We can have n NUMA nodes within a socket. See this, https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonSPv2+msc.v2.1.png, where we have 2 sockets and each socket, has 2 NUMA nodes.
yes, I use sockets_cnt
during socket sanitization.
Pal/include/arch/x86_64/pal_topology.h, line 101 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Is this
physical_cores_per_socket
variable still relevant? You have all this info inonline_nodes.ranges_arr
. Please check if this variable is really needed.
They are different resources. Please see my earlier comment.
Pal/src/host/Linux-common/topo_info.c, line 20 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Mentioning
retval
in this sentence is misleading, it makes an impression thatreval
may be passed asNULL
, but this is not true. In other words,retval
must always be passed to this function. So please removeretval
mention from this sentence (keep onlysize_mult
).
Done.
Pal/src/host/Linux-common/topo_info.c, line 27 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of the if check, simply
assert(retval)
-- this is a local function anyway, so we can make assumptions how it is called.
Done.
Pal/src/host/Linux-common/topo_info.c, line 32 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You can remove the last argument (
0
), it will still work.
Replaced open+read+close
with read_file_buffer()
.
Pal/src/host/Linux-common/topo_info.c, line 36 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of 64, cannot we use one of your
FILESZ
macros?
Done.
Pal/src/host/Linux-common/topo_info.c, line 40 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Isn't this whole
open
+read
+close
sequence exactly the same as whatread_file_buffer()
function does? Please replace this code with the call to this function.
Done.
Pal/src/host/Linux-common/topo_info.c, line 73 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
HW resources, ranges
->HW resources and their ranges
Done.
Pal/src/host/Linux-common/topo_info.c, line 73 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
...stored in `res_info`.
Done.
Pal/src/host/Linux-common/topo_info.c, line 75 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Add a dot at the end of this sentence.
Done.
Pal/src/host/Linux-common/topo_info.c, line 81 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of the if check, simply
assert(res_info)
-- this is a local function anyway, so we can make assumptions how it is called.
Done.
Pal/src/host/Linux-common/topo_info.c, line 88 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You can remove the last argument (
0
), it will still work.
Replaced open+read+close
with read_file_buffer()
.
Pal/src/host/Linux-common/topo_info.c, line 92 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of 64, cannot we use one of your
FILESZ
macros?
Done.
Pal/src/host/Linux-common/topo_info.c, line 96 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Isn't this whole
open
+read
+close
sequence exactly the same as whatread_file_buffer()
function does? Please replace this code with the call to this function.
Done.
Pal/src/host/Linux-common/topo_info.c, line 101 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why do you need a helper variable? Simply use
res_info->resource_cnt
instead.
A few of the lines exceeded 100 chars and I needed to wrap them. So added this helper variable.
Removed it.
Pal/src/host/Linux-common/topo_info.c, line 119 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't you add also
*end == '\t'
?
Unless we encounter files with \t
, it is better not to add it. So far, I haven't seen the need.
Pal/src/host/Linux-common/topo_info.c, line 120 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This
count as one more
comment doesn't make much sense now. I suggest to simply remove this part of the comment.
Done.
Pal/src/host/Linux-common/topo_info.c, line 125 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This
count how many ...
comment doesn't make much sense now. I suggest to simply remove this part of the comment.
Done.
Pal/src/host/Linux-common/topo_info.c, line 142 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
end_val - start_val
->diff
Done.
Pal/src/host/Linux-common/topo_info.c, line 142 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
The comment is weird. Why are there two examples? I would simply say
/* +1 because of inclusive range */
Done.
Pal/src/host/Linux-common/topo_info.c, line 150 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add an empty line after this line, easier to read
Done.
Pal/src/host/Linux-common/topo_info.c, line 151 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Simply
Realloc the array of ranges (expand by one range)
Done.
Pal/src/host/Linux-common/topo_info.c, line 178 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You get to this label only if something failed. Please rename it to
fail:
.
Done.
Pal/src/host/Linux-common/topo_info.c, line 181 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
??? How can this work? Instead of this for loop, it should be simply:
if (res_info->ranges_arr) free(res_info->ranges_arr);
oops my bad, yes fixed it.
Pal/src/host/Linux-common/topo_info.c, line 183 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I would remove this comment, it is obvious
Done.
Pal/src/host/Linux-common/topo_info.c, line 218 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of the if check, simply
assert(cache_indices_cnt)
-- this is a local function anyway, so we can make assumptions how it is called.
Done.
Pal/src/host/Linux-common/topo_info.c, line 220 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of 1024, cannot we use one of your
FILESZ
macros?
That would be too little to store the alllinux_dirent
structs. Since this is a small function, felt it was fine to hardcode.
Pal/src/host/Linux-common/topo_info.c, line 221 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please don't use this pattern of assigning
0
toret
during init. Instead, add an explicitret = 0;
statement before theout:
label.
OK, done.
Pal/src/host/Linux-common/topo_info.c, line 249 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add an empty line
Done.
Pal/src/host/Linux-common/topo_info.c, line 250 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add
ret = 0;
Done.
Pal/src/host/Linux-common/topo_info.c, line 266 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of 128, cannot we use one of your
FILESZ
macros?
PAL_SYSFS_BUF_FILESZ
is only 64, but here we end up with strings with larger lengths. We could use PAL_SYSFS_MAP_FILESZ
but the name is confusing in this context.
We can either leave it or put some other macro length of the filename.
Pal/src/host/Linux-common/topo_info.c, line 270 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This is hard to read. You have the part
/sys/devices/system/cpu/cpu%zu/cache/index%zu/
always the same -- please move it to a helper variable, and use the helper variable everywhere, like this:char dirname[128]; snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu/cache/index%zu", core_idx, cache_idx); snrpintf(filename, sizeof(filename), "%s/shared_cpu_list", dirname); snrpintf(filename, sizeof(filename), "%s/level", dirname); ... and so on ...
Done.
Pal/src/host/Linux-common/topo_info.c, line 285 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why
ARRAY_SIZE
? Simplysizeof()
yes used sizeof()
.
Pal/src/host/Linux-common/topo_info.c, line 336 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
There is no reason to have this specific name, please rename to simply
out:
Let us rename it to fail
to be consistent and it also makes more sense as these are failure cases.
Pal/src/host/Linux-common/topo_info.c, line 363 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
offling
->offlining
Done.
Pal/src/host/Linux-common/topo_info.c, line 377 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of 128, cannot we use one of your
FILESZ
macros?
OK, created a macro PAL_SYSFS_PATH_LEN
. Let me know if you have other suggestions or revert it back to using hardcoded value.
Pal/src/host/Linux-common/topo_info.c, line 381 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This is hard to read. You have the part
/sys/devices/system/cpu/cpu%zu/
always the same -- please move it to a helper variable, and use the helper variable everywhere, like this:char dirname[128]; snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu", idx); snrpintf(filename, sizeof(filename), "%s/online", dirname); snrpintf(filename, sizeof(filename), "%s/topology/core_id", dirname); ... and so on ...
Done.
Pal/src/host/Linux-common/topo_info.c, line 386 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add
else { core_topology_arr[idx].is_logical_core_online = 1; }
Please see my other comment on this. I don't think we should even set this core_topology_arr[0].is_logical_core_online
parameter as any access to it from LibOS will result in failure.
Pal/src/host/Linux-common/topo_info.c, line 425 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
As I mentioned previously, I think you don't need these two variables anymore (
sockets_cnt
andphysical_cores_per_socket
).
sockets
and NUMA nodes
are different hardware resources. Please see my other comment.
Pal/src/host/Linux-SGX/db_main.c, line 126 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add a comment that this function and all below functions do not care about correctly freeing memory.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
To access untrusted base-type variables, you should always use
READ_ONCE()
. In this case:size_t range_cnt = READ_ONCE(untrusted_src->range_cnt);
Done, but I didn't understand the need to use READ_ONCE
. Can you please clarify why we need to use READ_ONCE()
?
I see some examples in this file where we don't use READ_ONCE()
like https://github.com/gramineproject/gramine/blob/master/Pal/src/host/Linux-SGX/db_main.c#L721 and
https://github.com/gramineproject/gramine/blob/master/Pal/src/host/Linux-SGX/db_main.c#L735
Pal/src/host/Linux-SGX/db_main.c, line 135 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
READ_ONCE(untrusted_src->ranges_arr)
(For examples, you can check
enclave_ocalls.c
)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 143 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
READ_ONCE
Done.
Pal/src/host/Linux-SGX/db_main.c, line 158 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Here and everywhere:
READ_ONCE()
Done.
Pal/src/host/Linux-SGX/db_main.c, line 165 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
core_topo_arr
, please fix all such issues in your logs
Done.
Pal/src/host/Linux-SGX/db_main.c, line 185 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Can you introduce a helper variable, otherwise it's hard to read. Like this:
struct pal_core_cache_info* untrusted_cache_info = READ_ONCE(&untrusted_src[idx].cache_info_arr[lvl]); cache_info_arr[lvl].type = READ_ONCE(untrusted_cache_info->type); ...
Done, expect for the initial address assignment which throws an error.
../Pal/src/host/Linux-SGX/db_main.c:190:17: error: cannot take the address of an rvalue of type 'struct pal_core_cache_info *'
READ_ONCE(&untrusted_src[idx].cache_info_arr[lvl]);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../common/include/api.h:295:69: note: expanded from macro 'READ_ONCE'
#define READ_ONCE(x) ({ __typeof__(x) y = *(volatile __typeof__(x)*)&(x); y;})
Pal/src/host/Linux-SGX/db_main.c, line 206 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add an empty line before this one.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 208 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I don't like that this generic function suddenly updates a global variable -- this is an unexpected side-effect of this function. Instead, please add a new function argument which will be a pointer to the
topo_info
object, and pass&g_pal_public_state.topo_info
as this argument in the caller.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 222 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
READ_ONCE()
Done.
Pal/src/host/Linux-SGX/db_main.c, line 228 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
numa_topo_arr
Done.
Pal/src/host/Linux-SGX/db_main.c, line 239 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I don't like that this generic function suddenly updates a global variable -- this is an unexpected side-effect of this function. Instead, please add a new function argument which will be a pointer to the
topo_info
object, and pass&g_pal_public_state.topo_info
as this argument in the caller.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 247 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Can you add logic to verify that
resource_cnt
is correct and corresponds to ranges inres_info->ranges_arr
? I mean, you can do a for loop overres_info->ranges_arr[idx]
and sum up the range diffs. This will add protection against maliciousresource_cnt
.
yes, done.
Pal/src/host/Linux-SGX/db_main.c, line 261 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Invalid number of ranges
Isn't range_cnt
interpreted as range count
and num_ranges
interpreted as number of ranges
? And we decided to avoid the latter?
If so, I would prefer the existing comment as it aligns with what we intended.
Pal/src/host/Linux-SGX/db_main.c, line 277 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Invalid start of range
Done.
Pal/src/host/Linux-SGX/db_main.c, line 282 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Invalid end of range
Done.
Pal/src/host/Linux-SGX/db_main.c, line 289 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Malformed range
->Overlapping ranges
Done.
Pal/src/host/Linux-SGX/db_main.c, line 315 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please introduce a macro
MAX_HYPERTHREADS_PER_CORE
(or something like this) in one of the header files instead of hard-coding2
here.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 321 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Can you add a comment with an example of
shared_cpu_map
? I'm confused how this can be in range1..2
for the case of DATA cache...
The data cache isn't shared with other cores, so we end-up with just the core/and its sibling if HT is enabled.
For example,
$ cat /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map
00000000,00000002
What comment will help here?
Pal/src/host/Linux-SGX/db_main.c, line 328 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point in this
level
helper variable?
Removed it.
Pal/src/host/Linux-SGX/db_main.c, line 329 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please introduce a macro
MAX_CACHE_LEVELS
(or something like this) in one of the header files instead of hard-coding3
here.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 337 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why not remove this big IF and just add another
else {return -1}
below?
Sorry I didn't get your comment. I want to explicitly make sure the size_multiplier
is one of the accepted values. Anything else is returned as an error.
Pal/src/host/Linux-SGX/db_main.c, line 354 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point in this helper variable?
Removed it.
Pal/src/host/Linux-SGX/db_main.c, line 358 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point in this helper variable?
Removed it.
Pal/src/host/Linux-SGX/db_main.c, line 362 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point in this helper variable?
Removed it.
Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):
The inputs to this function are socket_info_arr
which is empty at first and gets populated as we identify socket<->core
mapping, socket_id
, the socket where the core is present and core_id
is the core. From this, we create a socket map of type struct pal_res_range_info
where range_cnt
corresponds to each socket and ranges_arr
stores the range of cores present in the socket.
So, it does create a socket map from the socket_id
and core_id
.
I don't know how many items are in the socket_info_arr
Yes, because we build this on the fly with socket_id
and core_id
I did have this logic as part of the caller, but it looked very clumsy and so moved it as a separate function.
Does the below comment help?
/* Create a socket to core mapping from `socket_id` and `core_id` and store the info as part of
* `socket_info_arr` */
Pal/src/host/Linux-SGX/db_main.c, line 402 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I suggest the comment:
For each socket, cross-verify that its set of cores is the same as the core topology's core-siblings: - Pick the first core in the socket - Find its core-siblings in the core topology - Verify that the array of cores in the socket object is exactly the same as core-siblings in the core topology object
Then you can remove the other comment (
pick the first ...
)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 409 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You have double-space here
Done.
Pal/src/host/Linux-SGX/db_main.c, line 410 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Instead of
core_sibling_cnt
, I suggest to use:struct pal_res_range_info* core_siblings = &core_topology_arr[core_in_socket].core_siblings; /* and then use this helper variable */ if (core_siblings->range_cnt != socket_info_arr[idx].range_cnt) { ...
Done.
Pal/src/host/Linux-SGX/db_main.c, line 433 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't you check
sockets_cnt
as well?
I wanted to remove this as it is redundant. We do check these in the caller itself.
Pal/src/host/Linux-SGX/db_main.c, line 449 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
What's the point in this helper variable? Please remove.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 456 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't the last argument be
online_logical_cores_cnt - 1
(note the- 1
)?
No, this is resource count
. For example, if range was 0-63, the count is 64
and we check against this value.
Pal/src/host/Linux-SGX/db_main.c, line 463 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
In some other place you said that the max SMT siblings are 2 :)
Please introduce a macro
MAX_HYPERTHREADS_PER_CORE
(or something like this) in one of the header files instead of hard-coding4
here.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 464 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't the last argument be
online_logical_cores_cnt - 1
(note the- 1
)?
Please see my earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 486 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please use:
* #2. Compare array of cores in each socket against the array of core-siblings from the core topology.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 499 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why did you remove this shallow copy? This is important! Please restore this part and use
temp_topo_info
.
Now that we have added READ_ONCE()
do we still want to restore it? If so, can you please explain why?
Pal/src/host/Linux-SGX/db_main.c, line 512 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please don't init with zero. Instead, keep
int ret;
and right-before the out label, doret = 0;
Done.
Pal/src/host/Linux-SGX/db_main.c, line 521 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
-1
in the last argument?
This is count so -1
will be incorrect check. Please see my earlier comment on this.
Pal/src/host/Linux-SGX/db_main.c, line 532 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
BITS_IN_TYPE(uint32_t)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 533 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I think this shouldn't be an assert but instead a proper
if (index >= cpumask_cnt) { fail }
Done
Pal/src/host/Linux-SGX/db_main.c, line 535 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
BITS_IN_TYPE(uint32_t)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 540 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
BITS_IN_TYPE(uint32_t)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 547 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Invalid numa_topology: number of distances is not the same as number of NUMA nodes
Rephrased as "Distance count is not same as the NUMA nodes count
Pal/src/host/Linux-SGX/db_main.c, line 554 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
more
->not the same number of
Rephrased as Invalid numa_topology: Mismatch between NUMA cores and online cores count
Pal/src/host/Linux-SGX/db_main.c, line 558 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Here must be
ret = 0;
Done.
Pal/src/host/Linux-SGX/db_main.c, line 559 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Rename to simply
out:
Renamed to fail
as in other places.
Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):
Instead of copying the whole structure, I am copying each member and verifying it. Even previously with temp_topo_info
we just did a shallow copy but no verification was done. Then in this function we made a deep copy and sanitization.
You removed the verification-and-copy of this uptr_topo_info,
Am I missing something? Can you please clarify?
Pal/src/host/Linux-SGX/db_main.c, line 740 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Remove space before
,
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
a discussion (no related file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Sure @mkow, I will create a new PR with those 2 commits. I will remove those commits from this PR before final rebase. There are lot of discussions here and so don't want to force push.
Pulled out the first 2 commits and created #360
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Blocked myself. Will do this as part of the final rebase.
Done as part of PR 360
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Blocked myself. Will do this as part of the final rebase.
Done as part of PR 360
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Blocked myself. Will do this as part of the final rebase.
Done as part of PR 360
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Blocked myself. Will do this as part of the final rebase.
Done as part of PR 360
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
By unit I mean "in which unit we'll display this value to the user". And the unstrusted host may select some impossible combinations, like the ones I already listed in this very conversation.
I don't want you to change the format in sysfs for the app, but to change the format on the untrusted interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
But as I said, you aren't sanitizing the unit, e.g. 1000000000K passes the current sanitization if I'm not mistaken
So, this is the problem. And to address this you are suggesting removing the size multiplier and keeping the value. For example, if the value was 4K
you want to store as 4096
in the untrusted PAL. Copy to it trusted PAL and run sanitization on this and when application queries this value, LibOS will change 4096
to 4K
format.
Is my understanding correct?
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file): Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Yes, exactly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Yes, exactly.
Got it. So currently I store 4
and K
that I read from untrusted PAL. But in the trusted PAL, I first check if the multiplier is such K
is valid. Then I multiply the value which is 4 times K
and then check we don't end with huge numbers like as you have pointed in your example. If the multiplied value (4096) is within range, I keep the value and the size multiplier else I throw an error.
Please see the below logic
size_t multiplier = 1;
if (cache_info_arr[lvl].size_multiplier == MULTIPLIER_KB)
multiplier = 1024;
else if (cache_info_arr[lvl].size_multiplier == MULTIPLIER_MB)
multiplier = 1024 * 1024;
else if (cache_info_arr[lvl].size_multiplier == MULTIPLIER_GB)
multiplier = 1024 * 1024 * 1024;
size_t cache_size;
if (__builtin_mul_overflow(cache_info_arr[lvl].size, multiplier, &cache_size))
return -1;
if (!IS_IN_RANGE_INCL(cache_size, 1, 1 << 30))
return -1;
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
Yes, you currently do it differently, that's why we have this whole discussion. What's your point/question? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 13 of 21 files reviewed, 127 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Got it. So currently I store
4
andK
[...]Yes, you currently do it differently, that's why we have this whole discussion. What's your point/question?
It turns out that Linux kernel always displays cache size in terms of KB. This makes things simpler for us and we can remove the size qualifier
. As suggested by @mkow, cache size is sanitized after multiplying with 1024 and when the application requests, LibOS converts the value to xK
format.
Please see here, https://elixir.bootlin.com/linux/v5.13/source/drivers/base/cacheinfo.c#L380
and https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu
size:
the total cache size in kB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 8 files at r4, all commit messages.
Reviewable status: all files reviewed, 29 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
LibOS/shim/include/shim_fs_pseudo.h, line 216 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I think it will be better to change
buf_size
->str_size
as it seems to fit with the function names. So have changed it this way. This will also align with what we had discussed before regarding the naming of the char buffers.Let me know if you want to revert it to
buf
andbuf_size
.
Ok, I'm fine with str_size
of course. Just wanted consistency in naming.
LibOS/shim/src/fs/proc/info.c, line 129 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
This will be incorrect as most servers have more than 1 socket. We anyway only print a selected few values. So, unless an application is failing, or we are breaking something in Gramine, I wouldn't suggest adding a dummy value.
OK, I see your point. Hopefully we'll remove this enable_sysfs_topology
soon anyway.
Pal/include/arch/x86_64/pal_topology.h, line 27 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
It turns out that Linux kernel always displays cache size in terms of KB. This makes things simpler for us and we can remove the
size qualifier
. As suggested by @mkow, cache size is sanitized after multiplying with 1024 and when the application requests, LibOS converts the value toxK
format.Please see here, https://elixir.bootlin.com/linux/v5.13/source/drivers/base/cacheinfo.c#L380
and https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpusize: the total cache size in kB
LGTM, I like it, we now removed a lot of code.
Pal/include/arch/x86_64/pal_topology.h, line 52 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I wanted to make this explicit. After a few months, I myself might miss this :). But I can remove it if it is too obvious.
OK, let's keep it if it helps developers.
Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
By uninitialized, I mean some garbage value. We don't read this from the host as the file is not present. Also, when the application queries for this path, we return
-ENOENT
from LibOS.$ cat /sys/devices/system/cpu/cpu0/online cat: /sys/devices/system/cpu/cpu0/online: No such file or directory
reasonable to always set is_logical_core_online = 1
Given this why set
is_logical_core_online
to 1?I put this as a reminder comment, but if it is confusing, we can remove it.
Maybe rephrase to a more verbose:
/* Core 0 is always online (and thus there is no file `/sys/devices/system/cpu/cpu0/online`),
* so we leave is_logical_core_online for core 0 uninitialized */
What I constantly forget is that cpu0/online
file doesn't exist, and I think this should be mentioned in this comment.
Pal/include/arch/x86_64/pal_topology.h, line 99 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I think you are getting confused with
sockets
andNUMA nodes
. A socket is a physical package, this is dictated by the hardware/VMM. A NUMA node is a software abstraction to split resources. We can have n NUMA nodes within a socket. See this, https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonSPv2+msc.v2.1.png, where we have 2 sockets and each socket, has 2 NUMA nodes.yes, I use
sockets_cnt
during socket sanitization.
OK, thanks. Yes, I was confusing sockets to be the same as NUMA nodes. Thanks for the clarification.
Pal/src/host/Linux-common/topo_info.c, line 101 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
A few of the lines exceeded 100 chars and I needed to wrap them. So added this helper variable.
Removed it.
OK, I understand the reasoning (I guess in the old code it was more useful). The new version looks better to me.
Pal/src/host/Linux-common/topo_info.c, line 119 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Unless we encounter files with
\t
, it is better not to add it. So far, I haven't seen the need.
But you have a check for \t
20 lines above (see my other comment).
Pal/src/host/Linux-common/topo_info.c, line 266 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
PAL_SYSFS_BUF_FILESZ
is only 64, but here we end up with strings with larger lengths. We could usePAL_SYSFS_MAP_FILESZ
but the name is confusing in this context.We can either leave it or put some other macro length of the filename.
LGTM, you added a new macro.
Pal/src/host/Linux-common/topo_info.c, line 336 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Let us rename it to
fail
to be consistent and it also makes more sense as these are failure cases.
OK, yes, this is better.
Pal/src/host/Linux-common/topo_info.c, line 377 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
OK, created a macro
PAL_SYSFS_PATH_LEN
. Let me know if you have other suggestions or revert it back to using hardcoded value.
OK, this is good.
Pal/src/host/Linux-common/topo_info.c, line 44 at r4 (raw file):
ret = __builtin_mul_overflow(val, 1024, &val); if (ret < 0) return -EINVAL;
__builtin_mul_overflow
returns a boolean: true
if overflow is detected and false
if everything is fine.
So this should be:
if (__builtin_mul_overflow(val, 1024, &val))
return -EOVERFLOW;
Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):
/* Ensure the size hasn't exceed max threshold after multiplying by size qualifier */ if (val > INT_MAX) return -EINVAL;
Actually, isn't -EOVERFLOW
is a better error here?
Pal/src/host/Linux-common/topo_info.c, line 77 at r4 (raw file):
char* ptr = str; while (*ptr) { while (*ptr == ' ' || *ptr == '\t' || *ptr == ',')
FYI: Here you have check for \t
, but in the below comment you say that you haven't encountered any files that use \t
. Delete from here then maybe?
Pal/src/host/Linux-common/topo_info.c, line 112 at r4 (raw file):
if (__builtin_add_overflow(res_info->resource_cnt, diff, &total_cnt) || total_cnt >= INT_MAX) { ret = -EINVAL;
Actually, isn't -EOVERFLOW
is a better error here?
Pal/src/host/Linux-common/topo_info.c, line 163 at r4 (raw file):
} ssize_t read_file_buffer(const char* filename, char* buf, size_t count) {
Can you move this function to the very top of this file, now that you use it in other functions as well?
Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done, but I didn't understand the need to use
READ_ONCE
. Can you please clarify why we need to useREAD_ONCE()
?I see some examples in this file where we don't use
READ_ONCE()
like https://github.com/gramineproject/gramine/blob/master/Pal/src/host/Linux-SGX/db_main.c#L721 and
https://github.com/gramineproject/gramine/blob/master/Pal/src/host/Linux-SGX/db_main.c#L735
Oops, these are bugs. Actually, we are supposed to use READ_ONCE()
on every read access of the untrusted variable.
The idea is to prevent the compiler from doing optimizations that break security guarantees. When you do a simple myvar = untrusted_var;
, the compiler can do all sorts of weird stuff: the compiler may decide to read untrusted_var
two or more times instead of just once. The compiler may say "This developer added a useless myvar
alias to the perfectly good untrusted_var
, the developer is stupid, let me just replace myvar
with untrusted_var
everywhere in the code".
With READ_ONCE()
, we prevent the compiler from playing smart and accidentally accessing the untrusted variables many times. Also recall that if an untrusted variable is accessed two times, then the attacker can replace the untrusted-variable value in-between the two accesses, and suddenly one part of your code has value 1
and another part of your code has value 2
of the same variable -- and your program glitches, fails, or exposes a vulnerability. Adding READ_ONCE()
is a simple way for the developer to be sure that untrusted_var
is first copied into myvar
, and then the stored in secure memory and never maliciously changing myvar
is used.
P.S. Feel free to fix those bugs you found in a separate PR.
Pal/src/host/Linux-SGX/db_main.c, line 261 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Isn't
range_cnt
interpreted asrange count
andnum_ranges
interpreted asnumber of ranges
? And we decided to avoid the latter?If so, I would prefer the existing comment as it aligns with what we intended.
OK
Pal/src/host/Linux-SGX/db_main.c, line 321 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
The data cache isn't shared with other cores, so we end-up with just the core/and its sibling if HT is enabled.
For example,$ cat /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map 00000000,00000002
What comment will help here?
Thanks, this example helps. Maybe this:
/* recall that `shared_cpu_map` shows this core + its siblings (if HT is enabled), for example:
* /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map: 00000000,00000002 */
Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):
No, this doesn't help me much. I still highly dislike the function.
I did have this logic as part of the caller, but it looked very clumsy and so moved it as a separate function.
I think the clumsiness stems from the fact that you have the realloc logic here (it is like 2/3 of this function). Maybe introduce a helper func realloc_from_size(void* buf, size_t old_size, size_t new_size)
? And then this function will become like 6 lines in total, and then you can remove this function and inline these 6 lines of code into the caller.
Also, I think you have the same realloc logic somewhere else, so you could even make this realloc_from_size()
function available as a common function, and use it in both cases.
Pal/src/host/Linux-SGX/db_main.c, line 499 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Now that we have added
READ_ONCE()
do we still want to restore it? If so, can you please explain why?
See my other comment.
Pal/src/host/Linux-SGX/db_main.c, line 559 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Renamed to
fail
as in other places.
But this is not a fail. You get to this label even in good cases. So it should be out
.
Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Instead of copying the whole structure, I am copying each member and verifying it. Even previously with
temp_topo_info
we just did a shallow copy but no verification was done. Then in this function we made a deep copy and sanitization.You removed the verification-and-copy of this uptr_topo_info,
Am I missing something? Can you please clarify?
At this point, you're accessing the untrusted pointer value (uptr_topo_info->
is transformed into asm mov (<untrusted mem>), %rdx
). The pointer value that is stored at this address <untrusted mem>
may contain an address in enclave memory, so you end up with giving a malicious first argument to copy_resource_range_to_enclave()
-- this first argument is a pointer to some arbitrary enclave-memory address (specified by the attacker).
You removed the if (!sgx_copy_to_enclave(&temp_topo_info, sizeof(temp_topo_info), uptr_topo_info, sizeof(*uptr_topo_info)))
operation, which is important here. This operation doesn't just perform a copy, it also performs a check on the untrusted pointer uptr_topo_info
-- this operation verifies that the untrusted pointer does not point into the enclave memory. See this:
gramine/Pal/src/host/Linux-SGX/enclave_framework.c
Lines 139 to 140 in 32d2778
!sgx_is_completely_outside_enclave(uptr, usize) || | |
!sgx_is_completely_within_enclave(ptr, usize)) { |
Pal/src/host/Linux-SGX/db_main.c, line 126 at r4 (raw file):
} /* This function and all the below functions doesn't clean up resources on failure.
doesn't
-> don't
Pal/src/host/Linux-SGX/db_main.c, line 153 at r4 (raw file):
size_t online_logical_cores_cnt, size_t cache_indices_cnt, struct pal_core_topo_info** out_core_topology_arr) {
Could you please add an assert:
assert(out_core_topology_arr);
Pal/src/host/Linux-SGX/db_main.c, line 160 at r4 (raw file):
return -1; }
Now that I look at all these READ_ONCE()
macros you added, I think that it will be just simpler to perform a shallow copy of *untrusted_src
into a temporary variable and use this variable (and you will get rid of READ_ONCE
because now you will be accessing a variable stored inside the enclave).
I definitely want this shallow copy-to-enclave implemented in this function. I'm not sure about the above copy_resource_range_to_enclave()
-- that one is pretty trivial and doesn't really merit a temporary variable. But your choice, it makes everyone feel safer to use shallow copies :)
Pal/src/host/Linux-SGX/db_main.c, line 167 at r4 (raw file):
core_topo_arr[idx].socket_id = READ_ONCE(untrusted_src[idx].socket_id); int ret = copy_resource_range_to_enclave(&untrusted_src[idx].core_siblings,
You're accessing the untrusted var here as well, you're supposed to use READ_ONCE()
.
Exactly because of such hard-to-reason-about cases, I suggest to do a shallow copy and not bother about these.
Pal/src/host/Linux-SGX/db_main.c, line 174 at r4 (raw file):
} ret = copy_resource_range_to_enclave(&untrusted_src[idx].thread_siblings,
ditto
Pal/src/host/Linux-SGX/db_main.c, line 191 at r4 (raw file):
for (size_t lvl = 0; lvl < cache_indices_cnt; lvl++) { struct pal_core_cache_info* untrusted_cache_info = &untrusted_src[idx].cache_info_arr[lvl];
This also merits a shallow copy.
Pal/src/host/Linux-SGX/db_main.c, line 221 at r4 (raw file):
static int sgx_copy_numa_topo_to_enclave(struct pal_numa_topo_info* untrusted_src, size_t online_nodes_cnt, struct pal_numa_topo_info** out_numa_topology_arr) {
ditto (please add an assert)
Pal/src/host/Linux-SGX/db_main.c, line 234 at r4 (raw file):
READ_ONCE(untrusted_src[idx].nr_hugepages[HUGEPAGES_1G]); int ret = copy_resource_range_to_enclave(&untrusted_src[idx].cpumap,
READ_ONCE()
or do a shallow copy of untrusted_src[idx]
Pal/src/host/Linux-SGX/db_main.c, line 241 at r4 (raw file):
} ret = copy_resource_range_to_enclave(&untrusted_src[idx].distance,
ditto
Pal/src/host/Linux-SGX/db_main.c, line 286 at r4 (raw file):
size_t end = res_info->ranges_arr[i].end; resource_cnt_from_ranges += end - start + 1;
I would move this line after the two checks -- otherwise end < start
, and you'll get an integer overflow. But after the two checks, you are sure that end
and start
are good, and you can safely do this arithmetics.
Pal/src/host/Linux-SGX/db_main.c, line 401 at r4 (raw file):
* - Find its core-siblings in the core topology. * - Verify that the cores in the socket info array is exactly the same as core-siblings * present in core topology array.
This sentence is very hard to read. Maybe add "
like this (if I understand correctly what you want to say here):
* - Verify that the "cores in the socket info" array is exactly the same as the "core-siblings
* present in core topology" array.
Pal/src/host/Linux-SGX/db_main.c, line 418 at r4 (raw file):
struct pal_range_info* core_sibling_ranges_arr = core_topology_arr[core_in_socket].core_siblings.ranges_arr;
Now you don't really need this helper variable core_sibling_ranges_arr
.
Instead you could use smth like core_siblings->ranges_arr[j].start
and core_siblings->ranges_arr[j].end
.
Or are these too long and don't fit into 100 chars? In this case, you can at least change:
struct pal_range_info* core_sibling_ranges_arr = core_siblings->ranges_arr;
Pal/src/host/Linux-SGX/db_main.c, line 532 at r4 (raw file):
size_t index = j / BITS_IN_TYPE(uint32_t); if (index >= cpumask_cnt) { log_error("Invalid numa topology: cpumap range, %zu - %zu", start, j);
I don't understand this message at all. Maybe simply ..."Invalid numa topology: Core %zu is beyond CPU mask limit", j);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 12 of 21 files at r1, 1 of 8 files at r3, 7 of 8 files at r4, all commit messages.
Reviewable status: all files reviewed, 64 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
How about the following?
This commit creates a clean representation of topology data on the trusted<->untrusted boundary, so that we can easily verify its correctness. The following are done: 1. Refactor sysfs code to convert CPU/NUMA information from Linux-formatted strings to integers in untrusted PAL. 2. Copy and sanitize CPU/NUMA information based on untrusted integers to trusted PAL. 3. Convert the trusted CPU/NUMA information from integers back to Linux-formatted strings in LibOS.
Looks better.
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r4 (raw file):
* Example output for 64 cores in total and ranges 0-15,48-55: "00ff0000,0000ffff". * Note: This function adds a newline at the end of the string. */ int sys_convert_ranges_to_cpu_bitmap_str(const struct pal_res_range_info* resource_range_info, char* str,
please fix wrapping
LibOS/shim/src/fs/proc/info.c, line 129 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
OK, I see your point. Hopefully we'll remove this
enable_sysfs_topology
soon anyway.
I also prefer skipping it than printing wrong value.
LibOS/shim/src/fs/sys/cache_info.c, line 45 at r4 (raw file):
break; case CACHE_TYPE_UNIFIED: ret = snprintf(str, sizeof(str), "%s\n", "Unified");
What's the point of "%s\n", "Unified"
? Please simplify here and above.
LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):
size_t range_cnt = resource_range_info->range_cnt; size_t offset = 0; for (size_t i = 0; i < range_cnt; i++) {
You leave str
uninitialized and return success for empty ranges list.
LibOS/shim/src/fs/sys/fs.c, line 56 at r4 (raw file):
BITS_TO_INTS
bitmap
is not an array of int
s
LibOS/shim/src/fs/sys/fs.c, line 75 at r4 (raw file):
/* Convert cpumask to strings */ size_t offset = 0; for (size_t j = cpumask_cnt; j > 0; j--) {
Please assert that cpumask_cnt > 0
, it should always be true, but otherwise this function unintuitively returns incorrect results.
Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Maybe rephrase to a more verbose:
/* Core 0 is always online (and thus there is no file `/sys/devices/system/cpu/cpu0/online`), * so we leave is_logical_core_online for core 0 uninitialized */
What I constantly forget is that
cpu0/online
file doesn't exist, and I think this should be mentioned in this comment.
+1 to Dmitrii's proposal
Pal/include/arch/x86_64/pal_topology.h, line 4 at r3 (raw file):
/* Copyright (C) 2022 Intel Corporation * Michał Kowalczyk <mkow@invisiblethingslab.com> * Copyright (C) 2022 Intel Corporation
You don't have to duplicate this line, see how we do this in other files.
Pal/include/arch/x86_64/pal_topology.h, line 11 at r3 (raw file):
#define PAL_TOPOLOGY_H /* Used to represent buffers having numeric values with unit suffixes. E.g "1024576K".
E.g
-> E.g.
Pal/include/arch/x86_64/pal_topology.h, line 11 at r3 (raw file):
#define PAL_TOPOLOGY_H /* Used to represent buffers having numeric values with unit suffixes. E.g "1024576K".
suffixes. E
-> suffixes, e
Pal/include/arch/x86_64/pal_topology.h, line 11 at r4 (raw file):
#define PAL_TOPOLOGY_H /* Used to represent buffers having numeric values with unit suffixes. E.g "1024576K".
Also, isn't this outdated now?
Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):
struct pal_range_info { size_t start; size_t end;
inclusive or exclusive?
Pal/include/arch/x86_64/pal_topology.h, line 43 at r4 (raw file):
struct pal_res_range_info { /* Total number of resources present. Eg. if output of `/sys/devices/system/cpu/online` were
Eg.
-> E.g.
Pal/include/arch/x86_64/pal_topology.h, line 43 at r4 (raw file):
struct pal_res_range_info { /* Total number of resources present. Eg. if output of `/sys/devices/system/cpu/online` were
were
-> was
Pal/include/arch/x86_64/pal_topology.h, line 47 at r4 (raw file):
size_t resource_cnt; /* Total number of ranges present. Eg. if output of `/sys/devices/system/cpu/online` were
Eg.
-> E.g.
Pal/include/arch/x86_64/pal_topology.h, line 47 at r4 (raw file):
size_t resource_cnt; /* Total number of ranges present. Eg. if output of `/sys/devices/system/cpu/online` were
were
-> was
Pal/include/arch/x86_64/pal_topology.h, line 49 at r4 (raw file):
/* Total number of ranges present. Eg. if output of `/sys/devices/system/cpu/online` were * 0-15,21,32-63 then `range_cnt` will be 3 */ size_t range_cnt;
range_cnt
-> ranges_cnt
Pal/include/arch/x86_64/pal_topology.h, line 51 at r4 (raw file):
size_t range_cnt; /* Array of ranges, with `range_cnt` items. Eg. if output of `/sys/devices/system/cpu/online`
Eg.
-> E.g.
Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):
/* Array of ranges, with `range_cnt` items. Eg. if output of `/sys/devices/system/cpu/online` * were 0-15,21,32-63 then `ranges_arr` will be [{0, 15}, {21, 21}, {32 63}]. When there is only
were
-> was
Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):
[{0, 15}, {21, 21}, {32 63}]
Please unify formatting here.
Pal/include/arch/x86_64/pal_topology.h, line 53 at r4 (raw file):
When there is only a single number instead of range both start and end are assigned start value
I'd drop this sentence, it's already clearly explained by the example.
Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):
/* Note: The ranges should not overlap */
But can they touch each other? E.g. is [[0, 15], [16, 31]]
a valid array of ranges?
Pal/include/arch/x86_64/pal_topology.h, line 62 at r4 (raw file):
So to get actual cache size multiply this value by 1024.
This is super confusing, please store this value in bytes instead.
Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):
struct pal_core_topo_info { /* [0] element is uninitialized because core 0 is always online */ size_t is_logical_core_online;
This should be bool
.
Pal/src/host/Linux-common/topo_info.c, line 21 at r4 (raw file):
/* Opens a pseudo-file describing HW resources and simply reads the value stored in the file. * Returns UNIX error code on failure and 0 on success. */ static int get_hw_resource_value(const char* filename, size_t* retval) {
retval
-> out_value
Pal/src/host/Linux-common/topo_info.c, line 34 at r4 (raw file):
long val = strtol(str, &end, 10); if (val < 0 || val > INT_MAX) return -ENOENT;
-ENOENT
-> -EINVAL
or something like that
Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):
"1,3-5,6"
This isn't a good example, it suggests that the ends are exclusive, which I think isn't the case.
Pal/src/host/Linux-common/topo_info.c, line 60 at r4 (raw file):
* N.B: Understands complex formats like "1,3-5,6". */ static int get_hw_resource_range(const char* filename, struct pal_res_range_info* res_info) {
res_info
-> out_info
Pal/src/host/Linux-common/topo_info.c, line 115 at r4 (raw file):
goto fail; } res_info->resource_cnt += diff + 1; /* +1 because of inclusive range */
You have an overflow check here, but not on the incrementation above?
Pal/src/host/Linux-common/topo_info.c, line 125 at r4 (raw file):
res_info->range_cnt++; /* Realloc the array of ranges (expand by one range) */
Doing realloc on every iteration is super slow, it probably will be non-negligible on large servers. Please do some overestimation with *2 growing when it's too small and then trim it with realloc after the loop.
Pal/src/host/Linux-common/topo_info.c, line 153 at r4 (raw file):
fail: if (res_info->ranges_arr)
no need for this if
Pal/src/host/Linux-common/topo_info.c, line 163 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Can you move this function to the very top of this file, now that you use it in other functions as well?
But please not in this PR, the diff is already large and moving code around is annoying to review :)
Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):
/* Returns number of cache levels present on this system by counting "indexX" dir entries under * `/sys/devices/system/cpu/cpuX/cache` on success and negative UNIX error code on failure. */
This comment is stale
Pal/src/host/Linux-common/topo_info.c, line 191 at r4 (raw file):
char buf[1024]; int ret; int dirs_cnt = 0;
size_t
Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):
PAL_SYSFS_PATH_LEN
Is this length or size?
Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):
return ret; if (topo_info->online_logical_cores.resource_cnt > INT32_MAX)
Why can't it be larger?
Pal/src/host/Linux-common/topo_info.c, line 316 at r4 (raw file):
size_t online_logical_cores_cnt = topo_info->online_logical_cores.resource_cnt; if (topo_info->possible_logical_cores.resource_cnt > INT32_MAX)
ditto
Pal/src/host/Linux-common/topo_info.c, line 341 at r4 (raw file):
snprintf(dirname, sizeof(dirname), "/sys/devices/system/cpu/cpu%zu", idx); /* cpu0 is always online and thus the "online" file is not present. */
Could we instead of leaving it uninitialized set this value to "true" for core0? There's a risk that someone will use these structures for something else than sysfs and will get hurt.
Pal/src/host/Linux-common/topo_info.c, line 401 at r4 (raw file):
return -ENOMEM; char filename[PAL_SYSFS_PATH_LEN];
ditto, len or size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 15 of 21 files reviewed, 64 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
please fix wrapping
Are you suggesting having text until 100 chars
? If so, I purposefully put "Example..." in a separate line so that it will be easier to read. Is this ok?
LibOS/shim/src/fs/sys/cache_info.c, line 45 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
What's the point of
"%s\n", "Unified"
? Please simplify here and above.
Done.
LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
You leave
str
uninitialized and return success for empty ranges list.
Done.
LibOS/shim/src/fs/sys/fs.c, line 56 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
BITS_TO_INTS
bitmap
is not an array ofint
s
Changed it to BITS_TO_UINT32S
LibOS/shim/src/fs/sys/fs.c, line 75 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Please assert that
cpumask_cnt > 0
, it should always be true, but otherwise this function unintuitively returns incorrect results.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 69 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Maybe rephrase to a more verbose:
/* Core 0 is always online (and thus there is no file `/sys/devices/system/cpu/cpu0/online`), * so we leave is_logical_core_online for core 0 uninitialized */
What I constantly forget is that
cpu0/online
file doesn't exist, and I think this should be mentioned in this comment.
sure, let me add this comment.
Pal/include/arch/x86_64/pal_topology.h, line 4 at r3 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
You don't have to duplicate this line, see how we do this in other files.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 11 at r3 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
E.g
->E.g.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 11 at r3 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
suffixes. E
->suffixes, e
Done.
Pal/include/arch/x86_64/pal_topology.h, line 11 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Also, isn't this outdated now?
This is still valid when we read the cache size from the host which is numeric with suffix format.
Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
inclusive or exclusive?
This is inclusive.
Pal/include/arch/x86_64/pal_topology.h, line 43 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Eg.
->E.g.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 43 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
were
->was
Done.
Pal/include/arch/x86_64/pal_topology.h, line 47 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Eg.
->E.g.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 47 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
were
->was
Done.
Pal/include/arch/x86_64/pal_topology.h, line 49 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
range_cnt
->ranges_cnt
why? I want to mention the variable name which is range_cnt
Pal/include/arch/x86_64/pal_topology.h, line 51 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Eg.
->E.g.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
were
->was
Done.
Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
[{0, 15}, {21, 21}, {32 63}]
Please unify formatting here.
[{0, 15}, {21, 21}, {32, 63}]
is this fine?
Pal/include/arch/x86_64/pal_topology.h, line 53 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
When there is only a single number instead of range both start and end are assigned start value
I'd drop this sentence, it's already clearly explained by the example.
OK, if both reviewers feel this way, I will drop it.
Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
/* Note: The ranges should not overlap */
But can they touch each other? E.g. is
[[0, 15], [16, 31]]
a valid array of ranges?
Yes, this is a valid case.
Pal/include/arch/x86_64/pal_topology.h, line 62 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
So to get actual cache size multiply this value by 1024.
This is super confusing, please store this value in bytes instead.
The comment is incorrect, I do store the value in bytes. Changed comment as below,
/* We store cache size is in bytes. But cache size is usually displayed in KB format. Please do
* the needful conversion.
* Pls ref: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu */
Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
This should be
bool
.
yes, bool
would be most appropriate here but we use a common function get_hw_resource_value()
to read the values from files. This function takes size_t pointer
as a parameter to store the read value(s). So set this variable as size_t
instead of bool
.
Pal/src/host/Linux-common/topo_info.c, line 119 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
But you have a check for
\t
20 lines above (see my other comment).
yes, I had missed it earlier but have removed it.
Pal/src/host/Linux-common/topo_info.c, line 21 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
retval
->out_value
Done.
Pal/src/host/Linux-common/topo_info.c, line 34 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
-ENOENT
->-EINVAL
or something like that
Changed it to -EINVAL
Pal/src/host/Linux-common/topo_info.c, line 44 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
__builtin_mul_overflow
returns a boolean:true
if overflow is detected andfalse
if everything is fine.So this should be:
if (__builtin_mul_overflow(val, 1024, &val)) return -EOVERFLOW;
ah, yes. Done.
Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, isn't
-EOVERFLOW
is a better error here?
The overflow should be caught by the earlier function __builtin_mul_overflow
right?
In this case, it is not overflowing, as val
is of type long
and can hold a larger value on a 64-bit system and if it is a 32-bit system, shouldn't have a value greater than INT_MAX
.
This check was intended for 64-bit systems where we don't end up with unrealistically large values and so limit to INT_MAX
. Note: In trusted PAL we do check for such limits, but this will also help in untrusted PAL case as well.
If you feel this is redundant and not needed for untrusted PAL case, I can remove this check itself.
Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
"1,3-5,6"
This isn't a good example, it suggests that the ends are exclusive, which I think isn't the case.
Is this 1-5,10-15,20,30-31
better?
Pal/src/host/Linux-common/topo_info.c, line 60 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
res_info
->out_info
Done.
Pal/src/host/Linux-common/topo_info.c, line 77 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
FYI: Here you have check for
\t
, but in the below comment you say that you haven't encountered any files that use\t
. Delete from here then maybe?
yes, missed it. I have removed this as well.
Pal/src/host/Linux-common/topo_info.c, line 112 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, isn't
-EOVERFLOW
is a better error here?
yes, changed to -EOVERFLOW
Pal/src/host/Linux-common/topo_info.c, line 115 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
You have an overflow check here, but not on the incrementation above?
good catch, done.
Pal/src/host/Linux-common/topo_info.c, line 125 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Doing realloc on every iteration is super slow, it probably will be non-negligible on large servers. Please do some overestimation with *2 growing when it's too small and then trim it with realloc after the loop.
Yes agree, but will do this optimization in a separate PR. Have added a TODO
to capture this effort.
For this release let us target this basic version of adding each range at a time.
Pal/src/host/Linux-common/topo_info.c, line 153 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
no need for this
if
Done.
Pal/src/host/Linux-common/topo_info.c, line 163 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
But please not in this PR, the diff is already large and moving code around is annoying to review :)
I moved the code and committed but will revert.
Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
This comment is stale
Reworded as
/* Updates `cache_indices_cnt` with number of cache levels present on the system by counting
* "indexX" dir entries under `/sys/devices/system/cpu/cpuX/cache`. Returns 0 on success and
* negative UNIX error code on failure. */
Pal/src/host/Linux-common/topo_info.c, line 191 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
size_t
Done.
Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
PAL_SYSFS_PATH_LEN
Is this length or size?
This is length as per our coding guidelines as it denotes characters in a C-string.
Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Why can't it be larger?
It can be but we need some upper limit, right? So, for now we have it as INT32_MAX
. In our validation, if we hit issues, we can update this.
Pal/src/host/Linux-common/topo_info.c, line 316 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
ditto
same reason as above.
Pal/src/host/Linux-common/topo_info.c, line 341 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Could we instead of leaving it uninitialized set this value to "true" for core0? There's a risk that someone will use these structures for something else than sysfs and will get hurt.
We have big comments not to do so :). Also, if we think the other way since it is set to 1 some can end up using it as a valid value. This can be an issue. But if you insist, I can set it to 1.
Pal/src/host/Linux-common/topo_info.c, line 401 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
ditto, len or size?
length.
Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Oops, these are bugs. Actually, we are supposed to use
READ_ONCE()
on every read access of the untrusted variable.The idea is to prevent the compiler from doing optimizations that break security guarantees. When you do a simple
myvar = untrusted_var;
, the compiler can do all sorts of weird stuff: the compiler may decide to readuntrusted_var
two or more times instead of just once. The compiler may say "This developer added a uselessmyvar
alias to the perfectly gooduntrusted_var
, the developer is stupid, let me just replacemyvar
withuntrusted_var
everywhere in the code".With
READ_ONCE()
, we prevent the compiler from playing smart and accidentally accessing the untrusted variables many times. Also recall that if an untrusted variable is accessed two times, then the attacker can replace the untrusted-variable value in-between the two accesses, and suddenly one part of your code has value1
and another part of your code has value2
of the same variable -- and your program glitches, fails, or exposes a vulnerability. AddingREAD_ONCE()
is a simple way for the developer to be sure thatuntrusted_var
is first copied intomyvar
, and then the stored in secure memory and never maliciously changingmyvar
is used.P.S. Feel free to fix those bugs you found in a separate PR.
Thanks for the nice explanation! Sure, will create a separate PR and fix these.
Now, I have removed all the READ_ONCE()
and took the shallow copy approach. The code looks more readable with shallow copy :)
Pal/src/host/Linux-SGX/db_main.c, line 321 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Thanks, this example helps. Maybe this:
/* recall that `shared_cpu_map` shows this core + its siblings (if HT is enabled), for example: * /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map: 00000000,00000002 */
Done.
Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):
so you could even make this realloc_from_size() function available as a common function, and use it in both cases
We did go down this path here, gramineproject/graphene#2661 and finally decided we will just have it inline. So let us drop realloc_from_size()
idea and I will inline it.
See the snippet below,
But in the other place you use realloc_size(). Now we have a strange state that some PAL sources use realloc_size and some sources don't use it (and instead do the malloc-memcpy-free pattern explicitly).
Sorry to go back and forth, but let's remove your newly-introduced realloc_size function and replace the only caller of realloc_size with the explicit malloc-memcpy-free pattern. It doesn't make sense to introduce a function for only one place in the code and (for some obscure reason) to not use this function in other places.
This means that you should drop the commit [Pal] Add realloc_size() in PAL slab manager.
Pal/src/host/Linux-SGX/db_main.c, line 499 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
See my other comment.
Got it. Brought back temp copy approach and removed READ_ONCE()
Pal/src/host/Linux-SGX/db_main.c, line 559 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
But this is not a fail. You get to this label even in good cases. So it should be
out
.
OK, changed it to out.
Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
At this point, you're accessing the untrusted pointer value (
uptr_topo_info->
is transformed into asmmov (<untrusted mem>), %rdx
). The pointer value that is stored at this address<untrusted mem>
may contain an address in enclave memory, so you end up with giving a malicious first argument tocopy_resource_range_to_enclave()
-- this first argument is a pointer to some arbitrary enclave-memory address (specified by the attacker).You removed the
if (!sgx_copy_to_enclave(&temp_topo_info, sizeof(temp_topo_info), uptr_topo_info, sizeof(*uptr_topo_info)))
operation, which is important here. This operation doesn't just perform a copy, it also performs a check on the untrusted pointeruptr_topo_info
-- this operation verifies that the untrusted pointer does not point into the enclave memory. See this:gramine/Pal/src/host/Linux-SGX/enclave_framework.c
Lines 139 to 140 in 32d2778
!sgx_is_completely_outside_enclave(uptr, usize) || !sgx_is_completely_within_enclave(ptr, usize)) {
Thanks for the clear explanation! The idea was to avoid 2 copies, but I see the issue now. I will revert it back to shallow-copy approach.
Pal/src/host/Linux-SGX/db_main.c, line 126 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
doesn't
->don't
Done.
Pal/src/host/Linux-SGX/db_main.c, line 153 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Could you please add an assert:
assert(out_core_topology_arr);
Done.
Pal/src/host/Linux-SGX/db_main.c, line 160 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Now that I look at all these
READ_ONCE()
macros you added, I think that it will be just simpler to perform a shallow copy of*untrusted_src
into a temporary variable and use this variable (and you will get rid ofREAD_ONCE
because now you will be accessing a variable stored inside the enclave).I definitely want this shallow copy-to-enclave implemented in this function. I'm not sure about the above
copy_resource_range_to_enclave()
-- that one is pretty trivial and doesn't really merit a temporary variable. But your choice, it makes everyone feel safer to use shallow copies :)
yes switched to shallow copy approach.
Pal/src/host/Linux-SGX/db_main.c, line 167 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You're accessing the untrusted var here as well, you're supposed to use
READ_ONCE()
.Exactly because of such hard-to-reason-about cases, I suggest to do a shallow copy and not bother about these.
Did shallow copy to address the issue.
Pal/src/host/Linux-SGX/db_main.c, line 174 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto
Did shallow copy to address the issue.
Pal/src/host/Linux-SGX/db_main.c, line 191 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This also merits a shallow copy.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 221 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto (please add an assert)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 234 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
READ_ONCE()
or do a shallow copy ofuntrusted_src[idx]
Did shallow copy to address the issue.
Pal/src/host/Linux-SGX/db_main.c, line 241 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
ditto
Did shallow copy to address the issue.
Pal/src/host/Linux-SGX/db_main.c, line 286 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I would move this line after the two checks -- otherwise
end < start
, and you'll get an integer overflow. But after the two checks, you are sure thatend
andstart
are good, and you can safely do this arithmetics.
Done.
Code quote:
resource_cnt_from_ranges
Pal/src/host/Linux-SGX/db_main.c, line 401 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This sentence is very hard to read. Maybe add
"
like this (if I understand correctly what you want to say here):* - Verify that the "cores in the socket info" array is exactly the same as the "core-siblings * present in core topology" array.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 418 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Now you don't really need this helper variable
core_sibling_ranges_arr
.Instead you could use smth like
core_siblings->ranges_arr[j].start
andcore_siblings->ranges_arr[j].end
.Or are these too long and don't fit into 100 chars? In this case, you can at least change:
struct pal_range_info* core_sibling_ranges_arr = core_siblings->ranges_arr;
yes, this was redundant. Removed it.
Pal/src/host/Linux-SGX/db_main.c, line 532 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I don't understand this message at all. Maybe simply
..."Invalid numa topology: Core %zu is beyond CPU mask limit", j);
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r5, all commit messages.
Reviewable status: all files reviewed, 64 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
LibOS/shim/src/fs/sys/fs.c, line 75 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
I don't see Michal's requested change.
Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
This is inclusive.
I guess Michal wants a comment like this:
struct pal_range_info {
size_t start;
size_t end; /* inclusive, e.g., start = 0 and end = 1 has two items (0 and 1) */
Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
yes,
bool
would be most appropriate here but we use a common functionget_hw_resource_value()
to read the values from files. This function takessize_t pointer
as a parameter to store the read value(s). So set this variable assize_t
instead ofbool
.
I wouldn't care much about the actual type here... So I'm fine with Vijay's approach.
Pal/include/arch/x86_64/pal_topology.h, line 60 at r5 (raw file):
size_t level; enum cache_type type; /* We store cache size is in bytes. But cache size is usually displayed in KB format. Please do
is in bytes
-> in bytes
Pal/include/arch/x86_64/pal_topology.h, line 60 at r5 (raw file):
size_t level; enum cache_type type; /* We store cache size is in bytes. But cache size is usually displayed in KB format. Please do
Weird phrasing, I suggest:
/* We store cache size in bytes, but it is typically displayed in KB (e.g., "256K"), so for output
* it should be converted to KBs. See https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu */
Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
The overflow should be caught by the earlier function
__builtin_mul_overflow
right?In this case, it is not overflowing, as
val
is of typelong
and can hold a larger value on a 64-bit system and if it is a 32-bit system, shouldn't have a value greater thanINT_MAX
.This check was intended for 64-bit systems where we don't end up with unrealistically large values and so limit to
INT_MAX
. Note: In trusted PAL we do check for such limits, but this will also help in untrusted PAL case as well.If you feel this is redundant and not needed for untrusted PAL case, I can remove this check itself.
No, it's fine then. Resolving.
Pal/src/host/Linux-common/topo_info.c, line 125 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Yes agree, but will do this optimization in a separate PR. Have added a
TODO
to capture this effort.
For this release let us target this basic version of adding each range at a time.
Agreed with Vijay.
Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):
} out_info->resource_cnt++;
Actually, I would prefer to have:
out_info->resource_cnt = total_cnt;
This way it's clear why you did the check for overflow.
Pal/src/host/Linux-common/topo_info.c, line 117 at r5 (raw file):
size_t diff = end_val - start_val; size_t total_cnt; if (__builtin_add_overflow(out_info->resource_cnt, diff, &total_cnt) ||
Shouldn't you replace diff
with diff + 1
here?
Alternatively, you can add + 1
to the diff = ...
above.
Pal/src/host/Linux-common/topo_info.c, line 122 at r5 (raw file):
goto fail; } out_info->resource_cnt += diff + 1; /* +1 because of inclusive range */
Actually, I would prefer to have:
out_info->resource_cnt = total_cnt;
This way it's clear why you did the check for overflow.
Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Thanks for the nice explanation! Sure, will create a separate PR and fix these.
Now, I have removed all the
READ_ONCE()
and took the shallow copy approach. The code looks more readable with shallow copy :)
You added the shallow copy? where? I don't see it.
Or did you add the shallow copy to the caller of this function, and now the first argument of this function (untrusted_src
) points to the trusted in-enclave values (ranges_arr
, range_cnt
, resource_cnt
)? If so, you should change the name of the argument (untrusted_src
-> src
).
UPDATE: Ok, so you indeed supply an safe in-enclave first argument. So you should rename untrusted_src
to simply src
.
But because struct pal_res_range_info
contains in itself a ranges_arr
field (pointer to some other memory), this field in untrusted. So you must have sgx_copy_to_enclave()
on this field -- which you do already, that's good. So this function looks correct, nothing needs to be changed (except my small comments).
Pal/src/host/Linux-SGX/db_main.c, line 369 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
so you could even make this realloc_from_size() function available as a common function, and use it in both cases
We did go down this path here, gramineproject/graphene#2661 and finally decided we will just have it inline. So let us drop
realloc_from_size()
idea and I will inline it.See the snippet below,
But in the other place you use realloc_size(). Now we have a strange state that some PAL sources use realloc_size and some sources don't use it (and instead do the malloc-memcpy-free pattern explicitly). Sorry to go back and forth, but let's remove your newly-introduced realloc_size function and replace the only caller of realloc_size with the explicit malloc-memcpy-free pattern. It doesn't make sense to introduce a function for only one place in the code and (for some obscure reason) to not use this function in other places. This means that you should drop the commit [Pal] Add realloc_size() in PAL slab manager.
Thanks for this reminder. I forgot that we had this discussion several months ago :)
Thanks for inlining this function now. It looks more reasonable to me. And once we introduce realloc_size()
or something similar in a follow-up PR, the code will be much easier to read.
Pal/src/host/Linux-SGX/db_main.c, line 692 at r2 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Thanks for the clear explanation! The idea was to avoid 2 copies, but I see the issue now. I will revert it back to shallow-copy approach.
You're welcome. Yes, I understand the desire to optimize for performance and clarity (by avoiding 2 copies), but unfortunately security is often at odds with this desire :) For example, consider a classic case of I/O bounce buffers (used for security reasons) -- there are typically 4 redundant copies of data, and this performance loss is because of security requirement...
Pal/src/host/Linux-SGX/db_main.c, line 128 at r5 (raw file):
/* This function and all the below functions don't clean up resources on failure. * We terminate the process anyway. */ static int copy_resource_range_to_enclave(struct pal_res_range_info* untrusted_src,
untrusted_src
-> src
Pal/src/host/Linux-SGX/db_main.c, line 130 at r5 (raw file):
static int copy_resource_range_to_enclave(struct pal_res_range_info* untrusted_src, struct pal_res_range_info* dest) { size_t range_cnt = untrusted_src->range_cnt;
There is no need for this helper variable now. Just use src->range_cnt
everywhere in this func.
Pal/src/host/Linux-SGX/db_main.c, line 137 at r5 (raw file):
} if (!sgx_copy_to_enclave(ranges_arr, range_cnt * sizeof(*ranges_arr),
Please add a comment like this:
/* even though `src` points to a safe in-enclave object, the `src->ranges_arr` pointer is
* untrusted and may maliciously point to inside the enclave; thus need to use
* `sgx_copy_to_enclave()` function */
Pal/src/host/Linux-SGX/db_main.c, line 138 at r5 (raw file):
if (!sgx_copy_to_enclave(ranges_arr, range_cnt * sizeof(*ranges_arr), untrusted_src->ranges_arr,
READ_ONCE()
Pal/src/host/Linux-SGX/db_main.c, line 156 at r5 (raw file):
assert(out_core_topology_arr); struct pal_core_topo_info* temp_core_topo_arr =
You have two ways of writing "core topology": core_topo_arr
and core_topology_arr
. It's hard to read. Let's unify...
Meh, the fact that you sometimes use topo
and sometimes topology
drives me nuts :) I guess just use topo
everywhere.
Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):
/* Shallow copy contents of core_topology_arr (untrusted_src) into enclave */ if (!sgx_copy_to_enclave(temp_core_topo_arr, online_logical_cores_cnt * sizeof(*temp_core_topo_arr), untrusted_src,
untrusted_src
-> READ_ONCE(untrusted_src)
Pal/src/host/Linux-SGX/db_main.c, line 167 at r5 (raw file):
online_logical_cores_cnt * sizeof(*temp_core_topo_arr), untrusted_src, online_logical_cores_cnt * sizeof(*untrusted_src))) { log_error("Shallow copy of temp core_topology_arr into the enclave failed");
Remove temp
Pal/src/host/Linux-SGX/db_main.c, line 171 at r5 (raw file):
} struct pal_core_topo_info* core_topology_arr =
core_topology_arr
-> core_topo_arr
Pal/src/host/Linux-SGX/db_main.c, line 198 at r5 (raw file):
} /* Allocate enclave memory to store cache info */
This comment belongs to another malloc (below), not this one. Here it should be:
/* Shallow copy contents of cache_info_arr (untrusted pointer) into enclave */
Pal/src/host/Linux-SGX/db_main.c, line 202 at r5 (raw file):
malloc(cache_indices_cnt * sizeof(*temp_cache_info_arr)); if (!temp_cache_info_arr) { log_error("Allocation for temp cache_info_arr failed");
"Allocation for shallow copy of cache_info_arr failed"
Pal/src/host/Linux-SGX/db_main.c, line 208 at r5 (raw file):
if (!sgx_copy_to_enclave(temp_cache_info_arr, cache_indices_cnt * sizeof(*temp_cache_info_arr), temp_core_topo_arr->cache_info_arr,
READ_ONCE()
Rule of thumb: always add READ_ONCE
to the third argument of sgx_copy_to_enclave()
.
Pal/src/host/Linux-SGX/db_main.c, line 211 at r5 (raw file):
cache_indices_cnt * sizeof(*temp_core_topo_arr->cache_info_arr))) { log_error("Shallow copy of temp cache_info_arr into the enclave failed");
Remove temp
Pal/src/host/Linux-SGX/db_main.c, line 247 at r5 (raw file):
free(temp_core_topo_arr); return 0; }
FYI: I like the new version, with shallow copies (bar my comments). I tried to review it carefully, and from the security point of you, it looks correct (in the sense that we don't use untrusted objects without first copying them into the enclave).
Pal/src/host/Linux-SGX/db_main.c, line 257 at r5 (raw file):
malloc(online_nodes_cnt * sizeof(*temp_numa_topo_arr)); if (!temp_numa_topo_arr) { log_error("Allocation for temp numa_topology_arr failed");
"Allocation for shallow copy of numa_topo_arr failed"
Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):
/* Shallow copy contents of numa_topology_arr (untrusted_src) into enclave */ if (!sgx_copy_to_enclave(temp_numa_topo_arr, online_nodes_cnt * sizeof(*temp_numa_topo_arr), untrusted_src,
READ_ONCE(untrusted_src)
Pal/src/host/Linux-SGX/db_main.c, line 265 at r5 (raw file):
online_nodes_cnt * sizeof(*temp_numa_topo_arr), untrusted_src, online_nodes_cnt * sizeof(*untrusted_src))) { log_error("Shallow copy of temp numa_topology_arr into the enclave failed");
Remove temp
numa_topology_arr
-> numa_topo_arr
Pal/src/host/Linux-SGX/db_main.c, line 300 at r5 (raw file):
free(temp_numa_topo_arr); return 0; }
FYI: I like the new version, with shallow copies (bar my comments). I tried to review it carefully, and from the security point of you, it looks correct (in the sense that we don't use untrusted objects without first copying them into the enclave).
Pal/src/host/Linux-SGX/db_main.c, line 739 at r5 (raw file):
struct pal_topo_info temp_topo_info; if (!sgx_copy_to_enclave(&temp_topo_info, sizeof(temp_topo_info), uptr_topo_info, sizeof(*uptr_topo_info))) {
READ_ONCE(uptr_topo_info)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 6 files at r5, all commit messages.
Reviewable status: all files reviewed, 42 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Are you suggesting having text until
100 chars
? If so, I purposefully put "Example..." in a separate line so that it will be easier to read. Is this ok?
I mean only the function signature, char* str
is wrapped. You should see this on Reviewable if you configured line length to 100 in it.
LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
Why do both now?
Instead of what you did maybe just add str[0] = '\0'
? There's no reason to fail on empty range if we can trivially make it work correctly for it.
Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I guess Michal wants a comment like this:
struct pal_range_info { size_t start; size_t end; /* inclusive, e.g., start = 0 and end = 1 has two items (0 and 1) */
Yes, I ofc know which one is the case, this was the question from the point of a reader of our codebase. It's not clear from the header until you read the implementation.
Ad the comment: Please do as Dima suggested, but maybe without this example, "inclusive" is not a complex term which would require lengthy explanations.
Pal/include/arch/x86_64/pal_topology.h, line 49 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
why? I want to mention the variable name which is
range_cnt
? I want you to change the variable name. I don't understand your response here...
Pal/include/arch/x86_64/pal_topology.h, line 52 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
[{0, 15}, {21, 21}, {32, 63}]
is this fine?
Looks good now.
Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Yes, this is a valid case.
As before, I've read the whole PR so I know, but other developers reading this header in the future won't and the current comment doesn't make it clear how this works.
Pal/include/arch/x86_64/pal_topology.h, line 62 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
The comment is incorrect, I do store the value in bytes. Changed comment as below,
/* We store cache size is in bytes. But cache size is usually displayed in KB format. Please do * the needful conversion. * Pls ref: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu */
Then please just drop the whole comment, I don't see why "how we usually print this value in sysfs" could have any importance here.
Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I wouldn't care much about the actual type here... So I'm fine with Vijay's approach.
Then just use get_hw_resource_value()
on a temporary variable and then assign to is_logical_core_online
. size_t
makes no sense here and this is a global structure, it should be as clean and intuitive as possible.
Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):
If you feel this is redundant and not needed for untrusted PAL case, I can remove this check itself.
I don't see any point in doing these checks here, especially that nothing should break if the value is bigger than this.
Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Is this
1-5,10-15,20,30-31
better?
Why not just change "6" to "7" in your original example? The new one is much more complicated but doesn't really provide any more information than the original.
Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Reworded as
/* Updates `cache_indices_cnt` with number of cache levels present on the system by counting * "indexX" dir entries under `/sys/devices/system/cpu/cpuX/cache`. Returns 0 on success and * negative UNIX error code on failure. */
It's not really correct now, "to update something" slightly implies reading and then refreshing some value. Here we just return via pointer and never use the old *cache_indices_cnt
.
Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
This is length as per our coding guidelines as it denotes characters in a C-string.
Then the buffer is too small, you don't have space for the trailing null byte. I.e. you won't fit a C-string with PAL_SYSFS_PATH_LEN
chars there.
Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):
It can be but we need some upper limit, right?
Why?
Pal/src/host/Linux-common/topo_info.c, line 341 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
We have big comments not to do so :). Also, if we think the other way since it is set to 1 some can end up using it as a valid value. This can be an issue. But if you insist, I can set it to 1.
But this is a valid value, the core 0 is actually online. It's only the sysfs printing code which skips printing it, but I see no reason to have this quirk reflected in a global structure.
Pal/src/host/Linux-common/topo_info.c, line 401 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
length.
see my reply above
Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, I would prefer to have:
out_info->resource_cnt = total_cnt;
This way it's clear why you did the check for overflow.
Why not just remove this temporary variable and add directly to out_info->resource_cnt
?
Pal/src/host/Linux-common/topo_info.c, line 117 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't you replace
diff
withdiff + 1
here?Alternatively, you can add
+ 1
to thediff = ...
above.
It's correct because of that >=
below, although quite weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 21 files reviewed, 42 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
LibOS/shim/include/shim_fs_pseudo.h, line 222 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
I mean only the function signature,
char* str
is wrapped. You should see this on Reviewable if you configured line length to 100 in it.
yes got it, thanks.
LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Why do both now?
Instead of what you did maybe just addstr[0] = '\0'
? There's no reason to fail on empty range if we can trivially make it work correctly for it.
Sorry, I don't understand. Are you saying it is fine to pass an empty string back to the application? This function basically converts ranges to strings, and I feel if the range is empty, we should not return success.
Converted memset(str, 0, str_size)
-> str[0] = '\0'
although I still prefer memset()
:).
LibOS/shim/src/fs/sys/fs.c, line 75 at r4 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I don't see Michal's requested change.
Sorry I put the check in a different place in db_main.c
, but it is not actually needed there. Fixed it here.
Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Yes, I ofc know which one is the case, this was the question from the point of a reader of our codebase. It's not clear from the header until you read the implementation.
Ad the comment: Please do as Dima suggested, but maybe without this example, "inclusive" is not a complex term which would require lengthy explanations.
oh ok :). Added the comment.
This comment confused me :) I was not sure if it was a question or a suggestion. Next time just say, add a comment.
Pal/include/arch/x86_64/pal_topology.h, line 49 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
? I want you to change the variable name. I don't understand your response here...
I misunderstood your comment. I thought you meant to change the comment.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
As before, I've read the whole PR so I know, but other developers reading this header in the future won't and the current comment doesn't make it clear how this works.
Yes, please say "add an example in the comment". I was not sure if it was a question or a suggestion.
I have updated the example to show touching ranges. Is this [{0, 15}, {16, 30}, {31, 31}]
example fine?
Pal/include/arch/x86_64/pal_topology.h, line 62 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Then please just drop the whole comment, I don't see why "how we usually print this value in sysfs" could have any importance here.
OK, done.
Pal/include/arch/x86_64/pal_topology.h, line 72 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Then just use
get_hw_resource_value()
on a temporary variable and then assign tois_logical_core_online
.size_t
makes no sense here and this is a global structure, it should be as clean and intuitive as possible.
OK, done.
Pal/include/arch/x86_64/pal_topology.h, line 60 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
is in bytes
->in bytes
Done.
Pal/include/arch/x86_64/pal_topology.h, line 60 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Weird phrasing, I suggest:
/* We store cache size in bytes, but it is typically displayed in KB (e.g., "256K"), so for output * it should be converted to KBs. See https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-system-cpu */
Done.
Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
If you feel this is redundant and not needed for untrusted PAL case, I can remove this check itself.
I don't see any point in doing these checks here, especially that nothing should break if the value is bigger than this.
Nothing will break but we can have bizarre values which will trickle down to trusted PAL and then fail. This was for catching the issue in the untrusted PAL itself and failing.
But I have removed the checks.
Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Why not just change "6" to "7" in your original example? The new one is much more complicated but doesn't really provide any more information than the original.
I thought having an individual value at the extremes made you think it was exclusive. So made it as a range. But I don't understand how changing 6
->7
will help. Please explain.
I have changed it to 7
now.
Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
It's not really correct now, "to update something" slightly implies reading and then refreshing some value. Here we just return via pointer and never use the old
*cache_indices_cnt
.
How about now?
/* This function stores the number of cache levels present on the system by counting "indexX" dir
* entries under `/sys/devices/system/cpu/cpuX/cache`in `out_cache_indices_cnt`. Returns 0 on
* success and negative UNIX error code on failure. */
Also renamed name the out variable as out_cache_indices_cnt
.
If this is not clear, please suggest your text and I will work it in.
Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Then the buffer is too small, you don't have space for the trailing null byte. I.e. you won't fit a C-string with
PAL_SYSFS_PATH_LEN
chars there.
Sorry not able to get your comment. Wait, are you saying that the C-string lengths (strlen
) don't account for null byte and so this is not a LEN
butSIZE
?
With this assumption, I am changing PAL_SYSFS_PATH_LEN
-> PAL_SYSFS_PATH_SIZE
But I read C-string as "Hello", which does include the null byte and not compare it with the return value of strlen()
.
Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
It can be but we need some upper limit, right?
Why?
Can we have larger than 2147483647
online cores? I set this as an upper limit to catch such ridiculously large values which will finally fail in trusted PAL because we have limit checks there. Idea was to catch such inconsistencies earlier and return.
But I have removed these checks.
Pal/src/host/Linux-common/topo_info.c, line 341 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
But this is a valid value, the core 0 is actually online. It's only the sysfs printing code which skips printing it, but I see no reason to have this quirk reflected in a global structure.
OK, I have initialized this to true
.
Pal/src/host/Linux-common/topo_info.c, line 401 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
see my reply above
Done?
Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Why not just remove this temporary variable and add directly to
out_info->resource_cnt
?
This will exceed 100 chars
so used a temp variable for better readability.
Pal/src/host/Linux-common/topo_info.c, line 117 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't you replace
diff
withdiff + 1
here?Alternatively, you can add
+ 1
to thediff = ...
above.
I do it below to account for inclusiveness, but I agree that we should do it as part of the diff because we can catch any potential overflows.
Pal/src/host/Linux-common/topo_info.c, line 122 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Actually, I would prefer to have:
out_info->resource_cnt = total_cnt;
This way it's clear why you did the check for overflow.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 128 at r2 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You added the shallow copy? where? I don't see it.
Or did you add the shallow copy to the caller of this function, and now the first argument of this function (
untrusted_src
) points to the trusted in-enclave values (ranges_arr
,range_cnt
,resource_cnt
)? If so, you should change the name of the argument (untrusted_src
->src
).UPDATE: Ok, so you indeed supply an safe in-enclave first argument. So you should rename
untrusted_src
to simplysrc
.But because
struct pal_res_range_info
contains in itself aranges_arr
field (pointer to some other memory), this field in untrusted. So you must havesgx_copy_to_enclave()
on this field -- which you do already, that's good. So this function looks correct, nothing needs to be changed (except my small comments).
Yes, your understanding of the flow is right.
Done, and added a comment that ranges_arr
is untrusted.
Pal/src/host/Linux-SGX/db_main.c, line 128 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
untrusted_src
->src
Done.
Pal/src/host/Linux-SGX/db_main.c, line 130 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
There is no need for this helper variable now. Just use
src->range_cnt
everywhere in this func.
Done.
Pal/src/host/Linux-SGX/db_main.c, line 137 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Please add a comment like this:
/* even though `src` points to a safe in-enclave object, the `src->ranges_arr` pointer is * untrusted and may maliciously point to inside the enclave; thus need to use * `sgx_copy_to_enclave()` function */
Done.
Pal/src/host/Linux-SGX/db_main.c, line 138 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
READ_ONCE()
Done.
Pal/src/host/Linux-SGX/db_main.c, line 156 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
You have two ways of writing "core topology":
core_topo_arr
andcore_topology_arr
. It's hard to read. Let's unify...Meh, the fact that you sometimes use
topo
and sometimestopology
drives me nuts :) I guess just usetopo
everywhere.
OK :) done. I have also changed numa_topology_arr -> numa_topo_arr
Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
untrusted_src
->READ_ONCE(untrusted_src)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 167 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Remove
temp
Done.
Pal/src/host/Linux-SGX/db_main.c, line 171 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
core_topology_arr
->core_topo_arr
Done.
Pal/src/host/Linux-SGX/db_main.c, line 198 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
This comment belongs to another malloc (below), not this one. Here it should be:
/* Shallow copy contents of cache_info_arr (untrusted pointer) into enclave */
Done.
Pal/src/host/Linux-SGX/db_main.c, line 202 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
"Allocation for shallow copy of cache_info_arr failed"
Done.
Pal/src/host/Linux-SGX/db_main.c, line 208 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
READ_ONCE()
Rule of thumb: always add
READ_ONCE
to the third argument ofsgx_copy_to_enclave()
.
Done. I have also added TODOs
to the missing ones that were not part of this PR. As discussed, I will fix these in a separate PR.
Pal/src/host/Linux-SGX/db_main.c, line 211 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Remove
temp
Done.
Pal/src/host/Linux-SGX/db_main.c, line 257 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
"Allocation for shallow copy of numa_topo_arr failed"
Done.
Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
READ_ONCE(untrusted_src)
Done.
Pal/src/host/Linux-SGX/db_main.c, line 265 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Remove
temp
numa_topology_arr
->numa_topo_arr
Done.
Pal/src/host/Linux-SGX/db_main.c, line 739 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
READ_ONCE(uptr_topo_info)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r6, all commit messages.
Reviewable status: all files reviewed, 35 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
LibOS/shim/src/fs/sys/fs.c, line 21 at r4 (raw file):
Are you saying it is fine to pass an empty string back to the application?
We probably won't do this because we sanitize the data, but it seems more intuitive to me that printing an empty list gives you an empty string, rather than erroring out. But I won't block on this.
Pal/include/arch/x86_64/pal_topology.h, line 39 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
oh ok :). Added the comment.
This comment confused me :) I was not sure if it was a question or a suggestion. Next time just say, add a comment.
But I don't want you to blindly copy-paste my suggestions without thinking ;) You should read them, look at the code and think whether they make sense and what does they mean (e.g.: "oh, this is actually unclear, I should add a comment" or maybe "I already have a comment about this, but you probably just missed it" or "maybe instead it's better to rename this variable?").
Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Yes, please say "add an example in the comment". I was not sure if it was a question or a suggestion.
I have updated the example to show touching ranges. Is this
[{0, 15}, {16, 30}, {31, 31}]
example fine?
Better, but now the example has no holes in the ranges, which are allowed. Reading it you can get an impression that the ranges have to be adjacent.
Pal/src/host/Linux-common/topo_info.c, line 48 at r4 (raw file):
This was for catching the issue in the untrusted PAL itself and failing.
But this is pointless, we need to duplicate these checks anyways in trusted PAL. And on non-TEE PALs these values are trusted.
Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I thought having an individual value at the extremes made you think it was exclusive. So made it as a range. But I don't understand how changing
6
->7
will help. Please explain.I have changed it to
7
now.
Because specifying 3-5,6
makes no sense if the ranges are inclusive, you'd have 3-6
instead then. But if the ends were exclusive, then 3-5,6
would make sense.
Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
How about now?
/* This function stores the number of cache levels present on the system by counting "indexX" dir * entries under `/sys/devices/system/cpu/cpuX/cache`in `out_cache_indices_cnt`. Returns 0 on * success and negative UNIX error code on failure. */
Also renamed name the out variable as
out_cache_indices_cnt
.If this is not clear, please suggest your text and I will work it in.
Looks ok, but you're missing a space before in
.
Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):
Wait, are you saying that the C-string lengths (
strlen
) don't account for null byte and so this is not aLEN
butSIZE
?
Yes, strlen()
doesn't count the null byte, but it's a bit unrelated. It's just our convention that we use "len" for C-string length and "size" for C-string byte-size.
But I read C-string as "Hello", which does include the null byte and not compare it with the return value of
strlen()
.
I don't understand what you wanted to say here ;)
Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Can we have larger than
2147483647
online cores? I set this as an upper limit to catch such ridiculously large values which will finally fail in trusted PAL because we have limit checks there. Idea was to catch such inconsistencies earlier and return.But I have removed these checks.
It can only mask missing verification in the trusted code and is just code duplication because we already check it there.
Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
This will exceed
100 chars
so used a temp variable for better readability.
Hmm, it easily fits in 100 chars for me?
if (__builtin_add_overflow(out_info->resource_cnt, 1, &out_info->resource_cnt)) {
Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
Waaait, what was wrong with the old code? Why READ_ONCE()
here?
Pal/src/host/Linux-SGX/db_main.c, line 208 at r5 (raw file):
Rule of thumb: always add READ_ONCE to the third argument of sgx_copy_to_enclave().
Uhmmm?
Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
?
Pal/src/host/Linux-SGX/db_main.c, line 739 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
?
Pal/src/host/Linux-SGX/db_main.c, line 88 at r6 (raw file):
} /* TODO: Fix untrusted pointer access by adding `READ_ONCE()`*/
I don't understand this TODO, what is incorrectly accessed here? You added it at a line which looks correct.
Pal/src/host/Linux-SGX/db_main.c, line 131 at r6 (raw file):
static int copy_resource_range_to_enclave(struct pal_res_range_info* src, struct pal_res_range_info* dest) { struct pal_range_info* ranges_arr = malloc(src->ranges_cnt * sizeof(*ranges_arr));
What if untrusted host sets src->ranges_cnt
to 0x8000000000000001?
Pal/src/host/Linux-SGX/db_main.c, line 262 at r6 (raw file):
malloc(online_nodes_cnt * sizeof(*temp_numa_topo_arr)); if (!temp_numa_topo_arr) { log_error("Allocation for shallow copy of numa_topo_arr failed");
Why printing an error here? We don't do this in other cases when we run out of memory, we should just print this in a single place, somewhere top-level.
ditto for other places like this
Pal/src/host/Linux-SGX/db_main.c, line 485 at r6 (raw file):
} /* This function doesn't clean up resources on failure: we terminate the process anyway. */
Please keep this comment at each function which doesn't do the clean-up. When verifying a specific function for memory leaks you don't search the whole 1k-LoC file if there's a comment somewhere saying that "here and below" some usual rules don't apply ;)
Pal/src/host/Linux-SGX/db_main.c, line 909 at r6 (raw file):
/* At this point we don't yet have memory manager, so we cannot allocate memory dynamically. */ static char libpal_path[1024 + 1]; /* TODO: Fix untrusted pointer access by adding `READ_ONCE()`*/
Where's that untrusted access?
Pal/src/host/Linux-SGX/db_main.c, line 923 at r6 (raw file):
/* We can't verify the following arguments from the urts. So we copy them directly but need to * be careful when we use them. */ /* TODO: Fix untrusted pointer access by adding `READ_ONCE()`*/
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 18 of 21 files reviewed, 35 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/include/arch/x86_64/pal_topology.h, line 54 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Better, but now the example has no holes in the ranges, which are allowed. Reading it you can get an impression that the ranges have to be adjacent.
OK, made the ranges discontinuous.
Pal/src/host/Linux-common/topo_info.c, line 58 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Because specifying
3-5,6
makes no sense if the ranges are inclusive, you'd have3-6
instead then. But if the ends were exclusive, then3-5,6
would make sense.
Thanks for the clarification.
Pal/src/host/Linux-common/topo_info.c, line 185 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Looks ok, but you're missing a space before
in
.
Done.
Pal/src/host/Linux-common/topo_info.c, line 239 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Wait, are you saying that the C-string lengths (
strlen
) don't account for null byte and so this is not aLEN
butSIZE
?Yes,
strlen()
doesn't count the null byte, but it's a bit unrelated. It's just our convention that we use "len" for C-string length and "size" for C-string byte-size.But I read C-string as "Hello", which does include the null byte and not compare it with the return value of
strlen()
.I don't understand what you wanted to say here ;)
I was trying to reason out why you would suggest _SIZE
instead of _LEN
:)
Pal/src/host/Linux-common/topo_info.c, line 312 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
It can only mask missing verification in the trusted code and is just code duplication because we already check it there.
OK.
Pal/src/host/Linux-common/topo_info.c, line 103 at r5 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Hmm, it easily fits in 100 chars for me?
if (__builtin_add_overflow(out_info->resource_cnt, 1, &out_info->resource_cnt)) {
oh my bad, removed the temporary variable.
Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Waaait, what was wrong with the old code? Why
READ_ONCE()
here?
The issue here is temp_topo_info
is a shallow copied object and so the core_topo_arr
is still an untrusted pointer. So, to prevent compiler optimization READ_ONCE()
is used.
Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
?
Same reason for using READ_ONCE()
in sgx_copy_core_topo_to_enclave()
Pal/src/host/Linux-SGX/db_main.c, line 88 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
I don't understand this TODO, what is incorrectly accessed here? You added it at a line which looks correct.
Hereuptr_src
is an untrusted pointer and we don't want the compiler to do any optimization to it. So added the TODO
Pal/src/host/Linux-SGX/db_main.c, line 131 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
What if untrusted host sets
src->ranges_cnt
to 0x8000000000000001?
Good point, this might result in a DOS attack. How about adding an upper limit check for ranges_cnt
? But I am not sure what should be the upper limit?
static int copy_resource_range_to_enclave(struct pal_res_range_info* src,
struct pal_res_range_info* dest) {
+ if (src->ranges_cnt > UINT16_MAX)
+ return -1;
+
struct pal_range_info* ranges_arr = malloc(src->ranges_cnt * sizeof(*ranges_arr));
if (!ranges_arr) {
log_error("Range allocation failed");
Pal/src/host/Linux-SGX/db_main.c, line 262 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Why printing an error here? We don't do this in other cases when we run out of memory, we should just print this in a single place, somewhere top-level.
ditto for other places like this
If we remove all the printing, and something fails we end up with this message Failed to copy and sanitize topology information
. This is not helpful. I think these logs are really needed esp. for topology cases where we have multiple levels of function calls. The only place where I haven't added error prints is in the realloc
flows as we plan to clean this up in a separate PR. If needed I can add error prints, there too.
The other function where we do malloc
and don't print error on failure is make_argv_list()
. This is a small function, and a top-level error message makes sense here.
But if reviewers strongly feel otherwise, I can remove it.
Pal/src/host/Linux-SGX/db_main.c, line 485 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Please keep this comment at each function which doesn't do the clean-up. When verifying a specific function for memory leaks you don't search the whole 1k-LoC file if there's a comment somewhere saying that "here and below" some usual rules don't apply ;)
OK, I have duplicated this comment on all functions where we use malloc/calloc
Pal/src/host/Linux-SGX/db_main.c, line 909 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Where's that untrusted access?
There isn't untrusted access. I have updated my comment.
Pal/src/host/Linux-SGX/db_main.c, line 923 at r6 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
ditto
Updated my comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 9 files at r6, 3 of 3 files at r7, all commit messages.
Reviewable status: all files reviewed, 21 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
Pal/src/host/Linux-SGX/db_main.c, line 138 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Done.
Sorry, I was wrong. You don't need it here, because you do not dereference src->ranges_arr
(which is an untrusted pointer) and thus there is no untrusted memory access happenning. You only dereference src
which is a trusted pointer. So you don't need READ_ONCE()
, just remove it.
Pal/src/host/Linux-SGX/db_main.c, line 165 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
The issue here is
temp_topo_info
is a shallow copied object and so thecore_topo_arr
is still an untrusted pointer. So, to prevent compiler optimizationREAD_ONCE()
is used.
Yes, sorry, Vijay. Please revert READ_ONCE(untrusted_src)
-> untrusted_src
.
Pal/src/host/Linux-SGX/db_main.c, line 208 at r5 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Rule of thumb: always add READ_ONCE to the third argument of sgx_copy_to_enclave().
Uhmmm?
Please revert.
Pal/src/host/Linux-SGX/db_main.c, line 263 at r5 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Same reason for using
READ_ONCE()
insgx_copy_core_topo_to_enclave()
Please revert
Pal/src/host/Linux-SGX/db_main.c, line 739 at r5 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
?
Please revert
Pal/src/host/Linux-SGX/db_main.c, line 88 at r6 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Here
uptr_src
is an untrusted pointer and we don't want the compiler to do any optimization to it. So added theTODO
See my other comment here, I was an idiot and explained you not 100% correctly :(
Pal/src/host/Linux-SGX/db_main.c, line 262 at r6 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
If we remove all the printing, and something fails we end up with this message
Failed to copy and sanitize topology information
. This is not helpful. I think these logs are really needed esp. for topology cases where we have multiple levels of function calls. The only place where I haven't added error prints is in therealloc
flows as we plan to clean this up in a separate PR. If needed I can add error prints, there too.The other function where we do
malloc
and don't print error on failure ismake_argv_list()
. This is a small function, and a top-level error message makes sense here.But if reviewers strongly feel otherwise, I can remove it.
I was actually fine with this error.
Pal/src/host/Linux-SGX/db_main.c, line 909 at r6 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
There isn't untrusted access. I have updated my comment.
Please revert this comment (delete it)
Pal/src/host/Linux-SGX/db_main.c, line 923 at r6 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Updated my comment.
Please revert this comment (delete it)
Pal/src/host/Linux-SGX/db_main.c, line 90 at r7 (raw file):
/* TODO: Fix potential compiler optimization of untrusted pointer `uptr_src` by adding * `READ_ONCE()`. */
Oops, sorry Vijay. This was my blunder. I got it wrong.
So, the rule should be: if the third argument of sgx_copy_to_enclave()
contains a dereference of an untrusted pointer (i.e., contains memory access through an untrusted pointer), only then we need to add READ_ONCE()
. (Because this is what the macro READ_ONCE()
does -- it accesses memory only once).
In this case -- like in some other cases (I marked them all in this PR) -- I was wrong and I recommended you this redundant READ_ONCE()
. In this particular case, the third argument is simply uptr_src
-- which doesn't do any memory access, but simply this pointer value is passed to the function. So there is no need in READ_ONCE()
.
Please remove this TODO comment.
Pal/src/host/Linux-SGX/db_main.c, line 470 at r7 (raw file):
if (idx != 0) { /* core 0 is always online */ bool is_core_online = core_topo_arr[idx].is_logical_core_online; if (is_core_online != false && is_core_online != true) {
Wait, this doesn't make sense anymore :) Booleans can only have false
and true
values, nothing else, so this if is meaningless now.
I suggest to simply remove this check on is_logical_core_online
completely. All these 7 lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
I think I'm missing something, if we allow CPUs to be only from 0 to online_logical_cores_cnt - 1
then why do we even store /sys/devices/system/cpu/online
as ranges? It always has to be just a single range, from 0 to online_logical_cores_cnt - 1
, the we can just not store this redundant information at all?
Regarding inconsistent view between untrusted host and SGX, [...]
Yes, that's right, but that's not my point, my point is about internal inconsistencies between the data we expose to the app.
If you want, I can add this check but based on Gitter chat looks like you already have a proposal for this and plan to push a commit. So, I will hold off from pushing any changes.
Yes, I'll do this to speed things up for the release, but there are many things I don't understand here, so please watch discussions under this PR :)
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
/* Virtual environments such as QEMU may assign each core to a separate socket/package with * one or more NUMA nodes. So we check against the number of online logical cores. */ if (!IS_IN_RANGE_INCL(topo_info->sockets_cnt, 1, online_logical_cores_cnt)) {
Can't you have more sockets than online CPUs? E.g. when all cores from some socket are turned off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
* example: /sys/devices/system/cpu/cpu1/cache/index1/shared_cpu_map: 00000000,00000002 */ int ret = sanitize_hw_resource_range(&cache_info_arr[lvl].shared_cpu_map, 1, max_limit, 0, online_logical_cores_cnt);
cache_info_arr[lvl].shared_cpu_map
is another list which may include non-existing cores
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
I think I'm missing something, if we allow CPUs to be only from 0 to
online_logical_cores_cnt - 1
then why do we even store/sys/devices/system/cpu/online
as ranges? It always has to be just a single range, from 0 toonline_logical_cores_cnt - 1
, the we can just not store this redundant information at all?Regarding inconsistent view between untrusted host and SGX, [...]
Yes, that's right, but that's not my point, my point is about internal inconsistencies between the data we expose to the app.
If you want, I can add this check but based on Gitter chat looks like you already have a proposal for this and plan to push a commit. So, I will hold off from pushing any changes.
Yes, I'll do this to speed things up for the release, but there are many things I don't understand here, so please watch discussions under this PR :)
Sure, please go ahead.
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
cache_info_arr[lvl].shared_cpu_map
is another list which may include non-existing cores
Yes, we don't check the range subsets/offlined cores. This was intended for core offlining PR. Currently, we make sure that the range of cores as part of the share_cpu_map
is not greater than the total online_logical_cores_cnt
.
For example, if we have shared_cpu_map (LLC) range as 0-31
, and the online cores are 0-8
, then we fail. But if shared_cpu_map
were like 0-1, 7-8
(with cores 2-6 non-existing) we don't fail.
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Can't you have more sockets than online CPUs? E.g. when all cores from some socket are turned off.
I don't think the kernel will expose them. I did a quick test on my ICX system which has 2 sockets and each having 32 cores. I offlined all the cores except core 0 and this is the output of lscpu
. Also, I don't think sysfs
exposes any path to show the total sockets count. (other tool like dmidecode
may expose the total socket count but we don't use this)
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0
Off-line CPU(s) list: 1-63
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 106
Model name: Intel(R) Xeon(R) Platinum 8352Y CPU @ 2.20GHz
Stepping: 6
CPU MHz: 2200.000
BogoMIPS: 4400.00
Virtualization: VT-x
L1d cache: 48K
L1i cache: 32K
L2 cache: 1280K
L3 cache: 49152K
NUMA node0 CPU(s): 0
NUMA node1 CPU(s):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Sure, please go ahead.
What about my question?
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
This was intended for core offlining PR.
But I thought the idea was that right now we'll refuse to start if some cores in topology are declared as offlined, not that we'll expose inconsistent data to the app. Especially exposing indexes which point to nothing.
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I don't think the kernel will expose them. I did a quick test on my ICX system which has 2 sockets and each having 32 cores. I offlined all the cores except core 0 and this is the output of
lscpu
. Also, I don't thinksysfs
exposes any path to show the total sockets count. (other tool likedmidecode
may expose the total socket count but we don't use this)Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 64 On-line CPU(s) list: 0 Off-line CPU(s) list: 1-63 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 106 Model name: Intel(R) Xeon(R) Platinum 8352Y CPU @ 2.20GHz Stepping: 6 CPU MHz: 2200.000 BogoMIPS: 4400.00 Virtualization: VT-x L1d cache: 48K L1i cache: 32K L2 cache: 1280K L3 cache: 49152K NUMA node0 CPU(s): 0 NUMA node1 CPU(s):
I see, thanks for checking, seems they are indeed hidden. But other interesting thing is that the node1 cpu list is empty, so we should allow empty ranges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):
struct pal_numa_topo_info { struct pal_res_range_info cpumap; struct pal_res_range_info distance;
What does this represent? I think on Linux it's not a list of ranges, but a list of integers? (distances to each other node)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
What does this represent? I think on Linux it's not a list of ranges, but a list of integers? (distances to each other node)
Actually, this field almost completely lacks sanitization, the only thing which is sanitized is resource_cnt
field, which is a self-declared (not verified with the list of ranges) sum of the ranges which has to equal to online_nodes_cnt
. Everything else can be arbitrary.
That's why it passes the tests... I was wondering how it's even possible that the tests are passing despite the ranges being unsorted here (that's the case now on my server).
Pal/src/host/Linux-SGX/db_main.c, line 601 at r10 (raw file):
} if (total_cores_in_numa != online_logical_cores_cnt) {
That's not enough, this sanitization misses verification against online and possible cores list. Non-existent indices will pass it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Actually, this field almost completely lacks sanitization, the only thing which is sanitized is
resource_cnt
field, which is a self-declared (not verified with the list of ranges) sum of the ranges which has to equal toonline_nodes_cnt
. Everything else can be arbitrary.
That's why it passes the tests... I was wondering how it's even possible that the tests are passing despite the ranges being unsorted here (that's the case now on my server).
yes, these are integers like 10 20
or 20 10
basically denoting the numa distances. Unlike other ranges like core_siblings
we don't call sanitize_hw_resource_range()
to sanitize this. This is because distance
can vary from system to system. All we want is to store the distance count. But again, this will raise the question of why store in pal_res_range_info
type. It is because I wanted to use a common function like get_hw_resource_range()
to read from the untrusted host which only returns a value of pal_res_range_info
type.
Other fields like ranges_cnt
and ranges_arr
are don't care as far as distance is concerned.
This isn't a range although it is stored in pal_res_range_info
. So, it will be treated as individual values instead of range and can be unsorted.
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
What about my question?
Sorry missed it, this was done to create the bare bones for supporting core offlining. I didn't want to keep changing the architecture.
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
This was intended for core offlining PR.
But I thought the idea was that right now we'll refuse to start if some cores in topology are declared as offlined, not that we'll expose inconsistent data to the app. Especially exposing indexes which point to nothing.
This case typically arises when cores are offline right, and as you pointed out, it might not lead to inconsistent data but maybe a performance issue. So, this was left until we have proper core offlining support.
We check resource_cnt
and range_cnt
are not 0
for shared_cpu_map
. So, we will not have empty indexes but missing cores.
Pal/src/host/Linux-SGX/db_main.c, line 601 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
That's not enough, this sanitization misses verification against online and possible cores list. Non-existent indices will pass it.
Yes, we should add a range subset check which will ensure that each core in numa node
is a subset of online_cores
. Since we are planning to add online_logical_cores
being a subset of possible_logical_cores
check, the above should be good.
I think if we add a single function to check if one range is a subset of another, we could use it for all these cases.
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
I see, thanks for checking, seems they are indeed hidden. But other interesting thing is that the node1 cpu list is empty, so we should allow empty ranges?
yes, we can change the limits when we call sanitize_hw_resource_range()
and also need to tweak this function to add a special check for numa
which can have empty range_cnt
. But again, let us keep this for core offlining PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
yes, these are integers like
10 20
or20 10
basically denoting the numa distances. Unlike other ranges likecore_siblings
we don't callsanitize_hw_resource_range()
to sanitize this. This is becausedistance
can vary from system to system. All we want is to store the distance count. But again, this will raise the question of why store inpal_res_range_info
type. It is because I wanted to use a common function likeget_hw_resource_range()
to read from the untrusted host which only returns a value ofpal_res_range_info
type.Other fields like
ranges_cnt
andranges_arr
are don't care as far as distance is concerned.This isn't a range although it is stored in
pal_res_range_info
. So, it will be treated as individual values instead of range and can be unsorted.
What if untrusted PAL sets the ranges to something where start != end
? I think you'll print output in invalid format in /sys/devices/system/node/node<n>/distance
?
Also, I really dislike this hack, i.e. "recycling" structures for something completely different than they were intended to. Here we just want a list of integers, so we should use a list of integers.
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Sorry missed it, this was done to create the bare bones for supporting core offlining. I didn't want to keep changing the architecture.
But right now we allow online_logical_cores
ranges which are different than [0; online_logical_cores_cnt)]
? So, such ranges pass sanitization but are actually unsupported and cause a lot of inconsistencies.
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
as you pointed out, it might not lead to inconsistent data but maybe a performance issue
I didn't say this. I said that right now we allow inconsistent data, e.g. containing invalid core indices.
We check
resource_cnt
andrange_cnt
are not0
forshared_cpu_map
. So, we will not have empty indexes but missing cores.
I didn't say empty (and btw, 0
is not "empty", "0" is just a zero index), I said indices pointing nowhere - e.g. indices which doesn't point to any core.
Pal/src/host/Linux-SGX/db_main.c, line 601 at r10 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Yes, we should add a range subset check which will ensure that each core in
numa node
is a subset ofonline_cores
. Since we are planning to addonline_logical_cores
being a subset ofpossible_logical_cores
check, the above should be good.I think if we add a single function to check if one range is a subset of another, we could use it for all these cases.
Yup, I think it will solve at least this specific issue.
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
yes, we can change the limits when we call
sanitize_hw_resource_range()
and also need to tweak this function to add a special check fornuma
which can have emptyrange_cnt
. But again, let us keep this for core offlining PR.
Why not just accept empty ranges?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
Not yet, I'm still verifying the sanitization and checks in the trusted part.
@mkow I have captured the following changes:
- Check
online_logical_cores
is a subset ofpossible_logical_cores
. - Check
shared_cpu_map
is a subset ofonline_logical_cores
. - Check cores in each numa node is a subset of
online_logical_cores
. - Change numa distance type from
pal_res_range_info
tolist of integers
- Allow empty numa nodes but I still prefer to fail for this case.
I'm not sure if you already have these changes, but if not please do let me know. I can get started and have something for you to review by tomorrow morning.
Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):
What if untrusted PAL sets the ranges to something where start != end? I think you'll print output in invalid format in /sys/devices/system/node/node/distance?
Yes, you are correct, we will print invalid format.
Also, I really dislike this hack, i.e. "recycling" structures for something completely different than they were intended to. Here we just want a list of integers, so we should use a list of integers.
I can clean this up and have it as a list of integers. This will also solve the invalid printing format.
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
But right now we allow
online_logical_cores
ranges which are different than[0; online_logical_cores_cnt)]
? So, such ranges pass sanitization but are actually unsupported and cause a lot of inconsistencies.
I think we agreed that for this issue we need to add online_logical_cores
being a subset of possible_logical_cores
check. In addition, I suggest we also have a check for possible_logical_cores
such that we only allow range_cnt
to be 1 and the ranges_arr
always start from 0
until max cores
.
PS: possible_logical_cores
will only have 0-max cores
even if we offline cores. So resource_cnt
will be n
cores, range_cnt
will be 1
and range_arr
will only have one range starting from 0
until max cores
.
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
I didn't say this. I said that right now we allow inconsistent data, e.g. containing invalid core indices.
True :) I interpreted it incorrectly. Typically applications don't check if a cpu shares a cache or not. It will end up using this inconsistent data, with the downside being performance due to incorrect cpu<->cache
mapping. Here again, if we make sure shared_cpu_map
is a subset of online_logical_cores
we can make sure the data is consistent. But even with such a check, the attacker can make share_cpu_map
a subset of online_logical_cores
but map different caches to different cores which will still cause performance issues.
I didn't say empty (and btw, 0 is not "empty", "0" is just a zero index), I said indices pointing nowhere - e.g. indices which doesn't point to any core.
It can't be indices that don't point to any core as we create a bitmap from the range_arr
. So, an index if set will either point to a core that truly shares the cache or point to a core that doesn't share the cache (compromised system) or that core is not for that cache.
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Why not just accept empty ranges?
The numa nodes will only be empty if we offline cores or the attacker has compromised the system. Right now, we don't have proper offlining support to properly validate this case and so I suggested we fail if the numa node is empty. This is the current behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
a discussion (no related file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
@mkow I have captured the following changes:
- Check
online_logical_cores
is a subset ofpossible_logical_cores
.- Check
shared_cpu_map
is a subset ofonline_logical_cores
.- Check cores in each numa node is a subset of
online_logical_cores
.- Change numa distance type from
pal_res_range_info
tolist of integers
- Allow empty numa nodes but I still prefer to fail for this case.
I'm not sure if you already have these changes, but if not please do let me know. I can get started and have something for you to review by tomorrow morning.
I started working on this and will try to solve at least some of the issues. We discussed this with maintainers and will just merge what we have (because it's still much better than the old code) but will need to keep it under the insecure/experimental flag until we make sure that everything is really validated.
And these issues are probably not everything, I still keep finding issues and it's quite hard to tell from the current implementation whether everything is already consistent or not. I'll try to change the code/structures to make verification part simpler and easier to prove.
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I think we agreed that for this issue we need to add
online_logical_cores
being a subset ofpossible_logical_cores
check. In addition, I suggest we also have a check forpossible_logical_cores
such that we only allowrange_cnt
to be 1 and theranges_arr
always start from0
untilmax cores
.PS:
possible_logical_cores
will only have0-max cores
even if we offline cores. Soresource_cnt
will ben
cores,range_cnt
will be1
andrange_arr
will only have one range starting from0
untilmax cores
.
Not sure if that's enough: you trim the range in sysfs printing to [0; online_logical_cores_cnt)
but in sanitization allow other ranges as well. So, effectively we'll expose the more complex ranges (visible e.g. in counts of resources) but will trim the available info in cpu directory to ranges SET-AND [0; online_logical_cores_cnt)
.
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
It will end up using this inconsistent data, with the downside being performance
or memory corruption, depending on the app's implementation.
Performance is not an issue, the host can always deny resources.
It can't be indices that don't point to any core as we create a bitmap from the
range_arr
. So, an index if set will either point to a core that truly shares the cache or point to a core that doesn't share the cache (compromised system) or that core is not for that cache.
But where do we verify that cache_info_arr[lvl].shared_cpu_map
contains only valid indices?
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
the attacker has compromised the system
There's no "compromised system" here, the host owner is the attacker.
we don't have proper offlining support to properly validate this case
I don't think there will be anything in the current sanitization that breaks if we remove this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @mkow and @vijaydhanraj)
a discussion (no related file):
I started working on this and will try to solve at least some of the issues. We discussed this with maintainers and will just merge what we have (because it's still much better than the old code) but will need to keep it under the insecure/experimental flag until we make sure that everything is really validated.
OK, sounds good. So, do you want me to merge the fixups and rebase them to the latest master? or do you have some changes to push?
And these issues are probably not everything, I still keep finding issues and it's quite hard to tell from the current implementation whether everything is already consistent or not. I'll try to change the code/structures to make verification part simpler and easier to prove.
Sure, we can keep the discussions active.
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Not sure if that's enough: you trim the range in sysfs printing to
[0; online_logical_cores_cnt)
but in sanitization allow other ranges as well. So, effectively we'll expose the more complex ranges (visible e.g. in counts of resources) but will trim the available info in cpu directory toranges SET-AND [0; online_logical_cores_cnt)
.
Can you please give an example of the failure case? It will be easier to understand your case.
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
It will end up using this inconsistent data, with the downside being performance
or memory corruption, depending on the app's implementation.
Performance is not an issue, the host can always deny resources.
It can't be indices that don't point to any core as we create a bitmap from the
range_arr
. So, an index if set will either point to a core that truly shares the cache or point to a core that doesn't share the cache (compromised system) or that core is not for that cache.But where do we verify that
cache_info_arr[lvl].shared_cpu_map
contains only valid indices?
I don't think we can verify this without reading CPUID leaf 0x4
on each core.
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
There's no "compromised system" here, the host owner is the attacker.
By "compromised system", I mean the integrity of the system is lost. It could be due to an attacker who could be a host owner or not.
I don't think there will be anything in the current sanitization that breaks if we remove this check.
How do we differentiate the case where the attacker has willfully tampered and made the node empty Vs truly empty due to offlined cores? Might again lead to inconsistent data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv, @mkow, and @vijaydhanraj)
a discussion (no related file):
OK, sounds good. So, do you want me to merge the fixups and rebase them to the latest master? or do you have some changes to push?
I have some changes which I'm still working on, but I can submit them as a follow-up PR, this code is behind the "experimental" flag, so it should be fine to merge the non-polished version for now.
Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
What if untrusted PAL sets the ranges to something where start != end? I think you'll print output in invalid format in /sys/devices/system/node/node/distance?
Yes, you are correct, we will print invalid format.
Also, I really dislike this hack, i.e. "recycling" structures for something completely different than they were intended to. Here we just want a list of integers, so we should use a list of integers.
I can clean this up and have it as a list of integers. This will also solve the invalid printing format.
I'll do it in a separate PR.
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
Can you please give an example of the failure case? It will be easier to understand your case.
online_logical_cores = [[0, 1], [5,6]]
and online_logical_cores_cnt = 4
, the structures in PAL will say that there are 4 CPUs, but sysfs will show only two of them? Or maybe I'm missing something? Actually, don't you have 2 different numbering for cores? I.e. a different one in ranges and a different one in path component names? (i.e. "cpu").
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I don't think we can verify this without reading CPUID leaf
0x4
on each core.
By "valid" I meant "consistent with other information we present to the app", e.g. the list of online CPUs.
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
By "compromised system", I mean the integrity of the system is lost. It could be due to an attacker who could be a host owner or not.
We don't assume any integrity of the host, there's nothing to be "lost".
How do we differentiate the case where the attacker has willfully tampered and made the node empty Vs truly empty due to offlined cores? Might again lead to inconsistent data?
No, why would it have to lead to inconsistent data? We fully control what we present to the application, we can make it fully consistent.
9c491f4
to
74dedd5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 12 of 21 files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @mkow)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
OK, sounds good. So, do you want me to merge the fixups and rebase them to the latest master? or do you have some changes to push?
I have some changes which I'm still working on, but I can submit them as a follow-up PR, this code is behind the "experimental" flag, so it should be fine to merge the non-polished version for now.
Rebased to the latest master and pushed the changes.
Previously, mkow (Michał Kowalczyk) wrote…
Looks better.
Done.
Pal/include/arch/x86_64/pal_topology.h, line 82 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
I'll do it in a separate PR.
OK.
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
I think we agreed that for this issue we need to add online_logical_cores being a subset of possible_logical_cores check. In addition, I suggest we also have a check for possible_logical_cores such that we only allow range_cnt to be 1 and the ranges_arr always start from 0 until max cores.
PS: possible_logical_cores will only have 0-max cores even if we offline cores. So resource_cnt will be n cores, range_cnt will be 1 and range_arr will only have one range starting from 0 until max cores.
With the above changes, and online_logical_cores = [[0, 1], [5,6]]
shouldn't gramine start up fail as possible_logical_cores_cnt > online_logical_cores_cnt
?
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
OK, that we can make sure by checking the list of cores that are part of share_cpu_map
is a subset of online_logical_cores
. This we already established right?
But even with such a check, the attacker can make share_cpu_map a subset of online_logical_cores but map different caches to different cores which will still cause performance issues.
This we cannot verify without proper cpuid
checks.
Pal/src/host/Linux-SGX/db_main.c, line 601 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Yup, I think it will solve at least this specific issue.
OK.
Pal/src/host/Linux-SGX/db_main.c, line 798 at r10 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
By "compromised system", I mean the integrity of the system is lost. It could be due to an attacker who could be a host owner or not.
We don't assume any integrity of the host, there's nothing to be "lost".
How do we differentiate the case where the attacker has willfully tampered and made the node empty Vs truly empty due to offlined cores? Might again lead to inconsistent data?
No, why would it have to lead to inconsistent data? We fully control what we present to the application, we can make it fully consistent.
yes, I take back, there wouldn't be inconsistent data. We can add this check.
Jenkins, retest this please |
@vijaydhanraj Jenkins fails like this:
Please fix. |
74dedd5
to
bc06c66
Compare
This commit creates a clean representation of topology data on the trusted<->untrusted boundary, so that we can easily verify its correctness. The following are done: 1. Refactor sysfs code to convert CPU/NUMA information from Linux-formatted strings to integers in untrusted PAL. 2. Copy and sanitize CPU/NUMA information based on untrusted integers to trusted PAL. 3. Convert the trusted CPU/NUMA information from integers back to Linux-formatted strings in LibOS. Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
bc06c66
to
66f58d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r11, 1 of 1 files at r12, all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I think we agreed that for this issue we need to add online_logical_cores being a subset of possible_logical_cores check. In addition, I suggest we also have a check for possible_logical_cores such that we only allow range_cnt to be 1 and the ranges_arr always start from 0 until max cores.
PS: possible_logical_cores will only have 0-max cores even if we offline cores. So resource_cnt will be n cores, range_cnt will be 1 and range_arr will only have one range starting from 0 until max cores.With the above changes, and
online_logical_cores = [[0, 1], [5,6]]
shouldn't gramine start up fail aspossible_logical_cores_cnt > online_logical_cores_cnt
?
But we can set possible_logical_cores_cnt
to something large? I didn't even use it in the example above.
Pal/src/host/Linux-SGX/db_main.c, line 401 at r10 (raw file):
OK, that we can make sure by checking the list of cores that are part of
share_cpu_map
is a subset ofonline_logical_cores
. This we already established right?
I think so.
This we cannot verify without proper
cpuid
checks.
Yes, but as I said, we don't care about the untrusted host influencing performance, because it has much better ways to do it than this.
Jenkins, retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
But we can set
possible_logical_cores_cnt
to something large? I didn't even use it in the example above.
I mean without core offlining online_logical_core_cnt
should be equal to possible_logical_cores_cnt
. So, if we have holes in online_logical_cores
as you mentioned in your example, it should fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, vijaydhanraj (Vijay Dhanraj) wrote…
I mean without core offlining
online_logical_core_cnt
should be equal topossible_logical_cores_cnt
. So, if we have holes inonline_logical_cores
as you mentioned in your example, it should fail.
But we don't have such a check and I think we only discussed checking if online_logical_core_cnt
is a subset of possible_logical_cores_cnt
, not checking for equality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
But we don't have such a check and I think we only discussed checking if
online_logical_core_cnt
is a subset ofpossible_logical_cores_cnt
, not checking for equality.
We already have this possible_logical_cores_cnt > online_logical_cores_cnt
which will catch the example you showed. But we should change this to equality check until proper core offlining support is added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL)
Pal/src/host/Linux-SGX/db_main.c, line 769 at r9 (raw file):
We already have this
possible_logical_cores_cnt > online_logical_cores_cnt
which will catch the example you showed.
It won't, my example allows specifying more possible cores, so that this check passes.
But we should change this to equality check until proper core offlining support is added.
Better to just check for equality of both lists. But we can do this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r10, 2 of 9 files at r11, 1 of 1 files at r12, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved
Description of the changes
This PR refactors sysfs to improve sanitization. It fixes issues 1, 2 listed as part of gramineproject/graphene#2105.
This supersedes PR #106 and I have closed the PR.
How to test this PR?
Please run
sysfs_common
andproc_common
regression tests. Also,openvino
benchmark using the TBB library can be used to validate the changes.This change is