-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid all compiler optimization on embedded apphost hash #110554
base: main
Are you sure you want to change the base?
Conversation
1ebf110
to
81e3213
Compare
74ea3b1
to
aa103bb
Compare
aa103bb
to
265d78f
Compare
265d78f
to
8d4a245
Compare
We assume that there is a single copy of the apphost hash in the apphost binary. And that it hasn't been modified by the compiler. However, the compiler can optimize the hash multiple ways, including re-ordering elements of the hash or duplicating the contents of the hash. This can currently happen under certain compiler versions and optimization flags. Try and avoid that by marking the hash as a volatile string and implementing comparisons/copying/initialization that respects that. Fixes: dotnet#109611
8d4a245
to
4e33337
Compare
|
||
static const char hi_part[] = EMBED_HASH_HI_PART_UTF8; | ||
static const char lo_part[] = EMBED_HASH_LO_PART_UTF8; | ||
|
||
if (!pal::clr_palstring(embed, app_dll)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementing pal::clr_palstring per-platform to handle volatile correctly looked much more involved that creating a non-volatile copy instead.
Strangely enough, even the non-volatile copy still needed the no-opt memory compare to stop creating duplicate instances of the embedded hash. I wonder if the compiler ignored volatile data when it started optimizing the non-volatile copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that the compiler can see all operations on the value and that the value does not escape, and thus volatile can be safely ignored.
It may help to move the value to the global scope and mark it as extern
or extern "C"
. It should prevent the compiler from making assumptions the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@omajid Have you tried applying extern
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it won't help after thinking about it some more. The main problem that this PR is trying to solve is suppressing undesirable optimizations in whole program optimization mode. The compiler can see all uses of the symbol in this mode. extern
won't prevent it from making this observation. We would have to artificially export the symbol from the binary to prevent the optimizations, but that is undesirable for other reasons.
@@ -48,18 +66,21 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll) | |||
// Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build". | |||
// Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length | |||
// where length is determined at compile time (=64) instead of the actual length of the string at runtime. | |||
static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string | |||
volatile static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious what happens if you change:
static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
to:
static const char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;
char embed[EMBED_MAX];
to_non_volatile(const_embed, embed, EMBED_MAX);
And keep the remainder of the file as it were (except for adding to_non_volatile
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment above says we can't use const
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change as the full diff:
diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..d0c90f4bfcf 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -40,6 +40,24 @@
#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
#define EMBED_HASH_FULL_UTF8 (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
+void to_non_volatile(volatile const char* cstr, char* output, size_t length)
+{
+ for (size_t i = 0; i < length; i++)
+ {
+ output[i] = cstr[i];
+ }
+}
+
+bool compare_memory_nooptimization(volatile const char* a, volatile const char* b, size_t length)
+{
+ for (size_t i = 0; i < length; i++)
+ {
+ if (*a++ != *b++)
+ return false;
+ }
+ return true;
+}
+
bool is_exe_enabled_for_execution(pal::string_t* app_dll)
{
constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]);
@@ -48,11 +66,14 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
// Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
// Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
// where length is determined at compile time (=64) instead of the actual length of the string at runtime.
- static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
+ volatile static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
+ char embed[EMBED_MAX];
+ to_non_volatile(const_embed, embed, EMBED_MAX);
+
if (!pal::clr_palstring(embed, app_dll))
{
trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image."));
The compiler embeds the string twice:
++ grep --byte-offset --only-matching --text c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca ./artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/home/omajid/runtime/src/native/corehost/corehost.cpp.o
320:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
4816:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using const static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;
produces the exact same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment above says we can't use const.
Like this, it is also working around a compiler optimization. I thought we may tackle both together.
Using const static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; produces the exact same result.
That is: you also tried without the volatile
?
I'm surprised it found the need to duplicate const data which shouldn't be optimized for a specific usage.
Thanks for giving it try!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like this?
diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..f77fbe63347 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -40,6 +40,14 @@
#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
#define EMBED_HASH_FULL_UTF8 (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
+void to_non_volatile(volatile const char* cstr, char* output, size_t length)
+{
+ for (size_t i = 0; i < length; i++)
+ {
+ output[i] = cstr[i];
+ }
+}
+
bool is_exe_enabled_for_execution(pal::string_t* app_dll)
{
constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]);
@@ -48,11 +56,14 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
// Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
// Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
// where length is determined at compile time (=64) instead of the actual length of the string at runtime.
- static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
+ volatile static const char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
+ char embed[EMBED_MAX];
+ to_non_volatile(const_embed, embed, EMBED_MAX);
+
if (!pal::clr_palstring(embed, app_dll))
{
trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image."));
This leads to:
$ grep --byte-offset --only-matching --text c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca ./artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/home/omajid/runtime/src/native/corehost/corehost.cpp.o
256:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
4752:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const vs non-const doesn't make a difference, as far as I can see:
diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..7c6c1244815 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -40,6 +40,14 @@
#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
#define EMBED_HASH_FULL_UTF8 (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
+void to_non_volatile(volatile const char* cstr, char* output, size_t length)
+{
+ for (size_t i = 0; i < length; i++)
+ {
+ output[i] = cstr[i];
+ }
+}
+
bool is_exe_enabled_for_execution(pal::string_t* app_dll)
{
constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]);
@@ -48,11 +56,14 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
// Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
// Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
// where length is determined at compile time (=64) instead of the actual length of the string at runtime.
- static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
+ volatile static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
+ char embed[EMBED_MAX];
+ to_non_volatile(const_embed, embed, EMBED_MAX);
+
if (!pal::clr_palstring(embed, app_dll))
{
trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image."));
$ grep --byte-offset --only-matching --text c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca ./artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/home/omajid/runtime/src/native/corehost/corehost.cpp.o
256:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
4752:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My conclusion is that I neither understand why all of these don't work, and why what is in the PR does work. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize the duplication comes from the compares against the parts and not from the complete embed...
I wonder if introducing overlap in the parts wouldn't eliminate the issue too.
diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..e3d0ca72891 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -36,9 +36,10 @@
* o https://en.wikipedia.org/wiki/Comparison_of_file_systems
* has more details on maximum file name sizes.
*/
-#define EMBED_HASH_HI_PART_UTF8 "c3ab8ff13720e8ad9047dd39466b3c89" // SHA-256 of "foobar" in UTF-8
-#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
-#define EMBED_HASH_FULL_UTF8 (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
+#define EMBED_HASH_PART_1_UTF8 "c3ab8ff13720e8ad9047dd39" // SHA-256 of "foobar" in UTF-8
+#define EMBED_HASH_PART_2_UTF8 "466b3c8974e592c2fa383d4a"
+#define EMBED_HASH_PART_3_UTF8 "3960714caef0c4f2"
+#define EMBED_HASH_FULL_UTF8 (EMBED_HASH_PART_1_UTF8 EMBED_HASH_PART_2_UTF8 EMBED_HASH_PART_3_UTF8) // NUL terminated
bool is_exe_enabled_for_execution(pal::string_t* app_dll)
{
@@ -50,8 +51,8 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
// where length is determined at compile time (=64) instead of the actual length of the string at runtime.
static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
- static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
- static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
+ static const char hi_part[] = (EMBED_HASH_PART_1_UTF8 EMBED_HASH_PART_2_UTF8);
+ static const char lo_part[] = (EMBED_HASH_PART_2_UTF8 EMBED_HASH_PART_3_UTF8);
if (!pal::clr_palstring(embed, app_dll))
{
@@ -73,9 +74,9 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
// So use two parts of the string that will be unaffected by the edit.
size_t hi_len = (sizeof(hi_part) / sizeof(hi_part[0])) - 1;
size_t lo_len = (sizeof(lo_part) / sizeof(lo_part[0])) - 1;
- if (binding.size() >= (hi_len + lo_len)
+ if (binding.size() >= (EMBED_SZ - 1)
&& binding.compare(0, hi_len, &hi_part[0]) == 0
- && binding.compare(hi_len, lo_len, &lo_part[0]) == 0)
+ && binding.compare(EMBED_SZ - 1 - lo_len, lo_len, &lo_part[0]) == 0)
{
trace::error(_X("This executable is not bound to a managed DLL to execute. The binding value is: '%s'"), app_dll->c_str());
return false;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the above to would work because the two parts that the compiler optimizes against with the overlap are less likely to form the whole embed due to the compiler optimizations than they currently are. The embed is split in two parts to avoid them forming the whole, and this is adding to that approach.
If this works, it also means that the to_non_volatile
logic that is in the PR is not needed because the above is only about the comparison against the parts.
The difference between compare_memory_nooptimization
and the above is that the former leaves out any possibility of the compiler optimizing the comparison (and therefore potentially reintroducing the issue as compilers get "smarter").
Rewriting volatile-based implementations instead of diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..d5e5fc3fb61 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -40,6 +40,33 @@
#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
#define EMBED_HASH_FULL_UTF8 (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
+void to_non_volatile(volatile const char* cstr, char* output, size_t length)
+{
+ for (size_t i = 0; i < length; i++)
+ {
+ output[i] = cstr[i];
+ }
+}
+
+bool compare_memory_nooptimization(volatile const char* a, volatile const char* b, size_t length)
+{
+ for (size_t i = 0; i < length; i++)
+ {
+ if (*a++ != *b++)
+ return false;
+ }
+ return true;
+}
+
+size_t strlen_nooptimization(volatile const char* a)
+{
+ for (size_t i = 0; ; i++)
+ {
+ if (a[i] == 0)
+ return i;
+ }
+}
+
bool is_exe_enabled_for_execution(pal::string_t* app_dll)
{
constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]);
@@ -48,21 +75,24 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
// Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
// Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
// where length is determined at compile time (=64) instead of the actual length of the string at runtime.
- static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
+ volatile static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
- if (!pal::clr_palstring(embed, app_dll))
+ char embed_non_volatile[EMBED_MAX];
+ to_non_volatile(embed, embed_non_volatile, EMBED_MAX);
+
+ if (!pal::clr_palstring(embed_non_volatile, app_dll))
{
trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image."));
return false;
}
- std::string binding(&embed[0]);
+ size_t length = strlen_nooptimization(embed);
// Check if the path exceeds the max allowed size
- if (binding.size() > EMBED_MAX - 1) // -1 for null terminator
+ if (length > EMBED_MAX - 1) // -1 for null terminator
{
trace::error(_X("The managed DLL bound to this executable is longer than the max allowed length (%d)"), EMBED_MAX - 1);
return false;
@@ -73,9 +103,9 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
// So use two parts of the string that will be unaffected by the edit.
size_t hi_len = (sizeof(hi_part) / sizeof(hi_part[0])) - 1;
size_t lo_len = (sizeof(lo_part) / sizeof(lo_part[0])) - 1;
- if (binding.size() >= (hi_len + lo_len)
- && binding.compare(0, hi_len, &hi_part[0]) == 0
- && binding.compare(hi_len, lo_len, &lo_part[0]) == 0)
+ if (length >= (hi_len + lo_len)
+ && compare_memory_nooptimization(embed, hi_part, hi_len)
+ && compare_memory_nooptimization(embed + hi_len, lo_part, lo_len))
{
trace::error(_X("This executable is not bound to a managed DLL to execute. The binding value is: '%s'"), app_dll->c_str());
return false;
|
@AaronRobinsonMSFT Can you also take a look at this? Would appreciate your C++ knowledge. |
We assume that there is a single copy of the apphost hash in the apphost binary. And that it hasn't been modified by the compiler. However, the compiler can optimize the hash multiple ways, including re-ordering elements of the hash or duplicating the contents of the hash. This can currently happen under certain compiler versions and optimization flags.
Try and avoid that by marking the hash as a volatile string and implementing comparisons/copying/initialization that respects that.
Fixes: #109611