-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid all compiler optimization on embedded apphost hash #110554
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -40,6 +40,24 @@ | |
#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2" | ||
#define EMBED_HASH_FULL_UTF8 (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated | ||
|
||
void to_non_volatile(volatile const char* cstr, char* output, size_t length) | ||
{ | ||
for (size_t i = 0; i < length; i++) | ||
{ | ||
output[i] = cstr[i]; | ||
} | ||
} | ||
|
||
bool compare_memory_nooptimization(volatile const char* a, volatile const char* b, size_t length) | ||
{ | ||
for (size_t i = 0; i < length; i++) | ||
{ | ||
if (*a++ != *b++) | ||
return false; | ||
} | ||
return true; | ||
} | ||
|
||
bool is_exe_enabled_for_execution(pal::string_t* app_dll) | ||
{ | ||
constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]); | ||
|
@@ -48,18 +66,21 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll) | |
// Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build". | ||
// Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length | ||
// where length is determined at compile time (=64) instead of the actual length of the string at runtime. | ||
static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string | ||
volatile static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string | ||
|
||
static const char hi_part[] = EMBED_HASH_HI_PART_UTF8; | ||
static const char lo_part[] = EMBED_HASH_LO_PART_UTF8; | ||
|
||
if (!pal::clr_palstring(embed, app_dll)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Implementing pal::clr_palstring per-platform to handle volatile correctly looked much more involved that creating a non-volatile copy instead. Strangely enough, even the non-volatile copy still needed the no-opt memory compare to stop creating duplicate instances of the embedded hash. I wonder if the compiler ignored volatile data when it started optimizing the non-volatile copy? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My guess is that the compiler can see all operations on the value and that the value does not escape, and thus volatile can be safely ignored. It may help to move the value to the global scope and mark it as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @omajid Have you tried applying There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suspect it won't help after thinking about it some more. The main problem that this PR is trying to solve is suppressing undesirable optimizations in whole program optimization mode. The compiler can see all uses of the symbol in this mode. |
||
char working_copy_embed[EMBED_MAX]; | ||
to_non_volatile(embed, working_copy_embed, EMBED_MAX); | ||
|
||
if (!pal::clr_palstring(&working_copy_embed[0], app_dll)) | ||
{ | ||
trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image.")); | ||
return false; | ||
} | ||
|
||
std::string binding(&embed[0]); | ||
std::string binding(&working_copy_embed[0]); | ||
|
||
// Check if the path exceeds the max allowed size | ||
if (binding.size() > EMBED_MAX - 1) // -1 for null terminator | ||
|
@@ -74,8 +95,8 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll) | |
size_t hi_len = (sizeof(hi_part) / sizeof(hi_part[0])) - 1; | ||
size_t lo_len = (sizeof(lo_part) / sizeof(lo_part[0])) - 1; | ||
if (binding.size() >= (hi_len + lo_len) | ||
&& binding.compare(0, hi_len, &hi_part[0]) == 0 | ||
&& binding.compare(hi_len, lo_len, &lo_part[0]) == 0) | ||
&& compare_memory_nooptimization(binding.c_str(), hi_part, hi_len) | ||
&& compare_memory_nooptimization(binding.substr(hi_len).c_str(), lo_part, lo_len)) | ||
{ | ||
trace::error(_X("This executable is not bound to a managed DLL to execute. The binding value is: '%s'"), app_dll->c_str()); | ||
return false; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious what happens if you change:
to:
And keep the remainder of the file as it were (except for adding
to_non_volatile
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment above says we can't use
const
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this change as the full diff:
The compiler embeds the string twice:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using
const static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;
produces the exact same result.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this, it is also working around a compiler optimization. I thought we may tackle both together.
That is: you also tried without the
volatile
?I'm surprised it found the need to duplicate const data which shouldn't be optimized for a specific usage.
Thanks for giving it try!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean like this?
This leads to:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const vs non-const doesn't make a difference, as far as I can see:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My conclusion is that I neither understand why all of these don't work, and why what is in the PR does work. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize the duplication comes from the compares against the parts and not from the complete embed...
I wonder if introducing overlap in the parts wouldn't eliminate the issue too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the above to would work because the two parts that the compiler optimizes against with the overlap are less likely to form the whole embed due to the compiler optimizations than they currently are. The embed is split in two parts to avoid them forming the whole, and this is adding to that approach.
If this works, it also means that the
to_non_volatile
logic that is in the PR is not needed because the above is only about the comparison against the parts.The difference between
compare_memory_nooptimization
and the above is that the former leaves out any possibility of the compiler optimizing the comparison (and therefore potentially reintroducing the issue as compilers get "smarter").