Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid all compiler optimization on embedded apphost hash #110554

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

omajid
Copy link
Member

@omajid omajid commented Dec 9, 2024

We assume that there is a single copy of the apphost hash in the apphost binary. And that it hasn't been modified by the compiler. However, the compiler can optimize the hash multiple ways, including re-ordering elements of the hash or duplicating the contents of the hash. This can currently happen under certain compiler versions and optimization flags.

Try and avoid that by marking the hash as a volatile string and implementing comparisons/copying/initialization that respects that.

Fixes: #109611

We assume that there is a single copy of the apphost hash in the apphost
binary. And that it hasn't been modified by the compiler. However, the
compiler can optimize the hash multiple ways, including re-ordering
elements of the hash  or duplicating the contents of the hash. This can
currently happen under certain compiler versions and optimization flags.

Try and avoid that by marking the hash as a volatile string and
implementing comparisons/copying/initialization that respects that.

Fixes: dotnet#109611

static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;

if (!pal::clr_palstring(embed, app_dll))
Copy link
Member Author

@omajid omajid Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementing pal::clr_palstring per-platform to handle volatile correctly looked much more involved that creating a non-volatile copy instead.

Strangely enough, even the non-volatile copy still needed the no-opt memory compare to stop creating duplicate instances of the embedded hash. I wonder if the compiler ignored volatile data when it started optimizing the non-volatile copy?

Copy link
Member

@jkotas jkotas Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that the compiler can see all operations on the value and that the value does not escape, and thus volatile can be safely ignored.

It may help to move the value to the global scope and mark it as extern or extern "C". It should prevent the compiler from making assumptions the value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@omajid Have you tried applying extern?

Copy link
Member

@jkotas jkotas Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect it won't help after thinking about it some more. The main problem that this PR is trying to solve is suppressing undesirable optimizations in whole program optimization mode. The compiler can see all uses of the symbol in this mode. extern won't prevent it from making this observation. We would have to artificially export the symbol from the binary to prevent the optimizations, but that is undesirable for other reasons.

@@ -48,18 +66,21 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
// Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
// Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
// where length is determined at compile time (=64) instead of the actual length of the string at runtime.
static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
volatile static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what happens if you change:

    static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string

to:

    static const char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;
    char embed[EMBED_MAX];
    to_non_volatile(const_embed, embed, EMBED_MAX);

And keep the remainder of the file as it were (except for adding to_non_volatile).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above says we can't use const.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change as the full diff:

diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..d0c90f4bfcf 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -40,6 +40,24 @@
 #define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
 #define EMBED_HASH_FULL_UTF8    (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
 
+void to_non_volatile(volatile const char* cstr, char* output, size_t length)
+{
+    for (size_t i = 0; i < length; i++)
+    {
+        output[i] = cstr[i];
+    }
+}
+
+bool compare_memory_nooptimization(volatile const char* a, volatile const char* b, size_t length)
+{
+    for (size_t i = 0; i < length; i++)
+    {
+        if (*a++ != *b++)
+            return false;
+    }
+    return true;
+}
+
 bool is_exe_enabled_for_execution(pal::string_t* app_dll)
 {
     constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]);
@@ -48,11 +66,14 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
     // Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
     // Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
     // where length is determined at compile time (=64) instead of the actual length of the string at runtime.
-    static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
+    volatile static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
 
     static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
     static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
 
+    char embed[EMBED_MAX];
+    to_non_volatile(const_embed, embed, EMBED_MAX);
+
     if (!pal::clr_palstring(embed, app_dll))
     {
         trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image."));

The compiler embeds the string twice:

++ grep --byte-offset --only-matching --text c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca ./artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/home/omajid/runtime/src/native/corehost/corehost.cpp.o
320:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
4816:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using const static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; produces the exact same result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above says we can't use const.

Like this, it is also working around a compiler optimization. I thought we may tackle both together.

Using const static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; produces the exact same result.

That is: you also tried without the volatile?

I'm surprised it found the need to duplicate const data which shouldn't be optimized for a specific usage.

Thanks for giving it try!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean like this?

diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..f77fbe63347 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -40,6 +40,14 @@
 #define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
 #define EMBED_HASH_FULL_UTF8    (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
 
+void to_non_volatile(volatile const char* cstr, char* output, size_t length)
+{
+    for (size_t i = 0; i < length; i++)
+    {
+        output[i] = cstr[i];
+    }
+}
+
 bool is_exe_enabled_for_execution(pal::string_t* app_dll)
 {
     constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]);
@@ -48,11 +56,14 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
     // Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
     // Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
     // where length is determined at compile time (=64) instead of the actual length of the string at runtime.
-    static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
+    volatile static const char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
 
     static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
     static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
 
+    char embed[EMBED_MAX];
+    to_non_volatile(const_embed, embed, EMBED_MAX);
+
     if (!pal::clr_palstring(embed, app_dll))
     {
         trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image."));

This leads to:

$ grep --byte-offset --only-matching --text c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca ./artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/home/omajid/runtime/src/native/corehost/corehost.cpp.o
256:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
4752:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const vs non-const doesn't make a difference, as far as I can see:

diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..7c6c1244815 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -40,6 +40,14 @@
 #define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
 #define EMBED_HASH_FULL_UTF8    (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
 
+void to_non_volatile(volatile const char* cstr, char* output, size_t length)
+{
+    for (size_t i = 0; i < length; i++)
+    {
+        output[i] = cstr[i];
+    }
+}
+
 bool is_exe_enabled_for_execution(pal::string_t* app_dll)
 {
     constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]);
@@ -48,11 +56,14 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
     // Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
     // Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
     // where length is determined at compile time (=64) instead of the actual length of the string at runtime.
-    static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
+    volatile static char const_embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
 
     static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
     static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
 
+    char embed[EMBED_MAX];
+    to_non_volatile(const_embed, embed, EMBED_MAX);
+
     if (!pal::clr_palstring(embed, app_dll))
     {
         trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image."));
$ grep --byte-offset --only-matching --text c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca ./artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/home/omajid/runtime/src/native/corehost/corehost.cpp.o
256:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca
4752:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca

Copy link
Member

@tmds tmds Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My conclusion is that I neither understand why all of these don't work, and why what is in the PR does work. 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize the duplication comes from the compares against the parts and not from the complete embed...

I wonder if introducing overlap in the parts wouldn't eliminate the issue too.

diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..e3d0ca72891 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -36,9 +36,10 @@
  *        o https://en.wikipedia.org/wiki/Comparison_of_file_systems
  *          has more details on maximum file name sizes.
  */
-#define EMBED_HASH_HI_PART_UTF8 "c3ab8ff13720e8ad9047dd39466b3c89" // SHA-256 of "foobar" in UTF-8
-#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
-#define EMBED_HASH_FULL_UTF8    (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
+#define EMBED_HASH_PART_1_UTF8 "c3ab8ff13720e8ad9047dd39" // SHA-256 of "foobar" in UTF-8
+#define EMBED_HASH_PART_2_UTF8 "466b3c8974e592c2fa383d4a"
+#define EMBED_HASH_PART_3_UTF8 "3960714caef0c4f2"
+#define EMBED_HASH_FULL_UTF8    (EMBED_HASH_PART_1_UTF8 EMBED_HASH_PART_2_UTF8 EMBED_HASH_PART_3_UTF8) // NUL terminated
 
 bool is_exe_enabled_for_execution(pal::string_t* app_dll)
 {
@@ -50,8 +51,8 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
     // where length is determined at compile time (=64) instead of the actual length of the string at runtime.
     static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
 
-    static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
-    static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
+    static const char hi_part[] = (EMBED_HASH_PART_1_UTF8 EMBED_HASH_PART_2_UTF8);
+    static const char lo_part[] = (EMBED_HASH_PART_2_UTF8 EMBED_HASH_PART_3_UTF8);
 
     if (!pal::clr_palstring(embed, app_dll))
     {
@@ -73,9 +74,9 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
     // So use two parts of the string that will be unaffected by the edit.
     size_t hi_len = (sizeof(hi_part) / sizeof(hi_part[0])) - 1;
     size_t lo_len = (sizeof(lo_part) / sizeof(lo_part[0])) - 1;
-    if (binding.size() >= (hi_len + lo_len)
+    if (binding.size() >= (EMBED_SZ - 1)
         && binding.compare(0, hi_len, &hi_part[0]) == 0
-        && binding.compare(hi_len, lo_len, &lo_part[0]) == 0)
+        && binding.compare(EMBED_SZ - 1 - lo_len, lo_len, &lo_part[0]) == 0)
     {
         trace::error(_X("This executable is not bound to a managed DLL to execute. The binding value is: '%s'"), app_dll->c_str());
         return false;

Copy link
Member

@tmds tmds Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the above to would work because the two parts that the compiler optimizes against with the overlap are less likely to form the whole embed due to the compiler optimizations than they currently are. The embed is split in two parts to avoid them forming the whole, and this is adding to that approach.

If this works, it also means that the to_non_volatile logic that is in the PR is not needed because the above is only about the comparison against the parts.

The difference between compare_memory_nooptimization and the above is that the former leaves out any possibility of the compiler optimizing the comparison (and therefore potentially reintroducing the issue as compilers get "smarter").

@omajid
Copy link
Member Author

omajid commented Dec 12, 2024

Rewriting volatile-based implementations instead of std::string also seems to work:

diff --git a/src/native/corehost/corehost.cpp b/src/native/corehost/corehost.cpp
index 6de7acfbd08..d5e5fc3fb61 100644
--- a/src/native/corehost/corehost.cpp
+++ b/src/native/corehost/corehost.cpp
@@ -40,6 +40,33 @@
 #define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
 #define EMBED_HASH_FULL_UTF8    (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated
 
+void to_non_volatile(volatile const char* cstr, char* output, size_t length)
+{
+    for (size_t i = 0; i < length; i++)
+    {
+        output[i] = cstr[i];
+    }
+}
+
+bool compare_memory_nooptimization(volatile const char* a, volatile const char* b, size_t length)
+{
+    for (size_t i = 0; i < length; i++)
+    {
+        if (*a++ != *b++)
+            return false;
+    }
+    return true;
+}
+
+size_t strlen_nooptimization(volatile const char* a)
+{
+    for (size_t i = 0; ; i++)
+    {
+       if (a[i] == 0)
+               return i;
+    }
+}
+
 bool is_exe_enabled_for_execution(pal::string_t* app_dll)
 {
     constexpr int EMBED_SZ = sizeof(EMBED_HASH_FULL_UTF8) / sizeof(EMBED_HASH_FULL_UTF8[0]);
@@ -48,21 +75,24 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
     // Contains the EMBED_HASH_FULL_UTF8 value at compile time or the managed DLL name replaced by "dotnet build".
     // Must not be 'const' because std::string(&embed[0]) below would bind to a const string ctor plus length
     // where length is determined at compile time (=64) instead of the actual length of the string at runtime.
-    static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
+    volatile static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8;     // series of NULs followed by embed hash string
 
     static const char hi_part[] = EMBED_HASH_HI_PART_UTF8;
     static const char lo_part[] = EMBED_HASH_LO_PART_UTF8;
 
-    if (!pal::clr_palstring(embed, app_dll))
+    char embed_non_volatile[EMBED_MAX];
+    to_non_volatile(embed, embed_non_volatile, EMBED_MAX);
+
+    if (!pal::clr_palstring(embed_non_volatile, app_dll))
     {
         trace::error(_X("The managed DLL bound to this executable could not be retrieved from the executable image."));
         return false;
     }
 
-    std::string binding(&embed[0]);
+    size_t length = strlen_nooptimization(embed);
 
     // Check if the path exceeds the max allowed size
-    if (binding.size() > EMBED_MAX - 1) // -1 for null terminator
+    if (length > EMBED_MAX - 1) // -1 for null terminator
     {
         trace::error(_X("The managed DLL bound to this executable is longer than the max allowed length (%d)"), EMBED_MAX - 1);
         return false;
@@ -73,9 +103,9 @@ bool is_exe_enabled_for_execution(pal::string_t* app_dll)
     // So use two parts of the string that will be unaffected by the edit.
     size_t hi_len = (sizeof(hi_part) / sizeof(hi_part[0])) - 1;
     size_t lo_len = (sizeof(lo_part) / sizeof(lo_part[0])) - 1;
-    if (binding.size() >= (hi_len + lo_len)
-        && binding.compare(0, hi_len, &hi_part[0]) == 0
-        && binding.compare(hi_len, lo_len, &lo_part[0]) == 0)
+    if (length >= (hi_len + lo_len)
+        && compare_memory_nooptimization(embed, hi_part, hi_len)
+        && compare_memory_nooptimization(embed + hi_len, lo_part, lo_len))
     {
         trace::error(_X("This executable is not bound to a managed DLL to execute. The binding value is: '%s'"), app_dll->c_str());
         return false;
$ grep --byte-offset --only-matching --text c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca ./artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/home/omajid/runtime/src/native/corehost/corehost.cpp.o
5696:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714ca

@agocke
Copy link
Member

agocke commented Jan 13, 2025

@AaronRobinsonMSFT Can you also take a look at this? Would appreciate your C++ knowledge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Host community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to locate managed application c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2
5 participants