Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Enable MHR support on OSX #25716

Merged
merged 2 commits into from
Jul 17, 2019
Merged

Enable MHR support on OSX #25716

merged 2 commits into from
Jul 17, 2019

Conversation

janvorli
Copy link
Member

  • Move JIT_WriteBarrier that is modified at runtime to a dynamically
    allocated memory instead of making a page in libcoreclr.dylib RWX.

  • Update PAL to add MEM_JIT flag for allocations and reservations of
    executable memory.

  • Update native runtime in EH and stack unwinding areas so that it can
    unwind from the write barrier copy. That code has no unwind info, so
    without special handling, runtime would not be able to unwind from
    it.

* Move JIT_WriteBarrier that is modified at runtime to a dynamically
  allocated memory instead of making a page in libcoreclr.dylib RWX.

* Update PAL to add MEM_JIT flag for allocations and reservations of
  executable memory.

* Update native runtime in EH and stack unwinding areas so that it can
  unwind from the write barrier copy. That code has no unwind info, so
  without special handling, runtime would not be able to unwind from
  it.
@janvorli janvorli added this to the 3.next milestone Jul 16, 2019
@janvorli janvorli requested a review from jkotas July 16, 2019 09:36
@janvorli janvorli self-assigned this Jul 16, 2019
@janvorli
Copy link
Member Author

cc: @odhanson, @sergiy-k

{
static int isRunningOnMojaveHardenedRuntime = -1;

if (isRunningOnMojaveHardenedRuntime == -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this always be called from the same thread ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter. The worst case is that we would evaluate the stuff multiple times.

Copy link
Member

@jkotas jkotas Jul 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does matter. Note that the C/C++ spec allows compilers to duplicate the memory load and introduce a subtle race condition in the process. E.g. this can be transformed by the C/C++ compiler to:

static int isRunningOnMojaveHardenedRuntime = -1;

int result = isRunningOnMojaveHardenedRuntime;

if (isRunningOnMojaveHardenedRuntime == -1)
{
   ...
   result = ...;
}

return result;

@janvorli
Copy link
Member Author

@dotnet/dnceng the OSX legs in this PR have failed with error in an azure devops python script:

+ /Users/dotnet-bot/.vsts-env/bin/python /Users/dotnet-bot/dotnetbuild/work/9e793ed2-605d-4498-bea0-4a42ee6a79d6/Payload/reporter/run.py https://dev.azure.com/dnceng/ public 7286796 eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsIng1dCI6Im9PdmN6NU1fN3AtSGpJS2xGWHo5M3VfVjBabyJ9.eyJuYW1laWQiOiJlNDI4NWM4Yy0zZmQ1LTQyNjctOWIxYy00MjE5NWM0N2E1NTMiLCJzY3AiOiJMb2NhdGlvblNlcnZpY2UuQ29ubmVjdCBSZWFkQW5kUHVibGlzaFRlc3Q6OWVlNmQ0NzgtZDI4OC00N2Y3LWFhY2MtZjZlNmQwODJhZTZkIFJlYWRBbmRVcGRhdGVCdWlsZEJ5VXJpOjllZTZkNDc4LWQyODgtNDdmNy1hYWNjLWY2ZTZkMDgyYWU2ZC9kb3RuZXQvY29yZWNsci8yMjg6QnVpbGQvQnVpbGQvMjY3MTUzIiwiYXVpIjoiN2QzYjA3NzItMzZjYi00MmVhLWE4NjMtODhkZTBkZDZhYmIzIiwic2lkIjoiN2FlZDYyYjEtN2Q4NS00NzRkLWE3YmQtZWVlNjRlNTMzODE4IiwiaXNzIjoiYXBwLnZzdG9rZW4udmlzdWFsc3R1ZGlvLmNvbSIsImF1ZCI6ImFwcC52c3Rva2VuLnZpc3VhbHN0dWRpby5jb218dnNvOmI1NWRlNGVkLTRiNWEtNDIxNS1hOGU0LTBhMGE1ZjcxZTdkOCIsIm5iZiI6MTU2MzI2OTk2MCwiZXhwIjoxNTYzMjg1NTYwfQ.PvknUyWHsR7SIj2oT2waj3B8big-PloxTUxz-NHp_khBBpfTIhMIpuAbFLGt03w119jZVwZvNixRFiy7Mwe6WSWNdbJxMYExxdYWizdSq2XRhQSfdrf6VLw9IQ9k8lfmrsp5uJElwaJ7X5O1kDZLw5DiCWhcq8rTz5cNARRPX_2CKdfECl4wI8Fm3_2v3nbrtjFBzPuG22pS_EV4OAsmrYLpTAbVzRvKk10Z6mzk675qWeKZpaUuRWmks3SGVlqjbSyC2XO_izXvESJaM5l9DTbAKgkh_CMkSYjTRiDPJ5RUxQNMNAj90h8wEWYeEtJKMt3wkBjwRH6HVd9GMRGVqA
Traceback (most recent call last):
  File "/Users/dotnet-bot/dotnetbuild/work/9e793ed2-605d-4498-bea0-4a42ee6a79d6/Payload/reporter/run.py", line 9, in <module>
    from azure_devops_result_publisher import AzureDevOpsTestResultPublisher
  File "/Users/dotnet-bot/dotnetbuild/work/9e793ed2-605d-4498-bea0-4a42ee6a79d6/Payload/reporter/azure_devops_result_publisher.py", line 67
    def convert_results(self, results: Iterable[TestResult]) -> Iterable[TestCaseResult]:
                                     ^
SyntaxError: invalid syntax

Can someone please take a look?


if (isRunningOnMojaveHardenedRuntime == -1)
{
if (GetDarwinVersion() >= DARWIN_VERSION_MOJAVE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to bother with this version check at all? Given the Apple ecosystem upgrades very fast, this is going to be just an expensive way to compute "true" in like a year.

@MattGal
Copy link
Member

MattGal commented Jul 16, 2019

@janvorli I'm taking a look now and will file issue(s) as appropriate.

@MattGal
Copy link
Member

MattGal commented Jul 16, 2019

This isn't new, but an artifact of the fact that other than OSX 10.14, the OSX Helix machines are still running Python 2 client code while bits like the line you mentioned are now Python-3 specific.

As noted in the logs though, the work items failed with this assert:

Assert failure(PID 1255 [0x000004e7], Thread: 944993 [0xe6b61]): !"ClrVirtualAlloc of GC barrier code page failed"
    File: /Users/vsts/agent/2.154.1/work/1/s/src/vm/threads.cpp Line: 1095
    Image: /Users/dotnet-bot/dotnetbuild/work/9e793ed2-605d-4498-bea0-4a42ee6a79d6/Payload/corerun

and had they passed, your run would have passed (indeed this has been the state for some time). For instance, see the OSX run in #25691
(Sample log showing the same reporter problem: https://helixexternalresults.blob.core.windows.net/dotnet-coreclr-refs-pull-25691-merge-4177742c1f25422494/JIT.jit64.valuetypes/console.2676e495.log )

The challenge here is that updating our on-prem OSX machines is much more work than other types of systems, and most machines will be fully reimaged in the process.

@ilyas1974 is working on this and it's tracked via https://github.com/dotnet/core-eng/issues/7003. From chatting with him a rough ETA for when the 10.13 machines would be August (next month)

@janvorli
Copy link
Member Author

@MattGal thank you for digging out the assert, that's a problem related to my change.

@janvorli
Copy link
Member Author

Strange, I don't get the failure to allocate the memory page for memory barriers on my local Mac, all the pri 1 coreclr tests are passing. I wonder if it is related to the fact that the lab machines don't have Mojave installed yet. I'm looking into it.

@MattGal
Copy link
Member

MattGal commented Jul 16, 2019

@janvorli the CoreCLR tests are explicitly opted-in to running on High Sierra, that's what osx.1013.amd64.open effectively means. You can fix both the AzDO reporting error AND run on Mojave if you like by simply only testing on osx.1014.amd64.open, but do note that until 10.16 comes out, 10.13 is at least supported for security patches and should be considered a place we need to make sure .NET core works.

@janvorli
Copy link
Member Author

Ok, now the OSX tests in the lab are passing. There was actually a bug in the IsRunningOnMojaveHardenedRuntime that I have fixed when I've removed the check for the OS version in the last commit that was causing it.

@odhanson
Copy link
Member

Ok, now the OSX tests in the lab are passing. There was actually a bug in the IsRunningOnMojaveHardenedRuntime that I have fixed when I've removed the check for the OS version in the last commit that was causing it.

@janvorli , the OSX tests are only testing the non MHR case right ? We don't have any tests with Sandbox/MHR enabled ?

Copy link
Member

@odhanson odhanson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@janvorli
Copy link
Member Author

We don't have any tests with Sandbox/MHR enabled ?

No we don't. I am not sure how we would enable that in the lab. For some reason, I have to sign the libcoreclr.dylib from a terminal running in the UI of my Mac. If I try to run the same command line via SSH connection, it fails. But I guess there must be some way.

Anyways, I have tested my change with MHR enabled using a coreclr test that exercised all the code paths I've touched.

@odhanson
Copy link
Member

I will also try find some time in the next few days to backport this change (as we are targeting 2.2 right now) and will verify this change with our product.

@filipnavara
Copy link
Member

We don't have any tests with Sandbox/MHR enabled ?

No we don't. I am not sure how we would enable that in the lab. For some reason, I have to sign the libcoreclr.dylib from a terminal running in the UI of my Mac. If I try to run the same command line via SSH connection, it fails. But I guess there must be some way.

For reference, this is what Mono uses for the tests: https://github.com/mono/mono/pull/15589/files. Obviously the prerequisite is to have macOS Mojave or Catalina on CI which was not the case last time I checked.

@odhanson
Copy link
Member

For reference, this is what Mono uses for the tests: https://github.com/mono/mono/pull/15589/files. Obviously the prerequisite is to have macOS Mojave or Catalina on CI which was not the case last time I checked.

Not part of this change, but I think it would be quite important to have some automated test so we make sure we don't break this in future changes

@MarcWils
Copy link

Any chance this can be merged to 3.0? #18617

@janvorli
Copy link
Member Author

I am reverting this change now. I have found a couple of issues in it and I am leaving for a week of vacation, so I'll be able to fix it once I'm back. The issue is that the JIT_Stelem_Ref_xxxx functions accidentally call the original unpatched JIT_WriteBarrier.

janvorli added a commit that referenced this pull request Jul 20, 2019
jkotas pushed a commit that referenced this pull request Jul 20, 2019
@janvorli janvorli restored the add-mhr-support branch July 20, 2019 08:35
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Enable MHR support on OSX

* Move JIT_WriteBarrier that is modified at runtime to a dynamically
  allocated memory instead of making a page in libcoreclr.dylib RWX.

* Update PAL to add MEM_JIT flag for allocations and reservations of
  executable memory.

* Update native runtime in EH and stack unwinding areas so that it can
  unwind from the write barrier copy. That code has no unwind info, so
  without special handling, runtime would not be able to unwind from
  it.



Commit migrated from dotnet/coreclr@7a97084
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants