-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disabled Job - Libraries Test Run release mono windows x64 Release #45524
Comments
I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label. |
Looking at logs, failures appear to be related to reflection. Can you point me to where you are seeing
A first-chance exception is expected here as a P/Invoke call is used to test for a DLL's presence. Are you seeing the exception leak out in other ways?
Even with an updated Windows, unless the new version of Windows includes the required msquic.dll to light up the .NET feature -- I have been told this is explicitly not the plan for MsQuic -- I would not expect any change in behavior. CC @nibanks |
One example is in this log:
It sounds like something goes wrong when loading the MsQuic library but I don't see any related changes between the good/bad commits so my current hunch is that something in the OS changed. |
Hmm after more digging this doesn't look related to MsQuic anymore, I tried disabling loading it with #45520 but there are still other workitems that hit the reflection issue. We seem to be hitting it on the release/5.0 branch as well so it definitely looks related to the recent Helix rollout. |
@dotnet/dnceng for visibility. This looks like it started failing on all windows queues as of the latest rollout. |
I opened https://github.com/dotnet/core-eng/issues/11569 for tracking |
I hit the same NullReferenceException in https://dev.azure.com/dnceng/public/_build/results?buildId=906896&view=logs&j=d81dbc02-ec0b-5bd5-14c1-0072bd710d3b&t=780f1b30-8338-5e4d-6b72-c57eeb2413c9&l=76
|
Tagging subscribers to this area: @directhex Issue DetailsAlmost all tests involving networking fail in that leg. The test runs have been disabled temporarily in #45529 An example in https://dnceng.visualstudio.com/public/_build/results?buildId=906071&view=ms.vss-test-web.build-test-results-tab
Exploratory PR in #45520
|
Investigating via https://github.com/dotnet/core-eng/issues/11569 |
@eerhardt those builds ran before the leg was disabled. |
I think this might be related to the VS update since I can reproduce it locally, i.e. the cause is not the Helix test machine but |
Ok, I confirmed that this is indeed due to the VS update. Good:
Bad:
|
To reproduce: .\build.cmd mono+libs+libs.pretest -c Release
.\dotnet.cmd build /p:RuntimeFlavor=mono /t:Test /p:Configuration=Release .\src\libraries\System.Private.Xml\tests\XmlSchema\XmlSchemaSet\System.Xml.XmlSchemaSet.Tests.csproj |
BatchedCI build for 2ee13ec is also failing with NRE in |
After upgrade to later msvc version on CI boots, see dotnet#45524 for details, Window x64 Release builds started to crash on libraries tests. After investigation it turns out that new msvc compiler handles an expression different compared to how it was handled in previous version. After upgrade of msvc, the expression: int _amd64_width_temp = ((guint64)(imm) == (guint64)(int)(guint64)(imm)); implemented in amd64_mov_reg_imm and then called from tramp-amd64.c@500 was transformed into an always true expression by compiler: amd64_mov_reg_imm (code, AMD64_R11, (guint8*)mono_get_rethrow_preserve_exception_addr ()); lea rcx,[rethrow_preserve_exception_func (07FFB9E33A590h)] mov word ptr [rbx+0Dh],0BB41h mov byte ptr [rbx+0Fh],cl mov rax,rcx shr eax,8 mov byte ptr [rbx+10h],al mov rax,rcx shr eax,10h shr ecx,18h mov byte ptr [rbx+11h],al lea rax,[rbx+13h] mov byte ptr [rbx+12h],cl as seen above, the condition and handling of a 64-bit imm has been dropped by compiler. This cause issues when the imm is a 64-bit value since it will always gets truncated into 32-bit imm and in this case it was a pointer to a function within coreclr.dll (mono_get_rethrow_preserve_exception_addr) loaded located at higher address (using more than 32-bit). This is most likely a regression issue in compiler for this specific construction. I tried simpler construction (using same type conversion) on both old and new compiler version and then it makes the right optimization. Fix is to switch to a macro already available in amd64-codegen (amd64_is_imm32) detecting if an imm needs a 32-bit or 64-bit sized value. This will be correctly optimized by new msvc compiler and even if this is a work around for a what seems to be a optimization bug in the compiler, it is still cleaner and better describes the intent than current code. Fix also re-enable Windows x64 Release CI test lane.
Crash should be fixed by #46573. PR also re-enables the Windows x64 Release CI lane. |
After upgrade to later msvc version on CI boots, see dotnet/runtime#45524 for details, Window x64 Release builds started to crash on libraries tests. After investigation it turns out that new msvc compiler handles an expression different compared to how it was handled in previous version. After upgrade of msvc, the expression: int _amd64_width_temp = ((guint64)(imm) == (guint64)(int)(guint64)(imm)); implemented in amd64_mov_reg_imm and then called from tramp-amd64.c@500 was transformed into an always true expression by compiler: ``` amd64_mov_reg_imm (code, AMD64_R11, (guint8*)mono_get_rethrow_preserve_exception_addr ()); lea rcx,[rethrow_preserve_exception_func (07FFB9E33A590h)] mov word ptr [rbx+0Dh],0BB41h mov byte ptr [rbx+0Fh],cl mov rax,rcx shr eax,8 mov byte ptr [rbx+10h],al mov rax,rcx shr eax,10h shr ecx,18h mov byte ptr [rbx+11h],al lea rax,[rbx+13h] mov byte ptr [rbx+12h],cl ``` as seen above, the condition and handling of a 64-bit imm has been dropped by compiler. This cause issues when the imm is a 64-bit value since it will always gets truncated into 32-bit imm and in this case it was a pointer to a function within coreclr.dll (mono_get_rethrow_preserve_exception_addr) loaded located at higher address (using more than 32-bit). This is most likely a regression issue in compiler for this specific construction. I tried simpler construction (using same type conversion) on both old and new compiler version and then it makes the right optimization. Fix is to switch to a macro already available in amd64-codegen (amd64_is_imm32) detecting if an imm needs a 32-bit or 64-bit sized value. This will be correctly optimized by new msvc compiler and even if this is a work around for a what seems to be a optimization bug in the compiler, it is still cleaner and better describes the intent than current code. Fix also re-enable Windows x64 Release CI test lane.
Do we need to report to compiler team? |
@danmosemsft Yes that would be good, how do we best progress regarding that? |
After upgrade to later msvc version on CI boots, see dotnet/runtime#45524 for details, Window x64 Release builds started to crash on libraries tests. After investigation it turns out that new msvc compiler handles an expression different compared to how it was handled in previous version. After upgrade of msvc, the expression: int _amd64_width_temp = ((guint64)(imm) == (guint64)(int)(guint64)(imm)); implemented in amd64_mov_reg_imm and then called from tramp-amd64.c@500 was transformed into an always true expression by compiler: ``` amd64_mov_reg_imm (code, AMD64_R11, (guint8*)mono_get_rethrow_preserve_exception_addr ()); lea rcx,[rethrow_preserve_exception_func (07FFB9E33A590h)] mov word ptr [rbx+0Dh],0BB41h mov byte ptr [rbx+0Fh],cl mov rax,rcx shr eax,8 mov byte ptr [rbx+10h],al mov rax,rcx shr eax,10h shr ecx,18h mov byte ptr [rbx+11h],al lea rax,[rbx+13h] mov byte ptr [rbx+12h],cl ``` as seen above, the condition and handling of a 64-bit imm has been dropped by compiler. This cause issues when the imm is a 64-bit value since it will always gets truncated into 32-bit imm and in this case it was a pointer to a function within coreclr.dll (mono_get_rethrow_preserve_exception_addr) loaded located at higher address (using more than 32-bit). This is most likely a regression issue in compiler for this specific construction. I tried simpler construction (using same type conversion) on both old and new compiler version and then it makes the right optimization. Fix is to switch to a macro already available in amd64-codegen (amd64_is_imm32) detecting if an imm needs a 32-bit or 64-bit sized value. This will be correctly optimized by new msvc compiler and even if this is a work around for a what seems to be a optimization bug in the compiler, it is still cleaner and better describes the intent than current code. Fix also re-enable Windows x64 Release CI test lane. Co-authored-by: lateralusX <lateralusX@users.noreply.github.com>
After upgrade to later msvc version on CI boots, see #45524 for details, Window x64 Release builds started to crash on libraries tests. After investigation it turns out that new msvc compiler handles an expression different compared to how it was handled in previous version. After upgrade of msvc, the expression: int _amd64_width_temp = ((guint64)(imm) == (guint64)(int)(guint64)(imm)); implemented in amd64_mov_reg_imm and then called from tramp-amd64.c@500 was transformed into an always true expression by compiler: amd64_mov_reg_imm (code, AMD64_R11, (guint8*)mono_get_rethrow_preserve_exception_addr ()); lea rcx,[rethrow_preserve_exception_func (07FFB9E33A590h)] mov word ptr [rbx+0Dh],0BB41h mov byte ptr [rbx+0Fh],cl mov rax,rcx shr eax,8 mov byte ptr [rbx+10h],al mov rax,rcx shr eax,10h shr ecx,18h mov byte ptr [rbx+11h],al lea rax,[rbx+13h] mov byte ptr [rbx+12h],cl as seen above, the condition and handling of a 64-bit imm has been dropped by compiler. This cause issues when the imm is a 64-bit value since it will always gets truncated into 32-bit imm and in this case it was a pointer to a function within coreclr.dll (mono_get_rethrow_preserve_exception_addr) loaded located at higher address (using more than 32-bit). This is most likely a regression issue in compiler for this specific construction. I tried simpler construction (using same type conversion) on both old and new compiler version and then it makes the right optimization. Fix is to switch to a macro already available in amd64-codegen (amd64_is_imm32) detecting if an imm needs a 32-bit or 64-bit sized value. This will be correctly optimized by new msvc compiler and even if this is a work around for a what seems to be a optimization bug in the compiler, it is still cleaner and better describes the intent than current code. Fix also re-enable Windows x64 Release CI test lane.
@lateralusX I'd start with https://docs.microsoft.com/en-us/cpp/overview/how-to-report-a-problem-with-the-visual-cpp-toolset which essentially means opening an issue on developer community via https://aka.ms/feedback/report?space=62. If we don't get a response we can ping someone internally. |
Closing the issue since the problem on our end was fixed with #46573 |
Almost all tests involving networking fail in that leg. The test runs have been disabled temporarily in #45529
An example in https://dnceng.visualstudio.com/public/_build/results?buildId=906071&view=ms.vss-test-web.build-test-results-tab
Exploratory PR in #45520
The text was updated successfully, but these errors were encountered: