-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion failed 'targetReg != op2Reg' during 'Generate code' #91209
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue Details// Found by Antigen
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Runtime.Intrinsics;
using System.Runtime.Intrinsics.Arm;
using System.Numerics;
public class TestClass
{
public struct S1
{
public struct S1_D1_F2
{
}
}
static Vector64<int> s_v64_int_22 = Vector64.Create(6);
static Vector64<uint> s_v64_uint_23 = Vector64.Create((uint)5);
static S1 s_s1_52 = new S1();
float float_63 = 5.125f;
Vector64<int> v64_int_72 = Vector64.Create(5);
S1.S1_D1_F2 s1_s1_d1_f2_101 = new S1.S1_D1_F2();
static int s_loopInvariant = 1;
public S1.S1_D1_F2 Method3(ref Vector64<uint> p_v64_uint_155, out short p_short_156, float p_float_157, S1 p_s1_158, bool p_bool_159)
{
unchecked
{
p_short_156 = 15&4;
int __loopvar3 = 15-4;
while (15>4)
{
if (__loopvar3 > s_loopInvariant)
break;
switch (Vector64.Dot(AdvSimd.MultiplyAddByScalar(s_v64_int_22, s_v64_int_22, s_v64_int_22), v64_int_72 = v64_int_72 & s_v64_int_22))
{
case 2:
{
break;
}
case -2:
{
break;
}
default:
{
break;
}
}
}
return s1_s1_d1_f2_101;
}
}
public void Method0()
{
unchecked
{
short short_211 = -5;
float float_215 = 6.0416665f;
S1.S1_D1_F2 s1_s1_d1_f2_220 = new S1.S1_D1_F2();
s1_s1_d1_f2_220 = Method3(ref s_v64_uint_23, out short_211, float_63 -= float_215 *= 15/4, s_s1_52, 15==4);
return;
}
}
public static void Main(string[] args)
{
new TestClass().Method0();
}
}
/*
Environment:
set COMPlus_TieredCompilation=0
Assert failure(PID 11200 [0x00002bc0], Thread: 10704 [0x29d0]): Assertion failed 'targetReg != op2Reg' in 'TestClass:Method3(byref,byref,float,TestClass+S1,bool):TestClass+S1+S1_D1_F2:this' during 'Generate code' (IL size 83; hash 0x8f2c057a; FullOpts)
File: D:\a\_work\1\s\src\coreclr\jit\hwintrinsiccodegenarm64.cpp Line: 320
Image: E:\kpathak\CORE_ROOT\corerun.exe
*/
|
Is this duplicate of #91208? |
Simpler repro:
Doesn't look like there's anything preventing LSRA from assigning the same register to the target and multiple source arguments of these intrinsics which are marked as having RMW semantics. In the test case everything is assigned
If target and op1 were not already the same, say target=d1, op1=op2=op3=d0, we would get:
Are the asserts valid? It seems like the generated instruction reflects the intended semantics of the intrinsic -- at least in this case, since there is no cross-lane data corruption (for example). Or should a solution be to make the target interfere with op2/op3 (e.g., make op2/op3 "delayRegFree") so LSRA gives it a different register? @tannergooding Opinions? |
Actually, it does go through the "AddDelayFreeUses" code path, but decides not to add the delayFree setting due to it thinking one operator is a last use, so it is ok to reuse it:
This argues for the assert needing to incorporate that case. |
Another oddity here:
Then, Next, This seems worrisome, that a decision (to not set |
Eliminating the optimization in runtime/src/coreclr/jit/lsrabuild.cpp Lines 3441 to 3456 in 736dabe
|
Testing diffs with removed optimization here: #92391 |
On x64, removing the optimization shows a few regressions due to extra moves. |
It seems odd that Due to the changing of the RP |
The regressions from removing the |
@kunalspathak Do you have any thoughts about the LSRA questions I mention above, especially #91209 (comment) and #91209 (comment)? |
With Antigen, we are lately seeing pattern of issues where we end up passing same variable for different parameters to intrinsic methods. With that, we end up having same interval for use and rmw i.e. runtime/src/coreclr/jit/lsrabuild.cpp Line 3456 in 13a97c8
It is uncommon to have
I agree that we should be setting the
I recently ran into similar issue in #91798 and fixed by updating the assert. However, that was a slightly different scenario where we were marking |
Sent out a prototype in #92496. |
I realized that for this assertion (and related similar assertions), I can avoid them reasonably by only asserting if |
That's what I ended up doing in https://github.com/dotnet/runtime/pull/92183/files#diff-6b4e0f32449f2f144e05699f59f74415a564693637f643084a896dfbd081830dR8612. However, the "last use" information that LSRA relies on become stale, so we want to fix that eventually. |
There are a bunch of asserts in arm64 hardware intrinsics codegen, for intrinsics with read-modify-write behavior, that the target register is not the same as the non-RMW operand registers, since we do a `mov` to the target register which could trash those argument registers. However, if the target register and the `op1` register are the same, then no `mov` is necessary, and also, the register allocator should already have ensured proper lifetime conflict resolution. So, put the asserts under a check that the `mov` is required. Fixes dotnet#91209
* Fix asserts about arm64 hardware intrinsic register selection There are a bunch of asserts in arm64 hardware intrinsics codegen, for intrinsics with read-modify-write behavior, that the target register is not the same as the non-RMW operand registers, since we do a `mov` to the target register which could trash those argument registers. However, if the target register and the `op1` register are the same, then no `mov` is necessary, and also, the register allocator should already have ensured proper lifetime conflict resolution. So, put the asserts under a check that the `mov` is required. Fixes #91209 * Fix CLRTestTargetUnsupported usage Also, exclude mono for Runtime_91209 test.
The text was updated successfully, but these errors were encountered: