-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libraries-pgo] arm64 failure in System.Tests.DoubleTests.ParsePatterns #71005
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsThis test has been failing off and on for a while with random PGO (and possibly other modes) One common failure mode is
Though I have also seen
and gc heap corruption. Repros fairly readily with just this test enabled (add Binary search via
|
Debugging this, found it easiest to extract the method as a standalone repro using System;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Runtime.CompilerServices;
class Assert
{
public static void Equal(string s1, string s2)
{
if (!s1.Equals(s2))
{
Console.WriteLine($"Assert failed, '{s1}' != '{s2}'");
}
}
}
class X {
internal static string SplitPairs(string input)
{
if (!BitConverter.IsLittleEndian)
{
return input.Replace("-", "");
}
return string.Concat(input.Split('-').Select(pair => Reverse(pair)));
}
internal static string Reverse(string s)
{
char[] charArray = s.ToCharArray();
Array.Reverse(charArray);
return new string(charArray);
}
public static void Main()
{
string path = Directory.GetCurrentDirectory();
using (FileStream file = new FileStream(Path.Combine(path, "ibm-fpgen.txt"), FileMode.Open, FileAccess.Read))
{
using (var streamReader = new StreamReader(file))
{
int count = 0;
string line = streamReader.ReadLine();
while (line != null)
{
count++;
string[] data = line.Split(' ');
string inputHalfBytes = data[0];
string inputFloatBytes = data[1];
string inputDoubleBytes = data[2];
string correctValue = data[3];
double doubleValue = double.Parse(correctValue, NumberFormatInfo.InvariantInfo);
string doubleBytes = BitConverter.ToString(BitConverter.GetBytes(doubleValue));
float floatValue = float.Parse(correctValue, NumberFormatInfo.InvariantInfo);
string floatBytes = BitConverter.ToString(BitConverter.GetBytes(floatValue));
Half halfValue = Half.Parse(correctValue, NumberFormatInfo.InvariantInfo);
string halfBytes = BitConverter.ToString(BitConverter.GetBytes(halfValue));
doubleBytes = SplitPairs(doubleBytes);
floatBytes = SplitPairs(floatBytes);
halfBytes = SplitPairs(halfBytes);
if (BitConverter.IsLittleEndian)
{
doubleBytes = Reverse(doubleBytes);
floatBytes = Reverse(floatBytes);
halfBytes = Reverse(halfBytes);
}
Assert.Equal(doubleBytes, inputDoubleBytes);
Assert.Equal(floatBytes, inputFloatBytes);
Assert.Equal(halfBytes, inputHalfBytes);
line = streamReader.ReadLine();
}
Console.WriteLine($"Passed {count} tests");
}
}
}
} Then fix jit so it only puts a patchpoint at offset 0xED, and run with
and provide the input file that the test needs. Looks like for some reason the untracked lifetime GC info the runtime sees is different than what the jit thinks it produces. In particular the jit places a GC ref at FP offset 0xE0 and consistently uses it this way in the code:
but at runtime the decoded GC info shows all the untracked slots are 0x20 bytes lower and complains that the 0xC0 slot is not a valid object ref.
Other oddities: the jit reports 54 slots but the runtime only reports seeing 50. So perhaps the GC info is not getting encoded properly or is getting corrupted somehow? |
Looking at the GC info right after the OSR method is created, it seems match up with what the runtime reports later on. So there seems to be a discrepancy between what the jit thinks it is recording and what actually gets recorded. |
Ah, I think the issue is that we are double reporting a stack slot
and this likely leads to the failures. Seems like we can fix this by checking if an untracked on-frame (dependently) promoted struct has tracked GC fields, but perhaps the intention was that this combination should be impossible. If so perhaps OSR is missing some key bit of logic somewhere...? @dotnet/jit-contrib does this sound familiar to anyone? |
Semi-related: #67825 (comment) (also an untracked struct with gc fields). One possible fix would be in gc encoding. If we have an untracked, on-frame local struct with GC fields, and those fields are dependently promoted and tracked, we should report the tracked promoted fields and not the untracked gc offsets from the raw struct layout. |
If there is a gc struct local that is dependently promoted, the struct local may be untracked while the promoted gc fields of the struct are tracked. If so, the jit will double report the stack offset for the gc field, first as an untracked slot, and then as a tracked slot. Detect this case and report the slot as tracked only. Closes dotnet#71005.
If there is a gc struct local that is dependently promoted, the struct local may be untracked while the promoted gc fields of the struct are tracked. If so, the jit will double report the stack offset for the gc field, first as an untracked slot, and then as a tracked slot. Detect this case and report the slot as tracked only. Closes #71005.
This test has been failing off and on for a while with random PGO (and possibly other modes)
One common failure mode is
Though I have also seen
and gc heap corruption.
Repros fairly readily with just this test enabled (add
-parallel none -method System.Tests.DoubleTests.ParsePatterns
to xunit args).Binary search via
COMPlus_JitEnablePgoRange
indicates the problematic method isThe text was updated successfully, but these errors were encountered: