-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Inlinable String.Equals for ref equality #16434
Conversation
Regressions (other than centralization of |
return false; | ||
|
||
if (this.Length != value.Length) | ||
if (value is null || value.Length != value.Length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops
Hadn't noticed
|
Perf? |
Needs some tweaks, e.g. |
Looks like changing static int CompareOrdinalIgnoreCase(ReadOnlySpan<char> strA, ReadOnlySpan<char> strB) to |
String -> ROS #16440 |
@dotnet-bot test Windows_NT arm Cross Checked corefx_baseline Build and Test |
throw GetStringComparisonException_NotSupported(); | ||
} | ||
|
||
[MethodImpl(MethodImplOptions.NoInlining)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't JIT automatically not-inline methods that unconditionally throw?
if (!m_IsForceInline && m_IsNoReturn && (basicBlockCount == 1))
{
SetNever(InlineObservation::CALLEE_DOES_NOT_RETURN);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't throw; it just constructs and returns the exception to the method that throws.
Issue is if the throw method does too much then basicBlockCount
can be greater than 1
then it no longer counts as a non-returning throw method. Moving the exception constructor, the resource lookup and marking that as non-inlining; and then just throwing the return guarantees that the basicBlockCount
is always 1
in the throw method.
NoInlining
probably isn't needed here; its a trait for older runtimes that don't know about non-returning methods; but its also not harming anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't throw; it just constructs and returns the exception to the method that throws.
am I misreading? I see throw new ArgumentException(...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eugh; should be return
rather than throw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add to and use ThrowHelper here instead (https://github.com/dotnet/coreclr/blob/master/src/mscorlib/src/System/ThrowHelper.cs)?
Errors still
|
That is a known issue: dotnet/corefx#27212 (comment) |
@dotnet-bot test Windows_NT arm Cross Checked corefx_baseline Build and Test |
@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test |
@dotnet-bot test Windows_NT x64_arm64_altjit Checked corefx_baseline |
@ahsonkhan its complaining
For internal static int CompareOrdinal(ReadOnlySpan<char> strA, ReadOnlySpan<char> strB)
{
// TODO: Add a vectorized code path, similar to SequenceEqual
// https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/SpanHelpers.byte.cs#L900
int minLength = Math.Min(strA.Length, strB.Length);
ref char first = ref MemoryMarshal.GetReference(strA);
ref char second = ref MemoryMarshal.GetReference(strB); And wants a extra using at top? using System.Runtime.InteropServices; |
Yes, we still need the InteropService using directive that you removed here: https://github.com/dotnet/coreclr/pull/16434/files#diff-a43bf8854ecf533cd7e5c7f6b221ea0cL10 That is where MemoryMarshal lives. |
lol, sorry :) |
57d2eef
to
165032d
Compare
|
||
case StringComparison.CurrentCultureIgnoreCase: | ||
return ReplaceCore(oldValue, newValue, CultureInfo.CurrentCulture, CompareOptions.IgnoreCase); | ||
return ReplaceCore(oldValue, newValue, CultureInfo.CurrentCulture, StringSpanHelpers.GetCaseCompareOfComparisonCulture(comparisonType)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: How much value does this change add (reducing the cases in the switch statement this way)?
I have span-based APIs that have similar structure and am wondering how fruitful it would be to do the same thing there.
return (CompareOptions)((int)comparisonType & (int)CompareOptions.IgnoreCase); | ||
} | ||
|
||
[StackTraceHidden] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am seeing this attribute for the first time. I can infer what it does, but can you explain why we should have it here?
Is it mainly to help debugging by avoiding an extra method showing up in the call stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't output the method in the stacktrace; so it looks like it was thrown in the original method that calls it
using System.Runtime.CompilerServices; | ||
using System.Runtime.ConstrainedExecution; | ||
using System.Runtime.InteropServices; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Add back new line.
I have been separating using Internal.*
and the rest of the using directives with a new line. Most of the places I have seen do this but it is inconsistent.
Although, I don't think it is worth making the change given CI is mostly green already.
return false; | ||
|
||
if (this.Length != value.Length) | ||
if (value is null || this.Length != value.Length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is value is null
better than value == null
?
The disassembly is identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't affect string as its special cased; but in other cases can cause the compiler to do an implicit conversion to the type then use an overloaded ==
to compare; has caught me out before.
charA = *a; | ||
charB = *b; | ||
|
||
Debug.Assert((charA | charB) <= 0x7F, "strings have to be ASCII"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are now replacing calls to this method (CompareOrdinalIgnoreCaseHelper) with calls to CompareOrdinalIgnoreCase(ReadOnlySpan<char> strA, ReadOnlySpan<char> strB)
Should this be 0x7F instead of 0x80?
// in InvariantMode we support all range and not only the ascii characters.
char maxChar = (char) (GlobalizationMode.Invariant ? 0xFFFF : 0x80);
Also, this method doesn't do the (*a <= maxChar) && (*b <= maxChar)
checks for each character (which the new method does). Wouldn't we see perf regression now (including the overhead of casting string to span)?
Edit: Ignore the point about the loop having extra checks per character (we are replacing the IsAscii check which was doing this anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the check needs to be <= 0x7F
and not <= 0x80
to remain consistent with the IsAscii checks.
Line 1392 in ba1d5b2
if (c>=0x80) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that;s not right; either should be < maxChar
or maxChar
should be 0x7f
; since the alternative is 0xffff
guessing should be 0x7f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug.Assert((charA | charB) <= 0x7F, "strings have to be ASCII"); | ||
|
||
// Ordinal equals or lowercase equals if the result ends up in the a-z range | ||
if (charA == charB || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which way of writing this method is more optimal?
A: What we are removing:
// Ordinal equals or lowercase equals if the result ends up in the a-z range
if (charA == charB ||
((charA | 0x20) == (charB | 0x20) &&
(uint)((charA | 0x20) - 'a') <= (uint)('z' - 'a')))
{
a++;
b++;
length--;
}
else
{
return false;
}
if (charA == charB)
{
a++; b++;
length--;
continue;
}
// uppercase both chars - notice that we need just one compare per char
if ((uint)(charA - 'a') <= 'z' - 'a') charA -= 0x20;
if ((uint)(charB - 'a') <= 'z' - 'a') charB -= 0x20;
// Return the (case-insensitive) difference between them.
if (charA != charB)
return charA - charB;
// Next char
a++; b++;
length--;
It seems like (A) is better (at least smaller disassembly): 148 bytes vs 133 bytes (2 instructions).
https://www.diffchecker.com/0hhlUYlf
[MethodImpl(MethodImplOptions.NoInlining)]
public unsafe static int TestingA(ReadOnlySpan<char> strA, ReadOnlySpan<char> strB)
{
int length = Math.Min(strA.Length, strB.Length);
fixed (char* ap = &MemoryMarshal.GetReference(strA))
fixed (char* bp = &MemoryMarshal.GetReference(strB))
{
char* a = ap;
char* b = bp;
while (length != 0)
{
int charA = *a;
int charB = *b;
// Ordinal equals or lowercase equals if the result ends up in the a-z range
if (charA == charB ||
((charA | 0x20) == (charB | 0x20) &&
(uint)((charA | 0x20) - 'a') <= ('z' - 'a')))
{
a++;
b++;
length--;
}
else
{
return charA - charB;
}
}
return strA.Length - strB.Length;
}
}
[MethodImpl(MethodImplOptions.NoInlining)]
public unsafe static int TestingB(ReadOnlySpan<char> strA, ReadOnlySpan<char> strB)
{
int length = Math.Min(strA.Length, strB.Length);
fixed (char* ap = &MemoryMarshal.GetReference(strA))
fixed (char* bp = &MemoryMarshal.GetReference(strB))
{
char* a = ap;
char* b = bp;
while (length != 0)
{
int charA = *a;
int charB = *b;
if (charA == charB)
{
a++; b++;
length--;
continue;
}
// uppercase both chars - notice that we need just one compare per char
if ((uint)(charA - 'a') <= 'z' - 'a') charA -= 0x20;
if ((uint)(charB - 'a') <= 'z' - 'a') charB -= 0x20;
// Return the (case-insensitive) difference between them.
if (charA != charB)
return charA - charB;
// Next char
a++; b++;
length--;
}
return strA.Length - strB.Length;
}
}
Related to: https://github.com/dotnet/coreclr/issues/10293#issue-215273397
CompareInfo.CompareOrdinalIgnoreCase checks to see if the chars are equivalent before normalizing case, whereas String.CompareOrdinalIgnoreCaseHelper doesn't. This suggests that the former is optimizing for the chars being the same case, whereas the latter is optimizing for the case where the chars aren't the same case. It's not clear to me that there's a use-case difference between these helpers that would account for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A is probably a bit faster; B has more branching and does more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benaadams can you please update CompareOrdinalIgnoreCase to look like (A). See my TestingA method above.: https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/System/Globalization/CompareInfo.cs#L576-L593
Thanks!
System.Runtime.Extensions.Tests
Could this have something to do with a bug in https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/System/Span.NonGeneric.cs#L337? Or is this related to: dotnet/corefx#27348 cc @JeremyKuhne Edit: Could be related to my change here - cb4dd60#diff-a43bf8854ecf533cd7e5c7f6b221ea0cR658 I am going to dig deeper. |
System.Data.SqlClient.Tests.SqlConnectionBasicTests.IntegratedAuthConnectionTest |
} | ||
|
||
Debug.Fail("StringComparison outside range"); | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return null; [](start = 12, length = 12)
this is wrong and we have to throw #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It throws at the start of the function; second throw here would be redundant and just bloat the method #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We throw in the beginning. We should never reach this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
|
||
Debug.Fail("StringComparison outside range"); | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return 0; [](start = 11, length = 10)
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. We should never reach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than updating CompareOrdinalIgnoreCase, looks good to me.
@dotnet-bot test Windows_NT x64_arm64_altjit Checked corefx_baseline |
https://github.com/dotnet/corefx/issues/23435
Is this a known issue? cc @AndyAyersMS, @dotnet/jit-contrib |
Yes, it's a pre-existing issue, see #16377. |
Lol, I knew it seemed familiar. I searched the repo for "LongTuplesWithNull" but missed it (I didn't wait long enough to see the results under Issues). |
OK, @benaadams, can you please address the code review feedback, whenever possible, and we can get this merged :) Also:
|
Let me get some perf tests @jkotas could there be an issue for R2R that the initial slim String.Equal won't inline (crossing assembly boundries) so this would introduce two calls? |
It is like that today already, so this is regressing R2R. Right? |
Not sure; am wondering how to check; would I need to crossgen the performance test? |
@benaadams, any plans to get this in this week? Do you want me to take over and resolve the merge conflicts? |
@ahsonkhan I'd like to hold off on it a bit till I do a bit more perf analysis; as I think it will mean the whole of corefx will have to make 2 calls instead of 1 (as everything is crossgen'd to the runtime store?) If that's the case, there are still some improvements that could be incorporated from it without splitting to two calls (though I'm getting on a plane shortly) |
@benaadams, do you plan to update this PR as part of 2.1 or 2.2? |
This is not for 2.1. |
This change makes the code both smaller and faster. For example, the following is about 1.4x faster with this change: ``` ReadOnlySpan<char> s1 = "Hello world"; ReadOnlySpan<char> s2 = "world"; for (int i = 0; i < 100000000; i++) s1.EndsWith(s2, StringComparison.OrdinalIgnoreCase); ``` Also, I have ported GetCaseCompareOfComparisonCulture code size optimization from #16434 while I was on it because of it fit well with the rest of the changes.
Most of these changes got submitted via independent smaller PRs. If there is anything good still left, please submit a new PR with what is remaining. |
Allows String.Equals (of all forms) to inline for the null and ref equality parts.
Remove
case default
inline throw asStringSpanHelpers.CheckStringComparison
already deals with out of rangeDeduplicate calls to
StringSpanHelpers.CheckStringComparison
and always use it, rather than in some if branches (so thecase default
can be removed)Add
GetCaseCompareOfComparisonCulture
that convertsStringComparison
toCompareOptions
with a single&
when comparison is a culture.Dedupe the calls using it.
Remove the now other duplicated code (for equals)
Resolves #14320
"Consider removing String.IsAscii() fast paths" as
CompareOrdinalIgnoreCase(ReadOnlySpan<char>,ReadOnlySpan<char>)
has an ascii fast-path in itResolves #10293 "Differences between String.CompareOrdinalIgnoreCaseHelper and CompareInfo.CompareOrdinalIgnoreCase" by removing the
string
version in consolidation