-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use internal BitOperations.ResetLowestSetBit
#87798
Conversation
Tagging subscribers to this area: @dotnet/area-system-memory Issue Detailsruntime/src/libraries/System.Private.CoreLib/src/System/Numerics/BitOperations.cs Lines 941 to 945 in 05e25ff
Without any lowering, the implementation is still likely more efficient than
|
@@ -462,8 +462,7 @@ ref Unsafe.Add(ref valueRef, 1), valueTailLength)) | |||
// Do a full IgnoreCase equality comparison. SpanHelpers.IndexOf skips comparing the two characters in some cases, | |||
// but we don't actually know that the two characters are equal, since we compared with | 0x20. So we just compare | |||
// the full string always. | |||
int bitPos = BitOperations.TrailingZeroCount(mask); | |||
nint charPos = (nint)((uint)bitPos / 2); // div by 2 (shr) because we work with 2-byte chars | |||
nint charPos = (nint)((uint)BitOperations.TrailingZeroCount(mask) / sizeof(ushort)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't think the cast to uint is needed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the uint
cast for consistency and to avoid a sign-extension in the expression offset + charPos
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
887ccac: uint.TrailingZeroCount(mask)
is easier to read though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, makes sense to me, godbolt example: https://godbolt.org/z/EYMfbr44K
Data from On Haswell the RThroughput ratio is 0.72 (an improvement). |
@@ -462,8 +462,7 @@ ref Unsafe.Add(ref valueRef, 1), valueTailLength)) | |||
// Do a full IgnoreCase equality comparison. SpanHelpers.IndexOf skips comparing the two characters in some cases, | |||
// but we don't actually know that the two characters are equal, since we compared with | 0x20. So we just compare | |||
// the full string always. | |||
int bitPos = BitOperations.TrailingZeroCount(mask); | |||
nint charPos = (nint)((uint)bitPos / 2); // div by 2 (shr) because we work with 2-byte chars | |||
nint charPos = (nint)(uint.TrailingZeroCount(mask) / sizeof(ushort)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nint charPos = (nint)(uint.TrailingZeroCount(mask) / sizeof(ushort)); | |
nint charPos = (nint)(uint.TrailingZeroCount(mask) / sizeof(char)); |
@stephentoub This would be slightly clearer, but might not be worth spinning CI again?
runtime/src/libraries/System.Private.CoreLib/src/System/Numerics/BitOperations.cs
Lines 941 to 945 in 05e25ff
Without any lowering, the implementation is still likely more efficient than
mask &= ~(uint)(0b11 << bitPos)
as it avoids a loop-carried dependency.