Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@ahsonkhan
Copy link

@ahsonkhan ahsonkhan commented Mar 31, 2018

The extra span.Length checks (required by the way string.LastIndexOf was implemented) ends up adding overhead for small spans.

TODO: Investigate a way to avoid that overhead without having two separate implementations.

image

Version 1 of the PR (out-dated)

image

Related PR: #17284

cc @jkotas, @tarekgh, @eerhardt

@ahsonkhan
Copy link
Author

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

}
}

public static unsafe int LastIndexOf(ref char firstChar, int startIndex, char value, int length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the startIndex argument? Isn't it always length - 1?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think this should be length, just like for IndexOf.

Copy link
Author

@ahsonkhan ahsonkhan Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how that would work.

Let's say we change the caller to pass in length for startIndex instead:

public int LastIndexOf(char value) => SpanHelpers.LastIndexOf(ref _firstChar, Length, value, Length);

For a string of length 1, we would end up with the incorrect result (-1 instead of 0).

'a'.LastIndexOf('a'); // Expected: 0, Actual: -1

We have to pass in Length - 1 as the startIndex since the character at position startIndex is included in the search. Which means, we have to special case Length == 0, as well.

Copy link
Member

@jkotas jkotas Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IndexOf and LastIndexOf implementation should be pretty similar. IndexOf helper takes 3 arguments, so LastIndexOf helper should take 3 arguments as well.

The arguments of the SpanHelpers.LastIndexOf method should be the block of memory (ie pointer + length) and the value you are looking for. The method should always start searching at length - 1.

The naive non-vectorized algorithm would be:

static int LastIndexOf(ref char searchSpace, char value, int length)
{
    for (int i = length - 1; i >= 0; i--)
        if (Unsafe.Add(ref searchSpace, i) == value) return i;
    return -1;
}

And the call from int IndexOf(char value, int startIndex, int count) should be:

static int LastIndexOf(char value, int startIndex, int count)
{
...
    int endIndex = startIndex + 1 - count;
    int result = SpanHelpers.LastIndexOf(ref Unsafe.Add(_firstChar, endIndex), count, value);
    return result == -1 ? result : result + endIndex;
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks. That is what I ended up doing :)

public int LastIndexOf(char value)
{
return LastIndexOf(value, this.Length - 1, this.Length);
if (Length == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not have Length == 0 check for IndexOf, we should not have one here either.

if (Vector.IsHardwareAccelerated && length >= Vector<ushort>.Count * 2)
{
const int elementsPerByte = sizeof(ushort) / sizeof(byte);
length = (((int)pCh & (Unsafe.SizeOf<Vector<ushort>>() - 1)) / elementsPerByte) + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need to be Vector<byte>? I think either you don't need elementsPerByte, or you should use Vector<byte>.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't Vector<byte>.Count and Unsafe.SizeOf<Vector<ushort>>() equivalent in value? In either case, we would still need elementsPerByte, no?

What about IndexOf: https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/System/SpanHelpers.Char.cs#L96?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct. I was thinking Vector<T>.Count, not Unsafe.SizeOf. Please ignore.

while (length > 0)
{
char* pStart = pCh - Vector<ushort>.Count + 1;
Vector<ushort> vMatches = Vector.Equals(vComparison, Unsafe.ReadUnaligned<Vector<ushort>>(pStart));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use Unsafe.Read in IndexOf, but Unsafe.ReadUnaligned here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if pStart is guaranteed to be vector aligned here (unlike the case of IndexOf). I will have to reason about it to convince myself. If it is aligned, I will change it to Unsafe.Read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be guaranteed, or else the preamble that we are doing is worthless. The whole point of the first SequentialScan is to get our pointer aligned. That's what:

length = (((int)pCh & (Unsafe.SizeOf<Vector<ushort>>() - 1)) / elementsPerByte) + 1;

that line is doing.

Once pCh is aligned, pStart will be aligned because:

char* pStart = pCh - Vector<ushort>.Count + 1;

while (length > Vector<ushort>.Count - 1)
{
char* pStart = pCh - Vector<ushort>.Count;
// Using Unsafe.ReadUnaligned instead of Read since it isn't gauranteed that pStart is vector aligned
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not same as in IndexOf?

     // Using Unsafe.Read instead of ReadUnaligned since the search space is pinned and pCh is always vector aligned
     Debug.Assert(((int)pCh & (Unsafe.SizeOf<Vector<ushort>>() - 1)) == 0);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahsonkhan - I thought we discussed this last night, and I showed you why pStart is guaranteed to be aligned.

Copy link
Author

@ahsonkhan ahsonkhan Apr 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, char* pStart = pCh - Vector<ushort>.Count + 1; is always aligned, but in the current implementation, that causes test failures. For correctness, I had to change it to char* pStart = pCh - Vector<ushort>.Count; (note: no + 1) which isn't aligned.

I will investigate how to fix the test failures.

As part of this change, I also had to change the sequential scan (and the goto logic) for correctness. We decrement pch at the start of the loop before comparing with value rather than doing it at the end of the loop.

if (Vector.IsHardwareAccelerated && length >= Vector<ushort>.Count * 2)
{
const int elementsPerByte = sizeof(ushort) / sizeof(byte);
length = (((int)pCh & (Unsafe.SizeOf<Vector<ushort>>() - 1)) / elementsPerByte) + 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A case I think this is missing is when pCh is already at the correct position for alignment. With the current code, we will do a whole SequentialScan on the first vector's length of chars before we start the vectorized section. We can skip that SequentialScan when pCh is already at the correct position for alignment.

Using my back of the napkin pseudo code, I think you can accomplish that with adding the line:

length = length & (Vector<ushort>.Count - 1);

This is also what the SpanHelpers.Byte.LastIndexOf version is also doing -

nLength = (IntPtr)(((length & (Vector<byte>.Count - 1)) + unaligned) & (Vector<byte>.Count - 1));

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If pCh is already aligned, adding that line won't skip the SequentialScan (due to the + 1):

// ((int)pCh & (Unsafe.SizeOf<Vector<ushort>>() - 1)) = 0
// 0 / elementsPerByte = 0
// length =  0 + 1 = 1
length = (((int)pCh & (Unsafe.SizeOf<Vector<ushort>>() - 1)) / elementsPerByte) + 1; 

length = length & (Vector<ushort>.Count - 1); // length = 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not pCh that needs to be aligned. It's pStart below. Like I said above, the case where pCh is at the correct position for alignment.

A B C D | E F G H | I J K
          ^     ^
          |     pCh
          pStart

It's pStart that needs to be aligned.

So you are checking for when pCh is at the end of the alignment. i.e. length = Vector<ushort>.Count.

@ahsonkhan
Copy link
Author

@dotnet-bot test Windows_NT x64 Checked corefx_baseline
@dotnet-bot test Ubuntu x64 Checked corefx_baseline

@ahsonkhan
Copy link
Author

Any other feedback before I merge this?

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work. This is cleaner and more readable than the way I had it. I just needed to shift my thinking that pCh points to the char you've already checked through each iteration. But in doing that, it makes the math around it easier to understand.

@ahsonkhan ahsonkhan merged commit f561c10 into dotnet:master Apr 5, 2018
@ahsonkhan ahsonkhan deleted the UseSpanLastIndexOf branch April 5, 2018 18:36
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants