Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@stephentoub
Copy link
Member

  • Move most of the implementation to the platform-agnostic file, rather than having completely different implementations for Windows and Unix. Now the only logic in each platform-specific file is the logic around invoking the associated P/Invoke.
  • Optimize that implementation to take a fast path that doesn't allocate when no case change is needed, and to avoid the native call when the whole string is ASCII.

cc: @jkotas, @danmosemsft, @tarekgh, @ahsonkhan

Using the perf tests added in dotnet/corefx#28748:

System.Runtime.Performance.Tests.dll Unit Current PR Diff
"System.Tests.Perf_String.ToLower(size: 1, options: None, cultureName: ""en-US"")" GC Allocations 321520.944 0.00 #DIV/0!
"System.Tests.Perf_String.ToLower(size: 1, options: UniqueString, cultureName: ""en-US"")" GC Allocations 321709.776 0.00 #DIV/0!
"System.Tests.Perf_String.ToLower(size: 10, options: None, cultureName: ""en-US"")" GC Allocations 482977.472 0.00 #DIV/0!
"System.Tests.Perf_String.ToLower(size: 10, options: UniqueString, cultureName: ""en-US"")" GC Allocations 483279.328 0.00 #DIV/0!
"System.Tests.Perf_String.ToLower(size: 500, options: None, cultureName: ""en-US"")" GC Allocations 1.10E+07 0.00 #DIV/0!
"System.Tests.Perf_String.ToLower(size: 500, options: UniqueString, cultureName: ""en-US"")" GC Allocations 1.10E+07 0.00 #DIV/0!
"System.Tests.Perf_String.ToUpper(size: 1, options: None, cultureName: ""en-US"")" GC Allocations 321422.216 0.00 #DIV/0!
"System.Tests.Perf_String.ToUpper(size: 1, options: UniqueString, cultureName: ""en-US"")" GC Allocations 321722.096 0.00 #DIV/0!
"System.Tests.Perf_String.ToUpper(size: 10, options: None, cultureName: ""en-US"")" GC Allocations 482810.392 0.00 #DIV/0!
"System.Tests.Perf_String.ToUpper(size: 10, options: UniqueString, cultureName: ""en-US"")" GC Allocations 483392.68 0.00 #DIV/0!
"System.Tests.Perf_String.ToUpper(size: 500, options: None, cultureName: ""en-US"")" GC Allocations 1.10E+07 0.00 #DIV/0!
"System.Tests.Perf_String.ToUpper(size: 500, options: UniqueString, cultureName: ""en-US"")" GC Allocations 1.10E+07 0.00 #DIV/0!
"System.Tests.Perf_String.ToLower(size: 500, options: UniqueString, cultureName: ""en-US"")" Duration 22.196 3.36 6.61x
"System.Tests.Perf_String.ToUpper(size: 500, options: UniqueString, cultureName: ""en-US"")" Duration 21.12 3.37 6.26x
"System.Tests.Perf_String.ToLower(size: 1, options: None, cultureName: ""en-US"")" Duration 0.636 0.16 3.95x
"System.Tests.Perf_String.ToLower(size: 10, options: UniqueString, cultureName: ""en-US"")" Duration 0.836 0.21 3.92x
"System.Tests.Perf_String.ToUpper(size: 10, options: None, cultureName: ""en-US"")" Duration 0.776 0.21 3.75x
"System.Tests.Perf_String.ToUpper(size: 1, options: UniqueString, cultureName: ""en-US"")" Duration 0.59 0.16 3.66x
"System.Tests.Perf_String.ToUpper(size: 10, options: UniqueString, cultureName: ""en-US"")" Duration 0.77 0.21 3.63x
"System.Tests.Perf_String.ToLower(size: 10, options: None, cultureName: ""en-US"")" Duration 0.77 0.21 3.62x
"System.Tests.Perf_String.ToLower(size: 1, options: UniqueString, cultureName: ""en-US"")" Duration 0.571 0.16 3.61x
"System.Tests.Perf_String.ToUpper(size: 1, options: None, cultureName: ""en-US"")" Duration 0.57 0.16 3.48x
"System.Tests.Perf_String.ToLower(size: 500, options: None, cultureName: ""en-US"")" Duration 9.84 2.91 3.39x
"System.Tests.Perf_String.ToUpper(size: 500, options: None, cultureName: ""en-US"")" Duration 9.554 2.88 3.31x
"System.Tests.Perf_String.ToUpper(size: 1, options: MiddleDifferentCase, cultureName: ""en-US"")" Duration 0.647 0.23 2.88x
"System.Tests.Perf_String.ToLower(size: 1, options: MiddleDifferentCase, cultureName: ""en-US"")" Duration 0.583 0.22 2.70x
"System.Tests.Perf_String.ToUpper(size: 1, options: AllDifferentCase, cultureName: ""en-US"")" Duration 0.61 0.24 2.58x
"System.Tests.Perf_String.ToLower(size: 1, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" Duration 0.617 0.25 2.51x
"System.Tests.Perf_String.ToUpper(size: 10, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" Duration 0.936 0.38 2.47x
"System.Tests.Perf_String.ToUpper(size: 10, options: MiddleDifferentCase, cultureName: ""en-US"")" Duration 0.803 0.33 2.46x
"System.Tests.Perf_String.ToLower(size: 1, options: AllDifferentCase, cultureName: ""en-US"")" Duration 0.529 0.22 2.46x
"System.Tests.Perf_String.ToUpper(size: 1, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" Duration 0.59 0.25 2.37x
"System.Tests.Perf_String.ToLower(size: 1, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" Duration 0.563 0.24 2.34x
"System.Tests.Perf_String.ToLower(size: 10, options: MiddleDifferentCase, cultureName: ""en-US"")" Duration 0.734 0.32 2.31x
"System.Tests.Perf_String.ToUpper(size: 10, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" Duration 0.83 0.36 2.31x
"System.Tests.Perf_String.ToUpper(size: 1, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" Duration 0.568 0.25 2.27x
"System.Tests.Perf_String.ToUpper(size: 10, options: AllDifferentCase, cultureName: ""en-US"")" Duration 0.731 0.32 2.27x
"System.Tests.Perf_String.ToLower(size: 10, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" Duration 0.764 0.36 2.10x
"System.Tests.Perf_String.ToLower(size: 10, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" Duration 0.787 0.38 2.07x
"System.Tests.Perf_String.ToLower(size: 500, options: MiddleDifferentCase, cultureName: ""en-US"")" Duration 9.461 4.77 1.98x
"System.Tests.Perf_String.ToUpper(size: 500, options: MiddleDifferentCase, cultureName: ""en-US"")" Duration 9.477 4.86 1.95x
"System.Tests.Perf_String.ToLower(size: 10, options: AllDifferentCase, cultureName: ""en-US"")" Duration 0.687 0.35 1.94x
"System.Tests.Perf_String.ToLower(size: 500, options: AllDifferentCase, cultureName: ""en-US"")" Duration 10.095 5.20 1.94x
"System.Tests.Perf_String.ToUpper(size: 500, options: AllDifferentCase, cultureName: ""en-US"")" Duration 9.723 5.25 1.85x
"System.Tests.Perf_String.ToUpper(size: 500, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" Duration 21.61 15.60 1.39x
"System.Tests.Perf_String.ToLower(size: 500, options: MiddleTurkishI, cultureName: ""en-US"")" Duration 9.666 7.06 1.37x
"System.Tests.Perf_String.ToUpper(size: 500, options: MiddleTurkishI, cultureName: ""en-US"")" Duration 9.685 7.27 1.33x
"System.Tests.Perf_String.ToUpper(size: 500, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" Duration 20.475 15.43 1.33x
"System.Tests.Perf_String.ToLower(size: 500, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" Duration 20.164 15.97 1.26x
"System.Tests.Perf_String.ToLower(size: 500, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" Duration 21.567 17.10 1.26x
"System.Tests.Perf_String.ToUpper(size: 500, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" Duration 20.594 16.50 1.25x
"System.Tests.Perf_String.ToLower(size: 500, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" Duration 20.1 17.14 1.17x
"System.Tests.Perf_String.ToLower(size: 10, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 483082.664 482351.68 1.00x
"System.Tests.Perf_String.ToUpper(size: 10, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 483062.328 482354.55 1.00x
"System.Tests.Perf_String.ToUpper(size: 1, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" GC Allocations 321657.152 321233.96 1.00x
"System.Tests.Perf_String.ToLower(size: 1, options: AllDifferentCase, cultureName: ""en-US"")" GC Allocations 321521.112 321103.42 1.00x
"System.Tests.Perf_String.ToUpper(size: 1, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 321812.704 321470.34 1.00x
"System.Tests.Perf_String.ToLower(size: 1, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" GC Allocations 321766.96 321451.08 1.00x
"System.Tests.Perf_String.ToLower(size: 1, options: MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 321402 321180.61 1.00x
"System.Tests.Perf_String.ToUpper(size: 10, options: AllDifferentCase, cultureName: ""en-US"")" GC Allocations 482989.416 482681.72 1.00x
"System.Tests.Perf_String.ToUpper(size: 10, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" GC Allocations 483620.056 483379.00 1.00x
"System.Tests.Perf_String.ToLower(size: 10, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 483833.656 483599.46 1.00x
"System.Tests.Perf_String.ToUpper(size: 10, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 483827.64 483600.13 1.00x
"System.Tests.Perf_String.ToLower(size: 10, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" GC Allocations 483603.688 483377.29 1.00x
"System.Tests.Perf_String.ToLower(size: 10, options: MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 483135.88 482913.70 1.00x
"System.Tests.Perf_String.ToUpper(size: 10, options: MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 483222 483124.38 1.00x
"System.Tests.Perf_String.ToLower(size: 500, options: AllDifferentCase, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToLower(size: 500, options: MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToLower(size: 500, options: MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToLower(size: 500, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToLower(size: 500, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToLower(size: 500, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToUpper(size: 500, options: AllDifferentCase, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToUpper(size: 500, options: MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToUpper(size: 500, options: MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToUpper(size: 500, options: UniqueString, AllDifferentCase, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToUpper(size: 500, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToUpper(size: 500, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 1.10E+07 11000000.00 1.00x
"System.Tests.Perf_String.ToUpper(size: 10, options: MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 482485.296 482489.72 1.00x
"System.Tests.Perf_String.ToLower(size: 10, options: AllDifferentCase, cultureName: ""en-US"")" GC Allocations 482999.344 483005.01 1.00x
"System.Tests.Perf_String.ToLower(size: 1, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 321589.952 321600.74 1.00x
"System.Tests.Perf_String.ToLower(size: 10, options: MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 482488.08 482596.29 1.00x
"System.Tests.Perf_String.ToUpper(size: 1, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 320956.352 321086.74 1.00x
"System.Tests.Perf_String.ToLower(size: 1, options: MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 321201.952 321403.47 1.00x
"System.Tests.Perf_String.ToLower(size: 1, options: UniqueString, MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 320957.872 321182.16 1.00x
"System.Tests.Perf_String.ToUpper(size: 1, options: MiddleDifferentCase, cultureName: ""en-US"")" GC Allocations 321076.624 321396.39 1.00x
"System.Tests.Perf_String.ToUpper(size: 1, options: MiddleTurkishI, cultureName: ""en-US"")" GC Allocations 321188.84 321519.10 1.00x
"System.Tests.Perf_String.ToUpper(size: 1, options: AllDifferentCase, cultureName: ""en-US"")" GC Allocations 321421.744 321846.92 1.00x
"System.Tests.Perf_String.ToUpper(size: 1, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" Duration 0.7 0.71 0.98x
"System.Tests.Perf_String.ToUpper(size: 10, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" Duration 0.811 0.89 0.91x
"System.Tests.Perf_String.ToUpper(size: 10, options: MiddleTurkishI, cultureName: ""en-US"")" Duration 0.728 0.81 0.90x
"System.Tests.Perf_String.ToUpper(size: 1, options: MiddleTurkishI, cultureName: ""en-US"")" Duration 0.612 0.69 0.88x
"System.Tests.Perf_String.ToLower(size: 10, options: MiddleTurkishI, cultureName: ""en-US"")" Duration 0.686 0.80 0.86x
"System.Tests.Perf_String.ToLower(size: 1, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" Duration 0.572 0.67 0.86x
"System.Tests.Perf_String.ToLower(size: 1, options: MiddleTurkishI, cultureName: ""en-US"")" Duration 0.553 0.66 0.84x
"System.Tests.Perf_String.ToLower(size: 10, options: UniqueString, MiddleTurkishI, cultureName: ""en-US"")" Duration 0.766 0.91 0.84x

@danmoseley
Copy link
Member

This affects char.ToLower/Upper aka TextInfo.ToLower(char) but Perf_Char is almost empty. Do we have enough coverage of the char overloads and if not would you consider adding enough to make sure you didn't regress them?

@stephentoub
Copy link
Member Author

Do we have enough coverage of the char overloads and if not would you consider adding enough to make sure you didn't regress them?

Added at dotnet/corefx#28765. The results are the same before/after this PR.

@stephentoub
Copy link
Member Author

@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test please
@dotnet/dnceng, FYI:
https://ci.dot.net/job/dotnet_coreclr/job/master/job/x64_checked_osx10.12_innerloop_flow_prtest/2534/

FATAL: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 131.107.58.4/131.107.58.4:63995

@dotnet-bot test Ubuntu arm Cross Checked Innerloop Build and Test please
@dotnet/dnceng, FYI:
https://ci.dot.net/job/dotnet_coreclr/job/master/job/arm_cross_checked_ubuntu_innerloop_flow_prtest/269/

FATAL: command execution failed
java.nio.channels.ClosedChannelException
	at org.jenkinsci.remoting.protocol.impl.ChannelApplicationLayer.onReadClosed(ChannelApplicationLayer.java:208)

@dleeapho
Copy link

dleeapho commented Apr 3, 2018

@danmoseley danmoseley requested a review from tarekgh April 3, 2018 15:49
for (sourcePos = 0; sourcePos < source.Length; sourcePos++)
{
// If the character is lower-case, we're going to need to allocate a string.
char c = source[sourcePos];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

char c = source[sourcePos]; [](start = 24, length = 27)

Is it better to use a pointer to the string instead of using the string directly here? I think this can avoid the bound checking when accessing the indexed characters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JIT should be able to avoid the bounds checking here, as it can see that sourcePos is always within the bounds of the string based on the loop construction. I'd tried using pointers, but it didn't help with throughput and it made the code more unsafe, so I reverted it.

@tarekgh
Copy link
Member

tarekgh commented Apr 3, 2018

Thanks, @stephentoub for optimizing this. I think the gain will be very valuable.

}

int ret;
Debug.Assert(pSourceLen == pResultLen);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add debug asserts that check pSource, pResult is not null?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug.Assert(pSourceLen == pResultLen); => Debug.Assert(pSourceLen <= pResultLen); ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add debug asserts that check pSource, pResult is not null?

Sure

Debug.Assert(pSourceLen == pResultLen); => Debug.Assert(pSourceLen <= pResultLen); ?

Why? Is there a situation where it'll be < rather than ==?

Copy link

@ahsonkhan ahsonkhan Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is called for both strings and spans, correct?

The spans passed in to this method can come from the user who passed in a larger destination:
https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/System/MemoryExtensions.Fast.cs#L207

At the call site, the source.Length - length could be <= destination.Length - length
ChangeCase(a, source.Length - length, b, destination.Length - length, toUpper);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right. Will fix.

return result;
}

internal unsafe void ChangeCase(ReadOnlySpan<char> source, Span<char> destination, bool toUpper)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this method internal instead of private?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably because it's used from Span? Let's ask the person who added it:
1494667#diff-4724ac68e6d26f4724534fcbc2cf5374R68
😉

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably because it's used from Span? Let's ask the person who added it:

Right, of course. I talked to him offline, he confirms that was the reason :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

- Move most of the implementation to the platform-agnostic file, rather than having completely different implementations for Windows and Unix.  Now the only logic in each platform-specific file is the logic around invoking the associated P/Invoke.
- Optimize that implementation to take a fast path that doesn't allocate when no case change is needed, and to avoid the native call when the whole string is ASCII.
@stephentoub stephentoub merged commit d1f49cc into dotnet:master Apr 4, 2018
@stephentoub stephentoub deleted the textinfochangecase branch April 4, 2018 10:54
dotnet-bot pushed a commit to dotnet/corert that referenced this pull request Apr 4, 2018
- Move most of the implementation to the platform-agnostic file, rather than having completely different implementations for Windows and Unix.  Now the only logic in each platform-specific file is the logic around invoking the associated P/Invoke.
- Optimize that implementation to take a fast path that doesn't allocate when no case change is needed, and to avoid the native call when the whole string is ASCII.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
jkotas pushed a commit to dotnet/corert that referenced this pull request Apr 5, 2018
- Move most of the implementation to the platform-agnostic file, rather than having completely different implementations for Windows and Unix.  Now the only logic in each platform-specific file is the logic around invoking the associated P/Invoke.
- Optimize that implementation to take a fast path that doesn't allocate when no case change is needed, and to avoid the native call when the whole string is ASCII.

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
dotnet-bot pushed a commit to dotnet/corefx that referenced this pull request Apr 9, 2018
- Move most of the implementation to the platform-agnostic file, rather than having completely different implementations for Windows and Unix.  Now the only logic in each platform-specific file is the logic around invoking the associated P/Invoke.
- Optimize that implementation to take a fast path that doesn't allocate when no case change is needed, and to avoid the native call when the whole string is ASCII.

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
Anipik pushed a commit to dotnet/corefx that referenced this pull request Apr 9, 2018
- Move most of the implementation to the platform-agnostic file, rather than having completely different implementations for Windows and Unix.  Now the only logic in each platform-specific file is the logic around invoking the associated P/Invoke.
- Optimize that implementation to take a fast path that doesn't allocate when no case change is needed, and to avoid the native call when the whole string is ASCII.

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants