-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Add more functional and perf tests for string.ToLower/Upper #28748
Conversation
| bool differentCase = (options & ChangeCaseOptions.AllDifferentCase) != 0; | ||
| chars.Fill(upper != differentCase ? 'S' : 's'); | ||
|
|
||
| if ((options & ChangeCaseOptions.MiddleTurkishI) != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the impl in front of me but it cuts over to a slow path when it hits a non ASCII character right? Why cut over in the middle? If you're testing perf perhaps you want to have entirely slow case and entirely fast case to not mix what you test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted the perf test to assume as little as possible about the implementation, e.g. whether it goes front to back or back to front. Figured I'd just change the middle then.
Don't have a strong opinion, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you expect us to change the implementation, I think it's fine for the perf test to attempt to highlight different aspects. But I don't care either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I'm going to leave it for now. The other reason for this is that I'm already bumping up against how many tests seem reasonable for this (i.e. don't take too much time), and if we wanted to have the extremes, we'd want tests for when the non-ASCII char occurs at the beginning and at the end.
As an aside, it'd be great if the perf test harness supported a filter like /p:XunitOptions="-method ...". That appears to be completely ignored, and I don't see any other way (other than deleting all the other tests in the project temporarily) to only run a subset of tests.
|
|
||
| public static IEnumerable<object[]> ChangeCaseMemberData() => | ||
| from size in new[] { 1, 10, 500 } | ||
| from culture in new[] { "en-US" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why parameterize culture but only ever use en-US? Is that the only culture we can rely on being installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mainly because I had other cultures in there, including tr-TR and invariant, and decided it was too many perf tests to check-in, but wanted to make it easy to still add them in locally.
| from options in new[] | ||
| { | ||
| ChangeCaseOptions.None, | ||
| ChangeCaseOptions.MiddleTurkishI, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MiddleTurkishI [](start = 34, length = 14)
I am not getting this, why we have this option while not using the Turkish culture? it would make sense if you add the Turkish culture there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because if we encounter a non-ASCII character, it knocks us off the ASCII path, regardless of culture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am seeing you are creating the whole string as ascii. right? maybe I am missing something here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you saying you are trying to test the fast path even when having Turkish I?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am seeing you are creating the whole string as ascii. right? maybe I am missing something here.
When this option is set, it's creating the whole string as ASCII except for the middle character, which ends up being set to a non-ASCII character (\u0131). Thus when the code gets to that character, it'll be knocked off the ASCII path and will be forced to fall back to doing the OS-based culture-aware comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand this, but still this has nothing to do with Turkish. this is my point. it would be better either to add Turkish culture to test this Turkish I, or you change the name of this enum value to be something like NonAscii.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I understand this, but still this has nothing to do with Turkish. this is my point. it would be better either to add Turkish culture to test this Turkish I, or you change the name of this enum value to be something like NonAscii.
As long as I was choosing a non-ASCII character, I chose one that made it easy to then locally add "tr-TR" as another culture and see what, if any, impact there was.
I don't feel strongly about it, though. This has already been merged; you're welcome to submit a PR to change the name of the enum value if you want.
| new KeyValuePair<char, char>('\u0131', '\u0049')); | ||
|
|
||
| [Theory] | ||
| [InlineData("H\u0069 World", "H\u0049 WORLD")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"H\u0049 WORLD") [](start = 37, length = 16)
I would prefer having a string with Turkish I in the middle and not only the Turkish characters. This would be a real scenario rather than testing just one character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer having a string with Turkish I in the middle and not only the Turkish characters.
I don't understand. What string do you want to see? These flavors of strings already existed in the tests; I'm not adding them in from scratch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never mind, I didn't notice what you are doing in ToUpper_TurkishI_MemberData
I thought you were constructing the whole string from only Turkish I's but I was wrong.
In reply to: 178873636 [](ancestors = 178873636)
…orefx#28748) * Add more functional tests for string.ToLower/Upper{Invariant} * Add more string.ToLower/Upper perf tests Commit migrated from dotnet/corefx@fd239bc
cc: @danmosemsft, @tarekgh