Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@stephentoub
Copy link
Member

cc: @danmosemsft, @tarekgh

bool differentCase = (options & ChangeCaseOptions.AllDifferentCase) != 0;
chars.Fill(upper != differentCase ? 'S' : 's');

if ((options & ChangeCaseOptions.MiddleTurkishI) != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the impl in front of me but it cuts over to a slow path when it hits a non ASCII character right? Why cut over in the middle? If you're testing perf perhaps you want to have entirely slow case and entirely fast case to not mix what you test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted the perf test to assume as little as possible about the implementation, e.g. whether it goes front to back or back to front. Figured I'd just change the middle then.

Don't have a strong opinion, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you expect us to change the implementation, I think it's fine for the perf test to attempt to highlight different aspects. But I don't care either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'm going to leave it for now. The other reason for this is that I'm already bumping up against how many tests seem reasonable for this (i.e. don't take too much time), and if we wanted to have the extremes, we'd want tests for when the non-ASCII char occurs at the beginning and at the end.

As an aside, it'd be great if the perf test harness supported a filter like /p:XunitOptions="-method ...". That appears to be completely ignored, and I don't see any other way (other than deleting all the other tests in the project temporarily) to only run a subset of tests.


public static IEnumerable<object[]> ChangeCaseMemberData() =>
from size in new[] { 1, 10, 500 }
from culture in new[] { "en-US" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why parameterize culture but only ever use en-US? Is that the only culture we can rely on being installed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mainly because I had other cultures in there, including tr-TR and invariant, and decided it was too many perf tests to check-in, but wanted to make it easy to still add them in locally.

@stephentoub stephentoub merged commit fd239bc into dotnet:master Apr 3, 2018
@stephentoub stephentoub deleted the stringperftests branch April 3, 2018 11:02
from options in new[]
{
ChangeCaseOptions.None,
ChangeCaseOptions.MiddleTurkishI,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MiddleTurkishI [](start = 34, length = 14)

I am not getting this, why we have this option while not using the Turkish culture? it would make sense if you add the Turkish culture there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if we encounter a non-ASCII character, it knocks us off the ASCII path, regardless of culture.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing you are creating the whole string as ascii. right? maybe I am missing something here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you saying you are trying to test the fast path even when having Turkish I?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing you are creating the whole string as ascii. right? maybe I am missing something here.

When this option is set, it's creating the whole string as ASCII except for the middle character, which ends up being set to a non-ASCII character (\u0131). Thus when the code gets to that character, it'll be knocked off the ASCII path and will be forced to fall back to doing the OS-based culture-aware comparison.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand this, but still this has nothing to do with Turkish. this is my point. it would be better either to add Turkish culture to test this Turkish I, or you change the name of this enum value to be something like NonAscii.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand this, but still this has nothing to do with Turkish. this is my point. it would be better either to add Turkish culture to test this Turkish I, or you change the name of this enum value to be something like NonAscii.

As long as I was choosing a non-ASCII character, I chose one that made it easy to then locally add "tr-TR" as another culture and see what, if any, impact there was.

I don't feel strongly about it, though. This has already been merged; you're welcome to submit a PR to change the name of the enum value if you want.

new KeyValuePair<char, char>('\u0131', '\u0049'));

[Theory]
[InlineData("H\u0069 World", "H\u0049 WORLD")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"H\u0049 WORLD") [](start = 37, length = 16)

I would prefer having a string with Turkish I in the middle and not only the Turkish characters. This would be a real scenario rather than testing just one character.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer having a string with Turkish I in the middle and not only the Turkish characters.

I don't understand. What string do you want to see? These flavors of strings already existed in the tests; I'm not adding them in from scratch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind, I didn't notice what you are doing in ToUpper_TurkishI_MemberData

I thought you were constructing the whole string from only Turkish I's but I was wrong.


In reply to: 178873636 [](ancestors = 178873636)

@karelz karelz modified the milestones: 2.2.0, 2.1.0 Apr 4, 2018
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…orefx#28748)

* Add more functional tests for string.ToLower/Upper{Invariant}

* Add more string.ToLower/Upper perf tests


Commit migrated from dotnet/corefx@fd239bc
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants