Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ordinal string comparison in string manipulation methods #4912

Merged
merged 5 commits into from
Jun 12, 2018

Conversation

auduchinok
Copy link
Member

Use overloads with StringComparison parameter where CurrentCulture would be used instead.

@auduchinok auduchinok force-pushed the stringComparison-ordinal branch from 0d334fc to 3b277e6 Compare May 15, 2018 11:20
@auduchinok auduchinok force-pushed the stringComparison-ordinal branch from 3b277e6 to 568acd3 Compare May 15, 2018 11:24
@auduchinok
Copy link
Member Author

@dotnet-bot test Windows_NT Release_fcs Build please

Copy link
Contributor

@dsyme dsyme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just requesting a change to use StartsWithOrdinal etc. extension methods, thanks :)

@@ -556,7 +556,7 @@ module String =
while not (isNull !line) do
yield !line
line := reader.ReadLine()
if str.EndsWith("\n") then
if str.EndsWith("\n", StringComparison.Ordinal) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add StartsWithOrdinal/EndsWithOrdinal extension methods to illib.fs? Then it's easy to search and remove uses of StartsWith

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

String.dropPrefix r.Name FSharpOptimizationDataResourceName
elif r.Name.StartsWith FSharpOptimizationDataResourceName2 then
elif r.Name.StartsWith(FSharpOptimizationDataResourceName2, StringComparison.Ordinal) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know what comparison resource and assembly names should use. My understanding is that it doesn't actually make a difference if the thing being compared with is ASCII

@KevinRansom
Copy link
Member

@dsyme

assembly names are case insensitive, however guidance is that they should be treated as though they were case sensitive. (Which is pretty scary to me.)

So StringComparison.Ordinal conforms to best practice for assembly names.

https://docs.microsoft.com/en-us/dotnet/framework/app-domains/assembly-names

Note
The runtime treats assembly names as case-insensitive when binding to an assembly, but preserves whatever case is used in an assembly name. Several tools in the Windows Software Development Kit (SDK) handle assembly names as case-sensitive. For best results, manage assembly names as though they were case-sensitive. 

ResourceManager is configurable:

if ignore case is false, then does Culture sensitiveComparison, if ignore case is true does InvariantCulture lookup.

https://docs.microsoft.com/en-us/dotnet/api/system.resources.resourcemanager.ignorecase?view=netframework-4.7.1#System_Resources_ResourceManager_IgnoreCase

Remarks
If the value of the IgnoreCase property is false, a resource with the name "Resource" is not equivalent to the resource with the name "resource". If IgnoreCase is true, a resource with the name "Resource" is equivalent to the resource with the name "resource". Note, however, that when IgnoreCase is true, the ResourceManager.GetString and ResourceManager.GetObject methods perform case-insensitive string comparisons by using the invariant culture. The advantage is that results of case-insensitive string comparisons performed by these methods are the same on all computers regardless of culture. The disadvantage is that the results are not consistent with the casing rules of all cultures. 
For example, the Turkish alphabet has two versions of the character I: one with a dot and one without a dot. In Turkish, the character I (Unicode 0049) is considered the uppercase version of a different character ı (Unicode 0131). The character i (Unicode 0069) is considered the lowercase version of yet another character İ (Unicode 0130). According to these casing rules, a case-insensitive string comparison of the characters i (Unicode 0069) and I (Unicode 0049) should fail for the culture "tr-TR" (Turkish in Turkey). However, because the comparison is conducted by using the casing rules of the invariant culture if IgnoreCase is true, this comparison succeeds. 

@KevinRansom
Copy link
Member

If the guidance in that page for resources is correct, then Culture Sensitive is the way to go for perf and working set.

@auduchinok auduchinok force-pushed the stringComparison-ordinal branch from 9d1a26b to 21567ce Compare May 15, 2018 18:40
@auduchinok auduchinok force-pushed the stringComparison-ordinal branch from 21567ce to c751112 Compare May 15, 2018 18:42
@KevinRansom
Copy link
Member

What I mean is, if the look up is going to be CultureSensitive in the resource manager, which it is, I don't think we set ignoreCase. I don't know whether A StringComparison.Ordinal check elsewhere can introduce bugs, my intuition is that it won't … but I truly don't know.

Copy link
Member

@KevinRansom KevinRansom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.

@auduchinok
Copy link
Member Author

@dotnet-bot test Windows_NT Release_ci_part2 Build please.

@abelbraaksma
Copy link
Contributor

@KevinRansom I think you meant CultureInsensitive? Because that's the only way resources are read the same way on all systems. And i think that's what the guidance you referenced suggests there.

Changing that default may lead to builds failing on one system and succeeding on another (for instance, my build servers are in Germany and I develop in The Netherlands, I use resources a lot, marking the default Sensitive would possibly introduce hard to diagnose errors)

@KevinRansom KevinRansom merged commit ab48458 into dotnet:master Jun 12, 2018
@auduchinok auduchinok deleted the stringComparison-ordinal branch December 16, 2023 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants