Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA1862 suggestion changing semantics of the comparison. #89740

Closed
buyaa-n opened this issue Jul 31, 2023 · 8 comments
Closed

CA1862 suggestion changing semantics of the comparison. #89740

buyaa-n opened this issue Jul 31, 2023 · 8 comments
Assignees
Labels
area-System.Globalization bug code-analyzer Marks an issue that suggests a Roslyn analyzer
Milestone

Comments

@buyaa-n
Copy link
Contributor

buyaa-n commented Jul 31, 2023

CA1862: Prefer using 'StringComparer'/'StringComparison' to perform case-insensitive string comparisons suggestion changing semantics of the comparison

Warning: Prefer using 'string.Equals(string, StringComparison)' to perform a case-insensitive comparison

using System;

// These are both lower case Greek Sigmas
string a = "σ";
string b = "ς";

// Prints True
Console.WriteLine(a.Equals(b, StringComparison.InvariantCultureIgnoreCase));

string lowerA = a.ToLowerInvariant();
string lowerB = b.ToLowerInvariant();

// Prints False
Console.WriteLine(lowerA == lowerB);

Originally posted by @MichalStrehovsky in #89630 (comment):
This is changing semantics of the comparison. Do we really have an analyzer that suggests people blindly do this?

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 31, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jul 31, 2023
@ghost
Copy link

ghost commented Jul 31, 2023

Tagging subscribers to this area: @dotnet/area-system-globalization
See info in area-owners.md if you want to be subscribed.

Issue Details
          This is changing semantics of the comparison. Do we really have an analyzer that suggests people blindly do this?
using System;

// These are both lower case Greek Sigmas
string a = "σ";
string b = "ς";

// Prints True
Console.WriteLine(a.Equals(b, StringComparison.InvariantCultureIgnoreCase));

string lowerA = a.ToLowerInvariant();
string lowerB = b.ToLowerInvariant();

// Prints False
Console.WriteLine(lowerA == lowerB);

Originally posted by @MichalStrehovsky in #89630 (comment)

Author: buyaa-n
Assignees: -
Labels:

area-System.Globalization, untriaged, needs-area-label

Milestone: -

@buyaa-n buyaa-n added bug code-analyzer Marks an issue that suggests a Roslyn analyzer and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jul 31, 2023
@buyaa-n
Copy link
Contributor Author

buyaa-n commented Jul 31, 2023

Created the issue in runtime in order to discuss the appropriate fix for this issue @carlossanlop @tarekgh PTAL

@buyaa-n buyaa-n added this to the 8.0.0 milestone Jul 31, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jul 31, 2023
@tarekgh
Copy link
Member

tarekgh commented Jul 31, 2023

Using σ and ς is very-special case in the Unicode. If anyone is doing ToLower for comparisons in general will already be broken. For example, encountering Σ and using ToLower, will hold Σ == σ and Σ != ς. which would be wrong result. Actually, in general we recommend users to use ToUpper for compare casing because of such issues. Although, it is correct that Equals(b, StringComparison.InvariantCultureIgnoreCase) can have semantic change in this case, but it is better to have the analyzer flag this use cases of the ToLower anyway.

I prefer to keep the analyzer and we add something to the analyzer message clarify exceptional cases can hold different results. At least get users to think about it.

@MichalStrehovsky
Copy link
Member

Using σ and ς is very-special case in the Unicode. If anyone is doing ToLower for comparisons in general will already be broken. For example, encountering Σ and using ToLower, will hold Σ == σ and Σ != ς. which would be wrong result.

Yep, this is true - but for the code that I pointed out, this a behavior of the CoreCLR implementation of reflection (that is written in C++) that NativeAOT reflection (written in C#) is simply matching. "Fixing" it on NativeAOT would introduce a behavioral difference between the reflection stacks. Whenever this analyzer kicks in, people need to be aware it can change behavior. It is very subtle.

@tarekgh
Copy link
Member

tarekgh commented Jul 31, 2023

Whenever this analyzer kicks in, people need to be aware it can change behavior. It is very subtle.

That is what I suggested to address that we add something to the analyzer message clarify exceptional cases can hold different results. At least get users to think about it.

@Bellarmine-Head
Copy link

Bellarmine-Head commented Aug 1, 2023

Prefer using 'string.Equals(string, StringComparison)' to perform a case-insensitive comparison

It is to be preferred to use string.Equals(string, StringComparison) but the statement on its own doesn't go far enough.

What I have seen in related issues, e.g.

is a literal or rote translation of existing code to the equivalent StringComparison options. What you end up with is a StringComparison option with the word "Culture" in it... e.g. StringComparison.CurrentCultureIgnoreCase.

But programmers have an extra choice when using StringComparison (and StringComparer) wrt case-insensitivity... and that choice or option is OrdinalIgnoreCase.

The problem with an issue like #78606 is that the word "ordinal" doesn't appear once.

So we need to get OrdinalIgnoreCase on to people's radars, and this option needs to be taken into account by the analyzers that concern themselves with case-insensitive string comparisons.

If the programmer is comparing strings case-insensitively because the strings are identifiers, then OrdinalIgnoreCase is the only correct choice. Any option with the word "Culture" in it is wrong. This is in fact standard, established (and correct) Microsoft advice, as seen here: https://learn.microsoft.com/en-us/dotnet/standard/base-types/best-practices-strings#choosing-a-stringcomparison-member-for-your-method-call

For all the following types of strings, OrdinalIgnoreCase is what you must be using:-

  • Case-insensitive internal identifiers.
  • Case-insensitive identifiers in standards such as XML and HTTP.
  • File paths.
  • Registry keys and values.
  • Environment variables.
  • Resource identifiers (for example, handle names).
  • Case-insensitive security-related settings.

So when would you use a StringComparison option with the word "Culture" in it? For these:-

  • Some persisted, linguistically relevant data.
  • Display of linguistic data that requires a fixed sort order.
  • Data displayed to the user.
  • Most user input.

From experience (C# and .NET since 2002) I can say that comparing "linguistically relevant" strings case-insensitively is something I've never done, whereas comparing identifiers (etc.) using OrdinalIgnoreCase is something I do twenty times a day every day. It's the same for any/all code by others that I've reviewed.

The trouble with the table that seems to be at the heart of these new or revised analyzers, is that it the right-hand column is only correct if the string being operated on is cultural / linguistic. This is a dangerous assumption to make, I feel.

The analyzers in this space should be weighted towards recommending OrdinalIgnoreCase for all case-insensitive string comparisons, but may wish to mention the cultural string comparison options for the far less likely case that the string is cultural / linguistic.

@ericstj
Copy link
Member

ericstj commented Aug 15, 2023

@carlossanlop were you planning to fix this for 8.0?

@carlossanlop
Copy link
Member

Fixed by dotnet/roslyn-analyzers#6948

@ghost ghost locked as resolved and limited conversation to collaborators Oct 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Globalization bug code-analyzer Marks an issue that suggests a Roslyn analyzer
Projects
None yet
Development

No branches or pull requests

6 participants