UtilityAnalyzer: Move to a syntax based classification of identifiers in the token type utility analyzer #7369

martin-strecker-sonarsource · 2023-06-12T12:08:03Z

The token type analyzer calls TokenClassifierBase.ClassifyIdentifier for each identifier.

sonar-dotnet/analyzers/src/SonarAnalyzer.Common/Rules/Utilities/TokenTypeAnalyzerBase.cs

Lines 140 to 154 in 2728acd

    
           private TokenInfo ClassifyIdentifier(SyntaxToken token) 
        
           { 
        
               if (semanticModel.GetDeclaredSymbol(token.Parent) is { } declaration) 
        
               { 
        
                   return ClassifyIdentifier(token, declaration); 
        
               } 
        
               else if (GetBindableParent(token) is { } parent && semanticModel.GetSymbolInfo(parent).Symbol is { } symbol) 
        
               { 
        
                   return ClassifyIdentifier(token, symbol); 
        
               } 
        
               else 
        
               { 
        
                   return null; 
        
               } 
        
           }

This method calls semanticModel.GetDeclaredSymbol(token.Parent) and optionally semanticModel.GetSymbolInfo(token.Parent) for the identifier token. Therefore an ISymbol is created, and a mapping from SyntaxNode to ISymbol is added to the semantic model. This adds a lot of pressure to any shared semantic model, as the ISymbol and the mapping need to be cached by the semantic model in a thread-safe manner. The code snippet below shows how many identifiers are present in a simple code snippet.

using System;                     // +1
using System.Collections.Generic; // +3

namespace A.B.C;                  // +3

public class D                    // +1
{
    public D()                    // +1
    {
    }
    public void M()               // +1
    {
        List<D> myList;           // +3
    }
}

TokenClassifierBase.ClassifyIdentifier can only have two outcomes:

The identifier is considered TokenType.TypeName (with some special casing for types that are classified as keywords)
The identifier is considered to be unknown.

This classification can often be done on a syntactical level. In the sample above, all identifiers can be classified without querying the semantic model, saving 20 calls to the semantic model (13 identifiers, where 6 are declarations -> 13 + 7) and the allocation of 11 symbols.

To do a proper classification on the syntax level, the test infrastructure needs to be extended and made more powerful. #7289 describes how to do it, and #7108 implements this infrastructure. Therefore, this issue is blocked by #7289

Related:
#4217
#7288
#6674

The text was updated successfully, but these errors were encountered:

martin-strecker-sonarsource · 2023-08-16T16:40:52Z

Closed as fixed by #7788 and #7775

martin-strecker-sonarsource added the Type: Performance It takes too long. label Jun 12, 2023

martin-strecker-sonarsource added the Sprint: UtilityAnalyzer label Aug 4, 2023

martin-strecker-sonarsource added this to the 9.8 milestone Aug 4, 2023

This was referenced Aug 7, 2023

TokenTypeAnalyzer: Create a DSL for token type analyzer #7726

Merged

TokenType: Add test cases #7747

Merged

TokenType: Move to Syntax based classification #7775

Merged

martin-strecker-sonarsource self-assigned this Aug 11, 2023

martin-strecker-sonarsource mentioned this issue Aug 14, 2023

TokenType: Move to Syntax based classification (Type part) #7788

Merged

martin-strecker-sonarsource closed this as completed Aug 16, 2023

This was referenced Aug 16, 2023

UtilityAnalyzer: Optimize hot-path of TokenType anaylzer #7805

Closed

UtilityAnalyzer: Optimize SimpleMemberAccessExpression handling for the TokenTypeAnalyzer #7806

Closed

Tim-Pohlmann modified the milestone: 9.8 Aug 21, 2023

This was referenced Sep 23, 2023

TokenType: Tests and optimization for pointer types #8060

Merged

TokenType: Tests and support for scoped type declarations #8061

Merged

martin-strecker-sonarsource mentioned this issue Nov 17, 2023

Reduce the number of symbols retrieved by TokenTypeAnalyzer #5559

Closed

martin-strecker-sonarsource mentioned this issue Nov 27, 2023

Improve TokenTypeAnalyzerBase performance #6975

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UtilityAnalyzer: Move to a syntax based classification of identifiers in the token type utility analyzer #7369

UtilityAnalyzer: Move to a syntax based classification of identifiers in the token type utility analyzer #7369

martin-strecker-sonarsource commented Jun 12, 2023 •

edited

Loading

martin-strecker-sonarsource commented Aug 16, 2023

UtilityAnalyzer: Move to a syntax based classification of identifiers in the token type utility analyzer #7369

UtilityAnalyzer: Move to a syntax based classification of identifiers in the token type utility analyzer #7369

Comments

martin-strecker-sonarsource commented Jun 12, 2023 • edited Loading

martin-strecker-sonarsource commented Aug 16, 2023

martin-strecker-sonarsource commented Jun 12, 2023 •

edited

Loading