Improvements to classification perf with large string literals. #72217

CyrusNajmabadi · 2024-02-21T22:58:12Z

Followup to #72216.

Takes classification cost down from:

to

Basically, we were producign lots of tags in the case of some very large strings. This ended up being costly with later processing. This change fixes us up to produce far fewer tags (gated to just what is in view).

--

There is still further optimizations we can do. Specifically, for regex/json classification, we can limit how we walk those language subtrees to avoid even classifying portions of the embedded language tree that is not in view.

CyrusNajmabadi · 2024-02-21T22:59:45Z

...e/Portable/EmbeddedLanguages/Classification/AbstractEmbeddedLanguageClassificationService.cs

                ClassifyToken(token);
-                ProcessTriviaList(token.TrailingTrivia);


we were also previous walking down into trailing trivia (which never has structure). we were also walking into all structure, even though we don't have any embedded-lang classification in structured-trivia except for the case of string literals in directives. so this means no more walking into things like doc comments, which is nice.

...Core/Portable/EmbeddedLanguages/Classification/AbstractFallbackEmbeddedLanguageClassifier.cs

…stractFallbackEmbeddedLanguageClassifier.cs

CyrusNajmabadi · 2024-02-21T23:00:20Z

src/EditorFeatures/Core/Classification/Semantic/ClassificationUtilities.cs

-            var subTextSpan = service.GetMemberBodySpanForSpeculativeBinding(member);
-            if (subTextSpan.IsEmpty)
+            var memberBodySpan = service.GetMemberBodySpanForSpeculativeBinding(member);
+            if (memberBodySpan.IsEmpty)


view with whitespace off.

CyrusNajmabadi · 2024-02-21T23:03:06Z

...Features/Core/Portable/EmbeddedLanguages/Classification/EmbeddedLanguageClassifierContext.cs

-            => _result.Add(new ClassifiedSpan(classificationType, span));
+        // Ignore characters that don't intersect with the requested span.  That avoids potentially adding lots of
+        // classifications for portions of a large string that are out of view.
+        if (span.IntersectsWith(_spanToClassify))


note: in the case the user provided, while the string contains json, it's not a JsonTree. All we ahve are the embedded classifications for escapes like "" in the middle of the string. But there are literally thousands of these classifications. So this takes us down from many thousands of classifications to a few dozen.

CyrusNajmabadi · 2024-02-22T00:51:16Z

@sharwell @ToddGrun this is ready for review.

ToddGrun

LGTM

…ationPerf2" This reverts commit de2db03, reversing changes made to 8f67d64.

CyrusNajmabadi added 2 commits February 21, 2024 14:37

Do less work

f792eef

Do less work

9dcb066

CyrusNajmabadi requested a review from a team as a code owner February 21, 2024 22:58

CyrusNajmabadi commented Feb 21, 2024

View reviewed changes

...Core/Portable/EmbeddedLanguages/Classification/AbstractFallbackEmbeddedLanguageClassifier.cs Outdated Show resolved Hide resolved

Update src/Features/Core/Portable/EmbeddedLanguages/Classification/Ab…

bf04f79

…stractFallbackEmbeddedLanguageClassifier.cs

CyrusNajmabadi commented Feb 21, 2024

View reviewed changes

Merge remote-tracking branch 'upstream/main' into classificationPerf2

71b4de6

CyrusNajmabadi requested review from sharwell and ToddGrun February 22, 2024 00:51

ToddGrun approved these changes Feb 22, 2024

View reviewed changes

CyrusNajmabadi merged commit de2db03 into dotnet:main Feb 22, 2024
27 checks passed

CyrusNajmabadi deleted the classificationPerf2 branch February 22, 2024 21:11

CyrusNajmabadi mentioned this pull request Feb 22, 2024

Quote completion improvements #71898

Closed

jjonescz added a commit to jjonescz/roslyn that referenced this pull request Feb 26, 2024

Revert "Merge pull request dotnet#72217 from CyrusNajmabadi/classific…

ab58402

…ationPerf2" This reverts commit de2db03, reversing changes made to 8f67d64.

jjonescz added this to the 17.10 P2 milestone Feb 27, 2024

dotnet-bot mentioned this pull request Feb 29, 2024

[Automated] PRs inserted in VS build main-34628.19 #72320

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to classification perf with large string literals. #72217

Improvements to classification perf with large string literals. #72217

CyrusNajmabadi commented Feb 21, 2024

CyrusNajmabadi Feb 21, 2024

CyrusNajmabadi Feb 21, 2024

CyrusNajmabadi Feb 21, 2024

CyrusNajmabadi commented Feb 22, 2024

ToddGrun left a comment

		ClassifyToken(token);
		ProcessTriviaList(token.TrailingTrivia);

Improvements to classification perf with large string literals. #72217

Improvements to classification perf with large string literals. #72217

Conversation

CyrusNajmabadi commented Feb 21, 2024

CyrusNajmabadi Feb 21, 2024

Choose a reason for hiding this comment

CyrusNajmabadi Feb 21, 2024

Choose a reason for hiding this comment

CyrusNajmabadi Feb 21, 2024

Choose a reason for hiding this comment

CyrusNajmabadi commented Feb 22, 2024

ToddGrun left a comment

Choose a reason for hiding this comment