-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to classification perf with large string literals. #72217
Improvements to classification perf with large string literals. #72217
Conversation
ClassifyToken(token); | ||
ProcessTriviaList(token.TrailingTrivia); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we were also previous walking down into trailing trivia (which never has structure). we were also walking into all structure, even though we don't have any embedded-lang classification in structured-trivia except for the case of string literals
in directives. so this means no more walking into things like doc comments, which is nice.
...Core/Portable/EmbeddedLanguages/Classification/AbstractFallbackEmbeddedLanguageClassifier.cs
Outdated
Show resolved
Hide resolved
…stractFallbackEmbeddedLanguageClassifier.cs
var subTextSpan = service.GetMemberBodySpanForSpeculativeBinding(member); | ||
if (subTextSpan.IsEmpty) | ||
var memberBodySpan = service.GetMemberBodySpanForSpeculativeBinding(member); | ||
if (memberBodySpan.IsEmpty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
view with whitespace off.
=> _result.Add(new ClassifiedSpan(classificationType, span)); | ||
// Ignore characters that don't intersect with the requested span. That avoids potentially adding lots of | ||
// classifications for portions of a large string that are out of view. | ||
if (span.IntersectsWith(_spanToClassify)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: in the case the user provided, while the string contains json, it's not a JsonTree. All we ahve are the embedded classifications for escapes like ""
in the middle of the string. But there are literally thousands of these classifications. So this takes us down from many thousands of classifications to a few dozen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Followup to #72216.
Takes classification cost down from:
to
Basically, we were producign lots of tags in the case of some very large strings. This ended up being costly with later processing. This change fixes us up to produce far fewer tags (gated to just what is in view).
--
There is still further optimizations we can do. Specifically, for regex/json classification, we can limit how we walk those language subtrees to avoid even classifying portions of the embedded language tree that is not in view.