Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations in semantic tokens #9280

Merged
merged 6 commits into from
Sep 14, 2023

Conversation

davidwengier
Copy link
Contributor

Fixes https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1869868

Saw this in one of the slow completion traces. A little unorthodox method here, as our benchmarks weren't well suited for this sort of work. Instead I used the new ability to profile a test, and just added one test that produced a lot of tokens.

Results as reported by the VS profiler:
image

@@ -175,7 +183,7 @@ private static List<SemanticRange> CombineSemanticRanges(List<SemanticRange> ran
var tokenModifiers = csharpResponse[i + 4];

var semanticRange = CSharpDataToSemanticRange(lineDelta, charDelta, length, tokenType, tokenModifiers, previousSemanticRange);
if (_documentMappingService.TryMapToHostDocumentRange(generatedDocument, semanticRange.Range, out var originalRange))
if (_documentMappingService.TryMapToHostDocumentRange(generatedDocument, CopyValues(semanticRange, rangePool), out var originalRange))
Copy link
Contributor Author

@davidwengier davidwengier Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new API here that returned something other than Range would further reduce allocations, but seemed like a pretty big change for this late in the process. In future ideally we'll be able to move to auto-generated protocol types that aren't classes anyway, which will solve this too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at this a bit more and its definitely a big change, just with how much churn it is in tests that use Moq 😁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it back, wasn't too bad :)
#9285

@@ -40,11 +40,13 @@ public static List<SemanticRange> VisitAllNodes(RazorCodeDocument razorCodeDocum
rangeAsTextSpan = range.AsRazorTextSpan(sourceText);
}

var visitor = new TagHelperSemanticRangeVisitor(razorCodeDocument, rangeAsTextSpan, razorSemanticTokensLegend, colorCodeBackground);
using var _ = ArrayBuilderPool<SemanticRange>.GetPooledObject(out var builder);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The resizing of this array shows up in the profile as being a slight problem, but with only one test run being profiled there is no pooling benefit, so hopefully in the real world it goes away.

var lineCount = sourceText.Lines.Count;
if (line > lineCount ||
(line == lineCount && character > 0))
if (!sourceText.TryGetAbsolutePosition(line, character, out var absolutePosition))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic just moved to SourceTextExtensions, which is a better place anyway, as we can use it in more places to possibly solve some other PRISM bugs for out of range issues, since it's the only thing that currently does the correct "allow line to be count + 1" logic.

@davidwengier
Copy link
Contributor Author

A thought: Would it be a better change if SemanticRange just swapped to using Roslyn's LinePositionSpan instead of Range? It's the same shape, just a struct.

(which I eventually want to remove and just have it call the new one)
(line == lineCount && character > 0))
{
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would probably write this something like below (easier for my old brain to read and probably fewer conditions to evaluate in common case)

            if (line >= lineCount)
            {
                if (line > lineCount || character >0)
                {
                    return false;
                }

                // LSP spec allowed a Range to end one line past the end, and character 0. SourceText does not, so we adjust to the final char position
                absoluteIndex = sourceText.Length;
            }
            else
            {
                absoluteIndex = sourceText.Lines[line].Start + character;
            }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave this for now, since its just a move, and I find that harder to read 😛

I could be convinced about the less conditions though, so I'll have a go at this in a follow up PR, to clean up more Position and Range stuff. This probably needs more logic anyway - just because the line number is valid, doesn't mean the character position on that line is, and there is no check for line > 0.

@@ -159,13 +162,18 @@ private static List<SemanticRange> CombineSemanticRanges(List<SemanticRange> ran
return null;
}

var razorRanges = new List<SemanticRange>();
using var _ = ArrayBuilderPool<SemanticRange>.GetPooledObject(out var razorRanges);
razorRanges.SetCapacityIfLarger(csharpResponse.Length / TokenSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TokenSize

Look at that beautiful constant! :)

@ToddGrun
Copy link
Contributor

ToddGrun commented Sep 13, 2023

    if (FromRazor && !other.FromRazor)

If we're embracing CompareTo, might as well use it here too #Closed


Refers to: src/Razor/src/Microsoft.AspNetCore.Razor.LanguageServer/Semantic/Services/SemanticRange.cs:64 in ae876c4. [](commit_id = ae876c4, deletion_comment = False)

@davidwengier
Copy link
Contributor Author

If we're embracing CompareTo, might as well use it here too

I prefer the readability of the current code. The equivalent logic would be (!FromRazor).CompareTo(!other.FromRazor) and I'd spend an hour trying to work out what it's doing. Hell, it took me half an hour of experimenting in SharpLab to work out thats what it would be.

newList.SetCapacityIfLarger(razorRanges.Length + csharpRanges.Length);

newList.AddRange(razorRanges);
newList.AddRange(csharpRanges);

// Because SemanticToken data is generated relative to the previous token it must be in order.
// We have a guarantee of order within any given language server, but the interweaving of them can be quite complex.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a guarantee of order within any given language server

Is this comment not accurate (and thus the need for you to sort in the just csharp case above?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking its not wrong, but certainly misleading. Will update.

Within any given language server the order is guaranteed, due to the nature of the offset based data, but that just means its in order when we get it back from the C# server. Once we've done the re-mapping, a using directive which would be at the top of the generated C# file, can produce a classified span in the middle of a .razor document (as the @using can appear anywhere).

Copy link
Contributor

@ToddGrun ToddGrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@davidwengier davidwengier merged commit 3282025 into dotnet:main Sep 14, 2023
@davidwengier davidwengier deleted the SemanticTokensAllocations branch September 14, 2023 00:30
@ghost ghost added this to the Next milestone Sep 14, 2023
davidwengier added a commit that referenced this pull request Sep 14, 2023
Follow up to #9280

This creates a struct based API for document mapping, and moves semantic
tokens to it. I did it in a source-compatible way so we can upgrade
existing features as necessary. I suspect most won't get the benefit
that semantic tokens gets, as most things just ferry ranges and
positions around, and don't process them much. Logged
#9284 to follow up though.

Commit-at-a-time might be easiest.

Results for semantic tokens are pretty good though:

![image](https://github.com/dotnet/razor/assets/754264/7316e5db-0e90-4b32-a807-d4ee5d40741a)
@Cosifne Cosifne modified the milestones: Next, 17.8 P3 Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants