Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use isIncomplete flag to cap the LSP completion list size for faster serialization #53749

Merged
merged 28 commits into from
Jun 18, 2021

Conversation

dibarbet
Copy link
Member

@dibarbet dibarbet commented May 28, 2021

Resolves #52813

Implements the isincomplete flag on the completion list to cap completion list size to ~1000 items. The client then is responsible for re-requesting completion results when it gets an incomplete list and the user continues typing.

The general idea approach that this PR takes is

  1. Calculate the entire list on first request and cache it for immediately subsequent requests with the is incomplete trigger.
  2. Using the existing completion filtering APIs (which handle cases like returning all items when only 1 character is typed, pattern matching, and other things) filter down and sort the original completion list.
  3. Cap the filtered list at 1000 items plus any preselected items. This can result in a slightly weird looking list when the preselected item is not alphabetically near any of the 1k items we return.
  4. Set the isincomplete flag if the originally calculated list has more than 1k items (it really should be the filtered list, but blocked by https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1335142)

Note that the existing inproc completion does a lot of handling around which item has the best pattern match and should be selected and highlighting matching spans. However in LSP that behavior should be handled client side, so I did not attempt to re-use that logic.

I had to refactor MatchResult a bit to be agnostic of editor types so as not to introduce and editor features layer dependency here.

@dibarbet dibarbet added Area-IDE LSP issues related to the roslyn language server protocol implementation labels May 28, 2021
@dibarbet dibarbet requested review from a team as code owners May 28, 2021 07:05
// We expect that Editor will introduce this support and we will get rid of relying on the "★" then.
// We check both the display text and the display text prefix to account for IntelliCode item providers
// that may be using the prefix to include the ★.
internal static bool IsPreferredItem(this RoslynCompletionItem completionItem)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was just moved

private static ImmutableArray<CompletionItemWithHighlight> GetHighlightedList(List<MatchResult> matchResults)
=> matchResults.SelectAsArray(matchResult =>
new CompletionItemWithHighlight(matchResult.VSCompletionItem, matchResult.HighlightedSpans));
private static ImmutableArray<CompletionItemWithHighlight> GetHighlightedList(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, highlighting was done during the calculation of MatchResults and stored as a property on the MatchResult. Now this is done when the model is being created (so just moving the computation slightly later) to avoid needing to store these spans on the MatchResult type.

The highlight calculation code was essentially moved unchanged, other than no longer storing on MatchResult

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you clarify this though? aren't the highlighted spans specific to each match? so how can you store on the model and not the item?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

poke

//
// 2. They brought up completion with ctrl-j or through deletion. In these
// cases we just always keep all the items in the list.
private static bool KeepAllItemsInTheList(CompletionTriggerKind initialTriggerKind, string filterText)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helper for computing match results, moved down

{
var roslynItem = GetOrAddRoslynCompletionItem(item);

// Get the match of the given completion item for the pattern provided so far.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this code was moved down a layer. the only meaningful change was removing the call to GetHighlightedSpans when creating the MatchResult (it's done later)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the benefit of doing it later?

}

// PERF: Create a singleton to avoid lambda allocation on hot path
private static readonly Func<TextSpan, RoslynCompletionItem, Span> s_highlightSpanGetter
= (span, item) => span.MoveTo(item.DisplayTextPrefix?.Length ?? 0).ToSpan();

private static bool MatchesFilterText(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helper for matching, moved down

@@ -431,6 +478,41 @@ internal TestLspServer(TestWorkspace testWorkspace)
_executionQueue, methodName, request, clientCapabilities, clientName, cancellationToken);
}

public async Task OpenDocumentAsync(Uri documentUri)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there were multiple implementations of doc sync in tests, unifying them all here

@@ -358,5 +359,109 @@ private static int CompareExpandedItem(CompletionItem item1, PatternMatch match1

public static string ConcatNamespace(string? containingNamespace, string name)
=> string.IsNullOrEmpty(containingNamespace) ? name : containingNamespace + "." + name;

internal static bool TryCreateMatchResult<T>(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, this is moved code, the main change being the removal of the call to get highlighted spans

// Use the resultId to retrieve the completion list we cached for the original result.
Contract.ThrowIfFalse(_lastIncompleteResultId.HasValue, "Trigger for incomplete list missing previous result id");
var cachedList = _completionListCache.GetCachedCompletionList(_lastIncompleteResultId.Value);
Contract.ThrowIfNull(cachedList, "Trigger for incomplete list missing cached list");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the platform requeries us for the list when there is an error? We don't need to return the list on a cache miss?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check feels weird. i'm ok with us be stateful as an optimization. but say we got recycled but completion still calls to us with this value. Either:

  1. this is ok, and we shoudl be resilient and should respond with some sensible message.
  2. this is not ok, and we shoudl have caught this when we first got he message and we were validating it for invariant correctness.

Contract calls are for catching when we make mistakes entirely in our domain. in other words, this shouldn't be possible given everythign we know about the code under our control.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move the contract checks up. Being resilient here is difficult - the completion service only returns results for the first character typed on the word. So if we get an incomplete completion request in the middle of a word, even if we asked the completion service for more items, we wouldn't get any items.

Perhaps one solution would be to figure out the filter text, and just re-request completions on a document containing only the word start character? I'll try something like that out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to re-calculate the list using a document w/out the filter text

@CyrusNajmabadi
Copy link
Member

Unless urgent, can you hold off on merging until I can review on Tuesday?

@dibarbet
Copy link
Member Author

Unless urgent, can you hold off on merging until I can review on Tuesday?

Yep that shouldn't be a problem

Copy link
Contributor

@allisonchou allisonchou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excited about this change! Just have a few questions.

This can result in a slightly weird looking list when the preselected item is not alphabetically near any of the 1k items we return.

Out of curiosity what does this look like?

/// When the client asks us for more items using <see cref="LSP.CompletionTriggerKind.TriggerForIncompleteCompletions"/>
/// we use this result id to lookup the previously calculated list.
/// </summary>
private long? _lastIncompleteResultId;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does access of this variable need to be protected by some sort of lock?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I guess in theory it should, completion requests are not mutating so we could get multiple requests at once. I wouldn't expect it to usually happen, but I'll do it to be safe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya one interesting implication of this is if the client sends a completion request and then cancels it this could potentially reference a bad completion list. May want to build in a path to handle / invalidate the "cancelled" state. You'll also want to fallback gracefully in the case that you can't find this resultId too (I see below you currently contract.fail).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i assume we don't need a lock as LSP is not concurrent right? (or is that an incorrect assumption on my part?)
however, we do need to ensure that this is properly cleaned up and is only reused when we are certain it is safe to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i assume we don't need a lock as LSP is not concurrent right? (or is that an incorrect assumption on my part?)

textDocument/didChange requests are not concurrent (they can't be); however, everything else can be. It's up to the client on how it wants to handle that. Also, in practice StreamJsonRpc will only dispatch 1 request at a time, it's up to how you all re-dispatch those requests internally.

Copy link
Member Author

@dibarbet dibarbet Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutating requests are not concurrent, but non-mutating requests (this one) could be. I wouldn't expect to get multiple completion requests though, as generally a completion request is always preceded by a mutating didChange. A client is allowed to call us multiple times and we will run them concurrently as it stands. Soto be careful, I think we should explicitly lock it

Copy link
Member Author

@dibarbet dibarbet Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So after looking at this for a while, I think I need to mark completion requests as mutating on our side (which means they're always executed in order and requests wait for the previous requests to finish). I do not see a way to have a stateless cache for previous incomplete results:

  1. I have no resultId I can pass between the original isIncomplete request and the client request for TriggerForIncompleteCompletion.
  2. Without passing data to the client, I would have to cache on something like document URI + location of the word start character. But I have no easy way to invalidate that cache (since the cache needs to persist when the document version changes).

The alternative I see is to recompute the list each time and filter down.

thoughts @CyrusNajmabadi @NTaylorMullen ?

Copy link
Member Author

@dibarbet dibarbet Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually even if I do enforce sequential ordering of completion requests, I think we could still have problems.
Imagine the following

void M1()
{
    B
}

void M2()
{
    C
}

Typing 'B' in M1 gave back an incomplete list, then the user typed 'C' in M2, also giving back an incomplete list. If the user goes back to 'B' in M1 and types 'o', the client will request for incomplete completions, but we'll use the resultId from 'C'.

I think I have to recalculate - I'm also thinking about storing the position / text of the original completion request, and verifying that the text matches before using the previous result, but I'm not sure that's better than recalculating

Copy link
Member Author

@dibarbet dibarbet Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think I have something mostly working -

  1. I now rewind the source text to what it was before completion started (aka delete the filter text). If the resultId is missing,I use this to re-calculate.
  2. I'm now storing the (rewound) source text with the resultId. Then when I attempt to use a cached list, I compare the rewound source text to the one stored with the last resultId. I only re-use the completion list if it matches.
  3. Add a semaphore around reading and writing from the last resultId. Since we're more stateless now we should be OK as long as we allow only one access to the last resultId at a time.
  4. Also added a test for the code above to verify that it failed on the old version and now passes.

LMK what you all think. In my opinion it would really simplify things if we could pass a resultId to the client for the completion list and have them pass it back to us when they re-request for the incomplete items (similar to how it works with resolve)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMK what you all think. In my opinion it would really simplify things if we could pass a resultId to the client for the completion list and have them pass it back to us when they re-request for the incomplete items (similar to how it works with resolve)

That's quite an interesting idea. @dbaeumer have you seen similar scenarios with concurrent language servers trying to respect the isIncomplete -> TriggerForIncompleteCompletion, if have they taken a similar approach to what David did above?

For the record @dibarbet I think what you landed on is great.

filterText,
completionTrigger.Kind,
GetFilterReason(completionTrigger),
ImmutableArray<string>.Empty,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe explicit param name here would help for clarity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also... why do we need this if it's just going to be empty.

Copy link
Member Author

@dibarbet dibarbet Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an MRU in LSP completions as we don't know which items the client committed. Imho that could be client side behavior to appropriately sort to recently used items (or we could implement that feature using commands on commit). Either way, for now it has to be empty, but we need the parameter for the same method for classic completion

@CyrusNajmabadi
Copy link
Member

Implements the isincomplete flag on the completion list to cap completion list size to ~1000 items. The client then is responsible for re-requesting completion results when it gets an incomplete list and the user continues typing.

INstead of a unilateral cap like that across all items we instead do:

  1. the normal set of completion items (i.e. everything but the 'import completion' items) are always sent in full.
  2. we break up the import completion items into groups of N (where 1k seems fine) to ensure it's not too much

?

THis seems like it would help with cases like:

Cap the filtered list at 1000 items plus any preselected items. This can result in a slightly weird looking list when the preselected item is not alphabetically near any of the 1k items we return.

Because for the normal list we're not doing anything weird. It's only the gargantuan list of import completions that are being trimmed down (hopefully in a way the user can't ever even be aware of).

// Finally, truncate the list to 1000 items plus any preselected items that occur after the first 1000.
var filteredList = matchResultsBuilder
.Take(MaxCompletionListSize)
.Concat(matchResultsBuilder.Skip(MaxCompletionListSize).Where(match => match.RoslynCompletionItem.Rules.SelectionBehavior == CompletionItemSelectionBehavior.HardSelection))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you have more than one hard selection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially yes, this is the same as current implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, out of curiosity what does it mean from the end-users perspective to have more then one?

Copy link
Member Author

@dibarbet dibarbet Jun 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client just picks the first one it matches. Your question made me look into it though - I think we need to actually be looking for this rule - https://sourceroslyn.io/#Microsoft.CodeAnalysis.Features/Completion/MatchPriority.cs,32 but only when the item is also hard selected. Since LSP doesn't support us specifying a soft-selected preselect, imho its better to just not preselect in that case.

Updated

Copy link
Contributor

@allisonchou allisonchou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉 Appreciate all the comments you left throughout the code, it was super helpful for understanding everything.

@@ -242,7 +256,7 @@ bool IsValidTriggerCharacterForDocument(Document document, char triggerCharacter
FilterText = item.FilterText,
Kind = GetCompletionKind(item.Tags),
Data = completionResolveData,
Preselect = item.Rules.SelectionBehavior == CompletionItemSelectionBehavior.HardSelection,
Preselect = ShouldItemBePreselected(item),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if an item should be preselected, have we considered setting things up that we ensure that the items around it are also sent down with it (so there's no weird gap between the items that are sent and the completed item? or does this work by teh client continually pulling until it gets all the information necessary to get to that point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's definitely an option. I'm not sure exactly how much to send 'around' the item though, at some point there will be a weird gap. The client doesn't pull as you scroll unfortunately. I could just choose a number of items to add before and after, but then the gap just moves. I'm not sure if that would be more confusing than one preselected item at the end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed a bit offline - a 1k item window approach was suggested where we take the preselected item + the 1k items around the preselected result (while keeping the preselected item as close to the middle of the 1k items as possible).

I'll take a look and see what that looks like in a followup

// Next, we sort the list based on the pattern matching result.
matchResultsBuilder.Sort(MatchResult<CompletionItem?>.SortingComparer);

// Finally, truncate the list to 1000 items plus any preselected items that occur after the first 1000.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ... that seems funky :)

@dibarbet
Copy link
Member Author

@CyrusNajmabadi I think I've covered the feedback, let me know if there's anything else before I merge

@dibarbet
Copy link
Member Author

Going to merge this, will address the preselect 'window' in a followup PR, and if there is any further feedback

@dibarbet dibarbet merged commit 09b50ba into dotnet:main Jun 18, 2021
@ghost ghost added this to the Next milestone Jun 18, 2021
@dibarbet dibarbet deleted the isincomplete branch June 18, 2021 18:53
@RikkiGibson RikkiGibson modified the milestones: Next, 17.0.P2 Jun 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE LSP issues related to the roslyn language server protocol implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[LSP] Implement IsInComplete flag in LSP completions
6 participants