Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache semantic classifications and display from the cache while a solution is loading. #46955

Merged
merged 60 commits into from
Aug 28, 2020

Conversation

CyrusNajmabadi
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi commented Aug 19, 2020

Draft idea around making VS feel less encumbered during initial solution load. Part of hte caching working group experiments.

Note, this has a substantial impact on perceived classification perf on load. With Roslyn.sln itself, i go from anywhere from 30s to +1minute, down to just a couple of seconds before you have semantic classifications.

Tagging @mikadumont. This also ties into our discussions with platform about having some sort of good affordance in VS to let users know that this is initial solution-load and that values may be cached/stale, but will be available soon.

@@ -51,23 +51,8 @@ private VisualStudioRemoteHostClientProvider(HostWorkspaceServices services)
_lazyClient = new AsyncLazy<RemoteHostClient>(CreateHostClientAsync, cacheResult: true);
}

private async Task<RemoteHostClient> CreateHostClientAsync(CancellationToken cancellationToken)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not part of this PR. i'm branched off of Tomas's change so i can use VS without OOP crashing. Once #46929 goes in, this will be gone.

if (_uniqueItems.Add(item))
_nextBatch.Add(item);
_nextBatchMap.Remove(item);
_nextBatchMap.Add(item, _nextIndex++);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safe to do ++ here as we're under the lock.

@@ -29,6 +29,7 @@ private class Tagger : ForegroundThreadAffinitizedObject, IAccurateTagger<IClass
private readonly SemanticClassificationBufferTaggerProvider _owner;
private readonly ITextBuffer _subjectBuffer;
private readonly ITaggerEventSource _eventSource;
private readonly SemanticClassifier _classifier;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed from static helpers to an instance type now that it contains more interesting logic around caching/shutdown.

@@ -20,9 +28,54 @@

namespace Microsoft.CodeAnalysis.Editor.Implementation.Classification
{
internal static class SemanticClassificationUtilities
internal partial class SemanticClassifier
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i purposefully didn't rename the file as it causes github to lose the diff. this makes it easier to understand hte change. i will rename in followup pr.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point there's more code that's new than existed... 😄

{
// We're reading and interpreting arbitrary data from disk. This may be invalid for any reason.
Logger.Log(FunctionId.SemanticClassifier_ExceptionInCacheRead);
return false;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general pattern we use elsewhere when reading from cache.

var seenIds = new HashSet<DocumentId>();
foreach (var document in documentsToClassify)
Contract.ThrowIfFalse(seenIds.Add(document.Id));
#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug check to ensure that we are only processing a document once per batch.

Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a bug where the cache that's maintained in the OOP process only is using the DocumentKey as the cache key, but the contents being cached is implicitly depending on the textSpan being queried. Blocking only for that bug; the rest of this was both solid and also well-structured so it was nice and easy to read.

// Then place the cached information for this doc at the end.
_cachedData.AddLast((documentKey.Id, checksum, classifiedSpans));

// And ensure we don't cache too many docs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So once we do get everything loaded, is the intent that this cache then falls away, or is continued to be used for opening new documents even once we have full semantics?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So once we do get everything loaded, is the intent that this cache then falls away,

The caches falls away. This happens in two ways:

  1. teh calls to CacheSemanticClassifications will clear it explicitly (since we are fully loaded at that point)
  2. the classifier doesn't call into the cache once we are fully loaded.

Comment on lines 280 to 281
if (textSpan.IntersectsWith(classifiedSpan))
tempResult.Add(new ClassifiedSpan(classification, classifiedSpan));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were the spans originally written in order? Can we abandon the loop once we're past the end of textSpan?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!

Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good other than the JTF.Run() that I think you need to put back.

CyrusNajmabadi and others added 2 commits August 27, 2020 15:21
…lysisService_SemanticClassificationCache.cs

Co-authored-by: Jason Malinowski <jason@jason-m.com>
@ghost
Copy link

ghost commented Aug 27, 2020

Hello @CyrusNajmabadi!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-approval

return workspaceLoadedService.WaitUntilFullyLoadedAsync(CancellationToken.None);
});
var workspaceLoadedService = w.Services.GetRequiredService<IWorkspaceStatusService>();
await workspaceLoadedService.WaitUntilFullyLoadedAsync(CancellationToken.None).ConfigureAwait(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this need to be async? Task.Run() should have worked fine if the lambda is just directly returning the Task.

@ghost
Copy link

ghost commented Aug 27, 2020

Apologies, while this PR appears ready to be merged, I've been configured to only merge when all checks have explicitly passed. The following integrations have not reported any progress on their checks and are blocking auto-merge:

  1. Azure Pipelines

These integrations are possibly never going to report a check, and unblocking auto-merge likely requires a human being to update my configuration to exempt these integrations from requiring a passing check.

Give feedback on this
From the bot dev team

We've tried to tune the bot such that it posts a comment like this only when auto-merge is blocked for exceptional, non-intuitive reasons. When the bot's auto-merge capability is properly configured, auto-merge should operate as you would intuitively expect and you should not see any spurious comments.

Please reach out to us at fabricbotservices@microsoft.com to provide feedback if you believe you're seeing this comment appear spuriously. Please note that we usually are unable to update your bot configuration on your team's behalf, but we're happy to help you identify your bot admin.

@CyrusNajmabadi CyrusNajmabadi dismissed sharwell’s stale review August 28, 2020 07:55

addressed perf concerns

@CyrusNajmabadi CyrusNajmabadi merged commit 01474d0 into dotnet:master Aug 28, 2020
@ghost ghost added this to the Next milestone Aug 28, 2020
@CyrusNajmabadi CyrusNajmabadi deleted the merged branch August 28, 2020 07:55
@allisonchou allisonchou modified the milestones: Next, 16.8.P3 Aug 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants