-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new workspace event that gives handlers the opportunity to be processed immediately #76932
base: main
Are you sure you want to change the base?
Conversation
…e processed immediately This is in response to allocations seen when the OOP process attempts to operate on a sourcetext version that has not been synchronized over. In those cases, the OOP process requests a full deserialization of the source text causing full buffer allocations in both VS and the CA process. There are several points of asynchronicity in the current system around sending over buffer changes, reducing any of which would reduce the likelihood of needing this full serialization. This PR removes one of those points of asynchronicity, specifically in the workspace eventing layer. Previously, all notiications were done on a delayed basis, by wrapping each notification in the WorkspaceSpace.ScheduleTask call from the RaiseWorkspaceChangedEventAsync method. This method allows callers to indicate they need to be called immediately upon the workspace change. As noted by a doc comment in the PR, these handlers should be very fast. Going to mark this as draft and get speedometer numbers off this to see if this helps enough, or if instead the other points of asynchronicity should be investigated (specifically, the usage of the _textChangeQueue ABWQ). There are other alternatives within the text syncing system such as allowing OOP to request a text change instead of the full buffer contents, but previous investigations into that ended up complicated and incomplete.
PR validation insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/605941 |
src/Workspaces/SharedUtilitiesAndExtensions/Compiler/Core/Log/FunctionId.cs
Outdated
Show resolved
Hide resolved
Moving out of draft mode as the numbers look pretty good. Still some serialization happening during typing, but it's a lot less than before and this change was pretty trivial. If code reviewers think adding this to the workspace level is not desirable, can look into other approaches. |
…r reduce the amount of serialization and deserialization
commit 5 PR validation insertion: https://dev.azure.com/devdiv/DevDiv/_git/VS/pullrequest/606025 |
…ositions to allow composition to include an IThreadingContext
…ecksumUpdater.DispatchSynchronizeTextChanges and RemoteAssetSynchronizationService.SynchronizeTextChangesAsync
...kspaces/Remote/ServiceHub/Services/AssetSynchronization/RemoteAssetSynchronizationService.cs
Outdated
Show resolved
Hide resolved
using var _ = ArrayBuilder<(DocumentId id, Checksum textChecksum, ImmutableArray<TextChange> changes, Checksum newTextChecksum)>.GetInstance(out var builder); | ||
|
||
foreach (var (oldDocument, newDocument) in values) | ||
_ = _threadingContext.JoinableTaskFactory.RunAsync(async () => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should def doc the design here. make it clear why we've written it this way.
note: you can also use IAsyncToken here to track the work, allowing you to properly be able to unit test as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intent here that this is fire-and-forget (and will still always yield the thread or not? Because there's absolutely a chance in VS this might be on the UI thread and we're calling those GetChangeRanges methods and such which might (?) be expensive. Or maybe not.
This indeed might be worth documenting, since it's otherwise unclear to me why we're wrapping this in a JTF.RunAsync().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent is to fire-and-forget, but only yield the current thread if an async operation necessitates it.
Is GetChangeRanges expensive? It doesn't look to be to me, but maybe I'm missing an override? If it is expensive, the two conditions that use it could probably be removed, as they are very minor and specialized optimizations.
Will add some doc here
using (Logger.LogBlock(FunctionId.Workspace_EventsImmediate, (s, p, d, k) => $"{s.Id} - {p} - {d} {kind.ToString()}", newSolution, projectId, documentId, kind, CancellationToken.None)) | ||
{ | ||
args = new WorkspaceChangeEventArgs(kind, oldSolution, newSolution, projectId, documentId); | ||
ev.RaiseEvent(static (handler, arg) => handler(arg.self, arg.args), (self: this, args)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside, are these exception safe if the attached handler throws an exception? i ask because the old system might have been ok ift he Task went into a faulted state. i'm not sure about the new system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops asked this same question here: #76932 (comment) I think this might cause an issue now if this were to throw so probably best to handle that better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any concern with me just changing RaiseEvent from using FatalError.ReportAndPropagate to instead use FatalError.ReportAndCatch (also, would it make sense to have the try/catch inside the foreach in that method?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasonmalinowski -- any thoughts on change RaiseEvent to instead use FatalError.ReportAndCatch (and to move the try/catch inside the loop)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and just made the change to switch to ReportAndCatch
@@ -639,4 +639,9 @@ internal enum FunctionId | |||
VSCode_LanguageServer_Started = 860, | |||
VSCode_Project_Load_Started = 861, | |||
VSCode_Projects_Load_Completed = 862, | |||
|
|||
// 900-999 for items that don't fit into other categories. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// 900-999 for items that don't fit into other categories. | |
// 900-999 things related to workspace and OOP solution sync |
We're not restricted to three digits, so no reason to make this the final catch-all group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted a catch-all group as I was tired scanning through trying to find the next available slot with all the other groups. Is the request here to change the starting range / size of the catch all group, or just not to have it?
using var _ = ArrayBuilder<(DocumentId id, Checksum textChecksum, ImmutableArray<TextChange> changes, Checksum newTextChecksum)>.GetInstance(out var builder); | ||
|
||
foreach (var (oldDocument, newDocument) in values) | ||
_ = _threadingContext.JoinableTaskFactory.RunAsync(async () => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intent here that this is fire-and-forget (and will still always yield the thread or not? Because there's absolutely a chance in VS this might be on the UI thread and we're calling those GetChangeRanges methods and such which might (?) be expensive. Or maybe not.
This indeed might be worth documenting, since it's otherwise unclear to me why we're wrapping this in a JTF.RunAsync().
using (Logger.LogBlock(FunctionId.Workspace_EventsImmediate, (s, p, d, k) => $"{s.Id} - {p} - {d} {kind.ToString()}", newSolution, projectId, documentId, kind, CancellationToken.None)) | ||
{ | ||
args = new WorkspaceChangeEventArgs(kind, oldSolution, newSolution, projectId, documentId); | ||
ev.RaiseEvent(static (handler, arg) => handler(arg.self, arg.args), (self: this, args)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears RaiseEvent will report any exception being thrown but will still propagate it out. That might be dangerous here, because it now means any subscriber's mistake will now completely break the workspace and put the user in a really broken state. I admit I'm not sure why we aren't catching exceptions in RaiseEvent, so maybe that should be changed (or we have a variant that does catch).
use some local functions for cleanup Update some comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signing mostly off. I do want Jason to approve as well. And there look to be a few places we can clean up. I would like exception resilience thought about. Thanks!
public SolutionChecksumUpdater( | ||
Workspace workspace, | ||
IAsynchronousOperationListenerProvider listenerProvider, | ||
IThreadingContext threadingContext, | ||
CancellationToken shutdownToken) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious if the shutdown token is needed. Instead of just using the disposal token on the threading context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think they are different. VisualStudioWorkspaceServiceHubConnector has StartListening/StopListening methods, which drive the shutdownToken. However, I think the threadingContext's DisposalToken is driven by MEF and I don't think is disposed until VS shutdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just saw your other comment about VisualStudioWorkspaceServiceHubConnector. Let me think about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like right now, VisualStudioWorkspaceServiceHubConnector.StopListening is called on Workspace dispose. If I just send over the threadingContext and use it's cancellation token, are we ok with essentially just using a cancellation token from MEF dispose and not workspace dispose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah. let's keep this as is. i've looked at hte code and i'm not sure how to reconcile the different Disposes/CTs. So let's not change that now.
@@ -38,7 +40,7 @@ public void StartListening(Workspace workspace, object serviceOpt) | |||
} | |||
|
|||
// only push solution snapshot from primary (VS) workspace: | |||
_checksumUpdater = new SolutionChecksumUpdater(workspace, _listenerProvider, _disposalCancellationSource.Token); | |||
_checksumUpdater = new SolutionChecksumUpdater(workspace, _listenerProvider, _threadingContext, _disposalCancellationSource.Token); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have the threading context, we might be able to get rid of the disposal logic here and passing that ct along
|
||
using (Logger.LogBlock(functionId, (s, p, d, k) => $"{s.Id} - {p} - {d} {args.Kind.ToString()}", args.NewSolution, args.ProjectId, args.DocumentId, args.Kind, CancellationToken.None)) | ||
{ | ||
handlers.RaiseEvent(static (handler, args) => handler(args.NewSolution.Workspace, args), args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I def have concerns about exceptions. Can we try/catch with a fatalerror reporter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the other thread:
Any concern with me just changing RaiseEvent from using FatalError.ReportAndPropagate to instead use FatalError.ReportAndCatch (also, would it make sense to have the try/catch inside the foreach in that method?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. please ReportAndCatch. We really do not our eventing to ever get into a failed state (or corrupt the WS).
...kspaces/Remote/ServiceHub/Services/AssetSynchronization/RemoteAssetSynchronizationService.cs
Outdated
Show resolved
Hide resolved
Update comments Use string constants instead of allocating
@jasonmalinowski -- would love to get your feedback on the changes and any remaining concerns you may have that haven't been addressed as desired |
For this specific performance improvement, I'm curious about the viability of an alternative approach:
|
@sharwell def feel free to explore that idea. The approach here is very simple and effective and sits on a model we have good understanding of. All this is doing is allowing the current approach to operate without unnecessary delays that impact its efficacy. |
var metricName = wasSynchronized ? "SucceededCount" : "FailedCount"; | ||
TelemetryLogging.LogAggregatedCounter(FunctionId.ChecksumUpdater_SynchronizeTextChangesStatus, KeyValueLogMessage.Create(m => | ||
{ | ||
m[TelemetryLogging.KeyName] = nameof(SolutionChecksumUpdater) + "." + metricName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels like this will continually allocate. can we instead make these into constants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I changed the other one to constants, but not this one. Will change
TelemetryLogging.LogAggregatedCounter(FunctionId.ChecksumUpdater_SynchronizeTextChangesStatus, KeyValueLogMessage.Create(m => | ||
{ | ||
m[TelemetryLogging.KeyName] = nameof(SolutionChecksumUpdater) + "." + metricName; | ||
m[TelemetryLogging.KeyValue] = 1L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the idea that this will add 1 each time this happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the LogAggregatedCounter will add the specified amount to the counter
{ | ||
m[TelemetryLogging.KeyName] = nameof(SolutionChecksumUpdater) + "." + metricName; | ||
m[TelemetryLogging.KeyValue] = 1L; | ||
m[TelemetryLogging.KeyMetricName] = metricName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this here, if it is encoded in KeyName above? is one of these redundant?
{ | ||
var client = await RemoteHostClient.TryGetClientAsync(_workspace, _shutdownToken).ConfigureAwait(false); | ||
if (client == null) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly odd that this would show up as 'failure'. this means a customer who has OOP off will show has massively failing this op. i think this should maybe count as 'success' or perhaps NA so we don't get the wrong idea about being in a bad statee.
{ | ||
m[TelemetryLogging.KeyName] = keyName; | ||
m[TelemetryLogging.KeyValue] = 1L; | ||
m[TelemetryLogging.KeyMetricName] = metricName; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment if we need the keyname and metric name esp if hte keyname contains the metric name.
with your latest changes, i feel fine with this going in (even without Jason signing off). he will eventuallyget around to looking at this. |
This is in response to allocations seen when the CA process attempts to operate on a sourcetext version that has not been synchronized over. In those cases, the OOP process requests a full deserialization of the source text causing full buffer allocations in both VS and the CA process.
There are several points of asynchronicity in the current system around sending over buffer changes, reducing any of which would reduce the likelihood of needing this full serialization.
This PR removes one of those points of asynchronicity, specifically in the workspace eventing layer. Previously, all notiications were done on a delayed basis, by wrapping each notification in the WorkspaceSpace.ScheduleTask call from the RaiseWorkspaceChangedEventAsync method. This method allows callers to indicate they need to be called immediately upon the workspace change. As noted by a doc comment in the PR, these handlers should be very fast.
The numbers from speedometer came back looking pretty good (below). Another point of asynchronicity could be investigated around the usage of the _textChangeQueue ABWQ, but it might not be worth it as it looks like this change removes the majority of the serialization during typing. Other alternatives within the text syncing system such as allowing OOP to request a text change instead of the full buffer contents are also possible, but previous investigations into that ended up complicated and incomplete.
*** before allocations during typing in VS process ***
*** commit 3 allocations during typing in VS process ***
*** commit 5 allocations during typing in VS process ***
No allocations under SerializableSourceText.Serialize during typing period in profile
*** before allocations during typing in CA process ***
*** commit 3 allocations during typing in CA process ***
*** commit 5 allocations during typing in CA process ***
No allocations under SerializableSourceText.Deserialize during typing period in profile