Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent race in LSP pull diagnostics handling that causes stale diagnostics that do not update. #54558

Merged
merged 3 commits into from
Jul 2, 2021

Conversation

dibarbet
Copy link
Member

@dibarbet dibarbet commented Jul 2, 2021

I noticed this issue while investigating EnC diagnostics not updating correctly when the debug session ends (the EnC diagnostics linger after the debug session ends until a text change is made in the editor). See comment for details

@dibarbet dibarbet added Area-IDE LSP issues related to the roslyn language server protocol implementation labels Jul 2, 2021
@dibarbet dibarbet requested a review from a team as a code owner July 2, 2021 03:26
// Ensure the read to _documentIdToLastResultId to get the cached resultId and subsequent write
// to _documentIdToLastResultId after calculating new diagnostics happens in a single 'transaction'
// to prevent us from overwriting data in _documentIdToLastResultId with stale results.
using (await _gate.DisposableWaitAsync(cancellationToken).ConfigureAwait(false))
Copy link
Member Author

@dibarbet dibarbet Jul 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The particular issue I noticed is an issue where _documentIdToLastResultId is cleared for a document in between the read in DiagnosticsAreUnchanged and a write to it in ComputeAndReportCurrentDiagnosticsAsync for a single LSP pull request.

Consider the scenario

  1. A change in the editor triggers the OnDiagnosticsUpdated event to clear resultId for File.cs
  2. LSP sends a document pull for File.cs, we have no cached diagnostics (cleared by 1) so we begin calculation.
  3. A scenario like stop debugging causes OnDiagnosticsUpdated to be triggered and resultId is cleared before the calculation in 2) completes.
  4. The calculation in 2) completes and updates the resultId for File.cs, overwriting the clear that happened in 3) with stale results.
  5. The client gets the result from 4). Now the client sends the resultId from 4 for the next request, and we happily return them the stale cached results for all subsequent requests until we get a new OnDiagnosticsUpdated (like when the user types).

With my fix, OnDiagnosticsUpdated will always clear the cached resultId after or before a request has entirely completed.

  1. If the LSP pull request for a change comes in after the OnDiagnosticsUpdated for the same change, then it sees the cleared resultId and re-calculates.
  2. If the LSP pull request for a change comes in before the OnDiagnosticsUpdated for the same change, then the initial pull request will return the stale cached diagnostics, but subsequent polling by the client will soon query again at which point the stale cache will have been cleared by OnDiagnosticsUpdated and we re-calculate

@dibarbet dibarbet changed the base branch from main to release/dev17.0 July 2, 2021 03:45
@dibarbet dibarbet requested review from a team as code owners July 2, 2021 03:45
@dibarbet dibarbet removed the request for review from a team July 2, 2021 03:48
else
{
context.TraceInformation($"Diagnostics were changed for document: {document.FilePath}");
report = await ComputeAndReportCurrentDiagnosticsAsync(context, document, cancellationToken).ConfigureAwait(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having a lock around CmoputeAndReport is definitely scary. computation can take unbounded time. is it possible we can still compute outside of hte lock, but orderr things to always be consistent with teh results?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that make sense,, it could block us hearing about diagnostic updates for quite a while.

I wonder if I could update the _documentIdToLastResultId map before I do the computation instead of after, then release the lock and allow computation to proceed. Could potentially do that inside a try and clear out _documentIdToLastResultId if the computation throws an exception for some reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline, Moved caching before calculation as we will only report resultIds (per document) to client if the calculation succeeds. We do not report partial document results. If there is a cancellation or exception, the client will not see a resultId for the document at all, and therefore cannot ask us for it.

@dibarbet
Copy link
Member Author

dibarbet commented Jul 2, 2021

@vatsalyaagrawal for M2
edit: approved offline

@dibarbet dibarbet enabled auto-merge July 2, 2021 21:22
@dibarbet dibarbet merged commit 05d2055 into dotnet:release/dev17.0 Jul 2, 2021
@dibarbet dibarbet deleted the lsp_diagnostics_race branch July 2, 2021 23:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Approved to merge Area-IDE LSP issues related to the roslyn language server protocol implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants