Investigate a diagnostic pull model #112501

dbaeumer · 2020-12-14T20:41:56Z

Introduction

Diagnostics are currently the only data objects in the languages space that an extension pushed to VS Code. The reasoning behind that architecture is as follows:

VS Code wants to encourage extensions to provide workspace diagnostics and not only single file diagnostics. This is especially important for languages that have inter file dependencies and hence a change in file A can result in problems in a file B. In addition in a workspace diagnostic model the extension (e.g. LSP server) can decide when it is a good time to compute the diagnostics and to push them to the server.
diagnostics can easily be streamed since they can be delivered on a file by file basis.

However the approach has also some downsides:

the extension doesn't know which files to prioritize since VS Code has no API to query the visuals (e.g. which files are presented in tabs, ....).
the extension doesn't know if the client is presenting the diagnostics at all (e.g. a corresponding problem status or problems view is visible).
if a language is file based (e.g. all linters) then such an extension usually only validates the open files. But again the VS Code API doesn't allow to query if a file is visible hence extensions rely on the open / close events which fire when the content of a document is 'opened'. This event for example is also emitted if a workspace edit is applied or a hover preview is computed. In these cases extension shouldn't start computing diagnostics since their computation can be expensive.

Pull Model

Instead of letting the server push diagnostics a client could pull them. Such a model would be analogous to the other language features which are usually pull also (for example semantic tokens). However a pull model has some drawbacks as well:

how does a client decide when to pull for workspace diagnostics.
project configuration changes might require to re-pull all diagnostics (e.g. file and workspace based)

Requesting Diagnostics

Pulling for document scoped diagnostics should be done analogous to other providers that are document scoped. A good blueprint could be document symbols for which the pull frequency is adopted to computation time, file size, user typing, ... However it is unclear if the client should pull for all open visible documents or only for the focused one. If the client pulls for all visible documents the pull frequency should be higher for the active document.

Pulling for workspace diagnostics needs to address the following issues:

streaming to allow extensions to hand off computed diagnostics file by file.
delta encoding to avoid sending the same diagnostics over and over again.
ability to abort a current pull and ask the client to re-trigger.

These characteristics are addressed as follows (see API proposal below):

The result of a provider call is a WorkspaceDiagnosticResult to which the extension can add values later on. The result is consider to be open until the extension calls done on it.
Delta encoding is done analogous to semantic tokens by using result ids and two distinct provider functions (a) provideWorkspaceDiagnostics and (b) provideWorkspaceDiagnosticsEdits to pull the delta. An item in a result is identified using the resource property of a DiagnosticResultItem. This means a result item with a resource X replace one from a previous result with the same resource. Items are cleared for a specific resource by using a DiagnosticResultItem with the resource's uri and an empty diagnostics array.
An extension can at any time call done on a WorkspaceDiagnosticResult. The result items delivered are consider to be valid. However the extension can pass in a flag to ask the client to immediately re-trigger a diagnostic pull. Please note that such a re-trigger is even possible if no result items have been add. This allows an extension to signal a busy case indicating that it is not a good time right now to ask for workspace diagnostics.

API Proposal

A diagnostic pull model should be align with the current API design. A possible API could look like this:

	export interface DiagnosticProviderOptions {

		/**
		 * The name with which all diagnostics of a provider will be associated, for instance `typescript`.
		 * This is comparable to [DiagnosticCollection.name](#DiagnosticCollection.name)
		 */
		readonly name?: string;
	}

	export interface DocumentDiagnosticProvider {

		/**
		 * An optional event to signal that all diagnostics from this provider have changed
		 * and should be re-pulled.
		 */
		onDidChangeDiagnostics?: Event<void>;

		/**
		 * Provides diagnostics for the given document.
		 *
		 * @param document The text document to get diagnostics for.
		 * @param token A cancellation token.
		 */
		provideDocumentDiagnostics(document: TextDocument, token: CancellationToken): ProviderResult<Diagnostic[]>;
	}

	export interface DiagnosticResultItem {
		/**
		 * The resource identifer.
		 */
		readonly resource: Uri;

		/**
		 * The diagnostics of the resource. An empty array or `null` indicates
		 * that the diagnostics should be cleared.
		 */
		readonly diagnostics: [] | null;
	}

	export interface WorkspaceDiagnosticResult {

		/**
		 * The result id if available.
		 */
		readonly resultId?: string;

		/**
		 * Adds additional values to the workspace diagnostic result.
		 *
		 * @param values The values to add.
		 */
		add(values: DiagnosticResultItem[]): void;

		/**
		 * Signals that no additional result items will be added to this
		 * diagnostic result.
		 *
		 * @param retrigger Whether the client should immediately re-trigger the workspace pull.
		 */
		done(retrigger?: boolean): void;
	}

	export interface WorkspaceDiagnosticProvider {

		/**
		 * An optional event to signal that all diagnostics from this provider have changed
		 * and should be re-pulled.
		 */
		onDidChangeDiagnostics?: Event<void>;

		/**
		 * Provide diagnostics for the whole workspace.
		 *
		 * @param priorities If possible diagnostics for the provided Uris should be computed with higher priority.
		 * @param token The cancellation token.
		 */
		provideWorkspaceDiagnostics(priorities: Uri[], token: CancellationToken): ProviderResult<WorkspaceDiagnosticResult>;

		/**
		 * Provides a diagnostic delta for the whole workspace relative to a previous result.
		 *
		 * @param priorities If possible diagnostics for the provided Uris should be computed with higher priority.
		 * @param previousResultId The id of a previous result.
		 * @param token The cancellation token.
		 */
		provideWorkspaceDiagnosticsEdits?(priorities: Uri[], previousResultId: string, token: CancellationToken): ProviderResult<WorkspaceDiagnosticResult>;
	}

	export namespace languages {

		export function registerDocumentDiagnosticProvider(selector: DocumentSelector, provider: DocumentDiagnosticProvider, options?: DiagnosticProviderOptions): Disposable;

		export function registerWorkspaceDiagnosticProvider(provider: WorkspaceDiagnosticProvider, options?: DiagnosticProviderOptions): Disposable;
	}

Open questions

Can an extension register both a document and workspace diagnostic provider? IMO VS Code should support this and should treat problems from the document provider with a higher priority as from a workspace provider. This means diagnostics for a resource X from a document provider will shadow diagnostics for a resource X from a workspace provider.
Can an extension register a pull provider and at the same time push diagnostics? IMO VS Code should support such a case with the limitation that the pull and push model need to use a different diagnostic collection name.

The text was updated successfully, but these errors were encountered:

dbaeumer · 2020-12-14T20:43:01Z

@jrieken @aeschli a first writeup for a diagnostic pull model. Feedback highly welcome.

jrieken · 2020-12-15T08:43:47Z

The reasoning behind that architecture is as follows:

I think the most important reason for this being push, not pull hasn't been listed and that's that there is no way to know what files to ask diagnostics for. Typing a file A can introduce errors in file Z and that file might not be active and not be an open tab, however I still wanna see its errors in the error list.

jrieken · 2020-12-15T08:52:08Z

related to #15178

@dbaeumer I am actually surprised that you are proposing this for vscode.d.ts and not just LSP. We will push for an API that reflects the state of open editors (#15178) and that will allow to manage diagnostics for "documents" that are only open as tabs.

dbaeumer · 2020-12-15T10:05:57Z

@jrieken

Typing a file A can introduce errors in file Z and that file might not be active and not be an open tab, however I still wanna see its errors in the error list.

This is what I meant with the statement that VS Code pushed for a workspace diagnostic model. But I will clarify this by adding the file dependency example as well.

I am actually surprised that you are proposing this for vscode.d.ts and not just LSP.

The reason for this is that even if we have an open editor tab model I think it will not be enough. If we consistently decorate file with their errors in the UI how would an extension (e.g. a linter) decide to compute the errors and push them if a file is for example presented in a reference result list?

So I thought that a pull model would benefit VS Code as well and not only LSP. And I think that on an LSP layer I can't promise to implement a pull model correctly in all cases.

dbaeumer · 2020-12-15T10:08:22Z

Update the description with the inter file dependency example.

aeschli · 2020-12-15T10:10:59Z

I like the proposal. We need this for vscode.d.ts as well

it's the missing piece for to enable the programmatic computation for diagnostics on virtual documents in order to compose them in a main document (e.g. for documents with embedded languages, maybe also for notebooks)
the separation of provideDocumentDiagnostics and provideWorkspaceDiagnostics should solve the issue @jrieken mentiones: provideWorkspaceDiagnostics would return all diagnostics for resources related to the current workspace folders, no list of resources as input required.
It will help that we don't have diagnostics reported for virtual documents (e.g. in git diff or for vscode schema files) unless the document is open in an editor and the editor requests for it.
The separation will also be useful for large projects or complex compilers (thinking of cpp) so the editor gets the document diagnostics (using provideDocumentDiagnostics) first. The problems view would use provideWorkspaceDiagnostics.

I would suggest to try first without a provideWorkspaceDiagnosticsEdits variant. The normal number of diagnostics is normally in the hundreds. If it goes into the ten-thousands it doesn't really make sense to list them all, so a LS might decide to collapse some errors (e.g. have a single 'build-path' error on the project).

jrieken · 2020-12-15T10:40:21Z

It will also make sure we don't get diagnostics reported for virtual documents (e.g. in git diff or for vscode schema files) unless the document is open in an editor and the editor requests for it.

That's not correct. The existing push-API will remain forever and therefore you will always have diagnostics being reported for virtual documents.

AntonyBlakey · 2021-01-29T18:17:54Z

A pull model would be useful for embedded languages, which is a use-case I have right now.

dbaeumer · 2021-07-02T08:14:05Z

Will close the issue. A new next version of the LSP libraries containing a pull model implementation got published

dbaeumer self-assigned this Dec 14, 2020

dbaeumer added this to the January 2021 milestone Dec 14, 2020

dbaeumer added feature-request Request for new features or functionality under-discussion Issue is under discussion for relevance, priority, approach labels Dec 14, 2020

dbaeumer modified the milestones: January 2021, February 2021 Jan 28, 2021

dbaeumer mentioned this issue Feb 5, 2021

textDocument/references and textDocument/implementation occasionally opens files microsoft/vscode-languageserver-node#738

Closed

lszomoru modified the milestones: February 2021, March 2021 Feb 26, 2021

dbaeumer modified the milestones: March 2021, April 2021 Mar 24, 2021

alexdima modified the milestones: April 2021, May 2021 Apr 30, 2021

dbaeumer modified the milestones: May 2021, June 2021 Jun 3, 2021

dbaeumer modified the milestones: June 2021, July 2021 Jul 2, 2021

dbaeumer closed this as completed Jul 2, 2021

dbaeumer modified the milestones: July 2021, June 2021 Jul 2, 2021

karlhorky mentioned this issue Aug 13, 2021

Consider using LSP Diagnostic Pull? usernamehw/vscode-error-lens#77

Closed

karlhorky mentioned this issue Aug 13, 2021

Add support for explicitly requesting diagnostics microsoft/language-server-protocol#737

Closed

github-actions bot locked and limited conversation to collaborators Aug 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate a diagnostic pull model #112501

Investigate a diagnostic pull model #112501

dbaeumer commented Dec 14, 2020 •

edited

Loading

dbaeumer commented Dec 14, 2020

jrieken commented Dec 15, 2020

jrieken commented Dec 15, 2020

dbaeumer commented Dec 15, 2020

dbaeumer commented Dec 15, 2020

aeschli commented Dec 15, 2020 •

edited

Loading

jrieken commented Dec 15, 2020

AntonyBlakey commented Jan 29, 2021

dbaeumer commented Jul 2, 2021

Investigate a diagnostic pull model #112501

Investigate a diagnostic pull model #112501

Comments

dbaeumer commented Dec 14, 2020 • edited Loading

Introduction

Pull Model

Requesting Diagnostics

API Proposal

Open questions

dbaeumer commented Dec 14, 2020

jrieken commented Dec 15, 2020

jrieken commented Dec 15, 2020

dbaeumer commented Dec 15, 2020

dbaeumer commented Dec 15, 2020

aeschli commented Dec 15, 2020 • edited Loading

jrieken commented Dec 15, 2020

AntonyBlakey commented Jan 29, 2021

dbaeumer commented Jul 2, 2021

dbaeumer commented Dec 14, 2020 •

edited

Loading

aeschli commented Dec 15, 2020 •

edited

Loading