Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove legacy symbol table to improve server responsiveness for large projects #4420

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

keyboardDrummer
Copy link
Member

While working on #4419, profiling showed that the majority of processing time was spent on migrating symbols from the legacy symbol table. The large file test takes 16s with that migration and 5s without it. The legacy symbol table migration migrates symbols from all files, regardless of whether these symbols are in files that have been edited or not, so it does not scale to large projects. Since we're phasing out the legacy symbol table anyways, this PR removes it altogether instead of improving the migration performance.

The legacy symbol table was still used for two features that are removed as part of this PR:

  • Code completion. Code completion has to work on files that don't parse. This was done by taking the symbol table from a previous resolution and migrating it syntactically, and then using heuristics to determine what is in scope on the .. I would consider this feature experimental, since I don't believe it was complete or reliable: it's not tied to the logic of the resolver and it operates on outdated resolution state.
  • Signature help (help when calling a method/function). For reasons similar as the above, I would consider this feature experimental.

Changes

  • Improve performance of large projects by no longer migrating all project symbols on each edit, with no cancellation.
  • Remove existing code completion support
  • Remove existing signature help support

Testing

No testing needed

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

@keyboardDrummer keyboardDrummer changed the title Remove legacy symbol table Improve server responsiveness for large projects Aug 15, 2023
@MikaelMayer
Copy link
Member

While working on #4419, profiling showed that the majority of processing time was spent on migrating symbols from the legacy symbol table. The large file test takes 16s with that migration and 5s without it. The legacy symbol table migration migrates symbols from all files, regardless of whether these symbols are in files that have been edited or not, so it does not scale to large projects. Since we're phasing out the legacy symbol table anyways, this PR removes it altogether instead of improving the migration performance.

The legacy symbol table was still used for two features that are removed as part of this PR:

  • Code completion. Code completion has to work on files that don't parse. This was done by taking the symbol table from a previous resolution and migrating it syntactically, and then using heuristics to determine what is in scope on the .. I would consider this feature experimental, since I don't believe it was complete or reliable: it's not tied to the logic of the resolver and it operates on outdated resolution state.
  • Signature help (help when calling a method/function). For reasons similar as the above, I would consider this feature experimental.

Changes

  • Improve performance of large projects by no longer migrating all project symbols on each edit, with no cancellation.
  • Remove existing code completion support
  • Remove existing signature help support

Testing

No testing needed

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

I'm not very comfortable to delete these two features either. I'm at least using signature help regularly.
Could we rather optimize the existing migration so that we don't migrate symbols unnecessarily?

@robin-aws
Copy link
Member

How about just disabling the migration and the two features by default instead? That way users that want to still use the features can opt in, and we can hopefully have them back on by default once we have them properly implemented in the future without using the legacy symbol table.

@keyboardDrummer
Copy link
Member Author

keyboardDrummer commented Aug 16, 2023

Robin: That way users that want to still use the features can opt in, and we can hopefully have them back on by default once we have them properly implemented in the future without using the legacy symbol table.

I think it may well take more than a year before these features are completed, because I plan on doing that using error correction in the parser and resolver changes so it supports expression holes and doesn't do early exits on errors, two complex features that likely won't get prioritized soon. I think doing it this way, as opposed to the current approach, is the only way to make it reliable.

I thought that indeed making these non-default opt-in features would be a safer change than this one, but honestly I doubt anyone will use it if it's non-default with the caveat that it might reduce IDE responsiveness. Removing the code altogether has the upside of reducing maintenance cost. My feeling is that if these features wouldn't have already been turned on, removing the code would be a much better choice than releasing them.

Mikael: I'm not very comfortable to delete these two features either. I'm at least using signature help regularly.
Could we rather optimize the existing migration so that we don't migrate symbols unnecessarily?

The change I could make is to scope the migration to URIs, so at least you only have to migrate symbols in the edited file, but I'm not sure how fast that'll make it. Customer codebases have files in the thousands of lines that contain over a thousand symbol locations per file, so it might still end up taking some milliseconds.

Those milliseconds might not seem like a lot but they have an additional downside.

UpdateDocument can not be called if it's currently running, so migration, which happens synchronously in UpdateDocument, prevents UpdateDocument from being called, which prevents cancellation from being triggered which means doing unnecessary work. In debugging large files I see a large latency between user edits and UpdateDocument calls and that causes a queue of scheduled calls to UpdateDocument to build up that can take a long time to clear. We've had users complain that the IDE becomes unresponsive and needs to be restarted, and the above is my best explanation for that behavior.

An earlier unmerged PR of mine (https://github.com/dafny-lang/dafny/pull/4002/files) changes UpdateDocument so it doesn't do any actual work but schedules everything onto a worker thread, and this means cancellation can be triggered reliably. I'm unsure of whether that's the right thing to do, and for now I'd rather not go that way since it adds additional complexity. I would rather go the way of making the work that UpdateDocument does be constant time. That means changing our in memory text buffer so it can handle incremental edits, and changing migration so it is also constant time.

Anyways, I will see if I can make a small PR to scope the migration to the changed URI, and we can see from there whether we need more changes.


Side note

For migration, I think it can be made constant time by not letting the IDE/compiler work with absolute positions but with migrating positions that can resolved to absolute positions and which update with the text changes. I used this technique when working on incremental parsing, where the source locations in the AST also needed to be updated efficiently when making small changes to a large file.

@keyboardDrummer keyboardDrummer changed the title Improve server responsiveness for large projects Remove legacy symbol table to improve server responsiveness for large projects Aug 18, 2023
@keyboardDrummer
Copy link
Member Author

Currently I'm running this test:

[Fact]
public async Task ManyFastEditsOnLargeFile() {
    string GetLineContent(int index) => $"method Foo{index}() {{ assume false; }}";
    var contentBuilder = new StringBuilder();
    for (int lineNumber = 0; lineNumber < 1000; lineNumber++) {
      contentBuilder.AppendLine(GetLineContent(lineNumber));
    }
    var documentItem = await CreateAndOpenTestDocument(contentBuilder.ToString(), "ManyFastEditsUsingLargeFiles.dfy");
    for (int i = 0; i < 100; i++) {
      ApplyChange(ref documentItem, new Range(0, 0, 0, 0), "// added this comment\n");
    }

    await client.WaitForNotificationCompletionAsync(documentItem.Uri, CancellationToken);
    await AssertNoDiagnosticsAreComing(CancellationToken);
}

Which takes 3 seconds to complete (with several unmerged optimizations), and ~500ms are spent migrating legacy symbol table state.

That's about 5ms per user edit, which I think is acceptable for now. In the long term though I think this is not good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants