Skip to content

Conversation

gabritto
Copy link
Member

Follow up to #1767.

In that PR, we computed documentPositionMappers based on declaration map files and cached them at the language service level. This can be rather inefficient because a language service is created per request, so the cached position mappers were recomputed for every LS request.

In this PR, we cache position mappers in the source file structures themselves, and those get persisted across snapshots and therefore across different language services. We achieve this in the following way:

  1. When we get a request in the server, we get an LS from session that is based on the current snapshot. We use this LS to compute the response.
  2. We collect all file paths present in the response which could be subjected to declaration mapping.
  3. We ask session for a new LS that is based on the old snapshot but contains declaration maps for the files collected in the previous step.
  4. We compute a new response by mapping old positions to new positions using this new LS with declaration map information.

To implement step 3, we now have a new operation on snapshots, CloneWithSourceMaps, which clones a snapshot with additional declaration map information. This is done by creating a snapshotFSBuilder based on the current snapshot, and having the snapshot FS builder read additional files and compute declaration map information.

To persist these newly read files and declaration map information across future snapshots, we also add these files and computed information to the session's current snapshot. This is implemented in CloneWithDiskChanges. We also update file watchers to include any newly read disk file that was added as part of those changes.

Note: a snapshot FS has two kinds of files cached, overlays and disk files. Overlays are files open in the client, and their contents may or may not match the corresponding disk files, if those exist. To simplify the implementation here, we only compute declaration map information based on disk files. That means if a .d.ts file or .d.ts.map file is open in the client, we won't use the overlay contents, and will instead read from disk. If the overlay content matches the disk content, there's no noticeable difference. If it doesn't, there will be a mismatch in the mapped positions. But in this scenario, something has gone wrong already: if the user edits a .d.ts file in the client (but doesn't save it to disk), and that .d.ts had a declaration map generated for it, the mapping of positions is already going to be wrong.

)

func FileNameToDocumentURI(fileName string) lsproto.DocumentUri {
if bundled.IsBundled(fileName) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed to avoid crashes in the tests, since a file name for a bundled file doesn't round trip with the current implementations of FileNameToDocumentURI and documentUri.FileName().

}
}

func registerDefinitionHandler(handlers handlerMap) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning on making this generic when we add position mapping for other requests.


// GetECMALineInfo implements sourcemap.Host.
func (s *snapshotFSBuilder) GetECMALineInfo(fileName string) *sourcemap.ECMALineInfo {
if file := s.getDiskFile(fileName); file != nil {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument for why it's safe to call file.ECMALineInfo() here is kind of subtle, but basically boils down to the fact that, during the process of computing a new snapshot with added map info, we only ever read files or add the computed map info to existing files. We never change a file's content.

if isInline {
// Store document mapper directly in disk file for an inline source map
docMapper := sourcemap.ConvertDocumentToSourceMapper(s, url, genFileName)
entry.Change(func(file *diskFile) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not atomic or following any concurrency safe pattern because we don't need to: these are always computed sequentially for now.

@gabritto gabritto marked this pull request as ready for review October 10, 2025 19:57
@Copilot Copilot AI review requested due to automatic review settings October 10, 2025 19:57
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements persistently cached declaration maps to improve performance. Previously, declaration maps were computed at the language service level and recomputed for each LS request, making the system inefficient.

Key changes include:

  • Caching source map information in source file structures that persist across snapshots
  • Adding CloneWithSourceMaps and CloneWithDiskChanges operations to snapshots
  • Modifying the definition handler to use source maps for accurate position mapping

Reviewed Changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/sourcemap/source_mapper.go Split source map parsing logic to separate URL detection from mapper creation
internal/project/snapshotfs.go Added source map computation and caching logic to snapshot file system
internal/project/snapshot.go Added new snapshot cloning operations for source maps and disk changes
internal/project/session.go Updated language service creation to return snapshots and handle mapped files
internal/ls/definition.go Modified definition handling to extract files for mapping and apply source maps
internal/ls/source_map.go Updated source mapping logic to work with new architecture
internal/ls/languageservice.go Simplified to delegate source map operations to host
internal/ls/host.go Updated interface to provide document position mappers
internal/lsp/server.go Added specialized definition handler with source map support
internal/project/watch.go Extended file watching patterns to include declaration maps
testdata/baselines/reference/fourslash/goToDefinition/declarationMapGoToDefinitionChanges.baseline.jsonc Added test baseline for declaration map functionality

if sourceMapInfo.sourceMapPath != "" {
referencedToFile[s.toPath(sourceMapInfo.sourceMapPath)] = entry.Key()
} else if mapper := sourceMapInfo.documentMapper.m; mapper != nil {
for _, sourceFile := range mapper.GetSourceFiles() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How/when/why does a source mapper point to multiple files? I'm having trouble wrapping my head around what this means.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this use to be --out which we no longer support ?

@jakebailey
Copy link
Member

Do we know in general how much faster this is compared to the more naive but simple approach? I was kind of hoping we could get away with "caching within a single request" rather than retaining this memory in later snapshots...

@gabritto
Copy link
Member Author

Do we know in general how much faster this is compared to the more naive but simple approach? I was kind of hoping we could get away with "caching within a single request" rather than retaining this memory in later snapshots...

I'm working on a comparison, just happened to find a bug while doing this, so it's taking a while.

@gabritto
Copy link
Member Author

Do we know in general how much faster this is compared to the more naive but simple approach? I was kind of hoping we could get away with "caching within a single request" rather than retaining this memory in later snapshots...

Did a quick test with a library that had declaration maps which is used in the large repo I was investigating a while back.
Basically using this file:

import { CSSStyleDeclaration } from "happy-dom"

and going to definition on CSSStyleDeclaration.

With current per-request cache:
Time to compute the source map information we need for the request is ~17ms (i.e. total time of calling GetDocumentPositionMapper).
Definition response time was 20-22ms.

With persistent caches:
First definition request:
~17ms to clone the source maps with computed source map information.
Definition response time: 19ms.

Next definition requests (including after changing the file we're requesting definitions from):
Definition response time: 1ms.

Now, with go to definition, I don't think it's that big of a difference. However, we are going to soon add support for source mapping to find all references, and we are going to have cross-project find all references. So in this scenario, I can see these ~17ms per source map adding up to a noticeable difference.
Note the source map used for the test above is for a .d.ts file of ~1200 lines. I think this is on the larger side, but I could be wrong.

@sheetalkamat what do you think? Do you think we'll need this more efficient version of source map handling?

@gabritto gabritto requested a review from andrewbranch October 14, 2025 20:25
@jakebailey
Copy link
Member

jakebailey commented Oct 15, 2025

Now, with go to definition, I don't think it's that big of a difference. However, we are going to soon add support for source mapping to find all references, and we are going to have cross-project find all references. So in this scenario, I can see these ~17ms per source map adding up to a noticeable difference.

I feel like it has to be the case that the number of mappings we need in any given case has a low bound. When are we actually going to be in the situation where we are searching for say definitions and need to look past more than one .d.ts.map?

Find all refs I can understand, but only if source file redirection is disabled, right?

(I would also have sort of expected Lattice to have complained with the current implementation if it were slow but seemingly not. Maybe we need to test it in some internal repos that don't do references?)

@gabritto
Copy link
Member Author

I feel like it has to be the case that the number of mappings we need in any given case has a low bound. When are we actually going to be in the situation where we are searching for say definitions and need to look past more than one .d.ts.map?
Find all refs I can understand, but only if source file redirection is disabled, right?

Go to definition I don't think will be a problem, which is why I mentioned FAR, but maybe even that is not really a problem, I had forgotten about disableSourceOfProjectReferenceRedirect (and how many people even have disableSourceOfProjectReferenceRedirect on?). There's also navTo/workspace symbols which uses source mapping, though again, not sure if that could be problematic.

(I would also have sort of expected Lattice to have complained with the current implementation if it were slow but seemingly not. Maybe we need to test it in some internal repos that don't do references?)

Any ideas which repos we could test on?

I'm ok to just leave this PR here in case we find we need it later. At least we know how to do something like this, if we need it for other things.

@sheetalkamat
Copy link
Member

I think when we added this "source of project reference" didnt exist so it was like everyone using this. And all scenarios use be affected. If we can keep source map per file - and keep it till the file doesnt change - that should be good enough no given we dont have to handle "--out" scenario? - this also allows sharing across projects (*there use to be some projects but i dont remember which that didnt use project references but had files shared across project - which i think is case when you use disableSourceOfProjectReference?)

The main problem we use to see then use to be those temporary file opens that would loose the source mapper if we stored it with program in LS ( again there use to be problem that these would change the program which we later fixed for more efficient way and i think is present in current snapshot one so we dont have to worry about )

@sheetalkamat
Copy link
Member

and yes FAR does do source map jumping

@jakebailey
Copy link
Member

Go to definition I don't think will be a problem, which is why I mentioned FAR, but maybe even that is not really a problem

If we are caching things per request, then even multi-project FAR that may reread the same file would still work, since the cached thing would still be there (since this is happening a layer above). I feel like it's probable we could save a lot of memory / complexity by not doing any persistence.

If we can keep source map per file - and keep it till the file doesnt change

I think we can do that, I just wonder if it's worth 20ms to store the mapping in memory potentially forever, given it's likely you will never rebuild the dependent project. Right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants