-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Virtual file system support in TSServer #47600
Comments
@andrewbranch Here are the steps I've been using to test local ts changes on a local web build of VS Code:
|
I'm not sure I understand why a VFS would need an identifier - do we anticipate having multiple or mixing VFS and actual FS access? Why wouldn't this command just say "this is your view of the FS until further notice"? |
Possibly relevant: I believe the tests already virtualize FS access. |
Do we have a sense of how much slower this would be than a more specialized API saying "this tarball if your FS"? If it's substantial, we might want a "payload kind" property. |
On the basic web, I think everything will be on the VFS. But on desktop VS Code, you can end up in situations where some files exist on disk and some exist on a virtual file system. This could happen if you create a workspace that has one folder from disk and one from somewhere like GitHub or one drive for example
My suspicion is that in most workspaces, the number of js/ts files we need to send over will be pretty small (< 100). For something like the typescript project, we would have to send a lot of files though. I don't expect the data transfer to be the main bottleneck but we will need to test this (we could also try to optimize it using We can also have a cap of the number of files we send if we do run into issues |
A qualified guestimate is "F1 > Measure Extension Host Latency". Take it with a grain of salt but on vscode.dev with Safari I see up/download speeds ~2000Mbs. I might also be noteworthy that a virtual file system is the more generic solution, e.g. vscode.dev also supports ADO and there no tarball is available. A VFS can abstract that away and give room for other optimisations like a browser-side git clone etc |
I'm going to read through the proposal and just plop a bunch of discussion points that I want to work through. Please note it's all constructive, I'm just trying to think through the scenarios!
One of the concerns here is deciding what is explicit and implicit. We should get a sense of when it is and isn't useful to make these distinctions. For example, you create a file called We can "do the right thing", but off the bat this can't support things like symlinks, other file-like abstractions, weird niche systems that could allow files and folders with the same name, etc. The more we see vscode.dev doing in the future, the more this sort of stuff will matter. I think for me, the biggest concern is symlinks. If you're planning on simulating
On this note, if we do go beyond In other words, what I'd hope for with this is that it's a discriminated union type. Maybe I'm going overkill.
I know this might just be a rough sketch now, but should this be
I dunno if @amcasey already asked this, but what else goes in place of
This leads me into a few questions:
That's cool! Still, we'd need to be a little careful I assume. Is a worker able to hold all this in memory? We'll need to make sure no client ever says "we just sent you 2 GB, good luck, have fun". Let me ask a weirder question: why wouldn't VS Code on desktop always talk to TypeScript through a VFS? We already have twice the number of file-watchers that we need. I think the answer is that language servers contain the necessary logic to filter down files - but if transfer speed is not a problem, can you just toss everything over to TS all the time, regardless of vscode.dev vs. VS Code desktop? Feels like a crazy suggestion, but maybe it's an assumption worth revisiting! |
Uh, and I guess editors also don't want to traverse and read every single file in a workspace. 🙄🤦♂️ But then if that's a concern, how would we get ATA to be speedy? |
I had similar questions about whether this can/should be enabled alongside a real FS on desktop. That could open up a lot of possibilities for VS Code extensions / LS plugins, but could also become a source of hard-to-diagnose bugs, and could make it difficult for us to update the API if it saw a surge of third-party adoption. |
Here's another thing - is this a new URI protocol style we would want to respond to? Would it make more sense for the request to talk about files being in
|
Thanks for the feedback. Just a few thoughts on some of these points:
I think a more explicit, tree-like structure for the file system has a few benefits but may be more difficult to implement/work with. For example, using a tree would let us express empty directories but then we'd also have to tell TS when a directory is deleted. Let's talk about this more
In general, I think we should try aligning with VS Code's model of file systems. That way we don't have to worry about aspects such as case-sensitive vs case-insensitive file names Symlinks are interesting though. I believe (but haven't actually tested this yet) that VS Code's virtual file system supports them. Something to consider for the design even if they are not supported in a v1
Interesting thought! With this proposal, the downside is that we have to eagerly sync over the VFS contents. In previous discussions however, we did talk about having TS be able to delegate file system operations back to VS Code. The main problem we ran into is that TS expects file reads to be sync, and there's not a good way to implement that since we'd have to communicate back and forth with the VS Code process
Yes I like that, although this protocol may need to completely change if we decide we need a more explicit api |
I want to dig into that more to better clarify how to model mixed virtual/non-virtual files in the same project. Let's say you have a project on disk with If you allow these files to mix, then my assumption is that the virtual file system has to take precedence for all LS operations. But I don't know what TSServer would do with the old To continue, let's say the client editor says " These might be weird edge-cases, but I want us to have a good mental model and some well-understood behavior here.
Yeah, that is weird - but this current idea might be less chatty than something like that. This isn't something we have to do for v1 of course. |
One thing that came up in our sync today was whether we had to proactively defend against extremely large files, or too many files that are irrelevant to TS. Our thinking is that we should stick to something simple at first, but that we have room to grow and improve the experience here. One idea was to have editors send over truncated files, or omit their contents entirely, and mark those files as "proactively omitted". The server could then signal "hey, we actually did need these files if they're still around", and an editor could choose to provide them. |
@mjbvz is it possible for me to work on this?, we are planning to use vscode in our products and I have good idea of what is being discussed in the thread. |
@ameerthehacker sorry it wasn't marked that way, but @sandersn is already working on this. |
cool thanks for the update @DanielRosenwasser |
Delegate file system operations to the client would be interesting as you could then deligate the operation to other things more than just a virtual filesystem or node's fs, the possibility to have it operate on whatwg/fs would be neat also so you can pick a folder from users own harddrive |
Just a question, will it be possible to "connect" TypeScript to Docker. Thus developers could avoid duplicating "npm install" and linking workspace in docker and on local machine. So there will be no node_modules in local file system, but VS Code will show file tree from docker and provide type checking etc. ? |
@aspirisen, sounds like you just want to take advantage of VS Code's remote functionality to develop inside of a container. That's probably preferable for you since you want some backing computation and filesystem within a (Docker) container. If there's a use-case I'm missing I'd encourage filing a separate issue. |
I like that idea, it could even be implemented similar to how it's done already when TS Server watches a specific file or folder. I'm not sure if when you request to watch a file if you get an initial response with the contents, but we could do that in this API. Although it could end up being very chatty, it means that the LS decides what files it wants to watch (as it already does today) and it just tells VSCode what it wants to watch. This could also improve the file watch situation that others have mentioned when running in normal FS mode, TS Server can delegate the watch to VSCode instead of directly to the FS. |
Asynchronous VFS can be safely implemented with synchronous server.
I have implemented that mechanism years ago with synchronous LanguageService/Host APIs loading large codebases in browser. It's not too much work, and removes the need to pass WHOLE massive VFS up-front. And being generic solution, it removes all the scattered complexities of another level of abstraction. |
This method can also tackle super-massive files safely too, feeding it into server in repeated chunks. In fact I had implemented that too for loading As far as synchronous server is concerned, the file is starting as 10Kb, then in a second becomes 20Kb, then 30Kb and so on. Language serer is already capable of handling malformed text, and it's doing a decent enough job of incremental parsing too. There are various caveats about lexical scopes that may turn this inefficient though. Intra-file async-to-sync streaming is not as big a win as whole-file. |
Update: instead of making tsserver support virtual file systems, we've been instead working with vscode to use SharedArrayBuffer in the browser only. This means that the ServerHost knows about the virtual filesystem but tsserver itself does not. This supports the web-based scenarios in the proposal, but not the desktop virtual filesystem one. |
I've been waiting for this proposal for months, but found out today that it was closed. It's very important for working on virtual filesystems like ftp filesystems, hopefully it can be reconsidered. |
@sandersn Is there a PR or some other link to this work that you can share? |
The host I'm writing is at: https://github.com/sandersn/vscode-wasm-typescript/ (published to npm under the same name) @hyb1996 I agree that real virtual filesystem support would be nice, but for my purposes, it makes more sense for the web-specific parts to be in the server host. That's a lot less true for integrating a virtual fs with a real one, but you might still be able to make it work there too. |
This is what I do on an LSP language server. :D (Just for reference: https://github.com/johnsoncodehk/volar/blob/aff3d7c0896a391412a605597adca7d796e9accf/packages/language-server/src/utils/webFileSystemHost.ts) |
Can the issue be re-opened for virtual filesystem support on desktop ? |
This proposal discusses support for a virtual file system (VFS) to TSServer. The contents of a virtual file system would be controlled by a client. Using virtual file systems, we believe we can deliver advanced features such as cross-file IntelliSense on vscode.dev and github.dev.
Context
The TypeScript server can currently work with two types of files: those on-disk and those in-memory (indicated by opening the file with a
^
prefix on the path). For the purposes of this discussion, on-disk files are files that the TSServer can independently read using nodejs file system apis, while the contents of in-memory files must always be synchronized with TSServer by a client.Many IntelliSense features are only possible for on-disk files. This includes resolving imports across files, looking up typings, and constructing projects from a jsconfig or tsconfig. In all of these cases, TS implements these features by walking directories and reading files from the disk. None of this is currently possible for in-memory files.
However on VS Code, users are increasingly using virtual workspaces that TSServer cannot read directly. On GitHub.dev and vscode.dev for example, the workspace is provided by a file system provider that reads the workspace contents directly from GitHub or other code storage services. While we can synchronize the opened editors over the TS Server, IntelliSense support for them is still quite limited.
Brining proper virtual file system support to TSServer seems like best solution to enable a desktop like IntelliSense experience on GitHub.dev and vscode.dev
Motivating use cases
Cross-file IntelliSense on web
When a user opens a github.dev and vscode.dev workspace, we would like to provide cross-file IntelliSense by resolving imports. Eventually we would even like to provide project IntelliSense by parsing tsconfig/jsconfig files.
To implement this, we need to synchronize the workspace contents over to the TS Server so that the server can read files besides the ones that are currently opened.
Support for virtual workspaces on desktop
With desktop versions of VS Code, users can also open virtual workspaces. Working with JS/TS files in these virtual workspaces should be just like working with with JS/TS files on-disk.
The requirements to implement this are almost identical to the web case listed above.
Automatic type Acquisition (ATA) on web
When a user opens a JS/TS file from github.dev or vscode.dev, we would like to automatically download typings to provide better IntelliSense.
To implement this, we need a way to tell TS about typings files and where these
d.ts
files live within the project. Again, this is not possible today but we believe could be implemented using virtual file systemsAdditional goals
Do not introduce VS Code specific concepts even though VS Code will be the largest consumer.
Do not requiring a significant rewrite of the entire compiler/server. For example, server is currently synchronous so our proposal must not require converting it to be asynchronous.
Out of scope
This proposal only discusses virtual file system support. We will discuss the specifics of the individual use cases above in separate issues.
Proposal
For the purposes of this proposal, a virtual file system (VFS) is a in-memory representation of a file system. The structure and contents of the VFS are provided to TSServer by the client. TSServer will use its in-memory VFS to implement file system operations, such as file reads and directory walks. By routing these operations through the VFS, we should be able to implement features such as cross-file IntelliSense without having to rewrite the entire server.
Implementing virtual file system support will require:
This proposal focuses only on the protocol part of the proposal. I don't have enough knowledge of TSServer's internals to come up with a plan for actually implementing it.
Protocol
updateFileSystem
updateFileSystem
is a new protocol request that clients use to update the contents of a VFS. It is inspired byupdateOpen
and would take a list of created, deleted, and updated files on the VFS.Virtual file systems each have a unique identifier. This identifier is used in calls to
updateFileSystem
and also will be used to open a file against a specific VFS.Here's an example request for a
memfs
VFS:The above proposal takes a flat list of files similar to update opened. If we think it would be more convenient, we could instead take a tree-like structure.
When TSServer receives an
updateFileSystem
request, it must update its internal in-memory representation of this VFS. However it should not yet start processing any of these files.Open file on a given VFS
After initializing a VFS, clients also need to then open a specific file on the VFS. For this, I propose we introduce a new style of path that can be used to talk about resources on a VFS:
This style of path is inspired by VS Code's uris. We would need to add support for them to all places in the protocol where we take or return a path.
Example
Let's walk through how VS Code could implement workspace-wide IntelliSense on vscode.dev using this proposal.
VS Code downloads and caches the entire contents of the workspace
This is already implemented on the VS code side.
VS Code sends a static copy of the workspace over to TS Server using
updateFileSystem
TS Server receives the file system contents and sets up its own representation of the virtual file system.
With the above request, TS server would construct an in-memory representation of the file system that looks like:
At this point, TS Server should not yet process any of these files or treat them part of a typescript project. The files are only held in-memory and can be read later
VS Code opens
index.js
on the virtual file systemLet's assume this happens because the user clicked on
index.js
to view it.At this point, VS Code uses a normal
updateOpen
call to tell TS server that the user has opened a JS or TS file. This file is part of the virtual file system.TS constructs project representation
After
index.ts
is opened, TS processes it and starts building up a representation of the TS project. In this case, it sees the import./sub/abc
inindex.ts
and attempts to resolve the import. Using the virtual file system and opened files, the server first checks if the filememfs:/workspace/sub/abc.ts
exists. Here all file system operations need to be routed through the virtual file system instead of trying to go to disk.User requests
go to definition
on a reference toabc
inindex.js
Here VS Code would send a
definitionAndBoundSpan
request:The server uses the VFS to respond
Alternatives considered
Delegate file system operations to the client
Instead of eagerly syncing the VFS over to TSServer, we could instead delegate individual file system operations back to the client.
This is likely not possible without a significant rewrite of the server. The server expects file system operations to be synchronous, and there is no good way to synchronously communicate from the TSServer worker process back to main VS Code extension host process. Even if we could implement synchronous calls, doing so would not be ideal and would result in a large number of messages getting passed back and forth between the client and server.
The text was updated successfully, but these errors were encountered: