Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coloring of large files #45944

Closed
rebornix opened this issue Mar 16, 2018 · 0 comments
Closed

Coloring of large files #45944

rebornix opened this issue Mar 16, 2018 · 0 comments
Assignees
Milestone

Comments

@rebornix
Copy link
Member

When users open a file, we try to restore the view state of last time when users close this file. It means after the file is opened, users are usually not watching the top of the document. When the file is large, users will see white code first then after a few hundred ms they see the colorful content. This is because our tokenization is not fast enough.

Even though to get a correct tokenization state, we have to tokenize the document from the first line to the last, we can try to guess the starting state of tokenization when viewing a random view port. If our guess is accurate enough, then users can have a better experience than current one, even if sometimes the colors go wrong for a bit of time.

There are several ways to guess

  1. Go backwards 100 (or some other magic number) lines, start the tokenization from there.
  2. Guess by indentation. Say the first line of the view port indentation level is 3, we can go backwards to find lines whose indent level is 2, 1, 0. And then tokenize these three lines and the view port.
  3. Go backwards a few lines to see if we are in a block comment or string, if not, tokenize directly from the first line of viewport. This can work together with option 1.

The verify if we are good at guessing, we can run full tokenization against files and compare that with our guessing algorithm. We don't necessarily need the tokenization state to be the same, as long as the colors are correct, then it's a good one. For example, we can run TypeScript tokenization against VSCode, TypeScript, Angular, etc; Ruby for Rails, Jekyll, CocoaPods; etc etc.

@rebornix rebornix added this to the March 2018 milestone Mar 16, 2018
@rebornix rebornix self-assigned this Mar 16, 2018
@vscodebot vscodebot bot locked and limited conversation to collaborators May 11, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants