-
Notifications
You must be signed in to change notification settings - Fork 29.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Exploration] Tree-sitter tokenization exploration (Fixes #161256) #161479
Conversation
…ng when arrow clicked.
… to also implement a system which keeps track of the changes.
…the actions. Adding rules to disable linting on certain imports.
…te call to parseTree in the constructor
Hi @aiday-mar was you planning to continue experimenting with this? Is this something you or @alexdima needed help on? I know there were questions around Tree Sitter's parser ABI versioning. I reached out to tree sitter maintainer Andrew Hlynskyi who responded:
|
Great work here This way, adding a grammar just requires adding the queries folder predefined in the tree-sitter repo. |
The work here may have also been channeled into the Anycode extension or vice versa! |
BTW, GitHub can now render native interactive bar charts :) enjoy https://github.com/mermaid-js/mermaid#bar-chart-using-gantt-chart-docs---live-editor |
PR will be closed |
(Sorry any little clarification for us? Did this pull fall off the deck in favour of other features and hotfixes? |
|
Hi thank you for asking @zm-cttae. This PR was made with the purpose to explore the usage of tree-sitter in VS Code tokens colorization. Currently our efforts are pivoted towards developing Copilot and it is not on the roadmap to merge this work. |
Okay thank you for the answer & hopefully this work gets revisited 🤞 |
Notes about the draft PR
Feature request for the issue #161256.
Toggle Synchronous Tree-Sitter Colorization
and the otherToggle Asynchronous Tree-Sitter Colorization
. The two differ only by the value of theasynchronous
boolean which is passed in as parameter during the instantiation of the tree-sitter colorization trees. This boolean determines whether the subsequent tree-parsing operation will be synchronous or asynchronous. The colorization operation will always be synchronous. These actions log into the console the execution time for the parsing, querying and colorization operations. They also log the number of calls to the corresponding asynchonous methods. They were used to get the performance measurements below.test.skip()
.treeSitterService.ts
, there is a functiongetTreeSitterTree()
. It's sole purpose is to retrieve the tree-sitter tree for the purpose of testing (see the testing file). Similarly, inside of the filetreeSitterTree.ts
, there is a functionparseTreeAndCountCalls()
. It is used only in the testing file for testing the amount of calls to the_parseTree()
function. When the booleanasynchronous
is set to True, the_parseTree()
function will first try to parse the tree synchronously and if this fails because of a timeout it will parse the tree asynchronously. Otherwise when the boolean is false, it will always parse the tree synchronously.treeSitterColorizationQueries.scm
file contains the query results needed for colorizing the tokens in the editor. The names of the capture groups are mapped to the text-mate inner-most scope names.colorThemeData.ts
, the methodgetTokenColorIndex()
was made public in order to be able to perform the colorization. For the same reason, in the filecontiguousMultilineTokens.ts
there is a new setter method for the _startLineNumber member.Comparison of the current tokenization/colorization system with the tree-sitter exploratory implementation
memory
tree-sitter-typescript.wasm
file is 1300 kBperformance
In order to get the results below a new boolean parameter
asynchronous
has been created which controls whether the tree parsing operation is synchronous or asynchronous. The colorization (set tokens operation) is always synchronous. Both synchronous and asynchronous actions previously described are toggled 3 times on 6 different files (from the TypeScript repo) and the average as well as the median are displayed in the table below. Some clarifications about the data:Force Retokenize
command from the command palette.Current
Synchronous
Asynchronous