-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support syntax highlighting with tree-sitter #50140
Comments
tree-sitter is cool technology, and we have our eyes on it. If you already have experiences with specific grammars, e.g. the TypeScript grammar or the C-grammar, and think it is superior to the TextMate grammars, let us know. That would be the criteria for us to invest. |
This may help in the future with the whole 'embedding one language in another', which is an enfant terrible when it comes to TextMate grammars. |
There's also a request in #5408 for .sublime-syntax which has been open since Apr 2016 which would also be a step up from .tmLanguage. While tree-sitter has an awesome concept I can't say the idea of writing grammar in JavaScript is all that appealing. |
@omniomi tree-sitter also supports writing grammars in pure JSON if that's what you prefer. The main & dramatic advantage of tree-sitter is that it's a full parsing system and not an ad-hoc, underspecified, horrifyingly complex yet extremely limited regex contraption. |
Integrating tree-sitter would help solve this issue dotnet/vscode-csharp#2461 |
@aeschli Atom has switched to tree sitter for C++ and no longer fixing issues with Text Mate: atom/language-c#232 (comment) . Please advise on how we should proceed for improving the C++ syntax highlighting/etc. experience. |
👋 Just to reiterate - the Atom team doesn't intend to disrupt other apps like VSCode that are using modules like The reason that we've been closing issues like that is just to be explicit about the fact that our team won't be prioritizing work on them in the future, since Atom is moving away from text-mate grammars. |
@sean-mcmanus we already have our own syntax highlighting stuff (shared with Visual Studio), but haven't been able to use it because we are waiting on an API that lets us turn off tmLanguage and provide the coloring ourselves: #585. Moving to tree-sitter is only relevant to us so long as #585 is incomplete. |
Tree-sitter is extensible for other programming languages, and in particular already supports Rust and Ruby as well. Are the Visual Studio APIs ready to be extended with new language support in those ways? |
I'm wondering if tree-sitter can solve this #51157 |
@bobbrow is that a finite decision? Would have been nice to share the code with Atom here. |
No plans for this in 2018? |
It's going to be 2019!! |
Yeah, Atom 1.33 ships with tree sitter and most of the C/C++ colorization bugs have been fixed with it -- the Atom/language-c team is closing the non-tree sitter bugs. |
@aeschli I meanwhile re-implemented my TextMate grammar with tree-sitter because the former proved unmaintainable (templated regexes up to 400 characters long, etc.). Developing the tree-sitter grammar and highlighter from scratch took three days, compared to three weeks for the TextMate grammar. The new highlighter works better and is dramatically easier to maintain. I wish I could use it in VS Code as well. |
Is there real tangible data on the performance change from using Tree-sitter? |
@meche-gh #161479 |
Slightly tangent to this, I find tree-sitter to be interesting for more than syntax highlighting. And most importantly, external modules built around tree sitter are extremely useful. I'd say we approach including tree-sitter in VSCode more holistically:
These decisions will likely impact the first inclusion of tree-sitter into VSCode, be it syntax highlighting or others. I don't know if it's clear or not. But including tree-sitter into VSCode is a huge benefit because it makes it aware of the code and not treat it like text. It may start with syntax highlighting (which is a bit already solved by textmate grammars) but doesn't end there. If the benefit of having tree-sitter syntax highlighting isn't very big, I'd say it would be better to start with other simpler features that can live at the borders of VSCode as opposed to being in the core (syntax highlighting isn't simple to get right and not critical since it is working relatively well atm.) When the basic setup of tree-sitter is done. A PR to have syntax highlighting will be much easier to build, review and merge. |
Just giving my update and 2 cents. @haikyuu those are some interesting thoughts, and I agree it’s a huge benefit all round. I don’t agree about syntax highlighting being a solved problem because even though it “works” the performance is hitting its ceiling. I wrote about it here https://jason-williams.co.uk/posts/speeding-up-vscode-extensions-in-2022/ (see Tree Sitter section). If VSCode wants to stay competitive it will eventually need to migrate towards this in my opinion. Last time I looked at the performance of large files a lot of time was attributed to parsing. I do agree with starting simple, but this will need to be in the core. I don’t want to see us go down a path of “everyone needs a tree sitter extension”, not that that’s what you were suggesting, but it would be good to see some roadmap for actually having it be the primary syntax system. I did look into branching of #161479 but it’s a monumental effort as it touches so many parts of the code base. So it isn’t something I could take on alone, especially if the maintainers are already planning to work on this (we don’t know, they are quiet on this topic, although there’s still positive signals they’re interested in investigating). ABI Stability There was concern over stability which may have been the reason progress in this area went quiet. @alexdima did raise concerns around the ABI potentially changing causing extensions to break. Although I reached out to the Tree Sitter maintainers who declared the library to be stable and there shouldn’t be any backwards incompatible changes. |
@jasonwilliams I agree this should land into core for optimal experience. And the performance benefit is not to neglect (I am personally using neovim at the moment and everything feels way faster) |
If the Tree-Sitter community wants to scale to existing themes, they need to plan their token names ahead of time and standardise it the way that Sublime and Textmate have done, and also the way Microsoft began to do with the LSP token format a couple months in. |
Nvm, if the mapping is done by the grammar owner, that would be a small portion of the current effort needed to maintain Textmate grammars.. Would suck more if there was no TM grammar but even then the mapping would only be painful once |
While I agree with the fact that tree-sitter grammars also have their difficulties, the difference is that many of us need to define a tree-sitter grammar anyway, since you can use it for other things, like an LSP server, or a compiler. A textmate grammar would have to be maintained in parallel with whatever other parser generator you're using for other components. |
Am I reading #161479 (comment) correctly that tree-sitter support is not going to happen any time soon? :\ |
The team hasn't been active in any linked issues, hasn't publicly expressed intent to change direction, and arguably has competing interests with unifying VS Code syntax definitions with either VS and/or Monaco/Monarch. I consider this issue closed in practice. |
so ... where do we go to ask ... "either VS and/or Monaco/Monarch" to adopt treesitter :) |
Visual Studio
Monarch (Monaco)
|
#207416 for those who missed it. |
That issue is now locked. Has any further exploration occurred? |
yes |
Please consider supporting tree-sitter grammars in addition to TextMate grammars. TextMate grammars are incredibly difficult to author and maintain and impossible to get right. The over 500 (!) issues reported against https://github.com/Microsoft/TypeScript-TmLanguage are a living proof of this.
This presentation explains the motivation and goals for tree-sitter: https://www.youtube.com/watch?v=a1rC79DHpmY
tree-sitter already ships with Atom and is also used on github.com.
The text was updated successfully, but these errors were encountered: