-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor tree-sitter related functionality #330
base: master
Are you sure you want to change the base?
Conversation
Thanks, this looking great. Will take me a bit to review. Have a couple questions to start
I still see it exampled on the nvim-treesitter readme.
|
nvim-treesitter/nvim-treesitter#4767 describes the plan for ditching modules framework. There’s no official deprecation notice just yet (I believe it’s waiting for neovim 0.10 release). This implementation does lead to intermittent misses on buffer changes. For up to 300ms after last change the parser can be out of sync. It hasn’t been a problem for me yet, but independent falsification would definitely help. Generally, asynchronicity in access to matches would solve this (e.g. reschedule matches processing for the next parser ready state). I’m not confident I can implement that cleanly, however - vimscript knowledge is not my strong suite. This PR should lead to responsiveness improvement due to match lookup getting disconnected from tree parsing. |
It's possible to skip It can result in a simpler code with no need to consider performance in presence of other plugins that could be running re-parsing routines. |
This seems reasonable, as match-up already has deferred mode (which could be migrated to lua on_bytes but one thing at a time). What's the mechanism to tell nvim that we require treesitter highlighting? |
It would rather be a user requirement. Basically, “to enable tree-sitter subsystem, enable tree-sitter highlighting on a buffer with |
nvim-treesitter is transitioning to being just a parsers manager, with plugins being encouraged to move to independent initialization and implementation. This commit extracts nvim-treesitter plugin dependencies, copying those too big to reimplement. Additionally, tree-sitter engine is reworked to be an async library. When buffer's filetype is set, a parser is initialized, a throttled reparser is attached to parser's on_bytes event, and schedule_wrapped match extractor is attached to parser tree change event. The latter extracts matches and stores them in a per-buffer dictionary. All requests to the treesitter backend now fetch data from the aforementioned dictionary. Extra precautions are taken in get_matches request, which now returns no matches if the buffer is known to have changed with a parser that haven't caught up yet.
ac2c32f
to
34d3362
Compare
Could you provide some instructions or a link to some to do that? I still think it's better to do things automatically for now but we'll want to move toward nvim convention more as time goes on so want to have an idea of the migration path. BTW I'm not sure why the vim workflow is failing, can probably just ignore that for now. |
I'll describe the whole thing to "reset" the context (if only for my sake). Current implementationTree-sitter module doesn't rely on tree-sitter highlighting. When vim-matchup initialises for the buffer, it calls
DownsidesThere's one, and it boils down to lack of a central convention on handling parse updates in neovim, or my ignorance of its existence. Specifically, if multiple plugins implement this behaviour, the timers will interleave, reducing the benefit of throttling parse calls. Simply typing in insert mode can then result in a high CPU usage in that scenario. Rely on
|
No. This starts treesitter highlighting. Each plugin is expected to handle its own logic independently (calling I recommend looking at https://github.com/nvim-treesitter/nvim-treesitter-context for an example of a standalone treesitter plugin.
It is not; it's a necessary performance optimization. |
Thanks @clason for giving guidance here! Sorry if this is a really basic question, it is not quite clear to me yet. Specifically, I don't understand how nvim-treesitter-context can guarantee that the tree is in such a state as to be queried, it does not contain any parse() command, although your statement "Each plugin is expected to handle its own logic independently (calling parse on a parsed tree is a no-op," makes me think all such plugins ought to. Per @Slotos's question (neovim/neovim#26220), it seems that neovim is moving to more incremental update parsing, which is surely a good thing. However per the commentary it seems even the highlighter "module" cannot be relied upon to have fully parsed the tree before invocation of dependent plugins like match-up or nvim-treesitter-context. |
The point here is that every "user" is responsible to doing whatever it needs to. The highlighter does not need to have a fully parsed state, as things outside the viewport do not need to be highlighted. Similar for context; other plugins may have different needs -- and why should people that don't use them pay for that? TL;DR: "parse if you need to, and only parse what you need". (And highlighting is just one functionality among others; it should not be considered the "main treesitter module" that others have to hook into. We're trying to get away from this initial monolithic design, which has not aged well.) |
I gather that for vim-matchup we’re looking at the full tree in the general case. We might be able to use previous matches to somewhat limit the range of update. E.g. by finding the range of the largest known match containing the changes. I worry that at that point that’s getting indistinguishable from languagetree’s own parsing with injections, however. I see that there’s tracking of changed lines present in its code, but I haven’t traced it far enough to have a complete mental model yet. |
I'd be very surprised if that were the case. You know where the cursor is, and you'd never search for matching braces in injected languages (unless you were already inside it). |
Let's take a step back and consider the actual issue -- if there even is one; this seems to me a very premature (and likely unwarranted) concern. Also, simply vendoring nvim-treesitter is the wrong way to do it -- much of this code we're getting rid of because it's obsolete. The main point of the rewrites is instead to rely on the functionality built into Neovim core (and possibly some convenience functions on top that are fine to steal). Lastly, it's important to point out that there's no real hurry here:
|
Good points @clason, let's not over-optimize for where we imagine nvim-treesitter is headed until we need it. On the other hand, I feel @Slotos 's work here is independently very important for a number of reasons (the title of this PR is inaccurate in my opinion):
There are a few issues and nits in this PR to address but on whole it looks great. I'll take another pass later and do a full review with some suggested edits. Few high level comments:
|
Fair points, but my general recommendation still stands: forget about nvim-treesitter and rely on Neovim core functionality as much as possible. (Any code from nvim-treesitter that is not a simple convenience wrapper is suspect; I've removed all of that for a reason.) Going forward, nvim-treesitter will just be a parser installer and curated query collection; you can take that as confirmation. Refactoring now (before nvim-treesitter 1.0) is possible, of course, but be aware that treesitter is still considered experimental even in Neovim, and a lot still changes -- breaking changes included. Nevertheless, I'd recommend targeting HEAD with this refactor, as a lot of performance improvements were made for 0.10 (specifically related to C injections, which is a well-known issue -- in fact, the "full parse" was removed exactly for that reason!) These issues are fundamental treesitter issues, though, and need to be solved by Neovim. (This is one of the things required for "graduating" treesitter support.) Regarding Regarding caching: queries are cached internally by Neovim (as opposed to nvim-treesitter; another reason to avoid that framework!), but if you do non-trivial things with the results, it would make sense to cache those via a (There are similar ones in nvim-treesitter 1.0, and in fact in Neovim: |
I've been thinking about the problem the cached query results is trying to solve, and I'm wondering about how to benchmark it. The goal itself is similar to that of nvim-treesitter-context's - as the cursor moves, we need to fetch related regions as fast as possible, preferably imperceptibly so. Now, the question then becomes "does lua cache perceptibly outperform a tree-sitter querying mechanism". Tree-sitter operates on trees, and I presume it is well optimised for querying on said trees, whereas local lua cache right now is a linear loop. If local lua cache doesn't show an order of magnitude speedup with average lookup times <1ms, using buftick cached individual queries is just ok and likely scales better into larger trees. Separately from that, I've been checking out performance on Neovim's own
I intent to iterate a few more times on this code, looking for a good combination of performance and simplicity. |
That's the goal. Code copies are probably masking the overall trend, but I specifically started this work to eliminate all and any reliance on |
Hey @Slotos, any further thoughts on this? Do you have any more ideas how to efficiently handle updates? |
On treesitter side, parsing a line as a region should give us all the local information that's interesting. Range parsing gets propagated to children trees, and will yield all the data that's relevant to the line. Highlighter is an example of this working, and it handles nested trees just ok. All of that should be mostly confined to I apologise for the silence, holidays season turned into a viral melting pot, followed by a set of tasks at work that consume all my attention. |
You might want to look at https://github.com/nvim-treesitter/nvim-treesitter/blob/main/lua/nvim-treesitter/locals.lua for another "self-contained" example for the nvim-treesitter 1.0 era. Feel free to just steal any code from the old |
Hey all - any further progress here? Currently |
I’m struggling to find time to dedicate for a proper implementation. I also fell into „good enough” trap - this implementation works ok-ish for me, even though it’s simply wrong. I hope for holidays to offer me some time to enjoy my vim tinkering. |
Here's my setup: {
"andymass/vim-matchup",
init = function()
vim.o.matchpairs = "(:),{:},[:],<:>"
-- Don't recognize anything in comments
vim.g.matchup_delim_noskips = 2
vim.g.matchup_matchparen_deferred = 1
vim.g.matchup_matchparen_offscreen = { method = "status_manual" }
end,
}, If I edit If I go one level down |
nvim-treesitter is transitioning to being just a parsers manager, with plugins being encouraged to move to independent initialization and implementation.
This commit extracts nvim-treesitter plugin dependencies, copying those too big to reimplement. Additionally, tree-sitter engine is reworked to be an async library. When buffer's filetype is set, a parser is initialized, a throttled reparser is attached to parser's on_bytes event, and schedule_wrapped match extractor is attached to parser tree change event. The latter extracts matches and stores them in a per-buffer dictionary.
All requests to the treesitter backend now fetch data from the aforementioned dictionary. Extra precautions are taken in get_matches request, which now returns no matches if the buffer is known to have changed with a parser that haven't caught up yet.
I removed old tests execution from neovim CI config, as new tests execute them already, and for some reason old tests require 0.5s delays before matchup matching becoms available. Same tests running with new framework need 10ms to 50ms.