Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: option to set timeout for ripgrep task #110

Open
v1nh1shungry opened this issue Jan 16, 2025 · 5 comments
Open

Feature: option to set timeout for ripgrep task #110

v1nh1shungry opened this issue Jan 16, 2025 · 5 comments

Comments

@v1nh1shungry
Copy link

Working on a big project like llvm-project suffers great performance issue. There is obivious input latency and it even blocks neovim sometimes.

I have tried following the guide and set prefix_min_len to 5 and context to 3, but it didn't help much. I guess there're just too many source files so max_filesize would not help.

Then I came up with an idea of setting a timeout for ripgrep. Even if it takes 10+ seconds to get 10K+ results, I don't think we really need so many results. If it will take too long to finish the work, it would be nice to stop and the results it has had would be enough. However, when I tried following blink's configuration and set timeout_ms to 100 and async to true, somehow this didn't work.

I noticed that this plugin uses vim.system({ ... }, nil, function() ... end) to launch ripgrep, and it means it will wait until ripgrep exits. I wrote a rough prototype in my configuration. I borrow most codes from this plugin (Thanks, and I hope you won't mind 🙏 ) and I changed the code slightly to vim.system({ ... }, { timeout = 100, stdout = on_stdout }, on_exit). I moved the parsing part to on_stdout so that it can parse the result at the same time, and commit the results in on_exit when it is timeout or exits normally.

After the change I can still feel the latency, but it is better and basically tolerant for me. Most importantly it no longer blocks my neovim now.

Hope my experiment can help a bit 🙏 . And thank you for this plugin, I really enjoy this plugin!

@mikavilpas
Copy link
Owner

I like this a lot. Thanks for taking the time to write this.

I'm not really happy with the performance in big projects right now. I actually tried to convert the ripgrep invocation to a streaming based approach where it wouldn't be required to wait for all results, but I think that may need to wait for Saghen/blink.cmp#395 to be fully implemented.

I'll make some time to go over your code, and maybe discuss a bit more.

By the way, I think using manual mode (https://github.com/mikavilpas/blink-ripgrep.nvim?tab=readme-ov-file#manual-mode) should remove most of the lag, although then the experience is not as fluent - so it may not be what you want. But it could be worth checking out.

@mikavilpas
Copy link
Owner

I did some preliminary testing today and quickly noticed on my system it takes about a minute to perform a ripgrep search in that repo - it's absolutely huge. I'll have to think if this plugin can be useful in that context at all, I was not aware that it would perform this badly.

What is your experience like in that repository? Does the plugin provide any benefit, or is it simply too slow?

@v1nh1shungry
Copy link
Author

I did some preliminary testing today and quickly noticed on my system it takes about a minute to perform a ripgrep search in that repo - it's absolutely huge. I'll have to think if this plugin can be useful in that context at all, I was not aware that it would perform this badly.

What is your experience like in that repository? Does the plugin provide any benefit, or is it simply too slow?

I'm not sure if I understand you correctly, but if you're talking about llvm-project I mentioned, yes it takes long to perform a ripgrep search in my system too. In that case, the plugin from time to time stucks the whole neovim which is the most unacceptable part to me.

That's why I came up with the idea to set a timeout for the plugin. I suppose even if a timed-out and incomplete search can't bring us all the results and is less useful, it does help in most of time and definitely better than blocking the whole world.

I don't think I have tested my own timeout-supported version for long enough, but it did provide enough information to me in these days, for your information.

PS. In fact my first idea is to limit the amount of results, but since ripgrep does not (and not likely to) provides option to deduplicate, results can be far less than what we expect if a file contains a lot of apperance of the same match. Timeout version should have the same issue, but I think it is more fair since it is hard to set an appropriate amount limit.

Please correct me if I miss anything, thanks!

@mikavilpas
Copy link
Owner

I just had an idea that might help increase performance. I noticed git grep seems to perform much better in this huge repo.

Here rg took 40 seconds, but git grep only took 5 seconds:

Image

I think it has the following features and limitations:

  • obviously much faster
  • it only considers tracked files. I tested it and found out uncommitted changes are included in the results as long as the file is tracked by git.

Let me know what you think about this.

@v1nh1shungry
Copy link
Author

First of all, I want to applogize for my shameful mistake. I didn't realize it was shell printing that slows down ripgrep. I believe I execute the same command as you did

$ rg "commit[\w_-]+" > /dev/null

And it just takes me about 2~3 seconds to finish, which is a big difference with yours. I don't have a config for my ripgrep and I haven't touched the repo yet. The repo is in commit 3792b36234b6c87d728f0a905543e284bf961460 and the working tree is clean. My ripgrep is installed via cargo

$ rg --version
ripgrep 14.1.0 (rev e50df40a19)

features:-simd-accel,+pcre2
simd(compile):+SSE2,-SSSE3,-AVX2
simd(runtime):+SSE2,+SSSE3,+AVX2

PCRE2 10.42 is available (JIT is available)

But I'm not an expert on it so I have no idea why there is such a big difference.

What I used to doubt is that there're too many matches for vim.json.decode to compute, and that's another reason inspires me to limit the match amount, but I forgot it soon and fell in love with timeout feature (sigh, what a shame again). So I decided to find a profiler to help me find out what happened.

I use folke's snacks profiler. Well I'm not an expert on it but this is the best profiler I can find. I use the following configuration

blink configuration
  {
    "saghen/blink.cmp",
    build = "cargo build --release",
    dependencies = "mikavilpas/blink-ripgrep.nvim",
    event = "InsertEnter",
    opts = {
      appearance = { use_nvim_cmp_as_default = false, nerd_font_variant = "mono" },
      completion = {
        accept = { auto_brackets = { enabled = true } },
        menu = {
          draw = {
            align_to = "none",
            treesitter = { "lsp" },
            components = {
              kind_icon = {
                ellipsis = false,
                text = function(ctx)
                  local kind_icon, _, _ = require("mini.icons").get("lsp", ctx.kind)
                  return kind_icon
                end,
                highlight = function(ctx)
                  local _, hl, _ = require("mini.icons").get("lsp", ctx.kind)
                  return hl
                end,
              },
            },
          },
        },
        documentation = { auto_show = true, auto_show_delay_ms = 200 },
        trigger = { show_in_snippet = false },
      },
      sources = {
        default = { "lsp", "snippets", "path", "buffer", "ripgrep" },
        cmdline = {},
        providers = {
          rg = {
            module = "blink.rg",
            name = "rg",
            score_offset = -100,
          },
          ripgrep = {
            module = "blink-ripgrep",
            name = "Ripgrep",
            score_offset = -100,
            opts = { prefix_min_len = 5, context = 3 },
            transform_items = function(_, items)
              for _, item in ipairs(items) do
                item.labelDetails = { description = "(rg)" }
              end
              return items
            end,
          },
        },
      },
      keymap = { preset = "super-tab" },
      fuzzy = { prebuilt_binaries = { download = false } },
    },
    opts_extend = { "sources.default" },
  },

I just open file clang/lib/Format/Format.cpp, enabled profiler, and started to insert constexpr int value. Actually when I finished constexpr neovim was already stuck. And I disabled the profiler until neovim was available. Here's the profile result.

blink-ripgrep

We can observe that blink-ripgrep.ripgrep_parser.parse takes most of time,and I believe this basically confirms my guess.

So the reason my timeout version is faster is likely that it handles much less results. I had a quick test on it, just modified sources.default in configuration above to enable mine and disable yours, and did the same steps.

mine

Let's go back to git grep. I think the idea is good, searching only tracked files does not bother me much since I believe most results we want are in tracked files usually. However I do have a few concerns

  • How do we decide which tool is used. According to statistics like file amount? Or just set an option to use one for all projects?
  • It only works in a git repo. That means we have to fallback to ripgrep again. For example if I edit in my home directory I still suffer from the perf issue.
  • If my operation above is correct, the bottleneck may not be in ripgrep but vim.json, then git grep may not help a lot.

I'm a bit busy these days and always make silly mistakes. Please correct me if I miss anything. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants