-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Way to report total count of matches? #411
Comments
Have you tried the -c flag? (Which is also in grep.) |
That prints the number of matches per file (and grep -c seems to do the same). I'm interested in the total number of matches across all files. |
Okay, then I guess i don't understand why piping to wc -l doesn't work? |
It works fine, I just thought I'd check if there was a built-in option just in case. The only downside to piping is that you don't get the actual matches printed, only the count, but that can be worked around by storing the results in a variable first. |
I'm still confused. You want both the matches printed and the count? Could you please provide an example so that your request is more clear? |
What I'm interested in is something like
Basically the current functionality but with the total count printed at the end. I don't expect support for that because I don't think other tools support it either, and I can get that information by storing the results in a variable first (or by running rg twice) so it's not a big deal. I was just checking to make sure it indeed wasn't supported. Sorry for the confusion. |
@elirnm I believe you could just use the $ rg blah blah | tee >(wc -l)
match
match
match
3 If you want to remove that tab before the output, you can do this: $ rg blah blah | tee >(wc -l | xargs echo) |
@kale Not on Windows. |
Powershell ? rg blah blah | Measure-Object -Line |
Does that show the matches and the count? |
|
@DoumanAsh Please re-read this thread. The OP is looking for a way to print both the matches and the count of the matches in a single command. |
Opps, im sorry. |
I'm going to close this. I don't see an option like this being added to ripgrep proper. I think it's too niche and working around it is very simple by piping the output through a line counter. |
I ended up here looking for a way to doing something like Using
Can you please re-open this issue, and consider adding a |
Did you actually try it though? If you pipe the output of ripgrep into another command, then it should revert to the standard output format of |
The situation is like I want to eat the cake (show results with headings) and have it too (show the match statistics too). With So I would need to run once to see the result and run second time to get the match count. But that is not as informative as the It's a different thing to investigate why the total matches are not the same when searched using ag vs rg. |
At what point does ripgrep have to solve every problem associated with displaying stats? It should be very simple for anyone to write a wrapper script that does what you want here, although the easiest path would run I will re-open this for now, but at a certain point, I have to be allowed to say "No" to new feature requests and people have to respect that reasonable people can disagree where that line is drawn.
The silver searcher has a very very large number of bugs associated with its gitignore support. It's more surprising when the total number of results are the same. |
I don't want my suggestion to rub you the wrong way. It was as I said.. just a suggestion. I fully respect your project. So if it's your decision to never support this, I will respect that. This request came up when I had to find the total number of matches during a discussion, and that's where I realized that I needed to switch to ag for that. |
I think I've come around to this feature. Starting with what the silver searcher does seems reasonable:
I do have a question though: should stats be printed to stdout or to stderr? ag prints them to stdout. I think I'm fine either way, and probably lean slightly towards stdout. |
An argument in favor of stderr is that you can do this: |
Wouldn't printing I would favor STDOUT as that stats are not errors technically.
In that case, may be a different switch be added to not output the matched lines at all? Probably |
I'm not sure I follow. If you don't want to throw off wrapper scripts, then don't use
I try to prefer composition of existing tools. |
I don't have such a wrapper script.. but in case one is doing set -euo pipefail # http://redsymbol.net/articles/unofficial-bash-strict-mode
foo_match_count=$(rg foo --stats | grep -Po '\d+(?=\s+matches)') The script will fail even where the match is positive. The STDERR output can cause confusion here. |
But in that case, the problem is immediately obvious because the stats will be dropped to stderr. And then it's easy to fix. |
I'd prefer stdout because if stats are part of stderr there's no easy way to separate the stuff related to results from the stuff related to actual errors while keeping the stats with the rest of the results-related stuff. If stats are printed to stderr, then |
I don't know if it is the correct place to ask for it, but could we have a way to print how many matches per file? I'm searching binary files so I care about matches and not lines... showing both is perfectly fine, just -c doesn't give me any meaninfull number. |
@santagada That seems like an orthogonal issue to what this is. Could you please open a new issue? Please also explain why you think
|
your example shows it, README.md has 17 occurrences of foo (on at least one line it shows 3 times). what -c is showing is how many lines matched the regex (or maybe we searched very different versions of the README.md). I will open a new issue |
filed a ticket for it in #566 |
I can pick this one up next if we still want this implemented. I don't see any other PRs open for this, and it looks pretty straightforward from a specification standpoint (duplicate what |
@balajisivaraman I'd be very grateful, thank you! This is one of the two things why I need to keep |
@balajisivaraman Thanks! Let me know if you want any help coming up with how to organize this in the code. It is likely that familiarizing yourself with Rust's support for atomics will be helpful. As another caveat, I would like to prevent the two search modules ( Here are some possible simplifications that you may elect to choose to do:
|
@BurntSushi, Thanks for the pointers. I'll have an initial look this weekend and come up with a rough idea of how I want to go about it, and I'll post it here for vetting. 👍 |
@balajisivaraman Aye. Another idea that I might like even better is seeing if this could be done in the printer instead of the search code. That way it would work for both searchers. |
@BurntSushi, I get the feeling we should be able to easily do this in Here's my reasoning as to why:
My thought is that we should be able to do all of the above in The trickiest part will be tracking bytes searched. I haven't been able to come up with an easy way to do this that doesn't involve making changes to My current thought is that we could do two things about it:
Also, as you suggested, I had a look at seeing whether we could offload some of this to the This is bad because there's shared mutable state going on. Another pain point is that we create a new I also found the following quirks in
I'll have another look to see whether there would be any other way of doing this, but that's what I came up after going through the code today. |
@balajisivaraman Thanks for writing that up! The task of counting bytes is definitely an interesting one and I grant that it does appear to be a little tricky to do with the current code. My feeling is that the "best" way to do this would be as a new type that implements But yeah, we can definitely punt on the byte counting for now and do that at a later time. With that said, it is definitely a useful part of the stats output because it's what will let you compute a thoughput statistic (which I suppose we should also include once we do byte counting).
ag has a lot of bugs. I think ripgrep can probably get stats right for stdin without claiming that it searched 17 other files. :-)
You're on to something here. I think it would be fine to ignore |
@BurntSushi, Ah that's a nifty little trick. Thanks for pointing that out. I'll see whether I'd be able to cook up something similar for counting bytes here. If you're OK with the rest of the suggestions in terms of tracking the existing stats and outputting them in |
@balajisivaraman Oh right, I forgot to respond to that part! Yes, doing those counts in |
@BurntSushi, I just realised that there are some similarities between this and #566. Although I have a WIP PR (#799) open for this, I realised that the I'll leave the pending PR open and look at ways I can work on the occurrence count issue. We should then be able to reuse that for computing stats, if that is fine. |
how to rg -c top 10 most words from a disk? |
Sorry for necro'ing this thread, but just wanted to say I really appreciate the work and effort in this feature! |
Count the total number of matches in ripgrepFor anyone still stumbling upon this thread, the Example output: the line you're looking for is the one where I put
If you just want just that one line, do this: rg --stats "my search term" | tail -n 8 | head -n 1 Example output:
If you just want the rg --stats "my search term" | tail -n 8 | head -n 1 | awk '{print $1}' Output:
Alternative hackAs @clashman says below, you can also do this hack: rg "my search term" | rg -c "my search term" Sample output:
If you need this to run in Windows...Use the Git Bash terminal which comes with Git for Windows. See my instructions here: Installing Git For Windows. For building software in Windows, or running the GCC or LLVM Clang compilers from the command-line, and to get other Linux tools in Windows, use the MSYS2 terminals. See my MSYS2 setup answer here: Installing & setting up MSYS2 from scratch, including adding all 7 profiles to Windows Terminal. |
A simpler, albeit dirty solution: |
For me the best answers is actually for what I needed I've not the same result |
what works for me rg [search term] -l | wc -l |
@taariksiers That will only report the total number of files that contain a match. The |
Because I use Windows, I have turned to a small python script (written by ChatGPT): import subprocess
# Run the command and capture the output
command = "rg -c foo"
output = subprocess.run(command, capture_output=True, text=True).stdout.strip()
# Extract the last number from each line and sum them up
total_count = sum(int(line.split(":")[-1].strip()) for line in output.split("\n"))
# Output the modified output and the total count
print(output)
print(f"Total Count: {total_count}") |
@TeaDrinkingProgrammer , FYI: all of the Bash/terminal solutions above work on Windows too. You just have to use the Git Bash Linux-like terminal that comes with Git for Windows, is all. You can also use the MSYS2 terminal. Here are my installation instructions for those terminals in Windows: |
You can also use WSL: rg foo | wsl wc -l |
|
thank you, sirOn Feb 28, 2024 8:13 PM, Victor Golovanenko ***@***.***> wrote:
rg -c 'my search term' | cut -d: -f2 | awk '{s+=$1} END {print s}'
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>
|
Is there a way to print the total count of matches? I can pipe the output to a line counter, but then I don't get the actual matches printed.
The text was updated successfully, but these errors were encountered: