-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
slowdown with many files on the command line, compared to no arguments #136
Comments
I think this is a dupe. On mobile. On Sep 29, 2016 5:10 PM, "oconnor663" notifications@github.com wrote:
|
Yeah, this is probably a dupe of #44. Your example is actually more embarrassing, because the positional arguments are files, not directories, which makes the reprocessing of parent ignore files even more strange. I have some grander plans in mind on refactoring the ignore code, so fixing this will probably get lumped in with that. For now, I think we can just track #44. Thanks for the report! |
@BurntSushi I should've mentioned before, |
Oh. Neat. I didn't check that. I'll look into it. On Sep 29, 2016 21:10, "oconnor663" notifications@github.com wrote:
|
You're right, something different is going on than what's in #44. |
cc @patrickxb who actually found this. |
Sadly (and embarrassingly), the blame for this can be placed squarely on my implementation of I think this means I should admit defeat switch to clap. :-) cc @kbknapp |
@BurntSushi if "admitting defeat" means having spent all your time writing some of the best parsers and libraries known to Rust (perhaps any language), and helping ensure the Rust ecosystem/users maintains or even exceeds current levels of quality and professionalism....well then I aspire to one day "admit defeat" as well 😉 As for this issue, I'd like to do a build using clap just to compare the results. I'll post the results in the next day or so once I have time to make the build and do the tests. |
@kbknapp Haha aww shucks. FWIW, clap is unbelievably well maintained. I hadn't looked at it for a while, but when I checked it the other day I was completely blown away. To give you more context: you probably won't notice much of a difference in standard usage. The problem appears specifically when there's a long list of positional file arguments. Docopt struggles with this because it uses a weird backtracking algorithm to match Converting |
Woah. Maybe there's some hack we could stick on the front of the whole thing (either docopt, or ripgrep) to handle the specific case of a ton of positional args, that wouldn't need a complete rewrite? I've never looked at docopt internals though. |
I'm sure there's probably something. I'm just not feeling up to digging around in Docopt internals. Everything about it is too complex. (Which is my failing.) |
If you'd like to do the converting that's cool with me, I'll just standby for any questions/suggestions! Instead of doing a full re-write I just a simple test (called
I know it's not super scientific or perfect, but gives a rough estimate. Here's what I used: Docopt: const USAGE: &'static str = "
Lots of Files
Usage:
lof <files>...
";
#[derive(RustcDecodable)]
struct Args {
arg_files: Vec<String>,
}
fn main() {
let args: Args = Docopt::new(USAGE)
.and_then(|d| d.decode())
.unwrap_or_else(|e| e.exit());
} And clap: fn main() {
let m = App::new("lof")
.arg(Arg::with_name("files")
.multiple(true))
.get_matches();
} I also did version where I looped through and printed out the values both as |
OK, so I finally have ripgrep wired up to clap and I can confirm this issue in particular gets fixed. My test case with GNU grep:
With current ripgrep master:
and finally with ripgrep wired up to clap:
Everything is run single-threaded on 3,458 files explicitly given on the command line. This test case was derived from the following
Yay clap! |
There were two important reasons for the switch: 1. Performance. Docopt does poorly when the argv becomes large, which is a reasonable common use case for search tools. (e.g., use with xargs) 2. Better failure modes. Clap knows a lot more about how a particular argv might be invalid, and can therefore provide much clearer error messages. While both were important, (1) made it urgent. Note that since Clap requires at least Rust 1.11, this will in turn increase the minimum Rust version supported by ripgrep from Rust 1.9 to Rust 1.11. It is therefore a breaking change, so the soonest release of ripgrep with Clap will have to be 0.3. There is also at least one subtle breaking change in real usage. Previous to this commit, this used to work: rg -e -foo Where this would cause ripgrep to search for the string `-foo`. Clap currently has problems supporting this use case (see: clap-rs/clap#742), but it can be worked around by using this instead: rg -e [-]foo or even rg [-]foo and this still works: rg -- -foo This commit also adds Bash, Fish and PowerShell completion files to the release, fixes a bug that prevented ripgrep from working on file paths containing invalid UTF-8 and shows short descriptions in the output of `-h` but longer descriptions in the output of `--help`. Fixes #136, #189, #210, #230
There were two important reasons for the switch: 1. Performance. Docopt does poorly when the argv becomes large, which is a reasonable common use case for search tools. (e.g., use with xargs) 2. Better failure modes. Clap knows a lot more about how a particular argv might be invalid, and can therefore provide much clearer error messages. While both were important, (1) made it urgent. Note that since Clap requires at least Rust 1.11, this will in turn increase the minimum Rust version supported by ripgrep from Rust 1.9 to Rust 1.11. It is therefore a breaking change, so the soonest release of ripgrep with Clap will have to be 0.3. There is also at least one subtle breaking change in real usage. Previous to this commit, this used to work: rg -e -foo Where this would cause ripgrep to search for the string `-foo`. Clap currently has problems supporting this use case (see: clap-rs/clap#742), but it can be worked around by using this instead: rg -e [-]foo or even rg [-]foo and this still works: rg -- -foo This commit also adds Bash, Fish and PowerShell completion files to the release, fixes a bug that prevented ripgrep from working on file paths containing invalid UTF-8 and shows short descriptions in the output of `-h` but longer descriptions in the output of `--help`. Fixes #136, Fixes #189, Fixes #210, Fixes #230
Example using the https://github.com/keybase/client repo:
It looks like
rg
is slower when given a (long) list of files on the command line, than it is when it has to find those files itself.grep
doesn't have this problem, so I don't think it's some confounder like my shell taking a long time to expand the glob.The text was updated successfully, but these errors were encountered: