Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress bar #740

Closed
MrYakobo opened this issue Mar 9, 2021 · 13 comments
Closed

Progress bar #740

MrYakobo opened this issue Mar 9, 2021 · 13 comments

Comments

@MrYakobo
Copy link

MrYakobo commented Mar 9, 2021

Hello

Thank you for an amazing tool. It really outplays GNU find and parallell by a lot.

Today, I found myself converting a lot of mp3s

fd -j4 . ../some_dir -e mp3 -x lame -b 256

which works marvellously, but without any easy way to tell how many files fd has left to process. A progress bar option (--progress-bar/-#?) would be nice.
Have a nice day!

@sharkdp
Copy link
Owner

sharkdp commented Mar 14, 2021

Thank you for your request.

That's an interesting idea. Two problems/questions:

  • It's not really possible to show a progress bar (in all --exec cases). Consider this: someone might be running a very fast --exec command on a couple of search results in a huge directory. In this scenario, the time is largely determined by searching the files, not by process execution. Conversely, in your use case, the filesystem search is probably much faster than the process execution. But there is no way to tell how long each of these commands will take. Estimating it by keeping track of a running average or similar can be misleading if there are files of vastly different size.
  • What about the process output? Would we show that in addition (above/below the progress bar)? Or would --progress-bar switch off the process output?

@MrYakobo
Copy link
Author

I fully agree with the difficulty of estimating the total time accurately, but isn't it always hard to estimate the total time? Progress bars have been wonky since always. The same argument applies to gnu parallel, yet they have the option. It doesn't have to be a perfect estimate 😄

For process output, yeah that's definitely also a tricky one. Maybe do what parallel does with it's --bar?

$ mkdir test && cd test
$ touch $(seq 1 10)
$ echo 'import sys;import random;import time;time.sleep(random.random()*2);print("STDERR", file=sys.stderr);print("STDOUT")' > stderr_stdout.py
$ find -print0 | parallel --bar -0 -P 4 python stderr_stdout.py

gnu

i.e. copy both streams. And then maybe have some option for muting the subprocess output for a clean progress bar.

Or, don't show a bar at all, but rather show a fraction (file 42/100 done). Maybe that's less lying than doing a bar and ETAs 🤔

@lainisourgod
Copy link

I would better not estimate total time, just show % of files processed and current time spent. It would be enough for user to satisfactorily estimate progress himself.

Thumbs up for feature!

@sharkdp
Copy link
Owner

sharkdp commented Aug 7, 2021

Maybe we should leave this to a separate tool. You can use GNU parallel with fd, for example:

fd … | parallel --bar cmd

In principle, using fd -x can be faster than piping to parallel. But only if the filesystem search is the limiting factor. Which is not the case here.

I hope it's okay if we close this, but please feel free to comment.

@sharkdp sharkdp closed this as completed Aug 7, 2021
@lainisourgod
Copy link

lainisourgod commented Aug 8, 2021

@sharkdp

I like @MrYakobo's idea of showing number of files currently processed.

  • We won't lie about time
  • We have this information
  • It's really easy to render

Why not?

@sharkdp
Copy link
Owner

sharkdp commented Aug 8, 2021

Why not?

what about my second question above: #740 (comment)

@lainisourgod
Copy link

@sharkdp

First case: many operations with logs

I guess I generally don't want my terminal to be cluttered with 10k+ lines of individual operations' logs so I want it to be disabled by default. As two ways to get this logs I would either:

  • write them to file and look/search from there
  • or do it the way docker build does: show progress bar or (file count + spent time) at the top and show last N lines of logs below
terminal.mp4

Second case: not many logs

It's ok to show logs for not a lot of files.

It's hard to define what is many for different cases so I would just make some behaviour default and the other not default.

@sharkdp
Copy link
Owner

sharkdp commented Oct 8, 2021

I guess I generally don't want my terminal to be cluttered with 10k+ lines of individual operations' logs so I want it to be disabled by default.

We're definitely not going to disable it by default. Having the output of --exec/--exec-batch is very important for a lot of use cases.

@lainisourgod
Copy link

@sharkdp okay i get why it should be the default and agree with you now. But the UX of not piping in parallel and having our own progress bar is cool. In my case I was (I guess) deleting lots of files and piping this huge output is not the best choice.

@sharkdp
Copy link
Owner

sharkdp commented Nov 10, 2021

Ok. I think I'd be okay with integrating this if it doesn't add a huge amount of new code.

@Anangaya
Copy link

Anangaya commented May 18, 2023

Hey there,

I really think having a progress bar would be a fantastic addition to fd.

I recently had a task to delete over 600,000 HTML files from a HDD and chose to use fd as some smaller hyperfine tests showed it was faster than combining it with rush. Even though it was faster, the entire process still took around 2 hours!

What would have made it a smoother experience is if I could have tracked the progress live on fd itself, similar to rush --eta. I ended up resorting to voidtools everything to monitor the progress, but it would have been great to have that functionality within fd.

@tavianator
Copy link
Collaborator

@Anangaya, by "sd" do you mean fd?

How did you actually delete the files? Hopefully -X rm, not -x rm?

@Anangaya
Copy link

Anangaya commented May 18, 2023

by "sd" do you mean fd?
yeah my bad. I corrected it.

How did you actually delete the files? Hopefully -X rm, not -x rm?

Well I did use -x rm in this case, because It didn't come to my mind to test it out with -X rm which definitely seems a lot faster for cases like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants