Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for excluding files from analysis #69

Closed
jvassev opened this issue Mar 1, 2021 · 3 comments
Closed

Support for excluding files from analysis #69

jvassev opened this issue Mar 1, 2021 · 3 comments
Labels
question A question, not a bug.

Comments

@jvassev
Copy link

jvassev commented Mar 1, 2021

Hi,
I am using rdfind to find duplicate files in single folder which is used by a downloader service. Occasionally, the downloader fetches the same file under a different name and rdfind is able to successfully dedup.

I noticed that rdfind (which I run in a loop every 30s) does a lot redundant work while the file is being downloaded.

Is it possible to say "ignore file that were modified in the last X seconds from analysis"? Or maybe use a globbing pattern to exclude *.part files?

@bes-internal
Copy link

bes-internal commented Mar 1, 2021

You can use system find as in an example in man rfdind:

 Search for duplicate files in directories called foo:
              find . -type d -name foo -print0 |xargs -0 rdfind

In your case is something like this:

find pathtodir -type f  ! -name '*.part' -print0 |xargs -0 rdfind
find pathtodir -type f  -mmin 1 -print0 |xargs -0 rdfind
                          ^-- File's data was last modified n minutes ago.

@pauldreik pauldreik added the question A question, not a bug. label Aug 12, 2021
@pauldreik
Copy link
Owner

Sorry for the late answer. The suggestion by @bes-internal is excellent!

@entrity
Copy link

entrity commented Mar 3, 2024

I think the xargs answer isn't sufficient in large collections. I want to do this on a large tree, and I get xargs: argument line too long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A question, not a bug.
Projects
None yet
Development

No branches or pull requests

4 participants