Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fclones scans a ton of files that have no chance of matching #257

Open
KyleSanderson opened this issue Jan 14, 2024 · 4 comments
Open

fclones scans a ton of files that have no chance of matching #257

KyleSanderson opened this issue Jan 14, 2024 · 4 comments

Comments

@KyleSanderson
Copy link

KyleSanderson commented Jan 14, 2024

Runline: fclones group --hidden --no-ignore -s1M --cache /mnt/*T*

[2024-01-13 14:03:31.740] fclones:  info: Started grouping
[2024-01-13 14:03:48.012] fclones:  info: Scanned 633467 file entries
[2024-01-13 14:03:48.018] fclones:  info: Found 466729 (107.6 TB) files matching selection criteria
[2024-01-13 14:03:48.580] fclones:  info: Found 425599 (35.6 TB) candidates after grouping by size
[2024-01-13 14:03:49.004] fclones:  info: Found 425599 (35.6 TB) candidates after grouping by paths
[2024-01-13 14:22:18.464] fclones:  info: Found 73996 (5.6 TB) candidates after grouping by prefix
[2024-01-13 14:24:36.793] fclones:  info: Found 73858 (5.5 TB) candidates after grouping by suffix
[2024-01-13 22:14:39.468] fclones:  info: Found 73294 (5.5 TB) redundant files

The true hardlink size when running link is 350GB~.

@pkolaczk
Copy link
Owner

Can you elaborate why do you think so? Which files should not be scanned? It basically scans all the files in the given directory, and then compares them by hashes.

@KyleSanderson
Copy link
Author

Can you elaborate why do you think so? Which files should not be scanned? It basically scans all the files in the given directory, and then compares them by hashes.

The scan is 5.6 + 5.5TB, if the prefix and suffix match the total should be 5.5 or less, not 11TB. Additionally these files are not new, but it seems to scan them anyway for fun even though cache is specified.

@KyleSanderson
Copy link
Author

Looks like half the problem is the cache isn't updated when link runs, and fclones sees them as new.

@Motophan
Copy link

Ive literally never had a problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants