Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster hashes #141

Open
ido opened this issue Oct 7, 2023 · 1 comment
Open

Faster hashes #141

ido opened this issue Oct 7, 2023 · 1 comment

Comments

@ido
Copy link

ido commented Oct 7, 2023

If you check the entire file's contents after comparing hashes, then it is probably worthwhile to explore faster hashes such as xxhash or to use hardware-accelerated checksums such as CRC32 on x86_64 if it's available, as a first pass before more expensive hashes or block by block file comparisons.

Here's a benchmark of fast hashes: https://github.com/rurban/smhasher#summary

Here's the crc32c Intel intrinsics: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=1494,1494&text=crc32

@retrosapien
Copy link

I would add an option to disable hashes altogether, too.

For example;
If i back up some files using rsync -a , i already know they are identical if the sizes are the same. If i later use rdfind to change the backup files to hard links, hashing is a waste of time and resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants