Skip to content

Faster hash algorithm? #108

@alexcrichton

Description

@alexcrichton

Running sccache against a fully cached LLVM build on S3 I've found that 82% of the runtime of the server is consumed in sha1 hashing. That's quite a lot! Knowing C++ it is indeed hashing megabytes and megabytes of output...

I wonder if perhaps other hash algorithms have been considered? So far some pieces I've learned are:

  • The OpenSSL implementation of SHA1 is ~4x faster than the sha1 crate. (presumably due to fancy simd bits)
  • The ring implementation is the same speed as the sha1 crate (presumably because both are written in stable rust without simd)
  • The OpenSSL SHA256 implementation is ~2x faster than the sha1 crate
  • The ring implementation is also 2x faster than the sha1 crate
  • Various blake2b crates can get to about 3x faster than the sha1 crate
  • I'm sure md5 can be even faster still!

Are there various thoughts on the hashing algorithm here? I'd be wary of using OpenSSL due to its difficult-to-get-working Windows support. I think ring could be a good option but it's not that much faster in sha256 mode. Maybe blake2b is fast enough? (and I think still written in Rust).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions