-
Notifications
You must be signed in to change notification settings - Fork 761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exp: plumbling/hasher, Add ObjectHasher
and ImmutableHash
#920
Conversation
Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
The support for SHA256 will require the mapping between ObjectFormat and the correct hash.Hash to be used. Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
The new exp package will hold only experimental code, which may eventually make its way to other parts of go-git. One example of how it would be used is to support the breaking up of changes required for things like SHA256. That feature would require several packages to be changed, and introducing such changes to plumbing in one go would be too disruptive. The new changes would now be staged in the new package, and as they mature they would be either moved into their target destination (e.g. plumbing/hash), or removed if no longer needed. As per caveat on the documentation, this package API may introduce breaking changes to its subpackages in minor go-git releases, therefore beware when depending on them. Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
The introduction of both new types are a stepping stone to enable SHA256 support concurrently - without the need for a build tag. ImmutableHash provides a way to represent varied sized hashes (for SHA1 or SHA256) with a single type, while keeping similar performance of the existing plumbing.Hash. The names were picked in order to provide a clearer distinction on what they do. ImmutableHash is the result of a hash operation and can't be changed once calculated. ObjectHasher, adds the Git object header before a normal hash operation that hash.Hash or plumbing/hash.Hash do. The performance results shows that SHA1 operations could be slower, however for SHA256 it can be over 3 times faster: ~Hasher/hasher-sha1-100B-16 2202226 574.7 ns/op 174.00 MB/s 272 B/op 7 allocs/op ~Hasher/objecthash-sha1-100B-16 1511851 772.6 ns/op 129.43 MB/s 272 B/op 9 allocs/op ~Hasher/objecthash-sha256-100B-16 5057584 247.4 ns/op 404.21 MB/s 96 B/op 7 allocs/op ~Hasher/hasher-sha1-5000B-16 96476 12523 ns/op 399.25 MB/s 5536 B/op 7 allocs/op ~Hasher/objecthash-sha1-5000B-16 105376 10983 ns/op 455.27 MB/s 272 B/op 9 allocs/op ~Hasher/objecthash-sha256-5000B-16 420741 2828 ns/op 1767.77 MB/s 96 B/op 7 allocs/op ~HashFromHex/hasher-parse-sha1-valid-16 23243548 64.65 ns/op 618.69 MB/s 48 B/op 1 allocs/op ~HashFromHex/objecthash-fromhex-sha1-valid-16 18120699 79.62 ns/op 502.36 MB/s 72 B/op 2 allocs/op ~HashFromHex/objecthash-fromhex-sha256-valid-16 8969871 124.2 ns/op 515.22 MB/s 96 B/op 2 allocs/op Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
@@ -21,6 +21,7 @@ require ( | |||
github.com/pjbgf/sha1cd v0.3.0 | |||
github.com/sergi/go-diff v1.1.0 | |||
github.com/skeema/knownhosts v1.2.1 | |||
github.com/stretchr/testify v1.8.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a new testing library or can we use the existing gopkg.in/check.v1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should not have two testing libraries. However, check.v1
is no longer maintained so we are better off removing it across the board. There is an issue already talking about it.
On go-git-fixtures, I already marked the check.v1
API as deprecated and on the new code for SHA256 the idea is to rely on either std Go or testify.
SHA1_Size = 20 | ||
SHA1_HexSize = SHA1_Size * 2 | ||
SHA256_Size = 32 | ||
SHA256_HexSize = SHA256_Size * 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SHA1_Size = 20 | |
SHA1_HexSize = SHA1_Size * 2 | |
SHA256_Size = 32 | |
SHA256_HexSize = SHA256_Size * 2 | |
SHA1Size = 20 | |
SHA1HexSize = SHA1Size * 2 | |
SHA256Size = 32 | |
SHA256HexSize = SHA256Size * 2 |
} | ||
} | ||
|
||
type objectHasherSHA256 struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both objectHasherSHA1
and objectHasherSHA256
share the same logic, can we combine both in one type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a single base type was my original design, which gave me a few options:
a) have two different sized arrays, to be used based on the hash at hand.
b) rely on a slice and size it based on the hash at hand.
c) use generics and rely on type constraints.
From the options above, the smallest impact in both performance and API changes was this. But I am not convinced this is it yet - hence why feedback on this is so important.
The original plumbing.Hash
idea of using a fixed sized array as the base Hash type was really good in so many different ways - including performance. So whatever comes to replace it will need to be quite good.
The introduction of both new types are a stepping stone to enable SHA256 support concurrently - without the need for a build tag.
ImmutableHash provides a way to represent varied sized hashes (for SHA1 or SHA256) with a single type, while keeping similar performance of the existing
plumbing.Hash
.The names were picked in order to provide a clearer distinction on what they do.
ImmutableHash
is the result of a hash operation and can't be changed once calculated.ObjectHasher
, adds the Git object header before a normal hash operation thathash.Hash
orplumbing/hash.Hash
do.The performance results shows that SHA1 operations could be slower, however for SHA256 it can be over 3 times faster:
Relates to #229 #899
The other commits (together with the one described above) lay the foundations for a follow-up PR which introduces a
exp/plumbing/format/idxfile
which supports both SHA1 and SHA256.