-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cryptographic primitives for hashing and HMAC #1712
Conversation
ex : Bytes ex = h = Sha3_512.new h2 = h `appendBytes` Bytes.fromList [1,2,3,4] Sha3_512.finish h2 > ex unison: pack bytes: non-natural closure: DataU1 589824 1 CallStack (from HasCallStack): error, called at src/Unison/Runtime/Machine.hs:1027:14 in unison-parser-typechecker-0.1-DahyA8UvBK0B5OVYJToEFT:Unison.Runtime.Machine
…e2b{512,256}, Blake2s{256}. Still pending work from @dolio for the universal hashing functions.
…ontexts. In prep for HMAC implementation.
…ializing / decompiling the hash / hmac state is a can of worms
…ts directory for transcripts that are only run with the new runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self review.
-- Pure ForeignOp taking two boxed values | ||
pfopbb :: ForeignOp | ||
pfopbb instr | ||
= ([BX,BX],) | ||
. TAbss [b1,b2] | ||
$ TFOp instr [b1,b2] | ||
where | ||
[b1,b2] = freshes 2 | ||
|
||
pfopbbb :: ForeignOp | ||
pfopbbb instr | ||
= ([BX,BX,BX],) | ||
. TAbss [b1,b2,b3] | ||
$ TFOp instr [b1,b2,b3] | ||
where | ||
[b1,b2,b3] = freshes 3 | ||
|
||
-- Pure ForeignOp taking no values | ||
pfop0 :: ForeignOp | ||
pfop0 instr = ([],) $ TFOp instr [] | ||
|
||
-- Pure ForeignOp taking 1 boxed value and returning 1 boxed value | ||
pfopb :: ForeignOp | ||
pfopb instr | ||
= ([BX],) | ||
. TAbss [b] | ||
$ TFOp instr [b] | ||
where | ||
[b] = freshes 1 | ||
|
||
-- Pure ForeignOp taking 1 boxed value and returning 1 Either, both sides boxed | ||
pfopb_ebb :: ForeignOp | ||
pfopb_ebb instr | ||
= ([BX],) | ||
. TAbss [b] | ||
. TLet e UN (AFOp instr [b]) | ||
. TMatch e . MatchSum | ||
$ mapFromList | ||
[ (0, ([BX], TAbs ev $ TCon eitherReference 0 [ev])) | ||
, (1, ([BX], TAbs ev $ TCon eitherReference 1 [ev])) | ||
] | ||
where | ||
[e,b,ev] = freshes 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added these common cases and am recycling them in a few places for various builtins.
-- declareForeign ("crypto.hash") pfopbb . mkForeign $ \(HashAlgorithm _ref _alg, _a :: Closure) -> | ||
-- pure $ Bytes.empty -- todo : implement me | ||
|
||
declareForeign "crypto.hashBytes" pfopbb . mkForeign $ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice am using the pfopbb : ForeignOp
convention here - it's a pure foreign operation that takes 2 boxed values. Probably a lot of the foreign operations in this file could be cleaned up to use a small number of ForeignOp
types.
instance {-# overlappable #-} BuiltinForeign b => ForeignConvention b where | ||
readForeign = readForeignBuiltin | ||
writeForeign = writeForeignBuiltin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This saves you from needing to provide a ForeignConvention
for a type that's already a BuiltinForeign
.
} | ||
|
||
instance B.ByteArrayAccess bytes => Eq (View bytes) where | ||
v1 == v2 = viewSize v1 == viewSize v2 && unsafeDupablePerformIO ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, this (unsafeDupablePerformIO
) seems like the wrong way to be doing this now. GHC has primops for comparing byte arrays that don't require dangerous stuff like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things overall look all right to me.
I think it'd be better to get those unsafeDupablePerformIO
calls eliminated, but it's not really different than depending on a library that does it (but shouldn't).
I wonder if we could just build some utilities around Vector Word8 rather than using the ByteArray interface. I'm not really sold on that interface. But I'd also like to start moving away from ByteString as I think the use of pinned memory isn't what we want. Going to merge this but we can revisit the Bytes internals anytime. |
This PR adds support for hashing and HMAC. See the transcript for docs and tests. Note that the newly created
unison-src/new-runtime-transcripts
directory has transcripts that will only be run using the new runtime and there's a tweak to the test harness to support this.Closes #1694, but see loose ends tickets below.
Implementation notes
Followed the instructions in
adding-builtins.md
. Nothing too exciting. I hit a couple bugs that @dolio was kind enough to fix. I have some ideas on how to improve the handling of foreign data types in the runtime but will work on that separately.I revamped the
Bytes
implementation to use aByteArray
interface provided by the memory package, to get various encoding/decoding functions (like converting to and from base 16 and base 32) and to switch to using unpinned memory. That package is already a dependency of cryptonite which we are using for crypto. See the diff inBytes.hs
for details.See self-review.
Interesting/controversial decisions
I had wanted an incremental hashing (and HMAC) implementation but put that aside for now: you have to feed it the full
Bytes
or other value you are hashing. An incremental interface would look like:This will let you do things like hash a large file without having to load the whole file contents in memory at once. The cryptonite library we are using supports incremental hashing, but the data type it uses to support it isn't safely serializable - it's literally just an opaque byte array internally, and the hashing algorithm (which is implemented in C) then makes assumptions based on these bytes. I was concerned that if you just deserialize any old byte array and do hashing with it, you could get buffer overruns and so on.
We could support an incremental interface later.
Test coverage
The transcript has some basic tests. The various
Bytes
encoding functions could use more tests.Loose ends