-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ML-KEM: Bring encoding functions in line with the spec #160
base: master
Are you sure you want to change the base?
Conversation
- adds the bytedecode function - adds an inversion property about decode + encode - adds some properties about rewrites that I did for decoding
Also adds the relevant property
Also adds vector versions of encode/decode-12.
2d917b4
to
820991b
Compare
ByteDecode12 : [32 * 12]Byte -> [256](Z q) | ||
ByteDecode12 B = F' where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this function is never actually used alone, but I don't feel like it's reasonable skip it and only make ByteDecode12_Vec
.
*/ | ||
EncodeBytes' : {ell, c} (fin ell, ell > 0, fin c) => [c * 8][ell] -> [c * ell]Byte | ||
EncodeBytes' = regroup | ||
* Encode an array of `d`-bit integers into a byte array, for `d < 12`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: I was also thinking of making all these private, since they're not part of the public API of ML-KEM.
deletes two functions that were previously used in encoding, and relocate one to the one place where it's used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EncodeBytes and DecodeBytes implementations look quite a bit different from the spec, but the explanations for why that's the case help build confidence. I especially appreciate proving each of the claims that were made (e.g., multiplying by 2^j is the same as left shift).
* This is used in some places where the `ByteDecode` function is required in | ||
* the spec. It's a 3D version of `DecodeBytes'`. | ||
* The subtract-and-divide algorithm applied to `a` in `ByteEncode` is the | ||
* same as shifting right. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be helpful to add the step in the algorithm (5) where this comes up in the explanation.
|
||
/** | ||
* This is used in some places where the `ByteEncode` function is required in | ||
* the spec. It's a 3D version of `EncodeBytes'`. | ||
* Encode a set of `k` vectors of integers mod `q` into a byte array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is [FIPS-203] Section 2.4.8 relevant here like it is in ByteEncode_Vec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks.
* :prove CorrectnessEncodeBytes | ||
* :prove mod2IsFinalBit`{d_u} | ||
* :prove mod2IsFinalBit`{d_v} | ||
* :prove mod2IsFinalBit`{12} | ||
* ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful to add some text about why we care that mod 2 is the final bit as it relates to the EncodeBytes algorithm.
// Step 1. | ||
b = BytesToBits B | ||
// Steps 2-4. The `mod m` is implicit in the type because the `[d]` type | ||
// always operates `mod 2^d`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I paused for a second here and had to look back at the spec. Maybe be explicit about how m=2^d?
|
||
/** | ||
* Proof that the efficient decode function is the same as the spec version. | ||
* Multiplying a value by `2^^j` is the same as bit-shifting it left by `j` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should 2^^j be 2^j in this line?
* ```repl | ||
* :prove ByteEncodeInvertsByteDecode`{1} | ||
* :prove ByteEncodeInvertsByteDecode`{d_u} | ||
* :prove ByteEncodeInvertsByteDecode`{d_v} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does d_u and d_v come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an explanation here and everywhere I use these without comment in the doctests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well, note that #166 applies here as well.
Addresses part of #144, but does not complete it.
This handles all the encode / decode parts of #144. It was getting a little hefty as-is so I'm breaking it into its own PR.
This was mainly a project in mucking around with types. The spec assumes that all arithmetic operations can be done on all integral types, but that's not true in Cryptol. For example, you can't multiply a 1-bit vector by 2, and you can't mod an integer mod
q
by 2. These things "just happen" in the spec, but in this implementation we have to manually convert types between bit vectors and integers mod, and sometimes we have to use more width-independent operations (like bit shifting instead of multiplying by2^j
).The original plan for this PR was to combine the
ByteEncode
andByteEncode12
functions into one (and similarly for decoding). However in order to do so I would've had to representd
-bit vectors as integers mod2^d
(e.g. using theZ
type), and that was ugly, hard to understand, and slow to prove any properties about. So I kept them separate, as they were previously, and just changed the names to be more in line with the spec.The other overloaded name in the spec is that the encode/decode functions use the same name to apply to either one or many inputs. I added a
_Vec
suffix to distinguish these cases.This feels very cluttered to me with so many properties and conditional functions to handle just 2 algorithms in the spec. I welcome suggestions to reduce that mess.