-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCC miscompiles Fp6 Frobenius with -flto flag #230
Comments
It seems like the reference exponentiation algorithm gets miscompiled and zero out some data [Suite] 𝔽p6 Frobenius map: Frobenius(a, k) = a^(pᵏ) (mod p⁶) [64-bit words]
Frobenius(a) for Fp6[BN254_Nogami]
frob test ( a in): Fp6(
c0: Fp2(
c0: 0x10c399f626ceb62186cefecb6b3666da8db348e4fd28bb7f60c74a7ee5aea24a,
c1: 0x02cbc4df3bf4fdbed6be0cd9436a484b7e3294270ec0cc9f8d1378999d112334),
c1: Fp2(
c0: 0x073c6d67ce314a11b78c642631efd635efdbf4f4e436e00f7a7fdd9020bbe642,
c1: 0x089bb97f312d37833ba8b0c1ca2c60bd29c22e852b5b571398a5f1994fd9016d),
c2: Fp2(
c0: 0x040ff098efd9b9c8f0cd9f2925f2e2b869fcaa4cf50156192662ac606d1800ff,
c1: 0x0a4d8e2b6db81ef92a16d822e855cd38bc6fd8c9161c2b6206990c04bcc40d99))
frob test ( a out): Fp6(
c0: Fp2(
c0: 0x1d0240477bb4c3bb13c9e4f44c606fe9d730fc8f00630c2f0316b49beb606b5c,
c1: 0x0000000000000000000000000000000000000000000000000000000000000000),
c1: Fp2(
c0: 0x0000000000000000000000000000000000000000000000000000000000000000,
c1: 0x0000000000000000000000000000000000000000000000000000000000000000),
c2: Fp2(
c0: 0x0000000000000000000000000000000000000000000000000000000000000000,
c1: 0x0000000000000000000000000000000000000000000000000000000000000000))
frob test (fa out): Fp6(
c0: Fp2(
c0: 0x10c399f626ceb62186cefecb6b3666da8db348e4fd28bb7f60c74a7ee5aea24a,
c1: 0x22579fa3040b0242e37640a6bc95b7bce2ee6bd8f13f337419ec876662eedcdf),
c1: Fp2(
c0: 0x0d118e8c6ba1c6953de028c405b7ffd9b842d85366413f1ec1c49669f3bdc70b,
c1: 0x1ad1021b795ca7fec204540f55887869a937057ec588dbf9650ea71fac0be0bd),
c2: Fp2(
c0: 0x2098e7de1289bf414f9bbe13c7337da8c7cf89742a33eeadc8600a4c43f19de6,
c1: 0x172c6a37e2db93bd813bcd878a1ddc7d6f6f6b6272ad3596310c740cc17c9a5c))
/[...]/constantine/tests/math/t_fp_tower_frobenius_template.nim(89, 21): Check failed: bool(a == fa)
[FAILED] Frobenius(a) = a^p (mod p^6) And compiling with ubsan makes the problem disappear obviously ...:
test_fp6_frobenius xoshiro512** seed: 1681812781
[Suite] 𝔽p6 Frobenius map: Frobenius(a, k) = a^(pᵏ) (mod p⁶) [64-bit words]
Frobenius(a) for Fp6[BN254_Nogami]
frob test ( a in): Fp6(
c0: Fp2(
c0: 0x10c399f626ceb62186cefecb6b3666da8db348e4fd28bb7f60c74a7ee5aea24a,
c1: 0x02cbc4df3bf4fdbed6be0cd9436a484b7e3294270ec0cc9f8d1378999d112334),
c1: Fp2(
c0: 0x073c6d67ce314a11b78c642631efd635efdbf4f4e436e00f7a7fdd9020bbe642,
c1: 0x089bb97f312d37833ba8b0c1ca2c60bd29c22e852b5b571398a5f1994fd9016d),
c2: Fp2(
c0: 0x040ff098efd9b9c8f0cd9f2925f2e2b869fcaa4cf50156192662ac606d1800ff,
c1: 0x0a4d8e2b6db81ef92a16d822e855cd38bc6fd8c9161c2b6206990c04bcc40d99))
frob test ( a out): Fp6(
c0: Fp2(
c0: 0x10c399f626ceb62186cefecb6b3666da8db348e4fd28bb7f60c74a7ee5aea24a,
c1: 0x22579fa3040b0242e37640a6bc95b7bce2ee6bd8f13f337419ec876662eedcdf),
c1: Fp2(
c0: 0x0d118e8c6ba1c6953de028c405b7ffd9b842d85366413f1ec1c49669f3bdc70b,
c1: 0x1ad1021b795ca7fec204540f55887869a937057ec588dbf9650ea71fac0be0bd),
c2: Fp2(
c0: 0x2098e7de1289bf414f9bbe13c7337da8c7cf89742a33eeadc8600a4c43f19de6,
c1: 0x172c6a37e2db93bd813bcd878a1ddc7d6f6f6b6272ad3596310c740cc17c9a5c))
frob test (fa out): Fp6(
c0: Fp2(
c0: 0x10c399f626ceb62186cefecb6b3666da8db348e4fd28bb7f60c74a7ee5aea24a,
c1: 0x22579fa3040b0242e37640a6bc95b7bce2ee6bd8f13f337419ec876662eedcdf),
c1: Fp2(
c0: 0x0d118e8c6ba1c6953de028c405b7ffd9b842d85366413f1ec1c49669f3bdc70b,
c1: 0x1ad1021b795ca7fec204540f55887869a937057ec588dbf9650ea71fac0be0bd),
c2: Fp2(
c0: 0x2098e7de1289bf414f9bbe13c7337da8c7cf89742a33eeadc8600a4c43f19de6,
c1: 0x172c6a37e2db93bd813bcd878a1ddc7d6f6f6b6272ad3596310c740cc17c9a5c))
[OK] Frobenius(a) = a^p (mod p^6) |
Further investigation shows that
The "obvious" difference between the working (BN254_Snarks, BLS12_377) and failing (BN254_Nogami, BLS12_381) set is that the failing set uses (1 + i) as a non-residue to build 𝔽p6 on top of 𝔽p2. Hence the bug is likely in constantine/constantine/math/extension_fields/towers.nim Lines 449 to 497 in 93dac25
Even multiplying by 0 or 1 or squaring wrong gets wrong test_fp6_BN254_Nogami xoshiro512** seed: 1681816001
[Suite] 𝔽p6 = 𝔽p2[w] BN254_Nogami [64-bit words]
[OK] Comparison sanity checks
[OK] Addition, substraction negation are consistent
[OK] Division by 2
//[...]//constantine/tests/math/t_fp_tower_template.nim(149, 21): Check failed: bool(r == One)
[FAILED] Squaring 1 returns 1
//[...]//constantine/tests/math/t_fp_tower_template.nim(176, 21): Check failed: bool(r == Four)
[FAILED] Squaring 2 returns 4
//[...]//constantine/tests/math/t_fp_tower_template.nim(205, 21): Check failed: bool(u == Nine)
[FAILED] Squaring 3 returns 9
//[...]//constantine/tests/math/t_fp_tower_template.nim(234, 21): Check failed: bool(u == Nine)
[FAILED] Squaring -3 returns 9
fatal.nim(54) sysFatal
Unhandled exception: t_fp_tower_template.nim(262, 20) `bool(r == Z)`
Expected zero but got
(Fp6[BN254_Nogami]): Fp6(
c0: Fp2(
c0: 0x1bd2327ab3dce8c41b4e55237009615f5aefc528e3623dd2fb98dd56fa5beaa2,
c1: 0x20fd5f09fc035ab6ba1090a72a5e91a22b13058690761ab6f047f66115ffbd90),
c1: Fp2(
c0: 0x0d1d6f126f6ac9a158f67541e580591d14d5d12d47f0302e9f01e8c31743722f,
c1: 0x0250f912118422fca544a186ac502b314f388713905525bc42593672eeff6c39),
c2: Fp2(
c0: 0x1f56de2990d171dbd1062e7f17d2096c120234af97168498a19f8b592a3e2e35,
c1: 0x16b952c577acd4af661e7994168797bfebc9852189aa57368e54051e37e9794e)) [AssertionDefect]
[FAILED] Multiplication by 0 and 1 |
So reverse engineering the code with Ghidra The code has been constant-folded and only has 3 inputs, in particular M, the statically known modulus address has been folded It's possible that the code has been relocated and the address referred to is garbage. Potential fixes:
|
Unfortunately swithcing back to memory constraint (m or o) allows GCC to generate correct code, but then LLVM generates incorrect code. Intuition is that there is miscompilation when the input is on the stack and we use the inline function: constantine/constantine/math/extension_fields/assembly/fp2_asm_x86_adx_bmi2.nim Lines 49 to 67 in 9a71374
Trying to force a pointer input on the assembly function doesn't seem to help LLVM to generate appropriate code. |
Now, if the array is "on the stack" the compiler is allowed to store it in registers. GCC probably sees the memory constraint and forces that array on the stack but possibly LLVM does not and so and so the pointers to memory are invalid when the function is inlined. Now the question becomes, if this is true, why does it work when the constraint is "PointerInReg" here: constantine/constantine/math/arithmetic/assembly/limbs_asm_mul_x86_adx_bmi2.nim Lines 111 to 125 in 9a71374
|
Seems like LLVM is broken with memory constraint and Rust removed them altogether:
And pointers make GCC miscompile with LTO (due to folding "constant pointers to constants") unless function is noInline. |
After applying fixes for #229 in #228 548cf20
fails with
Note that this time this does not depend on --opt:size flag.
This likely explains the BLS verification failure on windows.
This does not happen with Clang or without LTO
The text was updated successfully, but these errors were encountered: