You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Playing with the lib on the new Apple macbook air with M1, I found out that the performance of the native version is slower than the emulation of the x86_64 version.
After a bit of investigation, the performance difference come from the arm version not using intrinsics. A small modification of the cmake header detection script and clang.h/gcc.h files to import arm_neon.h instead of x86intrinsics.h when compiling on arm64, greatly improved the performances, see table. The performance boost should translate to others arm platforms. All tests are OK but I'm not able to check if everything works fine on windows or Android, so I didn't make a pull request.
CKKS performance test, degree 8192, on macbook air M1 8Go, timings in microseconds.
native arm w/o intrinsics
native arm with intrinsics
encode
683
441
decode
1309
900
encrypt
4356
2551
decrypt
245
112
add
38
37
multiply
808
280
multiply plain
368
106
square
597
203
relinearize
4253
1931
rescale
1014
474
rotate 1 step
4279
1972
rotate rd
17199
7821
The text was updated successfully, but these errors were encountered:
Thanks for sharing this! This could be really valuable to get into SEAL, obviously. Could you submit the pull request to the contrib branch and we can help evaluate that it's all good to go.
Playing with the lib on the new Apple macbook air with M1, I found out that the performance of the native version is slower than the emulation of the x86_64 version.
After a bit of investigation, the performance difference come from the arm version not using intrinsics. A small modification of the cmake header detection script and
clang.h/gcc.h
files to importarm_neon.h
instead ofx86intrinsics.h
when compiling on arm64, greatly improved the performances, see table. The performance boost should translate to others arm platforms. All tests are OK but I'm not able to check if everything works fine on windows or Android, so I didn't make a pull request.CKKS performance test, degree 8192, on macbook air M1 8Go, timings in microseconds.
The text was updated successfully, but these errors were encountered: