[RFC] MXNet AArch64 wheels #20251

mseth10 · 2021-05-06T18:25:52Z

Problem statement

Currently, Apache MXNet does not publish wheels for AArch64 based platforms. I would like to propose addition of AArch64 support to our CI/CD as well as stable release Jenkins pipelines.
MXNet already supports AArch64 based platforms. In CI, we cross-compile MXNet for AArch64 target architectures for Ubuntu and Android OS. For wheel generation and testing, we can use AWS Graviton2 processors powered Amazon EC2 instances and use native-compilation toolchain. For best performance of wheels, we can evaluate different build options and use the best possible configuration or provide different options for our users to choose from. Some of the different build options include choice of BLAS (OpenBLAS, Eigen BLAS, Arm Performance Libraries), choice of performance libraries (OneDNN, Arm Compute Libraries, XNNPACK) and different build flag setting (-march, -mtune, -mcpu, -moutline-atomics) [1][2] .

Proposed solutions

I have been able to build and test MXNet v1.x branch with OpenBLAS and OneDNN. The binary is supported on all AArch64 architectures (ARMv8-A, ARMv8.1-A, ARMv8.2-A, ...), but in order to make use of Large System Extensions introduced in ARMv8.1-A, I had to build using GCC flag -moutline-atomics which is supported in gcc-10 only. Using this build in CD pipelines would mean using a base docker image that supports gcc-10 (currently using Ubuntu:18.04). We can get rid of the flag -moutline-atomics (and gcc-10 dependency) if we build for base architecture ARMv8.1-A (-march=armv8.1-a), but then the binary won’t execute on ARMv8-A based platforms. We can also optimize the build for a particular micro-architecture by using other build flags like -mtune / -mcpu . Any suggestions are appreciated.
Arm has added experimental support for Arm Performance Libraries and Arm Compute Libraries into OneDNN [3]. Next step would be to evaluate this support and enable it in MXNet.

References

The text was updated successfully, but these errors were encountered:

mseth10 · 2021-05-06T22:01:07Z

WIP PR : #20252

ddelange · 2023-01-08T23:10:28Z

hi @mseth10,

fkrst of all, thanks a lot for the above PR. could you give an indication of complexity to cherry-pick these aarch64 builds onto the cuda wheels?

would be cool to run GPU inference on g5.xlarge

ddelange · 2023-01-11T09:26:11Z

first hickup ^ libquadmath0 is missing arm64

mseth10 added the RFC Post requesting for comments label May 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] MXNet AArch64 wheels #20251

[RFC] MXNet AArch64 wheels #20251

mseth10 commented May 6, 2021

mseth10 commented May 6, 2021

ddelange commented Jan 8, 2023

ddelange commented Jan 11, 2023

[RFC] MXNet AArch64 wheels #20251

[RFC] MXNet AArch64 wheels #20251

Comments

mseth10 commented May 6, 2021

Problem statement

Proposed solutions

References

mseth10 commented May 6, 2021

ddelange commented Jan 8, 2023

ddelange commented Jan 11, 2023