Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[RFC] MXNet AArch64 wheels #20251

Open
mseth10 opened this issue May 6, 2021 · 3 comments
Open

[RFC] MXNet AArch64 wheels #20251

mseth10 opened this issue May 6, 2021 · 3 comments
Labels
RFC Post requesting for comments

Comments

@mseth10
Copy link
Contributor

mseth10 commented May 6, 2021

Problem statement

Currently, Apache MXNet does not publish wheels for AArch64 based platforms. I would like to propose addition of AArch64 support to our CI/CD as well as stable release Jenkins pipelines.
MXNet already supports AArch64 based platforms. In CI, we cross-compile MXNet for AArch64 target architectures for Ubuntu and Android OS. For wheel generation and testing, we can use AWS Graviton2 processors powered Amazon EC2 instances and use native-compilation toolchain. For best performance of wheels, we can evaluate different build options and use the best possible configuration or provide different options for our users to choose from. Some of the different build options include choice of BLAS (OpenBLAS, Eigen BLAS, Arm Performance Libraries), choice of performance libraries (OneDNN, Arm Compute Libraries, XNNPACK) and different build flag setting (-march, -mtune, -mcpu, -moutline-atomics) [1][2] .

Proposed solutions

I have been able to build and test MXNet v1.x branch with OpenBLAS and OneDNN. The binary is supported on all AArch64 architectures (ARMv8-A, ARMv8.1-A, ARMv8.2-A, ...), but in order to make use of Large System Extensions introduced in ARMv8.1-A, I had to build using GCC flag -moutline-atomics which is supported in gcc-10 only. Using this build in CD pipelines would mean using a base docker image that supports gcc-10 (currently using Ubuntu:18.04). We can get rid of the flag -moutline-atomics (and gcc-10 dependency) if we build for base architecture ARMv8.1-A (-march=armv8.1-a), but then the binary won’t execute on ARMv8-A based platforms. We can also optimize the build for a particular micro-architecture by using other build flags like -mtune / -mcpu . Any suggestions are appreciated.
Arm has added experimental support for Arm Performance Libraries and Arm Compute Libraries into OneDNN [3]. Next step would be to evaluate this support and enable it in MXNet.

References

@mseth10 mseth10 added the RFC Post requesting for comments label May 6, 2021
@mseth10
Copy link
Contributor Author

mseth10 commented May 6, 2021

WIP PR : #20252

@ddelange
Copy link

ddelange commented Jan 8, 2023

hi @mseth10,

fkrst of all, thanks a lot for the above PR. could you give an indication of complexity to cherry-pick these aarch64 builds onto the cuda wheels?

would be cool to run GPU inference on g5.xlarge

@ddelange
Copy link

first hickup ^ libquadmath0 is missing arm64

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
RFC Post requesting for comments
Projects
None yet
Development

No branches or pull requests

2 participants