-
Notifications
You must be signed in to change notification settings - Fork 188
Build and container fixes for aarch64 #408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CUDA provides symlinks to correct GDS lib and include directories so there is no need to hardcode full path that is architecture-dependent. Set default GDS path to point to CUDA install directory and use 'include' and 'lib64' symlinks. Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Currently the build script and Docker build file assume that the container is built on x86_64 Linux host (by not specifying a platform which makes docker default to host architecture, hardcoding 'x86_64' in several places, etc.), which makes it impossible to create either x86_64 or aarch64 container on Arm host or aarch64 container on x86 host. Configure target architecture via 'ARCH' docker variable. Set it to x86 by default in Dockerfile (for any users that use it directly) and to host architecture in build-container.sh. Allow user to specify ARCH value via '--arch' CLI parameter, set docker build platform value accordingly and pass the value to docker build as a build arg. Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Similar to nixl library container infrastructure nixlbench also does not specify a docker build platform, hardcodes 'x86_64' in several places and assumes x86 manylinux platform which makes it impossible to build anything on aarch64 host or for aarch64 target. Configure target architecture via 'ARCH' docker variable. Set it to x86 by default in nixlbench Dockerfile (for any users that use it directly) and to host architecture in build-container.sh. Allow user to specify ARCH value via '--arch' CLI parameter, set docker build platform value accordingly and pass the value to docker build as a build arg. Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
|
👋 Hi w1ldptr! Thank you for contributing to ai-dynamo/nixl. Your PR reviewers will review your contribution then trigger the CI to test your changes. 🚀 |
|
Few hours ago I fixed the same issue: #407 |
@iyastreb @brminich this also fixes nixl container build besides nixlbench, allows container crossbuild (i.e. allows user to build x86 in Arm and vice versa) and properly solves the GDS path issue by leveraging symlinks instead of continuing to rely on arch-specific paths. |
Sure, I like your approach, let's sync up on Monday to remove the common part. |
Set default GDS path to point to CUDA install directory and use point lib and include to symlinks instead of hardcoding architecture-specific values.
Currently build scripts and Docker build files assume that the container is built on x86_64 Linux host (by not specifying a platform which makes docker default to host architecture, hardcoding 'x86_64' in several places, etc.), which makes it impossible to create either x86_64 or aarch64 container on Arm host or aarch64 container on x86 host.
Configure target architecture via 'ARCH' docker variable. Set it to x86 by default in docker files (for any users that use it directly) and to host architecture in build-container.sh and nixlbench build.sh. Allow user to specify ARCH value via '--arch' CLI parameter, set docker build platform value accordingly and pass the value to docker build as a build arg.
Fixes NIX-14