-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Add cmake flag USE_FATBIN_COMPRESSION, ON by default #19123
Conversation
Hey @DickJC123 , Thanks for submitting the PR
CI supported jobs: [website, windows-gpu, miscellaneous, edge, centos-cpu, unix-cpu, centos-gpu, sanity, unix-gpu, windows-cpu, clang] Note: |
As we can change default options before the 2.0 release, it may be helpful to always enable USE_FATBIN_COMPRESSION? A couple of users run into the linking errors if gpu auto-detection fails and cmake defaults to building for all "common" gpu architectures. |
I've adjusted the default to be always ON, regardless of CUDA version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @DickJC123 for this great addition!
…pache#19123) (apache#19158) * [1.x] Backport Add cmake flag USE_FATBIN_COMPRESSION, ON by default (apache#19123) * Trigger CI * Appending to existing CMAKE_CUDA_FLAGS in all cases
…pache#19123) * Add cmake flag USE_FATBIN_COMPRESSION, ON by default for CUDA >= 11 * cmake flag USE_FATBIN_COMPRESSION default is ON for all builds
As the size of libmxnet.so grows to near 2GB, with increased functionality and the addition of cuda architectures, we're running into link failures, e.g. see issue #17045
One technique that lowers lib size dramatically is 'fatbin compression', enabled by the nvcc options
--fatbin-options -compress-all
. This has been always a part of Makefile builds, but this PR adds it to the cmake builds. Specifically, this PR adds support to CMakeLists.txt for the cmake option-DUSE_FATBIN_COMPRESSION={ON,OFF}
, with a default of ON for CUDA 11 builds and beyond. This PR proposes to leave existing cmake builds against 10.2 as they are, without fatbin compression, to avoid unnecessarily introducing unforeseen consequences to existing use cases.Results of experiments building the 1.x branch with cuda11:
With cmake options
-DMXNET_CUDA_ARCH="5.2 6.0 6.1 7.0 7.2 7.5 8.0" -DUSE_FATBIN_COMPRESSION=OFF
, a cuda11 build fails with link error:With the same above cmake options, but dropping arches 5.2 and 7.2, the build succeeds with a libmxnet.so size of 1.8GB.
Finally, with the same first cmake options
-DMXNET_CUDA_ARCH="5.2 6.0 6.1 7.0 7.2 7.5 8.0"
a cuda11 build (using fatbin compression then by default) succeeds with a libmxnet.so size of 750MB, so over a 2X decrease in size.Both succeeding builds, one with fatbin compression and one without, ran the command:
in the same time of 7.6 secs.
@samskalicky @anirudh2290 @ChaiBapchya @ptrendx