Skip to content

AWS OFI NCCL v1.13.2

Latest
Compare
Choose a tag to compare
@arunkarthik-akkart arunkarthik-akkart released this 11 Dec 17:58
· 34 commits to master since this release
v1.13.2-aws

v1.13.2-aws (2024-12-06)

This release is intended only for use on AWS P* instances. A general release that supports other libfabric networks may be made in the near future.

With this release, building with platform-aws requires 1.22.0amzn4.0 or greater. AWS customers are generally recommended to track the latest-available EFA Installer for performance improvements and bug fixes.

The 1.13.x release series supports NCCL 2.23.4-1 while maintaining backward compatibility with older NCCL versions (NCCL v2.17.1 and later).

Bug Fixes:

  • Tuner Improvements:
    • Fixed algorithm selection for larger ranks and message sizes.
    • Re-calibrated the tuner for AllGather and ReduceScatter regions for 0x7 bitmask on P5en, optimizing performance for larger messages.
    • Added tuner support for AllGather and ReduceScatter regions for 0x0 bitmask on P5en.
  • Resolved a performance issue by preventing the eager protocol when RDMA writes are in flight, improving small AllReduce collective performance.

Note: dmabuf support is now turned off by default. Users can enable it explicitly using OFI_NCCL_DISABLE_DMABUF=0 if needed.

Checksum (sha512) for the release tarball:

4c0ac3144f178062fda9e86b50bb1784822e8fdbdffadf41cdbb30839456c4e912254ff12a5b0a8c63abbe910597fd14211a42572a451d10e01932100013971e  aws-ofi-nccl-1.13.2-aws.tar.gz