Skip to content

Releases: oneapi-src/oneCCL

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.14

06 Nov 10:44
3afa1bb
Compare
Choose a tag to compare

What's New:

  • Optimizations on key-value store support to scale up to 3000 nodes
  • New APIs for Allgather, Broadcast and group API calls
  • Performance Optimizations for scaleup for Allgather, Allreduce, and Reduce-scatter for scaleup and scaleout
  • Performance Optimizations for CPU single node
  • Optimizations to reuse Level Zero events.
  • Change of the default mechanism for IPC exchange to pidfd

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.13Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.13.1

08 Aug 08:58
c80317f
Compare
Choose a tag to compare

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.13

24 Jun 10:42
0eb5987
Compare
Choose a tag to compare

What's New:

  • Optimizations to limit the memory consumed by oneCCL
  • Optimizations to limit the number of file descriptors maintained opened by oneCCL.
  • Align the support for in-place for the Allgatherv and Reduce-scatter collectives to follow the same behavior as NCCL.
  • In particular, the Allgatherv collective is in place when:
  • send_buff == recv_buff + rank_offset, where rank_offset = sum (recv_counts[i]), for all I<rank.
  • Reduce-scatter is in-place when recv_buff == send_buff + rank *recv_count.
  • When using the environment variable CCL_WORKER_AFFINITY, oneCCL enforces the requirement that the length of the list should be equal to the number of workers.
  • Bug fixes.

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.12

27 Mar 12:52
Compare
Choose a tag to compare

What's New

  • Performance improvements for scaleup for all message sizes for AllReduce, Allgather, and Reduce_Scatter.
  • Optimizations also include small message sizes that appear in inference apps.
  • Performance improvements for scaleout for Allreduce, Reduce, Allgather, and Reduce_Scatter.
  • Optimized memory usage of oneCCL.
  • Support for PMIx 4.2.6.
  • Bug fixes.

Removals

  • oneCCL 2021.12 removes support for PMIx 4.2.2

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.11.2

21 Dec 18:16
8d18c7b
Compare
Choose a tag to compare

This update provides bug fixes to maintain driver compatibility for Intel® Data Center GPU Max Series.

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.11.1

29 Nov 15:50
Compare
Choose a tag to compare

This update addresses stability issues with distributed Training and Inference workloads on Intel® Data Center GPU Max Series.

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.11

28 Nov 16:07
Compare
Choose a tag to compare
  • Added point to point blocking communication operations for send and receive.

  • Performance optimizations for Reduce-Scatter.

  • Improved profiling with Intel® Instrumentation and Tracing Technology (ITT) profiling level.

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.10

20 Jul 09:50
b4c31ba
Compare
Choose a tag to compare
  • Improved scaling efficiency of the Scaleup algorithms for ReduceScatter
  • Optimized performance of oneCCL scaleup collectives by utilizing the new embedded Data Streaming Accelerator in Intel® 4th Generation Xeon Scalable Processors (formerly code-named Sapphire Rapids)

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.9

11 Apr 14:29
0db8329
Compare
Choose a tag to compare

• Optimizations across the board including improved scaling efficiency of the Scaleup algorithms for Alltoall and Allgather
• Add collective selection for scaleout algorithm for device (GPU) buffers

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.8

19 Dec 15:44
bfa1e99
Compare
Choose a tag to compare

• Provides optimized performance for Intel® Data Center GPU Max Series utilizing oneCCL.
• Enables support for Allreduce, Allgather, Reduce, and Alltoall connectivity for GPUs on the same node.