Skip to content

Conversation

@trevor-m
Copy link
Contributor

This PR is the first step in integrating the NVSHMEM library into XLA. There are two use cases in XLA that we are targeting with NVSHMEM: 1. Host-side collectives which are much faster than NCCL at small message sizes and 2. Generating fused communication and compute kernels with Pallas and Mosaic GPU.

This PR adds the library dependency which follows the stub loading mechanism used by other NVIDIA libraries in XLA. We also add a singleton wrapper API around it, and a unit test which checks that NVSHMEM can be initialized successfully. The build rules download the nvshmem headers during XLA build and uses the system installed shared library at runtime.

cc @nvcastet @Tixxx

@nvcastet
Copy link
Contributor

CC @ezhulenev

@trevor-m trevor-m force-pushed the nvshmem-upstream-1 branch 15 times, most recently from fd31caf to 4e8c4dc Compare January 7, 2025 00:22
@ezhulenev ezhulenev requested a review from frgossen January 7, 2025 02:55
@trevor-m
Copy link
Contributor Author

@frgossen Would you mind reviewing this when you have a chance?

@frgossen
Copy link
Member

Generally looks good to me. I would like to wait merging this until we can test NVSHMEM internally, which depends on integrating the library

nvshmemx_uniqueid_t nvshmem_id = NVSHMEMX_UNIQUEID_INITIALIZER;

// Initialize NVSHMEM
if (process_id_ == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this work in case when the first collection doesn't use device 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For initialization, it needs to be global so all the devices must be participating. This is a bit different from nccl where you can initialize communicators independently.
After initialization, we could split the devices into subgroups.

copybara-service bot pushed a commit that referenced this pull request Apr 10, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 10, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 10, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 10, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 10, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 10, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 10, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 10, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 10, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 10, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 10, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 10, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 10, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 10, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 10, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 11, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 11, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 13, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
5. Do not register nccl collectives when no_nccl dependency provided. This is needed when tools like xla/codegen/tools/emitters_opt are built with RBE.

Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 13, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.
5. Do not register nccl collectives when no_nccl dependency provided. This is needed when tools like xla/codegen/tools/emitters_opt are built with RBE.

Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 14, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 14, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 15, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 15, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 15, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 15, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 15, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 15, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#21683 from trevor-m:nvshmem-upstream-2 fd15a7cac745adc1971bec63e148047b9b811729
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 15, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

FUTURE_COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 740701134
copybara-service bot pushed a commit that referenced this pull request Apr 15, 2025
Imported from GitHub PR #21683

Requires #20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee3379 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39 by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c368 by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d111 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b2 by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7c by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

COPYBARA_INTEGRATE_REVIEW=#21683 from trevor-m:nvshmem-upstream-2 fd15a7c
PiperOrigin-RevId: 747816712
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Apr 15, 2025
Imported from GitHub PR openxla/xla#21683

Requires openxla/xla#20395 which adds the NVSHMEM library dependency.

This PR adds the following:
1. Nvshmem flag to enable nvshmem
2. Set nvshmem initialization issue when GPU PJRT client is created. The first time NVSHMEM is used, it will be initialized.
3. Uses the user buffer memory pool for nvshmem. If nvshmem is enabled, it will be allocated using `nvshmem_malloc`. This same memory can be used by user buffers if nccl user buffers is also enabled.
4. Update the `CollectiveColorer` so that mosaic_gpu custom calls use the nvshmem memory space.

Copybara import of the project:

--
aee33791e16ab2149118de728dbb9e62f5e7cc31 by Trevor Morris <tmorris@nvidia.com>:

Add nvshmem flag, memory allocation, and memory space assignment

Set Nvshmem env info during client creation

Rename flag and use absl::string_view

--
f8fca39300b3915eb6320142f58fa9c0ec7a1eaa by Trevor Morris <tmorris@nvidia.com>:

Use explicit types in test

--
e41faa3f72b778fcf8ea8111d3cde59548b8f9f5 by Trevor Morris <tmorris@nvidia.com>:

Add user buffer allgather and allreduce tests with and without nvshmem alloc

Set nvshmem in XLA_FLAGS

test fixes

formatting

--
cf0c36865de8b8a010caaf62c3a36b64e36037bd by Trevor Morris <tmorris@nvidia.com>:

Fixes

--
3b4d11123cdb794d0a60e65b94d22ded04b7b2b4 by Trevor Morris <tmorris@nvidia.com>:

Remove early dso check

--
359f2b243ec97b1f8003c27f0b07dde82407ff6c by Trevor Morris <tmorris@nvidia.com>:

Add flag comment

--
fd15a7cac745adc1971bec63e148047b9b811729 by Trevor Morris <tmorris@nvidia.com>:

Also assign memory space for mosaic_gpu_v2

Merging this change closes #21683

PiperOrigin-RevId: 747816712
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants