-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NANOO FP8 support for collaborative communication unit tests #16938
Conversation
@reedwm Could you take a look at this PR? |
xla/tests/collective_ops_e2e_test.cc
Outdated
@@ -54,6 +55,21 @@ DeviceAssignment MakeDeviceAssn(int64_t num_replicas) { | |||
|
|||
class CollectiveOpsTestE2E : public HloTestBase { | |||
public: | |||
CollectiveOpsTestE2E() { | |||
replacements_[kF8E4M3DatatypePlaceholder] = | |||
#if GOOGLE_CUDA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're trying to avoid using macros like GOOGLE_CUDA and instead check at runtime. Can you check this via stream executor instead, similar to what is done in gemm_rewrite_test?
xla/xla/service/gpu/transforms/gemm_rewriter_test.cc
Lines 65 to 88 in e2110ae
const auto& device_desc() const { | |
return backend().default_stream_executor()->GetDeviceDescription(); | |
} | |
protected: | |
const se::GpuComputeCapability& Capability() const { | |
return device_desc().gpu_compute_capability(); | |
} | |
stream_executor::SemanticVersion GetToolkitVersion() const { | |
return backend() | |
.default_stream_executor() | |
->GetDeviceDescription() | |
.runtime_version(); | |
} | |
bool IsCuda() const { | |
return std::holds_alternative<se::CudaComputeCapability>(Capability()); | |
} | |
bool IsRocm() const { | |
return std::holds_alternative<se::RocmComputeCapability>(Capability()); | |
} | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reedwm Hi I have updated it!
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 675635116
… tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 675635116
Hi @reedwm, this PR hasn't been merged. Could you take a look at it? Many thanks! |
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 676515264
… tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 676515264
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 676515264
… tests Imported from GitHub PR #16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: #10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 676515264
… tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes #16938 PiperOrigin-RevId: 676615012
FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 671073597
This test verifies whether the API v2 packages can be imported from the current build. It utilizes the `_api/v2/api_packages.txt` list of packages from the local wheel file specified in the `requirements_lock_<python_version>.txt`. The test should be executed after the TF wheel was built and put into `dist` dir inside Tensorflow repository. FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9 PiperOrigin-RevId: 673046193
…on unit tests Imported from GitHub PR openxla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla#10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes openxla#16938 COPYBARA_INTEGRATE_REVIEW=openxla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 676615012
…on unit tests Imported from GitHub PR openxla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla#10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes openxla#16938 COPYBARA_INTEGRATE_REVIEW=openxla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 676615012
…on unit tests Imported from GitHub PR openxla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla#10488 Copybara import of the project: -- 0fc74cc by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418 by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes openxla#16938 COPYBARA_INTEGRATE_REVIEW=openxla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af PiperOrigin-RevId: 676615012
…ation unit tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes tensorflow#16938 PiperOrigin-RevId: 676615012
…ation unit tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes tensorflow#16938 PiperOrigin-RevId: 676615012
…ation unit tests Imported from GitHub PR openxla/xla#16938 This PR adds support for NANOO FP8 data format in the collaborative communication unit tests. - For the context on OCP FP8 and NANOO FP8, please refer to this comment: google/flax#3993 (comment) - The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats: openxla/xla#10488 Copybara import of the project: -- 0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>: [AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz. -- d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>: refactor tests for collective comm ops -- 6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>: rafactor collective comm e2e tests -- 8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>: update: replace str -- 338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>: get rid of macros Merging this change closes tensorflow#16938 PiperOrigin-RevId: 676615012
This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
Added support of NANOO fp8. google/flax#3993 (comment)
PR #9531: Fp8 matmul support on AMD MI300 #10488