Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support of NANOO fp8. #3993

Merged
merged 4 commits into from
Jul 2, 2024

Conversation

wenchenvincent
Copy link
Contributor

What does this PR do?

This PR adds support of fp8 dot op for NANOO fp8 data formats (an alternative genre to the OCP fp8 data formats, which is used by NVIDIA GPU).

There are several different genres of fp8 formats used by different HW vendors. Two popular genres include

  • OCP fp8, which is used natively on NVIDIA H100
  • NANOO fp8, which is used natively on AMD MI300 and Graphcore HW.

These two genres of fp8 formats work very similarly. This PR is to enable support of NANOO fp8 as it is also now supported in JAX and XLA. It would enable usage of fp8 dot op on AMD MI300 GPU.

References:

@wenchenvincent
Copy link
Contributor Author

@levskaya I noticed that you have reviewed several PRs regarding fp8. Could you take a look at this one?

@wenchenvincent
Copy link
Contributor Author

@levskaya Could you kindly serve as the reviewer for this PR?

Copy link
Collaborator

@levskaya levskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay!

Looks OK, but a few requests to simplify class configuration and to not break existing names in public API.

Try resubmitting for tests after fixing that, the failure before was from a transient unrelated breakage.

flax/linen/__init__.py Show resolved Hide resolved
flax/linen/fp8_ops.py Outdated Show resolved Hide resolved
flax/linen/fp8_ops.py Outdated Show resolved Hide resolved
flax/linen/fp8_ops.py Outdated Show resolved Hide resolved
@wenchenvincent
Copy link
Contributor Author

@levskaya Thanks for the review. I have updated the PR to address the concerns. Could you take a look at the updates?

Copy link
Collaborator

@levskaya levskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes! We may need to do some tiny rebasing of simple things as the codebase just migrated to a python minver of 3.10.

@wenchenvincent
Copy link
Contributor Author

Thanks for the fixes! We may need to do some tiny rebasing of simple things as the codebase just migrated to a python minver of 3.10.

Thanks! Do you need me to rebase it to the tip of the tree?

@levskaya
Copy link
Collaborator

Yes to tip as of today should have the 3.10 minver updates. Also, I'm seeing this failure in the tests:

FAILED tests/linen/linen_test.py::Fp8Test::test_fp8_meta_dtype0 - TypeError: missing a required argument: 'amax_history'
FAILED tests/linen/linen_test.py::Fp8Test::test_fp8_meta_dtype1 - TypeError: missing a required argument: 'amax_history'

could you fix that?

There are several different genres of fp8 formats used by different
HW vendors. Two popular genres include
- OCP fp8, which is used natively on NVIDIA H100
- NANOO fp8, which is used natively on AMD MI300 and Graphcore HW.

These two genres of fp8 formats work very similarly. This PR is to
enable support of NANOO fp8 as it is also now supported in JAX and XLA.

References:
- OCP fp8 paper: https://arxiv.org/abs/2209.05433
- NANOO fp8 paper: https://arxiv.org/abs/2206.02915
- JAX PR: jax-ml/jax#21376
- XLA PR: openxla/xla#9531
@wenchenvincent
Copy link
Contributor Author

Yes to tip as of today should have the 3.10 minver updates. Also, I'm seeing this failure in the tests:

FAILED tests/linen/linen_test.py::Fp8Test::test_fp8_meta_dtype0 - TypeError: missing a required argument: 'amax_history'
FAILED tests/linen/linen_test.py::Fp8Test::test_fp8_meta_dtype1 - TypeError: missing a required argument: 'amax_history'

could you fix that?

Sorry I missed this test.

I just rebased and fixed this test.

@codecov-commenter
Copy link

codecov-commenter commented Jun 28, 2024

Codecov Report

Attention: Patch coverage is 0% with 17 lines in your changes missing coverage. Please review.

Project coverage is 0.00%. Comparing base (31adb00) to head (a6f52ae).
Report is 46 commits behind head on main.

Files Patch % Lines
flax/linen/fp8_ops.py 0.00% 16 Missing ⚠️
flax/linen/__init__.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##            main   #3993    +/-   ##
======================================
  Coverage   0.00%   0.00%            
======================================
  Files        106     107     +1     
  Lines      13582   13767   +185     
======================================
- Misses     13582   13767   +185     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@levskaya levskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey sorry, after importing, I just noticed two more things that need to be fixed.

tests/linen/linen_test.py Show resolved Hide resolved
flax/linen/fp8_ops.py Outdated Show resolved Hide resolved
@copybara-service copybara-service bot merged commit 5b8265c into google:main Jul 2, 2024
16 checks passed
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 17, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 17, 2024
… tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 17, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 17, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 17, 2024
… tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 17, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 17, 2024
… tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 18, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 18, 2024
… tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 18, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 18, 2024
… tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 18, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 18, 2024
… tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9
PiperOrigin-RevId: 675635116
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 19, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 676515264
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 19, 2024
… tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af2ca1a32302fdfe9d7abee335d24539ee9
PiperOrigin-RevId: 676515264
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 19, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 676515264
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 19, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

FUTURE_COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 676515264
copybara-service bot pushed a commit to openxla/xla that referenced this pull request Sep 19, 2024
… tests

Imported from GitHub PR #16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

COPYBARA_INTEGRATE_REVIEW=#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 676615012
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 20, 2024
… tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes #16938

PiperOrigin-RevId: 676615012
ScXfjiang added a commit to ROCm/xla that referenced this pull request Sep 20, 2024
…on unit tests

Imported from GitHub PR openxla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes openxla#16938

COPYBARA_INTEGRATE_REVIEW=openxla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 676615012
ScXfjiang added a commit to ROCm/xla that referenced this pull request Sep 20, 2024
…on unit tests

Imported from GitHub PR openxla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes openxla#16938

COPYBARA_INTEGRATE_REVIEW=openxla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 676615012
ScXfjiang added a commit to ROCm/xla that referenced this pull request Sep 20, 2024
…on unit tests

Imported from GitHub PR openxla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla#10488
Copybara import of the project:

--
0fc74cc by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418 by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ec by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes openxla#16938

COPYBARA_INTEGRATE_REVIEW=openxla#16938 from ROCm:ci_dev_rccl_nanoo_fp8 338d3af
PiperOrigin-RevId: 676615012
ScXfjiang added a commit to ROCm/tensorflow-upstream that referenced this pull request Sep 20, 2024
…ation unit tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes tensorflow#16938

PiperOrigin-RevId: 676615012
ScXfjiang added a commit to ROCm/tensorflow-upstream that referenced this pull request Sep 20, 2024
…ation unit tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes tensorflow#16938

PiperOrigin-RevId: 676615012
ScXfjiang added a commit to ROCm/tensorflow-upstream that referenced this pull request Sep 20, 2024
…ation unit tests

Imported from GitHub PR openxla/xla#16938

This PR adds support for NANOO FP8 data format in the collaborative communication unit tests.
- For the context on OCP FP8 and NANOO FP8, please refer to this comment:
google/flax#3993 (comment)
- The unit tests in this PR are similar to GEMM unit test introduced in the following PR to be able to deal with both OCP and NANOO fp8 formats:
openxla/xla#10488
Copybara import of the project:

--
0fc74ccae6cfcaf4e8627ea338ee03783af0626b by Wen Chen <Wen.Chen@amd.com>:

[AMD] Added NCCL support for fp8e4m3fnuz and fp8e5m2fnuz.

--
d247af5cd33fe42698bb55ef1c18f32df8a02a21 by scxfjiang <sc.xfjiang@gmail.com>:

refactor tests for collective comm ops

--
6f8c418b3052f7c531896bd5f8cbbc7a766ef7fc by scxfjiang <sc.xfjiang@gmail.com>:

rafactor collective comm e2e tests

--
8ecb6ecf08a1536c5b3f8ba87e0e9f8813b1b359 by scxfjiang <sc.xfjiang@gmail.com>:

update: replace str

--
338d3af2ca1a32302fdfe9d7abee335d24539ee9 by scxfjiang <sc.xfjiang@gmail.com>:

get rid of macros

Merging this change closes tensorflow#16938

PiperOrigin-RevId: 676615012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants