[Bug-fix][XLA:CPU][oneDNN] Fix BINARY_ADD fusion to Dot #13301

mdfaijul · 2024-06-01T13:34:48Z

This PR fixes a bug reported for JAX (#13054)

penpornk

Thank you for the fix! Could you please explain what is causing the issue, and how this fix addresses it? Is it because of rank mismatch + wrong auto broadcasting or something?

penpornk · 2024-06-03T08:28:32Z

xla/tests/onednn_matmul_test.cc

+
+ ENTRY main {
+ constant.2 = f32[] constant(1e-06)
+ broadcast.3 = f32[1000000] broadcast(constant.2), dimensions={}


I don't think the size needs to be this big to reproduce the failure. Would 10 work?

The issue is not reproducible with a smaller size.

penpornk · 2024-06-03T08:31:05Z

xla/tests/onednn_matmul_test.cc

+ subtract.14 = f32[1000000,3] subtract(broadcast.8, broadcast.13)
+ constant.4 = f32[] constant(0)
+ broadcast.5 = f32[3,3] broadcast(constant.4), dimensions={}
+ dot.15 = f32[1000000,3] dot(subtract.14, broadcast.5), lhs_contracting_dims={1}, rhs_contracting_dims={0}


Can we reduce the ops to just necessary ops that reproduce the failure? I don't think all the dots are needed.

The bug seems to be seen with this particular case.

hawkinsp · 2024-06-03T15:32:41Z

I'd like to try to get this fix in in the next day or so so I can incorporate it in the next JAX release, please.

kanvi-nervana · 2024-06-03T23:14:25Z

Thank you for the fix! Could you please explain what is causing the issue, and how this fix addresses it? Is it because of rank mismatch + wrong auto broadcasting or something?

oneDNN expects Matmul followed by Bias-Add followed by Binary-Add. But, here Matmul is followed by Binary-Add and then by Bias-Add which oneDNN does not support. The fix here is extending the dimensions of the Bias-Add to a Binary-Add which is supported. As seen below

penpornk

Thank you very much for the clarifications!

Imported from GitHub PR openxla/xla#13301 This PR fixes a bug reported for JAX (openxla/xla#13054) Copybara import of the project: -- 47d5bde8eab607d0fe9b60c6fd82d95365c8169f by mdfaijul <md.faijul.amin@intel.com>: Make addend rank same to dot. Merging this change closes #13301 FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13301 from Intel-tensorflow:amin/bug-fix-jax 47d5bde8eab607d0fe9b60c6fd82d95365c8169f PiperOrigin-RevId: 640081553

FUTURE_COPYBARA_INTEGRATE_REVIEW=#13301 from Intel-tensorflow:amin/bug-fix-jax 47d5bde PiperOrigin-RevId: 638276915

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#13301 from Intel-tensorflow:amin/bug-fix-jax 47d5bde8eab607d0fe9b60c6fd82d95365c8169f PiperOrigin-RevId: 638276915

Imported from GitHub PR openxla/xla#13301 This PR fixes a bug reported for JAX (openxla/xla#13054) Copybara import of the project: -- 47d5bde8eab607d0fe9b60c6fd82d95365c8169f by mdfaijul <md.faijul.amin@intel.com>: Make addend rank same to dot. Merging this change closes #13301 PiperOrigin-RevId: 640094871

Make addend rank same to dot.

47d5bde

github-actions bot added the kokoro:force-run Forces CI to rerun label Jun 1, 2024

github-actions bot assigned penpornk Jun 1, 2024

kokoro-team removed the kokoro:force-run Forces CI to rerun label Jun 1, 2024

mdfaijul mentioned this pull request Jun 1, 2024

Wrong output from JAX test after onednn change #13054

Closed

NaiyerRizz self-requested a review June 3, 2024 04:32

penpornk suggested changes Jun 3, 2024

View reviewed changes

penpornk approved these changes Jun 4, 2024

View reviewed changes

copybara-service bot mentioned this pull request Jun 4, 2024

PR #13301: [Bug-fix][XLA:CPU][oneDNN] Fix BINARY_ADD fusion to Dot tensorflow/tensorflow#69132

Merged

copybara-service bot pushed a commit that referenced this pull request Jun 4, 2024

[XLA:GPU] Enable symbolic tile analysis for bitcast and reshape.

f237330

FUTURE_COPYBARA_INTEGRATE_REVIEW=#13301 from Intel-tensorflow:amin/bug-fix-jax 47d5bde PiperOrigin-RevId: 638276915

copybara-service bot mentioned this pull request Jun 4, 2024

[XLA:GPU] Enable symbolic tile analysis for bitcast and reshape. #13370

Open

copybara-service bot closed this in 7d12719 Jun 4, 2024

copybara-service bot mentioned this pull request Jun 4, 2024

[XLA:GPU] Enable symbolic tile analysis for bitcast and reshape. tensorflow/tensorflow#69140

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug-fix][XLA:CPU][oneDNN] Fix BINARY_ADD fusion to Dot #13301

[Bug-fix][XLA:CPU][oneDNN] Fix BINARY_ADD fusion to Dot #13301

mdfaijul commented Jun 1, 2024

penpornk left a comment

penpornk Jun 3, 2024

kanvi-nervana Jun 3, 2024

penpornk Jun 3, 2024

kanvi-nervana Jun 3, 2024

hawkinsp commented Jun 3, 2024

kanvi-nervana commented Jun 3, 2024

penpornk left a comment

[Bug-fix][XLA:CPU][oneDNN] Fix BINARY_ADD fusion to Dot #13301

[Bug-fix][XLA:CPU][oneDNN] Fix BINARY_ADD fusion to Dot #13301

Conversation

mdfaijul commented Jun 1, 2024

penpornk left a comment

Choose a reason for hiding this comment

penpornk Jun 3, 2024

Choose a reason for hiding this comment

kanvi-nervana Jun 3, 2024

Choose a reason for hiding this comment

penpornk Jun 3, 2024

Choose a reason for hiding this comment

kanvi-nervana Jun 3, 2024

Choose a reason for hiding this comment

hawkinsp commented Jun 3, 2024

kanvi-nervana commented Jun 3, 2024

penpornk left a comment

Choose a reason for hiding this comment