Make onnx export SDPA match aten behavior #2479

IlyasMoutawwakil · 2025-08-06T20:07:36Z

This PR makes onnx sdpa export match the behavior of aten sdpa when boolean mask is used.

import onnxruntime as ort
import torch


class ScaledDotProductAttention(torch.nn.Module):
    def forward(self, query, key, value, attn_mask):
        return torch.nn.functional.scaled_dot_product_attention(query, key, value, attn_mask=attn_mask)


model = ScaledDotProductAttention()
attn_mask = torch.ones(2, 4, 8, 8).bool()  # boolean mask for attention
attn_mask[0, 0, 0, :] = False  # masking an entire row (padding token)
query = key = value = torch.randn(2, 4, 8, 16)
output = model(query, key, value, attn_mask)

torch.onnx.export(
    model,
    (query, key, value, attn_mask),
    "scaled_dot_product_attention.onnx",
    input_names=["query", "key", "value", "attn_mask"],
    output_names=["output"],
    opset_version=18,
    dynamo=True, # or False
)
ort_session = ort.InferenceSession("scaled_dot_product_attention.onnx")

np_inputs = {"query": query.numpy(), "key": key.numpy(), "value": value.numpy(), "attn_mask": attn_mask.numpy()}
onnx_outputs = ort_session.run(None, np_inputs)[0]

torch.testing.assert_close(output, torch.tensor(onnx_outputs), equal_nan=True)

fails the assertion because the ort model outputs nans.

onnxscript/function_libs/torch_lib/ops/nn.py

IlyasMoutawwakil · 2025-08-07T12:56:49Z

@titaiwangms @justinchuby

codecov · 2025-08-07T15:59:37Z

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 69.81%. Comparing base (32f2196) to head (0068e40).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
onnxscript/function_libs/torch_lib/ops/nn.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2479      +/-   ##
==========================================
- Coverage   69.81%   69.81%   -0.01%     
==========================================
  Files         209      209              
  Lines       25313    25314       +1     
  Branches     2525     2525              
==========================================
  Hits        17673    17673              
- Misses       6762     6763       +1     
  Partials      878      878

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Follow up #2479

justinchuby · 2025-08-08T03:23:23Z

Do we need to update https://github.com/pytorch/pytorch/blob/main/torch/onnx/_internal/exporter/_torchlib/ops/nn.py as well, or improve specs of the Attention op? @gramalingam @titaiwangms

justinchuby · 2025-08-08T03:24:01Z

onnxscript/function_libs/torch_lib/ops/nn.py

+    # This is because there's no safe/masked softmax imp in ONNX, so we need to handle NaN values explicitly to match
+    # the behavior of PyTorch with boolean masks.
+    attn_weight = op.Where(op.IsNaN(attn_weight), zero, attn_weight)
    attn_weight, _ = op.Dropout(attn_weight, dropout_p)


@titaiwangms we should probably conditionally skip this line (even though there is a rewrite rule already)

If you fix this, can you also please add a reference to pytorch/pytorch#103749 in the comments for the previous line fixing NaN?

We skip when dropout_p is 0?

match

0068e40

github-project-automation bot added this to ONNX Script Review Board Aug 6, 2025

github-project-automation bot moved this to Todo in ONNX Script Review Board Aug 6, 2025

gramalingam reviewed Aug 7, 2025

View reviewed changes

onnxscript/function_libs/torch_lib/ops/nn.py Show resolved Hide resolved

titaiwangms approved these changes Aug 7, 2025

View reviewed changes

titaiwangms enabled auto-merge (squash) August 7, 2025 15:55

titaiwangms disabled auto-merge August 7, 2025 16:26

titaiwangms merged commit ecb7677 into microsoft:main Aug 7, 2025
25 of 32 checks passed

github-project-automation bot moved this from Todo to Done in ONNX Script Review Board Aug 7, 2025

titaiwangms mentioned this pull request Aug 7, 2025

Add a test for boolean attention mask within SDPA #2480

Merged

titaiwangms added a commit that referenced this pull request Aug 8, 2025

Add a test for boolean attention mask within SDPA (#2480)

e2fe5e7

Follow up #2479

justinchuby reviewed Aug 8, 2025

View reviewed changes

justinchuby added the module: torchlib Related to the torch/aten function lib in development label Aug 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make onnx export SDPA match aten behavior #2479

Make onnx export SDPA match aten behavior #2479

Uh oh!

IlyasMoutawwakil commented Aug 6, 2025

Uh oh!

Uh oh!

IlyasMoutawwakil commented Aug 7, 2025

Uh oh!

codecov bot commented Aug 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

justinchuby commented Aug 8, 2025

Uh oh!

justinchuby Aug 8, 2025

Uh oh!

titaiwangms Aug 8, 2025

Uh oh!

gramalingam Aug 8, 2025

Uh oh!

titaiwangms Aug 8, 2025

Uh oh!

titaiwangms Aug 8, 2025

Uh oh!

Uh oh!

Make onnx export SDPA match aten behavior #2479

Make onnx export SDPA match aten behavior #2479

Uh oh!

Conversation

IlyasMoutawwakil commented Aug 6, 2025

Uh oh!

Uh oh!

IlyasMoutawwakil commented Aug 7, 2025

Uh oh!

codecov bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

justinchuby commented Aug 8, 2025

Uh oh!

justinchuby Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

titaiwangms Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

gramalingam Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

titaiwangms Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

titaiwangms Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Aug 7, 2025 •

edited

Loading