Revert to old way of calling fused_scaled_dot_product_attention #1142

tianmu-li · 2025-04-22T16:14:43Z

The current way of calling the op directly prevents neural-compressor from patching the fp8 module correctly, as it relies on ModuleFusedSDPA being present: https://github.com/intel/neural-compressor/blob/master/neural_compressor/torch/algorithms/fp8_quant/utils/patched_module_restore_registry.py#L128

madamczyk-intel · 2025-04-23T05:29:43Z

Hi. Thanks for the PR, It's already covered in #1086 and #1087 that are waiting for merging.

Revert to old way of calling fused_sdpa

6a9947c

tianmu-li requested review from afierka-intel, kzawora-intel, madamczyk-intel, mgawarkiewicz-intel, michalkuligowski and vivekgoe as code owners April 22, 2025 16:14

madamczyk-intel closed this Apr 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revert to old way of calling fused_scaled_dot_product_attention #1142

Revert to old way of calling fused_scaled_dot_product_attention #1142

Uh oh!

tianmu-li commented Apr 22, 2025 •

edited by github-actions bot

Loading

Uh oh!

madamczyk-intel commented Apr 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Revert to old way of calling fused_scaled_dot_product_attention #1142

Revert to old way of calling fused_scaled_dot_product_attention #1142

Uh oh!

Conversation

tianmu-li commented Apr 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

madamczyk-intel commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tianmu-li commented Apr 22, 2025 •

edited by github-actions bot

Loading

madamczyk-intel commented Apr 23, 2025 •

edited

Loading