Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Crash in MLXFast.scaledDotProductAttention #172

Closed
tqtifnypmb opened this issue Dec 5, 2024 · 5 comments
Closed

[BUG] Crash in MLXFast.scaledDotProductAttention #172

tqtifnypmb opened this issue Dec 5, 2024 · 5 comments
Assignees

Comments

@tqtifnypmb
Copy link

tqtifnypmb commented Dec 5, 2024

Version: 0.21.1

Error: libc++abi: terminating due to uncaught exception of type std::runtime_error: [metal::Device] Unable to load function steel_attention_float32_bq32_bk32_bd64_wm4_wn1 Function steel_attention_float32_bq32_bk32_bd64_wm4_wn1 was not found in the library

It seems that the update_mlx.sh does not include steel_attn.

@davidkoski davidkoski self-assigned this Dec 5, 2024
@davidkoski
Copy link
Collaborator

davidkoski commented Dec 5, 2024

So far I am unable to reproduce it. I tried to force the types like this (in case it was particular to the dtype):

        let output = MLXFast.scaledDotProductAttention(
            queries: queries.asType(.float32), keys: keys.asType(.float32), values: values.asType(.float32), scale: scale, mask: mask
        )

However I can see the files in the mlx directory (note that they are not built from here, so this matches the problem description):

./Source/Cmlx/mlx/mlx/backend/metal/kernels/steel/attn/kernels/steel_attention.metal:#include "mlx/backend/metal/kernels/steel/attn/kernels/steel_attention.h"
./Source/Cmlx/mlx/mlx/backend/metal/kernels/steel/attn/kernels/steel_attention.metal:  template [[host_name("steel_attention_" #tname "_bq" #bq "_bk" #bk "_bd" #bd "_wm" #wm "_wn" #wn)]] \
./Source/Cmlx/mlx/mlx/backend/metal/scaled_dot_product_attention.cpp:  kname << "steel_attention_" 

and I can see the header:

ls Source/Cmlx/mlx-generated/metal/steel/attn/kernels

steel_attention.h

and the metal file isn't there:

find Source/Cmlx/mlx-generated -name '*.metal' -print  

Source/Cmlx/mlx-generated/metal/arg_reduce.metal
Source/Cmlx/mlx-generated/metal/conv.metal
Source/Cmlx/mlx-generated/metal/rms_norm.metal
Source/Cmlx/mlx-generated/metal/random.metal
Source/Cmlx/mlx-generated/metal/scaled_dot_product_attention.metal
Source/Cmlx/mlx-generated/metal/gemv.metal
Source/Cmlx/mlx-generated/metal/layer_norm.metal
Source/Cmlx/mlx-generated/metal/rope.metal

@davidkoski
Copy link
Collaborator

What code are you using to trigger this failure?

@davidkoski
Copy link
Collaborator

davidkoski commented Dec 5, 2024

It looks like I need to hit the conditions for implementation_supports_use_case, otherwise it uses a fallback implementation. Additionally it needs to be in the "Full attention mode":

  const bool supports_sdpa_full = query_sequence_length >= threshold &&
      !mask.has_value() && sdpa_full_supported_head_dim &&
      stream.device == Device::gpu;

@awni
Copy link
Member

awni commented Dec 5, 2024

Yes exactly. You can see some conditions which should hit it here

@davidkoski
Copy link
Collaborator

Ah, I see the issue. This was added on 22 Nov and it was never in the list of metal files to build, so it is a matter of 1) narrow conditions, 2) new, and 3) never in the build.

davidkoski added a commit to davidkoski/mlx-swift that referenced this issue Dec 5, 2024
- steel_attenion.metal (new) was missing from the build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants