-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
运行flash_attn带来的错误 #2
Comments
flash-attn 2.6.3 |
I tried flash-attention 2 to train, got similary error. so i didn't mention this repo support flash-attention2. |
I debugged what is wrong when enable flash_attention_2 in finetune.py. Conclusion: fixed, see my latest commit ff383f7 How:
Solution:
|
I closed this issue. |
收到,可以训练了,就是显存占用还是不低;好项目,用pytorch原生训练 |
@lonngxiang 麻烦问下,用的什么显卡?我用的4090,24G显存,迭代一轮就报显存不足~你数据量多大?可否交流一下? |
重新安装新transformers后,可以训练,也是4090,就是加载很慢 |
out, q, k, v, out_padded, softmax_lse, S_dmask, rng_state = flash_attn_cuda.fwd(
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [298,0,0], thread: [64,0,0] Assertion
-sizes[i] <= index && index < sizes[i] && "index out of bounds"
failed.../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [298,0,0], thread: [65,0,0] Assertion
-sizes[i] <= index && index < sizes[i] && "index out of bounds"
failed.The text was updated successfully, but these errors were encountered: