-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA illegal memory access | AWS #3163
Comments
same problems with 8xA100 instance and using |
Do you still get the error with |
Same here on a AWS g5.12xlarge instance with 4xA10G GPUs, and built with CMake |
Same issue attempting to split a model between a 4090 and an A6000.
No. If I stick to a single GPU, there's no issue. |
No, atleast for myself, it works on For ex.
This does not work :
I have 4xA100-40G, so I have enough memory to run q5 falcon 180B. This was also discussed in #2160 and #1341 sometime back. I have tested the same commands on same model but Any updates on this is appreciated. Thanks. |
I cannot reproduce the issue on my server with 3x P40. |
I also get this error[1] after I run convert.py with This is the command I run: I suspect it has something to do with the 8888 context length param (although the model supports 200k context length) because if I try running it with [1] |
Expected Behavior
Running without errors and being able to utilise GPU
Current Behavior
Encountered a CUDA error during runtime:
CUDA error 700 at /llama.cpp/ggml-cuda.cu:6540: an illegal memory access was encountered
Environment and Context
Hardware: AWS instance ml.g5.12xlarge.
Operating System: x86_64 GNU/Linux
GPUs: 4xT4 16GB
Failure Information
While trying to run llama.cpp, I encountered a CUDA error. This happened with both the models I tried to quantise myself and using TheBlock pre-quantised models. If I remove '-ngl' everything works fine
Models like this are running OK with '-ngl'. The issue seems to be related to GGUF v2.
Steps to Reproduce
Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.
Failure Logs
Example environment info:
Example command
The text was updated successfully, but these errors were encountered: