You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @slaren
Thanks for the hint. After reducing CUDA_POOL_VMM_MAX_SIZE to 32GB, it works.
I created #4687 for this, please let me know if this PR is fine or if you need more tests on this device.
Summary
In b1696, everything works fine.
However, when the b1697 introduces the cuda vmm, it never works.
Hardware
NVIDIA Jetson AGX Orin 64GB
uname -a Linux jetson-orin 5.10.104-tegra #1 SMP PREEMPT Tue Jan 24 15:09:44 PST 2023 aarch64 aarch64 aarch64 GNU/Linux
OS
Reproduce steps
./build/bin/main -m /disk/models/baichuan2-7b-base.Q5_K.gguf -n 512 -ngl 35 -p '這是一段中文測試'
Expected output (b1696)
The text was updated successfully, but these errors were encountered: