Skip to content

Illegal instruction (core dumped) when trying to load model #839

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
R-Yordanov-AltScale opened this issue Oct 23, 2023 · 4 comments
Open
Labels
bug Something isn't working

Comments

@R-Yordanov-AltScale
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [ x ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [ x ] I carefully followed the README.md.
  • [ x ] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [ x ] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

To load the model

Please provide a detailed written description of what you were trying to do, and what you expected llama-cpp-python to do.

Current Behavior

when i try to load the model with llm = Llama(model_path="./llama.cpp/models/llama-2-7b-chat.Q5_K_M.gguf")
it response with: Illegal instruction (core dumped)

This is from my syslog:
kernel: [1728595.660950] traps: python3[213941] trap invalid opcode ip:7f4aa44a4e94 sp:7ffceec92e60 error:0 in libllama.so[7f4aa448a000+9f000]

Environment and Context

lscpu

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 40 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: AuthenticAMD
Model name: AMD Opteron 63xx class CPU
CPU family: 21
Model: 2
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 8
Stepping: 0
BogoMIPS: 5200.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid tsc_known_freq pni pclmulqd
q ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c hypervisor lahf_lm svm abm sse4a misalignsse 3dnowprefetch xop fma4 tbm vmmcall arat npt nrip_save
Virtualization features:
Virtualization: AMD-V
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 512 KiB (8 instances)
L1i: 512 KiB (8 instances)
L2: 4 MiB (8 instances)
L3: 128 MiB (8 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-7

it is a Vritual with ubuntu 22.04

$ uname -a
Linux trying-to-train-llama2 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

  • SDK version, e.g. for Linux:
$ python3 --version
Python 3.10.12

$ make --version
GNU Make 4.3

$ g++ --version
g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

@m-from-space
Copy link

m-from-space commented Oct 25, 2023

Your CPU is very old and doesn't support certain instructions like AVX2 (you can see that it is missing in the list of "flags" using lscpu). Try to reinstall using:

CMAKE_ARGS="-DLLAMA_AVX2=OFF" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall

@fgeo23
Copy link

fgeo23 commented Oct 26, 2023

Same issue here

After doing some digging, it turns out that CMAKE_ARGS are not passed to the pip install command. Still trying to figure out why

antoine-lizee pushed a commit to antoine-lizee/llama-cpp-python that referenced this issue Oct 30, 2023
@abetlen abetlen added the bug Something isn't working label Nov 8, 2023
@dimaioksha
Copy link

@fgeo23 have you found out the reasons and solution?

@dimaioksha
Copy link

Hello everyone.
I've solved this issue by upgrading nvidia-cuda-toolkit from 11.6 to 11.8 version (the latest llama-cpp-python==0.2.29 is working well)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants