Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[build] ARMv8 build problem (OpenWrt) #620

Closed
3 of 4 tasks
IngwiePhoenix opened this issue Mar 30, 2023 · 3 comments
Closed
3 of 4 tasks

[build] ARMv8 build problem (OpenWrt) #620

IngwiePhoenix opened this issue Mar 30, 2023 · 3 comments

Comments

@IngwiePhoenix
Copy link

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
    • git clone $url; cd llama.cpp; make
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

I expected to build the basic llama.cpp bin/main program, to see if building even worked properly.

Current Behavior

root@FriendlyWrt /s/o/llama.cpp (master)# make
I llama.cpp build info:
I UNAME_S:  Linux
I UNAME_P:  unknown
I UNAME_M:  aarch64
I CFLAGS:   -I.              -O3 -DNDEBUG -std=c11   -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -mcpu=native
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -pthread -mcpu=native
I LDFLAGS:
I CC:       cc (OpenWrt GCC 11.2.0) 11.2.0
I CXX:      g++ (OpenWrt GCC 11.2.0) 11.2.0

cc  -I.              -O3 -DNDEBUG -std=c11   -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -pthread -mcpu=native   -c ggml.c -o ggml.o
ggml.c: In function 'dequantize_row_q4_1':
ggml.c:1041:13: note: use '-flax-vector-conversions' to permit conversions between vectors with differing element types or numbers of subparts
 1041 |             const uint16x8_t vi_0 = vmovl_s8(vget_low_u8 (vq));
      |             ^~~~~
ggml.c:1041:46: error: incompatible type for argument 1 of 'vmovl_s8'
 1041 |             const uint16x8_t vi_0 = vmovl_s8(vget_low_u8 (vq));
      |                                              ^~~~~~~~~~~~~~~~
      |                                              |
      |                                              uint8x8_t
In file included from ggml.c:164:
/usr/lib/gcc/aarch64-openwrt-linux-musl/11.2.0/include/arm_neon.h:7989:20: note: expected 'int8x8_t' but argument is of type 'uint8x8_t'
 7989 | vmovl_s8 (int8x8_t __a)
      |           ~~~~~~~~~^~~
ggml.c:1042:46: error: incompatible type for argument 1 of 'vmovl_s8'
 1042 |             const uint16x8_t vi_1 = vmovl_s8(vget_high_u8(vq));
      |                                              ^~~~~~~~~~~~~~~~
      |                                              |
      |                                              uint8x8_t
In file included from ggml.c:164:
/usr/lib/gcc/aarch64-openwrt-linux-musl/11.2.0/include/arm_neon.h:7989:20: note: expected 'int8x8_t' but argument is of type 'uint8x8_t'
 7989 | vmovl_s8 (int8x8_t __a)
      |           ~~~~~~~~~^~~
make: *** [Makefile:226: ggml.o] Error 1

Environment and Context

  • Physical (or virtual) hardware you are using, e.g. for Linux:
# lscpu
Architecture:           aarch64
  CPU op-mode(s):       32-bit, 64-bit
  Byte Order:           Little Endian
CPU(s):                 8
  On-line CPU(s) list:  0-7
Vendor ID:              ARM
  Model name:           Cortex-A55
    Model:              0
    Thread(s) per core: 1
    Core(s) per socket: 4
    Socket(s):          1
    Stepping:           r2p0
    CPU(s) scaling MHz: 56%
    CPU max MHz:        1800.0000
    CPU min MHz:        408.0000
    BogoMIPS:           48.00
    Flags:              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
  Model name:           Cortex-A76
    Model:              0
    Thread(s) per core: 1
    Core(s) per socket: 2
    Socket(s):          2
    Stepping:           r4p0
    CPU(s) scaling MHz: 22%
    CPU max MHz:        2352.0000
    CPU min MHz:        408.0000
    BogoMIPS:           48.00
    Flags:              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
Caches (sum of all):
  L1d:                  384 KiB (8 instances)
  L1i:                  384 KiB (8 instances)
  L2:                   2.5 MiB (8 instances)
  L3:                   3 MiB (1 instance)
Vulnerabilities:
  Itlb multihit:        Not affected
  L1tf:                 Not affected
  Mds:                  Not affected
  Meltdown:             Not affected
  Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:           Mitigation; __user pointer sanitization
  Spectre v2:           Vulnerable: Unprivileged eBPF enabled
  Srbds:                Not affected
  Tsx async abort:      Not affected
  • Operating System, e.g. for Linux:
root@FriendlyWrt /s/o/llama.cpp (master)# uname -a
Linux FriendlyWrt 5.10.110 #1 SMP Sat Dec 3 01:25:15 CST 2022 aarch64 GNU/Linux
root@FriendlyWrt /s/o/llama.cpp (master)# cat /etc/os-release
NAME="OpenWrt"
VERSION="22.03.2"
ID="openwrt"
ID_LIKE="lede openwrt"
PRETTY_NAME="OpenWrt 22.03.2"
VERSION_ID="22.03.2"
HOME_URL="https://openwrt.org/"
BUG_URL="https://bugs.openwrt.org/"
SUPPORT_URL="https://forum.openwrt.org/"
BUILD_ID="r19803-9a599fee93"
OPENWRT_BOARD="rockchip/armv8"
OPENWRT_ARCH="aarch64_generic"
OPENWRT_TAINTS="busybox"
OPENWRT_DEVICE_MANUFACTURER="OpenWrt"
OPENWRT_DEVICE_MANUFACTURER_URL="https://openwrt.org/"
OPENWRT_DEVICE_PRODUCT="Generic"
OPENWRT_DEVICE_REVISION="v0"
OPENWRT_RELEASE="OpenWrt 22.03.2 r19803-9a599fee93"
  • SDK version, e.g. for Linux:
root@FriendlyWrt /s/o/llama.cpp (master)# python --version
Python 3.11.2
root@FriendlyWrt /s/o/llama.cpp (master)# make --version
GNU Make 4.3
Built for aarch64-openwrt-linux-gnu
Copyright (C) 1988-2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
root@FriendlyWrt /s/o/llama.cpp (master)# gcc --version
gcc (OpenWrt GCC 11.2.0) 11.2.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

Steps to Reproduce

Quite simple:

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
make

Reading the Makefile, I noticed that a lot of configuration was done automatically, so I assumed I could just go and make it.

Failure Logs

See above. :)

@ashimokawa
Copy link

@IngwiePhoenix

It literally told you what to do ;)

add -flax-vector-conversions to CFLAGS in Makefile and it will build.

But yeah, there should be a real fix.

@IngwiePhoenix
Copy link
Author

I have never worked with more complex "math stuff" (for the lack of a better term...) before, so I wasn't sure if this would be fine or not :) Gonna try it and see what happens!

@IngwiePhoenix
Copy link
Author

Worked! Thanks :)

Normally I would've just added the flag, but I have honestly never seen that function nor datatype before - hence I was more cautios ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants