Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master branch no longer works on AMD integrated #101

Open
unoexperto opened this issue Feb 1, 2025 · 1 comment
Open

master branch no longer works on AMD integrated #101

unoexperto opened this issue Feb 1, 2025 · 1 comment

Comments

@unoexperto
Copy link

Hi folks,

Unfortunately I've made a mistake and pull changes from master and koboldcpp-rocm no longer works for me :( In the past I spent significant amount of time making it work on my integrated AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics (gfx1103).

I've tried to do clean build like this

make LLAMA_HIPBLAS=1 AMDGPU_TARGETS=gfx1100 -j16

and I launch the app like this (parameters that used to work before)

LD_PRELOAD=/home/xxx/work/sideprojects/force-host-alloction-APU/libforcegttalloc.so HSA_OVERRIDE_GFX_VERSION=11.0.0 AMD_SERIALIZE_KERNEL=3 python koboldcpp.py --threads 6 --blasthreads 6 --usecublas mmq lowvram --gpulayers 32 --blasbatchsize 256 --contextsize 8192 --model /home/xxx/jdata/models/dolphin-2.9-llama3-8b-q8_0.gguf

it fails in runtime with following error

ggml_cuda_compute_forward: RMS_NORM failed
ROCm error: invalid device function
  current device: 0, in function ggml_cuda_compute_forward at ggml/src/ggml-cuda/ggml-cuda.cu:2207
  err
ggml/src/ggml-cuda/ggml-cuda.cu:73: ROCm error

Could you please advise what I'm missing ?

@arturbac
Copy link

arturbac commented Feb 3, 2025

On my gfx1100 - RX 7900 XTX it no longer works too

rocBLAS error: Tensile solution found, but exception thrown for { a_type: "f16_r", b_type: "f16_r", c_type: "f16_r", d_type: "f16_r", compute_type: "f16_r", transA: 'T', transB: 'N', M: 128, N: 4, K: 32, alpha: 1, row_stride_a: 1, col_stride_a: 4224, row_stride_b: 1, col_stride_b: 32, row_stride_c: 1, col_stride_c: 128, row_stride_d: 1, col_stride_d: 128, beta: 0, batch_count: 40, strided_batch: false, stride_a: 540672, stride_b: 128, stride_c: 512, stride_d: 512, atomics_mode: atomics_allowed }
Alpha value -1912 doesn't match that set in problem: 1
This message will be only be displayed once, unless the ROCBLAS_VERBOSE_TENSILE_ERROR environment variable is set.
ROCm error: CUBLAS_STATUS_INTERNAL_ERROR
  current device: 0, in function ggml_cuda_mul_mat_batched_cublas at ggml/src/ggml-cuda/ggml-cuda.cu:1719
  hipblasGemmBatchedEx(ctx.cublas_handle(), HIPBLAS_OP_T, HIPBLAS_OP_N, ne01, ne11, ne10, alpha, (const void **) (ptrs_src.get() + 0*ne23), HIPBLAS_R_16F, nb01/nb00, (const void **) (ptrs_src.get() + 1*ne23), HIPBLAS_R_16F, nb11/nb10, beta, ( void **) (ptrs_dst.get() + 0*ne23), cu_data_type, ne01, ne23, cu_compute_type, HIPBLAS_GEMM_DEFAULT)
ggml/src/ggml-cuda/ggml-cuda.cu:73: ROCm error
ptrace: Operacja niedozwolona.
No stack.
The program is not being run.
fish: Job 1, 'python koboldcpp.py --threads 6…' terminated by signal SIGABRT (Przerwij)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants