6600/6600 XT/6650 XT gfx1032 libraries for compilation of Kobold.cpp #655

jasyuiop · 2024-02-01T12:26:30Z

Information

I have a rx 6600(gfx1032) video card, I can use rocblas on linux using "export HSA_OVERRIDE_GFX_VERSION=10.3.0" But there is no kernel and Tensilelibrary support for rocblas gfx1032 on windows.

I had version 5.5.1 Rocm installed on my system. I used rocm-5.5.1 branches of rocBLAS and Tensile.

I applied this patch to Tensile; https://raw.githubusercontent.com/ulyssesrr/docker-rocm-xtra/f25f12835c1d0a5efa80763b5381accf175b200e/rocm-xtra-rocblas-builder/patches/Tensile-fix-fallback-arch-build.patch

Resources I follow

ggerganov#1087 (comment)
#441
https://www.reddit.com/r/LocalLLaMA/comments/16d1hi0/guide_build_llamacpp_on_windows_with_amd_gpus_and/

using the information here I was able to create a "non-lazy merged library" for gfx1032. I could not create the "lazy" one no matter what I did.

Results

using the generated Kernels.so-000-gfx1032.hsaco and TensileLibrary.dat files I was able to load 7b llm completely on the gpu in koboldcpp-rocm, I got an average speed of 25t/s in a new chat.

Progress

I installed version 5.7.1 ROCm, I am trying to make lazy and non-lazy versions for gfx1032 without any patches using release/rocm-rel-5.7 branches of tensile and rocblas. I don't know if I can compile it successfully, if I succeed I will add those files.

The last word

I would appreciate if you add these files to the pre-builds in future releases. @YellowRoseCx

Attachments

gfx1032_none_lazy.zip

jasyuiop · 2024-02-01T15:48:40Z

Information

EDIT: The one I created as "lazy" seems to be missing, I created "non-lazy" for rocblas and tensile rel-5.7.1 and I am attaching it.

with this commit that was merge last week I was able to generate "lazy" for gfx1032 without any patch. i used rocblas's develop branch. i will explain step by step how i did it below. ROCm/Tensile@efbe0c0

Setup

Install

Git for Windows
Visual Studio 2022 Build Tools

Tick “Desktop development with C++” workload.

ROCm Windows SDK (i used 5.7.1)
Strawberry perl
python 3.11

ADD PATH

Cmake and Ninja:

C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin
C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja

Git:

C:\Program Files\Git\bin

Perl:

C:\Strawberry\perl\bin

RC (when compiling koboldcpp I get an error saying rc not found, so I added it to path):

C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\x64

ROCM

C:\Program Files\AMD\ROCm\5.7\bin

vcpkg

Create a folder called github under C:\
cd github
git clone https://github.com/microsoft/vcpkg
.\vcpkg\bootstrap-vcpkg.bat

Rocblas

go to another folder for example downloads etc.

git clone https://github.com/ROCmSoftwarePlatform/rocBLAS.git
python.exe -m pip install --upgrade pip
cd rocBLAS
python rdeps.py

Open x64 native tools as ADMIN and go to the rocblas folder

python rmake.py

Now let me explain here, I have been struggling with the rmake.py command for two days, even if I pass -a with gtx1032 or other parameters, I still get an error. If you are not as unlucky as me, you may not get an error here.

It doesn't matter if you also get an error, the command just needs to generate some things and put them in place

After the rmake.py command is finished(Continue with x64 native tools console)

for non-lazy

.\build\release\virtualenv\Scripts\activate.bat
TensileCreateLibrary --architecture gfx1032 --code-object-version default --merge-files --library-format msgpack .\library\src\blas3\Tensile\Logic\asm_full C:\SomeOutputFolder HIP

for lazy

.\build\release\virtualenv\Scripts\activate.bat
TensileCreateLibrary --architecture gfx1032 --code-object-version default --merge-files --separate-architectures --lazy-library-loading --library-format msgpack .\library\src\blas3\Tensile\Logic\asm_full C:\SomeOutputFolder HIP

Generated kernel and tensilelibrary files with TensileCreateLibrary without any error.

We now have our kernel and tensilelibrary files in the C:\SomeOutputFolder folder.

Attachments

files I generated for gfx1032;
gfx1032_non-lazy-rocblas-dev-branch.zip
gfx1032_lazy-rocblas-dev-branch.zip
gfx1032_none_lazy-rocm-5.7.1.zip

jasyuiop · 2024-02-01T20:45:55Z

I use openhermes-2.5-mistral-7b.Q6_K.gguf, I put the kernel and TensileLibrary file I shared above https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip under AMD\ROCm\5.7\bin\rocblas\library. I compiled the latest koboldcpp-rocm version myself for gfx1032. I am using HIP SDK 5.7.1

My initial parameters for openhermes are as follows(kcpps)

{"model": null, "model_param": "D:/Ai/models/openhermes-2.5-mistral-7b.Q6_K.gguf", "port": 5001, "port_param": 5000, "host": "", "launch": false, "lora": null, "config": null, "threads": 8, "blasthreads": 8, "highpriority": false, "contextsize": 8192, "blasbatchsize": 512, "ropeconfig": [1.0, 10000.0], "smartcontext": false, "noshift": false, "bantokens": null, "forceversion": 0, "nommap": false, "usemlock": false, "noavx2": false, "debugmode": 0, "skiplauncher": false, "hordeconfig": null, "noblas": false, "useclblast": null, "usecublas": ["normal", "0"], "usevulkan": null, "gpulayers": 33, "tensor_split": null, "onready": "", "multiuser": 1, "remotetunnel": false, "foreground": false, "preloadstory": null, "quiet": false, "checkforupdates": 0, "ssl": null}

This is the result:

Processing Prompt [BLAS] (316 / 316 tokens)
Generating (250 / 250 tokens)
ContextLimit: 566/8192, Processing:2.01s (6.4ms/T), Generation:11.61s (46.4ms/T), Total:13.62s (54.5ms/T = 18.36T/s)

I see 7.7gb vram usage in task manager, I think the result is great. I can say that I got rid of dual-booting for llm :)

If you want to save those who have gfx1032 cards and compile their own .exe like me, you can add these files to the pre-build binaries 😄 @YellowRoseCx

YellowRoseCx · 2024-02-10T19:34:43Z

Adding them into KoboldCpp-ROCm 1.57.1.yr1, hopefully everything works as intended xD
Thanks!

jasyuiop · 2024-02-11T10:08:58Z

Adding them into KoboldCpp-ROCm 1.57.1.yr1, hopefully everything works as intended xD Thanks!

I realized later that the "lazy" one I shared was a bit incomplete and even unusable, so I added information at the top of this post #655 (comment), then I created and added "none-lazy" for the 5.7.1 HIP SDK version. The "none-lazy" one works smoothly and properly, I recommend adding the "none-lazy" one in the new version. I saw that the "lazy" one was added in the new version, which unfortunately will not work :( I am adding the link again to avoid confusion @YellowRoseCx
https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

YellowRoseCx · 2024-02-11T12:22:27Z

Adding them into KoboldCpp-ROCm 1.57.1.yr1, hopefully everything works as intended xD Thanks!

I realized later that the "lazy" one I shared was a bit incomplete and even unusable, so I added information at the top of this post #655 (comment), then I created and added "none-lazy" for the 5.7.1 HIP SDK version. The "none-lazy" one works smoothly and properly, I recommend adding the "none-lazy" one in the new version. I saw that the "lazy" one was added in the new version, which unfortunately will not work :( I am adding the link again to avoid confusion @YellowRoseCx
https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

I cant use the none lazy one because then I cant use the other ones from gfx1031 because it would overwrite the file Tensilelibrary.dat

jasyuiop · 2024-02-11T13:16:29Z

Adding them into KoboldCpp-ROCm 1.57.1.yr1, hopefully everything works as intended xD Thanks!

I realized later that the "lazy" one I shared was a bit incomplete and even unusable, so I added information at the top of this post #655 (comment), then I created and added "none-lazy" for the 5.7.1 HIP SDK version. The "none-lazy" one works smoothly and properly, I recommend adding the "none-lazy" one in the new version. I saw that the "lazy" one was added in the new version, which unfortunately will not work :( I am adding the link again to avoid confusion @YellowRoseCx
https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

I cant use the none lazy one because then I cant use the other ones from gfx1031 because it would overwrite the file Tensilelibrary.dat

yes, that would be a problem, I didn't think about that. gfx1032 owners will compile it themselves then, I wrote how to compile and create an exe on discord and I'll share it here;

Install HIP SDK 5.7.1 https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html
Download this from https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip
Place the Kernels.so-000-gfx1032.hsaco and TensileLibrary.dat files (under the library folder after unzipping) in the C:\Program Files\AMD\ROCm\5.7\bin\rocblas\library directory
open x64 native tools without admin (if you don't have visual studio, w64devkit will do, here is the information https://github.com/YellowRoseCx/koboldcpp-rocm?tab=readme-ov-file#compiling-for-amd-on-windows)
git clone https://github.com/YellowRoseCx/koboldcpp-rocm
cd koboldcpp-rocm
mkdir build && cd build
cmake ... -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLAMA_HIPBLAS=ON -DHIP_PLATFORM=amd -DCMAKE_C_COMPILER="C:/Program Files/AMD/ROCm/5. 7/bin/clang.exe" -DCMAKE_CXX_COMPILER="C:/Program Files/AMD/ROCm/5.7/bin/clang++.exe" -DAMDGPU_TARGETS="gfx1032"
cmake --build . -j 16
Copy koboldcpp_hipblas.dll from koboldcpp-rocm\build\bin and put it in main dir.
python -m venv .venv
.venv\Scripts\activate
pip install pyinstaller

make_pyinstaller_exe_rocm_only.bat copy create a new .bat change rocm version from 5.5 to 5.7 only

then run that bat file. it will create exe under koboldcpp-rocm\dist

YellowRoseCx · 2024-02-11T14:25:26Z

Adding them into KoboldCpp-ROCm 1.57.1.yr1, hopefully everything works as intended xD Thanks!

I realized later that the "lazy" one I shared was a bit incomplete and even unusable, so I added information at the top of this post #655 (comment), then I created and added "none-lazy" for the 5.7.1 HIP SDK version. The "none-lazy" one works smoothly and properly, I recommend adding the "none-lazy" one in the new version. I saw that the "lazy" one was added in the new version, which unfortunately will not work :( I am adding the link again to avoid confusion @YellowRoseCx
https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

I cant use the none lazy one because then I cant use the other ones from gfx1031 because it would overwrite the file Tensilelibrary.dat

yes, that would be a problem, I didn't think about that. gfx1032 owners will compile it themselves then, I wrote how to compile and create an exe on discord and I'll share it here;

Install HIP SDK 5.7.1 https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html

Download this from https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

Place the Kernels.so-000-gfx1032.hsaco and TensileLibrary.dat files (under the library folder after unzipping) in the C:\Program Files\AMD\ROCm\5.7\bin\rocblas\library directory

open x64 native tools without admin (if you don't have visual studio, w64devkit will do, here is the information https://github.com/YellowRoseCx/koboldcpp-rocm?tab=readme-ov-file#compiling-for-amd-on-windows)

git clone https://github.com/YellowRoseCx/koboldcpp-rocm

cd koboldcpp-rocm

mkdir build && cd build

cmake ... -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLAMA_HIPBLAS=ON -DHIP_PLATFORM=amd -DCMAKE_C_COMPILER="C:/Program Files/AMD/ROCm/5. 7/bin/clang.exe" -DCMAKE_CXX_COMPILER="C:/Program Files/AMD/ROCm/5.7/bin/clang++.exe" -DAMDGPU_TARGETS="gfx1032"

cmake --build . -j 16

Copy koboldcpp_hipblas.dll from koboldcpp-rocm\build\bin and put it in main dir.

python -m venv .venv

.venv\Scripts\activate

pip install pyinstaller

make_pyinstaller_exe_rocm_only.bat copy create a new .bat change rocm version from 5.5 to 5.7 only

then run that bat file. it will create exe under koboldcpp-rocm\dist

Could you try compiling for gpu targets gfx1031 and gfx1032? It should output only 1 tensilelibrary.dat then

jasyuiop · 2024-02-11T19:43:57Z

Adding them into KoboldCpp-ROCm 1.57.1.yr1, hopefully everything works as intended xD Thanks!

I realized later that the "lazy" one I shared was a bit incomplete and even unusable, so I added information at the top of this post #655 (comment), then I created and added "none-lazy" for the 5.7.1 HIP SDK version. The "none-lazy" one works smoothly and properly, I recommend adding the "none-lazy" one in the new version. I saw that the "lazy" one was added in the new version, which unfortunately will not work :( I am adding the link again to avoid confusion @YellowRoseCx
https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

I cant use the none lazy one because then I cant use the other ones from gfx1031 because it would overwrite the file Tensilelibrary.dat

yes, that would be a problem, I didn't think about that. gfx1032 owners will compile it themselves then, I wrote how to compile and create an exe on discord and I'll share it here;

Install HIP SDK 5.7.1 https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html

Download this from https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

Place the Kernels.so-000-gfx1032.hsaco and TensileLibrary.dat files (under the library folder after unzipping) in the C:\Program Files\AMD\ROCm\5.7\bin\rocblas\library directory

open x64 native tools without admin (if you don't have visual studio, w64devkit will do, here is the information https://github.com/YellowRoseCx/koboldcpp-rocm?tab=readme-ov-file#compiling-for-amd-on-windows)

git clone https://github.com/YellowRoseCx/koboldcpp-rocm

cd koboldcpp-rocm

mkdir build && cd build

cmake ... -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLAMA_HIPBLAS=ON -DHIP_PLATFORM=amd -DCMAKE_C_COMPILER="C:/Program Files/AMD/ROCm/5. 7/bin/clang.exe" -DCMAKE_CXX_COMPILER="C:/Program Files/AMD/ROCm/5.7/bin/clang++.exe" -DAMDGPU_TARGETS="gfx1032"

cmake --build . -j 16

Copy koboldcpp_hipblas.dll from koboldcpp-rocm\build\bin and put it in main dir.

python -m venv .venv

.venv\Scripts\activate

pip install pyinstaller

make_pyinstaller_exe_rocm_only.bat copy create a new .bat change rocm version from 5.5 to 5.7 only
then run that bat file. it will create exe under koboldcpp-rocm\dist

Could you try compiling for gpu targets gfx1031 and gfx1032? It should output only 1 tensilelibrary.dat then

I'm glad you told me that :) I compiled it without any problems, I used rocblas and tensile rel-5.7.1 branches.

python rmake.py -a gfx1031;gfx1032 --merge-architectures --no-lazy-library-loading -t "D:\Ai\5-7-1\Tensile" -d -j 16 -v

I'll explain step by step how I compiled it a little later, just for information :)

Attachments

gfx1031_gfx1032_none-lazy-rocm.5.7.1.zip

jasyuiop · 2024-02-11T20:13:24Z

Install

Git for Windows
Visual Studio 2022 Build Tools

Tick “Desktop development with C++” workload.

ROCm Windows SDK (i used 5.7.1)
Strawberry perl
python 3.11

ADD PATH

Cmake and Ninja:

C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin
C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja

Git:

C:\Program Files\Git\bin

Perl:

C:\Strawberry\perl\bin

ROCM:

C:\Program Files\AMD\ROCm\5.7\bin

vcpkg

Create a folder called github under C:\
cd github
git clone -b 2023.06.20 https://github.com/microsoft/vcpkg
.\vcpkg\bootstrap-vcpkg.bat

Rocblas

go to another folder for example downloads etc.

git clone -b release/rocm-rel-5.7.1.1 https://github.com/ROCmSoftwarePlatform/rocBLAS.git
download this file and put Rocblas folder https://github.com/ROCm/rocBLAS/files/13599164/vcpkg.json

Tensile

go to another folder

git clone -b release/rocm-rel-5.7.1.1 https://github.com/ROCm/Tensile
download this file and put Tensile folder https://raw.githubusercontent.com/ulyssesrr/docker-rocm-xtra/f25f12835c1d0a5efa80763b5381accf175b200e/rocm-xtra-rocblas-builder/patches/Tensile-fix-fallback-arch-build.patch
cd Tensile
git apply Tensile-fix-fallback-arch-build.patch

Open x64 native tools(without Admin) and go to the rocblas folder

python rdeps.py
python rmake.py -a gfx1031;gfx1032 --merge-architectures --no-lazy-library-loading -t "D:\Ai\5-7-1\Tensile" -d -j 16 -v

After the rmake.py command is finished open x64 native tools console with ADMİN

cmake --install build\release --prefix "C:\Program Files\AMD\ROCm\5.7"

jasyuiop · 2024-02-11T20:55:17Z

I always get this error when compiling with the parameters --lazy-library-loading --no-merge-architectures, if someone can tell me how to solve this error I can also compile the "lazy" one for gfx1031 and gfx1032.

I don't understand why, it compiles with --merge-architectures --no-lazy-library-loading without any error.

Reading logic files: Launching 16 threads for 108 tasks...
Reading logic files: Done.
[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||] 100% (0.4 secs elapsed)
Using fallback for arch: gfx1031
Using fallback for arch: gfx1032
# Writing Custom CMake
# Writing Kernels...
Generating kernels: Launching 16 threads...
Generating kernels: Done.
*
Compiling source kernels: Launching 16 threads...
Compiling source kernels: Done.
# Kernel Building elapsed time = 82.0 secs
# Tensile Library Writer DONE
################################################################################

[4/257] library\src\CMakeFiles\TENSILE_LIBRARY_TARGET.dir\utility.bat ecc6f16db1efb076
FAILED: library/src/CMakeFiles/TENSILE_LIBRARY_TARGET.util
library\src\CMakeFiles\TENSILE_LIBRARY_TARGET.dir\utility.bat ecc6f16db1efb076
Error copying file (if different) from "D:\Ai\5-7-1\rocBLAS\build\release\Tensile\library\TensileLibrary_lazy_gfx1032.dat" to "D:/Ai/5-7-1/rocBLAS/build/release/Tensile/library".
Batch file failed at line 61 with errorcode 1
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "D:\Ai\5-7-1\rocBLAS\rmake.py", line 512, in <module>
    main()
  File "D:\Ai\5-7-1\rocBLAS\rmake.py", line 505, in main
    if run_cmd(exe, opts):
       ^^^^^^^^^^^^^^^^^^
  File "D:\Ai\5-7-1\rocBLAS\rmake.py", line 468, in run_cmd
    proc = subprocess.run(program, check=True, stderr=subprocess.STDOUT, shell=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2288.0_x64__qbz5n2kfra8p0\Lib\subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'ninja.exe -j 16 --verbose all' returned non-zero exit status 1

YellowRoseCx · 2024-02-11T21:47:33Z

Adding them into KoboldCpp-ROCm 1.57.1.yr1, hopefully everything works as intended xD Thanks!

I realized later that the "lazy" one I shared was a bit incomplete and even unusable, so I added information at the top of this post #655 (comment), then I created and added "none-lazy" for the 5.7.1 HIP SDK version. The "none-lazy" one works smoothly and properly, I recommend adding the "none-lazy" one in the new version. I saw that the "lazy" one was added in the new version, which unfortunately will not work :( I am adding the link again to avoid confusion @YellowRoseCx
https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

I cant use the none lazy one because then I cant use the other ones from gfx1031 because it would overwrite the file Tensilelibrary.dat

yes, that would be a problem, I didn't think about that. gfx1032 owners will compile it themselves then, I wrote how to compile and create an exe on discord and I'll share it here;

Install HIP SDK 5.7.1 https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html

Download this from https://github.com/LostRuins/koboldcpp/files/14129073/gfx1032_none_lazy-rocm-5.7.1.zip

Place the Kernels.so-000-gfx1032.hsaco and TensileLibrary.dat files (under the library folder after unzipping) in the C:\Program Files\AMD\ROCm\5.7\bin\rocblas\library directory

open x64 native tools without admin (if you don't have visual studio, w64devkit will do, here is the information https://github.com/YellowRoseCx/koboldcpp-rocm?tab=readme-ov-file#compiling-for-amd-on-windows)

git clone https://github.com/YellowRoseCx/koboldcpp-rocm

cd koboldcpp-rocm

mkdir build && cd build

cmake ... -G "Ninja" -DCMAKE_BUILD_TYPE=Release -DLLAMA_HIPBLAS=ON -DHIP_PLATFORM=amd -DCMAKE_C_COMPILER="C:/Program Files/AMD/ROCm/5. 7/bin/clang.exe" -DCMAKE_CXX_COMPILER="C:/Program Files/AMD/ROCm/5.7/bin/clang++.exe" -DAMDGPU_TARGETS="gfx1032"

cmake --build . -j 16

Copy koboldcpp_hipblas.dll from koboldcpp-rocm\build\bin and put it in main dir.

python -m venv .venv

.venv\Scripts\activate

pip install pyinstaller

make_pyinstaller_exe_rocm_only.bat copy create a new .bat change rocm version from 5.5 to 5.7 only
then run that bat file. it will create exe under koboldcpp-rocm\dist

Could you try compiling for gpu targets gfx1031 and gfx1032? It should output only 1 tensilelibrary.dat then

I'm glad you told me that :) I compiled it without any problems, I used rocblas and tensile rel-5.7.1 branches.
python rmake.py -a gfx1031;gfx1032 --merge-architectures --no-lazy-library-loading -t "D:\Ai\5-7-1\Tensile" -d -j 16 -v
I'll explain step by step how I compiled it a little later, just for information :)

Attachments

gfx1031_gfx1032_none-lazy-rocm.5.7.1.zip

I'm building a new koboldcpp version now to see if it works

jasyuiop · 2024-02-11T21:50:54Z

I'm building a new koboldcpp version now to see if it works

By the way, one thing I noticed is that the tensilelibrary.dat file may be related to the Tensile version regardless of the cards.

When I do SHA check, it gives the same result as my previous build. I also compared it with the kernel and library file from your first build where you supported gfx1031, I think the compiler(rocblas, tensile) used HIP SDK version 5.5.1 and that's why both kernel and tensilelibrary SHAs are not consistent.

With new HIP versions and card support, if you take a base version(sdk, tensile, rocblas) and tell the card owners to compile in that version and send the kernel file, it seems to work fine.

hiepxanh · 2024-02-18T10:06:55Z

I have a 6600XT card now, should I can use the zip file or I have to do build step like you? @jasyuiop I think it little overhead for me

jasyuiop · 2024-02-18T10:46:14Z

I have a 6600XT card now, should I can use the zip file or I have to do build step like you? @jasyuiop I think it little overhead for me

You don't need to bother with compiling the kernel or koboldcpp, I compiled the kernel for gfx1032 and @YellowRoseCx added it to the new releases, just do the following and you're good

Install HIP SDK 5.7.1 https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html
use latest koboldcpp-rocm exe https://github.com/YellowRoseCx/koboldcpp-rocm/releases/tag/v1.58.yr0-ROCm

hiepxanh · 2024-02-18T10:47:34Z

Aw so sweet, thank you so much @jasyuiop

GoldenNocturne · 2024-02-20T18:35:04Z

Now let me explain here, I have been struggling with the rmake.py command for two days, even if I pass -a with gtx1032 or other parameters, I still get an error. If you are not as unlucky as me, you may not get an error here.

It doesn't matter if you also get an error, the command just needs to generate some things and put them in place

@jasyuiop for me it is stuck at:
[0/2] Re-checking globbed directories...
[2/400] Generating prototypes from C:/AI/rocBLAS/library/src.

Should i try waiting even longer or has the command finished doing what's needed?

jasyuiop · 2024-02-20T19:24:10Z

Now let me explain here, I have been struggling with the rmake.py command for two days, even if I pass -a with gtx1032 or other parameters, I still get an error. If you are not as unlucky as me, you may not get an error here.
It doesn't matter if you also get an error, the command just needs to generate some things and put them in place

@jasyuiop for me it is stuck at: [0/2] Re-checking globbed directories... [2/400] Generating prototypes from C:/AI/rocBLAS/library/src.

Should i try waiting even longer or has the command finished doing what's needed?

no, you should wait, but if you proceed as in the message you quoted, you may get an error

If you follow all the steps as I describe here, you should not get any error #655 (comment)

the reason I got an error there was because I was missing something, I realized it too late :)

GoldenNocturne · 2024-02-20T19:52:33Z

Now let me explain here, I have been struggling with the rmake.py command for two days, even if I pass -a with gtx1032 or other parameters, I still get an error. If you are not as unlucky as me, you may not get an error here.
It doesn't matter if you also get an error, the command just needs to generate some things and put them in place

@jasyuiop for me it is stuck at: [0/2] Re-checking globbed directories... [2/400] Generating prototypes from C:/AI/rocBLAS/library/src.
Should i try waiting even longer or has the command finished doing what's needed?

no, you should wait, but if you proceed as in the message you quoted, you may get an error

If you follow all the steps as I describe here, you should not get any error #655 (comment)

the reason I got an error there was because I was missing something, I realized it too late :)

Thanks. I'm actually trying to build for gfx1010, how should i adapt the process in the quoted comment?

jasyuiop · 2024-02-21T08:38:34Z

Now let me explain here, I have been struggling with the rmake.py command for two days, even if I pass -a with gtx1032 or other parameters, I still get an error. If you are not as unlucky as me, you may not get an error here.
It doesn't matter if you also get an error, the command just needs to generate some things and put them in place

@jasyuiop for me it is stuck at: [0/2] Re-checking globbed directories... [2/400] Generating prototypes from C:/AI/rocBLAS/library/src.
Should i try waiting even longer or has the command finished doing what's needed?

no, you should wait, but if you proceed as in the message you quoted, you may get an error
If you follow all the steps as I describe here, you should not get any error #655 (comment)
the reason I got an error there was because I was missing something, I realized it too late :)

Thanks. I'm actually trying to build for gfx1010, how should i adapt the process in the quoted comment?

if you followed exactly the same path, you only need to change the parameter for gfx1010 (don't forget to change the path for the tensile folder and change the -j parameter depending on how many cores you have)

python rmake.py -a gfx1010 --merge-architectures --no-lazy-library-loading -t "D:\Ai\5-7-1\Tensile" -d -j 16 -v

jasyuiop changed the title ~~6600/6600 XT/6650 XT Gfx1032 libraries for compilation of Kobold.cpp~~ 6600/6600 XT/6650 XT gfx1032 libraries for compilation of Kobold.cpp Feb 1, 2024

jasyuiop mentioned this issue Feb 1, 2024

6700XT/6800M Gfx1031 libraries for compilation of Kobold.cpp #441

Open

jasyuiop mentioned this issue Feb 11, 2024

rocBLAS error: Cannot read TensileLibrary.dat: No such file or directory for GPU arch : gfx1036 #676

Open

ENDlezZenith mentioned this issue Feb 25, 2024

[Issue]: Bug Report for GFX906+SD.Next+ZLUDA & Seek for Solution to Use GFX906 on Windows vladmandic/automatic#2906

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

6600/6600 XT/6650 XT gfx1032 libraries for compilation of Kobold.cpp #655

6600/6600 XT/6650 XT gfx1032 libraries for compilation of Kobold.cpp #655

jasyuiop commented Feb 1, 2024

jasyuiop commented Feb 1, 2024 •

edited

Loading

jasyuiop commented Feb 1, 2024

YellowRoseCx commented Feb 10, 2024

jasyuiop commented Feb 11, 2024 •

edited

Loading

YellowRoseCx commented Feb 11, 2024

jasyuiop commented Feb 11, 2024 •

edited

Loading

YellowRoseCx commented Feb 11, 2024 •

edited

Loading

jasyuiop commented Feb 11, 2024

jasyuiop commented Feb 11, 2024

jasyuiop commented Feb 11, 2024

YellowRoseCx commented Feb 11, 2024

Attachments

jasyuiop commented Feb 11, 2024 •

edited

Loading

hiepxanh commented Feb 18, 2024

jasyuiop commented Feb 18, 2024

hiepxanh commented Feb 18, 2024

GoldenNocturne commented Feb 20, 2024

jasyuiop commented Feb 20, 2024 •

edited

Loading

GoldenNocturne commented Feb 20, 2024

jasyuiop commented Feb 21, 2024

6600/6600 XT/6650 XT gfx1032 libraries for compilation of Kobold.cpp #655

6600/6600 XT/6650 XT gfx1032 libraries for compilation of Kobold.cpp #655

Comments

jasyuiop commented Feb 1, 2024

Information

Resources I follow

Results

Progress

The last word

Attachments

jasyuiop commented Feb 1, 2024 • edited Loading

Information

Setup

Install

ADD PATH

vcpkg

Rocblas

for non-lazy

for lazy

Attachments

jasyuiop commented Feb 1, 2024

YellowRoseCx commented Feb 10, 2024

jasyuiop commented Feb 11, 2024 • edited Loading

YellowRoseCx commented Feb 11, 2024

jasyuiop commented Feb 11, 2024 • edited Loading

YellowRoseCx commented Feb 11, 2024 • edited Loading

jasyuiop commented Feb 11, 2024

Attachments

jasyuiop commented Feb 11, 2024

Install

ADD PATH

vcpkg

Rocblas

Tensile

jasyuiop commented Feb 11, 2024

YellowRoseCx commented Feb 11, 2024

Attachments

jasyuiop commented Feb 11, 2024 • edited Loading

hiepxanh commented Feb 18, 2024

jasyuiop commented Feb 18, 2024

hiepxanh commented Feb 18, 2024

GoldenNocturne commented Feb 20, 2024

jasyuiop commented Feb 20, 2024 • edited Loading

GoldenNocturne commented Feb 20, 2024

jasyuiop commented Feb 21, 2024

jasyuiop commented Feb 1, 2024 •

edited

Loading

jasyuiop commented Feb 11, 2024 •

edited

Loading

jasyuiop commented Feb 11, 2024 •

edited

Loading

YellowRoseCx commented Feb 11, 2024 •

edited

Loading

jasyuiop commented Feb 11, 2024 •

edited

Loading

jasyuiop commented Feb 20, 2024 •

edited

Loading