27 Sep 17:44

Recompiled with gfx906 support
disable mmq by default
update make_pyinstaller.sh (used to create a single linux executable)

Assets 4

23 Sep 18:56

github-actions

v1.75.2.yr0-ROCm

42edf71

KoboldCPP-v1.75.2.yr0-ROCm

Update cmake-rocm-windows.yml remove openblas

Assets 4

01 Sep 23:26

github-actions

v1.74.yr0-ROCm

6be122b

KoboldCPP-v1.74.yr0-ROCm

Merge remote-tracking branch 'upstream/concedo'

Assets 4

24 Aug 04:57

github-actions

v1.73.1.yr1-ROCm

d15e1fd

v1.73.1.yr1-ROCm v6.2.0

KoboldCPP-ROCm v1.73.yr1

KoboldCPP-ROCm v1.73.yr1 with rocBLAS from ROCm v6.2.0 (the latest, newer than official Windows version)

I built rocBLAS and the tensile library files for the following GPU architectures: gfx803;gfx900;gfx1010;gfx1030;gfx1031;gfx1032;gfx1100;gfx1101;gfx1102 with the code from the ROCm 6.2.0 release

I was able to test out gfx1010 (5600xt) and gfx1030 (6800xt) and they both worked separately and together (have to use the Low VRAM setting for multi GPU it seems)

NEW: Added dual-stack (IPv6) network support. KoboldCpp now properly runs on IPv6 networks, the same instance can serve both IPv4 and IPv6 addresses automatically on the same port. This should also fix problems with resolving localhost on some systems. Please report any issues you face.
NEW: Pure CLI Mode - Added --prompt, allowing KoboldCpp to be used entirely from command-line alone. When running with --prompt, all other console outputs are suppressed, except for that prompt's response which is piped directly to stdout. You can control the output length with --promptlimit. These 2 flags can also be combined with --benchmark, allowing benchmarking with a custom prompt and returning the response. Note that this mode is only intended for quick testing and simple usage, no sampler settings will be configurable.
Changed the default benchmark prompt to prevent stack overflow on old bpe tokenizer.
Pre-filter to the top 5000 token candidates before sampling, this greatly improves sampling speed on models with massive vocab sizes with negligible response changes.
Moved chat completions adapter selection to Model Files tab.
Improve GPU layer estimation by accounting for in-use VRAM.
--multiuser now defaults to true. Set --multiuser 0 to disable it.
Updated Kobold Lite, multiple fixes and improvements
Merged fixes and improvements from upstream, including Minitron and MiniCPM features (note: there are some broken minitron models floating around - if stuck, try this one first!)

Hotfix 1.73.1 - Fixed DRY sampler broken, fixed sporadic streaming issues, added letterboxing mode for images in Lite. The previous v1.73 release was buggy, so you are strongly suggested to upgrade to this patch release.

To use minicpm:

download gguf model and mmproj file here https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf/tree/main
launch kobold, loading BOTH the main model file as the model, and the mmproj file as mmproj
upload images and talk to model

To use, download and run the koboldcpp_rocm.exe, which is a one-file pyinstaller.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Discussion: KoboldCPP-ROCm v1.73.1.yr1-ROCm v6.2.0 Discussion #64

Assets 4

24 Aug 01:04

YellowRoseCx

deps-v6.2.0

0d891b2

rocBLAS 4.2.0 for ROCm 6.2.0 for Windows

GPU tensile library files for gfx803;gfx900;gfx1010;gfx1030;gfx1031;gfx1032;gfx1100;gfx1101;gfx1102 and rocBLAS.dll built with ROCm 6.2.0 code

Assets 5

0 Join discussion

05 Aug 06:56

github-actions

v1.72.yr0-ROCm

63a53b2

KoboldCPP-v1.72.yr0-ROCm

Merge remote-tracking branch 'upstream/concedo'

Assets 4

06 Aug 06:16

YellowRoseCx

test-build

63a53b2

**Ignore** 1.72.yr0 Test Build for Nemo Fix + beta rocm 6.1.2 build Pre-release

Pre-release

rocm 6.1.2 build only has support for officially supported gpus, gfx906, gfx1030, gfx1100, gfx1101, gfx1102

Assets 4

31 Jul 01:26

github-actions

v1.71.1.yr0-ROCm

3266659

KoboldCPP-v1.71.1.yr0-ROCm

Various bug fixes and improvements

Made with ROCm 5.7.1
ROCm 6.1.2 builds coming soon

Assets 4

26 Jul 08:31

github-actions

v1.71.yr0-ROCm

e989213

KoboldCPP-v1.71.yr0-ROCm

Update koboldcpp.py

Assets 4

15 Jul 22:13

github-actions

v1.70.yr0-ROCm

24bc828

KoboldCPP-v1.70.yr0-ROCm

koboldcpp-1.70

mom: we have ChatGPT at home edition

Updated Kobold Lite:
- Introducting Corpo Mode: A new beginner friendly UI theme that aims to emulate the ChatGPT look and feel closely, providing a clean, simple and minimalistic interface. It has a limited feature set compared to other UI themes, but should feel very familiar and intuitive for new users. Now available for instruct mode!
- Settings Menu Rework: The settings menu has also been completely overhauled into 4 distinct panels, and should feel a lot less cramped now, especially on desktop.
- Sampler Presets and Instruct Presets have been updated and modernized.
- Added support for importing character cards from aicharactercards.com
- Added copy for code blocks
- Added support for dedicated System Tag and System Prompt (you are still encouraged to use the Memory feature instead)
- Improved accessibility, keyboard tab navigation and screen reader support
NEW: DRY dynamic N-gram anti-repetition sampler support has been added (credits @pi6am)
Added --unpack, a new self-extraction feature that allows KoboldCpp binary releases to be unpacked into an empty directory. This allows easy modification and access to the files and contents embedded inside the PyInstaller. Can also be used in the GUI launcher.
Fix for a Vulkan regression in Q4_K_S mistral models when offloading to GPU (thanks @0cc4m).
Experimental support for OpenAI tools and function calling API (credits @teddybear082)
Added a workaround for Deepseek crashing due to unicode decoding issues.
--chatcompletionsadapter can now be selected on included pre-bundled templates by filename, e.g. Llama-3.json, pre-bundled templates have also been updated for correctness (thanks @xzuyn).
Default --contextsize is finally increased to 4096, default Chat Completions API output length is also increased.
Merged fixes and improvements from upstream, including multiple Gemma fixes.

To use on Windows, download and run the koboldcpp_rocm.exe OR download koboldcpp_rocm_files.zip and run python koboldcpp.py from Window Terminal or CMD (additional python pip modules might need installed, like customtkinter and tk or python-tk.

To use on Linux, clone the repo or download Source Code (tar.gz or zip) and build with make LLAMA_HIPBLAS=1 -j4 (-j4 can be adjusted to your number of CPU threads for faster build times)

Run it from the command line with the desired launch parameters (see --help), or use the GUI by launching with python koboldcpp.py ((additional python pip modules might need installed, like customtkinter and tk or python-tk).

Once loaded, you can visit the following URL or use it as the API URL for other front-ends like Silly Tavern: http://localhost:5001/

For more information, be sure to run the program from command line with the --help flag.

Contributors

0cc4m, xzuyn, and 2 other contributors

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KoboldCPP-ROCm v1.73.yr1

Discussion: KoboldCPP-ROCm v1.73.1.yr1-ROCm v6.2.0 Discussion #64

koboldcpp-1.70

Contributors

Releases: YellowRoseCx/koboldcpp-rocm

KoboldCPP-v1.75.2.yr1-ROCm

KoboldCPP-v1.75.2.yr0-ROCm

KoboldCPP-v1.74.yr0-ROCm

v1.73.1.yr1-ROCm v6.2.0

KoboldCPP-ROCm v1.73.yr1

Discussion: KoboldCPP-ROCm v1.73.1.yr1-ROCm v6.2.0 Discussion #64

rocBLAS 4.2.0 for ROCm 6.2.0 for Windows

KoboldCPP-v1.72.yr0-ROCm

**Ignore** 1.72.yr0 Test Build for Nemo Fix + beta rocm 6.1.2 build

KoboldCPP-v1.71.1.yr0-ROCm

KoboldCPP-v1.71.yr0-ROCm

KoboldCPP-v1.70.yr0-ROCm

koboldcpp-1.70

Contributors

Ignore 1.72.yr0 Test Build for Nemo Fix + beta rocm 6.1.2 build