A simple extension for the text-generation-webui by oobabooga that uses Bark for audio output.
This repo is not actively maintained (at least for a while). If you want a version that works and fixes some issues with my code, have a look at RandoInternetPreson's Fork
Assuming you already have the webui set up:
- Activate the conda environment with the
cmd_xxx.bat
or usingconda activate textgen
- Enter the
text-generation-webui/extensions/
directory and clone this repository
cd text-generation-webui/extensions/
git clone https://github.com/minemo/text-generation-webui-barktts bark_tts/
- install the requirements
pip install -r extensions/bark_tts/requirements.txt
- Add
--extensions bark_tts
to your startup script
or
enable it through theInterface Mode
tab in the webui
The full version of Bark requires around 12Gb of memory to hold everything on GPU at the same time. However, even smaller cards down to ~2Gb work with some additional settings. For this extension, you could open extensions/bark_tts/.env
, then set USE_SMALL_MODELS
and USE_CPU
to true
:
# Whether to use small models
USE_SMALL_MODELS=true
# Whether to use CPU
USE_CPU=true