Skip to content

Releases: erew123/alltalk_tts

DeepSpeed 14.2 versions for Linux

30 May 18:52
7c7cb72
Compare
Choose a tag to compare

LINUX VERSION HERE (Not Windows)

Libaio Requirement

You need libaio-dev or libaio-devl (depending on your Linux flavour), otherwise DeepSpeed will fail e.g at your terminal.

  • Debian-based systems sudo apt install libaio-dev
  • RPM-based systems sudo yum install libaio-devel

DeepSpeed setup/compilation

DeepSpeed is complicated at best. You have to install DeepSpeed that matches:

  • Your Python version in your Python virtual environment e.g 3.10, 3.11, 3.12 etc
  • Your version of PyTorch within your Python virtual environment e.g, 2.0.x, 2.1.x, 2.2,x, 2.3.x etc
  • The version of CUDA that PyTorch was installed with in your Python virtual environment e.g. 11.8, or 12.1

If you change your version of Python, PyTorch or the CUDA version PyTorch uses within that virtual environment, you will need to uninstall DeepSpeed pip uninstall deepspeed and then install the correct matching version.

To understand a filename deepspeed-0.14.2+cu121torch2.3-cp312-cp312-manylinux_2_24_x86_64.whl

  • deepspeed-0.14.2+ the version of DeepSpeed
  • cu121 the CUDA version of PyTorch, in this case cu121 means CUDA 12.1
  • torch2.3 the version of PyTorch that it works with.
  • cp312-cp312 the version of Python it works with.
  • manylinux_2_24_x86_64 states its a Linux version.

So you will start your Python virtual environment then use something like AllTalks diagnostics to find out what version of:

  • Python is installed
  • PyTorch is installed
  • CUDA version that PyTorch was installed with.

You will then hunt through the below files and download the correct file to your folder, and still loaded into your Python virtual environment, you will run pip install deepspeed-0.14.2+{version here}manylinux_2_24_x86_64.whl where "version here" is the correct, matching version that you have downloaded.

To be clear, lets say your virtual Python environment is running Python 3.11.6 with PyTorch 2.2.1 with CUDA 12.1, you would want to download deepspeed-0.14.2+cu121torch2.2-cp311-cp311-manylinux_2_24_x86_64.whl

Note: You will need to install the Nvidia CUDA Development Toolkit, version 12.1.0 works perfectly with PyTorch 2.2.1, as it has been tested and confirmed to work. Version 12.4 has been tried and found to be problematic In Conda Python virtual environments, you can start the Python virtual environment and install this toolkit using the following command conda install nvidia/label/cuda-12.1.0::cuda-toolkit=12.1.

To be absolutely clear, the Nvidia CUDA Development Toolkit is separate from:

  • Your graphics card driver version.
  • The version of CUDA used by your graphics driver.
  • The version of PyTorch or Python on your system and their associated CUDA versions.

Think of the CUDA Development Toolkit like the engine diagnostics tools used by mechanics. These tools are necessary for the development, compilation and testing of CUDA applications (or in this case DeepSpeed). Just as a mechanic's tools are separate from the engine, car model, or the type of fuel the car uses, the CUDA Development Toolkit is separate from your graphics driver, the CUDA version your driver uses, and the versions of PyTorch or Python installed on your system. Aka, the versions do not all have to match exactly.

Also note, you will see this warning message when AllTalk starts up and DeepSpeed for Linux is installed. It is safe to ignore, as far as AllTalk is concerned.

image

AllTalk v1.9c

28 Mar 21:57
6c20f45
Compare
Choose a tag to compare

⚠️ This is AllTalk v1 which is now outdated. Please use AllTalk v2 ⚠️

AllTalk v2 link here

Quite a large update, in preparedness for a more structured application & future possibilities.

  • TTS Generator - Various interface bugs & filtering options cleaned up.
  • TTS Generator - TTSDiff now scans generated text and TTS for errors.
  • TTS Generator - TTSSRT now creates subtitle files for video production e.g. a Youtube video.
  • Finetune - Now uses a customised tokenizer to deal with Japanese.
  • Finetune - Pre flight check and warning messages.
  • Finetune - Extra documentation and warnings.
  • Entire file structure has been re-organised to simplify management and future changes.
  • Documentation (built in and Github) has been rewritten/tidied up.
  • Requirements files have been cleaned up and simplified.
  • ATsetup has been re-written as necessary with additional options.
  • Diagnostics now performs some other checks.
  • DeepSpeed moved up to version 14.
  • Standalone Application moved to PyTorch 2.2.1.
  • Nvidia CUDA Toolkit installation is NO LONGER needed (other than to compile DeepSpeed on Linux)

Tested on Linux and Windows.

65 changed files with 10,298 additions and 300 deletions.

If you download and use the ZIP file from here, it will NOT be linked to this Github repository and so CANNOT be automatically updated with a git pull in future.

DeepSpeed v14.0 for PyTorch 2.2.1 & Python 3.11

08 Mar 22:12
9bb0d1f
Compare
Choose a tag to compare

Before you install DeepSpeed, its recommended you confirm AllTalk works without.

This version has been built for PyTorch 2.2.1 and also Python 3.11.x

For CUDA v12.1 - WINDOWS - Download

For CUDA v11.8 - WINDOWS -Download

For CUDA v12.1 - LINUX- Download

For versions that support PyTorch 2.1.x please look at the main releases page

If you need to check your CUDA version within Text-generation-webui run cmd_windows.bat and then: python --version to get the Python version and pip show torch to get the CUDA version.

NOTE: You DO NOT need to set Text-generation-webUI's --deepspeed setting for AllTalk to be able to use DeepSpeed. These are two completely separate things and incorrectly setting that on Text-generation-webUI may cause other complications.

image

AllTalk v1.9

13 Jan 21:43
180adb4
Compare
Choose a tag to compare
  • Added Streaming endpoint API & Server Status API
  • Added SillyTavern support
  • Added ATSetup utility to help simplify Text-gen-webui and Standalone installations.
  • Updated TTS API generation endpoint to correctly name _combined files
  • Updated TTS API generation endpoint to correctly add a short uuid to timestamped files (non-narrator) to avoid dual file generation on the same tick.
  • Cleaned up some console outputs.
  • Additional documentation and documentation cleaning (along with the Github)
  • Added cutlet and unidic-lite to help with Japanese support on non-Japanese enabled computers.
  • Transformers requirements bumped to 4.37.1
  • Kobold is also now supported thanks to @illtellyoulater
  • Minor updates to Finetuning
  • Minor updates to documentation

DeepSpeed v12.7 wheel file

10 Jan 21:09
dd45008
Compare
Choose a tag to compare

Before you install DeepSpeed, its recommended you confirm AllTalk works without.

THESE ARE AS YET UNTESTED VERSIONS - DO NOT USE THESE
PLEASE USE DeepSpeed v11.2 here

Back to the DeepSpeed install instructions

Python 3.11.x and CUDA 12.1 COMPILED FOR PyTorch 2.1.x
DeepSpeed v12.7 for CUDA 12.1 and Python 3.11.x

If you are after DeepSpeed for CUDA 11.8 and/or Python 3.10.x please see DeepSpeed v11.2 here

If you need to check your CUDA version within Text-generation-webui run cmd_windows.bat and then: python --version to get the Python version and pip show torch to get the CUDA version.

NOTE: You DO NOT need to set Text-generation-webUI's --deepspeed setting for AllTalk to be able to use DeepSpeed. These are two completely separate things and incorrectly setting that on Text-generation-webUI may cause other complications.

image

AllTalk 1.8d

06 Jan 20:58
4f80b39
Compare
Choose a tag to compare
  • Added the AllTalk TTS Generator, which is designed for creating TTS of any length from as larger amount of text as you want. You are able to individually edit/regenerate sections after all the TTS is produced, export out to 1x wav file. You can also stream TTS if you just want to play back text or even push audio output to wherever AllTalk is currently running from at the terminal/command prompt.
  • Add greedy option to avoid apostrophe being removed. Add accentuated character for foreign language. (Thanks to @nicobubulle)
  • Updated filtering to allow Hungarian ő and ű characters and Cyrillic characters to pass through correctly.
  • Ms Word Add-in Added a proof-of-concept MS Word add-in to stream selected text to speech from within documents. This is purely POC and not production, hence support on this will be limited.

AllTalk 1.8c

01 Jan 21:04
ba13283
Compare
Choose a tag to compare

Streaming audio now supported on the built in documentation page demo (thanks to @rbruels)
Documentation separated out of main code, cleaned and spelling corrected.
Narrator has had its final upgrade/improvement and passed all tests. Details here

AllTalk 1.8b

30 Dec 14:56
b32df6a
Compare
Choose a tag to compare

Tidied up finetune interface.
Added multiple buttons to help with finetune final stages.
Added a compaction routine into finetune to compact legacy finetuned models.
New api JSON return output_cache_url The HTTP location for accessing the generated WAV file as a pushed download. with corrs support.
Updated documentation.

NOTE: I probably will be doing more work on the Narrator function. So if you are using that, you may want to git pull an update to get the latest.

AllTalk 1.8a

29 Dec 20:40
fa82870
Compare
Choose a tag to compare
  • Adds 3x new API endpoints supporting Ready status, Voices (list all available voices), Preview voice (generate hard coded short voice preview).
  • Playing of generated TTS via the API is now supported in the terminal/prompt where the script is running from.-
  • All documentation relevant to the above is updated.
  • Adds a 4th model loader "XTTSv2 FT" for loading of finetuned models.

The models have to be stored in /models/trainedmodel/ (which is the default location the finetune process will move a model to). On start-up, if a model is detected there, a 4th loader becomes available.
image

AllTalk 1.8

28 Dec 18:45
1331824
Compare
Choose a tag to compare

Finetuning has been made simpler at the final step (3x buttons now)
A compact.py script has been created for people who already have finetuned models.
Narrator function has been improved with its splitting, though there are still minor outlier situations to resolve.