Skip to content

Releases: raccoonML/Real-Time-Voice-Cloning

RTVC upstream 2/3/22 (English)

03 Feb 10:02
1cd2565
Compare
Choose a tag to compare
Pre-release

Note: Release RTVC-7 is recommended over this one. It uses the same pretrained models.

Description

This is a collection of the files currently in the original CorentinJ/Real-Time-Voice-Cloning repo, which my rtvc_upstream branch mirrors. I provide this as a convenience to users who want a convenient way to setup the CorentinJ version of the RTVC code.

File information

File location                         Filesize
-----------------------------------------------
saved_models/default/encoder.pt         17 MB
saved_models/default/synthesizer.pt    370 MB
saved_models/default/vocoder.pt         53 MB

Model information

The pretrained model files come from Corentin's google drive link and are identical to those provided in RTVC-7. The file checksums using sha1sum are:

d44d60cbb47362c3c99216576ddc9796aad69366  encoder.pt
733b0c983f8a1cdaba0144d248f085d158c1775f  synthesizer.pt
2ec56c93f219da3229ee40950c979e689aaa58d8  vocoder.pt

Audio samples

https://raccoonML.github.io/bluefish_experiments/RTVC-7.html

RTVC Swedish-1 (tensorflow)

19 Jan 11:18
Compare
Choose a tag to compare
Pre-release

This Swedish pretrained model originates from @ViktorAlm . I have assembled the files necessary to run the models.

Setup instructions

  1. Install Python 3.7. It needs to be this version for tensorflow 1.15 to work. It is highly recommended that you follow these instructions. GUIDE: Installing Python 3.7.9 on Windows.
  2. Download and extract RTVC-Swedish.zip.
  3. Open a Windows command prompt and set up a Python virtual environment. GUIDE: Python virtual environments in Windows
cd C:\path\to\RTVC\files
python -m venv venv
venv\Scripts\activate.bat
  1. Install dependencies.
pip install --upgrade pip
pip install torch -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
pip install webrtcvad-wheels
  1. Start the toolbox.
python demo_toolbox.py

File information

RTVC-Swedish.zip contains all repo files, pretrained models and a standalone ffmpeg.exe to help load mp3 files.

RTVC Spanish-1

21 Dec 13:34
Compare
Choose a tag to compare
RTVC Spanish-1 Pre-release
Pre-release

This is a Spanish model. It is experimental and there are a number of issues with it.

Known issues

The model suffers from this issue first pointed out by bluefish in CorentinJ/Real-Time-Voice-Cloning#879 (comment)

If there are problems with the synthesizer generating extra sounds, the stop threshold can be lowered to help prevent this. A threshold of 0.00001 seems to work well.

The stop threshold is left at the default 0.5 because I was unable to find a satisfactory value. Too low and there is a premature end to generation. Too high and the model produces extra sounds before stopping.

Model information

The source data for training this model is Multilingual LibriSpeech (MLS) Spanish. It was trained 278k steps at a batch size of 26 using a reduction factor r=5. The speaker encoder and vocoder are the same as the RTVC-7 release (trained on English).

File information

  • RTVC_Spanish.zip contains all repo files, pretrained models and a standalone ffmpeg.exe to help load mp3 files. For the easiest setup experience on Windows, follow these instructions.
  • pretrained_spanish.zip includes only the pretrained models. Copy the files to these locations relative to the root folder of the repo.
File location                                        Filesize
--------------------------------------------------------------
encoder/saved_models/pretrained.pt                      17 MB
synthesizer/saved_models/pretrained/pretrained.pt      134 MB
vocoder/saved_models/pretrained/pretrained.pt           53 MB

RTVC-7 (English)

15 Dec 12:34
4a4952f
Compare
Choose a tag to compare

File information

  • RTVC_Windows.zip contains all repo files, pretrained models and a standalone ffmpeg.exe to help load mp3 files. For the easiest setup experience on Windows, follow these instructions.
  • pretrained_models.zip includes only the pretrained models. Copy the files to these locations relative to the root folder of the repo.
File location                                        Filesize
--------------------------------------------------------------
encoder/saved_models/pretrained.pt                      17 MB
synthesizer/saved_models/pretrained/pretrained.pt      370 MB
vocoder/saved_models/pretrained/pretrained.pt           53 MB

Audio samples

https://raccoonML.github.io/bluefish_experiments/RTVC-7.html