-
-
Notifications
You must be signed in to change notification settings - Fork 137
AllTalk V2 QuickStart Guide
To start using AllTalk, you’ll use various start_xxx
files:
start_alltalk
- Launches the main AllTalk TTS application.start_environment
- Activates the AllTalk Python environment.start_finetune
- Starts the Coqui XTTS finetuning process.start_diagnostics
- Generates a diagnostics file for troubleshooting.
After running start_alltalk
, the application will display key details in the console, including Github update status, the API address that external applications can use to talk to AllTalk, and the Gradio links to open up the main AllTalk web user interface.
- GitHub Updated: Shows the last time an update was made on Github to AllTalk.
- Pull new changes by running
git pull
in thealltalk_tts
folder & thenatsetup
to update the installation requirements.
- API Address:
127.0.0.1:7851
- the API endpoint for TTS calls and a basic web interface.- Gradio Interface: The main web interface for AllTalk
- Light Mode: 127.0.0.1:7852
- Dark Mode: 127.0.0.1:7852?__theme=dark
💡 Tip: Open the Gradio links by holding
CTRL
thenLeft Click
with your mouse
💡 Tip: Errors/Issues will be displayed at the terminal/console screen. The Known Errors page is here on the Wiki
There are multiple areas to AllTalks Gradio web user interface. The primary areas of interest are:
- Generation Tab - In here you can generate TTS and choose which TTS Engine is loaded/Set along with the model its using
- Some TTS Engines do not have multiple model files, as their actual voice files are the models.
- Global Settings - In here you can configure/manage many features that are core to AllTalk.
- Some feature/settings will be only for advanced users or specific use cases.
- You can tweak lots of settings to change the behaviour or AllTalk.
- If you wish to use RVC voices, you need to Enable it in the
RVC Settings Tab
.
- TTS Engine Settings - In here you can make changes to each TTS engine, download its model files & find help about that TTS engine.
- Typically you will want to download/setup your chosen TTS engine when you have freshly installed AllTalk.
- If you wish to install your own models/voices for a TTS engine, you can read that engines documentation here.
- An overview page for each TTS engine can also be found here, along with links to the Developer of that TTS engine.
- Help Accordions - Throughout the AllTalk interface are expandable help accordions with information about the page you are on and its settings.
- Some help/information isn't in the interface but on the GitHub Wiki which you are on right now.
- A Known errors and issues list is maintained on the GitHub Wiki here
This is where you control which TTS engine AllTalk has loaded for generating TTS & can generate basic TTS.
- Change TTS Engine: To switch between different TTS engines, go to "Generate TTS" > "Generate" tab.
- Swap TTS Engine: Use the "Swap TTS Engine" button to select different TTS engines (e.g. XTTS, Piper, VITS).
- Load Different Model: Click "Load Different Model" to change the model for the chosen TTS engine
- Advanced Engine/Model settings: This is where you can control/change other settings like TTS speed, language etc.
- The settings here will change depending on what the currently loaded TTS engine supports. Some features may be greyed out.
💡 Tip: Click the Refresh Server Settings button to update all the voice lists etc.
💡 Tip: Download models for each TTS Engine and manage each TTS Engines settings, go to the TTS Engine Settings tab
💡 Tip: Voice cloning TTS engines store wav/mp3 files to be used for cloning in thealltalk_tts/voices/
folder.
💡 Tip: RVC will not be enabled until you Enable it in theGlobal Settings > RVC Settings
area.
Adjust AllTalk's central default behaviour and settings. Examples include:
- Audio Transcoding Convert output audio to different file formats as TTS is generated e.g MP3, Ogg, Flac
- Delete Old WAVs: Automatically delete old generated TTS files on start-up.
- Disk Space Use: Find out how much disk space is being used and where it is being used.
- RVC Pipeline: Enable RVC and set its default settings in the "RVC Settings" tab.
- Enabling RVC will download a few model files that it needs and also setup the
rvc_voices
folder where you can place RVC voice files for use.
⚠️ Warning: Some features in the Global Settings may be for advanced uses or special cases. Whilst you cannot damage anything, you can affect/impact how AllTalk behaves. As such if you are uncertain what you are doing, please read the help for each section.
In here you can set the custom settings for each TTS engine that AllTalk works with. You can also download Models/Voices (some Engines are Voice cloning Model based and some are individual voice models). You can find out about each TTS engine and its settings. If you wish to use the OpenAI compatible TTS endpoint, you can map the voices between what OpenAI's API uses and what the underlying TTS engine will use.
- Engine Information: Detailed descriptions of each engine (F5-TTS, Piper, XTTS, Parler, etc.) and links to developer sites.
- Models/Voices Download: Download models or voices specific to the chosen TTS engine.
- Default Settings: Set default parameters, including temperature, pitch, and repetition.
- Engine Help: Instructions on using each engine, managing models, and troubleshooting.
Model | DeepSpeed | Pitch | Speed | RepPen | MultiLang | Streaming | Low VRAM | Temp | Multi Model | Notes |
---|---|---|---|---|---|---|---|---|---|---|
F5-TTS | No | No | Yes | No | *Yes | No | Yes | No | Yes | * |
Parler-TTS | No | No | No | No | No | No | Yes | No | Yes | ** |
Piper | No | No | Yes | No | *No | No | No | No | Yes | *** |
Coqui VITS | No | No | No | No | *No | No | Yes | No | Yes | *** |
Coqui XTTS | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | **** |
- F5-TTS: Voice Cloning from an audio sample. Supports only Chinese and English language.
- Coqui XTTS: Voice Cloning from an audio sample. Multi-language.
- Parler-TTS: Voices created by written text instructions. English language only.
- Piper TTS: Individual single voice model files. Multi-language (depends on the voice model file).
- Coqui VITS: Individual single voice model files. Multi-language (depends on the voice model file).
💡 Tip: Information on each TTS engine is available in the Gradio Interface for each model.
💡 Tip: For more detailed information, links to the TTS engine developers websites are on the chart above.
AllTalk organizes files in the following structure:
alltalk_tts/
├── .GitHub/ # Git's version management tracking folder
├── alltalk_environment/ # AllTalk's Python environment folder
├── finetune/ # Coqui XTTS finetuning dataset files
├── models/ # Engines model files are stored in here
│ ├── f5tts/ # F5-TTS's TTS' model files/folders
│ ├── piper/ # Piper TTS's voice files/folders
│ ├── xtts/ # Parler's model files/folders
│ ├── rvc_base/ # RVC's core model files
│ ├── rvc_voices/ # RVC's voice models (where you can place them)
│ ├── xtts/ # Coqui XTTS's model files/folders
│ ├── vits/ # Coqui VITS's voice files/folders
│ └── etc.../
├── system/
│ ├── espeak-ng/ # Windows installer for espeak-ng
│ ├── gradio_pages/
│ ├── requirements/ # Requirement files
│ ├── TGWUI Extension/ # TGWUI remote extension
│ └── tts_engines/ # Individual TTS engine's core code
│ ├── tts_engines.json # TTS engine configuration file
│ ├── new_engines.json # New TTS engine configuration file
│ ├── f5tts/
│ ├── parler/
│ ├── piper/
│ ├── rvc/
│ ├── template-tts-engine/ # Template code for adding a new TTS engine
│ ├── vits/
│ └── xtts/
├── voices/ # Audio samples for voice cloning engines are stored in here.
├── outputs/ # TTS output audio files
├── confignew.json # AllTalk's central configuration file
├── atsetup.bat # Windows setup file
├── atsetup.sh # Linux setup file
├── Other Files...
├── script.py # Main start-up script
└── tts_server.py # Engine management script
- Auto-Delete WAVs: Set in Global Settings; controls automatic deletion of old output files.
For detailed help refer to the relevant tabs in the Gradio interface or the AllTalk Wiki.