This guide explains how to properly run inference using the AdaSpeech model.
-
Clone the repository:
git clone git@github.com:Cocii/AdaSpeech.git cd AdaSpeech
-
Install the required Python packages:
pip install -r requirements.txt
Before running inference, you need:
- AdaSpeech model checkpoint
- Vocoder checkpoint (default BigVGAN)
- Configuration files:
preprocess.yaml
model.yaml
train.yaml
CUDA_VISIBLE_DEVICES=0 python inference.py \
--language_id <LANG_ID> \
--speaker_id <SPEAKER_ID> \
--reference_audio <REF_AUDIO_PATH> \
--text "$(cat test.txt)" \
-p <PREPROCESS_CONFIG> \
-m <MODEL_CONFIG> \
-t <TRAIN_CONFIG> \
--restore_step <CHECKPOINT_STEP> \
--vocoder_checkpoint <VOCODER_PATH> \
--vocoder_config <VOCODER_CONFIG>
language_id
: Language identifier (0 for English, 1 for Chinese)speaker_id
: Target speaker identifierreference_audio
: Path to reference audio file for speaker embeddingtext
: Input text for synthesis (can be read from file)restore_step
: Checkpoint step number to loadvocoder_checkpoint
: Path to vocoder model weightsvocoder_config
: Path to vocoder configuration file