[TTS]Update VITS to support VITS and its voice cloning training on AIShell-3 #2268
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR types
New features | Bug fixes
PR changes
APIs | Docs
Describe
New features:
Bug fixes:
VITS inference code:
feats_lengths = paddle.to_tensor([paddle.shape(feats)[2]])
->feats_lengths = paddle.to_tensor(paddle.shape(feats)[2])
Extra:
I added the docs for traning VITS or VITS-VC on aishell-3 dataset, but leave a
TODO
label for pretrained model. I trained the two new examples with a little modified on aistudio single v100 16GB card withbatch_size=24
for 25000 step. This is to verify that training can be performed normally.I hope the official could release the official version of the pre-trained model trained on 4 cards in the future.
Here are some outputs of my 25000 step models:
test_vits.zip
test_e2e_vits.zip
test_vits_vc.zip
vc_syn_vits_vc_src_text.zip
vc_syn_vits_vc_src_audio.zip