-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of VITS-2 #1508
Comments
Can you try The error shows you need to specify the dim argument for torch.min(), though your code looks correct to me. |
Same issue coqui-ai/TTS#2555 It comes from the bad data file which doesn't align properly. |
I suggest that you use Otherwise, it may be difficult, if not impossible, to deploy the trained model with C++. You can find pre-built wheels for Linux and Windows at Do you have any code to share about using piper-phonemizer to convert text to tokens? |
@csukuangfj Let me try |
Ok, but we are switching to piper-phonemize for converting text to tokens. Hope that @yaozengwei can push the new tokenizer soon. |
I just uploaded the code here #1511. |
@csukuangfj Now I am getting:
I will try with the new tokenizer to see if it fixes the issue |
@yaozengwei |
Seems the tensor |
>>> import torch
>>> a = torch.empty((0,))
>>> torch.min(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: min(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' argument. An empty tensor will indeed throw the same error. |
@csukuangfj I might have some good news but it needs a bit more testing. I will let you know next week |
Unrelated to VITS-2 (please tell me if you prefer that I open a proper issue), it seems that for the VITS recipes, you are using spectrogram which is using |
hmm, i think we didn't choose this setup on purpose @yaozengwei am i right? |
We just follows the VITS paper (https://arxiv.org/pdf/2106.06103.pdf), which uses linear spectrogram as input of the posterior encoder (Sec 2.1.3 and Fig.1), and uses mel-scale spectrograms to compute the reconstruction loss (Sec 2.1.2). |
@yaozengwei Yes my bad, I misunderstood part of the code |
Hello, I am trying to implement VITS2 but I am getting the following error :
Do you have an idea where it might come from ? I know that without code it is difficult to know, I will do a PR of the implementation later this week. Thank you
The text was updated successfully, but these errors were encountered: