Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems using OpenVoice with cuda and >5s source audio #234

Open
eginhard opened this issue Dec 23, 2024 Discussed in #232 · 0 comments
Open

Problems using OpenVoice with cuda and >5s source audio #234

eginhard opened this issue Dec 23, 2024 Discussed in #232 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@eginhard
Copy link
Member

Discussed in #232

Originally posted by CiobanuPaul December 23, 2024
I have just upgraded coqui-tts to 0.25.1 to be able to use OpenVoice voice converter.
One issue I get is that an exception occurs if I use "cuda". It works only on "cpu".
The second issue is that the output of the vc has always only 5 seconds of content, the rest of it is white noise (if the source wav is bigger than 5 seconds).

I am using python3.10
This is a part of the exception message for the first issue:

 File "/home/catalin/Documents/virtual_envs/venv/lib/python3.10/site-packages/TTS/vc/models/openvoice.py", line 288, in extract_se
    y = torch.FloatTensor(audio_ref)
TypeError: expected TensorOptions(dtype=float, device=cpu, layout=Strided, requires_grad=false (default), pinned_memory=false (default), memory_format=(nullopt)) (got TensorOptions(dtype=float, device=cuda:0, layout=Strided, requires_grad=false (default), pinned_memory=false (default), memory_format=(nullopt)))
@eginhard eginhard self-assigned this Dec 23, 2024
@eginhard eginhard added the bug Something isn't working label Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant