Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd result from current version of the model #9

Closed
hexylena opened this issue Dec 15, 2020 · 7 comments
Closed

Odd result from current version of the model #9

hexylena opened this issue Dec 15, 2020 · 7 comments

Comments

@hexylena
Copy link

Hey, I'm not sure this is the best place to report this or if it's a bug with the docker version or upstream models but figured I'd ask you.

I have a pretty simple input:

Some important terms you should know.

Which generates a very strange result of the sentence, followed by the last word repeated endlessly until it runs out of processing time.

I ran a couple tests: with the latest available container (7118de96680b) and one from 2 days ago (ae5bf234ed9b) that both generate this result:

$ curl --silent -G --output - --data-urlencode 'text=Some important terms you should know.' 'http://localhost:5002/api/tts' > out
$ ffprobe -show_streams out 2>&1 | grep duration
duration_ts=754944
duration=34.237823

the server logs an unexpectedly high runtime.

[INFO] Synthesizing (37 char(s))...
[INFO] initializing backend espeak-1.48.03
 > Run-time: 9.952790260314941
 > Real-time factor: 0.2906955861025097
 > Time per step: 1.3183479293013637e-05

However an old version of the container I happened to have (483c72abc233), ~5 months old, it works as expected.

curl --silent -G --output - --data-urlencode 'text=Some important terms you should know.' 'http://localhost:5004/api/tts' > out
$ ffprobe -show_streams out 2>&1 | grep duration
duration_ts=49920
duration=2.263946

Any ideas what could be causing this? Removing the . from the end of the sentence seems to cause it work properly. Adding a . to the end likewise returns a short time.

@synesthesiam
Copy link
Owner

Wow, this is bizarre. Have you listened to the audio? She repeats "know know know..." over and over. But if I change the sentence simply to "Some important terms you should know now." it works fine.

This seems like a problem with the upstream model, specifically the stopnet (my guess). Maybe @erogol has some clue what's happening here.

Here is an audio example of the pre-trained LJSpeech model with the sentence "Some important terms you should know." -- https://drive.google.com/file/d/1gQMMU-UUlhL5fRW76UcGL2CdUbbdMdWd/view?usp=sharing

@hexylena
Copy link
Author

Right? So funny and unsettling, and among my ~300 other sentences none triggered that behaviour.

Thanks for knowing who to ping! Happy to recreate this issue somewhere else if it's more helpful for the mozilla team!

@synesthesiam
Copy link
Owner

It would be helpful to follow up in their forum: https://discourse.mozilla.org/c/tts/

Maybe there's some extra setting I should be using somewhere.

@hexylena
Copy link
Author

Sure, can do. I'll see if I can write up a minimal example to help them and post it there.

Additionally just wanted to say a huge thank you for writing such an easy to use container for this, it's making a difference in a project I'm working on where we generate videos from slides with narration. Before only those with an AWS account could test their scripts, now thanks to this container (mozilla tts would be too complicated for most people to configure) it's open to everyone and it's such an improvement!

@hexylena
Copy link
Author

@synesthesiam synesthesiam reopened this Dec 16, 2020
@synesthesiam
Copy link
Owner

I'm honored to be able to contribute to your project :)

Is there anything I could add to the container to help more?

@hexylena
Copy link
Author

honestly no, it was so easy to add support for this as a backend.

An AWS polly compatible API would've made it easier, but, after 10 minutes of work to add a --backend [aws|mozilla] flag, it wasn't necessary, so, I'm pretty sure it would not be worth the implementation effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants