Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds real-time TTS with Piper 2.0 #21008

Closed
wants to merge 45 commits into from

Conversation

ynot01
Copy link
Contributor

@ynot01 ynot01 commented Nov 30, 2023

Duplicate of #20977 for technical reasons

Document the changes in your pull request

At present, there are 12 voices:

  • 5 female British
  • 1 male British
  • 3 female American
  • 3 male American

There are several filters that get applied in different ways. Robots and machines will have a silicon filter applied. Lizards, aliens, and ethereals have filters attached to their tongues. Gas masks and sechailers have their own filters as well. Radio will sound more radioy than talking to someone in person.

Once I figure out model training (after this is merged preferably) we can start adding user-provided voices and maybe species-specific voices.

Backend info at https://github.com/yogstation13/yogs-tts

  • Do the tgui for the hear-tts pref and voice choice pref
  • Move the 750+ binary files to a containerized backend with build process instead of this repo
  • Testmerge
  • Add pitch preference
  • config dpmt wprl
  • make TTS into a subsystem
  • keep an array of ongoing phrases and deny any incoming requests that are in that
  • radio too loud - make a volume pref for both radio and in-person
  • slight delay in speech
  • gas mask + radio amplifies volume
  • kill spam violently
  • ai stating laws sounds awful - make it slower
  • preview TTS button does not apply filters correctly
  • Implement a filter to correct phonetic pronunciations i.e. IPC -> I.P.C.
  • Fix ghosts hearing non-radio TTS from other z levels Would break multi-z TTS
  • Fix silicons not being able to choose their voice
  • Fix AIs hearing themselves twice over radio
  • Support all characters via JSON instead of sanitizing them
  • fix preview TTS being available before it should be and some potential errors from that
  • Figure out and resolve loads of HTTP 500 errors
  • Update backend on live to apply the JSON change
  • Update chameleon mask with some sick ass TGUI to be able to select your voice
  • Kill tachyon-doppler research array
  • Remove asterisk spam on comms whispering/ionospheric
  • Add filter options to chameleon mask
  • Fix lizards being abnormally quiet
  • API fetch voices (backend)
  • API fetch voices (frontend)

Why is this good for the game?

Muh immersion

Testing

tts_uhoh.mp4
tts_cat_poly.mp4

Changelog

🆑
rscadd: Added TTS
/:cl:

@ynot01 ynot01 requested a review from a team as a code owner November 30, 2023 03:12
@Yogbot-13 Yogbot-13 added DME Edit This PR affects the yogstation.DME file Feature This adds new content to the game labels Nov 30, 2023
@ynot01 ynot01 added DO NOT MERGE Should not be merged without express approval from a Head Dev Test Merge - Scheduled This PR is waiting to be test merged TESTMERGED This PR is currently live in a test merge. labels Nov 30, 2023
@ynot01 ynot01 changed the title Adds real-time TTS with Piper Adds real-time TTS with Piper 2.0 Nov 30, 2023
@Spacemanspark
Copy link

ok I fucking burst out laughing when I heard Poly, god fucking damnit

@github-actions github-actions bot added Config Config files need to be changed on the host for this to work tgui This PR affects TGUI labels Nov 30, 2023
Yogbot-13 added a commit that referenced this pull request Nov 30, 2023
Yogbot-13 added a commit that referenced this pull request Nov 30, 2023
Yogbot-13 added a commit that referenced this pull request Nov 30, 2023
@ynot01 ynot01 added Literally the best PR My god its beautiful and removed TESTMERGED This PR is currently live in a test merge. labels Nov 30, 2023
Yogbot-13 added a commit that referenced this pull request Dec 1, 2023
Yogbot-13 added a commit that referenced this pull request Dec 1, 2023
Yogbot-13 added a commit that referenced this pull request Dec 1, 2023
Yogbot-13 added a commit that referenced this pull request Jan 11, 2024
Yogbot-13 added a commit that referenced this pull request Jan 13, 2024
@Cartlord
Copy link
Contributor

I've taken a look at the projects cited & their datasets, and I must thank & congratulate you for (seemingly) managing to only use A.I. trained on ethically collected data. The only issues I have are that two (US-amy and US-danny) link to https://github.com/MycroftAI/mimic3-voices without saying exactly which of the entries on the page they are (meaning i can't check how the data was collected), and that one (US-kusal) links to https://github.com/MycroftAI/mimic2 , which is not a voice repository or dataset at all but a model for training A.I.s off them.

@ynot01
Copy link
Contributor Author

ynot01 commented Jan 15, 2024

I've taken a look at the projects cited & their datasets, and I must thank & congratulate you for (seemingly) managing to only use A.I. trained on ethically collected data. The only issues I have are that two (US-amy and US-danny) link to https://github.com/MycroftAI/mimic3-voices without saying exactly which of the entries on the page they are (meaning i can't check how the data was collected), and that one (US-kusal) links to https://github.com/MycroftAI/mimic2 , which is not a voice repository or dataset at all but a model for training A.I.s off them.

Voices amy, danny, kusal retrieved from:
https://huggingface.co/rhasspy/piper-voices/tree/v1.0.0/en/en_US

Licensing information was not changed when porting files

All files demand * License: See URL, linking to the below two repositories

amy & danny reference MycroftAI/mimic3-voices, which is licensed under Creative Commons Attribution Share Alike 4.0 International

kusal references MycroftAI/mimic2, which is licensed under Apache License 2.0

It should be noted that amy and kusal are referenced in the MycroftAI/mimic2 repository, but no differing licenses or datasets are indicated

Because the datasets used to train these voices are not available on the repositories linked, it can only be assumed that the repository LICENSE file must be used because of the instruction * License: See URL

CC BY-SA 4.0 DEED is complied with by the links provided on the repository README and supplied unadulterated MODEL_CARD files

@Cartlord
Copy link
Contributor

I never meant to imply the license was the problem - that's all clearly above-board. I was only concerned with looking through the datasets to see how the data was collected, and my issue was just that not all of them were wholly transparent with where they got their data - I was trying to make sure it was all either from the public domain or collected from people who volunteered to train A.I..

Yogbot-13 added a commit that referenced this pull request Jan 17, 2024
Yogbot-13 added a commit that referenced this pull request Jan 17, 2024
Yogbot-13 added a commit that referenced this pull request Jan 17, 2024
Yogbot-13 added a commit that referenced this pull request Jan 17, 2024
Yogbot-13 added a commit that referenced this pull request Jan 18, 2024
Yogbot-13 added a commit that referenced this pull request Jan 19, 2024
Yogbot-13 added a commit that referenced this pull request Jan 22, 2024
Yogbot-13 added a commit that referenced this pull request Jan 23, 2024
@JamieD1
Copy link
Contributor

JamieD1 commented Feb 11, 2024

I like em

@ToasterBiome
Copy link
Contributor

Maybe some other time, nice experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Config Config files need to be changed on the host for this to work DME Edit This PR affects the yogstation.DME file DO NOT MERGE Should not be merged without express approval from a Head Dev Feature This adds new content to the game Literally the best PR My god its beautiful Test Merge - Scheduled This PR is waiting to be test merged TESTMERGED This PR is currently live in a test merge. tgui This PR affects TGUI
Projects
None yet
Development

Successfully merging this pull request may close these issues.