Skip to content

Azure Speech Service

VRCWizard edited this page Jun 24, 2023 · 13 revisions

Consider becoming a VoiceWizardPro member

The VoiceWizardPro API allows you to access Microsoft Azure, Amazon Polly, and the new Google Cloud voices without the need to create and manage multiple accounts. By choosing a tier and becoming a member on https://ko-fi.com/ttsvoicewizard/tiers, you will receive an allotted amount of TTS characters and Translation characters that refresh monthly.

For more information on VoiceWizardPro visit: https://github.com/VRCWizard/TTS-Voice-Wizard/wiki/VoiceWizardPro

VoiceWizardPro is not required to use Azure Services with TTS Voice Wizard.

Buy Me a Coffee at ko-fi.com

Setup Tutorial Video

TTS Voice Wizard Setup Tutorial
This video just runs through the instructions on this page.

How to get your Microsoft Azure Key and Region

  1. For Speech Recognition and TTS to work you must have an Azure Subscription Key.
  • Option 1: Free Azure Account
    • Completely free for the first month. After first month you will be asked to upgrade your account to "pay as you go" but still have access to free monthly limits

image

image

  1. After making your account you will need to create a speech service to get your Key and Region. You will enter this information in the "Microsoft Azure Cognitive Service" tab located in "Settings"

  2. Follow this video to get your key and region information:
    How to get your Key and Region

    • The pricing tier for your speech service in azure should be set to F0 Free if you wish to take advantage of azures free monthly limits and not be charged image

    • I am not responsible for any charges you receive if you upgrade from a Free Azure Account and use S0 Standard pricing! It is up to you to monitor your own usage if you are using a pay-as-you-go azure account

  3. Your key and region go in the "Microsoft Azure Cognitive Service" tab located in the "Speech Provider" tab

    • Make sure to click the change button for both key and region image

Features

Text to Speech Tab

  • Many azure voices have options for selecting Speaking Styles
    • These can drastically change the sounds of voices (try them out)
  • Spoken Language is the language that you speak natively
  • Translation Language is the language that you wish to translate to.
    • It should be set to No Translation (Default) when not in use
    • Speak to text hours and translation hours are separate you get 5 hours each
      • Pro Tip: You can technically use all your 5 for your free monthly speak to text hours and then use Translation Language set to your Spoken Language for an extra 5 extra hours

Azure Settings

  • Profanity filter is on by default turn it off in azure settings.
  • Dictionary takes advantage of Azure's Phrase List feature to allow users to add new words to be recognized.
    • For instance, it can be used for words like "Pogchamp" or user names that Azure wouldn't know otherwise
    • Separate different words or phrases with commas.
    • Phrase List Example:
VRChat, Sippbox, Poiyomi, Pogchamp, Suss E Baka
  • Continuous Recognition (Azure) allows a user to continuously speak and have their words transcribed without constantly pressing the speech to text button
    • WARNING: You will quickly use up your free azure limit with this feature enabled.

Additional Notes

image

Need Help / Have Questions / Wanna make suggestions?

Donate

  • Leave me a Github Star ⭐ (it's free) or

Buy Me a Coffee at ko-fi.com

Clone this wiki locally