-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] use the Azure TTS API? #1553
Comments
The existing forvo/lingualibre are essentially the same. We could merge them into "Online TTS". Related art: this popular Anki add-on that provides TTS from many serveries (Azure included.). Maybe we can copy its UI. The left side can select a service and add various related parameters. |
What I really mean is that we don't limit this feature to one specific service provider. The implementation should allow adding new service providers easy 😅 Adding new parameters shouldn't be much harder because it is pretty much combining new query URLs. |
Though a bit ambition at first. I have no rejection with this. :-) |
After some investigation, I find this feature should not be implemented with the current Websites/Programs/TTS/Transliteration are inherently different from other local storage-based dictionaries. It was a mistake to merge them into one. All implementations of those “dictionary but actually not” are messy AF. Websites/Programs/TTS/Transliteration are the afterthought of designing
I find doing Leaky abstraction in action: For example, how to extend the properties of a dictionary with
Lines 448 to 824 in 6a91c6b
|
I agree with this , Azure tts can be used across dictionaries and act on its own. It can be displayed as a single function(for example, in the right context menu). |
Not sure about the experience. Azure tts's endpoint depends on region, a user needs to copy both endpoint and API key in a super condensed interface 😅 Uses this hurl file https://hurl.dev/ POST {{endpoint}}/cognitiveservices/v1
Ocp-Apim-Subscription-Key: ${Your key here}
X-Microsoft-OutputFormat: ogg-48khz-16bit-mono-opus
Content-Type: application/ssml+xml
User-Agent: WhatEver
<speak version='1.0' xml:lang='en-US'>
<voice name='en-US-LunaNeural'>
{{sentence}}
</voice>
</speak> with hurl ./voice.hurl --variable endpont="https://eastus.api.cognitive.microsoft.com/" --variable sentence="This is nice!" --output nice.ogg will yield an audio. The It seems all cloud TTS supports the same "SSML" thing https://cloud.google.com/text-to-speech/docs/ssml |
a little ui improvement users can use a dropdown list to select the regions which have fixed values in advance
|
move them to Edit->preference? |
This can be considered . azure tts can have its own config file. |
It is not really difficult to replicate AwesomeTTS for an audio preview pane 😅 Progress for today, a little app https://github.com/SourceReviver/temp_ctts_impl |
Do you have time to implement this feature? |
I think https://github.com/SourceReviver/temp_ctts_impl is complete for the initial version of this feature. However, I need to prepare for an exam on Friday, so I will prepare an PR this weekends 😅 |
Exam first . PR can wait. |
Refactoring the current "Pronounce" button on the toolbar is needed, I think? Currently, the length of selection of pronunciation is limited to I don't have another chunk of time until at least August 13. #1685 is usable as of now. Not sure how we should proceed. 😅 |
That is ok, I can continue to work on it when I'm available. |
https://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-started-text-to-speech?tabs=windows%2Cterminal&pivots=programming-language-cli#prerequisites
The Microsoft TTS has offered a very high quality audio ,maybe worth a try to implemented as a function.
Users can select the text and use right-click menu to pronounce the text with the above engine.
The implementation can be wrapped around the cli command or use the provided C++ SDK.
The text was updated successfully, but these errors were encountered: