Switch to microsoft-cognitiveservices-speech-sdk for SpeechSynthesis #173

sbiaudet · 2022-02-09T11:01:56Z

Hello Compulim,

I use through the botframework-webchat, the module web-speech-cognitive-services. I have a fully customized UX with an animated character. I need the onwordboundary event to synchronize the display of subtitles and the character animations.

Currently you are using the REST API to do the SpeechSynthesis. If you used the sdk directly for SpeechSynthesis with websocket we could have the onwordboundary event. Also we could have access to the Visem event to do lipsync.

Do you think you could use the sdk instead of the REST API?

sbiaudet · 2023-06-27T07:06:27Z

@compulim we've been working on integrating the SpeechSynthesis library's websocket-based sdk in place of REST calls.

Response times are much better and the onStart event is better synchronized. We've added support for the onBoundary event with word and viseme types

Are you open to us proposing a pull-request?

sbiaudet · 2023-07-03T08:40:13Z

@compulim little up to remind my demand. We are ready to push a pull-request. Are you ok ?

vladmaraev · 2024-06-25T15:40:39Z

@sbiaudet I would be very interested in this. Do you want to collaborate on this change?

sbiaudet · 2024-06-25T15:45:17Z

@vladmaraev I never had a response from @compulim. We've fork the repo and publish a package here https://www.npmjs.com/package/@davi-ai/web-speech-cognitive-services-davi.

I'm ready to merge here, it's idiot to maintain a fork just for this. @compulim, is that ok with you ?

vladmaraev · 2024-06-26T16:09:55Z

@sbiaudet That's really nice! I tried your package, but unfortunately it fails to synthesise SSML (however I can see in the generated js code that SSML is still supported)... Maybe there are some caveats? I would be happy to contribute to either a PR here or to your fork (is it public?). Many thanks for your work!

@Davi-ai

This change switches TTS ponyfill to @Davi-ai fork. See the reasons for creating the fork here: compulim/web-speech-cognitive-services#173 (comment) This change enables sending VISEME events by SpeechState, to control external avatars. In addition, @Davi-ai fork was patched to remove excessive logging.

@Davi-ai

This change switches TTS ponyfill to @Davi-ai fork. See the reasons for creating the fork here: compulim/web-speech-cognitive-services#173 (comment) This change enables sending VISEME events by SpeechState, to control external avatars. In addition, @Davi-ai fork was patched to remove excessive logging.

@Davi-ai

* This change switches TTS ponyfill to @Davi-ai fork. See the reasons for creating the fork is here: compulim/web-speech-cognitive-services#173 (comment) * The @Davi-ai fork was modified to adjust typing, ASR final results and, additionally, excessive logging was removed. * This change enables sending VISEME events inside SpeechState, to control external avatars. For now, the VISEME events gets transformed to stream of FURHAT_BLENDSHAPES events which control Furhat lipsync. * Extensive test coverage for ASR and TTS (including streaming). To test streaming one needs to run SSE server (~test/server.js~)

@Davi-ai

* This change switches TTS ponyfill to @Davi-ai fork. See the reasons for creating the fork is here: compulim/web-speech-cognitive-services#173 (comment) * The @Davi-ai fork was modified to adjust typing, ASR final results and, additionally, excessive logging was removed. * This change enables sending VISEME events inside SpeechState, to control external avatars. For now, the VISEME events gets transformed to stream of FURHAT_BLENDSHAPES events which control Furhat lipsync. * Extensive test coverage for ASR and TTS (including streaming). To test streaming one needs to run SSE server (~test/server.js~)

vladmaraev mentioned this issue Jun 28, 2024

Migrate TTS/ASR to @davi-ai fork; Support visemes vladmaraev/speechstate#7

Merged

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to microsoft-cognitiveservices-speech-sdk for SpeechSynthesis #173

Switch to microsoft-cognitiveservices-speech-sdk for SpeechSynthesis #173

sbiaudet commented Feb 9, 2022

sbiaudet commented Jun 27, 2023

sbiaudet commented Jul 3, 2023

vladmaraev commented Jun 25, 2024

sbiaudet commented Jun 25, 2024

vladmaraev commented Jun 26, 2024

Switch to microsoft-cognitiveservices-speech-sdk for SpeechSynthesis #173

Switch to microsoft-cognitiveservices-speech-sdk for SpeechSynthesis #173

Comments

sbiaudet commented Feb 9, 2022

sbiaudet commented Jun 27, 2023

sbiaudet commented Jul 3, 2023

vladmaraev commented Jun 25, 2024

sbiaudet commented Jun 25, 2024

vladmaraev commented Jun 26, 2024