Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Implement getting audio output from window.speechSynthesis.speak() call as Blob or other data type proposal/addition needs implementer interest #2823

Closed
guest271314 opened this issue Jul 9, 2017 · 1 comment

Comments

@guest271314
Copy link
Contributor

Get audio output from window.speechSynthesis.speak() call as ArrayBuffer, AudioBuffer, Blob, MediaSource, MediaStream, ReadableStream, other object or data types.

Ideally, the implementation can be performed without workaround of using navigator.mediaDevices.getUserMedia() and MediaRecorder().

Relevant bug for Firefox https://bugzilla.mozilla.org/show_bug.cgi?id=1377816. Feature request for Chromium
https://bugs.chromium.org/p/chromium/issues/detail?id=733051#c3. Workaround
so far at github https://github.com/guest271314/SpeechSynthesisRecorder. It
took a while to determine that Monitor of Built-in Audio was necessary
instead of Built-in Audio at .getUserMedia() prompt.

Three widely applicable and appropriate use cases which are at the forefront are

  1. Persons who have issues speaking; i.e.g., persons whom have suffered a
    stroke or other communication inhibiting afflictions. They could convert
    text to an audio file and send the file to another individual or group.
    This feature would go towards helping them communicate with other persons,
    similar to the technologies which assist Stephen Hawking communicate;

  2. Presently, the only person who can hear the audio output is the person
    in front of the browser; in essence, not utilizing the full potential of
    the text to speech functionality. The audio result can be used as an
    attachment within an email; media stream; chat system; or other
    communication application. That is, control over the generated audio output;

  3. Another application would be to provide a free, libre, open source audio
    dictionary and translation service - client to client and client to server,
    server to client.

Those are the main three use cases. There are others that can fathom;
though the above should be adequate to cover a wide range of users of the
implementation.

If, in your or your organizations' view, those use cases are not compelling
or detailed enough, please advise and will compose a more thorough analysis
and proposal.

The current workaround is cumbersome. Why should we have to use
navigator.mediaDevices.getUserMedia() and MediaRecorder to get the audio
output? It is not as if the workaround is impossible to achieve, though why
are we needing to use two additional methods to get audio as a static file?

At a minimum we should be able to get a Blob or ArrayBuffer of the
generated audio. The Blob or ArrayBuffer could, generally, be converted to
other formats, if necessary. For example meSpeak.js already provides the
described functionality http://plnkr.co/edit/ZShBbiFGEKIJX2WgErkl?p=preview
.
Re: MediaStream, ArrayBuffer, Blob audio result from speak() for recording?

@domenic
Copy link
Member

domenic commented Jul 14, 2017

This specification does not define speechSynthesis, so I'll close this issue. Please feel free to open it on the repository of whatever specification does define that API.

@domenic domenic closed this as completed Jul 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants