window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome #23084

guest271314 · 2020-04-18T14:01:35Z

There is no way (without non-standard workarounds) to programmatically test if window.speechSynthesis.speak() outputs audio at all at Chromium or Chrome browsers.

It is possible to programmatically test if window.speechSynthesis.speak() outputs audio at Firefox and Nightly.

AFAICT there is currently no manual WPT test for audio output for window.speechSynthesis.speak().

This issue is for the purposes of at least adding a manual test for window.speechSynthesis.speak(); and further to track efforts to find a means to programmatically test audio output of window.spechSynthesis.speak() or conclusively determine if such a test is not possible at Chromium and Chrome browser.

The text was updated successfully, but these errors were encountered:

guest271314 · 2020-04-18T14:09:57Z

cc @foolip @sjdallst

stephenmcgruer · 2020-04-19T11:11:17Z

It is possible to programmatically test if window.speechSynthesis.speak() outputs audio at Firefox and Nightly.

Can you explain, or give a short code snippet showing, how one can programmatically test this in Firefox?

guest271314 · 2020-04-19T14:45:29Z

@stephenmcgruer

At Firefox or Nightly it is possible to select a Monitor of <device> source at navigator.mediaDevices.getUserMedia()

https://wiki.archlinux.org/index.php/PulseAudio/Examples

ALSA monitor source

To be able to record from a monitor source (a.k.a. "What-U-Hear", "Stereo Mix")

<!DOCTYPE html>

<html>

  <head>
  </head>

  <body>
    <script>
      let track,
        stream,
        recorder,
        u,
        voices,
        voice,
        // create 1 speechSynthesis object reference
        // instead of using multiple literal window.speechSynthesis.speak()
        // which could result in unexpected behaviour
        synth = window.speechSynthesis;
      navigator.mediaDevices
        .getUserMedia({
          audio: true
        })
        .then(stream => {
          // make sure to clear the queue
          synth.cancel();
          [track] = stream.getAudioTracks();
          track.contentHint = 'speech';
          recorder = new MediaRecorder(stream);
          recorder.ondataavailable = e => {
            // use FileReader here to avoid Firefox Blob URL bug
            // https://bugzilla.mozilla.org/show_bug.cgi?id=1628906#c7
            fr = new FileReader();
            fr.onload = _ => {
              // analyze audio output here
              var audio = new Audio(fr.result);
              document.body.appendChild(audio);
              audio.controls = true;
              track.stop();
            };
            fr.readAsDataURL(e.data);
          };
          // start here to avoid clipping initial audio output
          recorder.start();
          voices = synth.getVoices();
          if (!voices) {
            synth.onvoiceschanged = e => {
              voices = synth.getVoices();
            };
          }
          voice = voices.find(({
            lang,
            name
          }) => /^English_(America)/.test(name));
          u = new SpeechSynthesisUtterance();
          u.text = 'test '.repeat(50);
          u.voice = voice;
          console.log(u);
          // can clip initial audio output,
          // which could be ongoing for 
          // fraction of 1 second when start() is called
          // u.onstart = _ => recorder.start();
          u.onend = _ => recorder.stop();
          synth.speak(u);
        })
        .catch(console.trace);

    </script>
  </body>

</html>

Chrome, Chromium do not provide selection of Monitor of <device> at getUserMedia() prompt and there is no way to select that device with constraints, TL;DR

though technically, it is possible to select Monitor of <device> for Chromium at *nix at native OS (PulseAudio) Volume Control during audio output at Recording guest271314/SpeechSynthesisRecorder#14, which generally means at first run input text "test" will not provide enough time to select the option, so use 'test'.repeat(50) for first run to select that option at local volume control dialog, afterwards the MediaStreamTrack from subsequent getUserMedia() calls for audio will stream from that device, which allows capturing only audio output, not microphone, even when headphones are plugged in and user has background noise playing during capture.

guest271314 · 2020-04-19T15:00:00Z

@stephenmcgruer For completeness, note, voices are loaded asynchronously from a socket connection to speech-dispatcher, the call to getVoices() at the above code should probably look more like

    const synth = window.speechSynthesis;
    synth.cancel();
    let voices, voice;
    handleVoicesChanged = _ => {
      if (_) {
        alert(_.type);
      }
      alert(voices.length);
      voice = voices.find(({lang, name}) => /* filter voice */);
      const utterance = new SpeechSynthesisUtterance();
      utterance.text = text;
      utterance.voice = voice;
      synth.speak(utterance);
    };
    onload = e => {
      document.getElementById("test").onclick = _ => {
        voices = synth.getVoices();
        if (voices.length === 0) {
          synth.onvoiceschanged = handleVoicesChanged;
        } else {
          handleVoicesChanged();
        };
      };
    };

https://bugs.chromium.org/p/chromium/issues/detail?id=959362#c4

guest271314 · 2020-04-19T15:09:11Z

Note also the *oogle Chrome is shipped with custom voices created by *oogle - not Chromium - which requires enabling speech-dispatcher and having a local speech synthesis engine installed, and Firefox and Chrome do not both set lang attribute of voice, so /^English_(America)/.test(name) may not be applicable where that is an espeak or espeak-ng voice, and if neither of those native applications are installed the default voice could be selected, which should not necessarily matter in this instance where the goal is to capture the output for any case. Just another note that Chrome and Chromium could and will behave differently here. (Am at 32-bit machine so have not tried at Chrome since version 46 or 47 when 32-bit support for Chrome was deprecated; Chrome may circumvent to ordinary call process to speech-dispatcher to supplant its own voices).

guest271314 changed the title ~~window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome [type: untestable][Missing coverage]~~ window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome [type:untestable][Missing coverage] Apr 18, 2020

guest271314 changed the title ~~window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome [type:untestable][Missing coverage]~~ window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome [type:untestable][type:missing-coverage] Apr 18, 2020

stephenmcgruer added speech-api type:untestable labels Apr 19, 2020

guest271314 changed the title ~~window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome [type:untestable][type:missing-coverage]~~ window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome Apr 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome #23084

window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome #23084

guest271314 commented Apr 18, 2020 •

edited

Loading

guest271314 commented Apr 18, 2020

stephenmcgruer commented Apr 19, 2020

guest271314 commented Apr 19, 2020

guest271314 commented Apr 19, 2020

guest271314 commented Apr 19, 2020

window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome #23084

window.speechSynthesis.speak() audio output is untestable at Chromium or Chrome #23084

Comments

guest271314 commented Apr 18, 2020 • edited Loading

guest271314 commented Apr 18, 2020

stephenmcgruer commented Apr 19, 2020

guest271314 commented Apr 19, 2020

guest271314 commented Apr 19, 2020

guest271314 commented Apr 19, 2020

guest271314 commented Apr 18, 2020 •

edited

Loading