Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the on-device speech recognition methods #143

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

evanbliu
Copy link
Collaborator

@evanbliu evanbliu commented Feb 20, 2025

Closes #141

This PR contains the following changes:

  • Introduce a new AvailabilityStatus enum to indicate the availability of on-device speech recognition for a given language.
  • Update the availableOnDevice method to return a Promise that resolves to an AvailabilityStatus instead of a boolean.
  • Rewrite the availableOnDevice and installOnDevice methods in the modern, algorithmic style.

Preview | Diff

@evanbliu evanbliu requested a review from padenot February 20, 2025 22:01
@evanbliu
Copy link
Collaborator Author

@padenot - Can you please take a look at this PR when you get a chance? Thanks!


</dl>
<p>When the <dfn>on-device availability algorithm</dfn> with <var>lang</var> is invoked, the user agent MUST run the following steps:
1. If the [=current settings object=]'s [=relevant global object=]'s [=associated Document=] is NOT [=fully active=], throw an {{InvalidStateError}} and abort these steps.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't do this because you're in parallel. You need to do this check synchronously, thrown at this moment, then do the rest in parallel.

We can do it like so, in the algorithm:

  1. Check if doc is fully active, if not, return a DOM Exception of type ... and abort these steps
  2. Check if lang is valid per bcp47. if not, return a DOM Exception of type ... and abort these steps
  3. In parallel, do the following steps:
    1. step a
    2. step b
    3. step c
    4. step d
    5. Queue a task back to the main thread to resolve the promise with the value we've determined in step a/b/c/d

And above at the call site of the algorithm:

  1. Let p be a new promise
  2. Run the algorithm. If it returns an exception, throw it and abort these steps
  3. Return p


<dt><dfn method for=SpeechRecognition>installOnDevice({{DOMString}} lang)</dfn> method</dt>
<dd>The installOnDevice method returns a Promise that resolves to a boolean indicating whether the installation of on-device speech recognition for a given BCP 47 language tag initiated successfully. [[!BCP47]]</dd>
<dd>
The installOnDevice method returns a Promise that resolves to a boolean indicating whether the installation of on-device speech recognition for a given [[!BCP47]] language tag initiated successfully.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, use markup to autolink.

@@ -314,12 +321,39 @@ See <a href="https://lists.w3.org/Archives/Public/public-speech-api/2012Sep/0072
If the abort method is called on an object which is already stopped or aborting (that is, start was never called on it, the <a event for=SpeechRecognition>end</a> or <a event for=SpeechRecognition>error</a> event has fired on it, or abort was previously called on it), the user agent must ignore the call.</dd>

<dt><dfn method for=SpeechRecognition>availableOnDevice({{DOMString}} lang)</dfn> method</dt>
<dd>The availableOnDevice method returns a Promise that resolves to a boolean indicating whether on-device speech recognition is available for a given BCP 47 language tag. [[!BCP47]]</dd>
<dd>
The availableOnDevice method returns a Promise that resolves to a {{AvailabilityStatus}} indicating the on-device speech recognition availability for a given [[!BCP47]] language tag.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Markup around availableOnDevice. After merging my PR, it could simply be:

{{SpeechRecognition/installOnDevice}}

this allows linking back to the IDL.

Similarly for promise, boolean, etc.

1. If the <a>on-device availability algorithm</a> returns {{AvailabilityStatus/unavailable}}, {{AvailabilityStatus/downloading}}, or {{AvailabilityStatus/available}}, resolve <var>promise</var> with <code>false</code> and skip the rest of these steps.
1. Initiate the download of the on-device speech recognition language for <var>lang</var>.
<p class=note>
Note: The user agent may prompt the user for explicit permission to download the on-device speech recognition language pack.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot use MAY, which is an RFC 2119 keyword that has normative conformance class, in a Note, which is by definition informative (in other words, non-normative).

I'll turn the bikeshed option to warn about this shortly, I should have done it already, sorry.

We can simply rephrase here, say "can", that often works and is similar in meaning for what we want to say.

1. If the on-device speech recognition language pack for <var>lang</var> is unsupported, return {{AvailabilityStatus/unavailable}}.
1. If the on-device speech recognition language pack for <var>lang</var> is supported but not installed, return {{AvailabilityStatus/downloadable}}.
1. If the on-device speech recognition language pack for <var>lang</var> is downloading, return {{AvailabilityStatus/downloading}}.
1. If the on-device speech recognition language pack for <var>lang</var> is installed, return {{AvailabilityStatus/available}}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we know it has finished downloading?

"downloadable",
"downloading",
"available"
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking maybe we could reuse Translations' enum but maybe there's no point.

@padenot
Copy link
Member

padenot commented Feb 28, 2025

Sorry to be picky, but I think it's good to get things right from the get-go to get the ball rolling on the first few PRs. Apart from the "when do we know downloading has finished", my comments are on the form rather than the substance of the PR, but unfortunately that matters in normative writing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider changing return type of availableOnDevice()
3 participants