-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Web Translation API #948
Comments
I like this idea. And especially giving the capability to the user to choose between either using on-device or cloud based models. Given how performant and efficient on-device AI models have become using WASM/WebGPU, we can definitely vouch for the translations to take place while also minimizing risk of any data being sent to cloud. |
I think having an auto detect option could be really cool. Scenario: Imagine you are in a Google Meet video call and everyone is from a different country. This way, you could make a Chrome Extension that auto detects the language and is able to translate for each person speaking even if language changes between detection sessions. That would be super useful. It would be good if you are thinking of hyrbid approach for the programmer to explicitly require offline on device model inference here in case there are privacy aspects they need to adhere to for their application. I think 3 options are useful:
|
I think we can also recommend the user based on their hardware, the type of model inference to be used (cloud vs on device). For instance, if the user's hardware doesn't have a GPU and has limited RAM, it might be better suited for cloud inference. Let me know what do you think @jasonmayes |
We agree with and support the user need. Here are our thoughts...
|
Thanks for the review!
I believe these are listed in the first paragraph of the explainer. https://github.com/WICG/translation-api/blob/main/README.md#explainer-for-the-web-translation-and-language-detection-apis
It is possible for the developer to avoid downloading the model, if the browser intends to support on-device translation, by checking if We haven't yet exposed whether the translation is done entirely on-device or through cloud services, because doing so could possibly cause developers to write code that excludes certain browsers. But, we understand this could be worthwhile. This is mentioned in https://github.com/WICG/translation-api/blob/main/README.md#goals . We'll closely monitor this space, to find out if there are developers who need this ability, and/or whether any browsers actually plan to implement using cloud services.
Thanks for the kind words, although at least the fake downloads idea isn't looking too promising at the moment. webmachinelearning/translation-api#10
This is related to some ongoing work on other AI model-based APIs which are not yet at the stage of being ready for TAG review. We want them all to share a namespace and a set of common API patterns (e.g. sibling I understand it can be hard to judge this in the absence of other reviewable explainers, so we can revisit this later when we make more progress on those. Stay tuned!!
I thought about this avenue as well. First, to clarify, we do need a separate So I think what you're suggesting ends up converting the API from something like const capabilities = await ai.translator.capabilities();
const translator = await ai.translator.create(); to something like const capabilities = await AITranslatorCapabilities.get();
const translator = await AITranslator.create(); I think this is a viable direction. A bit uglier in my opinion, but if the goal is to minimize the number of namespaces, then it does work. We can keep it as a possibility, and see which web developers prefer, or if other arguments appear on either side.
The exact UI signals for when these APIs are in use is definitely worth exploring. Browser UI teams are not always excited about adding "noise" to what the user sees, but if we end up needing a permission prompt or similar anyway for privacy reasons, maybe we could convince them to add in some progress measures.
To some extent yes. Before webmachinelearning/translation-api@2cb6637 the APIs were more tighly coupled, both existing on a We separated the APIs even more once we looked into the possible implementation strategies. It turns out that language detector models and translation models are generally quite different. And we wanted to allow browsers to take advantage of these differences, instead of forcing them to unify to a lowest-common-denominator, or expose strange inconsistencies to web developers. For example, you can find small off-the-shelf language detector models supporting over 80 languages. (If I am reading this MDN page correctly correctly, both Firefox and Chrome use such a model for the Web Extensions A related question is discussed in https://github.com/WICG/translation-api/blob/main/README.md#allowing-unknown-source-languages-for-translation. |
Just sharing an example I was tagged in recently showing how JS users are using Web AI already to do real-time translation in browser entirely client side - may be useful to see use cases by real existing users that could help shape discussion here: |
Capabilities could simply be a static method on the translator, no?
|
Sure, either way, although that's less symmetric than having each class vend its own instances, and takes us back toward kinda using classes as namespaces (just this time with static methods). |
One wonders if, in the future, this will be as meaningless as an API called |
IMO It’s about what makes more sense in terms of entity-relationships. The human mental model is that we’re querying the capabilities of the translator; creating a Which actually makes me wonder if we need this object at all. Why not simply an async getter and an async function on Footnotes
|
Agreed, if authors care about getting the capabilities of the translator, querying the translator directly makes more sense. Having each class vend its own instances is an example of putting theoretical purity of an arbitrary design pattern over user needs |
What type would that async getter/function return? |
Hi Domenic - Our feedback that we sent above stands, particularly regarding the name space of this API. We don't think it belongs in a |
こんにちは TAG-さん!
I'm requesting a TAG review of Web Translation API.
Browsers are increasingly offering language translation to their users. Such translation capabilities can also be useful to web developers. This is especially the case when browser's built-in translation abilities cannot help, such as:
To perform translation in such cases, web sites currently have to either call out to cloud APIs, or bring their own translation models and run them using technologies like WebAssembly and WebGPU. This proposal introduces a new JavaScript API for exposing a browser's existing language translation abilities to web pages, so that if present, they can serve as a simpler and less resource-intensive alternative.
Further details:
The text was updated successfully, but these errors were encountered: