Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Input Language Parameter #63

Open
KenjiBaheux opened this issue Dec 3, 2024 · 4 comments
Open

[FR] Input Language Parameter #63

KenjiBaheux opened this issue Dec 3, 2024 · 4 comments

Comments

@KenjiBaheux
Copy link

To improve the accuracy and reliability of the AI APIs, it might be useful to add an optional language parameter for specifying the input's language.

This would allow the implementation to select the optimal model, fine-tuning, and prompting strategy for the given language, resulting in higher quality output. It would also enable the implementation to clearly communicate supported languages (i.e. languages that have been tested for quality and safety), preventing unexpected behavior with unsupported input.

@domenic
Copy link
Collaborator

domenic commented Dec 3, 2024

optional

I'm unsure how best to make it optional. Can we have it be required instead?

Having web API behavior depend on the user's language is not common and is prone to test vs. production mismatches.

Having a web API depend on the page's language is possible (i.e. <html lang="en">) but web developers don't always provide those, and there's no good answer for inside service workers or shared workers, which can be associated with multiple documents.

@christianliebel
Copy link

Authors may not know what the input language is, for example, in a chatbot scenario. Users may even switch between languages within a single prompt, so I think this can only be an optional hint.

@KenjiBaheux
Copy link
Author

My hope was that by making it optional, we could add this in a follow up.
If this proves challenging, then would making it required but with a default be possible?
I think this could be the second cheapest option:

  • We only need to support the default value initially.
  • We can gradually add support for specific input languages.

Not too sure about what the default should be though:

  • "mixed" to capture the idea that the input could be in any language, although this might strongly suggest "multilingual input"...
  • "unspecified" / "undefined" might be better to convey the idea that no extra effort was (or could be) taken, and that the output quality might not be the best it could be.

@domenic
Copy link
Collaborator

domenic commented Dec 3, 2024

"required but with a default" = "optional" :)

But, both messages from above are helpful for me as a reality check, that we probably do need some sort of default. I like the idea of calling it "unspecified" (probably null in the API).

This is not a great situation for a web standard. For example, I think it will be de-facto expected that the language model works mostly on English, with maybe a bit extra. That would effectively prevent other browsers from entering the market with a different primary language for their language model.

But, maybe this is the best we can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants