Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support src_lang/tgt_lang in InferenceClient.translation() #1763

Closed
5 tasks
Wauplin opened this issue Oct 18, 2023 · 5 comments
Closed
5 tasks

Support src_lang/tgt_lang in InferenceClient.translation() #1763

Wauplin opened this issue Oct 18, 2023 · 5 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@Wauplin
Copy link
Contributor

Wauplin commented Oct 18, 2023

Originally from @LysandreJik :

For some models (and some models only), it is possible to specify a source and target language in transformers and InferenceAPI. It would be nice to support it in InferenceClient as well. Example: facebook/mbart-large-50-many-to-many-mmt

from huggingface_hub import InferenceClient

data = InferenceClient().post(
    json={
        "inputs": "My name is Sarah Jessica Parker but you can call me Jessica",
        "parameters": {"src_lang": "en_XX", "tgt_lang": "fr_XX"},
    },
    model="https://api-inference.huggingface.co/models/facebook/mbart-large-50-many-to-many-mmt",
)

print(data)
b'[{"translation_text":"Mon nom est Sarah Jessica Parker mais vous pouvez m\'appeler Jessica"}]'

TODO:

  • add both parameters src_lang: Optional[str] = None and tgt_lang: Optional[str] = None to InferenceClient.translation
  • raise a ValueError if only one of them is set but not the other (?)
  • send them as "parameters" only if not None
  • add test with facebook/mbart-large-50-many-to-many-mmt
  • update docstring
@Wauplin Wauplin added enhancement New feature or request good first issue Good for newcomers labels Oct 18, 2023
@AkshathRaghav
Copy link

@Wauplin I'd like to work on this. Could you tell me how I can find all the models which support these params?

@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 6, 2023

Hi @AkshathRaghav, thanks for proposing your help on this one! 🙏 Unfortunately, there is no easy way to list models supporting these parameters. This is why it should be made clear in the docstring that it is at the user discretion to select a model that handles them. In general, the information will be found in the Model Card of the model on the Hub (along with the list of support language codes).

Therefore as a first step, I would simply implement the feature in InferenceClient for users that want to use it. Then in a second time, and if there is some demand, we could try to find a way to suggest models that support them but it's not trivial (that's the counterpart of having a versatile transformers library).

@ceferisbarov
Copy link
Contributor

@AkshathRaghav are you still working on this? I have implemented this in my fork. I can open a PR.

@AkshathRaghav
Copy link

@ceferisbarov HI, I haven't gotten a chance to work on it. Please go ahead!

ceferisbarov added a commit to ceferisbarov/huggingface_hub that referenced this issue Nov 27, 2023
ceferisbarov added a commit to ceferisbarov/huggingface_hub that referenced this issue Nov 27, 2023
@Wauplin
Copy link
Contributor Author

Wauplin commented Nov 28, 2023

Closing this issue thanks to #1869 by @ceferisbarov ! 🚀

>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient()
>>> client.translation("My name is Sarah Jessica Parker but you can call me Jessica", model="facebook/mbart-large-50-many-to-many-mmt", src_lang="en_XX", tgt_lang="fr_XX")
"Mon nom est Sarah Jessica Parker mais vous pouvez m'appeler Jessica"
>>> client.translation("My name is Sarah Jessica Parker but you can call me Jessica", model="facebook/mbart-large-50-many-to-many-mmt", src_lang="en_XX", tgt_lang="es_XX")
'Mi nombre es Sarah Jessica Parker pero puedes llamarme Jessica'

@Wauplin Wauplin closed this as completed Nov 28, 2023
ceferisbarov added a commit to ceferisbarov/huggingface_hub that referenced this issue Nov 28, 2023
…uggingface#1869)

* add language support to translation client, solves huggingface#1763

* Update tests/test_inference_client.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update tests/test_inference_client.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Update src/huggingface_hub/inference/_client.py

Co-authored-by: Lucain <lucainp@gmail.com>

* update the async client to match

* add cassette for translation tests

* Update src/huggingface_hub/inference/_client.py

* Apply suggestions from code review

* Update src/huggingface_hub/inference/_generated/_async_client.py

---------

Co-authored-by: Lucain <lucainp@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants