Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align I/O with Inference API #99

Merged
merged 26 commits into from
Dec 12, 2024
Merged

Align I/O with Inference API #99

merged 26 commits into from
Dec 12, 2024

Conversation

alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented Nov 13, 2024

Description

This PR aligns the I/O payloads on Inference Endpoints with the definitions for the Inference API, with the I/O payloads currently available via the transformers.pipeline, diffusers.pipeline, and sentence-transformer interfaces.

Additionally, this PR also closes #72 as it includes the handling and validation of both the current sentence-ranking approach, as well as the approach compatible with the TEI /rank endpoint.

@alvarobartt
Copy link
Member Author

Hi here @hanouticelina as far as I could check the naming affects the following tasks:

  • "text-generation"
  • "image-to-text"
  • "automatic-speech-recognition"
  • "text-to-audio"
  • "text-to-speech"
  • any model with "translation"

AFAIK for "image-to-text", "automatic-speech-recognition", "text-to-audio", and "text-to-speech" the generate_kwargs should be left as is, as those are part of the pipelines and an argument per se i.e. just generate_kwargs not **generate_kwargs; and for the rest of the tasks AFAIK we need to flatten the dict so that the **generate_kwargs are provided instead of generate_kwargs. This being said, we also need to include renames from generate (former spec) and generation_parameters (updated spec but not updated yet on transformers AFAIK), is that correct @hanouticelina?

Thanks a ton 🤗

@hanouticelina
Copy link
Contributor

Hi here @hanouticelina as far as I could check the naming affects the following tasks:

  • "text-generation"
  • "image-to-text"
  • "automatic-speech-recognition"
  • "text-to-audio"
  • "text-to-speech"
  • any model with "translation"

AFAIK for "image-to-text", "automatic-speech-recognition", "text-to-audio", and "text-to-speech" the generate_kwargs should be left as is, as those are part of the pipelines and an argument per se i.e. just generate_kwargs not **generate_kwargs; and for the rest of the tasks AFAIK we need to flatten the dict so that the **generate_kwargs are provided instead of generate_kwargs

yes, correct!

This being said, we also need to include renames from generate (former spec) and generation_parameters (updated spec but not updated yet on transformers AFAIK), is that correct @hanouticelina?

yes, currently Inference API does not support (for "image-to-text", "automatic-speech-recognition", "text-to-audio", and "text-to-speech) generate (which should have been generate_kwargs, see this issue for more context) and generation_parameters is not supported yet, only the specs have been updated as you mentioned, waiting for generate_kwargs to be replaced by generation_parameters in transformers, you can find the related PR here.

I recommend following the current transformers implementation as you're using directly the pipelines. If you want to be forward compatible, you can handle generation_parameters, but no need to do that with generate as it's not used at all in the Inference API.

@alvarobartt
Copy link
Member Author

I recommend following the current transformers implementation as you're using directly the pipelines. If you want to be forward compatible, you can handle generation_parameters, but no need to do that with generate as it's not used at all in the Inference API.

Yes makes sense, then I'll just handle the current as we're pinning the transformers version and update that once generation_parameters is included, as atm I'd say there's no need to, let me re-review and I'll get back to you!

Thank you so much for your time 🤗

Copy link
Member

@ErikKaum ErikKaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be good as far as I can tell 👍

@alvarobartt alvarobartt merged commit e0abd4b into main Dec 12, 2024
6 checks passed
@alvarobartt alvarobartt deleted the inference-api-alignment branch December 12, 2024 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants