Align I/O with Inference API #99

alvarobartt · 2024-11-13T15:27:57Z

Description

This PR aligns the I/O payloads on Inference Endpoints with the definitions for the Inference API, with the I/O payloads currently available via the transformers.pipeline, diffusers.pipeline, and sentence-transformer interfaces.

Additionally, this PR also closes #72 as it includes the handling and validation of both the current sentence-ranking approach, as well as the approach compatible with the TEI /rank endpoint.

Apparently the parameters are indeed supported via the `__call__` method of the `TokenClassificationPipeline` even if the docs say otherwise, since those are internally provided to the `_sanitize_parameters` function and then used within the `__call__` method instead of via the `__init__`

src/huggingface_inference_toolkit/handler.py

Co-authored-by: Célina <hanouticelina@users.noreply.github.com>

alvarobartt · 2024-12-03T20:48:24Z

Hi here @hanouticelina as far as I could check the naming affects the following tasks:

"text-generation"
"image-to-text"
"automatic-speech-recognition"
"text-to-audio"
"text-to-speech"
any model with "translation"

AFAIK for "image-to-text", "automatic-speech-recognition", "text-to-audio", and "text-to-speech" the generate_kwargs should be left as is, as those are part of the pipelines and an argument per se i.e. just generate_kwargs not **generate_kwargs; and for the rest of the tasks AFAIK we need to flatten the dict so that the **generate_kwargs are provided instead of generate_kwargs. This being said, we also need to include renames from generate (former spec) and generation_parameters (updated spec but not updated yet on transformers AFAIK), is that correct @hanouticelina?

Thanks a ton 🤗

hanouticelina · 2024-12-04T13:35:11Z

Hi here @hanouticelina as far as I could check the naming affects the following tasks:

"text-generation"

"image-to-text"

"automatic-speech-recognition"

"text-to-audio"

"text-to-speech"

any model with "translation"

AFAIK for "image-to-text", "automatic-speech-recognition", "text-to-audio", and "text-to-speech" the generate_kwargs should be left as is, as those are part of the pipelines and an argument per se i.e. just generate_kwargs not **generate_kwargs; and for the rest of the tasks AFAIK we need to flatten the dict so that the **generate_kwargs are provided instead of generate_kwargs

yes, correct!

This being said, we also need to include renames from generate (former spec) and generation_parameters (updated spec but not updated yet on transformers AFAIK), is that correct @hanouticelina?

yes, currently Inference API does not support (for "image-to-text", "automatic-speech-recognition", "text-to-audio", and "text-to-speech) generate (which should have been generate_kwargs, see this issue for more context) and generation_parameters is not supported yet, only the specs have been updated as you mentioned, waiting for generate_kwargs to be replaced by generation_parameters in transformers, you can find the related PR here.

I recommend following the current transformers implementation as you're using directly the pipelines. If you want to be forward compatible, you can handle generation_parameters, but no need to do that with generate as it's not used at all in the Inference API.

alvarobartt · 2024-12-04T14:35:51Z

I recommend following the current transformers implementation as you're using directly the pipelines. If you want to be forward compatible, you can handle generation_parameters, but no need to do that with generate as it's not used at all in the Inference API.

Yes makes sense, then I'll just handle the current as we're pinning the transformers version and update that once generation_parameters is included, as atm I'd say there's no need to, let me re-review and I'll get back to you!

Thank you so much for your time 🤗

Also adds some extra validation steps

ErikKaum

this should be good as far as I can tell 👍

alvarobartt added 3 commits November 12, 2024 10:02

Fix task type-hint and remove extra space in logging

d20528a

Align transformers and diffusers inputs with Inference API

be146c8

Remove duplicated sentencepiece extra requirement

0b61436

alvarobartt added the improvement label Nov 13, 2024

alvarobartt requested review from ErikKaum and philschmid November 13, 2024 15:27

alvarobartt self-assigned this Nov 13, 2024

alvarobartt added 16 commits November 13, 2024 11:50

Remove pipeline.task check for sentence-transformers

49254e9

Add warning and pop unsupported parameters

b45c40a

Fix sentence-transformers pipeline type-hints

b9dec32

Update sentence-ranking type-hints

77c2bb2

Add missing type-hints and clear code a bit

9b4fc67

Fix failing sentence-transformers tests due to input parsing

c1d519a

Fix "table-question-answering" payload check

7f0d84d

Fix "zero-shot-classification" payload check

307b27f

Check that payload is dict in advance

d3d2b5e

Fix HuggingFaceHandler errors and checks

64cbeb1

Fix sentence-transformers pipelines as those don't have parameters

8cbd4be

Fix INPUT to input_data fixture

0053e97

Fix quality in tests/unit/test_handler.py

b9dbf58

Make parameters default to empty dict instead of None

d764e44

Update version in setup.py

21ab873

hanouticelina reviewed Dec 2, 2024

View reviewed changes

src/huggingface_inference_toolkit/handler.py Outdated Show resolved Hide resolved

src/huggingface_inference_toolkit/handler.py Outdated Show resolved Hide resolved

alvarobartt and others added 3 commits December 3, 2024 18:13

Fix generate_kwargs payload handling for text2text-based tasks

7a225e2

Fix generate_kwargs handling to move to flatten first-level dict

cd9ebe7

Co-authored-by: Célina <hanouticelina@users.noreply.github.com>

Update generate_kwargs handling as sometimes required

280101d

alvarobartt added 3 commits December 4, 2024 18:14

Remove generate from supported generation kwargs key names

42cd852

Update SentenceRankingPipeline to handle query-texts pipelines

01cd7a8

Also adds some extra validation steps

Update typing and fix sentence-transformers tests

4ffcdfd

ErikKaum approved these changes Dec 5, 2024

View reviewed changes

Upgrade transformers, sentence-transformers and peft dependencies

9d87331

alvarobartt merged commit e0abd4b into main Dec 12, 2024
6 checks passed

alvarobartt deleted the inference-api-alignment branch December 12, 2024 21:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align I/O with Inference API #99

Align I/O with Inference API #99

alvarobartt commented Nov 13, 2024 •

edited

Loading

alvarobartt commented Dec 3, 2024

hanouticelina commented Dec 4, 2024

alvarobartt commented Dec 4, 2024

ErikKaum left a comment

Align I/O with Inference API #99

Align I/O with Inference API #99

Conversation

alvarobartt commented Nov 13, 2024 • edited Loading

Description

alvarobartt commented Dec 3, 2024

hanouticelina commented Dec 4, 2024

alvarobartt commented Dec 4, 2024

ErikKaum left a comment

Choose a reason for hiding this comment

alvarobartt commented Nov 13, 2024 •

edited

Loading