Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement img2text widget #290

Closed
wants to merge 1 commit into from
Closed

Implement img2text widget #290

wants to merge 1 commit into from

Conversation

mishig25
Copy link
Collaborator

@mishig25 mishig25 commented Sep 2, 2022

Implement Text-to-Image widget

huggingface/transformers#18821 (comment)

input (identical to ImageClassificationWidget):

const requestBody = { file }; // img file

output (identical to TextGenerationWidget):

Array<{ generated_text: string; }>

example:

[
  {
    "generated_text": "Some text"
  }
]

Note:

In transformers implementation, I see that the pipeline is called image-to-text-generation, while in the hub, we already defined it as image-to-text without the generation part. Please let me know if it is an issue: @Narsil @osanseviero

todos:

  • test when api-inference is up @Narsil
  • document widget input sample

@mishig25 mishig25 changed the title Implement ing2text widget Implement img2text widget Sep 2, 2022
@osanseviero
Copy link
Contributor

Related to

In transformers implementation, I see that the pipeline is called image-to-text-generation, while in the hub, we already defined it as image-to-text without the generation part. Please let me know if it is an issue:

This should take care of it huggingface/transformers#18864

@Narsil
Copy link

Narsil commented Sep 2, 2022

LGTM. I don't think we chose List[{"generated_text": str}] for the output but {"generated_text": str} only (I think).

I think being aligned with what we already have is better.

@OlivierDehaene In general, unfortunately since the pipelines were created at different times by different people originally, the exact output types are not super normalized, especially on the List, List of List stuff. This is something I would like to change on v5 (whenever we decide to go for it and I don't think we have plans) so that the pipeline code could be more regular.

For text-generation it can really generate multiple texts (it's controlled by num_return_sequences) and I think enabling the list here also enables more uses cases (even for captioning you might be interested to generate several at once and choose the best).

However, I don't think we should by highly strict in v4 since every pipeline is ever so slightly different than it's neighbor.

@osanseviero
Copy link
Contributor

Closing as this now lives in https://github.com/huggingface/hub-docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants