Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseImageProcessor #26

Merged
merged 62 commits into from
Aug 17, 2022
Merged

Conversation

amyeroberts
Copy link
Owner

What does this PR do?

Introduces the BaseImageProcessor class for all other model image processors to inherit from.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 8, 2022

The documentation is not available anymore as the PR was closed or merged.

@amyeroberts amyeroberts changed the base branch from base-image-processor-class to image-processor-mixin August 8, 2022 11:23
amyeroberts and others added 2 commits August 8, 2022 12:31
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Copy link
Collaborator

@alaradirik alaradirik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, excited to see this moving forward :)

data (`dict`):
Dictionary of lists/arrays/tensors returned by the __call__/pad methods ('pixel_values', 'attention_mask',
etc.).
tensor_type (`Union[None, str, TensorType]`, *optional*):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick question - are we going to default to "np" ? If so, maybe we can remove None from the accepted argument types or make it a non-optional argument

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends partly on whether we merge in: huggingface#18499

Defaulting to "np" isn't necessary to be able to use different combinations of e.g. do_resize and do_normalize. As we're aliasing the previous feature extractors with the new image processors, change the default would still be a breaking change.

If we decided to default to "np", then we'd have to include additional checks on the processed images. At the moment, because resize resizes the images to multiples of size_divisor, they are not guaranteed to all be the same size. This means calls the BatchFeature will fail if any of "np", "tf", "pt"or"jax"` are passed in as the images can't be batched together.

My preference would be to keep return_tensors=None as this more closely matches the behaviour of our tokenizers. However, our tokenizer provides arguments such that batches can be created e.g. padding=True. Not sure if an equivalent makes sense here.

What do you think? If we want to set "np" as default we should discuss how to handle introducing the image processors versus introducing that change.
cc @NielsRogge

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Base automatically changed from image-processor-mixin to image-transforms-library August 16, 2022 15:40
@amyeroberts amyeroberts merged commit 62c6e55 into image-transforms-library Aug 17, 2022
@amyeroberts amyeroberts deleted the image-batch-feature branch August 17, 2022 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants