Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make
pipeline
able to loadprocessor
#32514Make
pipeline
able to loadprocessor
#32514Changes from 26 commits
a3465ce
e377b29
500098d
bab9a57
7fd209f
3db0e0b
71c2d5b
a010357
c526e08
d45d7f6
a95d556
94f5616
2bd4e0e
49ec283
5686833
0a2349e
7d507a6
3873c1c
92d2be8
01d6040
ab7a229
6e0dde8
4c834b3
6a8e590
0533995
e06053d
d553dab
6ff4e68
a6993b5
5799775
9793e01
e712717
bcce4dc
64f002c
cbb813f
6996695
b67d24f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are now a few overlapping inputs:
I believe it would be nice to highlight somewhere visible (like in the documentation above) what attribute is necessary for what: at no point should a user specify all four of them, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a separate
Note
section to highlight that we should not provide all types ofprocessors
at oncee712717 and refer to a specific pipeline in case one would like to provide them explicitly.
For each specific pipeline we have only required processors args in docs section configured with docs decorator. e.g. here
transformers/src/transformers/pipelines/image_classification.py
Line 53 in e782e95
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, updated pipeline doc to more relevant one in 6996695
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this conditional is redundant - if
feature_extractor
is astr
then it will always benot None
as well.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in d553dab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For backward compatibility, we can control with
Pipeline
class if we need to load specific processors/tokenizers. For example, for zero-shot object detection, we will need to load only theprocessor
, and do not need to loadimage_processor
and tokenizer separately. Other legacy pipelines might load only tokenizer and image_processor, even if they have processor class.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Piggy-backing on the comment above, this is likely something we want to highlight very clearly in each pipeline's documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With transformers already having too many warnings, I'd be cautious about the ones we add.
What purpose does this warning serve? Is it sufficiently actionable? Does it concern users (the ones that will see it), or repo owners/creators that have not configured their processors/feature extractors correctly (that will likely not see this warning)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the questions!
There was a discussion here. The main idea is that
AutoProcessor
can load not only processors but also basic processing classes, liketokenizer
.Indeed, a misconfiguration in the model-pipeline-processor setup could trigger this and raising a warning + dropping the
processor
might be sufficient only when theprocessor
isn't needed in thepipeline
at all.However, with granular control, such a case shouldn't occur. Therefore, it seems more appropriate to replace the warning with an error to clearly indicate a misconfiguration. Otherwise, the error will happen later with a less clear message because the
processor
is None.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to error raise here bcce4dc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get this comment 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried to make it clearer in 9793e01
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
granular control for loading, see comment in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense and I appreciate us being explicit. This repeats what is said above, but this should be extremely clear in the pipelines documentation if possible