-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't use LayoutLMv2
and LayoutLMv3
in some pipeline tests
#22774
Conversation
The documentation is not available anymore as the PR was closed or merged. |
LayoutLMv2
and LayoutLMv3
in some pipeline tests
# `DocumentQuestionAnsweringPipeline` is expected to work with this model, but it combines the text and visual | ||
# embedding along the sequence dimension (dim 1), which causes an error during post-processing as `p_mask` has | ||
# the sequence dimension of the text embedding only. | ||
# (see the line `embedding_output = torch.cat([embedding_output, visual_embeddings], dim=1)`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @NielsRogge We might need to discuss this at some point.
|
||
return super().is_pipeline_test_to_skip( | ||
pipeline_test_casse_name, config_class, model_architecture, tokenizer_name, processor_name | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove all these - we just don't add these to be tested in the first place
if model_mapping is not None: | ||
model_mapping = {config: model for config, model in model_mapping.items() if config.__name__ in _TO_SKIP} | ||
if tf_model_mapping is not None: | ||
tf_model_mapping = {config: model for config, model in tf_model_mapping.items() if config.__name__ in _TO_SKIP} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to avoid the changes of pipeline_model_mapping
in this PR being reverted next time we update the repository with the script in add_pipeline_model_mapping_to_test.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also makes things more explicit: those models are not for this pipeline (test)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning this up. It looks like the models shouldn't have been added in the auto-mappings then, if they don't have a consistent API to be used in the pipeline. But that's too late to change this now!
Merge now as this PR only touches tests. Feel free to leave a comment if any @NielsRogge |
…ngface#22774) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
What does this PR do?
These 2 models require different input format than those of usual text models. See the relevant code block at the end.
The offline discussion with @NielsRogge is that these 2 models are only for DocQA pipeline, despite they have implementations for different head tasks.
Therefore, this PR removes these 2 models from being tested (pipeline) in the first place, instead of skipping them at later point.
IMO, we should also remove these models being used in the pipeline classes (except DocQA) if they are not going to work. But I don't do anything on this.
LayoutLMv3
withDocumentQuestionAnsweringPipeline
(and the pipeline test) is still not working due to some issue. We need to discuss with @NielsRogge to see if it could be fixed, but it's out of this PR's scope.relevant code block
transformers/src/transformers/models/layoutlmv3/tokenization_layoutlmv3.py
Lines 610 to 625 in daf5324