Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ONNX support for LayoutLMv3 #17953

Merged
merged 6 commits into from
Jun 30, 2022
Merged

Add ONNX support for LayoutLMv3 #17953

merged 6 commits into from
Jun 30, 2022

Conversation

regisss
Copy link
Contributor

@regisss regisss commented Jun 29, 2022

What does this PR do?

This PR adds ONNX support for LayoutLMv3. Linked to #16308.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@regisss regisss requested a review from lewtun June 29, 2022 19:00
@regisss
Copy link
Contributor Author

regisss commented Jun 29, 2022

@NielsRogge The supported tasks are question answering, token classification and sequence classification. Is there any other use case that should be supported?

Also, the order of input arguments for the forward method of LayoutLMv3ForSequenceClassification and LayoutLMv3ForQuestionAnswering is different from LayoutLMv3ForTokenClassification and LayoutLMv3Model. This is taken care of in the ONNX config because I guess modifying it in modeling_layoutlmv3.py is not an option since it would break backwards compatibility right?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 29, 2022

The documentation is not available anymore as the PR was closed or merged.

@regisss
Copy link
Contributor Author

regisss commented Jun 29, 2022

@lewtun All slow tests passed

Comment on lines +198 to +215
if self.task in ["question-answering", "sequence-classification"]:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("attention_mask", {0: "batch", 1: "sequence"}),
("bbox", {0: "batch", 1: "sequence"}),
("pixel_values", {0: "batch", 1: "sequence"}),
]
)
else:
return OrderedDict(
[
("input_ids", {0: "batch", 1: "sequence"}),
("bbox", {0: "batch", 1: "sequence"}),
("attention_mask", {0: "batch", 1: "sequence"}),
("pixel_values", {0: "batch", 1: "sequence"}),
]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate, but we probably can't change it due to backwards compatibility.

cc @LysandreJik

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could add a deprecation warning to notify users that the order of arguments will change in v5? (Or is that type of breaking change too extreme?)

Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding another novel ONNX config @regisss - you're on 🔥 !

This PR LGTM, so gently pinging @LysandreJik or @sgugger for final approval 🙏

Edit: I just saw that some of the CI tests failed. I've re-run them as they don't seem to be related to your changes. If they fail again, I suggest rebasing on main and pushing again to see if that resolves it. We should wait until the CI is green before merging ;-)

@@ -40,6 +40,7 @@

if TYPE_CHECKING:
from ..feature_extraction_utils import FeatureExtractionMixin
from ..processing_utils import ProcessorMixin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice use of the type checks!

@regisss
Copy link
Contributor Author

regisss commented Jun 30, 2022

CI fails because of the following error:

Traceback (most recent call last):
  File "utils/check_repo.py", line 768, in <module>
    check_repo_quality()
  File "utils/check_repo.py", line 762, in check_repo_quality
    check_all_objects_are_documented()
  File "utils/check_repo.py", line 675, in check_all_objects_are_documented
    + "\n - ".join(undocumented_objs)
Exception: The following objects are in the public init so should be documented:
 - OptionalDependencyNotAvailable
 - dummy_scatter_objects
 - sys

It seems to come from the following line in configuration_layoutlmv3.py:

from ...processing_utils import ProcessorMixin

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is what happens when you introduce new cyclical imports, it creates hard-to-debug errors everywhere ;-)

So let's contain imports for type checking as forward references and not do top-level imports :-)

@regisss
Copy link
Contributor Author

regisss commented Jun 30, 2022

Wow thanks a lot @sgugger for the clear explanation, it makes complete sense!

@regisss
Copy link
Contributor Author

regisss commented Jun 30, 2022

CI and slow tests all passed. It should be ready now @sgugger @lewtun

@sgugger
Copy link
Collaborator

sgugger commented Jun 30, 2022

Thanks!

@sgugger sgugger merged commit 9cb7cef into huggingface:main Jun 30, 2022
@regisss regisss deleted the layoutlmv3_onnx branch June 30, 2022 16:11
@githublsk
Copy link

@regisss Thank you for your great work, when convert layoutxlm LayoutLMv2ForRelationExtraction to onnx, we are blocked by relation extraction layer for some reasons, can you try to export LayoutLMv2ForRelationExtraction model to onnx and give us for some help? gret thanks for you!

@githublsk
Copy link

@NielsRogge Thanks for your great work, when I convert LayoutLMv2ForRelationExtraction to onnx, I can not export relation extraction layer to onnx, can you help me to solve it? because the deadline is coming for the project, I hope you can help me, Thank you very much.

@regisss
Copy link
Contributor Author

regisss commented Jul 1, 2022

@regisss Thank you for your great work, when convert layoutxlm LayoutLMv2ForRelationExtraction to onnx, we are blocked by relation extraction layer for some reasons, can you try to export LayoutLMv2ForRelationExtraction model to onnx and give us for some help? gret thanks for you!

@githublsk Where did you find LayoutLMv2ForRelationExtraction? I cannot find it in Transformers

@githublsk
Copy link

@regisss it is just in microsoft/unlim,please refer to the link:
https://github.com/microsoft/unilm/blob/master/layoutlmft/layoutlmft/models/layoutlmv2/modeling_layoutlmv2.py
image
it is useful for relation extraction, but when running onnx,some question occur as below:
image
the onnx graph is as below:
image
I can not find any reason, which confused me, I meet the deadline for my project, it is so urgent....

@githublsk
Copy link

@regisss if you have time, please help us, I am a newer to it, great thanks for you!

@regisss
Copy link
Contributor Author

regisss commented Jul 2, 2022

@regisss if you have time, please help us, I am a newer to it, great thanks for you!

@githublsk Open an issue because it is not related to this PR. And provide the command/script you ran with the complete error message please, screenshots are not very helpful.

@githublsk
Copy link

@regisss Thank you for your great help, I open an issue in the link, can you help me? because the deadline is coming, it bothers me a lot, we hope you can help us to reslove it, great thanks.

#17999

viclzhu pushed a commit to viclzhu/transformers that referenced this pull request Jul 18, 2022
* Add ONNX support for LayoutLMv3

* Update docstrings

* Update empty description in docstring

* Fix imports and type hints
@gjj123
Copy link

gjj123 commented Sep 8, 2022

@githublsk
how did you solve onnx convert error:
Exporting the operator bilinear to ONNX opset version 13 is not supported
super(BiaffineAttention, self).init()
self.in_features = in_features
self.out_features = out_features
self.bilinear = torch.nn.Bilinear(in_features, in_features, out_features, bias=False)

@pbcquoc
Copy link

pbcquoc commented Jan 10, 2023

@gjj123 replace torch.nn.Bilinear with this one

class Bilinear(nn.Module):
    def __init__(self, in1_features, in2_features, out_features):
        super(Bilinear, self).__init__()
        self.weight = torch.nn.Parameter(torch.zeros((in1_features, in2_features, out_features)))
        self.bias = torch.nn.Parameter(torch.zeros((out_features)))
        nn.init.xavier_uniform_(self.weight)

    def forward(self, x, y):
        t = x @ self.weight.permute(2,0,1)
        output = (t * y).sum(dim=2).t()
        return output

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants