Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds IBERT to models exportable with ONNX #14868

Merged
merged 6 commits into from
Jan 11, 2022

Conversation

MaximovaIrina
Copy link
Contributor

Adds IBERT to models exportable with ONNX

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!! Pinging @michaelbenayoun and @lewtun

Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this very clean contribution @MaximovaIrina !

I've left a minor remark, but otherwise this is looking really good. Could you please check that the "slow" tests pass for this model:

RUN_SLOW=1 pytest tests/test_onnx_v2.py -k "ibert" -rp

These tests aren't run until the PR is merged, so it's good to know in advance if there's any problems :)

@@ -123,3 +127,22 @@ def __init__(
self.position_embedding_type = position_embedding_type
self.quant_mode = quant_mode
self.force_dequant = force_dequant


class IBertOnnxConfig(OnnxConfigWithPast):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For encoder-based models like IBERT, I think it makes more sense to use OnnxConfig instead of OnnxConfigWithPast. The OnnxConfigWithPast is usually reserved for decoder-based and encoder-decoder models where having access to the past key values is important for text generation.

Is there a reason why you chose this particular config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the recommendations in this tutorial. In particular, I was looking at this patch.

Copy link
Member

@lewtun lewtun Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the information @MaximovaIrina - we're in the process of revamping the docs / tutorial, so hopefully it will be less confusing in future!

The patch you linked to is for MBART, which is an encoder-decoder model where having access to past key values is important. In your case with IBERT, I recommend looking at the DistilBERT example which inherits from OnnxConfig.

So my suggestion would be to replace your use of OnnxConfigWithPast with OnnxConfig.

As @michaelbenayoun notes elsewhere, you can also use DistilBERT as a guide for adding all the supported features.

)

@property
def outputs(self) -> Mapping[str, Mapping[int, str]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to override the outputs property as it is already taken care of by OnnxConfig for each of the tasks supported by IBERT (the outputs will depend on the task at hand, so the base class outputs property takes care of returning the proper outputs for a given task).

@@ -63,6 +64,7 @@ class FeaturesManager:
"bart": supported_features_mapping("default", onnx_config_cls=BartOnnxConfig),
"mbart": supported_features_mapping("default", onnx_config_cls=MBartOnnxConfig),
"bert": supported_features_mapping("default", onnx_config_cls=BertOnnxConfig),
"ibert": supported_features_mapping("default", onnx_config_cls=IBertOnnxConfig),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add all the supported features for IBERT?
Those are:

  • masked-lm
  • sequence-classification
  • multiple-choice (this one is not supported yet)
  • token-classification
  • question-answering

@lewtun
Copy link
Member

lewtun commented Dec 23, 2021

Hey @MaximovaIrina we recently switched all of our documentation from RST to MDX, so you can rebase on master to eliminate the conflict with docs/source/serialization.rst

@MaximovaIrina
Copy link
Contributor Author

Checking slow tests for ibert
image

@@ -0,0 +1,433 @@
..
Copyright 2020 The HuggingFace Team. All rights reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file shouldn't be included (must be a Git problem with the switch from RST to MDX). Could you please delete it from the PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you say

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I still see the file in the PR - is it possible you only reverted the diff? If a normal rm docs/source/serialization.rst + git commit doesn't fix it, perhaps running

git rm docs/source/serialization.rst

does?

Copy link
Contributor Author

@MaximovaIrina MaximovaIrina Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it should be okay
TU

Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for these last round of changes @MaximovaIrina !

Once the CI tests pass, this PR LGTM 🚀 - OK for you too @LysandreJik ?

@lewtun
Copy link
Member

lewtun commented Jan 10, 2022

Hey @MaximovaIrina would you mind rebasing your branch on master and resolving the merge conflicts?

Gently pinging @LysandreJik for his blessing on this PR too :)

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for your PR, @MaximovaIrina!

@LysandreJik
Copy link
Member

Could you just run the code quality tool to ensure that the code quality passes? You can install them with the following, from the root of your clone:

pip install -e ".[quality]"

And then run them with:

make fixup

@lewtun
Copy link
Member

lewtun commented Jan 11, 2022

Merging this since all the CI checks now pass and we have approval from a core maintainer. Thank you for your contribution @MaximovaIrina 🤗 !

@lewtun lewtun merged commit c4fa908 into huggingface:master Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants