Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Laying down building stone for more flexible ONNX export capabilities #11786

Merged
merged 69 commits into from
Jul 8, 2021

Conversation

mfuntowicz
Copy link
Member

@mfuntowicz mfuntowicz commented May 20, 2021

This PR aims at reworking the way the ONNX export tool work by introducing a static, checked description format to provide ONNX exporters (pt almost done, TF will follow) all the required knobs.

More specifically this PR introduces the following concepts:

  • OnnxConfig dataclass which enforces a model to be supported to describe all the properties to generate proper export
  • OnnxVariable namedtuple which describe a variables w.r.t the name of the variable, shape and potentially how many time it's "repeated" => Useful for past_keys

Test case was done initially for BART model, without use_cache=True supports.

For the sake of completeness, dropping support for use_cache=True is currently needed because we have a double nested tuple at the core of the past_keys output structure which would require multiple level of dynamic axis, not currently supported by ONNX.

This might be something we can work on in the future, potentially introducing a ONNX compatible output structure getting rid of the nested tuples layout and activable from a config property (to be discussed further later on).

Update 1:

  • I managed to enable exporting with nested structures such as past_key_values for GPT2.
  • Need to work on enabling the same for using such values as inputs to the model

Supported models:

  • ALBERT
  • BART (with & without past)
  • BERT
  • DistilBERT
  • Longformer => I've support for this, but the exporting fails because of missing ops ... need investigations.
  • GPT2 (with & without past)
  • Roberta
  • T5
  • XLM-Roberta

@mfuntowicz
Copy link
Member Author

mfuntowicz commented May 25, 2021

Example of potential command line to export bert-base-cased =>

python3 -m transformers.onnx -f pytorch --model=bert-base-cased --features=default --optimize --optimization-level=all onnx/bert-base-cased/

Copy link
Contributor

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party, but just a suggestion,

Why do OnnxConfig does not take the real Config object as an argument ? It would make all string-like inference $config.hidden_size unnecessary anymore, no ?

@LysandreJik
Copy link
Member

@mfuntowicz
Copy link
Member Author

Idea: Rename the convert_pytorch to export so we have the exact same hierarchy than PyTorch:

  • PyTorch: torch.onnx.export
  • Transformers: transformers.onnx.export

wdyt?

@LysandreJik
Copy link
Member

That's a great idea!

@mfuntowicz
Copy link
Member Author

@Narsil we moved forward on your suggestion, can you have a look (one more time 😄) 🙏🏻

@LysandreJik LysandreJik mentioned this pull request Jul 8, 2021
@LysandreJik LysandreJik merged commit 2aa3cd9 into master Jul 8, 2021
@LysandreJik LysandreJik deleted the onnx_export_v2 branch July 8, 2021 14:54
@leoozy
Copy link

leoozy commented Jul 20, 2021

Hello, when we can use the transformers.onnx?

@LysandreJik
Copy link
Member

You already can when installing from source:

pip install git+https://github.com/huggingface/transformers

We'll do a release this week (probably Thursday or Friday) and it will be in a pypi release then.

[
("input_ids", {0: "batch", 1: "encoder_sequence"}),
("attention_mask", {0: "batch", 1: "encoder_sequence"}),
("decoder_input_ids", {0: "batch"}),
Copy link

@kagrze kagrze Aug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the decoder_input_ids length is fixed. I guess, this makes sense when use_past is True, because we feed only one token (the one generated in the previous step). However, when use_past is False then we need to feed all the previously generated tokens, don't we?

@Avi-avidan
Copy link

hi, this thread is super important.
Is there support for bart text2text_generation export to onnx (more specifically for summarization tasks) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants