-
Notifications
You must be signed in to change notification settings - Fork 26.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Laying down building stone for more flexible ONNX export capabilities #11786
Conversation
Example of potential command line to export
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late to the party, but just a suggestion,
Why do OnnxConfig does not take the real Config object as an argument ? It would make all string-like inference $config.hidden_size
unnecessary anymore, no ?
aedaecc
to
4311e8a
Compare
See the contributed docs here https://235542-155220641-gh.circle-artifacts.com/0/docs/_build/html/serialization.html |
Idea: Rename the
wdyt? |
That's a great idea! |
@Narsil we moved forward on your suggestion, can you have a look (one more time 😄) 🙏🏻 |
This reverts commit f665efb.
bf5947a
to
d79c03c
Compare
Hello, when we can use the transformers.onnx? |
You already can when installing from source:
We'll do a release this week (probably Thursday or Friday) and it will be in a pypi release then. |
[ | ||
("input_ids", {0: "batch", 1: "encoder_sequence"}), | ||
("attention_mask", {0: "batch", 1: "encoder_sequence"}), | ||
("decoder_input_ids", {0: "batch"}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that the decoder_input_ids
length is fixed. I guess, this makes sense when use_past
is True
, because we feed only one token (the one generated in the previous step). However, when use_past
is False
then we need to feed all the previously generated tokens, don't we?
hi, this thread is super important. |
This PR aims at reworking the way the ONNX export tool work by introducing a static, checked description format to provide ONNX exporters (pt almost done, TF will follow) all the required knobs.
More specifically this PR introduces the following concepts:
OnnxConfig
dataclass which enforces a model to be supported to describe all the properties to generate proper exportOnnxVariable
namedtuple which describe a variables w.r.t the name of the variable, shape and potentially how many time it's "repeated" => Useful forpast_keys
Test case was done initially for BART model, without
use_cache=True
supports.For the sake of completeness, dropping support for
use_cache=True
is currently needed because we have a double nested tuple at the core of thepast_keys
output structure which would require multiple level of dynamic axis, not currently supported by ONNX.This might be something we can work on in the future, potentially introducing a ONNX compatible output structure getting rid of the nested tuples layout and activable from a config property (to be discussed further later on).
Update 1:
past_key_values
for GPT2.Supported models: