-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Example] Use .fuse() primitive when possible #42
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please do not merge first. I'll also remove the epoi dependency. |
Or do you think it is better to create a separate PR? @comaniac |
I'm fine with both, so just for your convenience. |
I've changed all the |
Thanks @chhzh123 |
Description
This PR fixes some issues for the example models:
.fuse()
to fuse the bias+GeLU in the MLP module. Since TorchScript module cannot be hooked and cannot properly work with DeepSpeed ZeRO-3, this feature is unset by default..replace()
and support bias+layernorm fusionChecklist