Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Relay][Training] fix first-order AD tuple/projection expr duplication #8318

Merged
merged 1 commit into from
Jun 24, 2021

Conversation

altanh
Copy link
Contributor

@altanh altanh commented Jun 23, 2021

Tuples constructions and projections were not handled correctly (in particular, they were not reconstructed using the let bindings of their inputs) which led to expression duplication. Often the CSE pass is able to eliminate this erroneous duplication when first-order AD was paired with ToGNF but sometimes it can't (which we observed in BERT training, leading to many duplicated matmuls).

cc @MarisaKirisame @jroesch

TupleGetItem orig = TupleGetItem(tup->get<ADTensor>().forward, idx);
orig->checked_type_ = op->checked_type();
auto ret = std::make_shared<ADTensor>(ll, orig, diag_ctx);
// for orig = pi(tup, i), pi_grad(tup, i, g) = G where pi(G, i) = g and pi(G, j) = 0 for j != i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain more here or delete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, this is just describing how the gradient for a projection is propagated back to the original tuple

@jroesch jroesch merged commit 4f9e614 into apache:main Jun 24, 2021
ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Mar 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants