[Relay][Training] fix first-order AD tuple/projection expr duplication #8318

altanh · 2021-06-23T20:06:37Z

Tuples constructions and projections were not handled correctly (in particular, they were not reconstructed using the let bindings of their inputs) which led to expression duplication. Often the CSE pass is able to eliminate this erroneous duplication when first-order AD was paired with ToGNF but sometimes it can't (which we observed in BERT training, leading to many duplicated matmuls).

cc @MarisaKirisame @jroesch

jroesch · 2021-06-23T23:27:44Z

src/relay/transforms/first_order_gradient.cc

+    TupleGetItem orig = TupleGetItem(tup->get<ADTensor>().forward, idx);
+    orig->checked_type_ = op->checked_type();
+    auto ret = std::make_shared<ADTensor>(ll, orig, diag_ctx);
+    // for orig = pi(tup, i), pi_grad(tup, i, g) = G where pi(G, i) = g and pi(G, j) = 0 for j != i


Explain more here or delete

sure, this is just describing how the gradient for a projection is propagated back to the original tuple

fix first-order AD tuple/projection expr duplication

d561b0a

jroesch reviewed Jun 23, 2021

View reviewed changes

jroesch approved these changes Jun 24, 2021

View reviewed changes

jroesch merged commit 4f9e614 into apache:main Jun 24, 2021

ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021

fix first-order AD tuple/projection expr duplication (apache#8318)

1316fbe

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Mar 4, 2022

fix first-order AD tuple/projection expr duplication (apache#8318)

5dc71af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay][Training] fix first-order AD tuple/projection expr duplication #8318

[Relay][Training] fix first-order AD tuple/projection expr duplication #8318

altanh commented Jun 23, 2021

jroesch Jun 23, 2021

altanh Jun 23, 2021

[Relay][Training] fix first-order AD tuple/projection expr duplication #8318

[Relay][Training] fix first-order AD tuple/projection expr duplication #8318

Conversation

altanh commented Jun 23, 2021

jroesch Jun 23, 2021

Choose a reason for hiding this comment

altanh Jun 23, 2021

Choose a reason for hiding this comment