Amendments for gradients #5941

t-vi · 2020-06-27T08:54:14Z

We fix the dtype handling of consts in generated gradients.
We add a collapse_sum_to instruction mirroring the collapse_sum_like.
While for general definitions (potentially dynamic shapes),
collapse_sum_like is the first choice, when moving to static,
using collapse_sum_to will greatly simplify the graph.
(This simplification is not part of the PR.)

- We fix the dtype handling of consts in generated gradients. - We add a collapse_sum_to instruction mirroring the collapse_sum_like. While for general definitions (potentially dynamic shapes), collapse_sum_like is the first choice, when moving to static, using collapse_sum_to will greatly simplify the graph. (This simplification is not part of the PR.)

t-vi · 2020-06-27T08:56:21Z

@MarisaKirisame @tqchen If I can interest you in this.

I have more gradient work coming up.

t-vi · 2020-06-27T08:56:49Z

@junrushao1994 too. :)

MarisaKirisame · 2020-06-29T03:43:55Z

src/relay/op/tensor/transform.cc

@@ -1713,6 +1713,54 @@ RELAY_REGISTER_OP("collapse_sum_like")
    .set_attr<FTVMCompute>("FTVMCompute", CollapseSumLikeCompute)
    .set_attr<TOpPattern>("TOpPattern", kCommReduce);

+// CollapseSumTo: <A, B> -> B where CollapseSumTo(A, B) = B


I am confused by this line. BroadCast(A, B) = B?

Oh, right. But Broadcast(A, B) = A right? Thanks for spotting this. I must admit that I'm not 100% sure I understand the notation.

BroadCast is a symmetric function that take two tensor type A, B, and return the broadcast type (I am confusing the type and term level a bit.). I am just listing constraint between the two elements with the equation after where.
what constraint does it exist between two argument of CollapseSumTo? If I get it I can 'collapse' A into B, which mean the type B can be broadcast to A right?

@MarisaKirisame

Thank you, @MarisaKirisame

MarisaKirisame

See my comment above, but I think it is good.

MarisaKirisame · 2020-06-29T23:46:07Z

We add a collapse_sum_to instruction mirroring the collapse_sum_like.
While for general definitions (potentially dynamic shapes),
collapse_sum_like is the first choice, when moving to static,
using collapse_sum_to will greatly simplify the graph.
(This simplification is not part of the PR.)

Sounds cool. I assume you have plan for a PR that implement a pass to turn like into to function? There's a lot of them.

@MarisaKirisame

* Amendments for gradients - We fix the dtype handling of consts in generated gradients. - We add a collapse_sum_to instruction mirroring the collapse_sum_like. While for general definitions (potentially dynamic shapes), collapse_sum_like is the first choice, when moving to static, using collapse_sum_to will greatly simplify the graph. (This simplification is not part of the PR.) * Fix Broadcast rel description in comment Thank you, @MarisaKirisame

@MarisaKirisame

* Amendments for gradients - We fix the dtype handling of consts in generated gradients. - We add a collapse_sum_to instruction mirroring the collapse_sum_like. While for general definitions (potentially dynamic shapes), collapse_sum_like is the first choice, when moving to static, using collapse_sum_to will greatly simplify the graph. (This simplification is not part of the PR.) * Fix Broadcast rel description in comment Thank you, @MarisaKirisame

tqchen added the status: need review label Jun 27, 2020

MarisaKirisame reviewed Jun 29, 2020

View reviewed changes

Fix Broadcast rel description in comment

c3dec29

Thank you, @MarisaKirisame

MarisaKirisame approved these changes Jun 29, 2020

View reviewed changes

tqchen assigned MarisaKirisame Jun 30, 2020

MarisaKirisame merged commit 2e04393 into apache:master Jun 30, 2020

ZihengJiang mentioned this pull request Sep 25, 2020

TVM v0.7 Release Note Candidate #6486

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amendments for gradients #5941

Amendments for gradients #5941

t-vi commented Jun 27, 2020

t-vi commented Jun 27, 2020

t-vi commented Jun 27, 2020

MarisaKirisame Jun 29, 2020

t-vi Jun 29, 2020 •

edited

Loading

MarisaKirisame Jun 29, 2020

MarisaKirisame left a comment

MarisaKirisame commented Jun 29, 2020

Amendments for gradients #5941

Amendments for gradients #5941

Conversation

t-vi commented Jun 27, 2020

t-vi commented Jun 27, 2020

t-vi commented Jun 27, 2020

MarisaKirisame Jun 29, 2020

Choose a reason for hiding this comment

t-vi Jun 29, 2020 • edited Loading

Choose a reason for hiding this comment

MarisaKirisame Jun 29, 2020

Choose a reason for hiding this comment

MarisaKirisame left a comment

Choose a reason for hiding this comment

MarisaKirisame commented Jun 29, 2020

t-vi Jun 29, 2020 •

edited

Loading