Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amendments for gradients #5941

Merged
merged 2 commits into from
Jun 30, 2020
Merged

Amendments for gradients #5941

merged 2 commits into from
Jun 30, 2020

Conversation

t-vi
Copy link
Contributor

@t-vi t-vi commented Jun 27, 2020

  • We fix the dtype handling of consts in generated gradients.
  • We add a collapse_sum_to instruction mirroring the collapse_sum_like.
    While for general definitions (potentially dynamic shapes),
    collapse_sum_like is the first choice, when moving to static,
    using collapse_sum_to will greatly simplify the graph.
    (This simplification is not part of the PR.)

- We fix the dtype handling of consts in generated gradients.
- We add a collapse_sum_to instruction mirroring the collapse_sum_like.
  While for general definitions (potentially dynamic shapes),
  collapse_sum_like is the first choice, when moving to static,
  using collapse_sum_to will greatly simplify the graph.
  (This simplification is not part of the PR.)
@t-vi
Copy link
Contributor Author

t-vi commented Jun 27, 2020

@MarisaKirisame @tqchen If I can interest you in this.

I have more gradient work coming up.

@t-vi
Copy link
Contributor Author

t-vi commented Jun 27, 2020

@junrushao1994 too. :)

@@ -1713,6 +1713,54 @@ RELAY_REGISTER_OP("collapse_sum_like")
.set_attr<FTVMCompute>("FTVMCompute", CollapseSumLikeCompute)
.set_attr<TOpPattern>("TOpPattern", kCommReduce);

// CollapseSumTo: <A, B> -> B where CollapseSumTo(A, B) = B
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by this line. BroadCast(A, B) = B?

Copy link
Contributor Author

@t-vi t-vi Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. But Broadcast(A, B) = A right? Thanks for spotting this. I must admit that I'm not 100% sure I understand the notation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BroadCast is a symmetric function that take two tensor type A, B, and return the broadcast type (I am confusing the type and term level a bit.). I am just listing constraint between the two elements with the equation after where.
what constraint does it exist between two argument of CollapseSumTo? If I get it I can 'collapse' A into B, which mean the type B can be broadcast to A right?

Copy link
Contributor

@MarisaKirisame MarisaKirisame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above, but I think it is good.

@MarisaKirisame
Copy link
Contributor

We add a collapse_sum_to instruction mirroring the collapse_sum_like.
While for general definitions (potentially dynamic shapes),
collapse_sum_like is the first choice, when moving to static,
using collapse_sum_to will greatly simplify the graph.
(This simplification is not part of the PR.)

Sounds cool. I assume you have plan for a PR that implement a pass to turn like into to function? There's a lot of them.

@MarisaKirisame MarisaKirisame merged commit 2e04393 into apache:master Jun 30, 2020
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jun 30, 2020
* Amendments for gradients

- We fix the dtype handling of consts in generated gradients.
- We add a collapse_sum_to instruction mirroring the collapse_sum_like.
  While for general definitions (potentially dynamic shapes),
  collapse_sum_like is the first choice, when moving to static,
  using collapse_sum_to will greatly simplify the graph.
  (This simplification is not part of the PR.)

* Fix Broadcast rel description in comment

Thank you, @MarisaKirisame
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Jul 2, 2020
* Amendments for gradients

- We fix the dtype handling of consts in generated gradients.
- We add a collapse_sum_to instruction mirroring the collapse_sum_like.
  While for general definitions (potentially dynamic shapes),
  collapse_sum_like is the first choice, when moving to static,
  using collapse_sum_to will greatly simplify the graph.
  (This simplification is not part of the PR.)

* Fix Broadcast rel description in comment

Thank you, @MarisaKirisame
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants