forked from pytorch/pytorch
-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Support broadcasts of predicated tensors within thread blocks (#100)
This intends to support cases where a reduced tensor is an input to a parallelized tensor. So, broadcasts after reductions like below should work now within a thread block: t1 = sum(t0, {1}); t2 = broadcast(t1, {false, true}); The major changes include: - Add blockBroadcast device function, which is used for broadcasting to dimensions parallelized with TIDx/y/z. - Update the softmax test. It now matches with the Aten output (within a relaxed threshold). - Add a simplified softmax test, which does not do input normalization with max. - Refactor thread predicate computation. Thread predicate information is necessary for both lowering and printing, so I extracted that from the lowering and make it a more independent class. Limitations and concerns: - Broadcasting to BID-parallelized dimensions are not supported - Thread predicates are computed twice, which might be a performance concern, but still should be trivially small compared to, e.g., the computeAt implementation.
- Loading branch information
Showing
14 changed files
with
810 additions
and
277 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.