-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow inlining past loop broadcasts #3416
base: mma_predicate_elimination
Are you sure you want to change the base?
Conversation
After this, we can actually generate a proper kernel and run it. I will rebase #3406 onto this and modify the test to compile and run in that PR so we can inspect the generated kernel there. We can keep this PR for discussing the inlining changes only. |
Does this only apply to broadcast IDs added by |
Yes, that's the intention. I am using |
Yes. @zasdfgbnm, when you added this, were you thinking about having non-broadcast IDs in |
To be safe I'll check the IterType when skipping. |
for ([[maybe_unused]] auto [expr, dir] : IRBFS::getExprsBetween( | ||
{tv->domain()->additionalIDs().begin(), | ||
tv->domain()->additionalIDs().end()}, | ||
{tv->getLoopDomain().begin(), tv->getLoopDomain().end()}, | ||
/*require_all_to_visited=*/false)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This includes all IDs that are between additionalIDs()
and loop domain. However, we could have something like this:
tv->broadcast(0, 16);
tv->merge(0);
In this case, we'll be merging the new broadcast ID with a pre-existing loop ID, so we should not ignore that. I think instead maybe what we should do is traverse from the root domain to the loop domain instead and the complement will then be the "pure" loop broadcasts which we can ignore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I suppose that would also automatically allow us to inline past regular broadcasts that are created using BroadcastOp since those new Broadcast IDs are not reachable from the root domain either, but we already inline past those IDs anyway I believe.
Stacked on #3414
This PR enables us to inline an MmaOp properly when its inputs are missing broadcast dimensions. We do this by always allowing inlining past loop broadcasts or their transforms. For example
As long as the operation
foo
properly maps its arguments despite the missing logical dimensions (asMmaOp
does as of #3391), then we should be able to fully inline this case because the loop broadcastsbS5
andbS6
are imaginary in the sense that they don't impact indexing.