-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Relay][Passes] Iterative A-normal Traversals #7374
Conversation
Hi @mbrookhart , This looks great! In the code clean up, would you be considering factor out let-chain traversal possibly as an utility ?. I see a pattern that the current implementation of Let node is had "pre" and "post" processing stage that is being done prior and after the pushing them to the stack. So I was wondering, whether its possible to expose a generic interface just to provide the pre processing functionality (going down the let chain) and post processing functionality (coming up the let chain). I m not sure how the actual implementation should look like -- do let me know what you have in mind :) . |
@manupa-arm Thanks for the suggestion. @altanh and I were talking about doing something similar late last week using higher order functions. I'll let you know what I can figure out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM except for a few nits, clean solution!
I stopped repeating my comment but basically I'd prefer all implicit uses of captured this
be made explicit- currently it's not consistent across the passes.
auto post_visit = [this](const LetNode* op) { | ||
Expr expr = GetRef<Expr>(op); | ||
Var var = Downcast<Var>(this->VisitExpr(op->var)); | ||
Expr value = this->VisitExpr(op->value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to visit var and value again in the post?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need to pull the values out of the cache. Instead of maintaining a cache shared by the two lambdas, I'm using the memorization cache in the Mutator. The second time visit is called, it will short circuit and return the previously computed value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see. thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbrookhart exactly what I had in mind as well :) Thanks!
Thanks @mbrookhart @manupa-arm @zhiics @altanh the PR has been merged! |
* [WIP][Relay][Passes] non-recursive a-normal traversals * fix clang warning * Refactor ANormal Iterative traversal into a higher order function utility with lambdas * refactor missed pass * add explict use of to lamdbas
* [WIP][Relay][Passes] non-recursive a-normal traversals * fix clang warning * Refactor ANormal Iterative traversal into a higher order function utility with lambdas * refactor missed pass * add explict use of to lamdbas
* [WIP][Relay][Passes] non-recursive a-normal traversals * fix clang warning * Refactor ANormal Iterative traversal into a higher order function utility with lambdas * refactor missed pass * add explict use of to lamdbas
* [WIP][Relay][Passes] non-recursive a-normal traversals * fix clang warning * Refactor ANormal Iterative traversal into a higher order function utility with lambdas * refactor missed pass * add explict use of to lamdbas
On very large models, we can have ASTs that stack overflow because we recursively traverse tens of thousands of chained Let nodes. In A-normal form, we can simply iterate over that chain instead.
These changes find every pass that segfaults with a stack overflow in ONNX SSD-Mobilenet (~20k nodes) and hijacks the visitor to do iterative traversals over chains of lets. This lets me run that model without setting
ulimit -s unlimited
by significantly reducing the stack size.I also needed to convert FuseOps to mixed mode.
Thanks!
cc @jroesch @tqchen @tristan-arm @altanh @zhiics @masahi @kevinthesun