-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELAY][BACKEND] Enable PlanMemory in the graph runtime. #2120
Conversation
@tqchen quick question - are there plans to allow dynamic memory allocation in the graph runtime, which would allow variable shapes? I believe that's not currently supported, but was curious if you had plans there. |
@ajtulloch yes, I think the plan is to write a new runtime system soon ™️. A few of us are working on a PLDI submission, and expect to ship a bunch of improvements/fixes/features post deadline. |
Overall looks good to me, a bit tired from the PLDI push so maybe someone else should do a pass. |
@tqchen Quite busy recently. But I will try my best to spend sometime to do a round tonight if it is not too late. |
When we start to move into NNVMv2(relay) we have this clear separation of compiler and runtime. The migration starts as two-phase process, and we are in the first step that moves the compiler, but keeps the old graph runtime. I think we can expect the static graph runtime to exist for a while, but we can also explore the possibility of new backends that breaks different assumptions(e.g. dynamic memory alloca, control flow). Luckily the IR is expressive enough to represent all these workloads. There is also a tradeoff here, depending on whether we want to allow JIT, how big the runtime is, etc. So I can imagine it could be possible that we build several of them. @ajtulloch I think it is a good time to hear opinions from everyone on what do we need |
there is not an existing RFC, how about we open a new one? |
opened in #2122 |
@ajtulloch RFC seems like a great idea, would look forward to figuring out what everyone is interested in, and what people are looking to do. |
@tqchen I only have some nit comments. Overall LGTM. |
} | ||
|
||
void VisitExpr_(const TupleNode* op) final { | ||
// Do nothing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the comment?
std::unordered_map<const ExprNode*, std::vector<StorageToken*> > token_map_; | ||
|
||
/*! | ||
* \brief call get token to get the necessary token. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call token
} | ||
// create token for the call node. | ||
CreateToken(op, true); | ||
// check if there is orphaned output that can be released immediately/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
immediately.
struct StorageToken { | ||
/*! \brief Reference counter */ | ||
int ref_counter{0}; | ||
/*! \brief numbe of bytes */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo number
} | ||
|
||
void VisitExpr_(const OpNode* op) final { | ||
// Do nothing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just try to learn, what is the default behavior if such function is not defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by default, it will recursively visit, which is fine, just to make it explicit
<< ttype->shape; | ||
size *= static_cast<size_t>(pval[0]); | ||
} | ||
size *= (ttype->dtype.bits() * ttype->dtype.lanes() + 7) / 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comments for magic number 7 & 8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this should be refactored into a round_up/div_round_up function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, it might be necessary to have an "alignment" function which takes byte, word, or dword, etc.
Thanks, @ajtulloch @yzhliu @zhiics, i have addressed the comments. |
Thanks @ajtulloch @yzhliu @zhiics this is merged |
This PR implements PlanMemory for graph runtime codegen backend of Relay. It also contains a few other improvements
The algorithm is basically the same from NNVM. We do need to introduce a storage token and have an initialization phase that propagates and calculate expected reference count of the token, before we run the greedy allocation algorithm.