-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rpc] Implemented rpc logging #10967
Conversation
6ff5271
to
b3a00c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good, but there is one more issue.
src/runtime/minrpc/minrpc_server.h
Outdated
ret_handler_(new MinRPCReturns<TIOHandler>(io_)), | ||
exec_handler_(new MinRPCExecute<TIOHandler>(io_, ret_handler_)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two members are allocated, but never deleted. In the logging classes you pass addresses of members to the MinRPCServer
constructor, so it doesn't have a way of knowing whether it should delete these objects or not. This needs to be fixed or it will cause memory leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for taking the time to read through the code, and for valuable comments.
Initially, I wanted to address this using unique_ptrs, but including "memory" resulted in "redefinition of operator new" error in rpc_server.cc. So, I simply use the "ret_handler_" variable as a flag to infer which constructor is used, and delete the allocated variables when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see. i think we could likely remove operator new in rpc_server.cc if it's defined in . it's been long enough that I don't remember why i needed to explicitly override that anyhow--i do think though that we should verify by testing on Zephyr/Arduino to make sure behaves the same way there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs a better solution---where the object ownership is fixed, i.e. either this class owns these objects or not. This will need a few more changes to the code creating objects of this class. I'm not sure what solution is the best for this, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the implementation to use smart pointer for exec_handler_, I hope that addresses the object ownership concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @mkatanbaf ! i have a few suggestions around the overall interface, mainly fixes to arguments/names and the way things are passed around to remove reinterpret_casts. otherwise this all looks pretty good.
it would be nice to find a way to test the logger output. right now i think we could merge without that explicitly, since we at least have a test that it doesn't crash, though.
python/tvm/rpc/client.py
Outdated
@@ -505,7 +508,7 @@ def connect(url, port, key="", session_timeout=0, session_constructor_args=None) | |||
client_via_proxy = rpc.connect( | |||
proxy_server_url, proxy_server_port, proxy_server_key, | |||
session_constructor_args=[ | |||
"rpc.Connect", internal_url, internal_port, internal_key]) | |||
"rpc.Connect", internal_url, internal_port, internal_key, enable_logging]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think here you mean
"rpc.Connect", internal_url, internal_port, internal_key, enable_logging]) | |
"rpc.Connect", internal_url, internal_port, internal_key], | |
enable_logging=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I understand your comment. This is not a function call, it's a list of parameters that would later pass to rpc.Connect on the device. so, I believe we cannot write down the param_name=value, it only has to be the value.
class MinRPCExecuteWithLog : public ExecInterface { | ||
public: | ||
explicit MinRPCExecuteWithLog(ExecInterface* next) : next_(next) { | ||
ret_handler_ = reinterpret_cast<MinRPCReturnsWithLog*>(next_->GetReturnInterface()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is the logging version of the ExecInterface, it's reasonable that the constructor should explicitly take a Logger*
. i think you could add that argument here and that would remove the need to reinterpret_cast here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, but I still need to keep the reinterpret_cast, to access functions such as GetHandleName, UpdateHandleName, ...
src/runtime/minrpc/minrpc_logger.cc
Outdated
|
||
if ((*tcodes)[i] == kTVMStr) { | ||
if (strlen((*values)[i].v_str) > 0) { | ||
ret_handler_->UpdateCurrHandleName((*values)[i].v_str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is specific to ModuleGetFunction
and GetGlobalFunc
and CallFunc
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we talked, I kept the code as is for the moment (i.e. it adds all the strings in parameters to the handle name). In future we can change it once we have a better vision of what should or should not be logged.
3759a62
to
5fcf577
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @mkatanbaf, i think just one last minor (but probably important to address) thing and then this is ready to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kparzysz-quic do you want to take another look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Thanks @mkatanbaf @areusch @kparzysz-quic @alanmacd! This PR is merged now. |
This reverts commit aa3bcd9, because it fails on Windows CI as reported in issue apache#11220. PR apache#11223 tries to address it but is is failing in the regular CI with testing issue on Hexagon.
Co-authored-by: Mohamad <mkatanbaf@users.noreply.github.com>
This reverts commit aa3bcd9, because it fails on Windows CI as reported in issue apache#11220. PR apache#11223 tries to address it but is is failing in the regular CI with testing issue on Hexagon.
Co-authored-by: Mohamad <mkatanbaf@users.noreply.github.com>
This reverts commit aa3bcd9, because it fails on Windows CI as reported in issue apache#11220. PR apache#11223 tries to address it but is is failing in the regular CI with testing issue on Hexagon.
This adds logging to rpc communications, by adding a wrapper around "RPCChannel". The wrapper, "RPCChannelLogging", has a "NanoRPCListener" that only listens to ongoing communication and log them in a human readable way. "NanoRPCListener" uses a "MinRPCSniffer" to decode the rpc commands. "MinRPCSniffer" is based on "MinRPCServer" and shares the same code to decode the packets, but replaces the execution and return handling with no-operation functions. The log is disabled by default for backward compatibility, and a test case ("test_rpc_simple_wlog") is added to the "test_runtime_rpc.py" with logging enabled.