-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[OpPerf] Fixed Python profiler bug #17642
[OpPerf] Fixed Python profiler bug #17642
Conversation
@mxnet-label-bot add [pr-awaiting-review] |
@@ -248,12 +248,11 @@ def python_profile(func): | |||
@functools.wraps(func) | |||
def python_profile_it(*args, **kwargs): | |||
runs = args[1] | |||
modified_args = (args[0], 1, args[2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember the reason why we need to write this way in the first place. @ChaiBapchya could you please review. @connorgoggins Have you run through existing tests to make sure this does not break any existing usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apeforest yes, ran full OpPerf suite with Python profiler with this change and everything passed (see results here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the args that are passed to this function are
args[0] = op
args[1] = warmup / runs (number of times to run for warmup or number of times to run)
args[2] - rest of the args
The way it worked for native MXNet CPP profiler is that you could pass the runs (and it would capture the time for each value along with mean/max, etc)
But for Python's time it function, we had to manually run the for loop
for the number of runs.
So that's what I did there
- we copy the number of runs in a variable in run and then run it that many number of times
- For each run, we use python time it function to time it and then take average, mean, max, etc values for each of those individual python time runs.
Makes sense? @apeforest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChaiBapchya So basically you are saying we don't need to do this modified_args for python profiler, right? So @connorgoggins change is valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to modify the args to meet the requirement for capturing per run timing info using python's time_it function. This needs to handled in a way that doesn't break existing native profiler.
preloaded and multi_* ops aren't being tracked for some reason. could you fix that too? @connorgoggins |
Let's fix that in a separate PR. |
times = [] | ||
|
||
for _ in range(runs): | ||
start_time = time.perf_counter() # 1 | ||
res = func(*modified_args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@connorgoggins @apeforest
If we pass the *args as is, it will still have
args[0] as op
args[1] as runs
For eg if user passed runs as 10
So the native profiler would run 10 times and so will the for loop run 10 times (for timing Python profiler)
Coz the func here is nd_forward_backward_profile or nd_forward_profile (both take runs as a parameter)
@@ -248,7 +248,7 @@ def python_profile(func): | |||
@functools.wraps(func) | |||
def python_profile_it(*args, **kwargs): | |||
runs = args[1] | |||
modified_args = (args[0], 1, args[2]) | |||
modified_args = (args[0], 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now! Thanks for the fix!
c4711c4
to
850260a
Compare
* Changed arg structure in op func call * Length check to prevent index out of bounds error * Dropping args[2] as it is no longer used (only using kwargs)
Description
Currently, the Python profiler for the opperf utility is broken. This fix changes the way args are passed to the underlying op during testing.
Fixes #17640
Checklist
Essentials
Changes
Comments
Test suite was run with Python profiler on Mac OS X, building the latest version of MXNet (with my fix) from source.
Full OpPerf test suite - CPU (Native profiler)
Full OpPerf test suite - CPU (Python profiler)
@apeforest