This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[OpPerf] Fixed native output ordering, added warmup & runs command line args #17571
Merged
apeforest
merged 4 commits into
apache:master
from
connorgoggins:opperf_profiler_reorder
Feb 12, 2020
Merged
[OpPerf] Fixed native output ordering, added warmup & runs command line args #17571
apeforest
merged 4 commits into
apache:master
from
connorgoggins:opperf_profiler_reorder
Feb 12, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@mxnet-label-bot add [pr-awaiting-review] |
apeforest
approved these changes
Feb 11, 2020
ChaiBapchya
approved these changes
Feb 12, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job!
Can we get results in markdown please for both CPU GPU? |
@ChaiBapchya Added MD results for CPU and GPU |
Old Ordering (opname-fwd-bwd-mem-inputs) persists?
https://gist.github.com/connorgoggins/9837ea88017bef2f0d889ee79ca5aacf |
@ChaiBapchya very good point, need to add some logic for markdown. Updating now. |
@ChaiBapchya MD results have been updated - thanks again! |
Looks great now :D |
connorgoggins
force-pushed
the
opperf_profiler_reorder
branch
from
February 12, 2020 08:20
41efb19
to
0bce879
Compare
zheyuye
pushed a commit
to zheyuye/incubator-mxnet
that referenced
this pull request
Feb 19, 2020
…ne args (apache#17571) * Fixed ordering, added warmup & runs to argparse and individual benchmark function calls * Dropped unused ChainMap * Added newline for consistency with previous changes * Adjusted markdown output ordering
anirudh2290
pushed a commit
to anirudh2290/mxnet
that referenced
this pull request
May 29, 2020
…ne args (apache#17571) * Fixed ordering, added warmup & runs to argparse and individual benchmark function calls * Dropped unused ChainMap * Added newline for consistency with previous changes * Adjusted markdown output ordering
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Here, I fix the ordering issue when operator perf results are generated with the native profiler. Previously, the ordering of keys in the output dict generated was arbitrary and illogical. There were cases where
avg_time_forward_
perf results were displayed afteravg_time_backward_
perf results (when running withopperf.py
), and where memory consumption was displayed between forward and backward perf results (when running categories of operators individually). Furthermore,inputs
were never displayed first in the output.To solve these problems, I traced the construction of the results dictionary back to the saving of forward/backward perf results and memory consumption data in
profiler_utils.py
. I found that the entry foravg_time_backward_
was consistently being added to the dictionary before the entry foravg_time_forward_
(since it appeared first in the raw profile results, and was consequently encountered first in the iteration). I implemented a fix that saved each result to a variable during the iteration, and added the variable values to the dict in the correct order after the iteration had finished.After making this change, I found another issue:
merge_map_list()
constructed the output map in a way that was insensitive to the order of the list of dictionaries (and the keys within those dictionaries) that it was passed. I rewrotemerge_map_list()
to be sensitive to the order of its input list, as wells as the order of the keys within the dictionaries of each list element. Then, inbenchmark_utils.py
, I passed theinput
map tomerge_map_list()
before the other entries to ensure its position as the first key in the resulting dictionary.Even after making these changes, I found that the output of
opperf.py
was still ordered incorrectly. I found the root cause of the issue: in thejson.dump
call incommon_utils.py
, thesort_keys
parameter was set toTrue
, which would sort the keys of all dictionaries in the output JSON alphabetically (overwriting my ordering of the output dictionary). I saw that the purpose of sorting the output dictionary was to display the operator keys alphabetically in the resulting JSON. To retain this alphabetical operator sorting while preserving the ordering I had established for the output of each operator, I disabled thesort_keys
parameter and sorted only the highest-level keys in the final output map inopperf.py
.When I finished making these changes, I wanted to quickly test them with
opperf.py
, but found that there was no way to specifywarmup
andruns
from the command line when calling opperf. I thought this would be a very useful feature for users testing operator performance, so I added it.Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
Tested on Mac OS and Ubuntu 16.04 (on p2.16xl) with:
run_mx_misc_operators_benchmarks
)opperf.py
(full run of all ops) with JSON output.Performance Results
Full OpPerf test (CPU) - JSON format
Full OpPerf test (GPU) - JSON format
Full OpPerf test (CPU) - MD format
Full OpPerf test (GPU) - MD format
@apeforest @access2rohit @ChaiBapchya