Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[OpPerf] Fixed native output ordering, added warmup & runs command line args #17571

Merged
merged 4 commits into from
Feb 12, 2020

Conversation

connorgoggins
Copy link
Contributor

@connorgoggins connorgoggins commented Feb 11, 2020

Description

Here, I fix the ordering issue when operator perf results are generated with the native profiler. Previously, the ordering of keys in the output dict generated was arbitrary and illogical. There were cases where avg_time_forward_ perf results were displayed after avg_time_backward_ perf results (when running with opperf.py), and where memory consumption was displayed between forward and backward perf results (when running categories of operators individually). Furthermore, inputs were never displayed first in the output.

To solve these problems, I traced the construction of the results dictionary back to the saving of forward/backward perf results and memory consumption data in profiler_utils.py. I found that the entry for avg_time_backward_ was consistently being added to the dictionary before the entry for avg_time_forward_ (since it appeared first in the raw profile results, and was consequently encountered first in the iteration). I implemented a fix that saved each result to a variable during the iteration, and added the variable values to the dict in the correct order after the iteration had finished.

After making this change, I found another issue: merge_map_list() constructed the output map in a way that was insensitive to the order of the list of dictionaries (and the keys within those dictionaries) that it was passed. I rewrote merge_map_list() to be sensitive to the order of its input list, as wells as the order of the keys within the dictionaries of each list element. Then, in benchmark_utils.py, I passed the input map to merge_map_list() before the other entries to ensure its position as the first key in the resulting dictionary.

Even after making these changes, I found that the output of opperf.py was still ordered incorrectly. I found the root cause of the issue: in the json.dump call in common_utils.py, the sort_keys parameter was set to True, which would sort the keys of all dictionaries in the output JSON alphabetically (overwriting my ordering of the output dictionary). I saw that the purpose of sorting the output dictionary was to display the operator keys alphabetically in the resulting JSON. To retain this alphabetical operator sorting while preserving the ordering I had established for the output of each operator, I disabled the sort_keys parameter and sorted only the highest-level keys in the final output map in opperf.py.

When I finished making these changes, I wanted to quickly test them with opperf.py, but found that there was no way to specify warmup and runs from the command line when calling opperf. I thought this would be a very useful feature for users testing operator performance, so I added it.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • M benchmark/opperf/opperf.py
  • M benchmark/opperf/utils/benchmark_utils.py
  • M benchmark/opperf/utils/common_utils.py
  • M benchmark/opperf/utils/profiler_utils.py

Comments

Tested on Mac OS and Ubuntu 16.04 (on p2.16xl) with:

  1. Checkout branch and call individual operator category testing functions (e.g. run_mx_misc_operators_benchmarks)
  2. Checkout branch and run opperf.py (full run of all ops) with JSON output.

Performance Results

Full OpPerf test (CPU) - JSON format
Full OpPerf test (GPU) - JSON format

Full OpPerf test (CPU) - MD format
Full OpPerf test (GPU) - MD format

@apeforest @access2rohit @ChaiBapchya

@connorgoggins
Copy link
Contributor Author

@mxnet-label-bot add [pr-awaiting-review]

@lanking520 lanking520 added the pr-awaiting-review PR is waiting for code review label Feb 11, 2020
Copy link
Contributor

@ChaiBapchya ChaiBapchya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

@ChaiBapchya
Copy link
Contributor

Can we get results in markdown please for both CPU GPU?
Easier to read in gist

@connorgoggins
Copy link
Contributor Author

@ChaiBapchya Added MD results for CPU and GPU

@ChaiBapchya
Copy link
Contributor

ChaiBapchya commented Feb 12, 2020

Screen Shot 2020-02-11 at 4 37 02 PM

Old Ordering (opname-fwd-bwd-mem-inputs) persists?

https://gist.github.com/connorgoggins/9837ea88017bef2f0d889ee79ca5aacf

@connorgoggins
Copy link
Contributor Author

@ChaiBapchya very good point, need to add some logic for markdown. Updating now.

@connorgoggins
Copy link
Contributor Author

@ChaiBapchya MD results have been updated - thanks again!

@ChaiBapchya
Copy link
Contributor

Looks great now :D

@apeforest apeforest merged commit 81d26cc into apache:master Feb 12, 2020
zheyuye pushed a commit to zheyuye/incubator-mxnet that referenced this pull request Feb 19, 2020
…ne args (apache#17571)

* Fixed ordering, added warmup & runs to argparse and individual benchmark function calls

* Dropped unused ChainMap

* Added newline for consistency with previous changes

* Adjusted markdown output ordering
anirudh2290 pushed a commit to anirudh2290/mxnet that referenced this pull request May 29, 2020
…ne args (apache#17571)

* Fixed ordering, added warmup & runs to argparse and individual benchmark function calls

* Dropped unused ChainMap

* Added newline for consistency with previous changes

* Adjusted markdown output ordering
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants