[MXNET-91] Added unittest for benchmarking metric performance #9705

safrooze · 2018-02-06T03:49:05Z

Output of the benchmark is sent to stderr

Description

Benchmark loops through two batch-sizes (100,000 and 1,000,000) and two output dimensions (100 and 500) and generates random data on CPU and GPU and calls metric.update() on a list of metrics with the generated date.

Checklist

Essentials

Passed code style checking (make lint)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Code is well-documented:
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Added unit-test for benchmarking metric performance.

Comments

Unit-test passes without GPU, but fails if GPU memory allocation fails
The output looks like this on a p2.x instance

mx.metric benchmarks
Metric         Ctx       Batch Size     Output Dim     Elapsed Time
----------------------------------------------------------------------
acc            cpu(0)    100000         100            0.069804
acc            gpu(0)    100000         100            0.0055592
----------------------------------------------------------------------
acc            cpu(0)    100000         500            0.29323
acc            gpu(0)    100000         500            0.034261
----------------------------------------------------------------------
acc            cpu(0)    1000000        100            0.66856
acc            gpu(0)    1000000        100            0.057442
----------------------------------------------------------------------
acc            cpu(0)    1000000        500            2.9239
acc            gpu(0)    1000000        500            0.27827
----------------------------------------------------------------------
top_k_acc      cpu(0)    100000         100            0.39707
top_k_acc      gpu(0)    100000         100            0.39684
----------------------------------------------------------------------
top_k_acc      cpu(0)    100000         500            2.6537
top_k_acc      gpu(0)    100000         500            2.6574
----------------------------------------------------------------------
top_k_acc      cpu(0)    1000000        100            4.0662
top_k_acc      gpu(0)    1000000        100            4.0537
----------------------------------------------------------------------
top_k_acc      cpu(0)    1000000        500            26.581
top_k_acc      gpu(0)    1000000        500            26.594
----------------------------------------------------------------------
F1             cpu(0)    100000         2              0.2515
F1             gpu(0)    100000         2              0.25105
----------------------------------------------------------------------
F1             cpu(0)    100000         2              0.25086
F1             gpu(0)    100000         2              0.24956
----------------------------------------------------------------------
F1             cpu(0)    1000000        2              2.509
F1             gpu(0)    1000000        2              2.5127
----------------------------------------------------------------------
F1             cpu(0)    1000000        2              2.5107
F1             gpu(0)    1000000        2              2.5094
----------------------------------------------------------------------
Perplexity     cpu(0)    100000         100            0.0058115
Perplexity     gpu(0)    100000         100            0.0030518
----------------------------------------------------------------------
Perplexity     cpu(0)    100000         500            0.0054376
Perplexity     gpu(0)    100000         500            0.0070541
----------------------------------------------------------------------
Perplexity     cpu(0)    1000000        100            0.042403
Perplexity     gpu(0)    1000000        100            0.003443
----------------------------------------------------------------------
Perplexity     cpu(0)    1000000        500            0.041232
Perplexity     gpu(0)    1000000        500            0.051778
----------------------------------------------------------------------
MAE            cpu(0)    100000         100            0.058175
MAE            gpu(0)    100000         100            0.056117
----------------------------------------------------------------------
MAE            cpu(0)    100000         500            0.26928
MAE            gpu(0)    100000         500            0.26553
----------------------------------------------------------------------
MAE            cpu(0)    1000000        100            0.53227
MAE            gpu(0)    1000000        100            0.52565
----------------------------------------------------------------------
MAE            cpu(0)    1000000        500            2.6206
MAE            gpu(0)    1000000        500            2.607
----------------------------------------------------------------------
MSE            cpu(0)    100000         100            0.041658
MSE            gpu(0)    100000         100            0.041626
----------------------------------------------------------------------
MSE            cpu(0)    100000         500            0.215
MSE            gpu(0)    100000         500            0.21492
----------------------------------------------------------------------
MSE            cpu(0)    1000000        100            0.43541
MSE            gpu(0)    1000000        100            0.42094
----------------------------------------------------------------------
MSE            cpu(0)    1000000        500            2.1183
MSE            gpu(0)    1000000        500            2.1229
----------------------------------------------------------------------
RMSE           cpu(0)    100000         100            0.042453
RMSE           gpu(0)    100000         100            0.041688
----------------------------------------------------------------------
RMSE           cpu(0)    100000         500            0.21422
RMSE           gpu(0)    100000         500            0.21395
----------------------------------------------------------------------
RMSE           cpu(0)    1000000        100            0.43216
RMSE           gpu(0)    1000000        100            0.42024
----------------------------------------------------------------------
RMSE           cpu(0)    1000000        500            2.1158
RMSE           gpu(0)    1000000        500            2.1298
----------------------------------------------------------------------
ce             cpu(0)    100000         100            0.017465
ce             gpu(0)    100000         100            0.016886
----------------------------------------------------------------------
ce             cpu(0)    100000         500            0.084103
ce             gpu(0)    100000         500            0.080693
----------------------------------------------------------------------
ce             cpu(0)    1000000        100            0.19837
ce             gpu(0)    1000000        100            0.1848
----------------------------------------------------------------------
ce             cpu(0)    1000000        500            0.81667
ce             gpu(0)    1000000        500            0.8098
----------------------------------------------------------------------
nll_loss       cpu(0)    100000         100            0.018017
nll_loss       gpu(0)    100000         100            0.016982
----------------------------------------------------------------------
nll_loss       cpu(0)    100000         500            0.083593
nll_loss       gpu(0)    100000         500            0.080484
----------------------------------------------------------------------
nll_loss       cpu(0)    1000000        100            0.19791
nll_loss       gpu(0)    1000000        100            0.1856
----------------------------------------------------------------------
nll_loss       cpu(0)    1000000        500            0.81095
nll_loss       gpu(0)    1000000        500            0.81938
----------------------------------------------------------------------
pearsonr       cpu(0)    100000         100            0.57283
pearsonr       gpu(0)    100000         100            0.22794
----------------------------------------------------------------------
pearsonr       cpu(0)    100000         500            2.2202
pearsonr       gpu(0)    100000         500            1.1238
----------------------------------------------------------------------
pearsonr       cpu(0)    1000000        100            4.4207
pearsonr       gpu(0)    1000000        100            2.2353
----------------------------------------------------------------------
pearsonr       cpu(0)    1000000        500            21.999
pearsonr       gpu(0)    1000000        500            11.147
----------------------------------------------------------------------

safrooze · 2018-02-06T03:49:30Z

@szha Please review.

eric-haibin-lin · 2018-02-06T06:44:35Z

Why not include small batch size like 64? 100k is huge.

safrooze · 2018-02-06T06:57:46Z

The intention is to observe a measurable elapsed time (hence large data size) and amplify the difference between CPU and GPU processing (hence processing all the data in one batch). A valid alternative is to use small batch size and iterate over multiple batches.

ptrendx · 2018-02-06T18:50:02Z

@safrooze This is the wrong reasoning - the fact that GPU is faster than CPU when processing million elements does not mean you should use GPU when adding 2 numbers together. You should only test on batch sizes that are realistic (and do multiple runs to have measurable time difference).

marcoabreu · 2018-02-06T21:27:26Z

Please make sure to use a fixed seed in order to provide reproducibility in between different runs.

safrooze · 2018-02-08T02:02:13Z

OK I think I addressed all the feedback:

random is seeded
nd.wait_all() used before starting timing and before ending timing
Added batch-size values of 16, 64, 256, and 1024
Datasize varies by number of output channels to keep total runtime down to a few minutes

The modified code output looks like this:

Metric         Data-Ctx  Label-Ctx   Data Size   Batch Size     Output Dim     Elapsed Time
------------------------------------------------------------------------------------------
acc            cpu(0)    cpu(0)      131072      16             128            1.0015
acc            cpu(0)    gpu(0)      131072      16             128            1.682
acc            gpu(0)    cpu(0)      131072      16             128            2.6263
acc            gpu(0)    gpu(0)      131072      16             128            3.3028
------------------------------------------------------------------------------------------
acc            cpu(0)    cpu(0)      131072      64             128            0.42843
acc            cpu(0)    gpu(0)      131072      64             128            0.568
acc            gpu(0)    cpu(0)      131072      64             128            0.78586
acc            gpu(0)    gpu(0)      131072      64             128            0.94317
------------------------------------------------------------------------------------------
acc            cpu(0)    cpu(0)      131072      256            128            0.19074
acc            cpu(0)    gpu(0)      131072      256            128            0.24228
acc            gpu(0)    cpu(0)      131072      256            128            0.21548
acc            gpu(0)    gpu(0)      131072      256            128            0.25075
------------------------------------------------------------------------------------------
acc            cpu(0)    cpu(0)      131072      1024           128            0.1303
acc            cpu(0)    gpu(0)      131072      1024           128            0.14127
acc            gpu(0)    cpu(0)      131072      1024           128            0.055079
acc            gpu(0)    gpu(0)      131072      1024           128            0.065515
------------------------------------------------------------------------------------------

eric-haibin-lin · 2018-02-08T06:43:54Z

Wouldn't nightly test be a better place for performance tests like this? This unit test doesn't verify the correctness at all.

safrooze · 2018-02-09T18:28:19Z

@eric-haibin-lin You're correct that nightly would be a more suitable place. One concern with nightly was that community wouldn't be able to see the results of the benchmark.

marcoabreu · 2018-02-09T19:10:41Z

It is planned to move the nightly tests to the public CI very soon. Am 09.02.2018 7:28 nachm. schrieb "Sina Afrooze" <notifications@github.com>:

…

@eric-haibin-lin <https://github.com/eric-haibin-lin> You're correct that nightly would be a more suitable place. One concern with nightly was that community wouldn't be able to see the results of the benchmark. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9705 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARxB6xpawGZ70PgaUdyVdYgBCkKfiv_gks5tTI5cgaJpZM4R6dXt> .

szha · 2018-02-20T04:47:20Z

When will nightly tests be moved to public CI?

marcoabreu · 2018-02-20T06:42:59Z

It's planned for end of Q1 Am 20.02.2018 5:47 vorm. schrieb "Sheng Zha" <notifications@github.com>:

…

When will nightly tests be moved to public CI? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#9705 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARxB64_1KvbTT_mTlGJN8Ag9V6x2bLG6ks5tWk5vgaJpZM4R6dXt> .

safrooze · 2018-02-28T18:15:35Z

I'll move this to nightly tests then.

CodingCat · 2018-03-07T04:42:57Z

Hi, the community has passed to vote about associating the code changes with JIRA (https://lists.apache.org/thread.html/ab22cf0e35f1bce2c3bf3bec2bc5b85a9583a3fe7fd56ba1bbade55f@%3Cdev.mxnet.apache.org%3E)

We have updated the guidelines for contributors in https://cwiki.apache.org/confluence/display/MXNET/Development+Process, please ensure that you have created a JIRA at https://issues.apache.org/jira/projects/MXNET/issues/ to describe your work in this pull request and include the JIRA title in your PR as [MXNET-xxxx] your title where MXNET-xxxx is the JIRA id

Thanks!

szha · 2018-03-12T01:35:49Z

Checking in on the public nightly build results, is it still on track?

marcoabreu · 2018-03-12T01:40:57Z

I don't think so - at least not from my side. We have been resource constrained and managing the Nightly CI does not fit into my schedule, especially since all jobs have to be refactored. I will ask Bhavins Team to do it and I will do the reviews, but I personally am not able to refactor that part as well.

On the other hand, we've got additional headcount approved for CI, but it will take some time until everybody is ramped up. We will have to see how and when we can continue.

szha · 2018-03-12T01:53:27Z

In that case, let's put the test in unittest for now. @safrooze could you resolve conflict?

szha · 2018-03-13T03:31:21Z

One last request: would you put the performance tests in a separate test file, such as test_metric_perf.py, so that it's easier to move to nightly later?

- Output of the benchmark is sent to stderr - random is seeded - nd.wait_all() used before starting timing and before ending timing - Added batch-size values of 16, 64, 256, and 1024 - Datasize varies by number of output channels to keep total runtime down to a few minutes

leezu · 2020-05-15T07:24:10Z

Why does this test only print numbers but doesn't actually enforce anything?

safrooze force-pushed the metric_benchmark branch from 11a59cc to 5d19aef Compare February 6, 2018 16:46

szha mentioned this pull request Feb 13, 2018

[MX-9588] Add micro averaging strategy for F1 metric #9777

Merged

5 tasks

szha self-assigned this Feb 13, 2018

szha mentioned this pull request Feb 13, 2018

mx.metric F1 is using numpy logic #9586

Open

sxjscience mentioned this pull request Feb 20, 2018

[Metric] Accelerate the calculation of F1 #9833

Merged

6 tasks

safrooze force-pushed the metric_benchmark branch from 33e161d to 4d369e5 Compare March 13, 2018 18:25

safrooze changed the title ~~Added unittest for benchmarking metric performance~~ [MXNET-91] Added unittest for benchmarking metric performance Mar 13, 2018

szha merged commit 4ad37d8 into apache:master Mar 16, 2018

ThomasDelteil mentioned this pull request Apr 27, 2018

Update CONTRIBUTORS.md #10720

Merged

ChaiBapchya mentioned this pull request May 15, 2020

Remove test metric perf #18331

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-91] Added unittest for benchmarking metric performance #9705

[MXNET-91] Added unittest for benchmarking metric performance #9705

safrooze commented Feb 6, 2018 •

edited by szha

Loading

safrooze commented Feb 6, 2018

eric-haibin-lin commented Feb 6, 2018

safrooze commented Feb 6, 2018

ptrendx commented Feb 6, 2018

marcoabreu commented Feb 6, 2018

safrooze commented Feb 8, 2018 •

edited

Loading

eric-haibin-lin commented Feb 8, 2018

safrooze commented Feb 9, 2018

marcoabreu commented Feb 9, 2018 via email

szha commented Feb 20, 2018

marcoabreu commented Feb 20, 2018 via email

safrooze commented Feb 28, 2018

CodingCat commented Mar 7, 2018

szha commented Mar 12, 2018

marcoabreu commented Mar 12, 2018

szha commented Mar 12, 2018

szha commented Mar 13, 2018

leezu commented May 15, 2020

[MXNET-91] Added unittest for benchmarking metric performance #9705

[MXNET-91] Added unittest for benchmarking metric performance #9705

Conversation

safrooze commented Feb 6, 2018 • edited by szha Loading

Description

Checklist

Essentials

Changes

Comments

safrooze commented Feb 6, 2018

eric-haibin-lin commented Feb 6, 2018

safrooze commented Feb 6, 2018

ptrendx commented Feb 6, 2018

marcoabreu commented Feb 6, 2018

safrooze commented Feb 8, 2018 • edited Loading

eric-haibin-lin commented Feb 8, 2018

safrooze commented Feb 9, 2018

marcoabreu commented Feb 9, 2018 via email

szha commented Feb 20, 2018

marcoabreu commented Feb 20, 2018 via email

safrooze commented Feb 28, 2018

CodingCat commented Mar 7, 2018

szha commented Mar 12, 2018

marcoabreu commented Mar 12, 2018

szha commented Mar 12, 2018

szha commented Mar 13, 2018

leezu commented May 15, 2020

safrooze commented Feb 6, 2018 •

edited by szha

Loading

safrooze commented Feb 8, 2018 •

edited

Loading