initial stab at the numba.typed.List benchmarks #10

esc · 2020-09-18T14:57:49Z

This is an intial stab at the ASV tests for the numba.typed.List.

Things that still need to be decided on:

Do we want to measure compile time too? If so, how?
Is compile time included in these benchmarks or not?

Lastly, here is a snapshot of what it looks like when run on my laptop:

 💣 zsh» asv run --show-stderr  -b 'bench_typed_list*'
· Fetching recent changes.
· Creating environments
· Discovering benchmarks
· Running 4 total benchmarks (1 commits * 1 environments * 4 benchmarks)
[  0.00%] · For numba commit 1949c62e <master>:
[  0.00%] ·· Benchmarking conda-py3.6-cudatoolkit-llvmlite-numpy
[ 12.50%] ··· Running (bench_typed_list.ConstructionSuite.time_construct_from_python_list--)....
[ 62.50%] ··· bench_typed_list.ConstructionSuite.time_construct_from_python_list                                                                               97.8±1ms
[ 75.00%] ··· bench_typed_list.ConstructionSuite.time_construct_in_njit_function                                                                             2.61±0.1ms
[ 87.50%] ··· bench_typed_list.ReductionSuite.time_reduction_sum                                                                                                203±9μs
[100.00%] ··· bench_typed_list.SortSuite.time_sort                                                                                                              117±4ms
asv run --show-stderr -b 'bench_typed_list*'  23.89s user 1.37s system 85% cpu 29.624 total

This is an intial stab at the ASV tests for the `numba.typed.List`. Things that still need to be decided on: * Do we want to measure compile time too? If so, how? * Is compile time included in these benchmarks or not? Lastly, here is a snapshot of what it looks like when run on my laptop: ``` 💣 zsh» asv run --show-stderr -b 'bench_typed_list*' · Fetching recent changes. · Creating environments · Discovering benchmarks · Running 4 total benchmarks (1 commits * 1 environments * 4 benchmarks) [ 0.00%] · For numba commit 1949c62e <master>: [ 0.00%] ·· Benchmarking conda-py3.6-cudatoolkit-llvmlite-numpy [ 12.50%] ··· Running (bench_typed_list.ConstructionSuite.time_construct_from_python_list--).... [ 62.50%] ··· bench_typed_list.ConstructionSuite.time_construct_from_python_list 97.8±1ms [ 75.00%] ··· bench_typed_list.ConstructionSuite.time_construct_in_njit_function 2.61±0.1ms [ 87.50%] ··· bench_typed_list.ReductionSuite.time_reduction_sum 203±9μs [100.00%] ··· bench_typed_list.SortSuite.time_sort 117±4ms asv run --show-stderr -b 'bench_typed_list*' 23.89s user 1.37s system 85% cpu 29.624 total ```

sklam · 2020-09-18T15:08:14Z

benchmarks/bench_typed_list.py

+class ConstructionSuite:
+
+    def setup(self):
+        self.pl = make_random_python_list(SIZE)


it should seed the the rng

sklam · 2020-09-18T15:08:22Z

benchmarks/bench_typed_list.py

+class ReductionSuite:
+
+    def setup(self):
+        self.tl = make_random_typed_list(SIZE)


needs to seed rng

sklam · 2020-09-18T15:09:30Z

We do want compile time. It can make a fresh dispatcher and run the .compile method explicitly

stuartarchibald · 2020-09-18T15:12:52Z

benchmarks/bench_typed_list.py

+                agg += i
+            return agg
+
+        self.reduction_sum = njit(reduction_sum)


Without fastmath this won't vectorize (should probably test both with and without). I also wonder if the iterator will vectorize vs an explicit induced loop, something to look at. The iterator loop will likely be full of refops.

I am testing both now, they seem to have very similar run-time characteristics, will need to look at vectorization. But testing both is worthwhile in it's own right. Testing compilation time too now.

To ensure the reproducibility of the results, we seed the (pseudo-) random number generator.

Since we want to catch compile time regressions too, we benchmark the compile-time.

Implement the steps in `dispatcher.compile` to clear out the cache etc.. to ensure we really do get a fresh compile during every benchmark.

This makes the function reusable.

esc · 2020-09-25T13:05:46Z

I have update the tests to seed the RNG and to time compilation. Here is snapshot of a run on my machine:

[ 56.25%] ··· bench_typed_list.ConstructionSuite.time_construct_from_python_list                                                                               96.9±2ms
[ 62.50%] ··· bench_typed_list.ConstructionSuite.time_construct_in_njit_function                                                                            2.17±0.09ms
[ 68.75%] ··· bench_typed_list.ReductionSuite.time_compile_reduction_sum_fastmath                                                                            21.1±0.4μs
[ 75.00%] ··· bench_typed_list.ReductionSuite.time_compile_reduction_sum_no_fastmath                                                                         21.8±0.1μs
[ 81.25%] ··· bench_typed_list.ReductionSuite.time_execute_reduction_sum_fastmath                                                                               195±6μs
[ 87.50%] ··· bench_typed_list.ReductionSuite.time_execute_reduction_sum_no_fastmath                                                                            201±2μs
[ 93.75%] ··· bench_typed_list.SortSuite.time_compile_sort                                                                                                      216±3ms
[100.00%] ··· bench_typed_list.SortSuite.time_execute_sort                                                                                                      116±3ms

Distinguish between for-loop based and iterator based iteration and between integers and floats.

esc · 2020-10-09T09:49:28Z

With recent updates, these are the current benchmarks for the changes introduced by: numba/numba#6278

All benchmarks:

       before           after         ratio
     [3b3eab89]       [05ce51c6]
     <pull/5543/merge~1>       <pull/6278/head~1>
          103±3ms          103±3ms     1.00  bench_typed_list.ConstructionSuite.time_construct_from_python_list
      2.37±0.09ms      2.39±0.07ms     1.01  bench_typed_list.ConstructionSuite.time_construct_in_njit_function
  3.603527119826277e-05±4e-06  2.7983062694041507e-05±9.5e-07    ~0.78  bench_typed_list.ForLoopReductionSuite.time_compile_reduction_sum_fastmath
  4.161973561003964e-05±9.1e-06  2.9423185928547246e-05±3.7e-06    ~0.71  bench_typed_list.ForLoopReductionSuite.time_compile_reduction_sum_no_fastmath
  0.0035349028767086565±0.00034  0.0025243946991395207±2e-05    ~0.71  bench_typed_list.ForLoopReductionSuite.time_execute_reduction_sum_fastmath
  0.0031963562505552545±0.00021  0.0027488698993693105±0.0002    ~0.86  bench_typed_list.ForLoopReductionSuite.time_execute_reduction_sum_no_fastmath
         62.6±2ms         62.7±2ms     1.00  bench_typed_list.ForLoopReductionSuiteFloat.time_compile_reduction_sum_fastmath
         61.9±1ms       61.2±0.7ms     0.99  bench_typed_list.ForLoopReductionSuiteFloat.time_compile_reduction_sum_no_fastmath
      2.59±0.03ms      2.52±0.07ms     0.97  bench_typed_list.ForLoopReductionSuiteFloat.time_execute_reduction_sum_fastmath
      2.58±0.04ms      2.50±0.04ms     0.97  bench_typed_list.ForLoopReductionSuiteFloat.time_execute_reduction_sum_no_fastmath
       29.1±0.8μs       28.0±0.6μs     0.96  bench_typed_list.ForLoopReductionSuiteInt.time_compile_reduction_sum_fastmath
       29.3±0.5μs       31.6±0.4μs     1.08  bench_typed_list.ForLoopReductionSuiteInt.time_compile_reduction_sum_no_fastmath
       2.64±0.1ms      2.55±0.04ms     0.97  bench_typed_list.ForLoopReductionSuiteInt.time_execute_reduction_sum_fastmath
       2.69±0.1ms      2.47±0.07ms     0.92  bench_typed_list.ForLoopReductionSuiteInt.time_execute_reduction_sum_no_fastmath
  2.6585843983129173e-05±2.1e-06  2.2718447455970037e-05±2.8e-07    ~0.85  bench_typed_list.IteratorReductionSuite.time_compile_reduction_sum_fastmath
  2.4821427593867003e-05±2.2e-06  2.3370140742376312e-05±3.2e-07     0.94  bench_typed_list.IteratorReductionSuite.time_compile_reduction_sum_no_fastmath
  0.00023403224115879442±5.1e-06  0.00021722948935967772±3.8e-06     0.93  bench_typed_list.IteratorReductionSuite.time_execute_reduction_sum_fastmath
  0.0002068479817932133±1.5e-05  0.00022312129993224517±7e-06     1.08  bench_typed_list.IteratorReductionSuite.time_execute_reduction_sum_no_fastmath
+      23.3±0.8μs       26.7±0.9μs     1.14  bench_typed_list.IteratorReductionSuiteFloat.time_compile_reduction_sum_fastmath
-      27.0±0.8μs       24.4±0.3μs     0.90  bench_typed_list.IteratorReductionSuiteFloat.time_compile_reduction_sum_no_fastmath
         227±20μs          271±4μs    ~1.19  bench_typed_list.IteratorReductionSuiteFloat.time_execute_reduction_sum_fastmath
         231±20μs         246±10μs     1.07  bench_typed_list.IteratorReductionSuiteFloat.time_execute_reduction_sum_no_fastmath
       22.3±0.4μs       26.1±0.7μs    ~1.17  bench_typed_list.IteratorReductionSuiteInt.time_compile_reduction_sum_fastmath
+      22.8±0.5μs       27.2±0.2μs     1.19  bench_typed_list.IteratorReductionSuiteInt.time_compile_reduction_sum_no_fastmath
          212±7μs          235±4μs    ~1.11  bench_typed_list.IteratorReductionSuiteInt.time_execute_reduction_sum_fastmath
          198±5μs          231±4μs    ~1.17  bench_typed_list.IteratorReductionSuiteInt.time_execute_reduction_sum_no_fastmath
              n/a              n/a      n/a  bench_typed_list.ReductionSuite.time_compile_reduction_sum_fastmath
              n/a              n/a      n/a  bench_typed_list.ReductionSuite.time_compile_reduction_sum_no_fastmath
              n/a              n/a      n/a  bench_typed_list.ReductionSuite.time_execute_reduction_sum_fastmath
              n/a              n/a      n/a  bench_typed_list.ReductionSuite.time_execute_reduction_sum_no_fastmath
          235±6ms          233±5ms     0.99  bench_typed_list.SortSuite.time_compile_sort
          120±2ms          113±2ms     0.94  bench_typed_list.SortSuite.time_execute_sort

stuartarchibald · 2020-11-09T11:25:09Z

README.md

+Run `asv` on the first commit:
+
+```bash
+asv run "-1 abcdefg


Suggested change

asv run "-1 abcdefg

asv run "-1 abcdefg"

also, that's hexadecimal? Why's there a g in it?

Not necessarily, it can be any commit-ish - which includes tags and branches. Though I understand that this is misleading because it looks like hex.

Fixed in 5c870d3

stuartarchibald · 2020-11-09T11:25:15Z

README.md

+Run `asv`  on the second commit:
+
+```bash
+asv run "-1 1235567


Suggested change

asv run "-1 1235567

asv run "-1 1235567"

fixed in 5c870d3

stuartarchibald · 2020-11-09T11:26:14Z

README.md

+```bash
+asv run --verbose --show-stderr  -b 'bench_typed_list' "-1 abcdefg"
+```
+
+```bash
+asv compare --machine machine.local "abcdefg" "1234567"
+```


Add 1 line to say what they do/why they are useful?

Fixed in 5c870d3

Fixed some syntax issues, made the metsyntactic placeholders for commit-ish's less ambiguous and expanded the descriptions.

sklam reviewed Sep 18, 2020

View reviewed changes

stuartarchibald reviewed Sep 18, 2020

View reviewed changes

esc added 7 commits September 23, 2020 16:00

adding seeding for PRNG

d0558e5

To ensure the reproducibility of the results, we seed the (pseudo-) random number generator.

this measures compile-time for sort

c997b5f

Since we want to catch compile time regressions too, we benchmark the compile-time.

benchmark with fastmath and w/o

5eeeb6c

make sure we really do recompile

7234402

Implement the steps in `dispatcher.compile` to clear out the cache etc.. to ensure we really do get a fresh compile during every benchmark.

increase the min_run_count of this suite to 5

255a119

extract clearing the dispatcher

5d0fce2

This makes the function reusable.

implement benchmarking compilation time too for the reduction case

8732ef8

enhance the typed-list benchmarks some more

cbf3e8c

Distinguish between for-loop based and iterator based iteration and between integers and floats.

esc added 6 commits October 14, 2020 15:45

add benchmarks for getitem_unchecked

ca0011f

adding the ArrayListSuite

921321b

fix benchmarking parameters, elim warm-up time

6957bbc

fix signature for float suits

27fb3d8

fixup the array list tests

db6103e

upgrade the README with instructions on how to compare results

fb90310

stuartarchibald reviewed Nov 9, 2020

View reviewed changes

esc added 2 commits November 9, 2020 15:57

add missing benchmarks.json file

a6be44c

respond to PR feedback and improve docs

5c870d3

Fixed some syntax issues, made the metsyntactic placeholders for commit-ish's less ambiguous and expanded the descriptions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial stab at the numba.typed.List benchmarks #10

initial stab at the numba.typed.List benchmarks #10

esc commented Sep 18, 2020

sklam Sep 18, 2020

esc Sep 25, 2020

sklam Sep 18, 2020

esc Sep 25, 2020

sklam commented Sep 18, 2020

stuartarchibald Sep 18, 2020

esc Sep 25, 2020

esc commented Sep 25, 2020

esc commented Oct 9, 2020

stuartarchibald Nov 9, 2020

esc Nov 9, 2020

esc Nov 9, 2020

stuartarchibald Nov 9, 2020

esc Nov 9, 2020

stuartarchibald Nov 9, 2020

esc Nov 9, 2020

initial stab at the numba.typed.List benchmarks #10

Are you sure you want to change the base?

initial stab at the numba.typed.List benchmarks #10

Conversation

esc commented Sep 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sklam commented Sep 18, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

esc commented Sep 25, 2020

esc commented Oct 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment