Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial stab at the numba.typed.List benchmarks #10

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

esc
Copy link
Member

@esc esc commented Sep 18, 2020

This is an intial stab at the ASV tests for the numba.typed.List.

Things that still need to be decided on:

  • Do we want to measure compile time too? If so, how?
  • Is compile time included in these benchmarks or not?

Lastly, here is a snapshot of what it looks like when run on my laptop:

 💣 zsh» asv run --show-stderr  -b 'bench_typed_list*'
· Fetching recent changes.
· Creating environments
· Discovering benchmarks
· Running 4 total benchmarks (1 commits * 1 environments * 4 benchmarks)
[  0.00%] · For numba commit 1949c62e <master>:
[  0.00%] ·· Benchmarking conda-py3.6-cudatoolkit-llvmlite-numpy
[ 12.50%] ··· Running (bench_typed_list.ConstructionSuite.time_construct_from_python_list--)....
[ 62.50%] ··· bench_typed_list.ConstructionSuite.time_construct_from_python_list                                                                               97.8±1ms
[ 75.00%] ··· bench_typed_list.ConstructionSuite.time_construct_in_njit_function                                                                             2.61±0.1ms
[ 87.50%] ··· bench_typed_list.ReductionSuite.time_reduction_sum                                                                                                203±9μs
[100.00%] ··· bench_typed_list.SortSuite.time_sort                                                                                                              117±4ms
asv run --show-stderr -b 'bench_typed_list*'  23.89s user 1.37s system 85% cpu 29.624 total

This is an intial stab at the ASV tests for the `numba.typed.List`.

Things that still need to be decided on:

* Do we want to measure compile time too? If so, how?
* Is compile time included in these benchmarks or not?

Lastly, here is a snapshot of what it looks like when run on my laptop:

```
 💣 zsh» asv run --show-stderr  -b 'bench_typed_list*'
· Fetching recent changes.
· Creating environments
· Discovering benchmarks
· Running 4 total benchmarks (1 commits * 1 environments * 4 benchmarks)
[  0.00%] · For numba commit 1949c62e <master>:
[  0.00%] ·· Benchmarking conda-py3.6-cudatoolkit-llvmlite-numpy
[ 12.50%] ··· Running (bench_typed_list.ConstructionSuite.time_construct_from_python_list--)....
[ 62.50%] ··· bench_typed_list.ConstructionSuite.time_construct_from_python_list                                                                               97.8±1ms
[ 75.00%] ··· bench_typed_list.ConstructionSuite.time_construct_in_njit_function                                                                             2.61±0.1ms
[ 87.50%] ··· bench_typed_list.ReductionSuite.time_reduction_sum                                                                                                203±9μs
[100.00%] ··· bench_typed_list.SortSuite.time_sort                                                                                                              117±4ms
asv run --show-stderr -b 'bench_typed_list*'  23.89s user 1.37s system 85% cpu 29.624 total
```
class ConstructionSuite:

def setup(self):
self.pl = make_random_python_list(SIZE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should seed the the rng

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

class ReductionSuite:

def setup(self):
self.tl = make_random_typed_list(SIZE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to seed rng

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done too.

@sklam
Copy link
Member

sklam commented Sep 18, 2020

We do want compile time. It can make a fresh dispatcher and run the .compile method explicitly

agg += i
return agg

self.reduction_sum = njit(reduction_sum)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without fastmath this won't vectorize (should probably test both with and without). I also wonder if the iterator will vectorize vs an explicit induced loop, something to look at. The iterator loop will likely be full of refops.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am testing both now, they seem to have very similar run-time characteristics, will need to look at vectorization. But testing both is worthwhile in it's own right. Testing compilation time too now.

esc added 7 commits September 23, 2020 16:00
To ensure the reproducibility of the results, we seed the (pseudo-)
random number generator.
Since we want to catch compile time regressions too, we benchmark the
compile-time.
Implement the steps in `dispatcher.compile` to clear out the cache etc..
to ensure we really do get a fresh compile during every benchmark.
This makes the function reusable.
@esc
Copy link
Member Author

esc commented Sep 25, 2020

I have update the tests to seed the RNG and to time compilation. Here is snapshot of a run on my machine:

[ 56.25%] ··· bench_typed_list.ConstructionSuite.time_construct_from_python_list                                                                               96.9±2ms
[ 62.50%] ··· bench_typed_list.ConstructionSuite.time_construct_in_njit_function                                                                            2.17±0.09ms
[ 68.75%] ··· bench_typed_list.ReductionSuite.time_compile_reduction_sum_fastmath                                                                            21.1±0.4μs
[ 75.00%] ··· bench_typed_list.ReductionSuite.time_compile_reduction_sum_no_fastmath                                                                         21.8±0.1μs
[ 81.25%] ··· bench_typed_list.ReductionSuite.time_execute_reduction_sum_fastmath                                                                               195±6μs
[ 87.50%] ··· bench_typed_list.ReductionSuite.time_execute_reduction_sum_no_fastmath                                                                            201±2μs
[ 93.75%] ··· bench_typed_list.SortSuite.time_compile_sort                                                                                                      216±3ms
[100.00%] ··· bench_typed_list.SortSuite.time_execute_sort                                                                                                      116±3ms

Distinguish between for-loop based and iterator based iteration and
between integers and floats.
@esc
Copy link
Member Author

esc commented Oct 9, 2020

With recent updates, these are the current benchmarks for the changes introduced by: numba/numba#6278

All benchmarks:

       before           after         ratio
     [3b3eab89]       [05ce51c6]
     <pull/5543/merge~1>       <pull/6278/head~1>
          103±3ms          103±3ms     1.00  bench_typed_list.ConstructionSuite.time_construct_from_python_list
      2.37±0.09ms      2.39±0.07ms     1.01  bench_typed_list.ConstructionSuite.time_construct_in_njit_function
  3.603527119826277e-05±4e-06  2.7983062694041507e-05±9.5e-07    ~0.78  bench_typed_list.ForLoopReductionSuite.time_compile_reduction_sum_fastmath
  4.161973561003964e-05±9.1e-06  2.9423185928547246e-05±3.7e-06    ~0.71  bench_typed_list.ForLoopReductionSuite.time_compile_reduction_sum_no_fastmath
  0.0035349028767086565±0.00034  0.0025243946991395207±2e-05    ~0.71  bench_typed_list.ForLoopReductionSuite.time_execute_reduction_sum_fastmath
  0.0031963562505552545±0.00021  0.0027488698993693105±0.0002    ~0.86  bench_typed_list.ForLoopReductionSuite.time_execute_reduction_sum_no_fastmath
         62.6±2ms         62.7±2ms     1.00  bench_typed_list.ForLoopReductionSuiteFloat.time_compile_reduction_sum_fastmath
         61.9±1ms       61.2±0.7ms     0.99  bench_typed_list.ForLoopReductionSuiteFloat.time_compile_reduction_sum_no_fastmath
      2.59±0.03ms      2.52±0.07ms     0.97  bench_typed_list.ForLoopReductionSuiteFloat.time_execute_reduction_sum_fastmath
      2.58±0.04ms      2.50±0.04ms     0.97  bench_typed_list.ForLoopReductionSuiteFloat.time_execute_reduction_sum_no_fastmath
       29.1±0.8μs       28.0±0.6μs     0.96  bench_typed_list.ForLoopReductionSuiteInt.time_compile_reduction_sum_fastmath
       29.3±0.5μs       31.6±0.4μs     1.08  bench_typed_list.ForLoopReductionSuiteInt.time_compile_reduction_sum_no_fastmath
       2.64±0.1ms      2.55±0.04ms     0.97  bench_typed_list.ForLoopReductionSuiteInt.time_execute_reduction_sum_fastmath
       2.69±0.1ms      2.47±0.07ms     0.92  bench_typed_list.ForLoopReductionSuiteInt.time_execute_reduction_sum_no_fastmath
  2.6585843983129173e-05±2.1e-06  2.2718447455970037e-05±2.8e-07    ~0.85  bench_typed_list.IteratorReductionSuite.time_compile_reduction_sum_fastmath
  2.4821427593867003e-05±2.2e-06  2.3370140742376312e-05±3.2e-07     0.94  bench_typed_list.IteratorReductionSuite.time_compile_reduction_sum_no_fastmath
  0.00023403224115879442±5.1e-06  0.00021722948935967772±3.8e-06     0.93  bench_typed_list.IteratorReductionSuite.time_execute_reduction_sum_fastmath
  0.0002068479817932133±1.5e-05  0.00022312129993224517±7e-06     1.08  bench_typed_list.IteratorReductionSuite.time_execute_reduction_sum_no_fastmath
+      23.3±0.8μs       26.7±0.9μs     1.14  bench_typed_list.IteratorReductionSuiteFloat.time_compile_reduction_sum_fastmath
-      27.0±0.8μs       24.4±0.3μs     0.90  bench_typed_list.IteratorReductionSuiteFloat.time_compile_reduction_sum_no_fastmath
         227±20μs          271±4μs    ~1.19  bench_typed_list.IteratorReductionSuiteFloat.time_execute_reduction_sum_fastmath
         231±20μs         246±10μs     1.07  bench_typed_list.IteratorReductionSuiteFloat.time_execute_reduction_sum_no_fastmath
       22.3±0.4μs       26.1±0.7μs    ~1.17  bench_typed_list.IteratorReductionSuiteInt.time_compile_reduction_sum_fastmath
+      22.8±0.5μs       27.2±0.2μs     1.19  bench_typed_list.IteratorReductionSuiteInt.time_compile_reduction_sum_no_fastmath
          212±7μs          235±4μs    ~1.11  bench_typed_list.IteratorReductionSuiteInt.time_execute_reduction_sum_fastmath
          198±5μs          231±4μs    ~1.17  bench_typed_list.IteratorReductionSuiteInt.time_execute_reduction_sum_no_fastmath
              n/a              n/a      n/a  bench_typed_list.ReductionSuite.time_compile_reduction_sum_fastmath
              n/a              n/a      n/a  bench_typed_list.ReductionSuite.time_compile_reduction_sum_no_fastmath
              n/a              n/a      n/a  bench_typed_list.ReductionSuite.time_execute_reduction_sum_fastmath
              n/a              n/a      n/a  bench_typed_list.ReductionSuite.time_execute_reduction_sum_no_fastmath
          235±6ms          233±5ms     0.99  bench_typed_list.SortSuite.time_compile_sort
          120±2ms          113±2ms     0.94  bench_typed_list.SortSuite.time_execute_sort

README.md Outdated
Run `asv` on the first commit:

```bash
asv run "-1 abcdefg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
asv run "-1 abcdefg
asv run "-1 abcdefg"

also, that's hexadecimal? Why's there a g in it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessarily, it can be any commit-ish - which includes tags and branches. Though I understand that this is misleading because it looks like hex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5c870d3

README.md Outdated
Run `asv` on the second commit:

```bash
asv run "-1 1235567
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
asv run "-1 1235567
asv run "-1 1235567"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 5c870d3

README.md Outdated
Comment on lines 51 to 57
```bash
asv run --verbose --show-stderr -b 'bench_typed_list' "-1 abcdefg"
```

```bash
asv compare --machine machine.local "abcdefg" "1234567"
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add 1 line to say what they do/why they are useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 5c870d3

esc added 2 commits November 9, 2020 15:57
Fixed some syntax issues, made the metsyntactic placeholders for
commit-ish's less ambiguous and expanded the descriptions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants