Optimize IO's encoding #90

alexandru · 2017-11-28T22:15:01Z

Fixes #89 and improves performance.

This implementation is inspired by the internal encoding of the Monix Task — it's not an exact port because Monix's Task is cancellable and has to do more work because of that. Also some minor optimizations were left out for now until further benchmarks (e.g. having a separate state for IO.apply was left out), but the gist is here.

Benchmarking (last update: 2017-11-29)

The PR includes a JMH setup with benchmarks/vPrev and benchmarks/vNext as sub-projects for measuring the performance impact of changes, compared with whatever previous version we want.

In order to run the benchmarks one needs to execute the script:

./benchmarks/run-benchmarks-all

The results will be dumped in bechmarks/results.

ShallowBindBenchmark

This measures a plain tail-recursive flatMap loop. Previous version:

Benchmark                   (size)   Mode  Cnt     Score    Error  Units
ShallowBindBenchmark.async   10000  thrpt   20    49.247 ±  4.749  ops/s
ShallowBindBenchmark.delay   10000  thrpt   20  1589.975 ± 32.753  ops/s
ShallowBindBenchmark.pure    10000  thrpt   20  2556.298 ± 35.352  ops/s

PR changes:

Benchmark                   (size)   Mode  Cnt     Score    Error  Units
ShallowBindBenchmark.async   10000  thrpt   20    82.844 ± 14.171  ops/s
ShallowBindBenchmark.delay   10000  thrpt   20  3619.257 ± 49.135  ops/s
ShallowBindBenchmark.pure    10000  thrpt   20  6291.561 ± 52.802  ops/s

Over twice the throughput.

DeepBindBenchmark

This one measures a non-tail-recursive flatMap loop (like in issue #89). Previous version:

Benchmark                (size)   Mode  Cnt     Score     Error  Units
DeepBindBenchmark.async    3000  thrpt   20     2.704 ±   0.062  ops/s
DeepBindBenchmark.delay    3000  thrpt   20     4.354 ±   0.076  ops/s
DeepBindBenchmark.pure     3000  thrpt   20  2883.429 ± 103.256  ops/s

After PR changes:

Benchmark                (size)   Mode  Cnt     Score     Error  Units
DeepBindBenchmark.async    3000  thrpt   20   375.433 ±  19.089  ops/s
DeepBindBenchmark.delay    3000  thrpt   20  5485.506 ±  71.269  ops/s
DeepBindBenchmark.pure     3000  thrpt   20  7089.853 ± 111.027  ops/s

The differences are dramatic due to memory usage.

AttemptBenchmark

This one measures the performance of attempt, both for the happy path and for handling errors:
Previous version:

Benchmark                     (size)   Mode  Cnt     Score    Error  Units
AttemptBenchmark.errorRaised   10000  thrpt   20   349.944 ±  3.795  ops/s
AttemptBenchmark.happyPath     10000  thrpt   20  2282.834 ± 23.092  ops/s

After the PR changes:

Benchmark                     (size)   Mode  Cnt     Score    Error  Units
AttemptBenchmark.errorRaised   10000  thrpt   20  2150.224 ± 25.714  ops/s
AttemptBenchmark.happyPath     10000  thrpt   20  2284.496 ± 39.242  ops/s

The differences are dramatic for when errors get handled.

HandleErrorBenchmark

This one measure the performance of handleErrorWith, that is optimized in the new version.
So previous version:

Benchmark                         (size)   Mode  Cnt    Score     Error  Units
HandleErrorBenchmark.errorRaised   10000  thrpt   20  282.673 ±   3.525  ops/s
HandleErrorBenchmark.happyPath     10000  thrpt   20  979.575 ± 130.618  ops/s

After PR changes:

Benchmark                         (size)   Mode  Cnt     Score    Error  Units
HandleErrorBenchmark.errorRaised   10000  thrpt   20  1939.384 ± 14.190  ops/s
HandleErrorBenchmark.happyPath     10000  thrpt   20  3068.821 ± 37.069  ops/s

Moving Forward

More optimizations are possible, but at this point this provides a good baseline — other micro-optimizations can come in separate PRs, along with the proof that they work.

codecov-io · 2017-11-28T22:21:27Z

Codecov Report

Merging #90 into master will increase coverage by 0.83%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master      #90      +/-   ##
==========================================
+ Coverage   85.93%   86.77%   +0.83%     
==========================================
  Files          19       20       +1     
  Lines         384      378       -6     
  Branches       21       27       +6     
==========================================
- Hits          330      328       -2     
+ Misses         54       50       -4

mpilquist · 2017-11-29T00:21:42Z

core/shared/src/main/scala/cats/effect/internals/IORunLoop.scala

+  }
+
+  /** Pops the next bind function from the stack, but filters out
+    * `Mapping.OnError` references, because we know they won't do


s/Mapping.OnError/IOFrame.ErrorHandler/g

mpilquist

This looks great. Interested to see some updated benchmarks.

pchlupacek · 2017-11-29T10:05:21Z

@alexandru this is excellent news. Btw does new IO encoding support the flatMapping over result, i.e. flatMap { result: Either[Throwable, A] => ??? } ?

alexandru · 2017-11-29T10:13:51Z

@pchlupacek the new internal encoding is optimized for flatMap-ing over both successful values and errors, however we are not exposing it in the API.

Monix's Task has a transformWith[B](f: a => Task[B], g: Throwable => Task[B]): Task[B] operator. I would like that in IO as well for efficiency reasons, but this PR is already too big and I'd propose that in a separate PR.

pchlupacek · 2017-11-29T10:16:19Z

@alexandru yeah, would be excellent to have that flatMap-ish design if possible, with same parformance as normal flatMap.

alexandru · 2017-11-29T10:19:10Z

@pchlupacek note that these changes makes handleErrorWith much, much lighter for the happy path.

A transformWith would be better still, but even without it you should see big improvements with this PR for defensive code.

See the HandleErrorBenchmark, which is relevant here.

djspiewak · 2017-11-29T18:25:26Z

@alexandru This looks very impressive at first glance. Thanks for taking the time to do this! I won't have a chance to review it potentially for a few days. @mpilquist has already given his stamp of approval, so if you guys want to move forward, feel free to do so. :-) Otherwise, I'll get to it asap.

non · 2017-11-29T22:34:33Z

👍 This looks great, thanks @alexandru!

mpilquist · 2017-11-30T13:32:01Z

@alexandru Could you take a look at the compilation failure on the 2.10 build? Once that's fixed, I think we can go ahead and merge this given Daniel's response from yesterday.

LukaJCB · 2017-11-30T15:29:31Z

Amazing work!

alexandru · 2017-11-30T15:42:16Z

@mpilquist thanks for the review and the merge.

@djspiewak when you have the time, please publish a hash version for testing purposes.

This might not be the last PR for performance optimizations. I'm tormenting myself with some profiling tools from Intel with an UI made in 1994 and I'm doing experiments, but as I said it would be better to introduce further optimizations piecemeal with some proof that they work.

alexandru changed the title ~~Optimize IO's encoding~~ WIP: Optimize IO's encoding Nov 28, 2017

New IO encoding for better performance

1bf9821

alexandru force-pushed the new-io branch from 3c14c72 to 1bf9821 Compare November 28, 2017 22:16

alexandru changed the title ~~WIP: Optimize IO's encoding~~ Optimize IO's encoding Nov 28, 2017

alexandru mentioned this pull request Nov 28, 2017

Issues with recursion #89

Closed

alexandru changed the title ~~Optimize IO's encoding~~ WIP: Optimize IO's encoding Nov 28, 2017

alexandru added 3 commits November 29, 2017 00:40

Fix comment on Mapping

9efa611

Rename Mapping -> IOFrame for clarity

78b57d5

Rename Bind.function -> Bind.frame

3737fdb

mpilquist reviewed Nov 29, 2017

View reviewed changes

mpilquist self-requested a review November 29, 2017 00:25

mpilquist approved these changes Nov 29, 2017

View reviewed changes

alexandru added 4 commits November 29, 2017 10:12

Simplify the Bind state

f00aa81

Add benchmarks

5e516ce

Add script for running all benchmarks

434dede

Fix comment

b4d48cc

Fix comments on benchmarks

ac2f320

mpilquist approved these changes Nov 29, 2017

View reviewed changes

Add tests to fix test coverage drop

e604983

alexandru changed the title ~~WIP: Optimize IO's encoding~~ Optimize IO's encoding Nov 29, 2017

alexandru requested a review from djspiewak November 29, 2017 17:46

Fix compilation error for 2.10.6

ecd7ab1

mpilquist merged commit 964e8d0 into typelevel:master Nov 30, 2017

This was referenced Dec 3, 2017

Second Batch of Optimizations — IO.apply #91

Merged

Third optimization batch — map fusion #95

Merged

pakoito mentioned this pull request Dec 10, 2017

Refactor IO following Cats arrow-kt/arrow#505

Merged

alexandru mentioned this pull request Dec 26, 2017

Task / Coeval Run-loop Optimizations, First Batch monix/monix#474

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize IO's encoding #90

Optimize IO's encoding #90

alexandru commented Nov 28, 2017 •

edited

Loading

codecov-io commented Nov 28, 2017 •

edited

Loading

mpilquist Nov 29, 2017

mpilquist left a comment

pchlupacek commented Nov 29, 2017

alexandru commented Nov 29, 2017

pchlupacek commented Nov 29, 2017 •

edited

Loading

alexandru commented Nov 29, 2017 •

edited

Loading

djspiewak commented Nov 29, 2017

non commented Nov 29, 2017

mpilquist commented Nov 30, 2017

LukaJCB commented Nov 30, 2017

alexandru commented Nov 30, 2017

Optimize IO's encoding #90

Optimize IO's encoding #90

Conversation

alexandru commented Nov 28, 2017 • edited Loading

Benchmarking (last update: 2017-11-29)

ShallowBindBenchmark

DeepBindBenchmark

AttemptBenchmark

HandleErrorBenchmark

Moving Forward

codecov-io commented Nov 28, 2017 • edited Loading

Codecov Report

mpilquist Nov 29, 2017

Choose a reason for hiding this comment

mpilquist left a comment

Choose a reason for hiding this comment

pchlupacek commented Nov 29, 2017

alexandru commented Nov 29, 2017

pchlupacek commented Nov 29, 2017 • edited Loading

alexandru commented Nov 29, 2017 • edited Loading

djspiewak commented Nov 29, 2017

non commented Nov 29, 2017

mpilquist commented Nov 30, 2017

LukaJCB commented Nov 30, 2017

alexandru commented Nov 30, 2017

alexandru commented Nov 28, 2017 •

edited

Loading

codecov-io commented Nov 28, 2017 •

edited

Loading

pchlupacek commented Nov 29, 2017 •

edited

Loading

alexandru commented Nov 29, 2017 •

edited

Loading