-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize traverse #3283
Optimize traverse #3283
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3283 +/- ##
==========================================
- Coverage 93.14% 93.14% -0.01%
==========================================
Files 378 378
Lines 7576 7575 -1
Branches 203 194 -9
==========================================
- Hits 7057 7056 -1
Misses 519 519
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I am now curious, are there other opportunities for optimization by inlining foldRight
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance, seems like we could potentially shave even more off of this with some dirtier internal tricks. This is a great start though.
Performance improvements on generic typeclass operations generally don't particularly interest me, since they should never be in the hot path anyway, but faster is always better than slower, regardless of the context.
tl;dr: Traversing a
List
orVector
is probably the most common operation people do with this library, and the current implementations for these types have some room for optimization, with the changes in this PR giving up to 20% more throughput forList
and 201% forVector
.I've put together a benchmark that compares the current
traverse
forList
with two new implementations:The first uses the standard library's
foldRight
with an explicitEval
accumulator, instead of theEval
-basedfoldRight
onFoldable
. The second effectively just inlines the call the Cat'sfoldRight
in the current implementation.Both of these seem substantially faster than the current implementation when traversing with
Right(_)
(results shown for list sizes 101, 102, 103, and 104; higher numbers are better; all results are shown for Scala 2.13, but 2.12 is similar):The
loop
implementation also allocates less:The results for a more complex parsing operation in
ValidatedNel
are similar.I've done a similar comparison for
Vector
, but with an additional new candidate:I've also included implementations of all three new approaches for
Vector
that accumulate the result in aList
and then convert at the end.Again the
loop
implementation is fastest (but the version that accumulates in a list, not the one that builds a vector directly).I've made these changes for
List
,Vector
, and 2.13'sArraySeq
, but not forChain
,Stream
, orLazyList
.