-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add combineAllOption to Foldable #2380
Merged
Merged
Changes from 1 commit
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
e3bd620
Added combineAllOption to Foldable
barambani 94fcff1
Attempt
barambani 7f252c7
Merge branch 'master' into issue-2363
barambani 7f8e044
prePR
barambani 200d9c0
Merge branch 'master' into issue-2363
barambani 654aae5
Added overrides for iterator, adapted tests, deprecated Stream
barambani 64053b5
Removed tc iterator and changed with LazyList for 2.13 and Stream for
barambani 427c7f5
Restored original iterator in tests, added iterable syntax and test
barambani 38cbe68
Removed `override` non needed, refined tests
barambani f951eef
Implemented fold in terms of iterable
barambani 2d835f1
Changed to Iterable
barambani File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be using
reduceLeftOption
andSemigroup.combine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't use
Monoid.combineAll
orSemigroup.combineAllOption
we are missing the optimization. This is for cases when we can use internal mutability to aggregate many fast, which is pretty critical in some applications (e.g. spark/scalding).If we just call out to
combine
we are destroying optimizations.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's tricky here, we don't want to impose
Monoid
and I wasn't aware there is optimization forSemigroup.combineAllOption
(actually I still can't find it), but even it does, we are in a syntax extension, so unlikefold
, we can't optimizecombineAllOption
inFoldable
instances.So to retain the optimization we probably need the keep the
Monoid
requirement which is rather unfortunate, we'd better document the rationale. Another option is to wait until Cats 2.0 with which we'll be able to add a combineAllOption toFoldable
In the meantime @johnynek would you mind point me to the where the
Semigroup.combineAllOption
's optimization is? we should probably document it somewhere.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, here is the method:
https://github.com/typelevel/cats/blob/master/kernel/src/main/scala/cats/kernel/Semigroup.scala#L39
note any semigroup can override that (as we do in other typeclasses for things like map2, product, etc.. in Applicative).
In algebird, which uses cats.kernel, we do this very frequently.
e.g.:
https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/HyperLogLog.scala#L582
https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/CountMinSketch.scala#L141
there are more. The point is the person implementing the Semigroup knows if there is a more efficient way to combine many of them, and should control that. Other typeclasses ideally should delegate to that to not undo the optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know about this optimization and the practice of overriding
Semigroup
'scombineAllOption
(and others) to address specific domain's performance gains. Thanks. What I can think about to exploit it is something likebit it still adds a traversal of the foldable for the conversion to
TraversableOnce
. I'm not sure it's acceptable.Anyway I'm also ok to abort this and wait for Cats 2.0 like @kailuowang mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's tough in this case since here you are trading off the potential added cost of materializing a list with the potential benefit of reducing an entire group together.
Many Foldables could actually create an
Iterator
to pass intocombineAllOption
on Semigroup, but since we don't have a method like that we can only make the List.We could imagine:
on Foldable in cats 2.0. In this world, if you can make an Iterator for the caller to consume. Alternatively, we could add
Fold[A, B]
to cats:https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/Fold.scala#L40
so we could have
def foldWith[A, B](fa: F[A])(fold: Fold[A, B]): B
onFoldable
. Since folds ca potentially have internal mutable state, they can do much the same optimization of combineAllOption, although, not in parallel (since you can't split over trees).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot say if we could add
Fold
so I would leave the decision to others. About the other option, how could I use theonIterator
if it were available ? Still with the semigroupcombineAllOption
or that would be alternative to depending on overriding that in Semigroup ? Because ifonIterator
wouldn't need to be overridden, could it be another syntax ? Thanks for the explanation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Right. Still aiming at mitigating the possible cost of building a list, it could be delayed up to when the iterator is traversing it. Something like
I'm not sure actually it be a step forward and is getting late. I will keep looking.