Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proposal] optimize Alternative (part 1): introduce NonEmptyAlternative with prependK and appendK methods #4014

Merged
merged 3 commits into from
Nov 22, 2021

Conversation

satorg
Copy link
Contributor

@satorg satorg commented Oct 9, 2021

Rationale

There's a pretty common pattern can be found for Alternative even in the Cats sources:
F.combineK(F.pure(a), fa) and F.combineK(fa, F.pure(a))
which basically means prepending or appending an element a lifted into F[_].
But when it is used for some standard collections like List or Vector it basically results in creating an interim singleton collection on each F.pure which gets discarded right after F.combineK is completed.

The proposed PR allows to avoid creating the interim singleton collections by various Alternative implementations in many cases.

Description

This PR does the following:

  1. Introduces a new type class NonEmptyAlternative as a combination of Applicative and SemigroupK.
  2. Adds prependK and appendK methods to NonEmptyAlternative. These methods have default implementations as shown above. But they can be overridden for a particular Alternative implementation.
  3. Makes Alternative inheriting NonEmptyAlternative instead of just Applicative. The MonoidK mix-in type class remains in place.
  4. Switches existing F.pure + F.combineK patterns found in core Cats (apart from cats.data) to F.prependK or F.appendK appropriately where it is safe and reasonable.
  5. Introduces NonEmptyAlternativeLaws which becomes a base for AlternativeLaws.
  6. Both the laws above get an ability to test composed (via compose) type-classes (see composed constructor in the corresponding companion objects).

Things to be done in a subsequent PR (see the discussion below):

  1. Add NonEmptyAlternative instance to non-empty containers in cats.data.
  2. Override F.prependK and F.appendK in the Alternative and NonEmptyAlternative implementations where it reasonable.

Therefore this PR along with the next one will allow to utilize more efficient methods in some collections (e.g. :: in List) while preserving binary and source compatibility with existing Alternative implementations. Moreover they should enable further optimizations for non-empty data types too.

Caveats

  • Simulacrum's ScalaFix rules do not recognize the prependK method because it is defined in this way:
    def prependK(a: A, fa: F[A]): F[A] // seems Simulacrum cannot understand `fa` as a second parameter
    I believe it is an issue with Simulacrum itself and not a big deal anyway since neither Alternative nor NonEmptyAlternative make use of extension methods generated by Simulacrum actually.

@satorg
Copy link
Contributor Author

satorg commented Oct 11, 2021

Oh wait. I've just realized that it is not going to work with non-empty containers like NonEmptyList.
Apparently, NE containers cannot be MonoidK while Alternative inherits to MonoidK along with Applicative.

Not sure what would be the best way to improve it. One way could be to move appendK and prependK into SemigroupK, but it would require to add an implicit Applicative to these methods as a constraint. Another way could be introducing a new typeclass like NonEmptyAlternative which would inherit to Applicative and SemigroupK.

The latter seems to be more "radical", but it would allow other cool things like describing an additional variant of unite in terms of Reducible (rather than Foldable for the current Alternative#unite)

The former seems to be less invasive, but I am a bit concerned about these additional Applicative constraints.

So wdyt?

UPD: decided to introduce NonEmptyAlternative eventually.

Copy link
Contributor

@johnynek johnynek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this pr!

I agree the order of prependK should be changed.

Also, can we update all the Alternative instances? LazyList, Stream, ZipStream, ZipLazyList, Composed, Kleisli, ArraySeq, Queue, Chain. I think that's about it.

core/src/main/scala/cats/instances/option.scala Outdated Show resolved Hide resolved
johnynek
johnynek previously approved these changes Oct 24, 2021
Copy link
Contributor

@johnynek johnynek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@satorg
Copy link
Contributor Author

satorg commented Oct 24, 2021

@johnynek sorry, it is not complete yet. I'm still working on the changes you requested.

@satorg
Copy link
Contributor Author

satorg commented Oct 24, 2021

@johnynek turns out that ZipList and ZipVector don't have Alternative instances defined (although ZipLazyList and ZipStream both do). I cannot understand why – perhaps just an oversight. But I could also add Alternative for them in this PR and then define prependK and appendK as well. Does it make sense to add it to this PR, wdyt?

@satorg
Copy link
Contributor Author

satorg commented Oct 25, 2021

@johnynek regarding Alternative[Chain]: seems it is not very useful to specialize prependK and appendK for Chain for now since the Chain's own prepend and append are implemented this way:

final def prepend[A2 >: A](a: A2): Chain[A2] = Chain.concat(one(a), this)
final def append[A2 >: A](a: A2): Chain[A2] = Chain.concat(this, one(a))

I.e. they do exactly the same job as combineK along with pure do.

Moreover, due to the nature of Chain I am not certain if it is even possible to make them better than that.
One of the ways to optimize them could be to handle a case when Chain is actually a Wrap, i.e.:

final def prepend[A2 >: A](a: A2): Chain[A2] =
  this match {
    case Chain.Wrap(seq) => Chain.Wrap(a +: seq)
    case _ => Chain.concat(one(a), this)
  }

but I am not sure if it is really better than just to do a concat for everything.

I'd suppose it may require additional investigations and benchmarking and is beyond the scope of this PR.

@johnynek
Copy link
Contributor

Yeah good point about Chain. Definitely doing the wrap thing would be bad because Seq has no guarantee about either being fast.

Although you could imagine pattern matching on List or Vector in there and then using optimized prepend or append respectively.

But that could be costly to put that test in there. Maybe worth a benchmark.

johnynek
johnynek previously approved these changes Oct 29, 2021
Copy link
Contributor

@johnynek johnynek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one comment, take or leave.

Thanks for doing this!

core/src/main/scala-2.13+/cats/instances/stream.scala Outdated Show resolved Hide resolved
@satorg
Copy link
Contributor Author

satorg commented Oct 29, 2021

Just an update: although the initial goal seems achieved in general from my prospective, now I'm working on addressing my aforementioned concern regarding non-empty containers. It is going to be a new type class, namely NonEmptyAlternative, which should cover such cases. It will be a bunch of commits on top of this PR so if the community doesn't like that, it will be easy to discard.

@satorg
Copy link
Contributor Author

satorg commented Oct 29, 2021

@johnynek regarding ZipList and ZipVector. Seems that these types didn't get their Alternative instances intentionally. I'm not certain about it, but since Alternative[ZipLazyList] relies on LazyList laziness and ability to produce infinite sequences, I'd guess that Alternative for non-lazy sequences cannot be implemented.

Unfortunately, seems there're neither laws nor docs for Zip* wrappers available so I still don't really understand their purpose. So I would appreciate any help or hint with that :)

cc @LukaJCB @johnynek

johnynek
johnynek previously approved these changes Nov 3, 2021
@satorg satorg marked this pull request as draft November 3, 2021 20:34
@satorg satorg force-pushed the alternative-prepend-append branch 2 times, most recently from 3f7c9bf to 9e6ac0d Compare November 6, 2021 07:04
@satorg
Copy link
Contributor Author

satorg commented Nov 6, 2021

Turned out it was not an easy lift and required a way more changes that I initially anticipated. But seems the PR is competed now. Finally) I've also updated the description of the PR.

There were several bin-compat issues which I had to tackle. The solutions don't look esthetically great but they're the best I managed to come up with. And seems MiMa is happy now. But if someone has got an idea on how to make it better, I'd really appreciate any help/hint on that.

cc @LukaJCB, @johnynek

@satorg satorg marked this pull request as ready for review November 6, 2021 07:29
Comment on lines +25 to +29

def composed[M[_], N[_]](implicit M: Alternative[M], N: Applicative[N]): AlternativeLaws[λ[α => M[N[α]]]] =
new AlternativeLaws[λ[α => M[N[α]]]] {
def F: Alternative[λ[α => M[N[α]]]] = M.compose[N]
}
Copy link
Contributor Author

@satorg satorg Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a call-out: this introduces a new ability to validate laws for composed instances. This feature is totally new, it didn't exist before. I decided to add it because composed instances were not covered with any tests at all.

But currently only composed Alternative and NonEmptyAlternative instances (see below) are supported to keep the PR scope narrow.

@johnynek
Copy link
Contributor

This is really good work.

It did get very large.

I'm now wondering if we can break it into two:

  1. introduce NonEmptyAlternative and the laws.
  2. override prependK/appendK to be specialized for more instances.

It is much harder to review larger PRs and if we have to revert, we revert more code at once. I would be on the lookout for ways we can decompose into separately mergeable parts.

What do you think?

@satorg
Copy link
Contributor Author

satorg commented Nov 19, 2021

I'm now wondering if we can break it into two:

  1. introduce NonEmptyAlternative and the laws.
  2. override prependK/appendK to be specialized for more instances.

Agree, that makes sense to me too. I'll take on this.

"introduce NonEmptyAlternative and the laws" – do you mean to add NonEmptyAlternative typeclass along with prependK/appendK methods all together (but without their specialized versions)? Because otherwise the new typeclass is going to be just empty...

@johnynek
Copy link
Contributor

yeah, I was thinking add the default implementation of prependK/appendK using pure + combineK. And then a separate pr with all the specializations.

Just as one example way to split the PR.

@satorg satorg marked this pull request as draft November 20, 2021 02:12
@@ -14,6 +14,7 @@ abstract class AllSyntaxBinCompat

trait AllSyntax
extends AlternativeSyntax
with NonEmptyAlternativeSyntax
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, where to insert this new trait. I'd prefer to maintain all mix-in traits in some particular order, but I cannot figure out any specific order here.

@@ -71,4 +71,5 @@ package object syntax {
object vector extends VectorSyntax
object writer extends WriterSyntax
object set extends SetSyntax
object nonEmptyAlternative extends NonEmptyAlternativeSyntax
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that the initial intent was to keep all inner objects of object syntax just in alphabetic order by their names. But later the order was broken (perhaps due to an oversight). So I'm putting the new object to the end of all existing objects for now.

@satorg satorg marked this pull request as ready for review November 22, 2021 09:13
@satorg satorg changed the title [proposal] optimize Alternative: prependK and appendK methods [proposal] optimize Alternative (part 1): introduce NonEmptyAlternative with prependK and appendK methods Nov 22, 2021
@satorg
Copy link
Contributor Author

satorg commented Nov 22, 2021

@johnynek FYI: I truncated this PR to make it containing basic prependK/appendK changes only.

Copy link
Member

@LukaJCB LukaJCB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much!

@johnynek johnynek merged commit 1fed125 into typelevel:main Nov 22, 2021
@satorg satorg deleted the alternative-prepend-append branch November 22, 2021 19:56
@satorg
Copy link
Contributor Author

satorg commented Nov 22, 2021

@johnynek @LukaJCB thank you guys for following up this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants