-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize traverseWithChain for StackSafeMonad(s) #3790
Comments
Forgive the verbose reply (much known to @djspiewak but I am adding for additional context for other readers). This is an important question. Arguably, If we want fast, we generally want to avoid laziness since that generally adds boxing and wrapping. But also, we want to preserve "short circuiting" error semantics when we can. Since we want fast, cats defaults to strict arguments, thus The problem comes when Here I want to point out a big challenge we have: optimizing traverse is hard. Generally it depends on both the particular data structure we are traversing (say Also, it is key to note that due to our laws, the result must be the same if we use map2Eval or map2, the only possible difference is the cost to compute in the presence of some short circuits. So, ideally, we would want the applicative to be able to signal it should use map2 or map2Eval, but currently there is no way. What is missing is a kind of complete taxonomy of applicatives that might shed some light. E.g. we have lazy function-like applicatives (Function0, Function1, Kleisli, State, Writer, Eval, TailRec, Parsers, ...), we have error-like applicatives: Either, Try, Future. We have effectful applicatives: IO, IO.Par, ... we have possibly empty collection applicatives (List, Vector, ZipList,...), we have non-empty collections (NonEmptyList, NonEmptyVector...). It seems to me our current choice is helping strict error-like and possibly empty collections at the expense of the other kinds. Now, you can argue that the other kinds are the main uses: State, Writer, Parsers, IO, IO.Par. Further, you can ask why we are trying so hard to avoid calling the Lastly, @djspiewak suggests a very practical solution: if we check that the Applicative is a StackSafeMonad, we have a very good hint that we are not in the cases above that we are optimizing for: e.g. Strict but error-like monads (Try, Either, possibly empty collections). I believe a strict monad can never be a StackSafeMonad, but I don't see a proof immediately. A last note: a few years ago, I noticed a typeclass called I think this is an interesting area for discussion and research on FP in strict languages. Maybe something is already known related to what we are currently groping in the dark for. I am not opposed to Daniel's suggestion, but given we have waited this long, I do think it is worth a bit of discussion and thinking to see if we can synthesize any new knowledge here in 2021 that we didn't really have when cats made the map2Eval design in the first place. |
Just on the surface I think trying to unify (That said if it proves thorny I think @djspiewak's tactical suggestion also works.) |
see #3962 for a stab at some of what I wrote. |
What do you think of something like this: trait TraverseStrategy[F[_]] {
type Rhs[_]
def map2[A, B, C](fa: F[A], fb: Rhs[B])(fn: (A, B) => C): Rhs[C]
def build[A, B](a: A)(fn: A => F[B]): Rhs[B]
def toF[A](f: Rhs[A]): F[A]
}
object TraverseStrategy {
def viaEval[F[_]](ap: Apply[F]): TraverseStrategy[F] =
new TraverseStrategy[F] {
type Rhs[A] = Eval[F[A]]
def map2[A, B, C](fa: F[A], fb: Rhs[B])(fn: (A, B) => C): Rhs[C] =
ap.map2Eval(fa, fb)(fn)
def build[A, B](a: A)(fn: A => F[B]): Rhs[B] =
Eval.always(fn(a))
def toF[A](f: Rhs[A]): F[A] = f.value
}
def direct[F[_]](ap: Apply[F]): TraverseStrategy[F] =
new TraverseStrategy[F] {
type Rhs[A] = F[A]
def map2[A, B, C](fa: F[A], fb: Rhs[B])(fn: (A, B) => C): Rhs[C] =
ap.map2(fa, fb)(fn)
def build[A, B](a: A)(fn: A => F[B]): Rhs[B] = fn(a)
def toF[A](f: Rhs[A]): F[A] = f
}
def default[F[_]](ap: Apply[F]): TraverseStrategy[F] =
ap match {
case _: StackSafeMonad[F] @ unchecked => direct(ap)
case _ => viaEval(ap)
}
}
trait Apply[F[_]] {
// override this to change the traverse strategy
lazy val traverseStrategy[F]: TraverseStrategy[F] = TraverseStrategy[F].default(this)
} Or something close to that. Then we could override the traverseStrategy for some non-stacksafe monads that I'm not in love with the circular dependency here (which I think lazy val resolves). |
I don't believe you need the lazy val actually; |
say more about your thoughts about the ad-hoc-ness... My goal here is to allow In some sense, I think that's the main problem: Apply defines both map2 and map2Eval, but can't say anything about which strategy is better for F, and in an abstract context, you have no idea which one is the right one to choose. |
The
traverseWithChain
implementation is really good and performs well in benchmarks on downstream datatypes, but it has a meaningful flaw: it forces added allocations throughEval
at every step of the iteration. This is something that is definitely necessary when you're trying to implement it on an arbitrary applicative, but that really isn't the case most of the time.The common scenario I'm thinking about is traversing
List
withIO
, which happens both in the context ofparTraverse
and the more conventionaltraverse
, and it happens very frequently. When you instantiateG = IO
withintraverseWithChain
, themap2Eval
and corresponding machinery is no longer necessary becauseIO
itself is stack-safe, and thus we can entirely remove that wrapping allocation in each step.My proposal would be to do this by checking to see if
G.isInstanceOf[StackSafeMonad[G]]
. The reasoning here is that most such implementations will take advantage of theStackSafeMonad
utility mixin. For those that don't, the oldEval
implementation will still work, but for those that do, we have the guarantee (based on their owntailRecM
laws) that theirflatMap
is capable of suspending the laziness just as safely asEval
's. This covers the common case ofIO
as well as several other similar cases, and achieves a significant speed-up in downstream performance without sacrificing anything other than ugliness of implementation.The text was updated successfully, but these errors were encountered: