-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TailRecM does not make sense for lists / streams #1329
Comments
I'm solving this for now by configuring ScalaCheck to weaken the tests (setMaxSize etc) |
The List and Stream examples are correct but just like a recursive flatMap If you loop on Lists you need the mean size of the resulting list to be You can see the discussion about this. If you don't have tailRecM people We have so far seen three classes of scala monad that can implement
Without tailRecM (or recursive flatMap) being safe, libraries like We made the switch in 0.7.0 to always have tailRecM since consistency with It sounds like your issue is that Observable, like List has a |
@alexandru the real issue may be this: I made that change to exercise deeper recursion (but not too deep). Maybe that was a bad call. We could walk it back to With that change, you may never see the |
Hi @johnynek, I appreciate your answer. To describe my problem in more details, please have patience with me :-) I've measured what's going on, so to put some numbers on this, I have these ScalaCheck settings:
With these settings, that property test is generating:
Generating/streaming 186 million items in a test is way too much. And you could say that the max size is too high, but 100 is a low count for testing streams. So indeed, if you could do Following is more personal opinion :-)
I think we think of a different notion of stack. For me, a tail recursive algorithm is one that uses constant space, either stack or heap. Especially because in functional programming you can't rely on the normal C call-stack that much. From what I've heard Haskell manages its own "stack" by keeping linked lists in heap. And we are basically replacing Java's call stack by trampolines or thread-pools. Either way, for streams, no tail recursive algorithm is possible for implementing F.tailRecM(1) { a =>
Stream.continuously(Left(a+1))
} That's because, by my definition at least, this isn't tail recursive, as on each frame we aren't discarding the current frame and we keep work around, in our trampoline or our thread-pool or whatever. But it gets worse. Creating streams of just one single element cannot be tail-recursive either and it will also crash the process: // This will also crash
F.tailRecM(1) { a => Stream( Left(a+1) ) } Of course, this is counter intuitive, because the equivalent with a // This is tail recursive
F.tailRecM(1) { a => Task( Left(a+1) ) } The reason is that any stream implementation needs a way to signal the end, as a final event, be it as a final
I understand that, but what's the use of shoving a My opinion here is that Of course, I might be missing something. It does seem like an interesting way to generate data.
No, Observable is lazy, space growth is linear, but it blows up exponentially in time. And it doesn't have a stack safe Anyway, if anybody could point out to some docs where a good usage of Cheers, |
What we mean by stack-safe is that it consumes a constant amount of jvm stack, not that it consumes Note that: scala> import cats.instances.stream._
import cats.instances.stream._
scala> Monad[Stream].tailRecM(1) { a => Stream( Left(a+1) ) } does not crash, in fact. It simply does not terminate (I have been running it for quite some time). This also just spins: scala> Monad[List].tailRecM(1) { a => List( Left(a+1) ) } without crashing or OOMing. This is not a lot different than the function: @annotation.tailrec
def go[A, B](a: A, fn: A => Either[A, B]): B =
fn(a) match {
case Right(b) => b
case Left(a1) => go(a1, fn)
} then calling This is precisely what Here is a case of using tailRecM: def readAll(in: InputStream): Try[Vector[Array[Byte]]] = {
val buffer = new Array[Byte](1024)
def step(acc: Vector[Array[Byte]): Try[Either[Vector[Array[Byte]], Vector[Array[Byte]]]] =
Try(in.read(buffer)).map {
case x if x < 0 => Right(acc)
case x => Left(acc :+ buffer.slice(0, x))
}
Monad[Try].tailRecM(Vector.empty[Array[Byte]])(step)
} Without This is also common in the iteratee library to do this kind of monadic recursion. It may not be common, or safe, for all monads to recursively flatMap or call tailRecM, but for most it actually is (again, all of them in cats are safe to use in this way). Incidentally, using |
@johnynek I'm still not convinced about its utility for lists and I'd do more arguing about tail recursivity, but then I was wrong before, so I'll shut up for now. That test fix is sufficient, thanks for it. |
Here is a (contrived) example for list: import cats.Monad
def iterateM[M[_]: Monad, T](m: M[T], depth: Int)(fn: T => M[T]): M[T] = {
def step(a: (Int, T)): M[Either[(Int, T), T]] = {
val (d, t) = a
if (d <= 0) Monad[M].pure(Right(t))
else Monad[M].map(fn(t)) { nextt => Left((d - 1, nextt)) }
}
Monad[M].flatMap(m) { t => Monad[M].tailRecM((depth, t))(step _) }
}
import cats.instances.list._
scala> iterateM[List, Int](List(1, 2, 3), 0) { x => (0 to x).toList }
res9: List[Int] = List(1, 2, 3)
scala> iterateM[List, Int](List(1, 2, 3), 1) { x => (0 to x).toList }
res10: List[Int] = List(0, 1, 0, 1, 2, 0, 1, 2, 3)
scala> iterateM[List, Int](List(1, 2, 3), 2) { x => (0 to x).toList }
res11: List[Int] = List(0, 0, 1, 0, 0, 1, 0, 1, 2, 0, 0, 1, 0, 1, 2, 0, 1, 2, 3)
scala> iterateM[List, Int](List(1, 2, 3), 3) { x => (0 to x).toList }
res12: List[Int] = List(0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 0, 0, 1, 0, 1, 2, 0, 1, 2, 3) you can imagine other examples that are not as silly (like doing a search where we can find 0, 1, or more than one candidates to recursively try, and recursion on a list is nice there, using tailRecM, you can do that safely). |
Thanks @johnynek, warming up to it :-) I think this is one of those cases where users need some education. Hopefully some examples will happen soon in those docs. |
Cleanup. |
The new tailRecM in
FlatMap
probably does not make sense for lists / streams.I promise to read the paper (have some difficulty parsing Haskell/Purescript code), however this operation only makes sense for monads for which this can be implemented in a tail-recursive way (i.e. constant stack AND heap).
Looking for example at the List implementation the call stack is effectively replaced with a heap buffer that balloons in size with each iteration. This may be stack safe, but it is not tail recursive, which I think misses the point. Unless I'm not reading it right, the growth is exponential. And actually, the possibility of such a space leak is why we have stack-overflow errors (the idea being to crash a single thread instead of the whole process or system) and I cannot see the utility of such a dangerous operation on lists / streams.
And what this operation does is enforced by the tailRecMConsistentFlatMap law, because of which we cannot provide an alternative interpretation. Because I tried doing basically a
.take(1)
on all streams emitted byf
.My current problem is that in Monix I have to provide a
Monad
instance forObservable
. And naturally I noticed the law-tests going from 2 seconds to 60 seconds, because for each observable emitted byf
, we get an arbitrarily fresh observable that's then flat-mapped in that loop and observables are made of arbitrary lists of numbers and so on. And to give an idea of why this happens, I fired up a profiler and saw that thedefaultTailRecM
implementation was called over 4.5 million times, with over 35 million elements being streamed by theFlatMapLaws
tests, which along with the tricks I'm doing to test this synchronously and cross-platform, it gets to be freakishly slow.But besides this ... given its current laws, given that it is unsafe, I do NOT want to provide this function for any lists or streams I can think of. And I can't just provide the default implementation because that default is untestable. I cannot have a test that takes 60 seconds and reconfiguring ScalaCheck to emit smaller lists of numbers or to do less checks isn't OK at all, because I actually cared about the flatMap/monad tests.
The text was updated successfully, but these errors were encountered: