Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with recursion #89

Closed
guillaumebort opened this issue Nov 27, 2017 · 13 comments
Closed

Issues with recursion #89

guillaumebort opened this issue Nov 27, 2017 · 13 comments
Labels

Comments

@guillaumebort
Copy link

guillaumebort commented Nov 27, 2017

Hi,

I discussed this issue with @tpolecat on gitter and he advised me to report it.

As I was looking at different ways of expressing an infinite (or very large) recursion I stumbled upon this snippet that produces strange result:

  def loop(i: Int = 0): IO[Int] =
    for {
      _ <- IO { if(i % 100 == 0) println(i) }
      _ <- if(i > 50000) IO.raiseError(new Exception("oops")) else loop(i + 1)
    }
    yield i
  loop().unsafeRunSync // Expect oops exception to be thrown at some point

This works but takes a very long time and allocates several GB of memory (it becomes exponentially slower and slower so we can suspect that it allocates exponentially more and more memory at each iteration). You can entirely reclaim the memory after all so there is no leak.

Also using IO.suspend the iteration itself works perfectly, but in case of error the error is not reported as expected because of a StackOverflowError in AndThen.runLoop.

  def loop(i: Int = 0): IO[Int] =
    for {
      _ <- IO { if(i % 100 == 0) println(i) }
      _ <- if(i > 50000) IO.raiseError(new Exception("oops")) else IO.suspend(loop(i + 1))
    }
    yield i
  loop().unsafeRunSync // Expect oops exception to be thrown at some point but got StackOverflowError
@alexandru
Copy link
Member

That looks bad. From what you're saying the first sample leaks memory and it should not leak — the expected behavior is for it to use a constant amount of heap memory.

You shouldn't need to wrap the recursive call in a suspend in that second sample either, because flatMap already suspends the execution. That a StackOverflow happens in that second sample is very worrying.

I haven't played with the samples yet.

@mpilquist
Copy link
Member

mpilquist commented Nov 27, 2017

Note that the recursive call to loop (in both cases) is not in tail position due to the yield i, which will push a map(i => i) on to the internal structure for each iteration. The SOE shouldn't be possible though -- the yield i will result in a memory leak but not an SOE.

@guillaumebort
Copy link
Author

I have run this on the 0.5.0 release btw.

@tpolecat
Copy link
Member

Note that this runs instantaneously with Eval. Not sure it's a fair comparison but I'm skeptical that IO is doing that much more work. This thing takes like 30 seconds to run on my machine.

@djspiewak
Copy link
Member

I think what a lot of this comes back to is the fact that AndThen is really slow and hacky. I need to look into it further, tbh, but there's a lot of performance issues that come out of that. The SOE is obviously terrifying.

@alexandru
Copy link
Member

alexandru commented Nov 28, 2017

Note that this runs instantaneously with Eval. Not sure it's a fair comparison but I'm skeptical that IO is doing that much more work. This thing takes like 30 seconds to run on my machine.

For what is worth Monix's Task is also instant.

The Cats IO doesn't do more work than Eval in that loop, Monix's Task does due to its cancelable nature. IO should not be less performant than Eval without actual asynchronous boundaries being triggered in that loop.

@djspiewak @tpolecat @mpilquist I would like to submit a PR with the IO implementation changed to use an internal encoding similar to Monix's Task. These are internals, we don't need an elegant AndThen and we wouldn't break binary compatibility either.

But it might take some effort on my part, one or two days of work, so I need confirmation that it is something you're interested in, before committing to it.

@alexandru
Copy link
Member

Actually I'll just go ahead and do it, because it's easier to show then tell. Cats's IO can be the fastest implementation because it does the least amount of work.

@mpilquist
Copy link
Member

Sounds great @alexandru

@djspiewak
Copy link
Member

@alexandru I'm definitely interested in a better encoding. There are things I like about the one we have, but ultimately it's just a relatively naive optimization on top of ContT[Free[() => ?, ?], Unit, Either[Throwable, A]] (with stack-safe function composition on the suspensions).

@tpolecat
Copy link
Member

Yep, I'd like to see a fix. It's unusable in current form.

@alexandru
Copy link
Member

alexandru commented Nov 28, 2017 via email

@alexandru
Copy link
Member

Couldn't let it go for today 😀

So I pushed PR #90 — it might need more tests, waiting on code coverage report (couldn't get it to work locally today), but it definitely fixes this issue.

@alexandru
Copy link
Member

I've added some benchmarking results in the PR and I'm happy to report dramatic differences 😀
#90

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants