Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test with a large DAG #27

Merged
merged 13 commits into from
Nov 23, 2018
Merged

Add a test with a large DAG #27

merged 13 commits into from
Nov 23, 2018

Conversation

johnynek
Copy link
Collaborator

@dieu take a look...

This reproduces the stack overflow we have seen in scalding.

@dieu
Copy link

dieu commented Nov 17, 2018

@johnynek nice, I have something like that, but separate for toLiteral and for Dag.

like:

def genHugeLiteral: Gen[(Literal[Box, Int], Box[Int])] =
    Gen.listOfN(100000, genConst).flatMap { literals =>
      val zeroLit = literals.head
      val zeroVal = zeroLit.evaluate

      literals.tail.foldLeft((zeroLit, zeroVal)) {
        case ((left, Box(value)), right) =>
          val rightVal = right.evaluate.get
          val fn = mk(_ + _)
          val bfn = { case (Box(l), Box(r)) => Box(fn(l, r)) }: (Box[Int], Box[Int]) => Box[Int]
          (Binary(left, right, bfn), Box(fn(value, rightVal)))
      }
    }
property("stack over flow regression check") =
    forAll(genHugeLiteral) { case (l, v) =>
      l.evaluate == v
    }

and for Dag

  test("stack over flow regression check") {
    def it(i: Int): Iterator[Int] = Iterator.single(i)

    val iters = (0 to 10000).map(it)
    val expected = iters.reduce(_ ++ _)

    val f = iters.map(Flow.apply).reduce(_ ++ _)

    val opt =
      try {
        Dag.applyRule(f, Flow.toLiteral, Flow.allRules)
      } catch {
        case e: StackOverflowError => e.printStackTrace()
      }

     opt match {
      case Flow.IteratorSource(it) =>
        assert(it.toList == expected.toList)
      case nonSrc =>
        fail(s"expected total evaluation $nonSrc")
    }
  }

@johnynek
Copy link
Collaborator Author

@dieu @non please take a look.

This passes (after a very long time) on my laptop. We should be able to optimize the giant dag case. Right now, we spin a long time constantly recomputing the fanOut function.

This won't throw, but it can get very slow. I think we have some quadratic algoritms in play here currently (or possibly worse). We have a number of places where we do a linear search at each step (this should be N^2 or greater if we are looking at each position on the graph to apply a rule).

We could possibly keep the memoizations across modifications to the DAG, or possibly find a way to avoid checking every node.

@johnynek
Copy link
Collaborator Author

three issues with this PR currently:

  1. javascript does not throw StackError, it does another exception we would need to figure out how to catch.
  2. scala 2.10 TailCalls does not have map/flatMap, so we can't use it for the stack safety, but I think it is fine to discard 2.10 support (2.13 is about to be released, after all).
  3. we timeout since it is super expensive to recompute fanOut constantly.

Item 3 seems the most intractable. Knowing the fanout of a node, or at least it is used exactly once, not at all, or more than once, seems needed for many optimizations. However, I have not found a good algorithm yet to incrementally update this property. Currently, we just recompute the entire reverse dependency graph every time the graph changes.

If we could find an incremental algorithm to compute fanout (or this fanout 3 state: 0, 1, >1), I think we will have solved our problems.

@johnynek
Copy link
Collaborator Author

I ran this with yourkit... I think the only way we will get a lot better is maintaining data structures and not recomputing them. Right now, basically all the time is taken on getting from HashMaps.

It's pretty hard to maintain many of the views since if we change an Id from one Node to another due to a rule, that can rewrite a lot of the graph.

@johnynek
Copy link
Collaborator Author

Okay, this change gets the giant test passing, but still other tests fail which I need to fix.

This graph turns out to be high pathological for the current code. We were going in order from smallest ID to largest, but that tends to work on nodes from source to sink. Working from sinks to source is a better direction since the rules can see bigger subgraphs and potentially work on those.

By leveraging a recursive rule, I took the complexity from O(N^2) to O(N) I think (or maybe even N^3 to N^2). This is because we didn't modify the graph N times, and each of those modifications doing O(N) work. Instead, we modify 1 time, and do O(N) work on that modification, and then follow up with O(N) work.

This hints a better way to write rules: always apply them top down. That is a bigger change.

@johnynek
Copy link
Collaborator Author

@non @dieu could you review this?

I think it is mergable after we figure out the CI issues that are currently making the giant test timeout on scalajs.

I think the usual approach here is to make the test smaller on scalajs, but that seems to require some sbt hijinks, which of course will cause me to spin out on how much I hate having to deal with that crap.

actually maybe just adding a test dependency on catalysts works:
https://github.com/typelevel/catalysts/blob/master/platform/js/src/main/scala/catalysts/Platform_js.scala

@johnynek
Copy link
Collaborator Author

okay, this passes. @dieu can I get a review?

private val nodeToId: HCache[N, Lambda[t => Option[Id[t]]]] =
HCache.empty[N, Lambda[t => Option[Id[t]]]]
// Caches polymorphic functions of type Literal[N, T] => Option[Id[T]]
private val litToId: HCache[Literal[N, ?], Lambda[t => Option[Id[t]]]] =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this potentially can be a problem because Literal represents the structure of a user graph and can be as deep as a user graph. And if HCache is using hashCode and equals for lookups it can lead StackOverflow.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, since we are not triggering it now. The trick is we memoize construction of Literal so we leverage reference equality on them for fast equals.

This works as long as the user uses Memoize as we have done here to construct the Literals.

I don't think we need to go nuts being defensive, we just need to make sure there is a usable path for users that need giant graph support.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, hashCode is cached for Literal nodes.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some weird stuff:

if I run the test as test:testOnly *DataFlowTest then tests are passed.

but if I run the test as test:testOnly *DataFlowTest -- -t "test a giant graph" then I got StackOverflow:

[info]   at com.stripe.dagon.DataFlowTest$Flow$ComposedOM.hashCode(DataFlowTest.scala:196)
[info]   at scala.runtime.Statics.anyHash(Statics.java:115)
[info]   at scala.util.hashing.MurmurHash3.productHash(MurmurHash3.scala:64)
[info]   at scala.util.hashing.MurmurHash3$.productHash(MurmurHash3.scala:211)
[info]   at scala.runtime.ScalaRunTime$._hashCode(ScalaRunTime.scala:145)
[info]   at com.stripe.dagon.DataFlowTest$Flow$ComposedOM.hashCode(DataFlowTest.scala:196)
[info]   at scala.runtime.Statics.anyHash(Statics.java:115)
[info]   at scala.util.hashing.MurmurHash3.productHash(MurmurHash3.scala:64)
[info]   at scala.util.hashing.MurmurHash3$.productHash(MurmurHash3.scala:211)
[info]   at com.stripe.dagon.DataFlowTest$Flow.<init>(DataFlowTest.scala:33)
[info]   at com.stripe.dagon.DataFlowTest$Flow$OptionMapped.<init>(DataFlowTest.scala:108)
[info]   at com.stripe.dagon.DataFlowTest$Flow$composeOptionMapped$.compose(DataFlowTest.scala:279)
[info]   at com.stripe.dagon.DataFlowTest$Flow$composeOptionMapped$.$anonfun$apply$5(DataFlowTest.scala:285)
[info]   at com.stripe.dagon.Rule$$anon$1.$anonfun$apply$1(Rule.scala:27)
[info]   at com.stripe.dagon.Rule$$anon$1.$anonfun$apply$1(Rule.scala:27)
[info]   at com.stripe.dagon.Rule$$anon$1.$anonfun$apply$1(Rule.scala:27)
[info]   at com.stripe.dagon.Rule$$anon$1.$anonfun$apply$1(Rule.scala:27)
[info]   at com.stripe.dagon.Rule$$anon$1.$anonfun$apply$1(Rule.scala:27)
[info]   at com.stripe.dagon.Dag$$anon$2.$anonfun$toFunction$1(Dag.scala:109)
[info]   at com.stripe.dagon.Dag.go$1(Dag.scala:155)
[info]   at com.stripe.dagon.Dag.$anonfun$applyOnce$1(Dag.scala:157)
[info]   at scala.collection.Iterator$$anon$10.next(Iterator.scala:448)
[info]   at scala.collection.TraversableOnce.collectFirst(TraversableOnce.scala:145)
[info]   at scala.collection.TraversableOnce.collectFirst$(TraversableOnce.scala:132)
[info]   at scala.collection.AbstractIterator.collectFirst(Iterator.scala:1417)
[info]   at com.stripe.dagon.Dag.applyOnce(Dag.scala:159)
[info]   at com.stripe.dagon.Dag.loop$1(Dag.scala:79)
[info]   at com.stripe.dagon.Dag.apply(Dag.scala:84)
[info]   at com.stripe.dagon.DataFlowTest.$anonfun$new$87(DataFlowTest.scala:863)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is weird. What do you propose we do about it? The test passes as part of the suite.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, the stack depth could be different depending on how we run it, and it may be that this way of running increases the depth just enough... Note, that stack overflow is in the hashCode method of the test code. We know about this issue already: your AST nodes should have stack safe hashCode and equals. As mentioned before, this can be made somewhat easier in a follow up PR.

val lit = toLiteral(node)
try ensureFast(lit)
catch {
case _: Throwable => //StackOverflowError should work, but not on scala.js
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we find out which error is happening in scala.js?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but it isn't the same error on jvm (it was in the travis CI output). I don't think it is worth trying to some special build that catches one kind of error (which does not exist on jvm) in one, and another for scalajs.

You could do this with a custom build of this method, but honestly, I don't think it is a problem. If there is a user error, they will hit it in both branches and there will be an exception in the second branch.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you right probably is not that important.

/*
* We *non-recursively* use either the fast approach or the slow approach
*/
Memoize.functionK[Literal[N, ?], N](new Memoize.RecursiveK[Literal[N, ?], N] {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

potentially same problem with caching Literal

/*
* You need a custom equals to avoid stack overflow
*/
override def equals(that: Any) = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to give our user some easier way to define this or avoid using equals and hashCode in our code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a follow up PR to make this easier, but honestly, this library is basically for compiler-like things. It is not some general purpose thing that comes up all the time. I don't mind if those users have to think for a few minutes about best practices. Giving up performance across the board (which almost any generic solution will do) to make things easier might not be a good trade.

Note, the current changes are about making it possible to use with arbitrary graphs, i.e. fixing the internal issues IF the user makes sane equals and hashCode methods on their type.

@dieu
Copy link

dieu commented Nov 21, 2018

@johnynek I need more time to play with it, I will come back to you in the next 2 days. I want to double check that's we don't have a problem with caching Literal.

@johnynek
Copy link
Collaborator Author

Thanks. Note we pass the tests which exercise the caching.

@codecov-io
Copy link

codecov-io commented Nov 22, 2018

Codecov Report

Merging #27 into master will decrease coverage by 0.79%.
The diff coverage is 77.24%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #27     +/-   ##
=========================================
- Coverage   85.01%   84.22%   -0.8%     
=========================================
  Files          13       13             
  Lines         267      374    +107     
  Branches       18       21      +3     
=========================================
+ Hits          227      315     +88     
- Misses         40       59     +19
Impacted Files Coverage Δ
core/src/main/scala/com/stripe/dagon/Id.scala 100% <ø> (ø) ⬆️
core/src/main/scala/com/stripe/dagon/Memoize.scala 100% <100%> (ø) ⬆️
core/src/main/scala/com/stripe/dagon/HMap.scala 95.65% <100%> (+0.19%) ⬆️
core/src/main/scala/com/stripe/dagon/Literal.scala 58.82% <57.14%> (-3.09%) ⬇️
core/src/main/scala/com/stripe/dagon/Expr.scala 65.51% <65%> (-4.49%) ⬇️
core/src/main/scala/com/stripe/dagon/Dag.scala 85.64% <79.16%> (-5.69%) ⬇️
core/src/main/scala/com/stripe/dagon/Graphs.scala 100% <0%> (+45.45%) ⬆️
...re/src/main/scala/com/stripe/dagon/FunctionK.scala 100% <0%> (+66.66%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 616eb76...a1ecfd4. Read the comment docs.

@dieu
Copy link

dieu commented Nov 22, 2018

one side note, in one of my iteration to avoid recursion, I come up with something like:

  class MutableStack[A[_]](private var stack: List[A[_]]) {
    def pop[T](): A[T] = {
      val res = stack.head
      stack = stack.tail
      res.asInstanceOf[A[T]]
    }

    def pop[T](n: Int): List[A[T]] = {
      val (res, tail) = stack.splitAt(n)
      stack = tail
      res.asInstanceOf[List[A[T]]]
    }

    def push(items: List[A[_]]): Unit =
      stack = items ::: stack

    def push(item: A[_]): Unit =
      stack = item :: stack

    def at[T](index: Int): A[T] =
      stack(index).asInstanceOf[A[T]]

    def nonEmpty: Boolean = stack.nonEmpty
  }

  trait Foldable[A[_], B[_]] {
    def unfold[T](a: A[T]): List[A[_]]
    def fold[T](stack: MutableStack[B])(a: A[T]): B[T]
  }

  def functionKF[A[_], B[_]](foldable: Foldable[A, B]): FunctionK[A, B] =
    new FunctionK[A, B] {
      private val cache = HCache.empty[A, B]

      private def widen[S[_], T](s: S[_]): S[T] = s.asInstanceOf[S[T]]

      override def toFunction[U]: A[U] => B[U] = { a =>
        cache.getOrElseUpdate(a, {
          var (tree, index) = (List(a): List[A[_]], 0)

          while (index >= 0) {
            val next = foldable.unfold(widen(tree(index)))

            index -= 1

            if (next.nonEmpty) {
              index += next.length
              tree = next ::: tree
            }
          }

          val stack = new MutableStack[B](List.empty)

          tree.foreach { next =>
            val value = foldable.fold(stack)(widen(next))

            stack.push(value)

            cache.insert(widen(next), value)
          }

          cache(a)
        })
      }
    }

and usage:

def toLiteralFold: FunctionK[Flow, Literal[Flow, ?]] =
      Memoize.functionKF[Flow, Literal[Flow, ?]](new Foldable[Flow, Literal[Flow, ?]] {
        override def unfold[T](a: Flow[T]): List[Flow[_]] = a match {
          case it: IteratorSource[T] => List.empty
          case o: OptionMapped[s, T] => List(o.input)
          case c: ConcatMapped[s, T] => List(c.input)
          case t: Tagged[a, s] => List(t.input)
          case f: Fork[s] => List(f.input)
          case m: Merge[s] => List(m.left, m.right)
          case m: Merged[s] => m.inputs
        }

        override def fold[T](stack: Memoize.MutableStack[Literal[Flow, ?]])(a: Flow[T]): Literal[Flow, T] = a match {
          case it: IteratorSource[T] => Literal.Const(it)
          case o: OptionMapped[s, T] => Literal.Unary(stack.pop[s](), { f: Flow[s] => OptionMapped(f, o.fn) })
          case c: ConcatMapped[s, T] => Literal.Unary(stack.pop[s](), { f: Flow[s] => ConcatMapped(f, c.fn) })
          case t: Tagged[a, s] => Literal.Unary(stack.pop[s](), { f: Flow[s] => Tagged(f, t.tag) })
          case f: Fork[s] => Literal.Unary(stack.pop[s](), { f: Flow[s] => Fork(f) })
          case m: Merge[s] => Literal.Binary(stack.pop[s](), stack.pop[s](), { (l: Flow[s], r: Flow[s]) => Merge(l, r) })
          case m: Merged[s] =>  Literal.Variadic(stack.pop[s](m.inputs.length), { fs: List[Flow[s]] => Merged(fs) })
        }
      })

the general idea is linearizing tree into the list of items, and then process them one by one, on same way we can do equals and hashcode as well.

not sure if you like it, but for user looks like it's easy to write this kind of toLiteral.

PS: instead of stack, we can use cache directly.

* This does recursion on the stack, which is faster, but can overflow
*/
protected def ensureFast[T](lit: Literal[N, T]): (Dag[N], Id[T]) =
findLiteral(lit, lit.evaluate) match {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use here saved version of Literal.evaluateMemo to maintain the cache of evaluated literals?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great suggestion. Thanks.

/*
* This does recursion on the stack, which is faster, but can overflow
*/
protected def ensureFast[T](lit: Literal[N, T]): (Dag[N], Id[T]) =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure basically converting Literal to Expr, right? what's do you think to remove the layer of Expr and merge Literal and Expr together?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expr has Ids which allow us to rewrite nodes. Literal has no Ids. It’s not totally obvious to me how to merge them in a useful way. The point of adding the Ids is so we can rewrite the graph without losing the types. If the users have to construct the Expr directly they have a hard problem on their hands (basically they need to write ensure and the Is tracking in user code).

Do you have a concrete idea of how to merge them?

@johnynek
Copy link
Collaborator Author

@dieu I have taken your suggestion about sharing the memoization of the Literal to N mapping across the entire ensure operation.

What do you think about merging this and you can follow up with more improvements along the lines you are suggesting. I think this addresses the core problem: making it possible to use dagon with giant graphs. Note it also does so in a binary compatible way.

@dieu
Copy link

dieu commented Nov 23, 2018

@dieu I have taken your suggestion about sharing the memoization of the Literal to N mapping across the entire ensure operation.

What do you think about merging this and you can follow up with more improvements along the lines you are suggesting. I think this addresses the core problem: making it possible to use dagon with giant graphs. Note it also does so in a binary compatible way.

sounds good for me.

@johnynek johnynek merged commit a0e248e into master Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants