From 1d9ced6277899fc13a33fff1bc2f73dfa030eb30 Mon Sep 17 00:00:00 2001
From: Adelbert Chang <adelbertc@gmail.com>
Date: Mon, 24 Oct 2016 21:13:04 -0700
Subject: [PATCH] New blog post: EDSLs as functions

---
 posts/2016-09-21-edsls-part-1.md |   3 +
 posts/2016-10-26-edsls-part-2.md | 450 +++++++++++++++++++++++++++++++
 2 files changed, 453 insertions(+)
 create mode 100644 posts/2016-10-26-edsls-part-2.md

diff --git a/posts/2016-09-21-edsls-part-1.md b/posts/2016-09-21-edsls-part-1.md
index 9a48fdc2..3b4929b2 100644
--- a/posts/2016-09-21-edsls-part-1.md
+++ b/posts/2016-09-21-edsls-part-1.md
@@ -78,3 +78,6 @@ axes:
   interpret our mini-programs. How EDSLs are encoded will affect
   how they perform and therefore affect any downstream consumers of our
   programs, be it other programs or end users.
+
+In the [next post]({% post_url 2016-10-26-edsls-part-2 %}) we'll take a look
+at the first of these approaches.
diff --git a/posts/2016-10-26-edsls-part-2.md b/posts/2016-10-26-edsls-part-2.md
new file mode 100644
index 00000000..4118dbb8
--- /dev/null
+++ b/posts/2016-10-26-edsls-part-2.md
@@ -0,0 +1,450 @@
+---
+layout: post
+title: EDSLs as functions
+
+meta:
+  nav: blog
+  author: adelbertc
+  pygments: true
+---
+
+*This is the second of a series of articles on "Monadic EDSLs in Scala."*
+
+Perhaps the most direct way to start writing an EDSL is to start writing
+functions. Let's say we want a language for talking about sets of integers.
+
+```scala
+trait SetLang {
+  def add(i: Int, set: Set[Int]): Set[Int]
+  def remove(i: Int, set: Set[Int]): Set[Int]
+  def exists(i: Int, set: Set[Int]): Boolean
+}
+```
+
+This works... to the extent that we want only to work with
+`scala.collection.Set`s. As it stands we cannot talk about
+other sets such as bloom filters or sets controlled by other threads.
+Our language isn't *abstract* enough, so let's remove
+all traces of `Set`.
+
+```scala
+trait SetLang[F[_]] {
+  def add(i: Int, set: F[Int]): F[Int]
+  def remove(i: Int, set: F[Int]): F[Int]
+  def exists(i: Int, set: F[Int]): Boolean
+
+  // Given unknown F we no longer know how to create an empty set
+  // so we add the capability to our language
+  def empty: F[Int]
+}
+```
+
+We've parameterized our language with a [higher-kinded type][hkt] which
+represents the context of our set. A similar parameterization could be
+done with a *-kinded type (e.g. `SetLang[A]`) but since this series
+focuses on **monadic** EDSLs, the choice is made for us.
+
+Now we can write mini-programs which talk about some abstract set
+yet to be determined.
+
+```scala
+def program[F[_]](lang: SetLang[F]): Boolean = {
+  import lang._
+  exists(10, remove(5, add(10, add(5, empty))))
+}
+```
+
+Interpretation of our program is done by implementing `SetLang` and
+passing an instance into `program`.
+
+However, our language is still not abstract enough. Replacing `Set`
+with `F` allows us to swap in implementations of sets, but doesn't
+allow us to talk about the context. Consider the behavior of `exists` if `F`
+represents some remote set. Since `exists` returns a `Boolean`,
+checking membership must be a synchronous operation despite the set living
+on another node.
+
+It's also tedious to thread the set through each method manually.
+
+We can solve both problems by generalizing the use of `F` to some
+context that is able to read and write to some set
+(think `Set[Int] => (Set[Int], A)`).
+
+```scala
+trait SetLang[F[_]] {
+  def add(i: Int): F[Unit]
+  def remove(i: Int): F[Unit]
+  def exists(i: Int): F[Boolean]
+
+  // No longer need `empty` since the "context" has it already
+}
+```
+
+`SetLang` can now talk about the **effects** around interpretation, such as
+asynchronity.
+
+```scala
+import scala.concurrent.Future
+
+type AsyncSet[A] = Set[Int] => Future[(Set[Int], A)]
+
+object AsyncSet extends SetLang[AsyncSet] {
+  def add(i: Int): Set[Int] => Future[(Set[Int], Unit)] = ???
+
+  def remove(i: Int): Set[Int] => Future[(Set[Int], Unit)] = ???
+
+  def exists(i: Int): Set[Int] => Future[(Set[Int], Boolean)] = ???
+}
+```
+
+This new encoding introduces a new but important problem: how do we
+combine the results of multiple calls to `SetLang` methods? In the previous
+encoding we could add and remove by threading the set from one call to
+the next. With this change to represent a context, it's not clear how to do
+that.
+
+Fortunately we are now in a position to leverage a powerful tool:
+[monads][monads]. By extending our set language to be monadic
+we recover composition in an elegant way. The [Cats][cats] library is used
+for demonstration purposes, but the discussion applies equally to
+[Scalaz][scalaz].
+
+```scala
+import cats.Monad
+import cats.implicits._
+
+trait SetLang[F[_]] {
+  // See [this post][typeClassSubType] for why the `Monad` instance is defined
+  // as a member as opposed to through inheritance.
+  def monad: Monad[F]
+
+  def add(i: Int): F[Unit]
+  def remove(i: Int): F[Unit]
+  def exists(i: Int): F[Boolean]
+}
+
+def program[F[_]](lang: SetLang[F]): F[Boolean] = {
+  import lang._
+  implicit val monadInstance = monad
+  for {
+    _ <- add(5)
+    _ <- add(10)
+    _ <- remove(5)
+    b <- exists(10)
+  } yield b
+}
+```
+
+Defining an interpreter starts by identifying a target context. Since the context
+computes values while updating state, this suggests the state monad.
+
+```scala
+import cats.data.State
+
+object ScalaSet extends SetLang[State[Set[Int], ?]] {
+  val monad = Monad[State[Set[Int], ?]]
+
+  def add(i: Int): State[Set[Int], Unit] =
+    State.modify(_ + i)
+
+  def remove(i: Int): State[Set[Int], Unit] =
+    State.modify(_ - i)
+
+  def exists(i: Int): State[Set[Int], Boolean] =
+    State.inspect(_(i))
+}
+```
+
+```scala
+val state = program[State[Set[Int], ?]](ScalaSet)
+// state: cats.data.StateT[cats.Eval,scala.collection.immutable.Set[Int],Boolean] = cats.data.StateT@ce9f626
+
+state.run(Set.empty).value
+// res5: (scala.collection.immutable.Set[Int], Boolean) = (Set(10),true)
+```
+
+Note that calling `program` did not require any context-specific knowledge -
+we could define another interpreter, perhaps one that talks to a set
+concurrently.
+
+```scala
+import cats.data.StateT
+import scala.concurrent.{ExecutionContext, Future}
+
+// Asynchronous state
+def AsyncSet(implicit ec: ExecutionContext): SetLang[StateT[Future, Set[Int], ?]] =
+  new SetLang[StateT[Future, Set[Int], ?]] {
+    val monad = Monad[StateT[Future, Set[Int], ?]]
+
+    def add(i: Int): StateT[Future, Set[Int], Unit] =
+      StateT.modify(_ + i)
+
+    def remove(i: Int): StateT[Future, Set[Int], Unit] =
+      StateT.modify(_ - i)
+
+    def exists(i: Int): StateT[Future, Set[Int], Boolean] =
+      StateT.inspect(_(i))
+  }
+```
+
+```scala
+// No changes to `program` required
+val result = program(AsyncSet(ExecutionContext.global))
+// result: cats.data.StateT[scala.concurrent.Future,scala.collection.immutable.Set[Int],Boolean] = cats.data.StateT@1c029382
+```
+
+`SetLang` captures the *structure* of a computation, but leaves open
+its *interpretation*.
+
+# Monad transformers and classes
+
+As it turns out, `SetLang` is an example of an encoding often referred to as
+[MTL-style][mtl].
+
+## Monads in monads
+
+Among the motivations for monad classes is to remove the need to specify
+monad transformer stacks. The following example is adapted from
+[Functional Programming with Overloading and Higher-Order Polymorphism][mtlPaper]
+by Professor Mark P. Jones.
+
+Consider a program that is open to failure and computes with some state. This
+suggests a combinator of `Either` and `State`, both of which have
+monad transformers. All that is left is to decide which transformer to use.
+
+```scala
+type App1[A] = EitherT[State[S, ?], Error, A]
+            // State[S, Either[Error, A]]
+            // S => (S, Either[Error, A])
+
+type App2[A] = StateT[Either[Error, ?], S, A]
+            // S => Either[Error, (S, A)]
+```
+
+While `App1` and `App2` are both valid compositions, the
+semantics of the compositions differ. `App1` describes a program where
+the computation of a *value* at each transition may fail - but any changes
+are preserved - whereas `App2` describes a program where the *entire*
+transition may fail.
+
+We can abstract away the difference by creating a type class which provides
+the relevant operations we need.
+
+```scala
+trait MonadError[F[_], E] {
+  def monad: Monad[F]
+
+  def raiseError[A](e: E): F[A]
+  def handleErrorWith[A](fa: F[A])(f: E => F[A]): F[A]
+}
+
+trait MonadState[F[_], S] {
+  def monad: Monad[F]
+
+  def get: F[S]
+  def set(s: S): F[Unit]
+}
+```
+
+Similar type classes exist for the `Reader` and `Writer` data types.
+These type classes are provided in both [Cats][cats] and [Scalaz][scalaz],
+[with some caveats][typeClassSubType].
+
+With these type classes in place we can write functions against these as
+opposed to specific transformer stacks. Furthermore our functions can specify
+exactly what operations they need which helps correctness and
+[parametricity][parametricity].
+
+```scala
+import cats.{MonadError, MonadState}
+import cats.data.{EitherT, State, StateT}
+
+def program[F[_]](implicit F0: MonadError[F, String],
+                           F1: MonadState[F, Int]): F[Int] =
+  F0.flatMap(F1.get) { i =>
+    F0.raiseError[Int]("fail")
+  }
+```
+
+Our program can then be instantiated with either transformer stack.
+
+```scala
+import cats.implicits._
+
+// At the time of this writing Cats does not have these instances
+// so they are defined here.
+//
+// Additionally, both Cats and Scalaz 7 have encoding issues
+// with these MTL type classes which requires us to redefine Monad when
+// defining MonadState instances, despite there already being one.
+implicit def eitherTMonadState[F[_], E, S](implicit F: MonadState[F, S]): MonadState[EitherT[F, E, ?], S] =
+  new MonadState[EitherT[F, E, ?], S] {
+    def get: EitherT[F, E, S] =
+      EitherT(F.get.map(Right(_)))
+
+    def set(s: S): EitherT[F, E, Unit] =
+      EitherT(F.set(s).map(Right(_)))
+
+    def flatMap[A, B](fa: EitherT[F, E, A])
+                     (f: A => EitherT[F, E, B]): EitherT[F, E, B] =
+      fa.flatMap(f)
+
+    def pure[A](x: A): EitherT[F, E, A] =
+      EitherT.pure(x)
+
+    def tailRecM[A, B](a: A)(f: A => EitherT[F, E, Either[A, B]]): EitherT[F, E, B] =
+      EitherT.catsDataMonadErrorForEitherT[F, E].tailRecM(a)(f)
+  }
+
+implicit def stateTMonadError[F[_], E, S](implicit F: MonadError[F, E]): MonadError[StateT[F, S, ?], E] =
+  new MonadError[StateT[F, S, ?], E] {
+    def handleErrorWith[A](fa: StateT[F, S, A])(f: E => StateT[F, S, A]): StateT[F, S, A] =
+      StateT[F, S, A] { (s: S) =>
+        val state: F[(S, A)] = fa.run(s)
+        F.handleErrorWith(state)(e => f(e).run(s))
+      }
+
+    def raiseError[A](e: E): StateT[F, S, A] =
+      StateT.lift(F.raiseError(e))
+
+    def flatMap[A, B](fa: StateT[F, S, A])(f: A => StateT[F, S, B]): StateT[F, S, B] =
+      fa.flatMap(f)
+
+    def pure[A](x: A): StateT[F, S, A] = StateT.pure(x)
+
+    def tailRecM[A, B](a: A)(f: A => StateT[F, S, Either[A, B]]): StateT[F, S, B] =
+      StateT.catsDataMonadStateForStateT[F, S].tailRecM(a)(f)
+  }
+
+type App1[A] = EitherT[State[Int, ?], String, A]
+
+type App2[A] = StateT[Either[String, ?], Int, A]
+```
+
+```scala
+val app1 = program[App1]
+// app1: App1[Int] = EitherT(cats.data.StateT@5fdc056d)
+
+val app2 = program[App2]
+// app2: App2[Int] = cats.data.StateT@72493a33
+```
+
+# Composing languages
+
+From one angle we can view our set language, or more generally any EDSL
+in MTL-style, as an effect like `MonadError` and `MonadState`. From another
+angle we can view `MonadError` and `MonadState` as EDSLs that talk about errors
+and stateful computations. We can eliminate the distinctions by renaming
+`SetLang` to `MonadSet` and treating it as a type class.
+
+```scala
+import cats.Monad
+import cats.implicits._
+
+trait MonadSet[F[_]] {
+  def monad: Monad[F]
+
+  def add(i: Int): F[Unit]
+  def remove(i: Int): F[Unit]
+  def exists(i: Int): F[Boolean]
+}
+```
+
+Composing multiple languages then becomes adding constraints to functions, and
+interpretation becomes instantiating type parameters that satisfy the
+constraints.
+
+```scala
+trait MonadCalc[F[_]] {
+  def monad: Monad[F]
+
+  def lit(i: Int): F[Int]
+  def plus(l: F[Int], r: F[Int]): F[Int]
+}
+
+def setProgram[F[_]: MonadSet](i: Int): F[Boolean] =
+  implicitly[MonadSet[F]].exists(i)
+
+def calcProgram[F[_]: MonadCalc]: F[Int] = {
+  val calc = implicitly[MonadCalc[F]]
+  calc.plus(calc.lit(1), calc.lit(2))
+}
+
+def composedProgram[F[_]: MonadCalc: MonadSet]: F[Boolean] = {
+  implicit val monad: Monad[F] = implicitly[MonadCalc[F]].monad
+  for {
+    i <- calcProgram[F]
+    b <- setProgram(i)
+  } yield b
+}
+
+// Instance
+
+// Instances are defined together but nothing is stopping us from defining
+// these separately, perhaps one in the MonadSet object and another in the
+// SetState object.
+implicit val stateInstance: MonadSet[State[Set[Int], ?]] with MonadCalc[State[Set[Int], ?]] =
+  new MonadSet[State[Set[Int], ?]] with MonadCalc[State[Set[Int], ?]] {
+    val monad = Monad[State[Set[Int], ?]]
+
+    def add(i: Int): State[Set[Int], Unit] = State.modify(_ + i)
+
+    def remove(i: Int): State[Set[Int], Unit] = State.modify(_ - i)
+
+    def exists(i: Int): State[Set[Int], Boolean] = State.inspect(_(i))
+
+    def lit(i: Int): State[Set[Int], Int] = State.pure(i)
+    def plus(l: State[Set[Int], Int], r: State[Set[Int], Int]): State[Set[Int], Int] =
+      (l |@| r).map(_ + _)
+  }
+```
+
+```scala
+val result = composedProgram[State[Set[Int], ?]].run(Set.empty[Int]).value
+// result: (scala.collection.immutable.Set[Int], Boolean) = (Set(),false)
+```
+
+As before, `composedProgram`, `calcProgram`, and `setProgram` are defined
+independent of interpretation, so alternative interpretations simply require
+defining appropriate instances.
+
+# A note about laws
+
+Type classes should come with laws - this lets us give meaning to their use.
+The `Monoid` type class requires data types to have an **associative** binary
+operation and a corresponding identity element. These laws allow us to
+parallelize batch operations, such as partitioning a `List[A]` into
+multiple chunks to be scattered across threads or machines and gathered
+back.
+
+Since our EDSLs are type classes, we should think about what laws we expect
+to hold. Below are some possible candidates for laws:
+
+```
+// MonadSet
+set *> add(i)    *> remove(i) = set
+set *> remove(i) *> exists(i) = false
+set *> add(i)    *> exists(i) = true
+
+// MonadCalc - these are just the Monoid laws
+plus(lit(0), x) = plus(x, lit(0)) = x
+plus(x, plus(y, z)) = plus(plus(x, y), z)
+```
+
+Next up we'll take a look at some pitfalls of this approach, and a modified
+encoding that solves some of them.
+
+*This article was tested with Scala 2.11.8, Cats 0.7.2, kind-projector 0.9.0,
+and si2712fix-plugin 1.2.0 using [tut][tut].*
+
+[cats]: https://github.com/typelevel/cats "Typelevel Cats"
+[hkt]: http://typelevel.org/blog/2016/08/21/hkts-moving-forward.html "Higher-kinded types: the difference between giving up, and moving forward"
+[hkt]: {% post_url 2016-08-21-hkts-moving-forward %} "Higher-kinded types: the difference between giving up, and moving forward"
+[mtl]: https://hackage.haskell.org/package/mtl "Monad classes"
+[mtlPaper]: http://web.cecs.pdx.edu/~mpj/pubs/springschool.html "Functional Programming with Overloading and Higher-Order Polymorphism"
+[monads]: http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf "Monads for functional programming"
+[parametricity]: https://www.mpi-sws.org/~dreyer/tor/papers/wadler.pdf "Theorems for free!"
+[scalaz]: https://github.com/scalaz/scalaz/tree/series/7.3.x "Scalaz 7"
+[tagless]: http://okmij.org/ftp/tagless-final/ "Typed final (tagless-final) style"
+[tut]: https://github.com/tpolecat/tut "tut: doc/tutorial generator for scala"
+[typeClassSubType]: {% post_url 2016-09-30-subtype-typeclasses %} "Subtype type classes don't work"