April 19, 2014
"Let me take you on an adventure which will give you superpowers." --
@bitemyapp
https://github.com/nuttycom/lambdaconf-2014-workshop
Trail of Awesome
Perfect Programs
Not on my board
Haskell and Scala
- Mention utility of types. - Piper
- Going to start with a very simple program.
True
This program doesn't have any bugs.
This program has only a single possible result.
data Bool = True | False
bool :: Bool
-- "::" is pronounced, "has type"
1 + 1 = 2
There are two possible states of the program implied by a boolean variable - the state in which it is true, and the state in which it is false.
i :: Int32
There are 2^32 possible states of the program implied by a 32-bit integer.
This one is just awful...
s :: String
The number of states representable by a string is basically all the possible memory states of your computer, more or less. It is effectively unbounded. And hence, almost useless. Strings should only ever be used for humans to read, never for computers to dispatch on.
data Maybe a = Just a | Nothing
inhabitants (Maybe a) = inhabitants a + 1
data Either a b = Left a | Right b
inhabitants (Either a b) = inhabitants a + inhabitants b
Bool, Maybe and Either are called sum types for the obvious reason.
tuple :: (Bool, Int32)
-
2^32^ inhabitants if the value on the left is True
-
2^32^ inhabitants if it's False
-
2^32^ + 2^32^ = 2 * 2^32^ = 2^33^
-- if haskell had unsafe type-level pattern matching...
inhabitants :: Type -> Nat
inhabitants Bool = 2
inhabitants Int32 = 2^32
inhabitants (a, b) = inhabitants a * inhabitants b
inhabitants (Int32, Int32) = 2^32 * 2^32 = 2^64
inhabitants (a, b, c) = inhabitants a * inhabitants b * inhabitants c
inhabitants (Bool, Bool, Int32) = 2 * 2 * 2^32 = 2^34
With tuples, we always multiply.
We call these "product" types.
2^32^ + 2^32^ = 2 * 2^32^ = 2^33^
either :: Either Int32 Int32
-- 2^33 inhabitants
tuple :: (Bool, Int32)
-- 2^33 inhabitants
These types are isomorphic.
Choose whichever one is most convenient.
inhabitants (Maybe Int32) = 2^32^ + 1
Most languages emphasize products.
Bad ones don't let you define a type
with 2^32^ + 1 inhabitants easily.
Sapir-Whorf
Most mainstream languages today make products easy, sums hard.
Too easy to define types with too many inhabitants.
interface EitherVisitor<A, B, C> {
public C visitLeft(Left<A, B> left);
public C visitRight(Right<A, B> right);
}
interface Either<A, B> {
public <C> C accept(EitherVisitor<A, B, C> visitor);
}
public final class Left<A, B> implements Either<A, B> {
public final A value;
public Left(A value) {
this.value = value;
}
public <C> C accept(EitherVisitor<A, B, C> visitor) {
return visitor.visitLeft(this);
}
}
public final class Right<A, B> implements Either<A, B> {
public final B value;
public Right(B value) {
this.value = value;
}
public <C> C accept(EitherVisitor<A, B, C> visitor) {
return visitor.visitRight(this);
}
}
"Bug fixing strategy: forbid yourself to fix the bug. Instead, render the bug impossible by construction." --Paul Phillips
To do this, we need to minimize the state space of our program.
What types should we not let in?
The type
String
should only ever appear in your program when a value is being shown to a human being.
- Strings as dictionary keys
- Strings as serialized form
Strings are worse than any other potentially infinite data structure because they're convenient.
A very common misfeature is to use strings as a serialization mechanism.
Appealing because human readable.
Introduce bugs so that you can have a debugging tool?
Good for being read by humans. But not for machines. Don't turn your value into a string until it's about to be turned into photons headed at eyeballs.
Use newtypes liberally.
newtype Name = Name { strValue :: String }
-- don't export strValue unless you really, really need it
case class Name(strValue: String) extends AnyVal
Never, ever pass bare String values
unless it's to putStrLn
or equivalent.
Never, ever return bare String values
except from readLn
or equivalent
If a string is serving any purpose other than display in your system, then it has semantic meaning that should be tracked by the type system.
Hide your newtype constructor behind validation.
case class IPAddr private (addr: String) extends AnyVal
object IPAddr {
val pattern = """(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})""".r
def parse(s: String): Option[IPAddr] = for {
xs <- pattern.unapplySeq(s) if xs.forall(_.toInt <= 255)
} yield IPAddr(s)
}
We've shrunk down an infinite number of states to (256^4^ + 1). Given the inputs, that's the best we can do.
This approach applies to virtually every primitive type.
Three very useful types:
-
A \/ B
-
Validation[A, B]
-
EitherT[M[_], A, B]
Sum types where the error type conventionally comes on the left.
- Going to quickly look at each of them.
MyError \/ B
-
\/
(Disjunction) has a Monad biased to the right -
We can sequentially compose operations that might fail
-
for
comprehension syntax is useful for this
import scalaz.std.syntax.option._
def parseJson(s: String): ParseError \/ JValue = ???
def ipField(jv: JValue): ParseError \/ String = ???
def parseIP(addrStr: String): ParseError \/ IPAddr =
IPAddr.parse(addrStr) toRightDisjunction {
ParseError(s"$addrStr is not a valid IP address")
}
val ipV: ParseError \/ IPAddr = for {
jv <- parseJson("""{"hostname": "nuttyland", "ipAddr": "127.0.0.1"}""")
addrStr <- ipField(jv)
ipAddr <- parseIP(addrStr)
} yield ipAddr
Validation[NonEmptyList[MyError], B]
-
Validation does not have a Monad instance.
-
Composition uses Applicative: conceptually parallel!
-
if you need sequencing,
.disjunction
type VPE[B] = Validation[NonEmptyList[ParseError], B]
def hostname(jv: JValue): VPE[String] = ???
def ipField(jv: JValue): ParseError \/ String = ???
def parseIP(addrStr: String): ParseError \/ IPAddr = ???
def host(jv: JValue) = ^[VPE, String, IPAddr, Host](
hostname(jv),
(ipField(jv) >>= parseIP).validation.leftMap(nels(_))
) { Host.apply _ }
There is a functor isomorphism between \/
and validation.
Types have the same information content, but compose differently.
EitherT[M[_], MyErrorType, B]
-
EitherT layers together two effects:
-
The "outer" monadic effect of the type constructor M[_]
-
The disjunctive effect of
\/
-
// EitherT[M, A, _] <~> M[A \/ _]
def findDocument(key: DocKey): EitherT[IO, DBErr, Document] = ???
def storeWordCount(key: DocKey, wordCount: Long): EitherT[IO, DBErr, Unit] = ???
val program: EitherT[IO, DBErr, Unit] = for {
doc <- findDocument(myDocKey)
words = wordCount(doc)
_ <- storeWordCount(myDocKey, words)
} yield ()
"Real world" interaction.
in other words, what was that IO thingy?
What is IO?
A value of type IO[String] represents some action that can be performed to return a String.
IO is kind of like if you had only a single sensory organ.
It tells you that there's something going on, but it's not very specific about what.
You couldn't, for example, determine whether it was your hair or your hand that was on fire. That sort of thing.
We want to be more specific. Sum types give us a way to do that.
Minimizing the number of inhabitants of our program includes minimizing side effects.
We want to be specific.
Sounds like a job for a sum type!
// def findDocument(key: DocKey): EitherT[IO, DBErr, Document] = ???
// def storeWordCount(key: DocKey, wordCount: Long): EitherT[IO, DBErr, Unit] = ???
sealed trait DocAction
case class FindDocument(key: DocKey) extends DocAction
case class StoreWordCount(key: DocKey, wordCount: Long) extends DocAction
How can we build a program out of these actions?
object DocAction {
type Program = List[DocAction]
def runProgram(actions: Program): IO[Unit] = ???
}
- Obviously, this doesn't work.
- can't interleave pure computations
- no way to represent return values
- can't produce a new DocAction from a Program
- can't make later action a function of an earlier one
- But, it is conceptually related to what we want.
- a data structure an ordered sequence of actions
- an interpreter that evaluates this data structure
Let's, see, what is good for sequencing effects?
trait Monad[M[_]] {
def pure[A](a: => A): M[A]
def bind[A, B](ma: M[A])(f: A => M[B]): M[B]
}
- State threads state through a computation...
- Reader gives the same inputs to all...
- Writer keeps a log...
- Not going to be Option, List, ...
-
restrict the client to only permit actions in our sum
-
produce later actions as a function of earlier ones
-
interleave pure computations with effectful
Restate relationship of minimization of states to correctness.
sealed trait Free[F[_], A]
case class Pure[F[_], A](a: A) extends Free[F, A]
case class Bind[F[_], A, B](s: Free[F, A], f: A => Free[F, B]) extends Free[F, A]
case class Suspend[F[_], A](s: F[Free[F, A]]) extends Free[F, A]
for a complete derivation of this type, see Functional Programming In Scala
Exercise 1: Write the Monad instance for Free.
This data structure will take the place of List in our broken Program type.
Instead of walking through the derivation, I'm going to show you how to use it, and I think that through the process of using it it'll become obvious why it is useful for our purpose.
object Free {
def monad[F[_]] = new Monad[({ type λ[α] = Free[F, α] })#λ] {
def pure[A](a: => A): Free[F, A] = Pure(a)
// def bind[A, B](ma: Free[F, A])(f: A => Free[F, B]): Free[F, B] = Bind(ma, f)
def bind[A, B](ma: Free[F, A])(f: A => Free[F, B]): Free[F, B] = ma match {
case Bind(ma0, f0) => bind(ma0) { a => Bind(f0(a), f) }
case other => Bind(other, f)
}
}
}
Back to our requirements.
Need to chain together
Here's a little trick: store the rest of the program in each action.
// sealed trait DocAction
// case class FindDocument(key: DocKey) extends DocAction
// case class StoreWordCount(key: DocKey, wordCount: Long) extends DocAction
sealed trait DocAction[A]
case class FindDocument(key: DocKey, cont: Option[Document] => A) extends DocAction[A]
case class StoreWordCount(key: DocKey, wordCount: Long, cont: () => A) extends DocAction[A]
This is an implementation detail, really, but it's a critical one.
Remember, a program is a value representing a sequence of operations for an interpreter to follow. So, capturing the rest of the program is just capturing that description - there's not a lot of overhead to it.
Earlier, we used a list to represent this sequence, and a couple of slides back I introduced the Free data structure (which is itself a sum type) as the one we'd use to capture the sequence of operations. Let's do that now.
sealed trait DocAction[A]
object DocAction {
case class FindDocument[A] private[DocAction] (
key: DocKey,
cont: Option[Document] => A
) extends DocAction[A]
case class StoreWordCount[A] private[DocAction] (
key: DocKey, wordCount: Long,
cont: () => A
) extends DocAction[A]
type Program[A] = Free[DocAction, A]
def findDocument(key: DocKey): Program[Option[Document]] =
Suspend(FindDocument(key, Pure(_)))
def storeWordCount(key: DocKey, wordCount: Long): Program[Unit] =
Suspend(StoreWordCount(key, wordCount, () => Pure(())))
}
Private constructors, no need to expose them.
We now have a way to create programs, and we know in the abstract that because Free has a Monad, that we can chain those programs together.
So let's see what that looks like.
import DocAction._
val program: Program[Unit] = for {
document <- findDocument(docKey)
_ <- document match {
case Some(d) => storeWordCount(docKey, countWords(d))
case None => Free.monad[DocAction].pure(())
}
} yield ()
While this looks like an effectful program, it's just chaining together calls to the Program constructors that we defined previously. So, all that we've done is built up a data structure, like our original List, but much more powerful!
The next thing to do is interpret this data structure.
type Memory = Map[DocKey, (Document, Option[Long])]
@tailrec def run[A](program: Program[A], memory: Memory): A = {
program match {
case Pure(a) => a
case Suspend(FindDocument(key, cont)) => ???
case Suspend(StoreWordCount(key, n, cont)) => ???
case Bind(s, f) => ???
}
}
type Memory = Map[DocKey, (Document, Option[Long])]
@tailrec def run[A](program: Program[A], memory: Memory): A = {
program match {
case Pure(a) => a
case Suspend(FindDocument(key, cont)) =>
run(cont(memory.get(key).map(_._1)), memory)
case Suspend(StoreWordCount(key, n, cont)) => ???
case Bind(s, f) => ???
}
}
type Memory = Map[DocKey, (Document, Option[Long])]
@tailrec def run[A](program: Program[A], memory: Memory): A = {
program match {
case Pure(a) => a
case Suspend(FindDocument(key, cont)) =>
run(cont(memory.get(key).map(_._1)), memory)
case Suspend(StoreWordCount(key, n, cont)) => ???
val newValue = memory.get(key).map(_.copy(_2 = Some(n)))
val newMemory = memory ++ newValue.map(key -> _)
run(cont(), newMemory)
case Bind(s, f) => ???
}
}
type Memory = Map[DocKey, (Document, Option[Long])]
@tailrec def run[A](program: Program[A], memory: Memory): A = {
program match {
case Pure(a) => a
case Suspend(FindDocument(key, cont)) => // implemented
case Suspend(StoreWordCount(key, n, cont)) => //implemented
case Bind(s, f) =>
s match {
case Pure(a) => run(f(a), memory)
case Suspend(FindDocument(key, cont)) =>
run(Bind(cont(memory.get(key).map(_._1)), f), memory)
case Suspend(StoreWordCount(key, n, cont)) =>
val newValue = memory.get(key).map(_.copy(_2 = Some(n)))
val newMemory = memory ++ newValue.map(key -> _)
run(Bind(cont(), f), newMemory)
case Bind(s0, f0) =>
run(Bind(s0, (a: Any) => Bind(f0(a), f)), memory)
}
}
}
Exercise 2: Implement a pure interpreter for ASDF
After this, we'll work on side-effecting interpreter.
Exercise 3: Refactor the tests to allow reuse against other interpreters.
Exercise 4: Implement an effectful interpreter for ASDF