Skip to content

Commit 73b43dc

Browse files
committed
Squashed commit of the following:
commit 41ba8c36a90cc11723b14ce6c45599eabdcfaa53 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 21:02:57 2015 -0500 type provenance commit be5eda941bb4c44b4c4af0ddbbd793643938f4ff Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 20:13:06 2015 -0500 provenance prototype commit 7aa958b9c279e7571f7c4887f6aa19443e16f6fb Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 19:35:08 2015 -0500 fix misc typos commit 52d60b3b2630e50ef0cd6ea5f0fa1f308d92e26d Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:15:58 2015 -0500 license badge commit 7d34274afe6f05a0002c8f87e5077b6a130b42b4 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 15:07:28 2015 -0500 fix resolution for llvm cfg graphs commit 14d9bc836ecc64f8e9acc60bcbd2da02335255b9 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:12:39 2015 -0500 added codegen dsl stub commit 0f74cdd6f95d0a1fe1cafd73e45cb1407709efd8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 13:01:14 2015 -0500 llvm cfg graphs commit a199d721503985954060e7670c1d2f5e1a65dd11 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 10:56:54 2015 -0500 source code font commit c7db0c5d67b73d8633f08be093971877e2d6ede0 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sun Jan 18 09:59:37 2015 -0500 change phrasing around recursion commit 6903700db482524233262e722df54b1066218250 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 18:20:06 2015 -0500 contributors.md commit 14d90a3f2ebf7ddf1229c084fe4a1e9fa13f2e41 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 17:35:41 2015 -0500 added llvm logo commit d270df6d94cbf1ef9eddfdd64af5aabc36ebca72 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 15:50:28 2015 -0500 initial llvm chapter commit e71b189c057ea9e399e90e47d9d49bb4cf12cda8 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Sat Jan 17 12:21:00 2015 -0500 system-f typing rules commit 2a7d5c7f137cf352eeae64836df634c98118f594 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Thu Jan 15 15:21:14 2015 -0500 flesh out system-f commit 7b3b2f0a2aea5e1102abe093cf5e0559090720aa Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 22:22:14 2015 -0500 started on extended parser commit cdeaf1a2658f15346fe1dc665ca09e954cce6c2e Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 17:25:02 2015 -0500 creative commons license commit f09d210be253a05fc8ad0827cd72ffa32404e2ba Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 16:54:10 2015 -0500 higher res images commit 8555eadfea8843f5683621e6652857e4259fa896 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Wed Jan 14 14:48:44 2015 -0500 cover page commit e5e542e92610f4bb4c5ac726ffa86cd1e07753e3 Author: Stephen Diehl <stephen.m.diehl@gmail.com> Date: Tue Jan 13 17:31:01 2015 -0500 initial happy/alex parser
1 parent b81a2c9 commit 73b43dc

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

101 files changed

+3971
-166
lines changed

000_introduction.md

+100-55
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,19 @@
1+
<div class="pagetitle">
12
![](img/titles/introduction.png)
3+
</div>
24

3-
******
5+
<p class="halfbreak">
6+
</p>
47

5-
> When the limestone of imperative programming is worn away, the granite of
6-
> functional programming will be observed.
8+
<!--
9+
> *When the limestone of imperative programming is worn away, the granite of
10+
> functional programming will be observed.*
711
>
812
> <cite>— Simon Peyton Jones</cite>
913
1014
<p class="halfbreak">
1115
</p>
16+
-->
1217

1318
Introduction
1419
============
@@ -151,42 +156,71 @@ refines the space of allowable behavior and degree of expressible programs for
151156
the language. Types are the world's most popular formal method for analyzing
152157
programs.
153158

154-
$$
155-
\begin{aligned}
156-
1 &: \t{Nat} \\
157-
(\lambda x . x) &: \forall a. a \to a \\
158-
(\lambda x y . x) &: \forall a b. a \to b \to a \\
159-
\end{aligned}
160-
$$
159+
In a language like Python all expressions have the same type at compile time,
160+
and all syntactically valid programs can be evaluated. In the case where the
161+
program is nonsensical the runtime will bubble up exceptions at runtime. The
162+
Python interpreter makes no attempt to analyze the given program for soundness
163+
at all before running it.
164+
165+
```bash
166+
>>> True & "false"
167+
Traceback (most recent call last):
168+
File "<stdin>", line 1, in <module>
169+
TypeError: unsupported operand type(s) for &: 'bool' and 'str'
170+
```
161171

162-
In more sophisticated languages types and terms will commingle either with
163-
explicit annotations on binders, or even as first class values themselves.
172+
By comparison Haskell will do quite a bit of work to try to ensure that the
173+
program is well-defined before running it. The language that we use to
174+
predescribe and analyze static semantics of the program is that of *static
175+
types*.
164176

165-
$$
166-
\t{Pair} \ u \ v = \Lambda X . \lambda x^{U \rightarrow V \rightarrow X} . x u v
167-
$$
177+
```bash
178+
Prelude> True && "false"
179+
180+
<interactive>:2:9:
181+
Couldn't match expected type `Bool' with actual type `[Char]'
182+
In the second argument of `(&&)', namely `"false"'
183+
In the expression: True && "false"
184+
In an equation for `it': it = True && "false"
185+
```
168186

169-
In all the languages which we will implement the types present during compilation are
170-
*erased*. Although they are present in the evaluation semantics, the runtime
171-
cannot dispatch on types of values at runtime. Types by definition only exist at
172-
compile-time in the static semantics of the language.
187+
Catching minor type mismatch errors is the simplest example of usage, although
188+
they occur extremely frequently as we humans are quite fallible in our reasoning
189+
about even the simplest of program constructions! Although this just the tip of
190+
the iceberg, the gradual trend over the last 20 years toward more *expressive
191+
types* in modern type systems; which are capable of guaranteeing a large variety
192+
of program correctness properties.
193+
194+
* Preventing resource allocation errors.
195+
* Enforcing security access for program logic.
196+
* Side effect management.
197+
* Preventing buffer overruns.
198+
* Ensuring cryptographic properties for network protocols.
199+
* Modeling and verify theorems in mathematics and logic.
200+
* Preventing data races and deadlocks in concurrent systems.
201+
202+
Type systems can never capture all possible behavior of the program. Although
203+
more sophisticated type systems are increasingly able to model a large space of
204+
behavior and is one of the most exciting areas of modern computer science
205+
research. Put most bluntly, **static types let you be dumb** and offload the
206+
checking that you would otherwise have to do in your head to a system that can
207+
do the reasoning for you and work with you to interactively build your program.
173208

174209
Functional Compilers
175210
--------------------
176211

177-
A compiler is typically divided into parts, a *frontend* and a *backend*. These
178-
are loose terms but the frontend typically deals with converting the human
179-
representation of the code into some canonicalized form while the backend
180-
converts the canonicalized form into another form that is suitable for
181-
evaluation.
212+
A *compiler* is a program for turning high-level representation of ideas in a
213+
human readable language into another form. A compiler is typically divided into
214+
parts, a *frontend* and a *backend*. These are loose terms but the frontend
215+
typically deals with converting the human representation of the code into some
216+
canonicalized form while the backend converts the canonicalized form into
217+
another form that is suitable for evaluation.
182218

183219
The high level structure of our functional compiler is described by the
184220
following *block diagram*. Each describes a *phase* which is a sequence of
185221
transformations composed to transform the input program.
186222

187-
<p class="center">
188223
![](img/pipeline1.png)
189-
</p>
190224

191225
* **Source** - The frontend textual source language.
192226
* **Parsing** - Source is parsed into an abstract syntax tree.
@@ -200,15 +234,13 @@ A *pass* may transform the input program from one form into another or alter the
200234
internal state of the compiler context. The high level description of the forms
201235
our final compiler will go through is the following sequence:
202236

203-
<p class="center">
204237
![](img/pipeline2.png)
205-
</p>
206238

207239
Internal forms used during compilation are *intermediate representations* and
208240
typically any non-trivial language will involve several.
209241

210-
Lexing
211-
------
242+
Parsing
243+
-------
212244

213245
The source code is simply the raw sequence of text that specifies the program.
214246
Lexing splits the text stream into a sequence of *tokens*. Only the presence of
@@ -222,22 +254,22 @@ let f x = x + 1
222254
For instance the previous program might generate a token stream like the
223255
following:
224256

225-
Token Value
226-
----- -----
227-
reserved let
228-
var f
229-
var x
230-
reservedOp =
231-
var x
232-
reservedOp +
233-
integer 1
234-
235-
Parsing
236-
-------
257+
```haskell
258+
[
259+
TokenLet,
260+
TokenSym "f",
261+
TokenSym "x",
262+
TokenEq,
263+
TokenSym "x",
264+
TokenAdd,
265+
TokenNum 1
266+
]
267+
```
237268

238-
A datatype for the *abstract syntax tree* (AST) is constructed by traversal of
239-
the input stream and generation of the appropriate syntactic construct using a
240-
parser.
269+
We can then scan the token stream via and dispatch on predefined patterns of
270+
tokens called *productions* and recursively build up a datatype for the
271+
*abstract syntax tree* (AST) by traversal of the input stream and generation of
272+
the appropriate syntactic.
241273

242274
```haskell
243275
type Name = String
@@ -319,6 +351,19 @@ Let "f" []
319351
(Lit (LitInt 1))))
320352
```
321353

354+
Transformation
355+
--------------
356+
357+
The type core representation is often suitable for evaluation, but quite often
358+
different intermediate representations are more amenable to certain
359+
optimizations and make explicit semantic properties of the language explicit.
360+
These kind of intermediate forms will often attach information about free
361+
variables, allocations, and usage information directly onto the AST to make it
362+
363+
The most important form we will use is called the *Spineless Tagless G-Machine*
364+
( STG ), an abstract machine that makes many of the properties of lazy
365+
evaluation explicit directly in the AST.
366+
322367
Code Generation
323368
---------------
324369

@@ -356,11 +401,11 @@ resulting module.
356401

357402
```perl
358403
f:
359-
movl %edi, -4(%rsp)
360-
movl -4(%rsp), %edi
361-
addl $1, %edi
362-
movl %edi, %eax
363-
ret
404+
movl %edi, -4(%rsp)
405+
movl -4(%rsp), %edi
406+
addl $1, %edi
407+
movl %edi, %eax
408+
ret
364409

365410
```
366411

@@ -370,11 +415,11 @@ instructions defined by the processor specification.
370415

371416
```perl
372417
0000000000000000 <f>:
373-
0: 89 7c 24 fc mov %edi,-0x4(%rsp)
374-
4: 8b 7c 24 fc mov -0x4(%rsp),%edi
375-
8: 81 c7 01 00 00 00 add $0x1,%edi
376-
e: 89 f8 mov %edi,%eax
377-
10: c3 retq
418+
0: 89 7c 24 fc mov %edi,-0x4(%rsp)
419+
4: 8b 7c 24 fc mov -0x4(%rsp),%edi
420+
8: 81 c7 01 00 00 00 add $0x1,%edi
421+
e: 89 f8 mov %edi,%eax
422+
10: c3 retq
378423
```
379424

380425
\pagebreak

001_basics.md

+36-27
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1+
<div class="pagetitle">
12
![](img/titles/basics.png)
2-
3-
******
3+
</div>
44

55
<!--
66
<blockquote>
@@ -33,14 +33,12 @@ add :: Integer -> Integer -> Integer
3333
add x y = x + y
3434
```
3535

36-
```haskell
37-
add (x,y) = x + y
38-
```
36+
In Haskell all functions are pure, the only thing a function may do is return a
37+
value.
3938

40-
In Haskell all functions are pure, the only thing a function may do is return a value.
41-
42-
All functions in Haskell are curried, for example a function of three arguments takes up to three arguments and for
43-
anything less than three it yields a partially applied function which when given additional arguments yields a
39+
All functions in Haskell are curried, for example a function of three arguments
40+
takes up to three arguments and for anything less than three it yields a
41+
partially applied function which when given additional arguments yields a
4442
another function or the resulting value if saturated.
4543

4644
```haskell
@@ -51,7 +49,8 @@ h :: Int -> Int
5149
h = g 2 3
5250
```
5351

54-
Haskell supports higher-order functions, functions which take functions and yield other functions.
52+
Haskell supports higher-order functions, functions which take functions and
53+
yield other functions.
5554

5655
```haskell
5756
compose f g = \x -> f (g x)
@@ -86,10 +85,7 @@ constructors also generates special set of functions known as *selectors* which
8685
extract the values of a specific field from the record.
8786

8887
```haskell
89-
data Prod = Prod
90-
{ a :: Int
91-
, b :: Bool
92-
}
88+
data Prod = Prod { a :: Int , b :: Bool }
9389

9490
-- a :: Prod -> Int
9591
-- b :: Prod -> Bool
@@ -158,7 +154,7 @@ Tuples are allowed (with compiler support) up to 15 fields in GHC.
158154
Pattern matching
159155
----------------
160156

161-
Pattern matching allows us to discriminate on the constructor(s) of a datatype,
157+
Pattern matching allows us to discriminate on the constructors of a datatype,
162158
mapping separate cases to separate code paths.
163159

164160
```haskell
@@ -216,12 +212,17 @@ Recursion
216212
In Haskell all iteration over data structures is performed by recursion.
217213
Entering a function in Haskell does not create a new stack frame, the logic of
218214
the function is simply entered with the arguments on the stack and yields result
219-
to the register. The resulting logic is compiled identically to ``while`` loops
220-
in other languages, via a ``jmp`` instruction instead of a ``call``.
215+
to the register. In the case where a function returns a invocation of itself
216+
invoked in the *tail position* the resulting logic is compiled identically to
217+
``while`` loops in other languages, via a ``jmp`` instruction instead of a
218+
``call``.
221219

222220
```haskell
223-
factorial 0 = 1
224-
factorial n = n * factorial (n - 1)
221+
sum :: [Int] -> [Int]
222+
sum ys = go ys 0
223+
where
224+
go (x:xs) i = go xs (i+x)
225+
go [] i = i
225226
```
226227

227228
Functions can be defined to recurse mutually on each other.
@@ -421,29 +422,29 @@ all monad instances must satisfy.
421422
**Law 1**
422423

423424
```haskell
424-
return a >>= f f a
425+
return a >>= f = f a
425426
```
426427

427428
**Law 2**
428429

429430
```haskell
430-
m >>= return m
431+
m >>= return = m
431432
```
432433

433434
**Law 3**
434435

435436
```haskell
436-
(m >>= f) >>= g m >>= (\x -> f x >>= g)
437+
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
437438
```
438439

439440
Haskell has a level of syntactic sugar for monads known as do-notation. In this
440441
form binds are written sequentially in block form which extract the variable
441442
from the binder.
442443

443444
```haskell
444-
do { a <- f ; m } f >>= \a -> do { m }
445-
do { f ; m } f >> do { m }
446-
do { m } m
445+
do { a <- f ; m } = f >>= \a -> do { m }
446+
do { f ; m } = f >> do { m }
447+
do { m } = m
447448
```
448449

449450
So for example the following are equivalent.
@@ -504,6 +505,16 @@ discards the left while ``<*`` discards the right. For example in a monadic
504505
parser combinator library the ``*>`` would parse with first parser argument but
505506
return the second.
506507

508+
Monoids
509+
-------
510+
511+
```haskell
512+
class Monoid a where
513+
mempty :: a
514+
mappend :: a -> a -> a
515+
mconcat :: [a] -> a
516+
```
517+
507518
Deriving
508519
--------
509520

@@ -638,9 +649,7 @@ evalStack m = execWriterT (evalStateT (unStack m) 0)
638649

639650
As illustrated by the following stack diagram:
640651

641-
<p class="center">
642652
![](img/stack.png)
643-
</p>
644653

645654
Using mtl and ``GeneralizedNewtypeDeriving`` we can produce the same stack but with
646655
a simpler forward facing interface to the transformer stack. Under the hood mtl

0 commit comments

Comments
 (0)