1
+ <div class =" pagetitle " >
1
2
![ ] ( img/titles/introduction.png )
3
+ </div >
2
4
3
- ******
5
+ <p class =" halfbreak " >
6
+ </p >
4
7
5
- > When the limestone of imperative programming is worn away, the granite of
6
- > functional programming will be observed.
8
+ <!--
9
+ > *When the limestone of imperative programming is worn away, the granite of
10
+ > functional programming will be observed.*
7
11
>
8
12
> <cite>— Simon Peyton Jones</cite>
9
13
10
14
<p class="halfbreak">
11
15
</p>
16
+ -->
12
17
13
18
Introduction
14
19
============
@@ -151,42 +156,71 @@ refines the space of allowable behavior and degree of expressible programs for
151
156
the language. Types are the world's most popular formal method for analyzing
152
157
programs.
153
158
154
- $$
155
- \begin{aligned}
156
- 1 &: \t{Nat} \\
157
- (\lambda x . x) &: \forall a. a \to a \\
158
- (\lambda x y . x) &: \forall a b. a \to b \to a \\
159
- \end{aligned}
160
- $$
159
+ In a language like Python all expressions have the same type at compile time,
160
+ and all syntactically valid programs can be evaluated. In the case where the
161
+ program is nonsensical the runtime will bubble up exceptions at runtime. The
162
+ Python interpreter makes no attempt to analyze the given program for soundness
163
+ at all before running it.
164
+
165
+ ``` bash
166
+ >>> True & " false"
167
+ Traceback (most recent call last):
168
+ File " <stdin>" , line 1, in < module>
169
+ TypeError: unsupported operand type(s) for & : ' bool' and ' str'
170
+ ```
161
171
162
- In more sophisticated languages types and terms will commingle either with
163
- explicit annotations on binders, or even as first class values themselves.
172
+ By comparison Haskell will do quite a bit of work to try to ensure that the
173
+ program is well-defined before running it. The language that we use to
174
+ predescribe and analyze static semantics of the program is that of * static
175
+ types* .
164
176
165
- $$
166
- \t{Pair} \ u \ v = \Lambda X . \lambda x^{U \rightarrow V \rightarrow X} . x u v
167
- $$
177
+ ``` bash
178
+ Prelude> True && " false"
179
+
180
+ < interactive> :2:9:
181
+ Couldn' t match expected type `Bool' with actual type ` [Char]'
182
+ In the second argument of `(&&)' , namely ` " false" '
183
+ In the expression: True && "false"
184
+ In an equation for `it' : it = True && " false"
185
+ ```
168
186
169
- In all the languages which we will implement the types present during compilation are
170
- * erased* . Although they are present in the evaluation semantics, the runtime
171
- cannot dispatch on types of values at runtime. Types by definition only exist at
172
- compile-time in the static semantics of the language.
187
+ Catching minor type mismatch errors is the simplest example of usage, although
188
+ they occur extremely frequently as we humans are quite fallible in our reasoning
189
+ about even the simplest of program constructions! Although this just the tip of
190
+ the iceberg, the gradual trend over the last 20 years toward more * expressive
191
+ types* in modern type systems; which are capable of guaranteeing a large variety
192
+ of program correctness properties.
193
+
194
+ * Preventing resource allocation errors.
195
+ * Enforcing security access for program logic.
196
+ * Side effect management.
197
+ * Preventing buffer overruns.
198
+ * Ensuring cryptographic properties for network protocols.
199
+ * Modeling and verify theorems in mathematics and logic.
200
+ * Preventing data races and deadlocks in concurrent systems.
201
+
202
+ Type systems can never capture all possible behavior of the program. Although
203
+ more sophisticated type systems are increasingly able to model a large space of
204
+ behavior and is one of the most exciting areas of modern computer science
205
+ research. Put most bluntly, ** static types let you be dumb** and offload the
206
+ checking that you would otherwise have to do in your head to a system that can
207
+ do the reasoning for you and work with you to interactively build your program.
173
208
174
209
Functional Compilers
175
210
--------------------
176
211
177
- A compiler is typically divided into parts, a * frontend* and a * backend* . These
178
- are loose terms but the frontend typically deals with converting the human
179
- representation of the code into some canonicalized form while the backend
180
- converts the canonicalized form into another form that is suitable for
181
- evaluation.
212
+ A * compiler* is a program for turning high-level representation of ideas in a
213
+ human readable language into another form. A compiler is typically divided into
214
+ parts, a * frontend* and a * backend* . These are loose terms but the frontend
215
+ typically deals with converting the human representation of the code into some
216
+ canonicalized form while the backend converts the canonicalized form into
217
+ another form that is suitable for evaluation.
182
218
183
219
The high level structure of our functional compiler is described by the
184
220
following * block diagram* . Each describes a * phase* which is a sequence of
185
221
transformations composed to transform the input program.
186
222
187
- <p class =" center " >
188
223
![ ] ( img/pipeline1.png )
189
- </p >
190
224
191
225
* ** Source** - The frontend textual source language.
192
226
* ** Parsing** - Source is parsed into an abstract syntax tree.
@@ -200,15 +234,13 @@ A *pass* may transform the input program from one form into another or alter the
200
234
internal state of the compiler context. The high level description of the forms
201
235
our final compiler will go through is the following sequence:
202
236
203
- <p class =" center " >
204
237
![ ] ( img/pipeline2.png )
205
- </p >
206
238
207
239
Internal forms used during compilation are * intermediate representations* and
208
240
typically any non-trivial language will involve several.
209
241
210
- Lexing
211
- ------
242
+ Parsing
243
+ -------
212
244
213
245
The source code is simply the raw sequence of text that specifies the program.
214
246
Lexing splits the text stream into a sequence of * tokens* . Only the presence of
@@ -222,22 +254,22 @@ let f x = x + 1
222
254
For instance the previous program might generate a token stream like the
223
255
following:
224
256
225
- Token Value
226
- ----- -----
227
- reserved let
228
- var f
229
- var x
230
- reservedOp =
231
- var x
232
- reservedOp +
233
- integer 1
234
-
235
- Parsing
236
- -------
257
+ ``` haskell
258
+ [
259
+ TokenLet ,
260
+ TokenSym " f" ,
261
+ TokenSym " x" ,
262
+ TokenEq ,
263
+ TokenSym " x" ,
264
+ TokenAdd ,
265
+ TokenNum 1
266
+ ]
267
+ ```
237
268
238
- A datatype for the * abstract syntax tree* (AST) is constructed by traversal of
239
- the input stream and generation of the appropriate syntactic construct using a
240
- parser.
269
+ We can then scan the token stream via and dispatch on predefined patterns of
270
+ tokens called * productions* and recursively build up a datatype for the
271
+ * abstract syntax tree* (AST) by traversal of the input stream and generation of
272
+ the appropriate syntactic.
241
273
242
274
``` haskell
243
275
type Name = String
@@ -319,6 +351,19 @@ Let "f" []
319
351
(Lit (LitInt 1 ))))
320
352
```
321
353
354
+ Transformation
355
+ --------------
356
+
357
+ The type core representation is often suitable for evaluation, but quite often
358
+ different intermediate representations are more amenable to certain
359
+ optimizations and make explicit semantic properties of the language explicit.
360
+ These kind of intermediate forms will often attach information about free
361
+ variables, allocations, and usage information directly onto the AST to make it
362
+
363
+ The most important form we will use is called the * Spineless Tagless G-Machine*
364
+ ( STG ), an abstract machine that makes many of the properties of lazy
365
+ evaluation explicit directly in the AST.
366
+
322
367
Code Generation
323
368
---------------
324
369
@@ -356,11 +401,11 @@ resulting module.
356
401
357
402
``` perl
358
403
f:
359
- movl %edi , -4(%rsp )
360
- movl -4(%rsp ), %edi
361
- addl $1 , %edi
362
- movl %edi , %eax
363
- ret
404
+ movl %edi , -4(%rsp )
405
+ movl -4(%rsp ), %edi
406
+ addl $1 , %edi
407
+ movl %edi , %eax
408
+ ret
364
409
365
410
```
366
411
@@ -370,11 +415,11 @@ instructions defined by the processor specification.
370
415
371
416
``` perl
372
417
0000000000000000 <f>:
373
- 0: 89 7c 24 fc mov %edi ,-0x4(%rsp )
374
- 4: 8b 7c 24 fc mov -0x4(%rsp ),%edi
375
- 8: 81 c7 01 00 00 00 add $0x1,%edi
376
- e: 89 f8 mov %edi ,%eax
377
- 10: c3 retq
418
+ 0: 89 7c 24 fc mov %edi ,-0x4(%rsp )
419
+ 4: 8b 7c 24 fc mov -0x4(%rsp ),%edi
420
+ 8: 81 c7 01 00 00 00 add $0x1,%edi
421
+ e: 89 f8 mov %edi ,%eax
422
+ 10: c3 retq
378
423
```
379
424
380
425
\pagebreak
0 commit comments