- Originated at MIT.
- Used the interpreter from TECO.
- TECO must have been a horrible command language.
- Later implementations on Lisp Machines (i.e. using Lisp as extension commands.
- 1984: Rewritten in C and Emacs Lisp for the GNU project (GNU Emacs).
- Low-level code written in C, often exposed to Lisp.
- Higher level functionality written in Lisp.
- Thus, Emacs is not just an editor, it is a Lisp VM.
- 1960: McCarthy devised a representation of functions based on lambda calculus.
- Programs as lists with symbols.
- Prefix notation.
- Syntax on top was planned, never implemented.
- Many types exist.
- Two are of supreme importance:
- List
- Singly-linked
- Made of cons-cells (ordered pairs)
- Heterogeneous
- They have syntax!
- When evaluated, represents an operation.
- Symbol
- Needed in a compiler anyway
- Exposed to the user (because … why not?)
- They have syntax too!
- When evaluated, returns the appropriate slot.
- List
- Many dynamic languages (perl, python, ruby) can create code at runtime and run it.
- Most cannot parse code at runtime.
- This makes code manipulation/rewriting very difficult (embedded DSLs).
- In Lisp, code is represented by Lisp datastructures, thus can be walked.
- This makes macros possible and powerful.
- Macros recieve their arguments unevalled, i.e. as code objects, not as the result of running the code.
(dolist (i (number-sequence 0 10))
(insert (format "%s\n" i)))
- The list
(i (number-sequence 0 10))
is not evaluated. - The macro
dolist
evaluates the second list element once. - All but the first argument is repeatedly evaluated.
- An element from the sequence is bound to the symbol
i
for each invocation.
- Source code in any language must be well-formed.
- Editing tools should try to enforce this.
- Should allow editing of structure, not of characters.
- Lispy and Paredit do so.
Task: Place `cl-values’ around the call to `apply’.
(defun el-reader//read-hash-table (stream _char)
(cl-values
(let ((k-v (el-reader/read-delimited-list ?\} stream t)))
(if (= (mod (length k-v) 2) 1)
(error "Invalid syntax: {}")
(apply #'el-reader//ht k-v)))))
- What this is all about!
- Lisp enables extensible readers.
- Syntax is for data, not language constructs:
- ()
- Lists
- []
- Vectors
- ””
- Strings
- Reader consumes characters, produces Lisp objects.
- The compiler consumes Lisp objects, returns (byte-)code.
- Why not use the word “parser”?
- Parsers presume a lexer.
- Lisp does both at the same time, but exposes the reader to the user of the language.
- A user can also manipulate where the lexer separates tokens—we’ll see an example of that later with hashtables.
- Need to replace the built-in function
read
altogether. - Yet still want to keep it around.
- Advice to the rescue!
(define-advice read
(:around (oldfun &optional stream)
el-reader//replace-read)
(if use-el-reader
(el-reader/read stream)
(funcall oldfun stream)))
- Elisp’s read:
(read &optional stream)
- CL’s read:
(read &optional input-stream eof-error-p eof-value recursive-p)
- el-reader’s read:
(cl-defun el-reader/read (&optional input-stream
(eof-error-p t)
eof-value
recursive-p
keys))
read
reads one expression from the stream.- If the first character encountered is a macro character, execute that function and use the result.
- If not, read characters into a token (symbol or number)
- End the token when whitespace or a macro character is encountered.
- Escape characters may be used to prevent macro execution or token termination (may include whitespace in a symbol name).
- Terminating macro character
- Calls user-supplied function if first char in token, ends read otherwise.
- Non-terminating macro character
- Calls user-supplied function if first char in token, reads itself otherwise.
- Read macro
- A pair of a macro character and a function to be called when this character is encountered.
- Syntax type
- Every instance of every character has exactly one syntax type. Terminating and non-terminating macro characters are syntax types.
- Token
- An atomic unit of text. Reads as a symbol or number.
- Constituent
- Part of a token (symbol or number).
- Macro character
- Can be terminating or non-terminating.
- Single escape character
- Causes the next character to be
treated as a constituent (even if it was a macro
character).
'foo\(bar ;; => A symbol with the name "foo(bar"
- Multiple escape character
- Also escapes characters to be constituent, but does so for a stretch of characters until another multiple escape character is encountered.
- Whitespace
- Characters which end the accumulation of a token, but are otherwise skipped.
- Invalid
- Characters which may not occur (unused by el-reader).
- alphabetic
- digit
- plus sign
- minus sign
- dot
- decimal point
- ratio marker
- exponent marker
- invalid (unused)
Link to the code. Link to the code in =read=.
(with-current-buffer (get-buffer "el-reader.el")
(occur "(\\(?:defclass\\|cl-defgeneric\\)"))
- Use buffer local variables.
- Add the following to the beginning of a file:
(eval-and-compile
(setf use-el-reader t))
- Sets the variable
use-el-reader
to be true, but only for the current buffer (i.e. file). - The advice around
read
honors this variable.
(defun el-reader//ht (&rest args)
"Create and return a hashtable.
Keys and values are given alternating in args."
(let ((h (make-hash-table)))
(cl-loop for (key value) on args by #'cddr
do (if (and key value) (puthash key value h)
(error "Odd number of arguments passed")))
h))
(defun el-reader//read-hash-table (stream _char)
(cl-values
(let ((k-v (el-reader/read-delimited-list ?\} stream t)))
(if (= (mod (length k-v) 2) 1)
(error "Invalid syntax: {}")
(apply #'el-reader//ht k-v)))))
(el-reader/set-macro-character ?\{ #'el-reader//read-hash-table)
(cl-multiple-value-bind (fun _term)
(el-reader/get-macro-character ?\))
(el-reader/set-macro-character ?\} fun))
{:foo "foo" :bar 5}
set-macro-character
- Link
get-macro-character
- Link
make-dispatch-macro-character
- Link
set-dispatch-macro-character
- Link
get-dispatch-macro-character
- Link
copy-readtable
- Link
getch
- Link
peek-char
- Link
read
- Link
read-preserving-whitespace
- Link
- The Syntax type of a character is now directly settable.
- Same for traits.
- Mapping from characters to numbers can be manipulated.
- Read macro procedures must manually wrap the return value in a list.
- No package support.
- No support for fractions.
- Non-terminating dispatching macro character (‘#’)
- No case conversion.
- Slight name and signature differences
- getch
- peek-char
- Not all constructs can be read yet.
- Because of this, no compilation.
- Rewrite in C as part of Emacs.
- Write a C module.
- As editing tools are not aware of el-reader, operations on custom syntaxes often does not work. This is already true of Common Lisp code.