monadic-parsing - Monadic parsing in Scheme

Normally boasted in Haskell, but this repository goes to show that the same can be done in Scheme.

Usage

A parser written in Haskell such as the following:

my_parser :: Parser (Char, Char)
my_parser = do a <- item
               item
               b <- item
               return (a, b)

May be converted to the following Scheme equivalent (using the doM* macro, which mimics Haskell).

(define my_parser
  (doM* (a <- item)
        item
        (b <- item)
        (return `(,a . ,b))))

We can then use our parser, without needing to worry about all the monadic magic happening underneath.

(my-parser "hello") ;; => ((#\h . #\l) . "lo") ;; "lo" was not consumed.
(my-parser "ah") ;; => ()

You'll know that your parser failed when it returns the empty list. Dive into the well-commented parsing.scm file to see operators like <:>, many, oneof, token and more, with examples included!

Included Parsers

expr: Demonstration of parsing arithmetic expressions into s-exps preserving operator precedence. (i.e. * has higher precedence than +).
- e.g. 2+(3*4), 2+3+10*8 etc.
read-num-list: Read lists of numbers such as [3,1,4,1,5] and returns them as Scheme lists (3 1 4 1 5).
- Fails nicely (i.e. returns () on malformed input like [1,2,] etc.)
bnf-syntax: Backus–Naur form parser. BNF is a notation that is used to describe context-free grammars.
lisp-expr: a reader for Lisp notation. Supports numbers, booleans, strings, nested lists and dotted lists. For instance: "((1 2) \"hello\" #t 'foo)" => ((1 2) "hello" #t foo) (don't forget to escape " in strings!).

Demo of the BNF parser below.

(bnf-syntax "<integer> ::= <digit>|<integer><digit>")
;; => ((bnf-rule integer (+++ (digit) (integer digit))) . "")

Context-Free Grammars

Because we can parse (and likely convert) BNF notation into our parser syntax, we should be able to parse all unambiguous context-free grammars. Of course, one could still write an ambiguous grammar but this parser will not return all the possible parse trees.

Consider the following grammar, written in BNF notation, that describes a language containing an equal amount of 0s and 1s, in no particular order.

<S> ::= "0" <A> | "1" <B> | ""
<A> ::= "1" <S> | "0" <A> <A>
<B> ::= "0" <S> | "1" <B> <B>

We can convert this into Scheme. Note that we annotate the list passed to return with an annotation keeping track of which rule was applied.

(define eq-01
  (<:> (doM* (char #\0)
             (xs <- 01-a)
             (return `(s1 0 ,xs)))
       (doM* (char #\1)
             (xs <- 01-b)
             (return `(s2 1 ,xs)))
       (return `s3)))
(define 01-a
  (<:> (doM* (char #\1)
             (x <- eq-01)
             (return `(a1 ,1 ,x)))
       (doM* (char #\0)
             (x <- 01-a)
             (y <- 01-a)
             (return `(a2 (0 ,x ,y))))))
(define 01-b
  (<:> (doM* (char #\0)
             (x <- eq-01)
             (return `(b1 1 ,x)))
       (doM* (char #\1)
             (x <- 01-b)
             (y <- 01-b)
             (return `(b2 1 ,x ,y)))))

The output can be hard to read. Fortunately, Guile provides a pretty-print REPL directive.

;; scheme@(guile-user)> ,pretty-print (parse 01-s "110000101001011011")
(s2 1
    (b2 1
        (b1 1 s3)
        (b1 1
            (s1 0
                (a2 (0
                     (a1 1
                         (s1 0
                             (a1 1
                                 (s1 0
                                     (a2 (0
                                          (a1 1 (s1 0 (a1 1 (s2 1 (b1 1 s3)))))
                                          (a1 1 s3)))))))
                     (a1 1 s3)))))))

Exercise: come up with an alternate context-free grammar describing this language of equal occurrences of 0s and 1s.

Papers and Resources

Most of these functions come from papers on monads and monadic parsing:

Functional Pearl: Monadic Parsing in Haskell
Monads for Functional Programming
Chapter 8 of the book Programming in Haskell by Graham Hutton.
Oleg Kiselyov's blog post on monads in Scheme

Most of the functions were re-written in Scheme, so functions like >>= (bind) and <:> (choice) which are normally infixed in Haskell must be applied prefix-style in Scheme.

There were differences in the naming of functions in these resources (e.g. return (a -> M a) is called unit in Wadler's paper). I use my own conventions here, yours may differ.

To be implemented

Support ambiguous grammars
- Done in Haskell with list comprehensions and the List monad, which means we may need to implement monad transformers.
XML parser.
Parsing BNF into Scheme/Haskell parsers!
Separate code to take advantage of the Scheme module system.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
parsing.scm		parsing.scm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

monadic-parsing - Monadic parsing in Scheme

Usage

Included Parsers

Context-Free Grammars

Papers and Resources

To be implemented

About

Releases

Packages

Languages

License

siraben/monadic-parsing

Folders and files

Latest commit

History

Repository files navigation

monadic-parsing - Monadic parsing in Scheme

Usage

Included Parsers

Context-Free Grammars

Papers and Resources

To be implemented

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages