Skip to content
richard-lyman edited this page Sep 18, 2012 · 98 revisions

Potentially close-to-done (future-valid) Amotoen grammar for edn (in edn):

add the concept that a match must succeed

#com.lithinos/amotoen {
:Edn [:_* (+ (| [#com.lithinos/amotoen-regex "^(?=#)" :Dispatched] :Element :Comment)) :_*]
:Dispatched (#com.lithinos/amotoen-must (| :Discard :TaggedElement [#com.lithinos/amotoen-regex "^(?=#{)" :Set]))
:TaggedElement [\# (| :ProvidedTag :LimitedSymbol) :_ :Element]
  :ProvidedTag (| "inst" "uuid")
:Element (| :DelimitedElement :SpacedElement)
:DelimitedElement (| :List :Vector :Map :Set)
:SpacedElement [:NonDelimitedElement (| :RequiredSpace :OptionalSpaceFromDelimiter [:_* :$])]
  :NonDelimitedElement (| :Nil :Boolean :String :Character :Symbol :Keyword :Float :Int)
  :RequiredSpace #com.lithinos/amotoen-regex "^[ \t\n\r]++(?!(?>[{\([])|(?>#\{))" ; Still feels wrong
  :OptionalSpaceFromDelimiter #com.lithinos/amotoen-regex "^[ \t\n\r]*+(?=(?>[{\([])|(?>#\{))"
:Comment [\; (* (% \newline)) \newline]
:Discard ["#_" :_* :Edn]
:List       (|  [\( :_* \)]  [\( :_* :Edn            :_* \)]  )
:Vector     (|  [\[ :_* \]]  [\[ :_* :Edn            :_* \]]  )
:Set    [\# (|  [\{ :_* \}]  [\{ :_* :Edn            :_* \}]  )]
:Map        (|  [\{ :_* \}]  [\{ :_* (+ [:Edn :Edn]) :_* \}]  )
:Nil "nil"
:Boolean (| "true" "false")
                                   ; Poor Github... there is a not-this-bad-looking regex under here...
:String #com.lithinos/amotoen-regex "^\"(?>\\\\|\\\"|(?>\P{M}\p{M}*)(?<!\"))++\""
:Character [\\ (| :WordedCharacter :UnicodeGrapheme)]
  :WordedCharacter #com.lithinos/amotoen-regex "^(?>tab)|(?>space)|(?>newline)|(?>return)"
  :UnicodeGrapheme #com.lithinos/amotoen-regex "^\P{M}\p{M}*"
:LimitedSymbol [#com.lithinos/amotoen-regex "^[a-zA-Z]" :SymbolOtherChars (? [\/ :SymbolOtherChars])]
:Symbol (| \/ [:SymbolPartOrAll (? [\/ :SymbolOtherChars])])
:SymbolPartOrAll (| :DashedSymbol :DottedSymbol :PlusedSymbol :VanillaSymbol)
  :DashedSymbol [\- :VanillaSymbol]
  :DottedSymbol [\. :VanillaSymbol]
  :PlusedSymbol [\+ :VanillaSymbol]
  :VanillaSymbol [:SymbolFirstChar :SymbolOtherChars]
    :SymbolFirstChar #com.lithinos/amotoen-regex "^[^0-9:#](?<!\/)"
    :SymbolOtherChars #com.lithinos/amotoen-regex "^(?>[a-zA-Z0-9:*+!_?.#$&%=-](?<!\/))*+"})
:Keyword [\: :LimitedSymbol]
:Int [:BasicInt (? \N)]
  :BasicInt #com.lithinos/amotoen-regex "^[+-]?(?>0)|(?>[1-9][0-9]*)"
:Float (| [:BasicInt \M] [:BasicInt :Frac :Exp] [:BasicInt :Exp] [:BasicInt :Frac])
  :Frac #com.lithinos/amotoen-regex "^\.[1-9][0-9]*"
  :Exp #com.lithinos/amotoen-regex "^[eE][+-]?[1-9][0-9]*"
:_* (* :Whitespace)
:_ [:Whitespace :_*]
  :Whitespace (| \space \tab \,)
}

You only need whitespace after a non-delimited element if the next character after the whitespace isn't #"^(?>[{([])|(?>#{)"

tag handler results must be one-in-to-one-out and are not required to be edn

.1 might be a valid number

Clone this wiki locally