-
Notifications
You must be signed in to change notification settings - Fork 858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ABNF description of TOML. #236
Changes from 9 commits
fcd0229
cb50c72
7b439f6
bd4fe79
4d15ea5
52dc64a
3808902
0cee090
532a466
bb00057
863f5cc
d6c49b4
216f642
64c0a6a
75f6ba3
9882e88
92f20e5
eb703d2
a2c74ce
7c0db2c
f9d4429
514037d
ecb8274
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,187 @@ | ||
;; This is an attempt to define TOML in ABNF according to the grammar defined | ||
;; in RFC 4234 (http://www.ietf.org/rfc/rfc4234.txt). | ||
|
||
;; TOML | ||
|
||
toml = expression *( newline expression ) | ||
expression = ( | ||
ws / | ||
ws comment / | ||
ws keyval ws [ comment ] / | ||
ws table ws [ comment ] | ||
) | ||
|
||
;; Newline | ||
|
||
newline = ( | ||
%x0A / ; LF | ||
%x0D.0A ; CRLF | ||
) | ||
|
||
newlines = 1*newline | ||
|
||
;; Whitespace | ||
|
||
ws = *( | ||
%x20 / ; Space | ||
%x09 ; Horizontal tab | ||
) | ||
|
||
;; Comment | ||
|
||
comment-start-symbol = %x23 ; # | ||
non-eol = %x09 / %x20-10FFFF | ||
comment = comment-start-symbol *non-eol | ||
|
||
;; Key-Value pairs | ||
|
||
keyval-sep = ws %x3D ws ; = | ||
keyval = key keyval-sep val | ||
|
||
key = unquoted-key / quoted-key | ||
unquoted-key = 1*( ALPHA / DIGIT / %x2D / %x5F ) ; A-Z / a-z / 0-9 / - / _ | ||
quoted-key = quotation-mark 1*basic-char quotation-mark ; See Basic Strings | ||
|
||
val = integer / float / string / boolean / date-time / array / inline-table | ||
|
||
;; Table | ||
|
||
table = std-table / array-table | ||
|
||
;; Standard Table | ||
|
||
std-table-open = %x5B ws ; [ Left square bracket | ||
std-table-close = ws %x5D ; ] Right square bracket | ||
table-key-sep = ws %x2E ws ; . Period | ||
|
||
std-table = std-table-open key *( table-key-sep key) std-table-close | ||
|
||
;; Array Table | ||
|
||
array-table-open = %x5B.5B ws ; [[ Double left square bracket | ||
array-table-close = ws %x5D.5D ; ]] Double right quare bracket | ||
|
||
array-table = array-table-open key *( table-key-sep key) array-table-close | ||
|
||
;; Integer | ||
|
||
integer = [ minus / plus ] int | ||
minus = %x2D ; - | ||
plus = %x2B ; + | ||
digit1-9 = %x31-39 ; 1-9 | ||
underscore = %x5F ; _ | ||
int = DIGIT / digit1-9 1*( DIGIT / underscore DIGIT ) | ||
|
||
;; Float | ||
|
||
float = integer ( frac / frac exp / exp ) | ||
zero-prefixable-int = DIGIT *( DIGIT / underscore DIGIT ) | ||
frac = decimal-point zero-prefixable-int | ||
decimal-point = %x2E ; . | ||
exp = e integer | ||
e = %x65 / %x45 ; e E | ||
|
||
;; String | ||
|
||
string = basic-string / ml-basic-string / literal-string / ml-literal-string | ||
|
||
;; Basic String | ||
|
||
basic-string = quotation-mark *basic-char quotation-mark | ||
|
||
quotation-mark = %x22 ; " | ||
|
||
basic-char = basic-unescaped / escaped | ||
escaped = escape ( %x22 / ; " quotation mark U+0022 | ||
%x5C / ; \ reverse solidus U+005C | ||
%x2F / ; / solidus U+002F | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why EDIT: I see that v0.4.0 had removed this rule. |
||
%x62 / ; b backspace U+0008 | ||
%x66 / ; f form feed U+000C | ||
%x6E / ; n line feed U+000A | ||
%x72 / ; r carriage return U+000D | ||
%x74 / ; t tab U+0009 | ||
%x75 4HEXDIG / ; uXXXX U+XXXX | ||
%x55 8HEXDIG ) ; UXXXXXXXX U+XXXXXXXX | ||
|
||
basic-unescaped = %x20-21 / %x23-5B / %x5D-10FFFF | ||
|
||
escape = %x5C ; \ | ||
|
||
;; Multiline Basic String | ||
|
||
ml-basic-string-delim = quotation-mark quotation-mark quotation-mark | ||
ml-basic-string = ml-basic-string-delim ml-basic-body ml-basic-string-delim | ||
ml-basic-body = *( ml-basic-char / newline / ( escape newline )) | ||
|
||
ml-basic-char = ml-basic-unescaped / escaped | ||
ml-basic-unescaped = %x20-5B / %x5D-10FFFF | ||
|
||
;; Literal String | ||
|
||
literal-string = apostraphe *literal-char apostraphe | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should be spelled "apostrophe". |
||
|
||
apostraphe = %x27 ; ' Apostraphe | ||
|
||
literal-char = %x09 / %x20-26 / %x28-10FFFF | ||
|
||
;; Multiline Literal String | ||
|
||
ml-literal-string-delim = apostraphe apostraphe apostraphe | ||
ml-literal-string = ml-literal-string-delim ml-literal-body ml-literal-string-delim | ||
|
||
ml-literal-body = *( ml-literal-char / newline ) | ||
ml-literal-char = %x09 / %x20-10FFFF | ||
|
||
;; Boolean | ||
|
||
boolean = true / false | ||
true = %x74.72.75.65 ; true | ||
false = %x66.61.6C.73.65 ; false | ||
|
||
;; Datetime (as defined in RFC 3339) | ||
|
||
date-fullyear = 4DIGIT | ||
date-month = 2DIGIT ; 01-12 | ||
date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on month/year | ||
time-hour = 2DIGIT ; 00-23 | ||
time-minute = 2DIGIT ; 00-59 | ||
time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap second rules | ||
time-secfrac = "." 1*DIGIT | ||
time-numoffset = ( "+" / "-" ) time-hour ":" time-minute | ||
time-offset = "Z" / time-numoffset | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should be clarified that
Same with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. |
||
|
||
partial-time = time-hour ":" time-minute ":" time-second [time-secfrac] | ||
full-date = date-fullyear "-" date-month "-" date-mday | ||
full-time = partial-time time-offset | ||
|
||
date-time = full-date "T" full-time | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Since non-technical users are part of the target audience, the improved readability might be appreciated. Should I make a separate issue for this (the specification is somewhat ambiguous here)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, please make a new issue. |
||
|
||
;; Array | ||
|
||
array-open = %x5B ws ; [ | ||
array-close = ws %x5D ; ] | ||
|
||
array = array-open array-values array-close | ||
|
||
array-values = [ val [ array-sep ] [ ( comment newlines) / newlines ] / | ||
val array-sep [ ( comment newlines) / newlines ] array-values ] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This disallows newline before the first element, e.g.
but allows it after:
Is that intended? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hopefully not. Could this be fixed? (It's surprising behavior) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It also disallows newlines between values and commas. Is it ok? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mojombo Ping... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any news on this? |
||
|
||
array-sep = ws %x2C ws ; , Comma | ||
|
||
;; Inline Table | ||
|
||
inline-table-open = %x7B ws ; { | ||
inline-table-close = ws %x7D ; } | ||
inline-table-sep = ws %x2C ws ; , Comma | ||
|
||
inline-table = inline-table-open inline-table-keyvals inline-table-close | ||
|
||
inline-table-keyvals = [ inline-table-keyvals-non-empty ] | ||
inline-table-keyvals-non-empty = key keyval-sep val / | ||
key keyval-sep val inline-table-sep inline-table-keyvals-non-empty | ||
|
||
;; Built-in ABNF terms, reproduced here for clarity | ||
|
||
; ALPHA = %x41-5A / %x61-7A ; A-Z / a-z | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not use unicode |
||
; DIGIT = %x30-39 ; 0-9 | ||
; HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule will match
[]
, which is expressly forbidden in the spec.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using
<key>
in this way does not follow the spec, which makes no mention of quoted table names.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@flaviut I don't see how it will match
[]
. The<key>
rule mandates that at least 1 character be present. Also, see #283 which clarifies key names to match the rules present here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad then, sorry. It seems like I read over the first
<key>
without noticing it.