Defining custom tokens in parser options #312
-
I was wondering whether it's possible, given a list of tokens for e.g. operators, to create a rule that tries to match from that list. For example: {
const operators = [ '**', '==', '+', '-', '*', '/' ]
}
Operator = token:$SourceCharacter+ &{ return operators.includes(token) } { return token }
SourceCharacter = . This version doesn't work though, since the parser will just keep matching characters and doesn't know when to stop. I've tried several approaches, but I'm not quite sure how to match a list of tokens like that when there isn't necessarily a restriction on the characters those tokens can contain (except perhaps a space, but that still leads to issues when you want to not have spaces between every morphological unit of the language). The difficulty I'm having with this really makes me wish for metaprogramming functionality in peggy, so I can just generate rules from a list like this. Perhaps there's something I can do with the Plugin API? But I'm not familiar enough to be able to tell. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
If you don't known which characters are not allowed in the your tokens, the only possible approach is to check match after parsing each character. Probably, it would be better to generate a concrete grammar instead to trying to write the generic one. |
Beta Was this translation helpful? Give feedback.
-
Hi, Scope
= "public"
/ "protected"
/ "private" is there a way to write like this with low performance impact? Scope
= option.kw.Public
/ option.kw.Protected
/ option.kw.Private
// doable at this moment
Scope = type:$SourceCharacter+ &{ return [option.kw.Public, option.kw.Protected, option.kw.Private].includes(type) } by supplying the keywords from options, we can eliminate one source of truth, which means less place to change, thus make our parser safer. |
Beta Was this translation helpful? Give feedback.
If you don't known which characters are not allowed in the your tokens, the only possible approach is to check match after parsing each character. Probably, it would be better to generate a concrete grammar instead to trying to write the generic one.