Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide reservedWord parser? #334

Closed
Anrock opened this issue Nov 26, 2018 · 5 comments
Closed

Provide reservedWord parser? #334

Anrock opened this issue Nov 26, 2018 · 5 comments

Comments

@Anrock
Copy link

Anrock commented Nov 26, 2018

Hi, i'm mainly using Megaparsec for toy languages and was wondering why it doesn't provide some kind of parser for reserved words? I think it would nicely fit into lexer module.

Tutorial even provides a example implementation:

rword :: String -> Parser ()
rword w = (lexeme . try) (string w *> notFollowedBy alphaNumChar)

rws :: [String] -- list of reserved words
rws = ["if","then","else","while","do","skip","true","false","not","and","or"]

Can we get that into library so i could stop copypasting it everywhere?

@mrkkrp
Copy link
Owner

mrkkrp commented Nov 26, 2018

Well, the amount of code you copy-paste is not that big, and different languages have different syntax with respect to what is considered identifier. So even if we add it, it'll have to take at least two parameters already, so it's really easier to simply "inline" them and have your own definition. My opinion is that this is just too-specific to a given language and won't help much if we add it to the library.

@Anrock
Copy link
Author

Anrock commented Nov 26, 2018

@mrkkrp thanks for quick response, however i have to disagree - your point is applicable to any function in Text.Megaparsec.Lexer. All of them take more than two parameters and (arguably) all of them are shorter than proposed parser and some of them more or less language-specific.
So i don't see why proposed function doesn't fit here.

@TikhonJelvis
Copy link
Collaborator

TikhonJelvis commented Nov 26, 2018

Is your rword significantly different from symbol from one of the Lexer modules? That's what a colleague and I used in a recent project:

symbol :: Text -> Parser Text
symbol = L.symbol whitespace

This is what keywords looked like:

definition = do
  void $ symbol "type"
  typeName <- fullName
  void $ symbol "="
  ...

The only place we ended up using a list of reserved words was in parsing identifiers:

identifier = do
  ident <- identifierToken
  when (ident `elem` reservedWords) $
    fail $ "Keyword " <> Text.unpack ident <> " cannot be an identifier."
  pure ident

(If we were more responsible, we'd use a set instead of a list here >.>)

I'm not sure this is the best approach, but it worked well for us and did not require us to write much extra code.

@Anrock
Copy link
Author

Anrock commented Nov 26, 2018

@TikhonJelvis

Is your rword significantly different from symbol from one of the Lexer modules?

Not really. symbol "reserved" would succesfully parse "reservedabc" while rword would fail due to unexpected "abc" piece.

If i understand correctly - you and your coallegue are mixing lexing and parsing together, while i prefer to separate these phases - write dumb lexer to get [Token] first and then write a parser that operates on those Tokens instead of raw Chars. I find this approach easier overall (at least for my toy languages) - trivial lexer and easy parser instead of single phase of medium difficulty.

So basically rword would improve Lexer module for my use case by providing common combinator, just like already provided lexeme, space and others.

@mrkkrp
Copy link
Owner

mrkkrp commented Mar 4, 2019

I'm going to close this one. I'm not a fan of this new function, but if you absolutely want it, I think we could merge a PR implementing it.

@mrkkrp mrkkrp closed this as completed Mar 4, 2019
tomjaguarpaw pushed a commit to tomjaguarpaw/megaparsec that referenced this issue Sep 29, 2022
* [mrkkrp#323] Add a JSON output format

Resolves mrkkrp#323

* Update CHANGELOG

* Fix Stack build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants