Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using a more general parser tookit (nom?) instead of pest #1

Open
ncalexan opened this issue Dec 24, 2016 · 3 comments
Open

Comments

@ncalexan
Copy link

Following the discussion in mozilla/mentat#139, I expect we'll be parsing EDN and then layering on additional parsers (query parser, transaction parser, perhaps others). It would be nice if we could use the same parser generation toolkit for parsing strings and EDN token streams.

From http://dragostis.github.io/pest/pest/trait.Input.html, I see that pest's intention is to parse strings (or byte streams). (e.g., http://dragostis.github.io/pest/pest/trait.Input.html#tymethod.slice accepts a general Input but requires it to effectively be a string.)

I see that nom is more general. (Doubtless there are other, more general, PEG generators as well.) From https://github.com/Geal/nom#parser-combinators:

A parser in nom is a function which, for an input type I, an output type O, and an optional error type E, will have the following signature: fn parser(input: I) -> IResult<I, O, E>;

@ncalexan
Copy link
Author

@rnewman @grigoryk worth considering as we build our libraries.

@ncalexan
Copy link
Author

So, it appears that nom doesn't do a great job with this: see rust-bakery/nom#266, which is harder than our problem, but similar. I think nom could be bent to our purposes, but it's clear that bytes are the first-class thing and arbitrary tokens second-class.

Perhaps combine is a better option: https://docs.rs/crate/combine

@ncalexan
Copy link
Author

ncalexan commented Jan 4, 2017

Perhaps combine is a better option: https://docs.rs/crate/combine

Indeed there's a JSON parser in the combine test suite: https://github.com/Marwes/combine/blob/master/benches/json.rs. It wouldn't be too hard to turn that into an EDN parser.

A few more notes after playing with combine a bit: generally, folks migrate from nom to combine to get better error reporting. I can't speak to the differences yet.

There are some kinks when defining higher order parsers using combine; the obvious const p: ... = many(...) doesn't work. I think this is actually rooted in Rust's need to know the type and size of p statically, which doesn't play that nicely with combine in some cases. This may just be my Rust ignorance, though!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant