Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing future opam files take 2 #44

Merged
merged 6 commits into from
May 20, 2021
Merged

Conversation

dra27
Copy link
Member

@dra27 dra27 commented Apr 23, 2021

This is complete reimplementation of #43 with the following changes:

The first attempt couldn't cope with, say, opam-version: "42" < because the parsing error occurs before the Variable has been completely parsed. This revised version instead reads three tokens from the lexer and so "sniff" the opam-version before parsing starts. With this approach, there's no need to do all of the global state mucking around - we already know if the opam file is a "future" one and so any exception can instead just return the opam-version line we successfully parsed. In order to allow a future version of opam to intentionally stick with an old parser (e.g. because opam 2.2 adds no new lexing/parsing rules), as well as the opam-version item, a sentinel section of kind # is returned which opam can then use to determine that there was actually a parse error.

Finally, I thought of a devious, yet curiously actually valid, way of raising OpamLexer.Error from OpamBaseParser.main which means that opam-version in the wrong place now includes a description of the problem rather than just "parse error".

Completely reimplements the best-effort parsing mode. The previous
attempt failed if a lexing or parsing error occured before the
opam-version item had been fully parsed.

This revised version begins by reading three tokens from the lexer
directly to parse the opam-version, if it's present. It _then_ calls the
ocamlyacc parser (feeding the three tokens back to it).

If the opam-version parsed is greater than the library's internal
version, then _any_ exception causes just the parsed opam-version header
to be returned (which is sufficient for the client to display that the
file is newer than itself).

Finally, in order to permit, for example, opam 2.2 to have new fields
but no new lexer or parser, exceptions cause both the opam-version
variable to be returned and also a sentinel group with kind `#` which
can be used by the client to determine that a parsing/lexing error
occurred (and where).
@AltGr
Copy link
Member

AltGr commented May 5, 2021

Wow, quite complex but seems to do the job — I trust you have tested it in various scenarios ;)

The parsing part is LGTM, but I think you should also have a look at the printer to enforce the invariants there too: may make some mistakes much easier to spot and fix.
Also the Printer.Normalise / Printer.FullPos.Normalise I think needs to be patched to still be valid syntax with this change ?

@dra27
Copy link
Member Author

dra27 commented May 5, 2021

Oh, I hadn't checked Normalise. For the others (Preserved, etc.) I don't think anything wants doing - there's a certain of "garbage in garbage out" - so if you feed an invalid file to the printer (e.g. where you've put the opam-version: "2.1" field further down the list then it will print it. I wasn't sure if we want it to fail (or even silently correct) the file at that point?

@dra27
Copy link
Member Author

dra27 commented May 5, 2021

For the correctness, I added some additional tests for it 🙂

opam-version with a value greater than 2.1 always comes first.
@dra27
Copy link
Member Author

dra27 commented May 6, 2021

Extra commit pushed - I should concoct some tests for it, though.

@AltGr
Copy link
Member

AltGr commented May 6, 2021

Does the Normalise module outside of FullPos not need the change because it is older anyway ? Just making sure.

@dra27
Copy link
Member Author

dra27 commented May 6, 2021

🤦

@dra27
Copy link
Member Author

dra27 commented May 6, 2021

We should probably refactor the printer code completely to convert the old format to the full pos one (with dummy locations) and then print that...

@dra27
Copy link
Member Author

dra27 commented May 19, 2021

That extra commit causes all of the opamfile and items functions to raise Invalid_argument if they're passed a list of items which violate the opam-version placement rules.

I think this is ready to go now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants