Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse tokens, not strings #147

Merged
merged 4 commits into from
May 31, 2017
Merged

Conversation

alexcrichton
Copy link
Collaborator

This commit builds off @mystor's work to finish porting the parser to working over tokens, not strings as it was previously. The tokens here come directly from the proc_macro2 crate which is the intended-to-be-stabilized interface of proc_macro itself.

In the end this should mean that conversion from a TokenStream to a syn AST and back should be a lossless transformation in terms of contextual information like spans. This'll lead to far better error messages from the compiler and such.

This comes with quite heavy changes to the synom crate, including deletion of a number of macros that were no longer uses and aren't as useful in this "lossless" world. Comments are of course always welcome!

Closes #137
Closes #142

mystor and others added 3 commits May 24, 2017 21:12
@alexcrichton
Copy link
Collaborator Author

Ah and it's worth noting that of course parsing strings is still supported. The TokenStream type can be parsed from a string and then from there we can parse those tokens. It's also worth noting that this currently has what is likely an excessively inefficient representation of the "slice of what to parse", especially around delimited token streams. I was hoping that this could be improved before 0.12 but perhaps after this PR.

@mystor
Copy link
Collaborator

mystor commented May 31, 2017

Thanks for rebasing this @alexcrichton (And also introducing the Synom trait / rewriting most of it :S which simplifies things a lot).

Ah and it's worth noting that of course parsing strings is still supported. The TokenStream type can be parsed from a string and then from there we can parse those tokens. It's also worth noting that this currently has what is likely an excessively inefficient representation of the "slice of what to parse", especially around delimited token streams. I was hoping that this could be improved before 0.12 but perhaps after this PR.

I have an idea of how to improve this, and fix the None-delimited Sequence and operator associativity problem with Expressions, but I think I'm going to wait until this gets merged to work on it.

Unfortunately, even once proc-macro is in nightly, we'll probably need something like proc-macro2 even in nightly builds for a while, as right now you can't parse a string into a TokenStream with the real proc-macro unless you're running inside the compiler. There are a few use cases for syn right now which require being outside of the compiler, such as in build scripts. (hopefully this will be changed before the new API is stable though).

@alexcrichton alexcrichton force-pushed the new-parse branch 3 times, most recently from 9463b2b to 16bb9b0 Compare May 31, 2017 21:34
Copy link
Owner

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not going to review this thoroughly right now but based on a spot check, this is exactly what I had in mind. Merge when ready!

Any idea what the future looks like for combining adjacent spans? Ideally the += token would have a single span. As I understand it, our parsing guarantees that + and = are adjacent bytes in the input, so we could even make do with an implementation that panics when merging spans that are not adjacent.

@dtolnay
Copy link
Owner

dtolnay commented May 31, 2017

There is a circular dependency here between proc-macro2 and synom. I will make a PR to drop the dependency of proc-macro2 on synom.

@alexcrichton
Copy link
Collaborator Author

Aw shucks, I was hoping we'd have an infinite bootstrap chain where if you compile proc-macro 0.N it requires syn 0.M which requires proc-macro 0.N-1 which requires syn 0.M-1 which requires...

@alexcrichton
Copy link
Collaborator Author

Any idea what the future looks like for combining adjacent spans?

Oh I haven't thought about this much, but I think it'd be quite plausible! If proc_macro itself exposes and API for this it shouldn't be hard to take advantage. Right now tokens like <<= keep track of three spans but there's only one helper function to change to convert that all to one span.

@alexcrichton alexcrichton merged commit 105beb8 into dtolnay:master May 31, 2017
@alexcrichton alexcrichton deleted the new-parse branch May 31, 2017 23:00
@dtolnay
Copy link
Owner

dtolnay commented May 31, 2017

I was hoping we'd have an infinite bootstrap chain where if you compile proc-macro 0.N it requires syn 0.M which requires proc-macro 0.N-1 which requires syn 0.M-1 which requires...

I hope you're not serious. dtolnay/proc-macro2#4

@Arnavion
Copy link
Contributor

@mystor
Copy link
Collaborator

mystor commented May 31, 2017

@Arnavion It definitely fixes #114 once we switch to using the real proc-macro (IIRC proc-macro2 doesn't support 128-bit integers yet, but that shouldn't be too hard to change).

Not sure about the others, but I can test. I might fix some of them as part of #148.

@Arnavion
Copy link
Contributor

Arnavion commented Jun 1, 2017

Also, I'm converting one of my proc macro crates to master to try it out. Shall I file bugs for useful / necessary features that were removed (no more Eq on syn::Path, no more way to check if a syn::Lit is specifically a string literal and get the string out of it, etc) ? Or should I wait?

@mystor
Copy link
Collaborator

mystor commented Jun 1, 2017

The second is a known issue. We'll probably solve it either in proc-macro2 (if it looks like proc-macro proper is going go gain that functionality), or I'll make a separate LiteralExt crate to add the methods I want which doesn't have to be built unless you need the feature dtolnay/proc-macro2#9

@mystor
Copy link
Collaborator

mystor commented Jun 1, 2017

(we should probably have an issue for both of them on syn though - go ahead and file :))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants