Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port to syn 0.15 #42

Merged
merged 7 commits into from
Sep 18, 2018
Merged

Port to syn 0.15 #42

merged 7 commits into from
Sep 18, 2018

Conversation

ogoffart
Copy link
Collaborator

Porting to newer syn is required in order to support new rust connstruct.

Fixes issue #41

syn and proc_macro2 do not allow to get the line number or the verbatim
captrued text: dtolnay/proc-macro2#110
This therefore has to redo the lexing manually to extract the relevant pieces
for which we need these information. Only the more comples parsing is still
done by syn.

This is a major change, so i took the opportunity to also run rustfmt

Porting to newer syn is required in order to support new rust connstruct.

Fixes issue #41

syn and proc_macro2 do not allow to get the line number or the verbatim
captrued text:  dtolnay/proc-macro2#110
This therefore has to redo the lexing manually to extract the relevant pieces
for which we need these information. Only the more comples parsing is still
done by syn.
@ogoffart ogoffart requested a review from mystor August 30, 2018 06:42
@ogoffart

This comment has been minimized.

For some reason, the TokenStream::to_stirng can contains \n when using nightlies
@ogoffart
Copy link
Collaborator Author

ogoffart commented Sep 5, 2018

@mystor any comments?

@mystor
Copy link
Owner

mystor commented Sep 5, 2018

Hey, sorry I haven't had much time to look at this patch yet, especially with how large it is. Ill try to look at it this weekend but that's probably the earliest I can get to it.

@mystor
Copy link
Owner

mystor commented Sep 5, 2018

I should also mention that syn 0.15 should be releasing soon (dtolnay/syn#476), and it might be worthwhile to directly port to it. The entire parser engine has been replaced.

@ogoffart
Copy link
Collaborator Author

ogoffart commented Sep 5, 2018

This means that cpp_common::parsing will need to be rewritten (again! :-))

However, I guess most of the rest of the code should not be affected, in particular the manual lexing will have to stay since proc_macro2 still does not allow to get the line information or offset.

I'll have a look at syn 0.15

@ogoffart
Copy link
Collaborator Author

ogoffart commented Sep 6, 2018

I made the port to syn 0.15: https://github.com/ogoffart/rust-cpp/pull/1 (It is based on this work)

@ogoffart ogoffart changed the title Port to syn 0.14 Port to syn 0.15 Sep 8, 2018
Copy link
Owner

@mystor mystor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just my initial review - I'm going to spend a bit of time thinking about the parsing changes & how we can make it cleaner.

cpp_build/src/lib.rs Outdated Show resolved Hide resolved
@@ -702,7 +640,6 @@ In order to provide a better error message, the build script will exit successfu
if !self.std_flag_set {
self.cc.flag_if_supported("-std=c++11");
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary whitespace change

@@ -0,0 +1,558 @@
use cpp_common::{Class, Closure, Macro, RustInvocation};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to have to do some level of custom parsing to keep working without a cpp fork, but I think we might be able to do it in a cleaner way. I'm probably going to take a bit to think about this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking forward for your input.

Note that strnom.rs is a slightly modified version of the equivalent code from proc_macro2 (maybe this should be commented more)

cpp_common/src/lib.rs Outdated Show resolved Hide resolved
let std_body = body
.to_string()
.chars()
.filter(|x| *x != ' ' && *x != '\n')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably use 'is_whitespace' here. It might also be nice to do this filtering inside of ClosureSig instead, and have a single entry point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single entry point to what?

@@ -176,7 +205,8 @@ pub fn expand_internal(input: TokenStream) -> TokenStream {
let source = input.to_string();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the awesome things about the new proc_macro2 is that we don't have to do this to-string dance anymore :-) - Just call syn::parse<cpp_common::Closure>(input). Syn even provides an awesome parse_macro_input macro which even does error reporting for us! https://docs.rs/syn/0.15.3/syn/macro.parse_macro_input.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, i was aware of that but keep it this way because i wasn't sure what is the base way to extract the stringify! macro.

Should I parse a a syn::ItemEnum and then use visit_item_enum to find the strignify macro within it. Or should i manually try somethjing like item_enum.variants.first().value().discriminent().unwrap().1../* ??? */...mac.tts ?
Or not use ItemEnum but try to parse the thing manually?

cpp_macros/src/lib.rs Outdated Show resolved Hide resolved
.expect("rust! macro");
use std::iter::FromIterator;
use syn::parse::Parser;
let input = proc_macro2::TokenStream::from_iter(vec![closure.body].into_iter());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's unfortunate you have to do from_iter here, but you should be able to just use a fixed size array rather than vec!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to do it.

I get this error.

error[E0271]: type mismatch resolving `<&[proc_macro2::TokenTree; 1] as std::iter::IntoIterator>::Item == proc_macro2::TokenTree`
   --> /home/rust/rust-cpp/cpp_macros/src/lib.rs:314:17
    |
314 |     let input = proc_macro2::TokenStream::from_iter(&[closure.body]);
    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected reference, found enum `proc_macro2::TokenTree`
    |
    = note: expected type `&proc_macro2::TokenTree`
               found type `proc_macro2::TokenTree`
    = note: required by `std::iter::FromIterator::from_iter`

Same if i use [closure.body].iter()

Actually. I got it worked with [closure.body].iter().map(|x|x.clone()))

let mut tokens : &str = &source[s..].trim();
let s = source
.find("stringify!(")
.expect("expected 'strignify!' token in class content") + 11;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: stringify!

cpp_macros/src/lib.rs Outdated Show resolved Hide resolved
@ogoffart
Copy link
Collaborator Author

About the extraction of stringify! from cpp_macro, i implemented both alternative using explicit de-structuring (what is currently in this pull request) or use syn::visit to find the macro: https://github.com/ogoffart/rust-cpp/commit/0a75608e3e71c16bb7149de813454901fc12b778
I'm not sure what is the best. Using visit is less code and is cleaner. Probably resist better to changes in the cpp! and cpp_class! implementation, or in the syn AST. However it is probably less efficient, as it needs to re-visit the whole tree. And maybe it'd be a problem if one can have other macro (cpp_class!(#[foo(stringify!(bar)) struct F as "F") is not currently allowed, but maybe in the future?)

Overall, i think it is pretty important to upgrade syn as currently, using any of the new language constructs in a crate that wants to use cpp_build is impossible. That means already things like impl Traitor dyn Trait, use of the recently unreserved keywords, and probably more.
The situation is quite bad, as we only get a warning from cpp_build that there was a parse error, without telling you where or what, and the rustc compilation works only that the cpp! function are not found. or just link error if it was already compiled before.

For this reason, I think it is quite important to release a new version fast with an updated syn.

Regarding the parsing done in cpp_build, I think we should not wait for proc_macro2 to have what we need (verbatim source code, and line/column info), as this can take a long time. Forking is not a good idea as we would have the same problem again.
The custom parsing code done in cpp_build is quite simple and limited to lexing. Originally, I thought to simply find occurence of the regexpt \bcpp\s*! within the source code, and then finding matching parentheses to get the end. But that last part is impossible without proper lexing, as it can be quite complicated (example: cpp!( ... { /* } */ " \" ) /*"; })).
So I imported the lexing code from proc_macro2. Change in the lexing rules might change un future rust versions, but are unlikely to change as much as the full grammar.

@mystor
Copy link
Owner

mystor commented Sep 16, 2018

I'm fine with using the visitor, and it does seem quite a bit cleaner.

I think my worry with the need for manual tokenization is largely around the large complexity of having a complete copy of the lexing code within the crate. Perhaps we could make a separate crate which just handles string lexing and is used by rust-cpp, like the cpp_synmap crate used to be?

@ogoffart
Copy link
Collaborator Author

@mystor What is the advantage to put it in a separated crate compared to a separated module?
Puting it in a different create means uploading another crate on crates.io, and also that means it needs somehow a public API and make it a full lexer.
Because right now, this is not even a full lexer. I do not do keywords, or numbers or lifetime or symbols, these are just ignored.

What exactly do you want in this crate? Just put strnom.rs in its own crate? Add skip_literal and find_delimited?

@mystor
Copy link
Owner

mystor commented Sep 18, 2018

I think I'm okay with this grossness for now, but I'll think on how to clean it up more. Merge away and I can publish new versions

@ogoffart ogoffart merged commit 595fcf1 into mystor:master Sep 18, 2018
@ogoffart
Copy link
Collaborator Author

Thank you.

@mystor
Copy link
Owner

mystor commented Sep 21, 2018

Published as cpp 0.5 https://crates.io/crates/cpp. Thanks @ogoffart :-)

@mystor mystor mentioned this pull request Sep 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants