-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Use rustdoc
JSON instead of parsing with syn
#906
Comments
The infrastructure we use is all open-source and based on the Trustfall query engine and the If you're interested in reusing some of that work, I'd be happy to point you in the right direction and help you get started. |
Is this true? We also need to know If rustdoc provides all that information, that might be an interesting approach. I'd be happy to take a patch that prototypes it assuming no regressions... cc #287 for things that would allow us to handle macros. |
We generate rustdoc JSON with the |
Cool! Curious, do you know how fast / slow rustdoc happens to be compared to syn? Also, thinking a bit about what cbindgen features we use in Firefox I came up with:
|
Rustdoc is a component of rustc, so to my knowledge it behaves very similarly: it runs build scripts, resolves dependencies, requires the code to type-check, handles incremental builds, etc. I've never benchmarked it versus syn but I'd expect it's often slower because syn isn't doing nearly as much work. |
I have no experience with trustfall, but a trustfall adapter for syn seems appropriate for the cbindgen case. At a later stage, the internal data could get a trustfall adapter that would be used by the language backends. |
If you're interested in giving it a shot, I'd be happy to help you get started with Trustfall and answer any questions! |
I don't mind mind trying but only after someone that actually maintains cbindgen says that this path is acceptable for them. I'm just a random user/programmer. =~ |
The issue with cbindgen only supporting macro expansion on nightly Rust is an issue for AccessKit's C bindings. In the short term, I think we'll have to manually generate our header file (still using nightly) and commit it to the repo, but obviously that's not idea. Anything we can do to help this get solved? cc @DataTriny |
I was looking into this, and trustfall looks like a super cool idea. I was looking at the rustdoc integration that appears to be used in cargo-semver-checks, and I'm stumped on two things.
I did successfully only pull out the repr C items in my codebase, so I could get that working, and I could choose structs or enums, and list the names of the types and their fields, but I couldn't get the type information that would be useful. I would like to do the sort of "copy-and-paste monomorphization" that cbindgen appears to do out of the box. @obi1kenobi could you point me in the correct direction for how to do these things? |
Currently, the Trustfall adapter for rustdoc doesn't expose type information for struct fields / enum variant components / function arguments / generics etc. It can, it will just take a bunch of work to model everything properly given Rust's very expressive type system (impl trait + generics + lifetimes etc.). I'm currently looking for funding to make this happen. In the meantime, I recommend a hybrid approach: using Trustfall, output the How does that sound? Happy to help however I can, so let me know if anything is still confusing! |
Yes, that is very helpful thank you! I hope you receive your funding! I think for my use case (and probably for a lot of other people's use cases) I won't need the full expressiveness of the type system (not having generics across the FFI boundary drastically simplifies things), so this is a great solution, thank you for your detailed response! |
Awesome! I'd love to hear about what you build and your experience along the way, so please keep me in the loop :) |
Parsing Rust code with
syn
is useful for macro authors, but has its limitations for whole-crate tools such ascbindgen
, as is apparent in that it does not properly support namespacing since it has no access to type information and such (and the support for macro expansion is also a bit spotty).In an ideal world, I think
cbindgen
should be implemented usingrustc_driver
, and distributed alongside the Rust compiler inrustup
(perhaps with acargo cbindgen
subcommand included); this would give it all the power ofrustc
, and allow it to do full semantic analysis of the code; weird quirks begone!But I recognize that that's a huge ask from several teams, and that it may not be desired for the Rust project to take ownership over this project (which is effectively a blocker for using
rustc_driver
, as doing that out-of-tree / without support from upstream is a huge hazzle).Luckily, there exists something else we can use! Enter
rustdoc
's JSON output (documentation for the format here, can be access conveniently via. therustdoc-types
crate). This fully describes a crate's public API, which is more than enough forcbindgen
's needs (it only needs to know allpub extern "C" fn
, and their dependent types).While the format itself is yet unstable, it is already depended on in the ecosystem, prominently
cargo-semver-checks
, so I'd say there is a high likelihood that it will not outright go away without some sort of replacement1.So: I propose that
cbindgen
transitions to using this instead ofsyn
, which would solve all of the aforementioned problems with namespacing (and likely make the implementation ofcbindgen
smaller, and capable of more advanced semantic type analysis in the future (example of something: knowledge of auto traits)).What do you think?
1: If
cbindgen
ends up going this route, we should of course state so on the tracking issue.The text was updated successfully, but these errors were encountered: