Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utils: Add scope config for visitors and add for_each_expr fn #232

Merged
merged 4 commits into from
Sep 2, 2023

Conversation

xFrednet
Copy link
Member

This PR makes the Visitor trait a bit more flexible and adds the for_each_expr function, to handle most use cases with very little boilerplate code. The contains_return function is used for simple testing and should also be useful in the future.

Inspired by:


Closes: #159

@xFrednet xFrednet added C-enhancement Category: New feature or request S-waiting-on-review Status: Awaiting review A-marker-utils Area: Utility functions usually provided in `marker_utils` or `marker_api` labels Aug 29, 2023
@xFrednet xFrednet added this to the v0.3.0 milestone Aug 29, 2023
/// This function is useful, for lints which suggest moving code snippets into
/// a closure or different function. Return statements might prevent the suggested
/// refactoring.
pub fn contains_return<'ast>(cx: &'ast AstContext<'ast>, node: impl Traverseable<'ast, bool>) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider making this a method on AstContext? Meaning this would be an extension trait

Copy link
Member Author

@xFrednet xFrednet Sep 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not specifically for this method. IMO, it wouldn't quite fit the AstContext struct, since it's intended for driver callbacks and fetching context data. Making it a function on the traversable note seems better to me. 👍

Edit: I moved the for_each_expr into the Traversable trait. I'm not sure, if I can do that for this function, since it requires the Visitor break value to be bool


Generally speaking, I've been considering adding traits in marker_utils to expand the functionality of existing API nodes. The split between marker_api and marker_utils is not clearly defined yet. My idea was, that the marker_api crate will continue to grow with new nodes and language features, which will result in longer build times. Sadly, it's also hard to split the crate. Rustc has separated the semantic type representation into a separate crate (See: TyKind), but I find the documentation very hard to read, as important types are only defined as const generics on a trait.

So, the idea was, that marker_api will host all nodes, items, and traits needed to create lints or further tools. marker_utils should extend the API with useful functions and traits, which make the API pleasant to use. The split should be similar to Rust's core and std separation.

The problem with this plan is the high dependence on the AstContext. marker_api has access to a globally stored instance, which is accessed via the with_cx function. marker_utils currently has no equivalent. I'm considering, adding a cfg-ed function to marker_api which would allow the access for marker_utils. However, I'm a bit worried, that users will find this feature and will start to use the global instance as well.

Do you maybe have some ideas, how this could be solved?

Copy link
Contributor

@Veetaha Veetaha Sep 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my mind marker_api and marker_utils should just be a single crate. As you've already mentionned it's not even obvious how you should qualify the util for the inclusion in marker_api or marker_utils.

I can't tell if compile times is such a big problem mainly because I'm used to working in a codebase that has 2000 aggregate crate dependencies 📦. What I can tell is that there is no benefit for the compile times to have a separate marker_utils crate if you are going to use that one because compiling it requires waiting for marker_api to compile in the first place.

For sure it is theoretically possible that there could be crates that depend on marker_api only and they could be compiled in parallel with marker_utils (for example if someone out there creates another set of utils for marker_api), but does it woth it? I'm not sure.

To me this could be solved by merging marker_utils into marker_api and using cargo features. For example

marker_api = { version = "0.3.0", features = ["extra-sugar-please"] }

For example, syn follows this approach. This also removes the need of having extension traits, since all the code resides in the same crate and everything could be an inherent impl. There is now also no question about exposing cfgd private functions.

But don't get me wrong, I don't think that having an unstable cargo feature to expose a private doc(hidden) function is a bad idea. I think even just having it doc(hidden) and pub without a cfg(feature) is still enough to make people acknowledge that this barn door doesn't open in their direction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell if compile times is such a big problem mainly because I'm used to working in a codebase that has 2000 aggregate crate dependencies 📦. What I can tell is that there is no benefit for the compile times to have a separate marker_utils crate if you are going to use that one because compiling it requires waiting for marker_api to compile in the first place.

There is actually some benefit. The compilation happens in multiple stages. The frontend first collects all the information etc and then passes the info to the backend for machine/byte code generation. Other dependence can already start compiling at this point, since all required information has been collected at this point. Only the final linking part has to wait on all crates. That's one reason why splitting crates is such a common recommendation, to speed up compile times.

I'll think a bit more about this. Something is still bothering me about merging these crates into one, but I don't quite know if this gut feeling is rational based on some experience or if it's just an architecture change 🤔

Copy link
Contributor

@Veetaha Veetaha Sep 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand how you image the compilation of marker_api and marker_utils could happen in parallel.
I'm used to think that once you have a dependency in Cargo.toml then your crate won't even start compiling before the object file for the dependency is produced, or maybe some small preparation stages may be done, but the bulk of the compiling of the dependency still blocks the compilation of your crate.

For example, there's been a debate about not having a derive feature in serde and depending on serde_derive directly to make them both be compiled in parallel. See the build timings report in this issue (unstable build timings feature in cargo).

Offtopic

David Tolnay agreed that not having a derive feature would improve the parallelism of compilation, but this way the version of serde_derive could go desynchronized with serde crate, but a workaround for that problem was found

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand how you image the compilation of marker_api and marker_utils could happen in parallel.

My understanding is, that the blue parts of the compilation have to happen in sequence, while the purple parts can overlap. This is what I read in the past and seems to also be supported by the linked discussion. (See also endler.dev, just the first related blog post, with decent sources) Though in serde's case, the purple part is relatively insignificant and not the main point of the discussion.

Copy link
Member Author

@xFrednet xFrednet Sep 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the timing report for compiling marker_uilints:

cargo clean
cargo build -p marker_uilints --timings

image

rn the speedup is insignificant. My worry is mostly concerned with the future. With the goal of stability, it will be difficult to remove anything. This could pile up over time

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I didn't know about that, I suppose the problem of parallelism applies mainly to proc macros

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reached out to @flip1995 to ask him if he remembers why the clippy_utils split was done and if it delivered on the hoped benefits.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read the entire thread, I'm just here answering the question about the split:

So clippy_utils was always "split" in Clippy from the rest of the code. However, back in the day it was pretty much just a mod.rs file inside of clippy_lints/src/utils/, together with a few other files.

The original request came from the https://github.com/trailofbits/dylint maintainers, that wanted to use the util functions in Clippy. Since we didn't want to publish clippy_lints for people to rely on, the suggestion of just splitting out clippy_utils came up. Since our utils grew more and more back then (the mod.rs file grew way larger than 1 kLoC), splitting them out into their own crate didn't seem like a bad idea. So this was the decision everyone could live with the best: dylint could reuse all of our utils functions, without having to compile our lints, and the Clippy repo had a nice split between shared utils and lint implementations.

We're still not publishing the clippy_utils crate on crates.io, because it is way too unstable.

Not sure if a split makes sense for marker though. Are the utils useful as a standalone lib, without the main part of marker? If not, pulling them together might make sense, if it makes development easier. All other questions are more stylistic and how you want to structure the project.

marker_utils/src/visitor.rs Outdated Show resolved Hide resolved
marker_utils/src/visitor.rs Outdated Show resolved Hide resolved
marker_utils/src/visitor.rs Outdated Show resolved Hide resolved
@xFrednet
Copy link
Member Author

xFrednet commented Sep 2, 2023

Btw. I just noticed that I misspelled every Traversable 😓

Co-authored-by: Veetaha <gersoh3@gmail.com>
@xFrednet xFrednet added this pull request to the merge queue Sep 2, 2023
Merged via the queue into rust-marker:master with commit 62e0aa6 Sep 2, 2023
@xFrednet xFrednet deleted the 159-search-dog branch September 4, 2023 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-marker-utils Area: Utility functions usually provided in `marker_utils` or `marker_api` C-enhancement Category: New feature or request S-waiting-on-review Status: Awaiting review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a Visitor implementation to Marker
3 participants