Utils: Add scope config for visitors and add `for_each_expr` fn #232

xFrednet · 2023-08-29T20:54:02Z

This PR makes the Visitor trait a bit more flexible and adds the for_each_expr function, to handle most use cases with very little boilerplate code. The contains_return function is used for simple testing and should also be useful in the future.

Inspired by:

Closes: #159

Veetaha · 2023-09-01T17:52:37Z

marker_utils/src/search.rs

+/// This function is useful, for lints which suggest moving code snippets into
+/// a closure or different function. Return statements might prevent the suggested
+/// refactoring.
+pub fn contains_return<'ast>(cx: &'ast AstContext<'ast>, node: impl Traverseable<'ast, bool>) -> bool {


Did you consider making this a method on AstContext? Meaning this would be an extension trait

Not specifically for this method. IMO, it wouldn't quite fit the AstContext struct, since it's intended for driver callbacks and fetching context data. Making it a function on the traversable note seems better to me. 👍

Edit: I moved the for_each_expr into the Traversable trait. I'm not sure, if I can do that for this function, since it requires the Visitor break value to be bool

Generally speaking, I've been considering adding traits in marker_utils to expand the functionality of existing API nodes. The split between marker_api and marker_utils is not clearly defined yet. My idea was, that the marker_api crate will continue to grow with new nodes and language features, which will result in longer build times. Sadly, it's also hard to split the crate. Rustc has separated the semantic type representation into a separate crate (See: TyKind), but I find the documentation very hard to read, as important types are only defined as const generics on a trait.

So, the idea was, that marker_api will host all nodes, items, and traits needed to create lints or further tools. marker_utils should extend the API with useful functions and traits, which make the API pleasant to use. The split should be similar to Rust's core and std separation.

The problem with this plan is the high dependence on the AstContext. marker_api has access to a globally stored instance, which is accessed via the with_cx function. marker_utils currently has no equivalent. I'm considering, adding a cfg-ed function to marker_api which would allow the access for marker_utils. However, I'm a bit worried, that users will find this feature and will start to use the global instance as well.

Do you maybe have some ideas, how this could be solved?

To my mind marker_api and marker_utils should just be a single crate. As you've already mentionned it's not even obvious how you should qualify the util for the inclusion in marker_api or marker_utils.

I can't tell if compile times is such a big problem mainly because I'm used to working in a codebase that has 2000 aggregate crate dependencies 📦. What I can tell is that there is no benefit for the compile times to have a separate marker_utils crate if you are going to use that one because compiling it requires waiting for marker_api to compile in the first place.

For sure it is theoretically possible that there could be crates that depend on marker_api only and they could be compiled in parallel with marker_utils (for example if someone out there creates another set of utils for marker_api), but does it woth it? I'm not sure.

To me this could be solved by merging marker_utils into marker_api and using cargo features. For example

marker_api = { version = "0.3.0", features = ["extra-sugar-please"] }

For example, syn follows this approach. This also removes the need of having extension traits, since all the code resides in the same crate and everything could be an inherent impl. There is now also no question about exposing cfgd private functions.

But don't get me wrong, I don't think that having an unstable cargo feature to expose a private doc(hidden) function is a bad idea. I think even just having it doc(hidden) and pub without a cfg(feature) is still enough to make people acknowledge that this barn door doesn't open in their direction.

I can't tell if compile times is such a big problem mainly because I'm used to working in a codebase that has 2000 aggregate crate dependencies 📦. What I can tell is that there is no benefit for the compile times to have a separate marker_utils crate if you are going to use that one because compiling it requires waiting for marker_api to compile in the first place.

There is actually some benefit. The compilation happens in multiple stages. The frontend first collects all the information etc and then passes the info to the backend for machine/byte code generation. Other dependence can already start compiling at this point, since all required information has been collected at this point. Only the final linking part has to wait on all crates. That's one reason why splitting crates is such a common recommendation, to speed up compile times.

I'll think a bit more about this. Something is still bothering me about merging these crates into one, but I don't quite know if this gut feeling is rational based on some experience or if it's just an architecture change 🤔

I'm not sure I understand how you image the compilation of marker_api and marker_utils could happen in parallel.
I'm used to think that once you have a dependency in Cargo.toml then your crate won't even start compiling before the object file for the dependency is produced, or maybe some small preparation stages may be done, but the bulk of the compiling of the dependency still blocks the compilation of your crate.

For example, there's been a debate about not having a derive feature in serde and depending on serde_derive directly to make them both be compiled in parallel. See the build timings report in this issue (unstable build timings feature in cargo).

Offtopic

David Tolnay agreed that not having a derive feature would improve the parallelism of compilation, but this way the version of serde_derive could go desynchronized with serde crate, but a workaround for that problem was found

I'm not sure I understand how you image the compilation of marker_api and marker_utils could happen in parallel.

My understanding is, that the blue parts of the compilation have to happen in sequence, while the purple parts can overlap. This is what I read in the past and seems to also be supported by the linked discussion. (See also endler.dev, just the first related blog post, with decent sources) Though in serde's case, the purple part is relatively insignificant and not the main point of the discussion.

Here is the timing report for compiling marker_uilints:

cargo clean cargo build -p marker_uilints --timings

rn the speedup is insignificant. My worry is mostly concerned with the future. With the goal of stability, it will be difficult to remove anything. This could pile up over time

I see, I didn't know about that, I suppose the problem of parallelism applies mainly to proc macros

I've reached out to @flip1995 to ask him if he remembers why the clippy_utils split was done and if it delivered on the hoped benefits.

I haven't read the entire thread, I'm just here answering the question about the split:

So clippy_utils was always "split" in Clippy from the rest of the code. However, back in the day it was pretty much just a mod.rs file inside of clippy_lints/src/utils/, together with a few other files.

The original request came from the https://github.com/trailofbits/dylint maintainers, that wanted to use the util functions in Clippy. Since we didn't want to publish clippy_lints for people to rely on, the suggestion of just splitting out clippy_utils came up. Since our utils grew more and more back then (the mod.rs file grew way larger than 1 kLoC), splitting them out into their own crate didn't seem like a bad idea. So this was the decision everyone could live with the best: dylint could reuse all of our utils functions, without having to compile our lints, and the Clippy repo had a nice split between shared utils and lint implementations.

We're still not publishing the clippy_utils crate on crates.io, because it is way too unstable.

Not sure if a split makes sense for marker though. Are the utils useful as a standalone lib, without the main part of marker? If not, pulling them together might make sense, if it makes development easier. All other questions are more stylistic and how you want to structure the project.

marker_utils/src/visitor.rs

xFrednet · 2023-09-02T10:54:06Z

Btw. I just noticed that I misspelled every Traversable 😓

Co-authored-by: Veetaha <gersoh3@gmail.com>

xFrednet added 2 commits August 29, 2023 17:33

Utils: Add visitor scope setting

cd28934

Utils: Add for_each_expr and contains_return utils

85cd7fa

xFrednet added C-enhancement Category: New feature or request S-waiting-on-review Status: Awaiting review A-marker-utils Area: Utility functions usually provided in `marker_utils` or `marker_api` labels Aug 29, 2023

xFrednet added this to the v0.3.0 milestone Aug 29, 2023

xFrednet force-pushed the 159-search-dog branch from bacf61d to 9b17703 Compare August 29, 2023 21:15

Chore: Make Clippy happy

bc3a2c7

xFrednet force-pushed the 159-search-dog branch from 9b17703 to bc3a2c7 Compare August 30, 2023 10:10

Veetaha approved these changes Sep 1, 2023

View reviewed changes

Chore: Address PR review <3

cda7b7c

Co-authored-by: Veetaha <gersoh3@gmail.com>

xFrednet force-pushed the 159-search-dog branch from 798cc2f to cda7b7c Compare September 2, 2023 11:11

Veetaha approved these changes Sep 2, 2023

View reviewed changes

xFrednet added this pull request to the merge queue Sep 2, 2023

Merged via the queue into rust-marker:master with commit 62e0aa6 Sep 2, 2023

xFrednet deleted the 159-search-dog branch September 4, 2023 16:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Utils: Add scope config for visitors and add `for_each_expr` fn #232

Utils: Add scope config for visitors and add `for_each_expr` fn #232

xFrednet commented Aug 29, 2023

Veetaha Sep 1, 2023

xFrednet Sep 2, 2023 •

edited

Loading

Veetaha Sep 2, 2023 •

edited

Loading

xFrednet Sep 2, 2023

Veetaha Sep 2, 2023 •

edited

Loading

xFrednet Sep 2, 2023

xFrednet Sep 2, 2023 •

edited

Loading

Veetaha Sep 2, 2023

xFrednet Sep 2, 2023

flip1995 Sep 4, 2023

xFrednet commented Sep 2, 2023

Utils: Add scope config for visitors and add for_each_expr fn #232

Utils: Add scope config for visitors and add for_each_expr fn #232

Conversation

xFrednet commented Aug 29, 2023

Veetaha Sep 1, 2023

Choose a reason for hiding this comment

xFrednet Sep 2, 2023 • edited Loading

Choose a reason for hiding this comment

Veetaha Sep 2, 2023 • edited Loading

Choose a reason for hiding this comment

xFrednet Sep 2, 2023

Choose a reason for hiding this comment

Veetaha Sep 2, 2023 • edited Loading

Choose a reason for hiding this comment

xFrednet Sep 2, 2023

Choose a reason for hiding this comment

xFrednet Sep 2, 2023 • edited Loading

Choose a reason for hiding this comment

Veetaha Sep 2, 2023

Choose a reason for hiding this comment

xFrednet Sep 2, 2023

Choose a reason for hiding this comment

flip1995 Sep 4, 2023

Choose a reason for hiding this comment

xFrednet commented Sep 2, 2023

Utils: Add scope config for visitors and add `for_each_expr` fn #232

Utils: Add scope config for visitors and add `for_each_expr` fn #232

xFrednet Sep 2, 2023 •

edited

Loading

Veetaha Sep 2, 2023 •

edited

Loading

Veetaha Sep 2, 2023 •

edited

Loading

xFrednet Sep 2, 2023 •

edited

Loading