-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unnecessary_join
lint
#8579
unnecessary_join
lint
#8579
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Manishearth (or someone else) soon. Please see the contribution instructions for more information. |
r? @flip1995 |
@flip1995 this is the same PR only with a squashed commit |
Thanks! @bors r+ |
@bors r+ Seems like bors lost track of this PR. I re-synced bors, let's hope it works this time. |
📌 Commit b60a7fb has been approved by |
☀️ Test successful - checks-action_dev_test, checks-action_remark_test, checks-action_test |
Great first contribution! I noticed today that I never asked first-time contributors about our dev experience. So: How was your experience as a first time contributor to Clippy? Did you find the docs helpful? How was your experience with our tooling? Is there anything you found in the docs/tooling that was unclear? Any other things we could improve on? It would be great if you could give me/us a bit of feedback on (some of) those questions. You can also write me on Zulip/Discord/email, if you prefer. However, don't feel obligated to give us feedback, if you don't have time / don't want to :) |
Thank you @flip1995! It was a good experience! The docs were helpful for getting the project working and working with VSCode / Clion, and Here's a list of some things I had to gather from context, some might be due to being relatively new to Rust / compilers (some of these might be in the docs and I missed them):
General suggestions:
Hopefully this helps! I had fun contributing and this is mostly a list of what came to mind and the items aren't in a particular order or were all necessarily large hurdles. I might be able to expand on the list later |
Thanks for the summary! I will include some of your review once I finally finish the Clippy book. I copied your comment to #6628 to not forget about it. I can give you some answers now though:
I guess we should monitor and respond to issues better. I guess we don't have the capacity to triage our issues as well as I would like it.
Common Tools for writing lints should contain common operations for lints. "Common lints" is basically looking at existing lints. |
You're welcome @flip1995! Thanks for the answers!
In what context would it be included?
Didn't mean this as an accusation, more in the direction of possibly creating an index of sort of existing lints (rather than just a list of names, with which items they relate to etc.) and possibly a playground to check if a piece of code is linted using all rules. The Rust playground has clippy but I'm not sure it turns everything on by default. Edit: The clippy lint list seems to work with keywords like "iterator" so this might already be covered
Not sure I saw this when working on the lint, will take a deeper look. Regarding common lints - You can look at existing code, but having a few examples that are indexed by what they're doing would probably be helpful (rather than trying to find lints which do what you want) (the previous point might tie into this using the indexed list of lints). |
I plan to rewrite some of our dev decumentation to make it easier finding things. I will use your feedback as some pointers where I might want to start with that.
No worries, I didn't took it that way. We have the Lint documentation where searching for some keywords like "join" would show you that such a lint didn't exist. Maybe we could improve the search, but I'm not sure how further improve that.
If you find things are missing, that you feel would have been useful there, feel free to open a PR adding those or leave a comment in #6628 |
Sounds good!
I just noticed that that works and edited the original, that does cover the general usecase! |
@flip1995 something else I ran into today and might be worth considering: It seems like the |
Thanks for pointing this out! We have that kind of on the radar and will also improve on that with the Clippy book. (If only the day had more than 24 hours, I would've already finished that 😐) It's done by adding |
Yeah the first comment on that question had the answer (which has eerily similar wording to your comment here haha): I ran into https://reddit.com/r/rust/comments/a4wblu/how_to_configure_clippy_to_be_as_annoying_as/ a while back when looking up how to add more lint categories (this isn't my comment), but thought to share that with you since it may indicate a certain lack of visibility for non-default lints.
Not at all meant as criticism! Open source is hard, often thankless work done for free and from good intentions, my point is to bring new information to help the project rather than imply anything |
Thanks a lot for this, that really helps, even though I can't act on it right away! I definitely do not take your comments as if they were bad intended. I just have a 10 hour work day behind me and don't have the patience to put more time in formulating my replies better. 😅 Thanks again for your feedback. That will really help! 🚀 |
That's completely fine, just wanted to make sure I'm not sending across the wrong message. Have a great weekend! |
I found this via discu.eu - very useful lint indeed. May I ask: This seems to only be applicable if the iter.collect::<Vec<String>>().join(", ");
// or
iter.intersperse(", ").collect::<String>() I like to believe that the second version is more performant, but who I am to just guess this? Either way: Maybe there's potential for another lint? |
Interesting! You may be onto something. I'm not next to a computer right now but I would benchmark the two using a few different variations to check. Notice that Edit: regarding your question, the current version supports |
Ah, I didn't even notice this. Maybe though even some other iterator-chaining is more performant. I will also do some benchmarks (will be a nice learning experience) and post them here. |
So writing the benchmark was less complex than I thought. Maybe I did something wrong, because I never wrote benchmarks for rust code before. Here's the playground, it is implemented on nightly. I get varying results though:
On my machine, the "manual join" implementation (the one with |
Ah yes, a bigger problem yields better visibility in the results:
with |
The first result might be due to loop unrolling, does it work differently with |
Might be, but a quick test (using |
I'll be able to check this later on but we did run into cases where the compiler unrolls the manual join but not the collect join, makes sense to me that this is the case here |
@matthiasbeyer I get correct results: use criterion::{black_box, criterion_group, criterion_main, Criterion};
fn criterion_benchmark(criterion: &mut Criterion) {
let mut benchmark_group = criterion.benchmark_group("unnecessary_join");
benchmark_group.bench_function("collect and join", |bencher| {
bencher.iter(|| {
black_box(["1", "2", "3", "4", "5"])
.into_iter()
.map(String::from)
.collect::<Vec<String>>()
.join("")
})
});
benchmark_group.bench_function("collect only", |bencher| {
bencher.iter(|| {
black_box(["1", "2", "3", "4", "5"])
.into_iter()
.map(String::from)
.collect::<String>()
})
});
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
|
@matthiasbeyer with the other benchmarks: (MacBook Pro (14-inch, 2021), 8 core M1 Pro, 16 GB RAM) use criterion::{black_box, criterion_group, criterion_main, Criterion};
fn criterion_benchmark(criterion: &mut Criterion) {
let mut benchmark_group = criterion.benchmark_group("unnecessary_join");
benchmark_group.bench_function("array_into_iter/collect_and_join", |bencher| {
bencher.iter(|| {
black_box(["1", "2", "3", "4", "5"])
.into_iter()
.map(String::from)
.collect::<Vec<String>>()
.join("")
})
});
benchmark_group.bench_function("array_into_iter/collect_only", |bencher| {
bencher.iter(|| {
black_box(["1", "2", "3", "4", "5"])
.into_iter()
.map(String::from)
.collect::<String>()
})
});
benchmark_group.bench_function("vec_iter/collect_and_join", |bencher| {
bencher.iter(|| {
let vector = black_box(vec!["hello", "world"]);
let _output = vector
.iter()
.map(|item| item.to_uppercase())
.collect::<Vec<String>>()
.join("");
})
});
benchmark_group.bench_function("vec_iter/collect_only", |bencher| {
bencher.iter(|| {
let vector = black_box(vec!["hello", "world"]);
let _output = vector
.iter()
.map(|item| item.to_uppercase())
.collect::<String>();
})
});
benchmark_group.bench_function("deno/collect_and_join", |bencher| {
bencher.iter(|| {
let url = black_box("https://google.com".to_owned());
let split_specifier = url.as_str().split(':');
let _ = split_specifier.skip(1).collect::<Vec<_>>().join("");
})
});
benchmark_group.bench_function("deno/collect_only", |bencher| {
bencher.iter(|| {
let url = black_box("https://google.com".to_owned());
let split_specifier = url.as_str().split(':');
let _ = split_specifier.skip(1).collect::<String>();
})
});
benchmark_group.bench_function("not_unrolled/collect_and_join", |bencher| {
bencher.iter(|| {
let _ = black_box(std::iter::repeat("'a, "))
.take(10)
.collect::<Vec<_>>()
.join("");
})
});
benchmark_group.bench_function("not_unrolled/collect_only", |bencher| {
bencher.iter(|| {
let _ = black_box(std::iter::repeat("'a, "))
.take(10)
.collect::<String>();
})
});
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
The last benchmark being an example where the loop doesn't unroll when using |
It should yield a ~22% improvement in normal cases according to this benchmark |
@matthiasbeyer It seems that when compiling these to
This is the assembly diff for an example that behaves like the first benchmark on x86_64: https://rust.godbolt.org/z/GKqhq5o3a @flip1995 let me know if you think we should change the wording |
Thanks for helping me understanding what is going on here! 👍 |
Maybe "usually more performant" -> "might be more performant" and "in most cases" -> "sometimes". |
@flip1995 should I create a new PR? |
That would be great! |
@flip1995 I know some that people / projects use the It's sort of contrary to the description of the category but I want to make sure we don't need to do anything specific regarding those cases |
update unnecessary_join documentation changelog: none Updates the description of `unnecessary_join` in accordance with #8579 (comment). I've also added a line regarding differences in assembly output, please let me know if it should also make it in.
Adds a lint called
unnecessary_join
that detects cases of.collect::<Vec<String>>.join("")
or.collect::<Vec<_>>.join("")
on an iterator, suggesting.collect::<String>()
insteadFixes: #8570
This is a reopen of #8573
changelog: add lint [
unnecessary_join
]