Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Iterator::array_chunks (take N+1) #100026

Merged
merged 12 commits into from
Aug 14, 2022
Merged

Conversation

WaffleLapkin
Copy link
Member

A revival of #92393.

r? @Mark-Simulacrum
cc @rossmacarthur @scottmcm @the8472

I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is try_fold implementation, I've left the "custom" one for now, not sure what exactly should it use.

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Aug 1, 2022
@rustbot
Copy link
Collaborator

rustbot commented Aug 1, 2022

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 1, 2022
@rust-log-analyzer

This comment has been minimized.

@the8472
Copy link
Member

the8472 commented Aug 1, 2022

Note that iter.next_chunk() exists now (#98326). And I expect these things to optimize poorly by default, so we'll need specialized implementations (as in #98553) for all sources that can yield chunks more efficiently and then plumb that through adapters that don't alter lengths like Map, Copied, Take etc. otherwise one won't get the autovectorized code one might expect.

I'm not sure if this the right approach and some design work is needed (#81615).

@scottmcm
Copy link
Member

scottmcm commented Aug 1, 2022

+1 that this should just use https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.next_chunk -- conveniently, that should mean that this PR gets substantially simpler, since the unsafe can live over there instead of here.

@scottmcm scottmcm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 1, 2022
Comment on lines 113 to 122
let mut acc = init;
let mut iter = self.iter.by_ref().rev();

// NB remainder is handled by `next_back_remainder`, so
// `next_chunk` can't return `Err` with non-empty remainder
// (assuming correct `I as ExactSizeIterator` impl).
while let Ok(mut chunk) = iter.next_chunk() {
chunk.reverse();
acc = f(acc, chunk)?
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by_ref AND a double-reverse? 😬 that must be horrendously slow.

Imo we either need a next_chunk_back or only limit it to forward iteration. The former could be done in a separate PR, but we need to decide on an approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry you had to see this impl 😅

@WaffleLapkin
Copy link
Member Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 5, 2022
self.next_back_remainder();

let mut acc = init;
let mut iter = self.iter.by_ref().rev();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: You have a Sized iterator here (since it's a field), so you can avoid some of the by_ref optimization penalty by using https://github.com/rust-lang/rust/blob/master/library/core/src/iter/adapters/by_ref_sized.rs instead of by_ref().

@Mark-Simulacrum
Copy link
Member

r? @scottmcm

Self: Sized,
F: FnMut(B, Self::Item) -> B,
{
self.try_fold(init, |acc, x| NeverShortCircuit(f(acc, x))).0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid making a closure type generic over I too:

Suggested change
self.try_fold(init, |acc, x| NeverShortCircuit(f(acc, x))).0
self.try_fold(init, NeverShortCircuit::wrap_mut_2(f)).0

(and the same in rfold)

Copy link
Member

@scottmcm scottmcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few comments, but overall I think this is essentially fine for nightly. (Lots of open questions for stabilization, but that's ok.)

Can you please open a tracking issue, add it to the unstable attributes, and address my other comments?

@scottmcm scottmcm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 10, 2022
// Take the last `rem` elements out of `self.iter`.
let mut remainder =
// SAFETY: `unwrap_err` always succeeds because x % N < N for all x.
unsafe { self.iter.by_ref().rev().take(rem).next_chunk().unwrap_err_unchecked() };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I'll block the PR on this, but unwrap_err_unchecked here scares me -- I added a note to the tracking issue.

My instinct for now is that %N is definitely trustworthy, so that combined with take is sufficient to say this is ok. But I'm still afraid, since it's not TrustedLen, and wonder if someone can come up with an evil example here where it somehow gets a full chunk and thus is UB in safe code.

(At least the obvious things, like a len() that always says usize, is protected against because of the modulo.)

Copy link
Member

@the8472 the8472 Aug 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already noted in another comment I don't like this entire method. Either we should add next_chunk_back or remove the DEI impl.

Copy link
Member Author

@WaffleLapkin WaffleLapkin Aug 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, an evil impl can return less than rem elements, but actual <= rem < N still holds (I don't think there is any way to trick Take into returning more than rem elements) so the chunk will be an error no matter what.

An evil impl can just return a wrong length, so after this iterator returns a number of elements not divisible by N. But this is fine too, an Err will just be ignored:

// NB remainder is handled by `next_back_remainder`, so
// `next_chunk` can't return `Err` with non-empty remainder
// (assuming correct `I as ExactSizeIterator` impl).
while let Ok(mut chunk) = iter.next_chunk() {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That being said, as @the8472 already said, we should probably rewrite the DEI impl anyway (I'll try to do this, after this lands).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the % is certainly fine. What I'm not yet convinced of is that there's no way to make take misbehave somehow -- for example, Take::try_fold uses the inner try_fold, so maybe there'd be a way for that to be implemented wrongly-but-not-UB that could result in too many things getting put in the array somehow.

But I agree that just making a nicer implementation without the unsafe{} is the right plan.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you are right 😨

Overriding try_fold with some nasty unsafe to skip over Break(_) allows you to trick Take into returning more elements that it should (playground).

n can underflow, if a bad iterator impl skips over Break(_):

*n -= 1;
let r = fold(acc, x);
if *n == 0 { ControlFlow::Break(r) } else { ControlFlow::from_try(r) }

The fact that unsafe code can't trust Take is scary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem there is still in the unsafe in Misbehave -- it protects against double-drop, but duplicating arbitrary values can violate safety invariants in general. (For example, if the type is !Clone + !Default, I can use you moving it into me to track resource consumption, like a ZST tracker for a global semaphore.) So that specific example is just "well unsound unsafe-using code breaks everything".

But yeah, even though I've not been able to come up with a safe trick that would do it, things along those lines are what make me worried about it. (And certainly if specialization existed then it would be possible to do particularly weird things in safe code because it can violate parametricity even worse than in normal code.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example could check, that type_id::<B>() == type_id::<usize>() or something similar, which would IMO make the code sound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about that one, but TypeId::of::<B>() needs B: 'static, which we don't have here. Regardless, it's yet another chink in the armour that makes be scared we'll break through eventually.

@scottmcm
Copy link
Member

Running over this again I noticed one thing I'm wondering about, but I don't think it needs to block going to nightly. Thanks for pushing this through!

@bors r+

@bors
Copy link
Contributor

bors commented Aug 13, 2022

📌 Commit 5fbcde1 has been approved by scottmcm

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 13, 2022
@the8472
Copy link
Member

the8472 commented Aug 13, 2022

(take N+1)

Clearly a bug! :D

Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Aug 13, 2022
Add `Iterator::array_chunks` (take N+1)

A revival of rust-lang#92393.

r? `@Mark-Simulacrum`
cc `@rossmacarthur` `@scottmcm` `@the8472`

I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is `try_fold` implementation, I've left the "custom" one for now, not sure what exactly should it use.
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 14, 2022
Rollup of 6 pull requests

Successful merges:

 - rust-lang#99582 (Delay a span bug if we see ty/const generic params during writeback)
 - rust-lang#99861 (orphan check: rationalize our handling of constants)
 - rust-lang#100026 (Add `Iterator::array_chunks` (take N+1))
 - rust-lang#100115 (Suggest removing `let` if `const let` or `let const` is used)
 - rust-lang#100126 (rustc_target: Update some old naming around self contained linking)
 - rust-lang#100487 (`assert_{inhabited,zero_valid,uninit_valid}` intrinsics are safe)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 482a6ea into rust-lang:master Aug 14, 2022
@rustbot rustbot added this to the 1.65.0 milestone Aug 14, 2022
@WaffleLapkin WaffleLapkin deleted the array-chunks branch August 14, 2022 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants