Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correctly drop Array::IntoIter #1773

Merged
merged 3 commits into from
Jan 24, 2024
Merged

correctly drop Array::IntoIter #1773

merged 3 commits into from
Jan 24, 2024

Conversation

Radvendii
Copy link
Member

Since #1683 has been open a while, and I'm not sure when I'm going to have the time to work out what's going on with Interner, I wanted to separate out the changes to Array::IntoIter.

If you called Array::into_iter(), you could leak elements in two circumstances:

  1. If you did not iterate through all elements before dropping, since they had been converted to ManuallyDrop
  2. If the Array had multiple references when the iterator was created, but the other references were dropped before the iterator was dropped, then the existence of the iterator would prevent those Rcs from dropping their contents, but then because the iterator is of type ManuallyDrop, it wouldn't drop those elements either

If you called Array::into_iter() you could leak elements in two circumstances:
1. If you did not iterate through all elements before dropping, since they had been converted to `ManuallyDrop`
2. If the Array had multiple references when the iterator was created, but the other references were dropped before the iterator was dropped, then the existence of the iterator would prevent those Rcs from dropping their contents, but then because the iterator is of type ManuallyDrop, it wouldn't drop those elements either
@github-actions github-actions bot temporarily deployed to pull request January 23, 2024 15:34 Inactive
Copy link
Member

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing safety comment (but beside other remarks/questions, LGTM)

core/src/term/array.rs Outdated Show resolved Hide resolved
core/src/term/array.rs Outdated Show resolved Hide resolved
core/src/term/array.rs Outdated Show resolved Hide resolved
@yannham
Copy link
Member

yannham commented Jan 23, 2024

Revisiting this, I think like we're going through a lot of trouble for no obvious gain. IIUC, what this iterator does is to drop stuff live as it is iterating if we're the only reference to the backing array. I guess the original motivation was that you could be the only remaining live reference, but have a very small view into the array (say, the [10,15] interval in a 10k elements array).

But, I wonder if we should really care at this point about such optimizations (we can wait for the iterator be consumed or dropped). Besides, I'm more concerned with running time than with memory consumption for now. And it's not obvious (or even probably false in a lot of common cases) than dropping things in many steps is faster than dropping everything at once.

This is why I'm tempted to get rid of this complexity altogether and just a normal iterator. What do you think @Radvendii ?

Co-authored-by: Yann Hamdaoui <yann.hamdaoui@tweag.io>
@Radvendii
Copy link
Member Author

IIUC, what this iterator does is to drop stuff live as it is iterating if we're the only reference to the backing array.

I thought the point was more that if you're, say, mapping over an Array and you're the only reference, you don't have to clone all the elements, you can modify-in-place. But I'm not super confident in that understanding.

This is why I'm tempted to get rid of this complexity altogether and just a normal iterator.

Oh, hmm... I'm not opposed. What's the status of testing the real-world speed? Can we run that medium-sized code base someone provided and see if simplifying this to a normal iterator slows it down?

@github-actions github-actions bot temporarily deployed to pull request January 24, 2024 18:27 Inactive
@yannham
Copy link
Member

yannham commented Jan 24, 2024

I thought the point was more that if you're, say, mapping over an Array and you're the only reference, you don't have to clone all the elements, you can modify-in-place. But I'm not super confident in that understanding.

Like this, from the top of my head, I would say that the fact that we can consume the backing memory of the array and reuse the allocation doesn't depend on this code, as this code seems to be concerned with dropping the individual elements or not, not the enclosing array. Given that we have a hand-written iterator, I wonder if there's any chance that array.into_iter().map(f).collect() can reuse the inner array - I can't see how the compiler could do so. But I must admit I don't have a good idea of how it works for Rust stdlib structures like Vec, so let me investigate a bit. That might actually be a better performance goal (than cloning or not the individual elements).

As for individual elements, it's true that the current code avoid cloning - but honestly, cloning terms is cheap. Unless it prevents some reuse-in-place later, maybe, but at that point the iterator should have been consumed and thus the Rc count be decremented. Though, about the initial motivation, maybe you're right: ManuallyDrop might have been used just for ManuallyDrop::take in next(), more than for early dropping of terms.

Oh, hmm... I'm not opposed. What's the status of testing the real-world speed? Can we run that medium-sized code base someone provided and see if simplifying this to a normal iterator slows it down?

Yes, I have a couple sizeable real world codebases, although they have been given to me as private, I can try with and without and see if there's any noticeable difference. With the proviso that variance is always huge - I've always suspected the hashmap randomization and the fact that we stress hashmaps a lot, but have never come to actually test this hypothesis.

Thanks for your input. I think the situation isn't all black-and-white, so while waiting for further investigations on whether the current implementation is worth it, or if we could do better with respect to mutating-in-place, I don't see a good reason to block this PR which strictly improves the current situation.

@Radvendii Radvendii added this pull request to the merge queue Jan 24, 2024
Merged via the queue into master with commit 220733c Jan 24, 2024
5 checks passed
@Radvendii Radvendii deleted the fix_array_iter_leak branch January 24, 2024 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants