RFC: remove weak pointers #1232

Gankra · 2015-07-31T17:26:07Z

Abandon support for Weak on Rc and Arc in favour of external crates that
provide the functionality. This resolves some open API questions, while making
them genuinely "pay for what you use".

Also stabilize the uniqueness stuff on them at the same time.

rendered

bluss · 2015-07-31T17:56:02Z

text/0000-rust-is-strong.md

+# Alternatives
+
+Accept the tiny overhead and stabilize Weak. This involves answering the
+question of what uniqueness means in the context of Weak pointers.


Alternative not mentioned: Provide both weakless and weakable Rc and Arc in rust. For example using default type parameters, Arc<T> = Arc<T, NoWeak> vs. Arc<T, AllowWeak>. This alternative is also pay for what you use.

I would suggest also designing this parameter to be able to configure atomicity and ref-count size, as well, if at all desired in the future, to avoid type parameter bloat.

nikomatsakis · 2015-07-31T18:20:35Z

👍 the weak support feels somewhat niche to me, though admittedly the cost is small

SimonSapin · 2015-07-31T22:40:50Z

Weak pointers are essential when reference cycles are expected, for example in a tree with parent node references. This is not all use cases, but it happens.

Yes, Rc-with-Weak can be defined outside of the standard library, but so can Rc-without-Weak. What justifies inclusion in std of one but not the other? Conversely, if one is moving out, why not the other as well? “Pay for what you use” is an argument for having both, not for where they should be.

To clarify:

I am fine with stabilizing the status quo, Rc-with-Weak only. (We have yet to hear of a use case where the extra cost is both significant and unnecessary)
I am fine with having both Rc-without-Weak and Rc-with-Weak in std, either through type parameters or in separate types.
I am fine with having neither in std, and both on crates.io. This is tricky right now since some of relevant optimizations or features (NonZero, unsafe_no_drop_flag, CoerceUnsized, assume) are not stable yet, but it would strengthen the story of “the core language is small but powerful, you can do everything in libraries”.
I don’t see a reason to have Rc-without-Weak in std and Rc-with-Weak on crates.io.

The uniqueness questions can be worked out (and are independent of inclusion or not in std):

get_mut and make_unique need to ensure there are no weak references which could be upgraded while the &mut T borrow is outstanding, causing mutable aliasing.
try_unwrap does not have this issue: when the last strong reference is dropped/consumed, any remaining weak reference is invalidated as usual.
The above shows that both possible meanings of is_unique are useful. Perhaps both should be provided, in two boolean methods with names to be ~~determined~~ bikeshedded.

metajack · 2015-07-31T23:19:17Z

👎

Do external crates already exist for this? These are used in Servo, and we were trying to move more things to libstd types that have them (Arc<Flow> instead of FlowRef).

alexcrichton · 2015-08-01T00:38:59Z

@metajack to clarify this RFC intends to move the support for weak pointers into an external crate on crates.io. Support for something like crates_io_rc::Rc<Trait> will not be stable for some time but it can be enabled with a cargo feature of the crate. Servo is already using nightly Rust anyway so I don't think this would harm Servo's usage of Rc or Arc at all (apart from switching crates).

metajack · 2015-08-01T01:43:13Z

@alexcrichton I agree with @SimonSapin that if Rc is in libstd, then Weak probably should be.

However, my issues with external crates is mostly around whether they will be created and maintained. If the answer is that the Rust team is committed to maintaining those, I agree this shouldn't impact Servo all that much.

seanmonstar · 2015-08-01T01:43:59Z

I must be missing something obvious, but I don't see why the weak_count is
needed at all. The WeakPtr can continue its drop check, checking if the
_ptr is null or post_drop, and remove the decrement completely.

On Fri, Jul 31, 2015, 5:39 PM Alex Crichton notifications@github.com
wrote:

@metajack https://github.com/metajack to clarify this RFC intends to
move the support for weak pointers into an external crate on crates.io.
Support for something like crates_io_rc::Rc will not be stable for
some time but it can be enabled with a cargo feature of the crate. Servo is
already using nightly Rust anyway so I don't think this would harm Servo's
usage of Rc or Arc at all (apart from switching crates).

—
Reply to this email directly or view it on GitHub
#1232 (comment).

Gankra · 2015-08-01T01:45:39Z

@seanmonstar you need the weak count for a weak pointer to know if there are more weak pointers -- if it thinks it's the last it will free the memory causing all other Weaks to use-after-free.

Gankra · 2015-08-01T01:47:11Z

@metajack it could certainly be a rust-lang crate.

seanmonstar · 2015-08-01T03:12:55Z

@gankro couldn't the drop fn of Weak just be:

let ptr = *self._ptr;
if !(*(&ptr as *const _ as *const *const ())).is_null() &&
    ptr as *const () as usize != mem::POST_DROP_USIZE {
    if self.strong() == 0 {
        deallocate(ptr as *mut u8, size_of_val(&*ptr),
                   align_of_val(&*ptr))
    }
}

When the last Rc is dropped, self._ptr.value can be dropped, as well as the RcBox. When a Weak is dropped, it'll see that RcBox is null/dropped.

Gankra · 2015-08-01T04:36:12Z

@seanmonstar You can make arbitrarily many weak and strong pointers. Also the strong pointers need to know if there's any weak pointers when deciding whether to free.

seanmonstar · 2015-08-01T04:54:16Z

@gankro I know you can make many. Once the pointer is null, the other destructors should just check is_null and end. In my example, the strong pointers shouldn't care if there are any weak. When count reaches 0, drop it on the floor. The Weak destructors will check that.

I may be wrongly assuming that a Weak might not be able to upgrade to an Rc, if all strongs are gone. If it can, I'd think that another alternative for this RFC is to make it instead that Weak has fn get(&self) -> Option<&T>, and thus removing any need for a weak count.

Gankra · 2015-08-01T05:12:34Z

@seanmonstar How does anyone know to free the RcBox?

Sorry to be clear, there's two distinct steps:
Dropping T, and Freeing the RcBox. You can't free the RcBox until all strongs and weaks are gone (because they have a pointer to it), but you have to drop T as soon as all the strongs are gone.

seanmonstar · 2015-08-01T05:29:46Z

In drop of Rc, if strong count becomes 0, free RcBox.

On Fri, Jul 31, 2015, 10:12 PM Alexis Beingessner notifications@github.com
wrote:

@seanmonstar https://github.com/seanmonstar How does anyone know to
free the RcBox?

—
Reply to this email directly or view it on GitHub
#1232 (comment).

Gankra · 2015-08-01T05:39:46Z

@seanmonstar This will make every Weak pointer use-after-free.

bluss · 2015-08-01T11:15:24Z

text/0000-rust-is-strong.md

+conditions regardless.
+
+Note that we have *never* received complaints about this, to my knowledge. Servo
+and rustc happily use Rc without Weak. This isn't so much a "this is tragedy"


Note: Servo uses arc::Weak.

Servo has both some data in Arcs that may have weak references, and some other data in Arcs that never do.

From the RFC it sounds like rustc and Servo don't use any weak pointers and are happy like that. I just don't want that we take decisions based on incorrect information.

Right, it’s a bit misleading. But there are also some cases where we have unused weak counters where the extra memory usage hasn’t bothered us yet.

rustc uses Weak in at least one place, as a parent pointer in the rustc_resolve custom tree of Modules.

So stop lieing in the RFC please!

Gankra · 2015-08-01T13:16:37Z

(seanmonstar and I hashed it out on IRC -- they failed to heed my warning about the implementation details being a bit confusing and looked at them anyway -- Filling drop and NonZero are a hell of a drug)

dgrunwald · 2015-08-01T16:12:41Z

The choice of which Rc to use isn't local: The whole point of Rc is shared ownership; possibly even shared between multiple modules in a program. If any those modules needs weak references; all modules must use a weak-enabled Rc. So the only time a module really can use Rc without weak reference support is if it doesn't use them itself and is sure the references are not passed to anyone who might.

So I don't think it's a good idea to get people to use Rc without weak references by default; at least until we have a good story for writing modules that are generic over the choice of smart pointer.

bluss · 2015-08-01T19:15:17Z

@dgrunwald I think that's a reasonably convincing argument for why Weak should be opt-out, not opt-in.

SimonSapin · 2015-08-01T21:25:34Z

The above [try_unwrap v.s. get_mut / make_unique] shows that both possible meanings of is_unique are useful. Perhaps both should be provided, in two boolean methods with names to be ~~determined~~ bikeshedded.

Proposal:

Leave is_unique unchanged. (Namely, strong_count() == 1 && weak_count() == 0.) A reference is not truly unique if there are other references, even weak. make_unique causes is_unique to be true.
Add Rc::can_unwrap(&Rc<T>) -> bool that says whether try_unwrap would succeed. (Namely, strong_count() == 1. Weak references should be invalidated by try_unwrap. (Can this be done race-free on Arc?)) This is “can I safely take ownership of the T value referenced by this Rc<T>?” and also “would the T value be dropped if I drop this Rc<T>?”. The docs should mention both scenarios.

(Like is_unique now, these two methods are only for Rc. On Arc, what they tell you might no longer be true by the time they return.)

nagisa · 2015-08-02T00:31:12Z

Add Rc::can_unwrap(&Rc) -> bool that says whether try_unwrap would succeed. (Namely, strong_count() == 1. Weak references should be invalidated by try_unwrap. (Can this be done race-free on Arc?)) This is “can I safely take ownership of the T value referenced by this Rc?” and also “would the T value be dropped if I drop this Rc?”. The docs should mention both scenarios.

What the Result/Option in return value cannot do that can_unwrap can? Sounds pretty low-gain addition for all warts can_unwrap has.

SimonSapin · 2015-08-02T09:53:08Z

What the Result/Option in return value cannot do that can_unwrap can? Sounds pretty low-gain addition for all warts can_unwrap has.

try_unwrap is destructive. In this code for example, I want to know whether dropping this Rc<T> would drop T without actually dropping or unwrapping it yet.

milibopp · 2015-08-02T19:09:03Z

Funny coincidence, I just opened an issue on stabilizing Weak, when @bluss pointed to this RFC. As mentioned in rust-lang/rust#27465 I use Arc with Weak in a project (as an implementation detail, not as part of the API). Not using Weak is not an option for that project.

I even tried to reimplement it outside std to bridge the phase until its stabilization, which I considered to be only a matter of time. And at that point the building blocks required weren't stable either. It appears to me that rebuilding Weak outside std is not even possible without duplicating a significant chunk of code from std.

Given all that the idea presented in this RFC does not feel right, especially as it is being considered for philosophical reasons. Altogether that sounds like premature optimization. Note that you want to remove functionality that people actually use. I disagree with the dismissal of the drawbacks in the RFC. Everybody using Weak being on nightly in my case means I am on nightly precisely because there are no Weak pointers on stable.

In my opinion we should at least obtain solid data, that there is a tangible benefit to removing it, to even consider this course of action. Of course, I admit that I have a biased perspective on this as a user of Weak pointers, but from that perspective the drawbacks clearly outweigh the performance advantages.

milibopp · 2015-08-02T19:16:19Z

Perhaps following the argument of @dgrunwald it is inadvisable to use a ref-counted pointer without weak-ref support in APIs and such. As an alternative proposal, maybe a strong-only pointer type could be added to std as a tool for microoptimization, when you are certain, that you'll never create reference cycles or expose it to the outside world. Of course the APIs should be consistent so that swapping from one to the other is as seamless as possible.

eddyb · 2015-08-03T00:22:13Z

I really like @bluss' parametrization alternative, allowing the std to expose both only-strong references and pairs of strong/weak reference types, without violating DRY or zero-cost principles.

bluss · 2015-08-03T21:18:29Z

Crates on crates.io that use Weak<T> according to a quick grep:

bacon_rajan_cc-0.1.0
carboxyl-0.1.1
crust-0.2.3
glr-parser-0.0.1
gstreamer-0.2.5
html5ever-0.2.1
html5ever_dom_sink-0.2.0
mm_video-0.1.5
oil-0.1.0
rclist-0.0.1
rctree-0.1.1
shared_slice-0.0.4
thread_isolated-0.1.0

brson · 2015-08-03T22:01:32Z

I agree that some evidence of the negative impact on space and perf would be very good to have. My feeling is that the RC we have is violating the pay-for-what-you-use principle.

I'd like to remind people that when weak pointers were originally proposed they were a completely separate RC type, but we merged them because we didn't want a proliferation of RC types. I think this was a mistake we made in the spirit of compromise and we should not have merged weak pointers at all.

huonw · 2015-08-03T22:15:36Z

On the topic of pay-for-what-you-use: C++ is generally regarded as the poster-child for this, and yet shared_ptr supports weak pointers unconditionally (AFAIK). This doesn't mean it is right for Rc/Arc to have them, but it is evidence that it's not necessarily problematic.

btw, what motivates this bias to move things out to crates.io?

Crates.io is much, much more flexible: (breaking) changes can be made independently of the language version, unlike things in std, e.g. there can be a crate foo with versions 1.0.0 and 2.0.0 which have different APIs, but both run against Rust 1.0 and, in fact, both can be maintained (its easy to continue to release 1.y.z versions even while the 2.y.z releases are the latest ones). If the functionality was in std, one can only make direct breaking changes with major version bumps to the language.

steveklabnik · 2015-08-03T22:35:01Z

shared_ptr also has to use atomics always, so that overhead is going to be more than the extra count would be, no? I guess what I mean is that shared_ptr already has lots of overhead.

huonw · 2015-08-03T22:37:37Z

Maybe we should only consider that factor on Arc, then.

eddyb · 2015-08-03T22:38:00Z

@huonw C++ also has no non-atomic ref-counted pointers, it's in a really bad spot, actually, when it comes to pay-for-what-you-use.
We could do better. The extra costs are possibly negligible for Rc, but less so for Arc.

EDIT: it appears that @steveklabnik made the same point while I was writing my comment.

huonw · 2015-08-03T22:44:50Z

it's in a really bad spot, actually, when it comes to pay-for-what-you-use.

This is kinda my point ;) C++'s core version of this abstraction isn't so great, and yet it is still the only version offered. Sure, other people will write their own weak-less/non-atomic versions, but apparently not enough to get into the standard (maybe tracking down the C++ standards committee's/boost's discussion/proposals for shared_ptr would be informative for this RFC).

ghost · 2015-08-03T23:27:42Z

I liked the suggestion by @bluss (and @eddyb's reply) so I scrambled together a quick half implementation that seems to work.

https://play.rust-lang.org/?gist=0d53766a9a224fe27fe4&version=nightly

Basically RcBox gets a generic reference count, rather than being set at 2 usize values.

I don't really understand atomic operations though, and testing for race conditions is hard, so it's probably full of holes. Not sure of the overhead in calling closures either. All of the tests from Rc and Arc passed (other than the functions I didn't implement) fwiw.

I kind of like that it's just one type this way but it seems to cause issues with type inference (e.g. let x = Rc::new(5u32) wouldn't work, but let x: Rc<u32> = Rc::new(5) did).

The RefCount trait is pretty awful though.

eddyb · 2015-08-04T03:31:19Z

@whataloadofwhat The issues with inference will be solved once #213 is ungated (if you enable the feature, you should be able to have let x = Rc::new(5u32) working right now).

Gankra · 2015-08-04T03:38:42Z

I still don't understand how the trait version is supposed to be a better api than 4 types. It's the exact same combinatoric shuffling as having a function take an enum, but at the type level. It will just make everything more complicated and require describing everything 6 times instead of 4.

bluss · 2015-08-04T12:54:13Z

@gankro It should simplify implementation by avoiding copy-pasting code. The API result should be exactly the same or better, otherwise it's not working correctly. The better part would be simpler ways to switch from one configuration to another.

pythonesque · 2015-08-04T19:26:25Z

It's hardly shocking that people aren't using Weak<T> given that it's not stabilized. I'm not in favor of this change; it's rare that I want to use Rc at all, but when I did I would almost certainly want Weak as well.

nikomatsakis · 2015-08-05T09:59:54Z

My opinion is starting to waver. There seems to be a balance between "pay for what you use" and "full featured". I think C++ goes too far with shared_ptr in that atomic reference counting is worth calling out explicitly, but it may be that weak pointers are cheap enough that they fall under the "full featured" category.

It's true that I've never personally had much use for weak pointers, but I know that Cocoa programs make use of them to represent pointers "up the tree" (and also for some specific situations, like a backpointer from delegate to UI node). We know Servo is using them. Hence, I can imagine that it comes up pretty regularly when programs grow to a certain size and rely extensively on Rc. And I guess it's one of those features that, when you need it, you really need it.

I am not sure how I feel about using generics. It's a neat trick, and it seems better than having two totally distinct types, but in the end it's still two types (or 1.5, at least), and hence carries similar downsides (e.g., the point @dgrunwald made that you may not be in a good position to choose which flavor of Rc to use). Of course, it's possible for everyone to write code generically over the flavor of Rc, but that seems unlikely to occur in practice.

Ericson2314 · 2015-08-08T01:56:23Z

I think parameters alleviates part of @dgrunwald's point, because we can write generic code on such parameters today, but not with totally separate types until HKT.

glaebhoerl · 2015-08-08T14:07:19Z

I agree with @aepsil0n that this feels like premature optimization. I would consider this proposal compelling only if there were benchmarks of heavy real-world users of Rc or Arc actually showing a significant performance improvement. Are there any heavy users which don't also use Weak?

Gankra · 2015-08-08T16:12:27Z

I'm going to propose the exact opposite of this RFC (but without is_unique).

nrc · 2015-08-09T19:25:03Z

Sorry to comment on a closed thread, but for future reference, this really needs data - someone needs to sit down and find out the cost of the extra pointer in benchmarks before we can make an informed decision here.

SimonSapin · 2015-08-09T22:32:53Z

Well, we know the cost, right? It’s size_of::<usize>() extra bytes of memory for the weak reference count per RcBox.

Whether that’s significant entirely depends on how big is T in Rc<T> and how memory constrained is a given application. And I believe that can only be anecdotal evidence.

Gankra · 2015-08-09T23:15:01Z

@SimonSapin it incurs extra branches/logic in some places. For instance when strong count hits 0 you need to branch on the weak count to decide if you also free the memory. Probably negligible -- but not free.

eddyb · 2015-08-10T05:51:36Z

Like I said before, the size is the least of your worries: the main significant cost is that of atomic operations in the case of arc::Weak - I believe it's worse now, as it has been changed to use a "locking" loop of atomic operations to plug a soundness hole.

In the case of rc::Weak, I guess that LLVM might be able to remove redundant refcount increment/decrement pairs on Rc, but the extra logic for the weak count throws it off.
But you need to inspect the generated IR, before you know whether that's an actual problem (though it's not like it could elide allocations, AFAIK LLVM doesn't currently understand our heap alloc/free functions).

Gankra · 2015-08-10T06:47:38Z

@eddyb it understands jemalloc enough to elide some useless allocations per rust-lang/rust#22159

But rust-lang/rust#24194 suggests it's maybe not great at that.

Storyyeller · 2017-02-02T06:21:17Z

Was there ever any update on this? It seems like it is impossible to build a "weak-less" Rc outside of std with the same performance because the standard Rc depends on unstable features.

comex · 2017-02-02T06:49:02Z

Weak is stable now.

But I can't think of any unstable features that would prevent implementing a custom Rc. What are you referring to?

bluss · 2017-02-02T09:27:57Z

unsized coerce and access to the memory allocator is missing (The latter could be possible to overcome with just using box, but I'm not 100% certain of the details).

NonZero is missing.

bluss · 2017-02-02T19:37:10Z

Seems like you can get most of a regular weakless Rc working on stable (code link). It's missing the unsized coerce and Option<Rc<T>> size optimization, but that's not so bad.

Storyyeller · 2017-02-02T19:51:17Z

Given that the motivation for removing the weak count is to improve performance, I think it's counter productive to remove the nullable optimization. Giving up the rest is tolerable, but IMO, NonZero is essential.

SimonSapin · 2017-02-02T20:13:12Z

I think that is an argument for NonZero outside of std rust-lang/rust#27730 (which is desirable anyway), not for weakless Rc in std.

remove weak pointers

e2a0adb

Gankra changed the title ~~remove weak pointers~~ RFC: remove weak pointers Jul 31, 2015

Gankra self-assigned this Jul 31, 2015

Gankra added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Jul 31, 2015

bluss reviewed Jul 31, 2015
View reviewed changes

bluss reviewed Aug 1, 2015
View reviewed changes

bluss mentioned this pull request Aug 2, 2015

Stabilization of weak pointers rust-lang/rust#27465

Closed

glaebhoerl mentioned this pull request Aug 8, 2015

Make Rc and Arc mem::forget safe rust-lang/rust#27174

Merged

Gankra closed this Aug 8, 2015

aturon mentioned this pull request Aug 12, 2015

Tracking issue for Rc/Arc stabilization rust-lang/rust#27718

Closed

RFC: remove weak pointers #1232

RFC: remove weak pointers #1232

Conversation

Gankra commented Jul 31, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikomatsakis commented Jul 31, 2015

SimonSapin commented Jul 31, 2015

metajack commented Jul 31, 2015

alexcrichton commented Aug 1, 2015

metajack commented Aug 1, 2015

seanmonstar commented Aug 1, 2015

Gankra commented Aug 1, 2015

Gankra commented Aug 1, 2015

seanmonstar commented Aug 1, 2015

Gankra commented Aug 1, 2015

seanmonstar commented Aug 1, 2015

Gankra commented Aug 1, 2015

seanmonstar commented Aug 1, 2015

Gankra commented Aug 1, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Gankra commented Aug 1, 2015

dgrunwald commented Aug 1, 2015

bluss commented Aug 1, 2015

SimonSapin commented Aug 1, 2015

nagisa commented Aug 2, 2015

SimonSapin commented Aug 2, 2015

milibopp commented Aug 2, 2015

milibopp commented Aug 2, 2015

eddyb commented Aug 3, 2015

bluss commented Aug 3, 2015

brson commented Aug 3, 2015

huonw commented Aug 3, 2015

steveklabnik commented Aug 3, 2015

huonw commented Aug 3, 2015

eddyb commented Aug 3, 2015

huonw commented Aug 3, 2015

ghost commented Aug 3, 2015

eddyb commented Aug 4, 2015

Gankra commented Aug 4, 2015

bluss commented Aug 4, 2015

pythonesque commented Aug 4, 2015

nikomatsakis commented Aug 5, 2015

Ericson2314 commented Aug 8, 2015

glaebhoerl commented Aug 8, 2015

Gankra commented Aug 8, 2015

nrc commented Aug 9, 2015

SimonSapin commented Aug 9, 2015

Gankra commented Aug 9, 2015

eddyb commented Aug 10, 2015

Gankra commented Aug 10, 2015

Storyyeller commented Feb 2, 2017

comex commented Feb 2, 2017

bluss commented Feb 2, 2017 • edited Loading

bluss commented Feb 2, 2017

Storyyeller commented Feb 2, 2017

SimonSapin commented Feb 2, 2017

bluss commented Feb 2, 2017 •

edited

Loading