-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guaranteed non-static destructors #1094
Conversation
Edit: This attempt is unsound because an Rc can outlive it's guard. See this post for a more correct Here's an initial attempt at implementing an Use it like this: use std::rc::{Rc, RcGuard};
use std::cell::RefCell;
struct Cycle<'a> {
next: RefCell<Option<ScopedRc<'a, Cycle<'a>>>>,
val: &'a char,
}
impl<'a> Drop for Cycle<'a> {
fn drop(&mut self) {
println!("{}", self.val);
}
}
fn main() {
let a_val = 'a';
let b_val = 'b';
let c_val = 'c';
let guard = RcGuard::new();
let a = guard.new_rc(Cycle { next: RefCell::new(None), val: &a_val });
let b = guard.new_rc(Cycle { next: RefCell::new(None), val: &b_val });
let c = guard.new_rc(Cycle { next: RefCell::new(None), val: &c_val });
*a.next.borrow_mut() = Some(b.clone());
*b.next.borrow_mut() = Some(c.clone());
*c.next.borrow_mut() = Some(a.clone());
// Cycle is cleaned up when guard is dropped
} |
Heres an experiment with a use nocyclerc::{NoCycleRc, NoCycleMarker};
struct Cycle<'a, T> {
next: Option<NoCycleRc<'a, T>>,
val: &'a char,
}
impl<'a, T> Drop for Cycle<'a, T> {
fn drop(&mut self) {
println!("{}", self.val);
}
}
fn main() {
// Rc's created with marker1 must only contain borrows that live longer than
// marker1 and must not themselves outlive marker1.
// NoCycleMarker is a zero-size type, and thus may be passed and cloned for
// free
let marker1 = NoCycleMarker::new();
let a_val = 'a';
let b_val = 'b';
let c_val = 'c';
// Error: a_val does not outlive marker1
//let a = marker1.new_rc(Cycle { next: RefCell::new(None), val: &a_val });
let marker1b = marker1.clone(); // Clones have the same lifetime
// Error: a_val does not outlive marker1b (same lifetime as marker1)
//let a = marker1b.new_rc(Cycle { next: RefCell::new(None), val: &a_val });
let marker2 = NoCycleMarker::new(); // New, shorter lifetime
// Ok: a_val and b_val outlive marker2
let mut a = marker2.new_rc(Cycle::<Cycle<()>> { next: None, val: &a_val });
let mut b = marker2.new_rc(Cycle::<()> { next: None, val: &b_val });
// Error: b doesn't outlive marker2
//nocyclerc::get_mut(&mut a).unwrap().next = Some(b.clone());
let marker3 = NoCycleMarker::new();
let mut c = marker3.new_rc(Cycle { next: None, val: &c_val });
// Ok: b outlives marker3
nocyclerc::get_mut(&mut c).unwrap().next = Some(b.clone());
} |
Guarantee that the destructor for data that contains a borrow is run before any code after the borrow’s lifetime is executed.
c816b61
to
06963fd
Compare
Panicing in Given that that panic can only be triggered in a destructor (I think?) which will end up being a process abort, it might be worth it to explicitly abort there to avoid panic bloat at deref callsites. |
@sfackler It's also possible to pass an |
with a shorter lifetime, and research the possibility of a reference-counted | ||
type that statically disallows cycles. | ||
|
||
Specifically, the scoped cycle collector would operate as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would really help to have an implementation, or at least the signatures and types of all the parts of this API. It's quite difficult to judge the ergonomics and such of this proposal without them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read the comments: that helps, some of that stuff should move into the RFC.
How could this RFC be adapted to |
Conceptually, this feels like making Rc just nullable arena pointers, since dereferencing them can panic if the RcGuard has been dropped (which is a dynamic, not static, condition). I'm not really sure yet how I feel about this, I have to mull it over some more. |
Doesn't this describe weak pointers? It's not clear to me whether under this proposal |
After further thought, allowing Rc's to outlive the guard is, in fact, unsound (you could deref and then drop the guard). However, I think I have now found a way to allow cycles while keeping the Rc objects from outliving their guard. Let me update my code to test. |
The whole idea of Not only that, but who owns the
I believe that's known to be impossible, at least not without unacceptably limiting what values can be held in the |
That's the plan, though I don't understand why it's terrible. It seems no different than RefCell::borrow_mut panicking when the cell is already borrowed. It can only ever happen if the user creates a cycle, doesn't clean it up themselves, and then tries to access the next link in the cycle in the destructor. It seems reasonable to call this a programming error and panic.
If the contents are
In the general case, yes, but it can be useful for some cases, which is why I'm also suggesting a |
The idea is that all non-cycle
The programmer is free to use weak pointers or avoid cycles all together to get all the benefits of reference counting. I don't expect a graph containing a bunch of cycles that stick around the entire time to be common (though it could be done). The advantage of having a guard is to ensure that if the programmer does create cycles of non- |
Specifying "potential for data revalidation on destruction" semantics for "objects with destructors and which are borrowing said data" is something I've previously discussed with @nikomatsakis. fn attract(a: &Entity, b: &Entity) -> Vec3 {...}
let mut vec: Vec<Entity> = ...;
for a in &mut vec {
let mut accel = Vec3(0.0, 0.0, 0.0);
for b in &vec { // inside this loop, *a is immutable
aceel += attract(a, b);
}
a.velocity += accel * dt;
} Another thing to note is that data races wouldn't be possible, because But not if you can safely forget that resource. Guaranteeing non-static destructors (or the revalidation subset, at least) always run is a must if you want to extend the borrow checker to understand more safe patterns, IMHO. Which... reminds me, don't we have a cycle detector in the compiler? Can't we mix it with checking for interior mutability and non-static references and expose it as a trait? Or has that been proposed already? |
Okay, here we go: this gist correctly enforces lifetimes and I believe it is sound. Sample usage: use scopedrc::{ScopedRc, ScopedRcGuard};
use std::cell::{Cell, RefCell};
struct Cycle<'a> {
next: RefCell<Option<ScopedRc<'a, Cycle<'a>>>>,
val: &'a char,
}
impl<'a> Drop for Cycle<'a> {
fn drop(&mut self) {
println!("{}", self.val);
}
}
fn main() {
//let guard1 = ScopedRcGuard::new();
let a_val = 'a';
let b_val = 'b';
let c_val = 'c';
// Error: a_val does not outlive marker1
//let a = guard1.new_rc(Cycle { next: RefCell::new(None), val: &a_val });
// Error: guard2 does not outlive target
//let target
let guard2 = ScopedRcGuard::new();
// Ok
let target;
let a = guard2.new_rc(Cycle { next: RefCell::new(None), val: &a_val });
let b = guard2.new_rc(Cycle { next: RefCell::new(None), val: &b_val });
let c = guard2.new_rc(Cycle { next: RefCell::new(None), val: &c_val });
*a.next.borrow_mut() = Some(c.clone()); // Cycle
*b.next.borrow_mut() = Some(a.clone()); // Holds cycle, but isn't one
*c.next.borrow_mut() = Some(a.clone()); // Cycle
// Error: guard2 is borrowed
//drop(guard2);
target = b;
// At end of scope, &'b' is dropped immediately. The rest are cleaned up by
// the cycle collector.
} |
Weak pointers and avoiding cycles do not combine to give you all the benefits of reference counting. The whole point of reference counting cycles is to be able to break them dynamically--you want all the references to be strong most of the time. You can replicate the functionality in this RFC in Rust already with a This RFC also adds potentially failing operations on A strong -1 for me, sorry. |
I think you misunderstand the point of this proposal. The guarded Rc proposed here can be used exactly like a normal Rc. The only difference usage-wise is that cycles get cleaned up automatically if you neglect to do it manually, allowing the presented destructor guarantee to be upheld. The destructor guarantee is the point of this proposal. Everything about Rc is to demonstrate that it can still exist and do all of the same things in a world with this guarantee.
There is absolutely nothing in this proposal that prevents you from doing that.
This just isn't true. To quote myself from Reddit: This proposal doesn't change the use or operation of reference counting in any way. The only thing it does is make sure that in the case of cycles, which are not the norm, the contained objects are still cleaned up before their lifetime ends. I want to be clear, because it seems like there is some confusion: This propsal doesn't make Rc's stay around until the guard is executed. When the strong count reaches zero, the contained object is dropped, like always. When the weak count reaches zero, the allocation is freed, like always. This proposal doesn't change that in any way. This proposal doesn't require you to know when things will be dropped, only the maximum amount of time they could live, which you have to know anyway due to Rust's lifetime system.
Rust has runtime failures all over the place. In general, Rust asks, is this error a programming/logic error or a valid runtime condition? If it's the former, the function panics or aborts. If it's the latter, the function returns a
First, I'll note that You also wouldn't have to worry about race conditions during cycle collection because all external references are guaranteed to have been dropped by then due to lifetime restrictions, so the only thread with access would be the one destructing the guard. |
Big +1. |
I'd be so happy if we could get something like this in pre-1.0 |
I'm looking at @pcwalton's comments in #1066 and trying to grasp how the competing proposals may address his concerns. It sounds like there is concensus on retaining a version of Is this a fair very high level description of the proposal: There will be a version of On the other hand, one can use another variant of I think this brings me to my question. I know everyone here wants hard guarantees about destructors being run, including me. However, I'm wondering if it would be sufficient for the compiler to issue a strong warning on today's |
@heiseda This also allows cyclic non- |
@rkjnsn @Ericson2314 In practice, how does this differ from |
@heiseda if you look to comments above (#1094 (comment)) @rkjnsn just addressed this. But to summarize:
|
I posted a similar comment on Reddit, but to summarize my current thoughts: Firstly, I would vastly prefer if this RFC took advantage of dropck (statically avoiding the possibility of using a cyclic reference within a destructor), because as it is it's going to induce overhead like bounds checking does, where it's usually not necessary for safety (except that unlike bounds checking, the compiler won't be good at eliding it). I think this should be doable by just limiting each Secondly, there is definitely going to be substantial overhead with any proposal I can think of that works with cycles in Most importantly, the requirement for a My suspicion is that rather than do this, in practice most people will just bound One might argue that the above also applies to the The third point is the biggest reason I'm against this RFC. I can't see a way around it, because allowing cycles and not having a scoped guard for cycle collection fundamentally means you're accepting leaks, and I don't want types with references to turn into second-class citizens. |
@rkjnsn that is some fantastic ergonomic work. Good job! |
I updated the RFC with the revised |
A few points:
By the way, the RFC would be easier to understand and evaluate if it had the type signatures of |
@rkjnsn, thanks for your hard work on this RFC, and everyone for the great discussion! (Comment below copied from #1085 (comment)): Of course, this needs to be settled prior to the 1.0 release next week, and the core team met yesterday to come to a final decision on the matter. This is truly a thorny problem with multiple reasonable paths to take, but in the end the analysis of the tradeoffs presented in Niko's blog post and the follow up represents the core team's consensus, which emerged through the discussion on this thread and others. As such, we are going to close this RFC, to settle the matter for 1.0. There is still discussion to be had about what to do with the |
@aturon I made a number of changes to the RFC in response to the feedback it the posts you mention, among others, and there has been no discussion here from any of the core team regarding them, or whether and why they are still insufficient. Specifically, I significantly reduced the overhead for unguarded I put a lot of effort into specifically addressing the discussions you mention as the reason for closing this, so it's very disheartening to see it closed with no further feedback. |
No, they could only be merged if they had the exact same lifetime, in which case the programmer could just use a single
I discuss this some in this comment.
I agree. The RFC calls it
There is a sample implementation and usage linked to from the RFC, but perhaps some of the details should be added to the RFC itself. |
I'm sorry, in the discussion in the core team yesterday the understanding was that these latest changes were covered by the follow-up blog posts and further replies in the thread there and elsewhere. Let me try to lay that out in more detail. In particular, the follow-up post lays out concerns about integration with external reference-counting systems, and @nikomatsakis also discusses a basic disagreement about the effective mental model around On the other side, as @nikomatsakis lays out in the blog post, all of these downsides have to be weighed against the problem the proposal is trying to solve. As @kballard nicely articulated, the use of RFC for cases like I imagine @nikomatsakis can give you some additional feedback and pointers, though he's away at the moment. Finally, while we've tried to focus this debate on the merits of various proposals in abstract, it's also worth remembering that 1.0 is slated for release in one week. As we've discussed in this thread, the blog post, and elsewhere, the fundamental problem being solved here is relatively narrow, and the potential risks of the solution potentially high. Putting all of those factors together, it seems quite imprudent to delay the release at this point, as would be needed to fully carry out and evaluate a solution like the one proposed in this RFC. |
@aturon I'm with @rkjnsn here. The RFC and implementation has seen significant changes since the follow up blog post even. Obviously there has been few others besides us pushing for this, so I can't complain if you reject it on grounds of lack of community support. But I do too feel that the reasons as given miss the mark or are out of date. I am going to address the core team's points as summarized by you in a few minutes. Ideally at this point I'd like to see both non-static rc and scoped threads made unstable for 1.0 so we can actually verify the claims made by both sides. |
I feel like the external reference counting was addressed nicely by @glaebhoerl, with whom @nikomatsakis seemed to agree. Different mental models regarding I disagree that this RFC would cause additional implementation details to leak for channel. The problem with "And there is still an overhead imposed when The goal of this RFC isn't to save As for the limited timeline, this is fortunate, but understandable. While I believe the benefits of this RFC far outweigh the downsides, I understand the strong desire to release on schedule that one week is probably not enough time to fully evaluate and implement it given everyone's busy pre-1.0 schedules. Ideally, this would have come up sooner, but I'm not sure my solution would be possible without @pnkfelix's dropck work. If this RFC isn't accepted due to lack of time, what would the steps be for trying to get it included in 2.0? |
It also would be nice to have feedback from @alexcrichton, who proposed the original |
Specifically, this RFC allows both creating cycles and leaking |
@rkjnsn now that you mention 2.0, I sometimes wish Core Team would decide on a roadmap for it soon. I think there are going to be more issues, especially since Rust is going to see more usage with 1.0. Maybe a Github label like An example of such a roadmap would be:
1.0 would perhaps be the shortest major release. All others would last several years each. |
@aturon IMHO this usually is not a leaky abstraction (channels are a weird case: I'm still trying to figure out the solution). Most code that use
|
Some solutions for channels:
|
First, I'd like to offer an apology: I did skim over the most recent changes to the RFC, but I didn't fully appreciate the work you had done to reduce cost and offer more specialization for specific scenarios. This was clever work. I'm glad you raised an objection so that we can talk it out a bit more. All that said, the newer draft of the RFC doesn't really change my final opinion. There are several factors here, but I think the root cause is this: Ultimately, this proposal is aimed at making unsafe code easier to reason about, which is a laudable goal. But that goal is achieved by making safe code more complicated, and I believe that this problem leaks up into surrounding APIs that will employ reference counting, as @pythonesque has noted. This is about more than the I have this strong feeling that simple, straight-forward reference-counting should not be unsafe, and should not require extra runtime overhead. Shared ownership is a fact of life and a very common pattern, even if less common than single ownership. We already place limitations on mutability, which is required for any form of safety, but further complicating the As far as overhead goes, I appreciate that your scheme can sometimes rule out cycles statically, but in so doing it prohibits recursive To go further, tying guarantees around leaks to a All this is not to say that I don't like your proposal. In fact, I like it quite a bit! But I don't think it's a pure win: it imposes real costs. In short, I think there are good arguments in favor of the status quo, and it has the advantage of being the devil we know. We have a lot of experience with the current system and a good notion of how well it scales. Now, the elephant in the room is obviously the 1.0 release. For the most part, I've been trying to think about this problem from first principles and decide what's best for Rust as a language. If we have to delay the release, so be it. But obviously we have very publicly announced that we will release 1.0 on May 15th, and if we decide we want to make changes to Finally, with regard to your point that people will write unsafe code that relies on destructors even though they shouldn't, I certainly agree that will probably happen. But I also think people are going to create leaks, even if we say that they are unsafe (along with a lot of other bugs to boot). In the end, we have to publish clear guidelines on unsafe code (working on it) and make progress on tools to help people validate their unsafe code (thinking about it, at least). |
@nikomatsakis is going to reply more fully, but I wanted to elaborate a bit on some of the points I raised as well.
I'll let @nikomatsakis follow up on this, but I think the basic point remains: with this RFC, it will simply not be possible to provide or interface to "simple" reference counting (without cycle detection or prevention) that is generic over borrowed data.
I agree that the mental model point can be reasonably argued in a few directions here, but want to be clear that in all schemes, memory unsafety can only arise through explicit use of unsafe code; this is all about the assumptions you can make in unsafe code. I believe that @kballard's point about the typical RAII pattern is a crucial one: the vast majority of the time, when RAII is used, the guard is actually a guard: you have to go through it to access the data. For such cases, experience shows that leakage poses little risk. (I've pointed to this comment before, but @nikomatsakis offers a nice analysis here). There's broad agreement that providing stronger guarantees and a stronger mental model, all else aside, is great. But it has to be balanced against the costs.
Whether you consider them implementation details or not, the fact is, this RFC would force the channel API to change, and would likely have impact on other APIs as well. Certainly anything that uses
The problem is with the phrase "as long as cycles aren't needed" -- it's not that simple. The proposed API requires you to "prove" the lack of cycles by means of dropck-enforced markers, but that is inherently tied to the stack structure of the program. It's not at all clear how many uses of I think it's fair to say that in this scheme, overhead will be required at least some of the time, and we don't know where the line is. Furthermore, to escape overhead, you have to program with markers, which is an additional programming burden we have little experience with. We have been pushing Rust toward lower and lower-level safe systems programming, and part of that push has been to guarantee safety without overhead. Reference counting is a basic part of that domain, and imposing overhead and/or extra programming complexity feels like a step backwards -- the benefits would need to be very strong indeed.
I'm very sympathetic to this point, but it's a global cost/benefit analysis. In my opinion, the style of RAII usage that leads to problems with leakage is fairly niche. Bugs will happen no matter what we do. So we have to weigh the likelihood and severity of a given class of bugs against the complexity cost imposed in trying to rule it out. FWIW, our experience elsewhere in the standard library over time has shown that complicating basic APIs to try to force programmers to account for corner cases is often a losing proposition. Further, the API cost here isn't fully known -- certainly
Yes, I feel that even if we could devote many people evaluating this full time, a week is far too little time to truly understand the fallout of a fundamental change to the guarantees that unsafe code must provide, or even to be sure that the proposal completely works. Frankly, I'm still worried that we have leaks elsewhere in the language or library that we haven't discovered yet. Regarding the 2.0 question, it's not at all clear what the timeline or impetus for such a release would be, given that we haven't put out our initial release yet; I don't want to speculate on that. I think we need to make a decision about 1.0 now, and then let the dust settle for a while. |
There's something I'm trying to understand about the details here. In particular, when you have a But since subtyping allows us to shorten lifetimes arbitrarily, doesn't that violate the idea that the structs containing them are dropped by the time the |
@aturon @nikomatsakis Thank you both for your detailed responses. That helps me understand your reasoning a lot better. |
I disagree with this argument.
I believe that's the only problem with this RFC (which is sad). |
I believe that the proposed API does not require you to prove the lack of cycles. You either
|
The context here is for cases where you have borrowed data (so
It's unclear how far the no cycle marker goes in terms of expressiveness, but it certainly imposes some new programming complexity. So the point here is just that, when working with |
I think I was not quite clear in explaining my point; but, also, what you wrote here is not strictly correct. You state the |
@nikomatsakis I've always assumed that all abstractions could take a ScopedRcGuard, and let the user pass in STATIC_GUARD as required. However, seems like you feel that is too complex. I guess I am underestimating the complexity (which is subjective any way, so I guess I should stop argue now) @aturon I sort of forgot that rust is about "zero cost abstractions", after all. I thought that the overhead is going to be low, but maybe that's not low enough. Sorry for beating the dead horse. |
Thanks again for taking time to respond. Since this obviously can't be part of 1.0 at this point, I'm going to take a little break from it, but I would definitely like to refine the design a bit to see if it would be practical as a future addition. I think there are a few language features that could help, here. One would be a way to specify additional dropck-related restrictions for a type in where clauses of generic methods. Specifically, it would be nice to be able to require not only that Another useful feature would be syntax to specify that a given Are there other concerns that I should try to address? Any good real-world heavy users of |
@rkjnsn Just wanted to say -- things are very hectic on my end right now, but I will try to give some feedback on this comment soon. |
Indeed, I had started a reply to this comment and then it got list in the shuffle. Let me note down some quick thoughts on your comments. First, the additional dropck restrictions you mentioned are things that @pnkfelix and I have talked about from time to time as well. This relates to an (in-progress) RFC I am working on buttoning up some aspects of the type system oriented around guaranteeing that whenever you have access to data in the lifetime Second, adding the ability to have a "strict outlives" relation does seem useful, but I am wary, because I think it could potentially impact other aspects of the language design. For example, we assume that today, during code generation, that we can erase lifetimes without affecting trait selection -- but if one could have a strict outlives relation, that might not always be true (it'll depend on how things like coherence and specialization treat regions as well). In general, this points to the need for more formal modeling, which is definitely something that is in progress. I'll be honest, it's unclear to me whether this decision will be reversible in a practical sense. Even if we were to issue Rust 2.0, I could imagine that it is either impractical to deprecate To that end, one obvious starting place is to look at abstractions like channels and try to modify them so as to ensure that they cannot leak (some thoughts were already put in place on this thread, iirc). Another thought that I have had concerns integrating with other kinds of shared ownership beyond Finally, I just want to note that even if we never fully deprecated leaks, I think that |
@nikomatsakis Interesting stuff, thanks for writing it!
|
Guarantee that the destructor for data that contains a borrow is run before any
code after the borrow’s lifetime is executed.
Rendered