Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ManuallyDrop type gives precise control of dtors of inline data. #197

Closed
wants to merge 5 commits into from

Conversation

huonw
Copy link
Member

@huonw huonw commented Aug 13, 2014

This is a replacement for zeroing, acting like *_ T wrt destructors (i.e. they need to be run manually).

@glaebhoerl
Copy link
Contributor

(Warning: thread hijack.)

I've been thinking about a proposal which seems related and might subsume this one. I'm not at the point of writing an RFC for it, but I'll describe it briefly as food for thought:

  • Add a built-in type UnsafeUnion<A, B>.
  • This is essentially like unions in C: a type compatible with the size and alignment of both A and B. So sizeof(UnsafeUnion<A, B>) = max(sizeof(A), sizeof(B)), and alignof(UnsafeUnion<A, B>) = lcm(alignof(A), alignof(B)).
  • Unlike C, it doesn't have fields, rather "construction" and "deconstruction" both simply happen using transmute(). So transmute::<A, UnsafeUnion<A, B>>, transmute::<B, UnsafeUnion<A, B>>, transmute::<UnsafeUnion<A, B>, A>, and transmute::<UnsafeUnion<A, B>, B> all have well-defined behavior, in basically the same circumstances as in C. (This means that the input and output of transmute would not have the same size/alignment in these cases. I don't know whether this is an issue, provided that they are both known.)
  • transmutes of pointers/references, on the other hand, should only be valid in one direction: e.g. you can legally transmute &UnsafeUnion<A, B> to &A, but not vice versa.
  • Because the representation of UnsafeUnions is idempotent: repr(UnsafeUnion<A, A>) = repr(A), commutative: repr(UnsafeUnion<A, B>) = repr(UnsafeUnion<B, A>), and associative: repr(UnsafeUnion<A, UnsafeUnion<B, C>>) = repr(UnsafeUnion<UnsafeUnion<A, B>, C>), larger unions can simply be composed from the binary one. The use of transmute for {,de}construction is also completely impervious to this nesting. We could then also potentially provide synonyms like type UnsafeUnion3<A, B, C> = UnsafeUnion<A, UnsafeUnion<B, C>> for convenience.
  • It should also be layout-compatible with C, and so should be usable to represent C unions in the FFI.

And the connection to this proposal is that UnsafeUnion<T, T> is very close to ManuallyDrop<T>. Even more unsafe, as you note. But given that UnsafeUnion would be more general, I'm not sure whether it would be worthwhile to have both of them as separate primitives.

(With the respect to the built-in traits, this is an interesting question. Right now I'm thinking we should probably avoid unnecessary unsafety, and have impl<A: Copy, B: Copy> Copy for UnsafeUnion<A, B>, and likewise for Send (and Share?). If only one type in the union is e.g. Copy and you know that that's the type currently contained in it, use transmute on a ref or something; likewise, for Send, just transmute it to the contained type that implements Send when you want to send it.)

@lilyball
Copy link
Contributor

👍 on ManuallyDrop. I could have used precisely this in my local_data rewrite in order to use Rc instead of reimplementing a new smart pointer.

@ben0x539
Copy link

Maybe only tangentially relevant, but we could really stand to have a more obviously in-place way to drop something than the ptr::read() thing...

@huonw
Copy link
Member Author

huonw commented Aug 17, 2014

@ben0x539 rust-lang/rust#16242

@glaebhoerl on a first reading, I think that idea isn't crazy.

Could you explain? Wouldn't e.g. UninterpretedBytesOfSize<UnsafeCell> let the compiler know that it contains an UnsafeCell?

It would literally just be an untyped chunk of bytes. If it did inform the compiler about the contained types, it seems like it is essentially identical to ManuallyDrop and thus isn't an alternative. (This was a suggestion @pnkfelix made on IRC as a possible alternative, not something I really fleshed out.)

@gereeter
Copy link

gereeter commented Sep 2, 2014

+1.

Also, this means that the mem::forget intrinsic can be removed, as that could just be implemented by creating a ManuallyDrop wrapper and then dropping that wrapper.

@Gankra
Copy link
Contributor

Gankra commented Sep 5, 2014

+1, this is definitely the right choice to sanely handle SmallVec-ish cases. The massive unsafety it introduces is in my mind acceptable, as it's not something you would use casually, but ideal for efficiently fiddling with fixed-size inline buffers. Ideally, this would only be used internally by libraries to expose an efficient mechanism with a safe interface. e.g. Small/Inline Vecs

I have no strong opinion on more general mechanisms, as destructors are the only thing that I've found troubling.

@huonw
Copy link
Member Author

huonw commented Sep 7, 2014

I'm "blocked" on this because I don't know if going with UnsafeUnion is a good idea now, or if that can be a future extension (because it does seem like the correct way to model, e.g., C unions).

I'm inclined to think we should start simple, but I'm also inclined to think UnsafeUnion isn't particularly complicated. (The only "interesting" thing part of the basic design is the size_of/align_of, and that's pretty simple.)

@Kimundi
Copy link
Member

Kimundi commented Sep 20, 2014

+1, I think having a unsafe ManualDrop<T> building block is a useful thing to have in all cases where you want to deal with low level optimized data structures.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 20, 2014

This could be done (except for static initialization) without adding a special case to the compiler if Rust had CTFE (at least for size_of.)

struct ManuallyDrop<T> {
    #[aligned(alignof(T))]
    data: [u8, ..std::mem::size_of::<T>()],
    _nocopy: std::kinds::marker::NoCopy,
}

impl<T> ManuallyDrop<T> {
    pub unsafe fn new(t: T) -> ManuallyDrop<T> {
        let ret = ManuallyDrop {
            data: std::mem::transmute(t)
        };
        std::mem::forget(t);
        ret
    }

    pub fn unwrap(self) -> T {
        unsafe { std::mem::transmute(self.data); }
    }
}

impl<T> Deref<T> for ManuallyDrop<T> {
    fn deref(&self) -> &T {
        unsafe { std::mem::transmute(&self.data) }
    }
}

impl<T> DerefMut<T> for ManuallyDrop<T> {
    fn deref_mut(&mut self) -> &mut T {
        unsafe { std::mem::transmute(&mut self.data) }
    }
}

@huonw
Copy link
Member Author

huonw commented Sep 20, 2014

Along with CTFE, it also needs the alignment attribute (minor but it's not something we have in Rust now), and, @glaebhoerl's UnsafeUnion seems harder to manually encode. And that implementation loses all type information, i.e. a &ManuallyDrop<UnsafeCell<uint>> would not be regarded as containing mutable data and so aliasing-based optimisations would break.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 20, 2014

Off topic: I expect the aligned attribute to appear before 1.0 because, afaik, the only way to currently align a struct at a 16 bytes boundary is to add a [f64x2, ..0] array to it. IIRC this caused segfaults on Windows before.

@mahkoh
Copy link
Contributor

mahkoh commented Sep 20, 2014

I've really no idea how the special treatment for UnsafeCell works, but maybe the following would solve the problem?

struct ManuallyDrop<T> {
    #[aligned(alignof(T))]
    data: [u8, ..std::mem::size_of::<T>()],
    _ty: [T, ..0],
}

Then &ManuallyDrop<UnsafeCell<uint>> would contain a UnsafeCell<uint> array.

(Furthermore this means that the struct is at least alignof(T) aligned. But since the struct layout is undefined this doesn't imply that data is properly aligned. So the attribute is necessary.)

@pczarn
Copy link

pczarn commented Sep 25, 2014

+1, this enables us to make Rc non-nullable and neatly implemented:

pub struct Rc<T> {
    _ptr: ManuallyDrop<Box<RcBox<T>>>,
    _nosend: marker::NoSend,
    _noshare: marker::NoSync
}

@glaebhoerl glaebhoerl mentioned this pull request Oct 8, 2014
@ben0x539
Copy link

It would be nice if the lang item could be specialcased so that implicitly dropping ManuallyDrop (or, having its drop glue appear) in user code is a compile-time error. That way the only way for it to go out of scope is to call .unwrap() or mem::forget(), so whether or not the contained value is dropped, it happens explicitly.

That plan doesn't seem to work so nicely with the hypothetical in-place drop facility though. You'd need to write ptr::drop_in_place(manually_drop.get_mut()); mem::forget(manually_drop); and can't write a wrapper function for that since then it wouldn't be in place anymore.

@huonw
Copy link
Member Author

huonw commented Nov 13, 2014

I thought about this a bit more, and I do like @glaebhoerl UnsafeUnion type, but I think that adds enough unanswered questions and additional complications that I'm continuing to support this version of the RFC.

@ben0x539 ManuallyDrop (by definition) has no drop glue, and, it may also be Copy (see the unresolved question). It does seem like it may be possible to have a lint to detect it being "dropped", but this would essentially be an ad-hoc linear type, and it may be better to have a more structured/general approach to linear typing.

The ManuallyDrop API could also include

impl<T> ManuallyDrop<T> {
    unsafe fn run_destructor(&mut self) {
        ptr::drop_in_place(self.get_mut())
    }
    unsafe fn discard(self) { mem::forget(self) }
}

so your snippet becomes value.run_destructor(); value.discard() which isn't so bad, IMO. (Definitely a little worse than value.destroy_and_discard() in one, but, as you say, this is no longer in place.)

@mahkoh
Copy link
Contributor

mahkoh commented Feb 16, 2015

Still relevant.

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jun 27, 2017
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.

[RFC 1974]: rust-lang/rfcs#197

The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.

cc rust-lang#27389
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jul 3, 2017
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.

[RFC 1974]: rust-lang/rfcs#197

The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.

cc rust-lang#27389
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Jul 5, 2017
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.

[RFC 1974]: rust-lang/rfcs#197

The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.

cc rust-lang#27389
cuviper pushed a commit to rayon-rs/rayon-hash that referenced this pull request Nov 14, 2017
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.

[RFC 1974]: rust-lang/rfcs#197

The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.

cc #27389
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.