Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Box::take() method #93653

Closed
wants to merge 1 commit into from
Closed

Added Box::take() method #93653

wants to merge 1 commit into from

Conversation

Kixunil
Copy link
Contributor

@Kixunil Kixunil commented Feb 4, 2022

The new Box::take() method allows to read from Box and reuse the
allocation safely. This commit also changes some unsafe code in the
compiler to use this method instead and becomes safe.

This was suggested in #63291 (comment) where it got two thumbs ups and no objections and is pretty simple, so should be OK without RFC. It reuses the feature new_uninit, hope it's OK.

Related: I noticed number of unsafe casts of Box<T>, A to Box<U, A>, maybe we should have a pub common unsafe method cast_unchecked for these? It'd be clearer and less boilerplate-y. Also cast (From) Box<T, A> -> Box<MaybeUninit<T>, A> could be safe (though leaking). Can open a new PR for this if there's interest.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Feb 4, 2022
@rust-highfive
Copy link
Contributor

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 4, 2022
@rust-log-analyzer

This comment has been minimized.

@Kixunil Kixunil force-pushed the box_take branch 2 times, most recently from aea701b to b54257f Compare February 4, 2022 17:24
@rust-log-analyzer

This comment has been minimized.

@Kixunil

This comment was marked as resolved.

@SkiFire13
Copy link
Contributor

I think you might want to add it here, between these two:

#![feature(const_ptr_read)]
#![feature(const_maybe_uninit_write)]

@Kixunil
Copy link
Contributor Author

Kixunil commented Feb 4, 2022

Looks legit, thanks!

@rust-log-analyzer

This comment has been minimized.

The new `Box::take()` method allows to read from `Box` and reuse the
allocation safely. This commit also changes some `unsafe` code in the
compiler to use this method instead and becomes safe.
@wwylele
Copy link
Contributor

wwylele commented Feb 4, 2022

I wonder how this would affect learning rust. As a beginner, I was looking for a Box::<T>::take(self) -> T function, unaware of the box magic *boxed ("DerefMove") that can achieve the same thing. If we add this function with the name take, people might use it as a general-purpose unboxing method. I guess it does work that way if you simply drop the second return value, but this could encourage non-idiomatic code.

@Kixunil
Copy link
Contributor Author

Kixunil commented Feb 4, 2022

@wwylele interesting point. I was also considering the names read or into_parts. There's already into_inner() too. We could also add a doc note and/or a clippy lint.

#[unstable(feature = "new_uninit", issue = "63291")]
#[rustc_const_unstable(feature = "const_box", issue = "92521")]
#[inline]
pub const fn take(this: Self) -> (T, Box<mem::MaybeUninit<T>, A>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take to me has a very strong implication of mem::take (or Option::take or ...) to me, with &mut self and such, so I would encourage a different name here.

👍 to the functionality, though! Makes good sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually picked the name because it is similar to mem::take in the sense of "taking the value out and replacing it with something", in this case replacing it with nothing and tracking it in the state. However if people feel it's strongly associated with &mut self I'm open to a better name, I'd just like to hear some suggestions on alternatives or a feedback on my own ides for alternatives (see above).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see the parallel, though to me this API doesn't do the "and replacing it with something" part that distinguishes take from into_inner kinds of things.

But more importantly to me, mem::take(&mut my_box) works today, so I think Box::take doing something pretty different would be surprising. If anything, I might expect it to be mem::take(&mut *my_box) -- note the deref -- like how RefCell::take works. (And note that mem::take(&mut my_option) does exactly the same thing as Option::take(&mut my_option), as another example.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, what do you think bout into_parts()? That one feels like the next best to me now.

@Mark-Simulacrum Mark-Simulacrum added I-libs-api-nominated Nominated for discussion during a libs-api team meeting. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 6, 2022
@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Feb 6, 2022

I would like to see a somewhat stronger use case for this API, ideally in the wild. I don't think the compiler code pointed to is necessarily a good case -- that code is more closely representative not of reusing an underlying allocation, but rather expressing roughly a mem::map(&mut T, impl FnOnce(T) -> T) function.

I'm somewhat weakly concerned by the continued growth of API surface area around the notion of containers and their 'untracked empty' states (i.e., MaybeUninit) -- it feels like we've added a lot of surface area here (cc #63291) and it's not clear to me that we want all of it. In general, most of these APIs are relatively simple under the hood (3-4 lines of unsafe), and while providing abstractions over unsafe code is generally positive, I'm not convinced it warrants the prominent surface area in std's documentation this PR and those APIs are placing.

This need not block this PR, necessarily, but I would like to bring this up with T-libs-api and see if there's thoughts on the surface area being added over time here. I would be particularly interested in seeing if we can design some better interfaces -- having every container in std contain functions for construction (fallible allocation * uninit/zeroed * with/without a custom allocator * with/without pin * with [T] or not) adds a ton of surface area -- that's at least 32 functions if we fill out the matrix completely.

On the deconstruction side, there's less, but APIs like Box::write which 'complement' the Box::take added by this PR also add surface area. I'm not sure it's quite the right fit.

I would also like to consider whether there's language support that may be warranted here -- this particular PR, as #93653 (comment) sort of mentions, feels like it could be more 'native' -- e.g., if you move out of a Box, it naturally becomes a Box<MaybeUninit<T>>. I suspect it's not that simple, but there's something to that thought, particularly as it may provide options for some of the constructors discussed above.

cc https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Can.20we.20stabilize.20.60Box.3A.3Anew_uninit.60.20.26.20friends.3F

@Kixunil
Copy link
Contributor Author

Kixunil commented Feb 6, 2022

@Mark-Simulacrum fair points. When approaching unsafe I try to boil things down into fundamental operations that are easy to reason about and create building blocks for safe code. This should hopefully minimize potential bugs. I also think that it's important for the language/library to provide good, verified tools to deal with otherwise unsafe stuff like uninitialized memory. There were already cases when even skilled people got things wrong so it'd be great to prevent that.

If there are better ways to deal with this then great. Your suggestion to extend the language rather than the library sounds interesting. OTOH I'm unsure if it's really better to extend the language or library. A possible obstacle is that MaybeUninit is already a library type and it's not too hard to build on top of it. OTOH, first-class support for uninitialized memory would be great.

@Mark-Simulacrum
Copy link
Member

In general I agree with expressing fundamental operations as building blocks, but I think it depends on whether you consider this particular API a building block or not -- the use cases I asked for may help push me to see this as more of a building block. For this particular API, I could see us supporting a fn replace(self: Box<T>, with: U) -> Box<U> where T and U are 'safe transmute' compatible in the future, which seems more general and provides for both directions (i.e., to and from uninit states). That API might also be something that could be generalized beyond Box -- some kind of pointer/field-projection concept, somewhat akin to CoerceUnsized. Obviously, isn't something we necessarily have today, but I think it's worth thinking about. We could have a minimal form (limited to uninit) pretty quickly in a way that's mostly future compatible, I suspect.

@Kixunil
Copy link
Contributor Author

Kixunil commented Feb 8, 2022

I've opened a discussion that'll hopefully help gather some feedback.

Generalized API sounds good.

@Michael-F-Bryan
Copy link

I think an API like Box::take() wouldn't pull its weight.

If I know I'm going to move out of the box and reuse it then I'd probably use Box::from_raw() to "cast" it between Box<T> and Box<MaybeUninit<T>>. Something like this is niche enough with such a trivial implementation that people can implement it themselves. The standard library should be trying to provide abstractions around subtle unsafe code, not everything including the kitchen sink.

We could always turn this into a function or extension trait and iterate in the ecosystem if you want.

trait BoxTake<T> {
    fn take_(this: Box<T>) -> (T, Box<MaybeUninit<T>>);
    fn initialize(this: Box<MaybeUninit<T>>, value: T) -> Box<T>;
}

impl<T> BoxTake<T> for Box<T> {
    fn take(this: Box<T>) -> (T, Box<MaybeUninit<T>>) {
        unsafe {
            let ptr = Box::into_raw(this);
            let value = ptr.read();
            (value, Box::from_raw(ptr.cast()))
        }
    }

    fn replace(mut this: Box<MaybeUninit<T>>, value: T) -> Box<T> {
        unsafe {
            this.write(value);
            Box::from_raw(Box::into_raw(this).cast())
        }
    }
}

fn main() {
    let boxed = Box::new(42);
    let (fourty_two, allocation) = Box::take(boxed);
    assert_eq!(fourty_two, 42);

    let initialized = Box::replace(allocation, 7);
    assert_eq!(initialized, 7);
}

(playground)

I would also like to consider whether there's language support that may be warranted here -- this particular PR, as #93653 (comment) sort of mentions, feels like it could be more 'native' -- e.g., if you move out of a Box, it naturally becomes a Box<MaybeUninit>. I suspect it's not that simple, but there's something to that thought, particularly as it may provide options for some of the constructors discussed above.

Rust doesn't have any mechanism where calling a method or executing an operator on a value will implicitly change its type so implementing a general "move out of pointer changes the original variable's type to MaybeUninit" feature would be a big change.

Do we really want to introduce another subtle unsafe language feature for people to shoot themselves in the foot with?

@Kixunil
Copy link
Contributor Author

Kixunil commented Feb 9, 2022

@Michael-F-Bryan I think that Box::write() is frequent enough - many people want to avoid copying from stack. Box::take() feels like a natural opposite operation but sure, maybe it's not strong enough justification for adding it.

@Mark-Simulacrum
Copy link
Member

The libs-api team discussed this today, and particularly in the context of concerns around the design of the standard library surface area around MaybeUninit (cc #63291, zulip, and rust-lang/wg-allocators#90), we don't want to move ahead with adding this API at this time. @m-ou-se is going to leave a comment elsewhere elaborating on that discussion a little.

It's also the case that this particular API likely does not pull its weight in terms of providing a sufficient abstraction to merit inclusion even without those larger concerns; if there are more extensive use cases (and in particular, ones that move the needle from completely avoiding unsafe code to reducing it a little bit), we might reconsider. There was also some discussion around the fact that the API does not leverage any std-special bits, so any crate in the ecosystem could provide a reusable API layer here to experiment with the options for exposing good APIs.

I'm going to go ahead and close this PR as such, but if the users.rust-lang.org thread results in good use cases, feel free to poke on Zulip (or reopen a PR) to see if they move the needle on accepting this.

@Mark-Simulacrum Mark-Simulacrum removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Feb 9, 2022
@Kixunil
Copy link
Contributor Author

Kixunil commented Feb 9, 2022

OK, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants