Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alloc: Add unstable Cfg feature `no_global_oom_handling #84266

Merged

Conversation

Ericson2314
Copy link
Contributor

@Ericson2314 Ericson2314 commented Apr 17, 2021

For certain sorts of systems, programming, it's deemed essential that
all allocation failures be explicitly handled where they occur. For
example, see Linus Torvald's opinion in 1. Merely not calling global
panic handlers, or always try_reserving first (for vectors), is not
deemed good enough, because the mere presence of the global OOM handlers
is burdens static analysis.

One option for these projects to use rust would just be to skip alloc,
rolling their own allocation abstractions. But this would, in my
opinion be a real shame. alloc has a few try_* methods already, and
we could easily have more. Features like custom allocator support also
demonstrate and existing to support diverse use-cases with the same
abstractions.

A natural way to add such a feature flag would a Cargo feature, but
there are currently uncertainties around how std library crate's Cargo
features may or not be stable, so to avoid any risk of stabilizing by
mistake we are going with a more low-level "raw cfg" token, which
cannot be interacted with via Cargo alone.

Note also that since there is no notion of "default cfg tokens" outside
of Cargo features, we have to invert the condition from
global_oom_handling to to not(no_global_oom_handling). This breaks
the monotonicity that would be important for a Cargo feature (i.e.
turning on more features should never break compatibility), but it
doesn't matter for raw cfg tokens which are not intended to be
"constraint solved" by Cargo or anything else.

To support this use-case we create a new feature, "global-oom-handling",
on by default, and put the global OOM handler infra and everything else
it that depends on it behind it. By default, nothing is changed, but
users concerned about global handling can make sure it is disabled, and
be confident that all OOM handling is local and explicit.

For this first iteration, non-flat collections are outright disabled.
Vec and String don't yet have try_* allocation methods, but are
kept anyways since they can be oom-safely created "from parts", and we
hope to add those try_ methods in the future.

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 17, 2021
@Ericson2314 Ericson2314 changed the title alloc: Add default Cargo feature "global-oom-handling" WIP: alloc: Add default Cargo feature "global-oom-handling" Apr 17, 2021
@rust-log-analyzer

This comment has been minimized.

@Ericson2314
Copy link
Contributor Author

I added the WIP because something in alloc is requiring exchange_malloc, yet I don't see any box expressions except for in tests, and when I dumped MIR for the module I didn't see it used.

@rust-log-analyzer

This comment has been minimized.

@Ericson2314
Copy link
Contributor Author

Also not sure why that failure happened, because I don't see any default-features = false or similar for alloc.

@bjorn3
Copy link
Member

bjorn3 commented Apr 17, 2021

I added the WIP because something in alloc is requiring exchange_malloc, yet I don't see any box expressions except for in tests, and when I dumped MIR for the module I didn't see it used.

Box::new needs it.

@the8472
Copy link
Member

the8472 commented Apr 17, 2021

This is a lot of cfg annotations. Would it make sense to group functions those into internal modules instead? E.g. vec::fallible and vec::infallible both containing impl blocks for the same types and then putting the cfg on the whole module.

Of course that'll require a lot of code motion now, but but it provides clearer separation and it'll be more obvious in the future when people add things in the wrong place.

@phil-opp
Copy link
Contributor

Also not sure why that failure happened, because I don't see any default-features = false or similar for alloc.

The test seems to use rustc directly. I think that (default) features are normally handled by cargo and only passed to rustc as --cfg argument, so we probably need to adjust the test to enable the global_oom_handling feature explicitly.

@Aaron1011
Copy link
Member

What happens if someone forgets to put #[cfg(feature = "global-oom-handling")] on an API that depends on global OOM handling? Will compilation fail with the feature disabled, or will the user get a weird runtime problem?

@bjorn3
Copy link
Member

bjorn3 commented Apr 17, 2021

It will fail with a compilation error as handle_alloc_error is cfg'ed away.

@Ericson2314
Copy link
Contributor Author

Indeed that is what's happening with right now.

I should have those 2 things fixed in a bit. Thanks for the tips, everyone.

@Ericson2314 Ericson2314 force-pushed the statically-disallow-global-oom-handling branch from c7e5e77 to e32dc47 Compare April 18, 2021 00:02
@Ericson2314
Copy link
Contributor Author

@the8472 It is a lot of cfg, but I think the "boundaries" between e.g. vec::fallible and vec::infallible would change a lot as more try_* methods are created. That sort of churn will cause massive headaches with PR conflicts, so I think gratuitous cfg is in fact the far lesser evil, at least until we decide how many try_* methods we want and stabilize around that.

@Ericson2314
Copy link
Contributor Author

The PR is passing, but it would be nice to have a test that the build with global-oom-handling disabled doesn't break. How ought I to do that?

@@ -2,4 +2,4 @@

all:
$(RUSTC) fakealloc.rs
$(RUSTC) --edition=2018 --crate-type=rlib ../../../../library/alloc/src/lib.rs --cfg feature=\"external_crate\" --extern external=$(TMPDIR)/$(shell $(RUSTC) --print file-names fakealloc.rs)
$(RUSTC) --edition=2018 --crate-type=rlib ../../../../library/alloc/src/lib.rs --cfg feature=\"external_crate\" --cfg=feature=\"global-oom-handling\" --extern external=$(TMPDIR)/$(shell $(RUSTC) --print file-names fakealloc.rs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't seem to test anything anymore: https://github.com/rust-lang/rust/search?q=external_crate

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way I think you can use the same pattern to test liballoc without the global-oom-handling feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed it does seem that way.

@the8472
Copy link
Member

the8472 commented Apr 19, 2021

@the8472 It is a lot of cfg, but I think the "boundaries" between e.g. vec::fallible and vec::infallible would change a lot as more try_* methods are created. That sort of churn will cause massive headaches with PR conflicts, so I think gratuitous cfg is in fact the far lesser evil, at least until we decide how many try_* methods we want and stabilize around that.

I'm not sure what you're concerned about. Potential churn in the long-term once those modules are established or the way to creating those modules?

If it's about the latter then we can probably minimize the conflict potential by only creating a fallible module and moving the try_ code and its fallible callees there. That should be a fairly small API surface right now and thus not need too much code movement. Once that is done we only need to slap a cfg on top of the impl blocks which contain infallible APIs to disable them.

@Ericson2314
Copy link
Contributor Author

Ericson2314 commented Apr 19, 2021

I'm not sure what you're concerned about. Potential churn in the long-term once those modules are established or the way to creating those modules?

Churn in the short term why try_ methods are being created.

If it's about the latter then we can probably minimize the conflict potential by only creating a fallible module and moving the try_ code and its fallible callees there. That should be a fairly small API surface right now and thus not need too much code movement. Once that is done we only need to slap a cfg on top of the impl blocks which contain infallible APIs to disable them.

When a try_ method is made, usually what happens is the existing implementation is refactored into the try_ version, and then the old one becomes a small wrapper around that. So with even just fallible module, we have a bunch of churn when the implementation gets refactored and moved from the original module to fallible one. And git handles refactor + moves extremely poorly.

@the8472
Copy link
Member

the8472 commented Apr 19, 2021

Churn in the short term why try_ methods are being created.

I don't understand what you mean by that. The try methods have to be created either way. They're not just created for a short term reason and then removed later. So "churn in the short term" cannot be their raison d'être.

When a try_ method is made, usually what happens is the existing implementation is refactored into the try_ version, and then the old one becomes a small wrapper around that. So with even just fallible module, we have a bunch of churn when the implementation gets refactored and moved from the original module to fallible one. And git handles refactor + moves extremely poorly.

I don't see the difference here. You need to create a new method which is sortof but not quite copy-pasted from the old one. And then redirect the old one to the new one. This has to happen either way whether they live in different files or not and whether you annotate on the function level or not. And if necessary can be split into multiple commits to make it easier for git to follow the movement.

E.g. you could implement the try method first in the base module and then once it is done use a separate commit to move it into the fallible module.

Yes, in the short-term this means a little more work, but in the long-term it makes it clearer which part belongs to the restricted fallible API set and which to the more lax infallible APIs.

@bjorn3
Copy link
Member

bjorn3 commented Apr 19, 2021

And if necessary can be split into multiple commits to make it easier for git to follow the movement.

Git blame doesn't track moves between files at all unless you pass it -C. The github ui doesn't have any option to enable this and most people likely don't know about it at all.

I don't see the difference here. You need to create a new method which is sortof but not quite copy-pasted from the old one. And then redirect the old one to the new one. This has to happen either way whether they live in different files or not and whether you annotate on the function level or not.

When doing it within the same file, git will consider most lines of non-try function in the previous version to be part of the try function in the current version. Take for example 92bfcd2#diff-0e10470fe128e30b8710381cb322bfe266884aa8d969619a9920fc6970cdceffL406-R450:

@@ -403,7 +405,9 @@ impl<T, A: Alloc> RawVec<T, A> {
     /// # Aborts
     ///
     /// Aborts on OOM
-    pub fn reserve_exact(&mut self, used_cap: usize, needed_extra_cap: usize) {
+    pub fn try_reserve_exact(&mut self, used_cap: usize, needed_extra_cap: usize)
+           -> Result<(), CollectionAllocErr> {
+
         unsafe {
             // NOTE: we don't early branch on ZSTs here because we want this
             // to actually catch "asking for more than usize::MAX" in that case.
@@ -413,16 +417,15 @@ impl<T, A: Alloc> RawVec<T, A> {
             // Don't actually need any more capacity.
             // Wrapping in case they gave a bad `used_cap`.
             if self.cap().wrapping_sub(used_cap) >= needed_extra_cap {
-                return;
+                return Ok(());
             }
 
             // Nothing we can really do about these checks :(
-            let new_cap = used_cap.checked_add(needed_extra_cap).expect("capacity overflow");
-            let new_layout = match Layout::array::<T>(new_cap) {
-                Some(layout) => layout,
-                None => panic!("capacity overflow"),
-            };
-            alloc_guard(new_layout.size());
+            let new_cap = used_cap.checked_add(needed_extra_cap).ok_or(CapacityOverflow)?;
+            let new_layout = Layout::array::<T>(new_cap).ok_or(CapacityOverflow)?;
+
+            alloc_guard(new_layout.size())?;
+
             let res = match self.current_layout() {
                 Some(layout) => {
                     let old_ptr = self.ptr.as_ptr() as *mut u8;
@@ -430,26 +433,34 @@ impl<T, A: Alloc> RawVec<T, A> {
                 }
                 None => self.a.alloc(new_layout),
             };
-            let uniq = match res {
-                Ok(ptr) => Unique::new_unchecked(ptr as *mut T),
-                Err(e) => self.a.oom(e),
-            };
-            self.ptr = uniq;
+
+            self.ptr = Unique::new_unchecked(res? as *mut T);
             self.cap = new_cap;
+
+            Ok(())
         }
     }
 
+    pub fn reserve_exact(&mut self, used_cap: usize, needed_extra_cap: usize) {
+        match self.try_reserve_exact(used_cap, needed_extra_cap) {
+            Err(CapacityOverflow) => panic!("capacity overflow"),
+            Err(AllocErr(e)) => self.a.oom(e),
+            Ok(()) => { /* yay */ }
+         }
+     }
+

@Ericson2314
Copy link
Contributor Author

Thanks @bjorn3 for laying this out much more clearly than I would have.

@the8472
Copy link
Member

the8472 commented Apr 19, 2021

Git blame doesn't track moves between files at all unless you pass it -C. The github ui doesn't have any option to enable this and most people likely don't know about it at all.

I assume you're using -C already, otherwise the occasional refactor that splits large modules into smaller ones (as happend not too long ago with Vec) would cause the very same issues.
And since there's no ban on such split operations (in fact they're encouraged by tidy's file length limit) why not do another one to isolate the fallible methods?

When doing it within the same file, git will consider most lines of non-try function in the previous version to be part of the try function in the current version.

Yes, hence do that in one commit, then do the motion in another so that git history browsing can follow it more easily.

@Ericson2314
Copy link
Contributor Author

Ericson2314 commented Apr 19, 2021

@the8472 I'm not against ever rearranging code --- quite in support in fact. But I imagine the try_ methods will be added incrementally over a long period of time. (We could do them all at once, but I'm afraid it's rather controversial, so I don't think that will happen.)

Assuming I'm correct in saying there will be many commits adding try_ methods, I rather have one big reorg at the end, then constant reorging, because it's much less a burden on others.

@the8472
Copy link
Member

the8472 commented Apr 19, 2021

Ok, that might work. But on the other hand the current approach also means people have to reason about fallibility on a per-method level. And if methods are added incrementally that would also keep church limited in impact, so moving wouldn't be that bad either?

@safinaskar
Copy link
Contributor

Important news! HashMap got method try_insert ( https://doc.rust-lang.org/std/collections/struct.HashMap.html#method.try_insert ), where try_ means something completely different to Box::try_new. Please, prevent somehow this try_insert from becoming stable

ojeda added a commit to ojeda/rust that referenced this pull request Mar 22, 2022
  - The edition should be 2021 to avoid warnings.

  - The `external_crate` feature was removed in commit 45bf1ed
    ("rustc: Allow changing the default allocator").

    Note that commit d620ae1 ("Auto merge of rust-lang#84266") removed
    the old test, but the new one introduced passed the `--cfg` like
    in the old one.

Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Mar 23, 2022
…test, r=Dylan-DPC

Modernize `alloc-no-oom-handling` test

  - The edition should be 2021 to avoid warnings.

  - The `external_crate` feature was removed in commit 45bf1ed ("rustc: Allow changing the default allocator").

    Note that commit d620ae1 ("Auto merge of rust-lang#84266") removed the old test, but the new one introduced passed the `--cfg` like in the old one.

Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

---

This is intended to align this test to the new `no_rc` and `no_sync` ones being added in rust-lang#89891, but it makes sense on its own too.
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Sep 21, 2022
`alloc`: add unstable cfg features `no_rc` and `no_sync`

In Rust for Linux we are using these to make `alloc` a bit more modular.

See rust-lang#86048 and rust-lang#84266 for similar requests.

Of course, the particular names are not important.
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this pull request Sep 21, 2022
`alloc`: add unstable cfg features `no_rc` and `no_sync`

In Rust for Linux we are using these to make `alloc` a bit more modular.

See rust-lang#86048 and rust-lang#84266 for similar requests.

Of course, the particular names are not important.
dpaoliello added a commit to dpaoliello/rust that referenced this pull request Oct 10, 2022
…hout infallible allocation methods

As a part of the work for rust-lang#86942 the `no_global_oom_handling` cfg was added in rust-lang#84266, however enabling that cfg makes it difficult to use `Vec`: the `vec!` macro is missing and there is no way to insert or push new elements, nor to create a `Vec` from an iterator (without resorting to `unsafe` APIs to manually expand and write to the underlying buffer).

This change adds `try_` equivalent methods for all methods in `Vec` that are disabled by `no_global_oom_handling` as well as a `try_vec!` as the equivalent of `vec!`.

Performance and implementation notes: A goal of this change was to make sure to NOT regress the performance of the existing infallible methods - a naive approach would be to move the actual implementation into the fallible method, then call it from the infallible method and `unwrap()`, but this would add extra compare+branch instructions into the infallible methods. It would also be possible to simply duplicate the code for the fallible version - I did opt for this in a few cases, but where the code was larger and more complex this would lead to a maintenance nightmare. Instead, I moved the implementation into an `*_impl` method that was then specialized based on a new `VecError` trait and returned `Result<_, VecError>`, this trait also provided a pair of methods for raising errors. Never (`!`) was used as the infallible version of this trait (and always panics) and `TryReserveError` as the fallible version (and returns the correct error object). All these `VecError` method were marked with `#[inline]`, so after inlining the compiler could see for the infallible version always returns `Ok()` on success (and panics on error) thus the `?` operator or `unwrap()` call was a no-op, and so the same non-branching instructions as before are generated.

I also added `try_from_iter` and `try_expand` methods for completeness, even though their infallible equivalents are trait implementations.
thomcc pushed a commit to tcdi/postgrestd that referenced this pull request Feb 10, 2023
`alloc`: add unstable cfg features `no_rc` and `no_sync`

In Rust for Linux we are using these to make `alloc` a bit more modular.

See rust-lang/rust#86048 and rust-lang/rust#84266 for similar requests.

Of course, the particular names are not important.
@gurry gurry mentioned this pull request Nov 25, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.