Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow references to interior mutable data behind a feature gate #80418

Merged
merged 13 commits into from
Jan 4, 2021

Conversation

oli-obk
Copy link
Contributor

@oli-obk oli-obk commented Dec 27, 2020

supercedes #80373 by simply not checking for interior mutability on borrows of locals that have StorageDead and thus can never be leaked to the final value of the constant

tracking issue: #80384

r? @RalfJung

@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
configure: rust.channel         := nightly
configure: rust.debug-assertions := True
configure: llvm.assertions      := True
configure: dist.missing-tools   := True
configure: build.configure-args := ['--enable-sccache', '--disable-manage-submodu ...
configure: writing `config.toml` in current directory
configure: 
configure: run `python /checkout/x.py --help`
configure: 
---
skip untracked path cpu-usage.csv during rustfmt invocations
skip untracked path src/doc/book/ during rustfmt invocations
skip untracked path src/doc/rust-by-example/ during rustfmt invocations
skip untracked path src/llvm-project/ during rustfmt invocations
Diff in /checkout/compiler/rustc_mir/src/transform/check_consts/validation.rs at line 3:
 use rustc_errors::{struct_span_err, Applicability, Diagnostic, ErrorReported};
 use rustc_hir::def_id::DefId;
 use rustc_hir::{self as hir, HirId, LangItem};
+use rustc_index::bit_set::BitSet;
 use rustc_infer::infer::TyCtxtInferExt;
 use rustc_infer::traits::{ImplSource, Obligation, ObligationCause};
 use rustc_middle::mir::visit::{MutatingUseContext, NonMutatingUseContext, PlaceContext, Visitor};
Diff in /checkout/compiler/rustc_mir/src/transform/check_consts/validation.rs at line 16:
 use rustc_span::{sym, Span, Symbol};
Running `"/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/rustfmt" "--config-path" "/checkout" "--edition" "2018" "--unstable-features" "--skip-children" "--check" "/checkout/compiler/rustc_mir/src/transform/check_consts/validation.rs"` failed.
If you're running `tidy`, try again with `--bless`. Or, if you just want to format code, run `./x.py fmt` instead.
 use rustc_trait_selection::traits::error_reporting::InferCtxtExt;
 use rustc_trait_selection::traits::{self, SelectionContext, TraitEngine};
-use rustc_index::bit_set::BitSet;
 use std::mem;
 use std::ops::Deref;
 use std::ops::Deref;
Diff in /checkout/compiler/rustc_mir/src/transform/check_consts/validation.rs at line 289:
 
     fn local_has_storage_dead(&mut self, local: Local) -> bool {
         let ccx = self.ccx;
-        self.local_has_storage_dead.get_or_insert_with(|| {
-            struct StorageDeads {
-                locals: BitSet<Local>,
-            }
-            impl Visitor<'tcx> for StorageDeads {
-                fn visit_statement(&mut self, stmt: &Statement<'tcx>, _: Location) {
-                    if let StatementKind::StorageDead(l) = stmt.kind {
-                        self.locals.insert(l);
+        self.local_has_storage_dead
+            .get_or_insert_with(|| {
+                struct StorageDeads {
+                    locals: BitSet<Local>,
+                }
+                impl Visitor<'tcx> for StorageDeads {
+                    fn visit_statement(&mut self, stmt: &Statement<'tcx>, _: Location) {
+                        if let StatementKind::StorageDead(l) = stmt.kind {
+                            self.locals.insert(l);
                     }
                 }
-            }
-            let mut v = StorageDeads {
-            let mut v = StorageDeads {
-                locals: BitSet::new_empty(ccx.body.local_decls.len()),
-            };
-            v.visit_body(ccx.body);
-            v.locals
-        }).contains(local)
+                let mut v = StorageDeads { locals: BitSet::new_empty(ccx.body.local_decls.len()) };
+                v.visit_body(ccx.body);
+                v.locals
+            })
+            .contains(local)
 
 
     pub fn qualifs_in_return_place(&mut self) -> ConstQualifs {
Build completed unsuccessfully in 0:00:15

@RalfJung
Copy link
Member

by simply not checking for interior mutability on borrows of locals that have StorageDead and thus can never be leaked to the final value of the constant

That's only a partial solution though, right? It means we cannot support any getter-style methods such as

fn fst_ref<T, U>(x: &(T, U)) -> &T { &x.0 }

(and same for &mut I guess)

So, I am not sure if this is worth it, since it seems rather restrictive. (I have no idea what your previous plan was in #80373 since I hadn't gotten around to reading it before you closed it.)

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 27, 2020

That's only a partial solution though, right? It means we cannot support any getter-style methods such as

Uh... no. If the feature gate is active, we only error on locals that have no StorageDead. The only locals that do not have StorageDead is uninhabited locals or locals in the trailing expression of a constant. This is the static analysis that goes in hand with the dynamic interior mutability behind reference check in validation. The only way a local can end up getting interned is by it being part of the trailing expression. So we only need to check Rvalue::Ref of locals that do not get erased before the end of the CTFE run, as all other pointers would become dangling, and we do not/cannot protect against dangling pointers anyway beyond the dynamic check.

@rust-log-analyzer

This comment has been minimized.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 28, 2020

Note that we can just outright not check anything in const fn, as the borrow checker will already do that for us. You're right though, as implemented there will be problems with (re)borrowing function arguments.

@RalfJung
Copy link
Member

RalfJung commented Dec 28, 2020

Uh... no. If the feature gate is active, we only error on locals that have no StorageDead. The only locals that do not have StorageDead is uninhabited locals or locals in the trailing expression of a constant. This is the static analysis that goes in hand with the dynamic interior mutability behind reference check in validation. The only way a local can end up getting interned is by it being part of the trailing expression. So we only need to check Rvalue::Ref of locals that do not get erased before the end of the CTFE run, as all other pointers would become dangling, and we do not/cannot protect against dangling pointers anyway beyond the dynamic check.

I think I am lost, I have no idea what this means. Having no StorageDead can also arise because there was no StorageLive, either.

And how does the referent matter, isn't if the reference for which we care if it ends up in the final value? When we see an expression let _x = &var.field.field, we care about whether the reference ends up in the final value, so how does looking at var tell us anything about that...?!? I am confused.^^

Unless you mean we are checking the _x (I haven't though that one though), but you keep saying "reference of locals that have no StorageDead", which to me means you are checking var, not _x.

Some examples could help. :)

@RalfJung
Copy link
Member

Oh, I think I get it -- if the referent has a StorageDead, then there simply is no way that it can still live during interning, so it cannot be part of the final value.

But I am a bit worried about drawing ourselves into a corner here, since it doesn't work for reborrowing getters.

@RalfJung
Copy link
Member

That said, that is a concern for stabilization -- for experimentation, this approach seems fine.

I hope I'll have tome to review the code changes tomorrow.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 28, 2020

Having no StorageDead can also arise because there was no StorageLive, either.

Afaict, in constants, that's not an issue, and even if it were, all that would mean is that our check forbids more things than we want. That's ok, we can be more permissive later.

And how does the referent matter, isn't if the reference for which we care if it ends up in the final value? When we see an expression let _x = &var.field.field, we care about whether the reference ends up in the final value, so how does looking at var tell us anything about that...?!? I am confused.^^

Unless you mean we are checking the _x (I haven't though that one though), but you keep saying "reference of locals that have no StorageDead", which to me means you are checking var, not _x.

Some examples could help. :)

Heh, ok, yea, I see the misunderstanding, and it is exactly the problem I had with building a static analyis for this until eddyb gave me the enlightening clue. Let's start with an example:

const FOO: &Cell<i32> = &Cell::new(42);

expands to something like the following MIR:

StorageLive(_2)
_2 = 42;
StorageLIve(_1)
_1 = Cell::new(42) -> bb2

bb2:
StorageDead(_2)
_0 = &_1
return

Now, you are right, _0 has neither StorageLive nor StorageDead, as its storage is handled by the "caller" (at least in a normal function). Similarly function arguments have no storage markers. The problematic operation in the above example is the &_1, as that creates a &'static Cell<i32> within a constant. And we know it's 'static, because of the escaping scopes rule: _1's scope is 'static, because it is a temporary in the trailing expression of a constant. Thus it also has no StorageDead marker. All this PR does is forbid Rvalue::Ref if the local (_1 in our case) has no StorageDead markers and the local's type (Cell::<i32> in this case) has interior mutability. By the way MIR is built, we know that this can only happen for the escaping locals. Our dynamic checks for dangling pointers do the same thing, but dynamically. They check whether the allocation still exists at the end. The only way that can happen is if they do not have a StorageDead statement.

So with this new analysis, you either get a hard error because you tried to take a reference to a local that has interior mutability and lives forever, or you get a validation error later because you created a dangling pointer. We can talk about dangling pointers independently, they are entirely orthogonal to preventing references to interior mutable data in constants. You can create the easily without having to do anything related to (interior) mutability.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 28, 2020

But I am a bit worried about drawing ourselves into a corner here, since it doesn't work for reborrowing getters.

I added this to a test, and it works. We don't actually need to check for such borrows in const fn at all, because the borrow checker will already prevent you from returning references to local data (and there is no enlosing scope rule for const fn).

@RalfJung
Copy link
Member

RalfJung commented Dec 28, 2020

I added this to a test, and it works. We don't actually need to check for such borrows in const fn at all, because the borrow checker will already prevent you from returning references to local data (and there is no enlosing scope rule for const fn).

Uh, I don't see the obvious argument for why this is sound. The returned reference could come from various places, including things like Box::leak.

I also saw this in the code:

A set that stores for each local whether it has a StorageDead for it somewhere.

The soundness argument above only works if StorageDead is executed on the referent in some block that post-dominates the block where the reference happens. It is not enough for there to be a StorageDead just anywhere. So this seems suspicious to me.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 28, 2020

Uh, I don't see the obvious argument for why this is sound. The returned reference could come from various places, including things like Box::leak.

For a const fn by itself, this is a non-issue. It's totally fine for a const fn to produce &'static Cell<i32>. Of course, now you can create a const item and call such a const fn there. I am not sure yet how to handle this, and we will need to consider cases like this anyway when stabilizing const heap. But right now there is nothing that could possibly be problematic.

The only places other than a heap allocation that you can return references to is things that the function arguments point to (if function arguments are references or pointers), or static items. const fn cannot refer to static items, thus we are fine here. Anything that you can point to at the argument site must obviously be something at the caller site, and thus follows whatever rules apply to the caller. Either it's a const item, then it can't possibly be anything but a promoted value or a non-escaping local (as the enclosing scope rule does not apply to function arguments).

The soundness argument above only works if StorageDead is executed on the referent in some block that post-dominates the block where the reference happens. It is not enough for there to be a StorageDead just anywhere. So this seems suspicious to me.

I... don't know how we could have MIR like that. It's a valid concern that we can resolve by adding MIR validation checks I believe. It should never happen afaik.

We can additionally count the number of assignments to the local, and if there is only one assignment and only one StorageDead, the StorageDead must come after the assignment (by construction, not by possible values of mir::Body).

@RalfJung
Copy link
Member

For a const fn by itself, this is a non-issue. It's totally fine for a const fn to produce &'static Cell. Of course, now you can create a const item and call such a const fn there. I am not sure yet how to handle this, and we will need to consider cases like this anyway when stabilizing const heap. But right now there is nothing that could possibly be problematic.

I think there is something that could possibly be problematic: a const using such a const fn to put an &Cell<i32> in its result.

The only places other than a heap allocation that you can return references to is things that the function arguments point to (if function arguments are references or pointers), or static items. const fn cannot refer to static items, thus we are fine here. Anything that you can point to at the argument site must obviously be something at the caller site, and thus follows whatever rules apply to the caller. Either it's a const item, then it can't possibly be anything but a promoted value or a non-escaping local (as the enclosing scope rule does not apply to function arguments).

Ah I see. That argument makes sense, but it very fundamentally relies on "no heap allocations". So if we stabilize this we say goodbye to some ways of supporting heap allocations -- IOW, we shouldn't stabilize this before having figured out the heap allocation story.

We can additionally count the number of assignments to the local, and if there is only one assignment and only one StorageDead, the StorageDead must come after the assignment (by construction, not by possible values of mir::Body).

I don't follow this argument. "By construction" sounds very similar to "by the possible shapes of a MIR body" to me.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 28, 2020

I don't follow this argument. "By construction" sounds very similar to "by the possible shapes of a MIR body" to me.

Oop, I meant by mir building ("mir construction"), not the formal "by construction".

we shouldn't stabilize this before having figured out the heap allocation story.

I can absolutely live with that. We should probably explore this feature with heap on nightly. Actually, this PR is necessary to even be able to experiment with generic heap abstractions, as that requires borrowing generic types.

@RalfJung
Copy link
Member

Oop, I meant by mir building ("mir construction"), not the formal "by construction".

Hm, that still seems like a rather fragile invariant.

At the very least, the code (and tracking issue) should explicitly spell out what we are relying on: if a StorageDead exists for a local, then for every assignment to the local, on each pass from that assignment to any return, there will be a StorageDead for this local.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 29, 2020

if a StorageDead exists for a local, then for every assignment to the local, on each pass from that assignment to any return, there will be a StorageDead for this local.

Added to the unresolved questions in the tracking issue.

@oli-obk
Copy link
Contributor Author

oli-obk commented Dec 29, 2020

I think there is something that could possibly be problematic: a const using such a const fn to put an &Cell<i32> in its result

yes, but that's what I mean. right now there is no way to write such a const fn. So we need to find a solution that considers const heap and references to interior mutable data at the same time. Adding that to the tracking issues of both features right now.

@bors
Copy link
Contributor

bors commented Dec 31, 2020

☔ The latest upstream changes (presumably #80459) made this pull request unmergeable. Please resolve the merge conflicts.

@oli-obk oli-obk force-pushed the this_could_have_been_so_simple branch from 9e9fe1d to 9c9b605 Compare January 1, 2021 14:14
oli-obk and others added 3 commits January 3, 2021 15:15
@oli-obk oli-obk mentioned this pull request Jan 3, 2021
@RalfJung
Copy link
Member

RalfJung commented Jan 3, 2021

Beautiful, thanks for taking care of my nitpicking. :)
@bors r+

@bors
Copy link
Contributor

bors commented Jan 3, 2021

📌 Commit 90b56b9 has been approved by RalfJung

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 3, 2021
@bors
Copy link
Contributor

bors commented Jan 3, 2021

⌛ Testing commit 90b56b9 with merge d3bc943b9a4dfe86ea7c56c5b4af86a1aba1b08b...

@rust-log-analyzer
Copy link
Collaborator

The job test-various failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
---- [ui] ui/consts/const-binops.rs stdout ----

error: test run failed!
status: exit code: 6
command: "/node-v14.4.0-linux-x64/bin/node" "/checkout/src/etc/wasm32-shim.js" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/consts/const-binops/a.wasm"
------------------------------------------

------------------------------------------
stderr:
stderr:
------------------------------------------
internal/per_context/primordials.js:102
  class SafeSet extends Set {}


TypeError: object is not iterable (cannot read property Symbol(Symbol.iterator))
------------------------------------------



---

Some tests failed in compiletest suite=ui mode=ui host=x86_64-unknown-linux-gnu target=wasm32-unknown-unknown


command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/wasm32-unknown-unknown/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/ui" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui" "--stage-id" "stage2-wasm32-unknown-unknown" "--suite" "ui" "--mode" "ui" "--target" "wasm32-unknown-unknown" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/checkout/obj/build/x86_64-unknown-linux-gnu/llvm/build/bin/FileCheck" "--nodejs" "/node-v14.4.0-linux-x64/bin/node" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/wasm32-unknown-unknown/native/rust-test-helpers" "--docck-python" "/usr/bin/python3" "--lldb-python" "/usr/bin/python3" "--gdb" "/usr/bin/gdb" "--llvm-version" "11.0.0-rust-1.51.0-nightly" "--llvm-components" "aarch64 aarch64asmparser aarch64codegen aarch64desc aarch64disassembler aarch64info aarch64utils aggressiveinstcombine all all-targets analysis arm armasmparser armcodegen armdesc armdisassembler arminfo armutils asmparser asmprinter avr avrasmparser avrcodegen avrdesc avrdisassembler avrinfo binaryformat bitreader bitstreamreader bitwriter cfguard codegen core coroutines coverage debuginfocodeview debuginfodwarf debuginfogsym debuginfomsf debuginfopdb demangle dlltooldriver dwarflinker engine executionengine extensions frontendopenmp fuzzmutate globalisel gtest gtest_main hexagon hexagonasmparser hexagoncodegen hexagondesc hexagondisassembler hexagoninfo instcombine instrumentation interpreter ipo irreader jitlink libdriver lineeditor linker lto mc mca mcdisassembler mcjit mcparser mips mipsasmparser mipscodegen mipsdesc mipsdisassembler mipsinfo mirparser msp430 msp430asmparser msp430codegen msp430desc msp430disassembler msp430info native nativecodegen nvptx nvptxcodegen nvptxdesc nvptxinfo objcarcopts object objectyaml option orcerror orcjit passes powerpc powerpcasmparser powerpccodegen powerpcdesc powerpcdisassembler powerpcinfo profiledata remarks riscv riscvasmparser riscvcodegen riscvdesc riscvdisassembler riscvinfo riscvutils runtimedyld scalaropts selectiondag sparc sparcasmparser sparccodegen sparcdesc sparcdisassembler sparcinfo support symbolize systemz systemzasmparser systemzcodegen systemzdesc systemzdisassembler systemzinfo tablegen target testingsupport textapi transformutils vectorize webassembly webassemblyasmparser webassemblycodegen webassemblydesc webassemblydisassembler webassemblyinfo windowsmanifest x86 x86asmparser x86codegen x86desc x86disassembler x86info xray" "--cc" "" "--cxx" "" "--cflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"


failed to run: /checkout/obj/build/bootstrap/debug/bootstrap --stage 2 test --host= --target wasm32-unknown-unknown src/test/run-make src/test/ui src/test/mir-opt src/test/codegen-units library/core
Build completed unsuccessfully in 0:22:00

@bors
Copy link
Contributor

bors commented Jan 3, 2021

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jan 3, 2021
@oli-obk
Copy link
Contributor Author

oli-obk commented Jan 4, 2021

@bors retry

I just ran the wasm tests locally and they passed, so I call spurious

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 4, 2021
@bors
Copy link
Contributor

bors commented Jan 4, 2021

⌛ Testing commit 90b56b9 with merge 8989689...

@bors
Copy link
Contributor

bors commented Jan 4, 2021

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 8989689 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 4, 2021
@bors bors merged commit 8989689 into rust-lang:master Jan 4, 2021
@rustbot rustbot added this to the 1.51.0 milestone Jan 4, 2021
@usbalbin usbalbin mentioned this pull request Jan 18, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants