Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix BTreeMap UB #56648

Merged
merged 3 commits into from
Dec 16, 2018
Merged

Fix BTreeMap UB #56648

merged 3 commits into from
Dec 16, 2018

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Dec 9, 2018

BTreeMap currently causes UB by created a shared reference to a too-small allocation. This PR fixes that by introducing a NodeHeader type and using that until we really need access to the key/value arrays. Avoiding run-time checks in into_key_slice was somewhat tricky, see the comments embedded in the code.

I also adjusted as_leaf_mut to return a raw pointer, because creating a mutable reference asserts that there are no aliases to the pointee, but that's not always correct: We use as_leaf_mut twice to create two mutable slices for keys and values; the second call overlaps with the first slice and hence is not a unique pointer.

Fixes #54957

Cc @nikomatsakis @gankro

@rust-highfive
Copy link
Collaborator

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 9, 2018
@sfackler
Copy link
Member

sfackler commented Dec 9, 2018

Would it be worth putting NodeHeader fields in the node types directly to avoid accidentally desynchronizing them?

The K2 shenanigans seem unfortunate but I'll defer to your expertise that they're necessary :)

@RalfJung
Copy link
Member Author

RalfJung commented Dec 9, 2018

Would it be worth putting NodeHeader fields in the node types directly to avoid accidentally desynchronizing them?

Unfortunately that will increase their size, because there'll be padding after len inside NodeHeader. (This is what I meant with the comment "LeafNode does not just contain a NodeHeader because we do not want unnecessary padding between len and the keys" in the code.)

@sfackler
Copy link
Member

sfackler commented Dec 9, 2018

Ah right.

Copy link
Contributor

@Gankra Gankra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shrug I guess

src/liballoc/collections/btree/node.rs Outdated Show resolved Hide resolved
Co-Authored-By: RalfJung <post@ralfj.de>
@RalfJung
Copy link
Member Author

shrug I guess

I'm always open for suggestions for better solutions ;)

@sfackler
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Dec 12, 2018

📌 Commit d9c64e5 has been approved by sfackler

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 12, 2018
Centril added a commit to Centril/rust that referenced this pull request Dec 16, 2018
Fix BTreeMap UB

BTreeMap currently causes UB by created a shared reference to a too-small allocation.  This PR fixes that by introducing a `NodeHeader` type and using that until we really need access to the key/value arrays.  Avoiding run-time checks in `into_key_slice` was somewhat tricky, see the comments embedded in the code.

I also adjusted `as_leaf_mut` to return a raw pointer, because creating a mutable reference asserts that there are no aliases to the pointee, but that's not always correct: We use `as_leaf_mut` twice to create two mutable slices for keys and values; the second call overlaps with the first slice and hence is not a unique pointer.

Fixes rust-lang#54957

Cc @nikomatsakis @gankro
bors added a commit that referenced this pull request Dec 16, 2018
Rollup of 20 pull requests

Successful merges:

 - #53506 (Documentation for impl From for AtomicBool and other Atomic types)
 - #56343 (Remove not used mod)
 - #56439 (Clearer error message for dead assign)
 - #56640 (Add FreeBSD unsigned char platforms to std::os::raw)
 - #56648 (Fix BTreeMap UB)
 - #56672 (Document time of back operations of a Linked List)
 - #56706 (Make `const unsafe fn` bodies `unsafe`)
 - #56742 (infer: remove Box from a returned Iterator)
 - #56761 (Suggest using `.display()` when trying to print a `Path`)
 - #56781 (Update LLVM submodule)
 - #56789 (rustc: Add an unstable `simd_select_bitmask` intrinsic)
 - #56790 (Make RValue::Discriminant a normal Shallow read)
 - #56793 (rustdoc: look for comments when scraping attributes/crates from doctests)
 - #56826 (rustc: Add the `cmpxchg16b` target feature on x86/x86_64)
 - #56832 (std: Use `rustc_demangle` from crates.io)
 - #56844 (Improve CSS rule)
 - #56850 (Fixed issue with using `Self` ctor in typedefs)
 - #56855 (Remove u8 cttz hack)
 - #56857 (Fix a small mistake regarding NaNs in a deprecation message)
 - #56858 (Fix doc of `std::fs::canonicalize`)

Failed merges:

 - #56741 (treat ref-to-raw cast like a reborrow: do a special kind of retag)

r? @ghost
@bors bors merged commit d9c64e5 into rust-lang:master Dec 16, 2018
// ever take a pointer past the first key.
static EMPTY_ROOT_NODE: LeafNode<(), ()> = LeafNode {
static EMPTY_ROOT_NODE: NodeHeader<(), ()> = NodeHeader {
parent: ptr::null(),
parent_idx: MaybeUninit::uninitialized(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uninitialized in a static is surprising, isn't it, it doesn't have much purpose other than if miri can potentially use it in analysis? Can't we easily just initialize this at no cost, since it's in a static?

Copy link
Member Author

@RalfJung RalfJung Dec 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header is also used at run-time, and this data is not always initialized, so it needs to have type MaybeUninit.

I don't see the point of trying to initialize this to anything else.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only point would be to code defensively, sort of like the safe practices one learns to use in C. Requires saying that a bug with use of uninit value would be worse than a bug where we access an initialized but bogus value. (So this is where Miri comes in? It could diagnose the former better!?)

This is absolutely not a big deal, but I like to remember that we need to be defensive in Rust and not just have one line of defence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If MaybeUninit::zeroed() was const fn I'd be happy to r+ a PR that uses that, but currently I don't see a good way to do anything more defensive. Do you have a concrete suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if Maybeuninit::new(0) would work, that would be it, but that's all. I think it works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to try if it works?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only one project in rust-lang/rust right now, that's #56440, so I can try after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BTreeSet causes UB by having (partially) dangling shared reference
6 participants