Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe uninitialized byte arrays. #1222

Closed
wants to merge 2 commits into from

Conversation

Stebalien
Copy link
Contributor

@alexcrichton
Copy link
Member

Using uninitialized memory is undefined behavior in terms of the optimization passes in LLVM. For example:

pub fn byte() -> u8 {                             
    unsafe {                                      
        let mut v = Vec::<u8>::with_capacity(100);
        v.set_len(100);                           
        v[10]                                     
    }                                             
}                                                 

Will generate this IR:

define i8 @_ZN4byte20hcb9dde8b210369c6eaaE() unnamed_addr #0 {
"_ZN31collections..vec..Vec$LT$u8$GT$9drop.115617hdcd6a559cef13decE.exit3":
  ret i8 undef
}

This undef can leak into other functions it's inlined into, possibly eliminating bounds checks or other various usages which can lead to undefined behavior.

I think it's possible to get around this by forcing LLVM to consider the memory initialized, but it's unfortunately not as simple as "just making it safe"


Every addressable byte in allocated memory is a valid u8 (byte) by definition.
On Linux at least, one can read `/proc/self/mem` into a buffer so the following
two functions are (virtually) indistinguishable at runtime (on Linux):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the fact that /proc is not necessarily mounted, e.g. a strict chroot might opt out of it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. But my point was that uninitialized bytes are still bytes.

@eddyb
Copy link
Member

eddyb commented Jul 21, 2015

I don't understand the "fast IO code" argument. In what cases do you have to expose uninitialized memory for efficiency?

@Stebalien
Copy link
Contributor Author

@eddyb rust-lang/rust#26950.

@huonw
Copy link
Member

huonw commented Jul 21, 2015

@eddyb, also, there's been a few discussion threads about it recently, e.g. https://users.rust-lang.org/t/reading-from-stdin-performance/2025 (and maybe some on reddit too).

@Stebalien
Copy link
Contributor Author

@alexcrichton I updated the RFC to introduce an intrinsic for asking LLVM to treat some memory as initialized. Unfortunately, I can't find LLVM documentation explaining how to do this.

@eddyb
Copy link
Member

eddyb commented Jul 21, 2015

Oh, I see, the read method takes &mut [u8] instead of some kind of fixed-capacity buffer.
I remember some discussion about zeroing back when the new I/O traits were introduced, but no mentions of how the read method had a suboptimal signature.

@eddyb
Copy link
Member

eddyb commented Jul 21, 2015

@Stebalien You don't need an intrinsic, the inline assembly in test::black_box should work just fine: it tells LLVM that anything could happen, including initialization of that memory.

@Stebalien
Copy link
Contributor Author

@eddyb, yes but I think the one we want is @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)

@Gankra
Copy link
Contributor

Gankra commented Jul 21, 2015

IIRC the I/O design discussion featured this problem as a security issue: if you don't zero bytes, then you can leak arbitrary things like private keys that were previously allocated. As such, even if this intrinsic was introduced, it's not clear to me that the standard library would be able to leverage it.

@eddyb
Copy link
Member

eddyb commented Jul 21, 2015

@Stebalien no, that's for lifetimes of allocas within a function, for stack allocation.

@Stebalien
Copy link
Contributor Author

IIRC the I/O design discussion featured this problem as a security issue: if you don't zero bytes, then you can leak arbitrary things like private keys that were previously allocated. As such, even if this intrinsic was introduced, it's not clear to me that the standard library would be able to leverage it.

@gankro, I see. I remember seeing some security argument but I thought it was about reading one's own uninitialized memory (which didn't make sense), not accidentally leaking it. I agree, that's a very good reason not to allow this.

@Stebalien Stebalien closed this Jul 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants