Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Fix unsoundness in ethash's unsafe code #6140

Merged
merged 5 commits into from
Jul 28, 2017

Conversation

eira-fransham
Copy link
Contributor

@eira-fransham eira-fransham commented Jul 25, 2017

some_vec.resize(n, mem::uninitialized()) is unsound since resize reads the value it is passed. Since it just writes it to the allocated block, this will currently (probably) just lead to LLVM optimising only the copies out and leaving the self.len = n, making this call equivalent to set_len. Since theoretically LLVM could do anything with these reads using set_len is more correct. This also relies on resize not, for example, branching on the passed value.

Additionally, reading beyond the end of an &mut [T; N] is unsound, although there is data there in this case. This is less likely to cause issues but if we're going to use unsafety it'd be better to keep it sound. I think that the Rust team is within rights to take advantage of this as an optimisation opportunity (which would, of course, make this problematic), but I don't know.

@eira-fransham eira-fransham changed the title Fix unsoundness in uses of unsafety Fix unsoundness in unsafety in ethash Jul 25, 2017
@eira-fransham eira-fransham changed the title Fix unsoundness in unsafety in ethash Fix unsoundness in ethash's unsafe code Jul 25, 2017
mem::transmute(&nonce),
s_mix.get_unchecked_mut(0).bytes[32..].as_mut_ptr(),
8
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use copy_from_slice instead?

Copy link
Contributor Author

@eira-fransham eira-fransham Jul 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing is that unsafe code (mem::uninitialized) cannot trust safe code (copy_from_slice). copy_from_slice can assume that its data is initialized, whereas copy_nonoverlapping, as an unsafe function, cannot.

EDIT: This could be elided in some cases, but this is partially initialized in sha3_512 and therefore we explicitly use uninit. I'll add a comment.

let (f_mix, mut mix) = s_mix.split_at_mut(1);
let (f_mix, mix) = s_mix.split_at_mut(1);
let mix_words: &mut [u32; MIX_WORDS] = make_const_array!(MIX_WORDS, mix);

for w in 0..MIX_WORDS {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also unroll! this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can! It's probably a good candidate, too.

f_mix.get_unchecked(0).as_words().get_unchecked(0) ^ i,
*mix.get_unchecked(0).as_words().get_unchecked(i as usize % MIX_WORDS)
) % num_full_pages;
for i in 0..ETHASH_ACCESSES as u32 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove unroll! here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because ETHASH_ACCESSES is 64 and 64 * 32 = 2048. We don't want 2048 copies of the same code. Best to leave the outer loop as a real loop.

mix.get_unchecked_mut(0).bytes.as_ptr(),
mix_hash.as_mut_ptr(),
32,
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_from_slice?

Copy link
Contributor Author

@eira-fransham eira-fransham Jul 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of our ptr::copy_nonoverlapping calls can be replaced with copy_from_slice, I'll rewrite all our code to use it in a seperate PR (unless you'd consider this a blocker).

EDIT: Ok, your other comment mentioning this got a 👍, I'll change the calls in this PR to use copy_from_slice and have a PR safe-ifying the rest of the code later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code was written before copy_from_slice was stabilized, but now we can use it.

let hash = [0xf5, 0x7e, 0x6f, 0x3a, 0xcf, 0xc0, 0xdd, 0x4b, 0x5b, 0xf2, 0xbe, 0xe4, 0x0a, 0xb3, 0x35, 0x8a, 0xa6, 0x87, 0x73, 0xa8, 0xd0, 0x9f, 0x5e, 0x59, 0x5e, 0xab, 0x55, 0x94, 0x05, 0x52, 0x7d, 0x72];
let nonce = 0xd7b3ac70a301a249;
let light = Light::new(486382);
let light = Light::new(env::temp_dir(), 486382);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to use tempdir crate

Copy link
Contributor Author

@eira-fransham eira-fransham Jul 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied this from another place in the code that did this. We don't already use tempdir, and this is just for a benchmark. I don't think it's a problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do use it in several places, but you are right env::temp_dir is way more common. I will create a separate issue for this

@debris debris added A0-pleasereview 🤓 Pull request needs code review. M4-core ⛓ Core client code / Rust. labels Jul 25, 2017
unsafe fn make_const_array<'a, T, U>(val: &'a mut [T]) -> &'a mut [U; $n] {
use ::std::mem;

debug_assert_eq!(val.len() * mem::size_of::<T>(), $n * mem::size_of::<U>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: dependent typing :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If only, if only.

for w in 0..MIX_WORDS {
*mix.get_unchecked_mut(0).as_words_mut().get_unchecked_mut(w) = *f_mix.get_unchecked(0).as_words().get_unchecked(w % NODE_WORDS);
*mix_words.get_unchecked_mut(w) =
Copy link
Contributor

@rphmeier rphmeier Jul 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify: what was happening before was that it read into mix[2] from a pointer to the words of mix[1]? and now it explicitly indexes into the one it should?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It got a pointer to mix[1], transmuted it to a [u32; NODE_WORDS], and then read between index 0 and NODE_WORDS * 2. Basically reading into mix[2], yeah. Now it transmutes to a [u32; NODE_WORDS * 2].

fn sha3_512(input: &[u8], output: &mut [u8]) {
unsafe { sha3::sha3_512(output.as_mut_ptr(), output.len(), input.as_ptr(), input.len()) };
}

#[inline]
fn sha3_512_inplace(input: &mut [u8]) {
unsafe { sha3::sha3_512(input.as_mut_ptr(), input.len(), input.as_ptr(), input.len()) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with a comment explaining why it's safe? couldn't a sha3 implementation technically optimize by using the output buffer for absorbing + squeezing instead of an internal one which is later copied?

// `sha3_512`.
let mut buf: [u8; 64 + 32] = mem::uninitialized();

ptr::copy_nonoverlapping(header_hash.as_ptr(), buf.as_mut_ptr(), 32);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy_from_slice is exactly the same thing as copy_nonoverlapping link

Copy link
Contributor Author

@eira-fransham eira-fransham Jul 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for uninit. The docs for mem::uninitialized specifically mention that the ptr::* functions are the only valid way to initialise uninitialised memory. In general, you cannot call any safe function with mem::uninitialized (this is what was wrong with vec.resize(n, mem::uninitialized()), which was part of the original PR), since safe functions must be allowed to trust their input. copy_from_slice would work but would be unsound and future optimisations may break it.

@debris debris added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Jul 27, 2017
@arkpar
Copy link
Collaborator

arkpar commented Jul 27, 2017

please revert changes to ethcore/res/ethereum/tests

@arkpar arkpar added A7-looksgoodcantmerge 🙄 Pull request is reviewed well, but cannot be merged due to conflicts. and removed A8-looksgood 🦄 Pull request is reviewed well. labels Jul 27, 2017
Vurich added 5 commits July 28, 2017 12:12
This commit also includes a completely absurd optimisation that I
promise is an honest win. You can check the benchmarks, I barely
believe it myself.
@eira-fransham
Copy link
Contributor Author

Fixed, should be ready to merge after CI. Sorry about that.

@debris debris added A8-looksgood 🦄 Pull request is reviewed well. and removed A7-looksgoodcantmerge 🙄 Pull request is reviewed well, but cannot be merged due to conflicts. labels Jul 28, 2017
@arkpar arkpar merged commit e84f308 into openethereum:master Jul 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A8-looksgood 🦄 Pull request is reviewed well. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants