Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[move-vm] Removed most RCs from runtime #160

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

tnowacki
Copy link
Contributor

Motivation

  • RCs are no longer necessary due to Move's reference safety checks
  • Some RCs are still needed to update clean/dirty status or hold onto external memory safely
  • I feel this greatly simplifies the layout of values in value_impl.rs
  • It does however make write_ref unsafe, as it could lead to memory issues if the reference safety checks are wrong or not run
  • I do not have a reference heavy bench mark to test if this makes any perf difference. My guess is it won't, but might help with memory usage

Test Plan

  • ran tests
  • I'm a bit nervous that our tests might not be exhaustive enough though....

Copy link
Contributor

@vgao1996 vgao1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to take a closer look, but here are some early feedbacks regarding unsafe fn.

I think one thing to point out is that write_safe is probably not the only thing that's unsafe -- read_ref and more broadly, anything that dereferences a CheckedRef (a.k,a, raw pointers) are also unsafe. Consider this case:

let locals = Locals::new(4);

locals.store_loc(0, Value::u64(0));
locals.store_loc(1, Value::u64(0));

let r1 = locals.borrow_loc(0).unwrap();
let r2 = locals.borrow_loc(1).unwrap();

std::mem::drop(locals);

r1.equals(r2);

I'm aware this is a terrible way to use the value APIs, but this is exactly the point of marking them as unsafe -- forcing the Move VM to annotate its usages, making it easier for us to review them and make sure they are indeed doing sane things.

language/move-vm/types/src/values/values_impl.rs Outdated Show resolved Hide resolved
language/move-vm/types/src/values/values_impl.rs Outdated Show resolved Hide resolved
language/move-vm/types/src/values/values_impl.rs Outdated Show resolved Hide resolved
@tnowacki
Copy link
Contributor Author

tnowacki commented Apr 1, 2022

Still need to take a closer look, but here are some early feedbacks regarding unsafe fn.

I think one thing to point out is that write_safe is probably not the only thing that's unsafe -- read_ref and more broadly, anything that dereferences a CheckedRef (a.k,a, raw pointers) are also unsafe. Consider this case:

let locals = Locals::new(4);

locals.store_loc(0, Value::u64(0));
locals.store_loc(1, Value::u64(0));

let r1 = locals.borrow_loc(0).unwrap();
let r2 = locals.borrow_loc(1).unwrap();

std::mem::drop(locals);

r1.equals(r2);

I'm aware this is a terrible way to use the value APIs, but this is exactly the point of marking them as unsafe -- forcing the Move VM to annotate its usages, making it easier for us to review them and make sure they are indeed doing sane things.

Very perceptive :) I saw exactly this after working on #175, seems you just saw it a bit earlier than myself

@vgao1996
Copy link
Contributor

vgao1996 commented Apr 8, 2022

Not sure why the hardhat-tests are failing... Could you try to rebase?

Todd Nowacki and others added 4 commits April 8, 2022 13:39
- RCs are no longer necessary due to Move's reference safety checks
- Some RCs are still needed to update clean/dirty status or hold onto external memory safely
Copy link
Contributor

@vgao1996 vgao1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's leave some time for @wrwg to have a look, just to make sure he's on board with the general direction.

Copy link
Contributor

@wrwg wrwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerned about the spread of unsafe across the code.

@sblackshear @davidiw

@@ -75,7 +75,7 @@ fn native_send(
let ext = context.extensions_mut().get_mut::<AsyncExtension>();
let mut bcs_args = vec![];
while args.len() > 2 {
bcs_args.push(pop_arg!(args, Vector).to_vec_u8()?);
bcs_args.push(pop_arg!(args, Vec<u8>));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! That actually works, great. Problem with that kind of magic is that the APIs aren't clearly documented and buried somewhere in all those trait impls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed its a bit hard to discover, not sure if @vgao1996 has any thoughts on that (to be clear the VMValueCast stuff has been around a while, not new to this PR)

.unwrap()
.value_as::<Reference>()?
.read_ref()?;
let key = unsafe { args.pop_back().unwrap().value_as::<Reference>()?.read_ref() };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering: can't we isolate the unsafe somewhere in the value impl? Really don't like to see this in user code.

//! However, it is **very important** that any instance of `Locals` is not dropped until after
//! all Values are consumed. If this is not followed, reading/writing to Values *will* read from
//! invalid pointers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is kind of bummer for usage of the VM. We want to use references in adapters (doesn't Sui does so) and custom embeddings of the VM, as well as native functions implementations, avoiding the need for using unsafe all over the code. Unsafe Rust should be isolated. Can you wrap the unsafe usage somewhere inside the VM?

@tnowacki
Copy link
Contributor Author

I think this is kind of bummer for usage of the VM. We want to use references in adapters (doesn't Sui does so) and custom embeddings of the VM, as well as native functions implementations, avoiding the need for using unsafe all over the code. Unsafe Rust should be isolated. Can you wrap the unsafe usage somewhere inside the VM?

Anything with references in this implementation will be inherently unsafe. Even reading from them could be unsafe, if an improper write occurs. So there isn't really way to encapsulate the logic, which we already do for adapters, could do some endpoints, and can't really do for native functions:

On native functions:
And there really isn't anything that can be done from the native function side of things, since native functions get references as inputs (and the native function implementation is not bound by Move's safety rules, the implementation could be wrong). Native functions already could have messed this up, but it would have been a graceful invariant violation. Now it will be some deeply disturbing, silent pointer failure. Which is potentially an issue. However, to me this doesn't feel like an issue in the sense that native functions are stating "I am extending the VM runtime", and as such you need to be just as safe as the VM runtime.

On the adapter:
None of the safety concerns are exposed to the adapter. Value is only used inside of the runtime. The adapter talks to the VM with BCS serialized values, both for inputs and outputs. So this "wrapping" is already done indirectly by the usage of serialized values.

For other endpoints (testing infra and such):
Its a bit annoying, and I could try to split the Value enum into reference and non reference. It would bloat the API but it would clean up at least a few endpoints... maybe...

@wrwg
Copy link
Contributor

wrwg commented Apr 12, 2022

I think this is kind of bummer for usage of the VM. We want to use references in adapters (doesn't Sui does so) and custom embeddings of the VM, as well as native functions implementations, avoiding the need for using unsafe all over the code. Unsafe Rust should be isolated. Can you wrap the unsafe usage somewhere inside the VM?

Anything with references in this implementation will be inherently unsafe. Even reading from them could be unsafe, if an improper write occurs. So there isn't really way to encapsulate the logic, which we already do for adapters, could do some endpoints, and can't really do for native functions:

I wonder whether we differentiate here between unsafeness regards the Rust and the Move language? Accessing a dangling Move reference may result in an undefined value, but it should not result in a process crash. In contrast, wherever you use unsafe Rust, a process crash can happen because of programmer mistakes. It is acceptable to do this in some isolated internal contexts, but not Rustonian to require unsafe in public APIs.

Perhaps we need a safe wrapper around the Value implementation? If it costs a bit extra, I think this is acceptable. We can also keep the unsafe API open for the implementation of certain critical native functions, but one should not be forced into using it.

@wrwg
Copy link
Contributor

wrwg commented Apr 12, 2022

Just a further note, not actionable but related to this PR.

If we want to have a more efficient implementation of values, I think we could go a different approach. What I sketch is motivated both by implementing the Move memory model on top of the EVM and for the Prover.

(a) Linear Memory

Borrow semantics is a model for linear addressable RAM, including pointer indirection and address arithmetic, and I think will be most efficiently implemented by reflecting those aspects. So why not represent any value as a byte-addressable region of memory. You can borrow offsets in this region, read values, and mutate them. An unsafe, untyped value representation would then be Value(*[u8]).

The lifetime of the region the value points to is guaranteed to be valid by our own implementation of memory allocation. We can also skip the GC (as Solidity does) assuming that transactions are short running. I.e. we use arena style allocation for *[u8]. The stack can still be shrinking and growing, but once allocated stack memory will stay in the VMs possession. In any case we can assume that *[u8] is always Rust-safe but possibly Move-unsafe, as we never give memory back to Rust.

As to how we layout memory, this was investigated as part of EVM Move. Assuming a field in a struct has a fixed offset, field borrowing simply amounts to Value(r) => Value(r[offs..offs+size]). Similar, borrowing a local amounts to slicing the stack frame, For globals we can give them their own region.

Nested structs can be inlined, or represented via indirections like vectors, which inherently need indirections because they have no fixed size.

(b) Generics and Variants

We can avoid enums for different value variants all together. A Value(*[u8]) can be only interpreted by providing a type. This type is either statically known to the program (no generics), or passed as a parameter to a function (with generics).

Since *[u8] always points to valid Rust memory and has a given size, we cannot destroy Rust integrity by using the wrong type when interpreting a value, including when mutating it.

There is one exception regards integrity, that is when we have an indirection to a vector (or struct if we choose so) embedded in *[u8]. In this case we must create a new Value from the data inside the region, which might be corrupted if we have the wrong type to interpret it. There are multiple ways how to deal with this:

  • Trust bytecode verification for internal VM code, and assume type soundness. For the external API, use ExtValue(TypeTag, Value)
  • Validate the indirection to be well-defined. As we know all memory we ever allocated, we can check whether the Value(*[u8]) constructed from the content of some other value actually points to a valid region. Valid means in the Rust sense, it can be still invalid in the Move sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants