Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug report: Due to incorrect use of implicit conversion(as), some cases do not work in 32-bit operating systems. #971

Closed
dyxushuai opened this issue Jan 11, 2024 · 5 comments · Fixed by #978
Labels
bug Something isn't working

Comments

@dyxushuai
Copy link
Contributor

dyxushuai commented Jan 11, 2024

Root cause

I found revm used the implicit conversion as without checking the overflow, like:

as_u64_saturated!($v) as usize

and so on.

In revm we usually convert u64 to usize. Therefore, it's perfectly fine when running on a 64-bit OS. However, in a 32-bit OS, the rules are different, and it will silently overflow and return unsound values.

Case study;TLDR

Failed case from zeth: risc0/zeth#52

What happened?

When you track the traces of the revm execution, you will find:

depth:1, PC:0, gas:0x9ffffadf8(42949651960), OPCODE: "PUSH15"(110)  refund:0x0(0) Stack:[], Data size:0
depth:1, PC:16, gas:0x9ffffadf5(42949651957), OPCODE: "PUSH1"(96)  refund:0x0(0) Stack:[0x0000000000000000000000000000000000112233445566778899aabbccddeeff_U256], Data size:0
depth:1, PC:18, gas:0x9ffffadf2(42949651954), OPCODE: "MSTORE"(82)  refund:0x0(0) Stack:[0x0000000000000000000000000000000000112233445566778899aabbccddeeff_U256, 0x0_U256], Data size:0
depth:1, PC:19, gas:0x9ffffadec(42949651948), OPCODE: "PUSH1"(96)  refund:0x0(0) Stack:[], Data size:32
depth:1, PC:21, gas:0x9ffffade9(42949651945), OPCODE: "PUSH1"(96)  refund:0x0(0) Stack:[0x000000000000000000000000000000000000000000000000000000000000003f_U256], Data size:32
depth:1, PC:23, gas:0x9ffffade6(42949651942), OPCODE: "EXP"(10)  refund:0x0(0) Stack:[0x000000000000000000000000000000000000000000000000000000000000003f_U256, 0x0000000000000000000000000000000000000000000000000000000000000002_U256], Data size:32
depth:1, PC:24, gas:0x9ffffadaa(42949651882), OPCODE: "PUSH1"(96)  refund:0x0(0) Stack:[0x0000000000000000000000000000000000000000000000008000000000000000_U256], Data size:32
depth:1, PC:26, gas:0x9ffffada7(42949651879), OPCODE: "PUSH1"(96)  refund:0x0(0) Stack:[0x0000000000000000000000000000000000000000000000008000000000000000_U256, 0x000000000000000000000000000000000000000000000000000000000000003f_U256], Data size:32
depth:1, PC:28, gas:0x9ffffada4(42949651876), OPCODE: "EXP"(10)  refund:0x0(0) Stack:[0x0000000000000000000000000000000000000000000000008000000000000000_U256, 0x000000000000000000000000000000000000000000000000000000000000003f_U256, 0x0000000000000000000000000000000000000000000000000000000000000002_U256], Data size:32
depth:1, PC:29, gas:0x9ffffad68(42949651816), OPCODE: "PUSH1"(96)  refund:0x0(0) Stack:[0x0000000000000000000000000000000000000000000000008000000000000000_U256, 0x0000000000000000000000000000000000000000000000008000000000000000_U256], Data size:32
depth:1, PC:31, gas:0x9ffffad65(42949651813), OPCODE: "RETURNDATACOPY"(62)  refund:0x0(0) Stack:[0x0000000000000000000000000000000000000000000000008000000000000000_U256, 0x0000000000000000000000000000000000000000000000008000000000000000_U256, 0x0_U256], Data size:32
+ depth:1, PC:32, gas:0x9ffffad62(42949651810), OPCODE: "PUSH1"(96)  refund:0x0(0) Stack:[], Data size:32
+ depth:1, PC:34, gas:0x9ffffad5f(42949651807), OPCODE: "MLOAD"(81)  refund:0x0(0) Stack:[0x0_U256], Data size:32
+ depth:1, PC:35, gas:0x9ffffad5c(42949651804), OPCODE: "PUSH1"(96)  refund:0x0(0) Stack:-[0x0000000000000000000000000000000000112233445566778899aabbccddeeff_U256], Data size:32
+ depth:1, PC:37, gas:0x9ffffad59(42949651801), OPCODE: "SSTORE"(85)  refund:0x0(0) Stack:[0x0000000000000000000000000000000000112233445566778899aabbccddeeff_U256, 0x0_U256], Data size:32

The binary running in the risc0-vm has over 4 traces, but it will terminate and return OutOfGas when reaching the RETURNDATACOPY opcode if run outside of it. Both of them have the same program, but what the difference between them?
The RETURNDATACOPY has 3 parameters1:

Parameter Value
destOffset 0x0
offset 0x0000000000000000000000000000000000000000000000008000000000000000
size 0x0000000000000000000000000000000000000000000000008000000000000000

The gas cost of this opcode will be 864691128455135253, which is far greater than the gas limit.
We discovered that the subsequent PUSH1 opcode has 42949651810 remaining gas, whereas the previous one had 42949651813, indicating a cost of only three gases. This is equivalent to the static gas of RETURNDATACOPY. Therefore, the total gas was not calculated using the size value (dynamic gas was missing).

pub fn returndatacopy<H: Host, SPEC: Spec>(interpreter: &mut Interpreter, _host: &mut H) {
check!(interpreter, BYZANTIUM);
pop!(interpreter, memory_offset, offset, len);
let len = as_usize_or_fail!(interpreter, len);
gas_or_fail!(interpreter, gas::verylowcopy_cost(len as u64));
let data_offset = as_usize_saturated!(offset);
let (data_end, overflow) = data_offset.overflowing_add(len);
if overflow || data_end > interpreter.return_data_buffer.len() {
interpreter.instruction_result = InstructionResult::OutOfOffset;
return;
}
if len != 0 {
let memory_offset = as_usize_or_fail!(interpreter, memory_offset);
memory_resize!(interpreter, memory_offset, len);
interpreter.memory.set(
memory_offset,
&interpreter.return_data_buffer[data_offset..data_end],
);
}
}

The only reason here is that gas::verylowcopy_cost(len as u64) returns 3, which should not be. Unless there are some issues with the risc0-vm or riscv32im-risc0-zkvm-elf target. However, no matter how the code is modified, this bug can be reproduced stably and can largely eliminate some UBs.

pub fn verylowcopy_cost(len: u64) -> Option<u64> {
let wordd = len / 32;
let wordr = len % 32;
VERYLOW.checked_add(COPY.checked_mul(if wordr == 0 { wordd } else { wordd + 1 })?)
}

So, I found the line get len return 0.

let len = as_usize_or_fail!(interpreter, len);

macro_rules! as_usize_or_fail {
($interp:expr, $v:expr) => {
as_usize_or_fail!($interp, $v, InstructionResult::InvalidOperandOOG)
};
($interp:expr, $v:expr, $reason:expr) => {{
let x = $v.as_limbs();
if x[1] != 0 || x[2] != 0 || x[3] != 0 {
$interp.instruction_result = $reason;
return;
}
x[0] as usize
}};
}

Since the value of len is 9223372036854776000, which exceeds usize::MAX in a 32-bit OS, an overflow occurred and returned a value of 0.

How to fix?

We cannot resolve this issue without making significant modifications to the codebase. Because revm used lots of usize, like the index&offset of Memory, all opcodes that use the len and so on.

Approach 1

I think a good way is to align with the implementation of geth2, all of these places, we use the u64 instead of usize, like use a u64 indexed Memory.

https://github.com/ethereum/go-ethereum/blob/master/core/vm/memory.go#L35

Approach 2

Let it fail fast, only return an error in cases of casting overflow, such as CastOVerflow.
To prevent developers from using the as keyword for casting, we can add a #![warn(clippy::cast_lossless)] compile attribute to perform a Clippy lint check3.
And it should be:

usize::try_from(x[0]).map_error(|_|CastOverflow)

However, this will cause inconsistent behavior between 32-bit and 64-bit operating systems.

Conclusion

In any case, it is crucial to avoid using as.

Footnotes

  1. https://www.evm.codes/#3e?fork=shanghai

  2. https://github.com/ethereum/go-ethereum

  3. https://rust-lang.github.io/rust-clippy/master/#/cast_lossless

@Wollac
Copy link
Contributor

Wollac commented Jan 11, 2024

I did a quick test and noticed that the eth tests also fail when using --target=i686-unknown-linux-gnu. This is definitely related and might be a good thing to check.

@rakita
Copy link
Member

rakita commented Jan 11, 2024

This is a nice catch!

I quickly searched for as usize and I think there are only few places where sensitive conversion happens and they are encapsulated inside those macros. So what I would do is just make those changes in revm and make conversion safe.

@rakita rakita added the bug Something isn't working label Jan 11, 2024
@dyxushuai
Copy link
Contributor Author

dyxushuai commented Jan 11, 2024

This is a nice catch!

I quickly searched for as usize and I think there are only few places where sensitive conversion happens and they are encapsulated inside those macros. So what I would do is just make those changes in revm and make conversion safe.

If you use the try_from to avoid overflow, this will encounter the problem that I mentioned in approach 2:

However, this will cause inconsistent behavior between 32-bit and 64-bit operating systems.

Different running results will be generated in 32-bit and 64-bit systems, which violates the consensus of Ethereum.

For approach 1

The only difficulty is that in Memory use Vec<u8> as memory layout, but Vec has usize as its indexer type.


We need to convert the usize indexer into u64 indexer. However, the addressing space of usize on a 32-bit machine is only u32::MAX. Anyway, we are not really allocating u64::MAX memory on a 32-bit machine because of the MMU.
In golang, the slice can accept an uint64 type numeric as indexer. Unfortunately, rust doesn't have this feature.
https://github.com/ethereum/go-ethereum/blob/daa2e5d6a66833b9834b60a3a46835610bbde99a/core/vm/memory.go#L35

// Set sets offset + size to value
func (m *Memory) Set(offset, size uint64, value []byte) {
	// It's possible the offset is greater than 0 and size equals 0. This is because
	// the calcMemSize (common.go) could potentially return 0 when size is zero (NO-OP)
	if size > 0 {
		// length of store may never be less than offset + size.
		// The store should be resized PRIOR to setting the memory
		if offset+size > uint64(len(m.store)) {
			panic("invalid memory: store empty")
		}
		copy(m.store[offset:offset+size], value)
	}
}

So, my suggestion is We can use two vectors as high and low bits.

pub struct Memory {
    low: Vec<u8>,
	high: Vec<u8>,
}

impl Memory {
	pub fn set_byte(&mut self, index: u64, byte: u8) {
		if index > usize::MAX {
			// store byte in high field
		} else {
			// store byte in low field
		}
    }
}

@rakita
Copy link
Member

rakita commented Jan 12, 2024

Different running results will be generated in 32-bit and 64-bit systems, which violates the consensus of Ethereum.

@dyxushuai I think you are overthinking things. It is not possible to use more than u32 size of memory as the gas calculation limits it (and would throw OutOfGas), that is why we are casting U256 to usize or saturated usize. Those macros are made especially for this.

The idea would be to modify these macros:

macro_rules! as_usize_saturated {
($v:expr) => {
as_u64_saturated!($v) as usize
};
}
macro_rules! as_usize_or_fail {
($interp:expr, $v:expr) => {
as_usize_or_fail!($interp, $v, InstructionResult::InvalidOperandOOG)
};
($interp:expr, $v:expr, $reason:expr) => {{
let x = $v.as_limbs();
if x[1] != 0 || x[2] != 0 || x[3] != 0 {
$interp.instruction_result = $reason;
return;
}
x[0] as usize
}};
}

@dyxushuai
Copy link
Contributor Author

The idea would be to modify these macros:

Cool, I can help with this fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants