You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Memory is one of the core elements of the EVM. Having it as optimized as possible would reduce the overall costs of all transactions ran through the EVM.
Here is my proposal: We can refactor the memory to be 32-bytes words based instead of the current 16-bytes words based implementation.
I started to implement the new version to run some benchmarks. Below are the related code snippets.
16-bytes words memory (current implementation)
fnstore(refself:Memory,element:u256,offset:usize){let x = testing::get_available_gas();
gas::withdraw_gas().unwrap();let new_min_bytes_len = helpers::ceil_bytes_len_to_next_32_bytes_word(offset + 32);let new_bytes_len = if new_min_bytes_len > self.bytes_len{
new_min_bytes_len
}else{self.bytes_len};self.bytes_len = new_bytes_len;// Check alignment of offset to bytes16 chunkslet(chunk_index, offset_in_chunk) = u32_safe_divmod(offset,u32_as_non_zero(16));if offset_in_chunk == 0{// Offset is aligned. This is the simplest and most efficient case,// so we optimize for it.self.items.store_u256(element, chunk_index);return();}// Offset is misaligned.// | W0 | W1 | w2 |// | EL_H | EL_L |// ^---^// |-- mask = 256 ** offset_in_chunklet mask:u256 = helpers::pow256_rev(offset_in_chunk);let mask_c:u256 = utils::pow(256,16).into() / mask;// Split the 2 input bytes16 chunks at offset_in_chunk.let(el_hh, el_hl) = u256_safe_div_rem(u256{low: element.high,high:0},u256_as_non_zero(mask_c));let(el_lh, el_ll) = u256_safe_div_rem(u256{low: element.low,high:0},u256_as_non_zero(mask_c));// Read the words at chunk_index, chunk_index + 2.let w0:u128 = self.items.get(chunk_index.into());let w2:u128 = self.items.get(chunk_index.into() + 2);// Compute the new wordslet w0_h:u256 = (w0.into() / mask);let w2_l:u256 = (w2.into() / mask);// We can convert them back to felt252 as we know they fit in one word.let new_w0:u128 = (w0_h.into()* mask + el_hh).try_into().unwrap();let new_w1:u128 = (el_hl.into()* mask + el_lh).try_into().unwrap();let new_w2:u128 = (el_ll.into()* mask + w2_l).try_into().unwrap();// Write the new wordsself.items.insert(chunk_index.into(), new_w0);self.items.insert(chunk_index.into() + 1, new_w1);self.items.insert(chunk_index.into() + 2, new_w2);(x - testing::get_available_gas()).print();}
32-bytes words memory (optimisation proposal)
fnstore(refself:Memory,element:u256,offset:usize){let x = testing::get_available_gas();
gas::withdraw_gas().unwrap();let new_min_bytes_len = helpers::ceil_bytes_len_to_next_32_bytes_word(offset + 32);let new_bytes_len = if new_min_bytes_len > self.bytes_len{
new_min_bytes_len
}else{self.bytes_len};self.bytes_len = new_bytes_len;// Check alignment of offset to 32-bytes chunkslet(chunk_index, offset_in_chunk) = u32_safe_divmod(offset,u32_as_non_zero(32));if offset_in_chunk == 0{// Offset is aligned. This is the simplest and most efficient case,// so we optimize for it.self.items.insert(chunk_index.into(),NullableExtensionTrait::new(element));return();}// Offset is misaligned.// | W0 | W1 |// | EL |// ^---^// |-- mask = 256 ** offset_in_chunklet mask:u256 = helpers::pow256_rev(offset_in_chunk);let mask_c:u256 = utils::pow(256,32).into() / mask;// Split the input two chunks at offset_in_chunk.let(el_h, el_l) = u256_safe_div_rem(element,u256_as_non_zero(mask_c));// Read the words at chunk_index, chunk_index + 1.let w0:u256 = self.items.get(chunk_index.into()).deref_or_0();let w1:u256 = self.items.get(chunk_index.into() + 1).deref_or_0();// Compute the new wordslet w0:u256 = (w0.into() / mask);let w1:u256 = (w1.into() / mask);// We can convert them back to felt252 as we know they fit in one word.let new_w0:u256 = (w0.into()* mask + el_h);let new_w1:u256 = (el_l.into()* mask + w1);// Write the new wordsself.items.insert(chunk_index.into(),NullableExtensionTrait::new(new_w0));self.items.insert(chunk_index.into() + 1,NullableExtensionTrait::new(new_w1));(x - testing::get_available_gas()).print();}}
Benchmarks
I ran tests on both implementations and computed the actual gas usage of both functions. Here is the output
Important update: we are gathering some bugs in the Kakarot v0 codebase, we need to make sure each issue and each PR in Kakarot-ssj is aware of the lists of known bugs. Look at this link everytime you take an issue and check your issue isn't targeted by a known bug.
Important update: we are gathering some bugs in the Kakarot v0 codebase, we need to make sure each issue and each PR in Kakarot-ssj is aware of the lists of known bugs. Look at this tracking issue everytime you take an issue and check your issue isn't targeted by a known bug. Will add this reminder in many places to make sure we keep track of known bugs.
Description
Memory is one of the core elements of the EVM. Having it as optimized as possible would reduce the overall costs of all transactions ran through the EVM.
Here is my proposal: We can refactor the memory to be 32-bytes words based instead of the current 16-bytes words based implementation.
I started to implement the new version to run some benchmarks. Below are the related code snippets.
16-bytes words memory (current implementation)
32-bytes words memory (optimisation proposal)
Benchmarks
I ran tests on both implementations and computed the actual gas usage of both functions. Here is the output
❯ scarb test -f test_store_should_add_an_element_to_the_memory_with_offset testing kakarot ... running 2 tests [DEBUG(memory_32_bytes_chunks)] (raw: 0x36cfe [DEBUG(memory_16_bytes_chunks)] (raw: 0x38e78
This leads to a ~4% gas decrease for each
store
operationProposal
Replace the 16-bytes words implementation with the 32-bytes words one.
The text was updated successfully, but these errors were encountered: