Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement 0x20 - SHA3 Opcode #115

Closed
Eikix opened this issue Aug 17, 2023 · 9 comments · Fixed by #281
Closed

feat: implement 0x20 - SHA3 Opcode #115

Eikix opened this issue Aug 17, 2023 · 9 comments · Fixed by #281
Assignees
Labels
enhancement New feature or request opcode Implementation of an opcode

Comments

@Eikix
Copy link
Member

Eikix commented Aug 17, 2023

SinceGroup
FrontierSHA3

Index 1 is top of the stack. See PUSH.

Stack input

  1. offset: byte offset in the memory.
  2. size: byte size to read in the memory.

Stack output

  1. hash: Keccak-256 hash of the given data in memory.

Example

Memory
0xFFFFFFFF
InputOutput
100x29045A592007D0C246EF02C2223570DA9522D0CF0F73282C79A1BC8F0BB2C238
24

Reproduce in playground.

Error cases

The state changes done by the current context are reverted in those cases:

  • Not enough gas.
  • Not enough values on the stack.

Gas

minimum_word_size = (size + 31) / 32

static_gas = 30
dynamic_gas = 6 * minimum_word_size + memory_expansion_cost

The memory expansion cost explanation can be found here.

Estimate your gas cost

Input: offset
Input: size
State: current memory size
Static gas + dynamic gas = 30

@Eikix Eikix added the enhancement New feature or request label Aug 17, 2023
@github-project-automation github-project-automation bot moved this to 🆕 Backlog in Kakarot on Starknet Aug 17, 2023
@Eikix Eikix added the opcode Implementation of an opcode label Aug 23, 2023
@Eikix Eikix added this to the Kakarot Cairo Migration milestone Aug 29, 2023
@Quentash
Copy link
Contributor

Can I have this one please ?

@Quentash
Copy link
Contributor

Quentash commented Aug 31, 2023

EDIT : Here is a corrected version of my first solution. This function returns the exact same hash than the EVM playground for many cases. I hope not forgetting some.
The main idea is to format the memory bytes into a u64 Array that I feed to cairo_keccak().
Using this, I manage to retrieve the exact same result obtained via the EVM playground whatever are the inputs.
It may be very badly written(tested a lot of things) so any feedback would be welcome to improve the code !

The test cases I'm thinking about are ;

  • 32 bytes size w/ and w/o offset
  • !=32 bytes size w/ and w/o offset
  • 0 byte size w/ and w/o offset
  • offset beyond memory_len
  • offset < memory_len but offset + size > memory_len
  • big size ( 3200 )
  • ... ?
fn exec_sha3(ref self: ExecutionContext) -> Result<(), EVMError> {
        let offset: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;
        let mut size: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;

        let mut toHash: Array<u64> = ArrayTrait::<u64>::new();
        let mut last_input: u64 = 0;
        let mut counter = 0;
        loop {
            if size < 32 {
                break;
            }

            let mut mem = self.memory.load_internal(offset+(32*counter));
            mem.low = integer::u128_byte_reverse(mem.low);
            mem.high = integer::u128_byte_reverse(mem.high);
            let tmp = mem.low;
            mem.low = mem.high;
            mem.high = tmp;

            let (highL, lowL) = u128_split(mem.low);
            let (highH, lowH) = u128_split(mem.high);
            toHash.append(lowL);
            toHash.append(highL);
            toHash.append(lowH);
            toHash.append(highH);

            counter += 1;
            size -= 32;
        };
        if size > 0 {
            let mut mem = self.memory.load_internal(offset+(32*counter));
            mem.low = integer::u128_byte_reverse(mem.low);
            mem.high = integer::u128_byte_reverse(mem.high);
            let tmp = mem.low;
            mem.low = mem.high;
            mem.high = tmp;
            let (highL, lowL) = u128_split(mem.low);
            let (highH, lowH) = u128_split(mem.high);

            if size < 8 {
                last_input = lowL;
            } else if size < 16 {
                size -= 8;
                toHash.append(lowL);
                last_input = highL;
            } else if size < 24 {
                size -= 16;
                toHash.append(lowL);
                toHash.append(highL);
                last_input = lowH;
            } else {
                size -= 24;
                toHash.append(lowL);
                toHash.append(highL);
                toHash.append(lowH);
                last_input = highH;
            }
        }

        let mut hash = cairo_keccak(ref toHash,last_input,size);
        hash.low = integer::u128_byte_reverse(hash.low);
        hash.high = integer::u128_byte_reverse(hash.high);
        let tmp = hash.low;
        hash.low = hash.high;
        hash.high = tmp;

        self.stack.push(hash)
    }

What it passes actually :
image

@Quentash
Copy link
Contributor

Quentash commented Sep 2, 2023

Here are my thoughts about either implementing Herodotus keccak or use the one suggested above.

TL;DR : Herodotus keccak seems to be a little bit more gas efficient but the fact of having to format the u256 read from memory into a Span.u8 that will itself be formated into a Span.u64 may acutally "compenstate the gain from Herodotus' keccak".
Both implemenation seems to be able to handle all edges cases I thought about until now.

keccak_thoughts.odt

@Eikix
Copy link
Member Author

Eikix commented Sep 2, 2023

Putting here the content of the word file for @Quentash:

I pulled Herodotus' keccak and added several test cases to be able to test some additional edge cases and compare them with my own implementation. I noticed that the tests consume slightly less gas in Herodotus than in my own implementation, but I think it's important to consider that Kakarot's tests set up a context and manipulate the stack/memory, while Herodotus' tests only test the hashing function. So, I also ran each test by isolating the hash function admitting that test_complete_cost - test_without_hash_cost = hash_cost.
Here are the results:

Use case 1:
Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000
Result: 0x1b00e6bf2213698a78d172620d2852921faf54cf9645342bdb34ca455f69d44
Herodotus implementation:
Test complete: (gas usage estimate: 854,892)
Test without hash: (gas usage estimate: 6,500)
Estimated hash cost: 854,892 - 6,500 = 848,392

"Kakarot" implementation:
Test complete: (gas usage estimate: 1,250,798)
Test without hash: (gas usage estimate: 308,330)
Estimated hash cost: 1,250,798 - 308,330 = 942,468

Use case 2:
Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000 * 1000
Result: 0x2022ae07f3a362b08ac0a4bcb785c830cb5c368dc0ce6972249c6abbc68a5291
Herodotus implementation:
Test complete: (gas usage estimate: 46,123,022)
Test without hash: (gas usage estimate: 1,246,580)
Estimated hash cost: 46,123,022 - 1,246,580 = 44,876,442

"Kakarot" implementation:
Test complete: (gas usage estimate: 72,120,176)
Test without hash: (gas usage estimate: 20,960,470)
Estimated hash cost: 72,120,176 - 20,960,470 = 51,159,706
In the end, the difference is not huge (although it becomes more significant with larger inputs), considering that the algorithm I am currently using probably still has room for improvement in terms of gas consumption. It is also important to consider that if we use the Herodotus library, we will still need to format the u256 read from memory into their Bytes type, which may ultimately incur a similar gas cost.
The current behavior of exec_sha3 is as follows:
• Pop the inputs from the stack
• For each 32-byte chunk to be hashed (if size > 32), read 32 bytes from memory.
• Reverse the 32 bytes and append them in groups of 8 to an array (input for carrio_keccak).
• If size < 32 but > 0, load the last 32 bytes, reverse them, and append the missing bytes.
• Call cairro_keccak with inputs = array, last_input = the last 8-byte slice that is not used, nbr_bytes = the number of bytes to take from last_input.
• Reverse the result bytes.
The behavior of exec_sha3 with the Herodotus library would be:
• Pop the inputs from the stack
• For each 32-byte chunk to be hashed (if size > 32), read 32 bytes from memory.
• Reverse the 32 bytes, transform the u256 into Array
• Concatenate with the already extracted data (loop append? Concat()?)
• If size < 32 but > 0, load the last 32 bytes, reverse them, and concatenate the last bytes.
• Call Herodotus::keccak_cairo with the filled array.span() Herodotus::keccak_cairo will then format the span
• Reverse the result bytes.

I have highlighted in green the differences (gas-consuming) between the two algorithms. I think that having to format the memory inputs into the type created by Herodotus (Bytes), which will then format this type into Span for carrio_keccak, may make the process more gas-consuming than the current one. Please let me know if I'm wrong or if I've missed something.
I haven't been able to test all edge cases with the Herodotus function yet, but it seems to be bug-proof.
My implementation also seems to be, I haven't found/thought of any breaking cases so far. All the cases I've tried return results consistent with the EVM playground, and I'm willing to test any case. I'm fairly confident in my implementation (in terms of results, not gas efficiency though).

@Eikix
Copy link
Member Author

Eikix commented Sep 2, 2023

Putting here the content of the word file for @Quentash:

I pulled Herodotus' keccak and added several test cases to be able to test some additional edge cases and compare them with my own implementation. I noticed that the tests consume slightly less gas in Herodotus than in my own implementation, but I think it's important to consider that Kakarot's tests set up a context and manipulate the stack/memory, while Herodotus' tests only test the hashing function. So, I also ran each test by isolating the hash function admitting that test_complete_cost - test_without_hash_cost = hash_cost. Here are the results:

Use case 1: Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000 Result: 0x1b00e6bf2213698a78d172620d2852921faf54cf9645342bdb34ca455f69d44 Herodotus implementation: Test complete: (gas usage estimate: 854,892) Test without hash: (gas usage estimate: 6,500) Estimated hash cost: 854,892 - 6,500 = 848,392

"Kakarot" implementation: Test complete: (gas usage estimate: 1,250,798) Test without hash: (gas usage estimate: 308,330) Estimated hash cost: 1,250,798 - 308,330 = 942,468

Use case 2: Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000 * 1000 Result: 0x2022ae07f3a362b08ac0a4bcb785c830cb5c368dc0ce6972249c6abbc68a5291 Herodotus implementation: Test complete: (gas usage estimate: 46,123,022) Test without hash: (gas usage estimate: 1,246,580) Estimated hash cost: 46,123,022 - 1,246,580 = 44,876,442

"Kakarot" implementation: Test complete: (gas usage estimate: 72,120,176) Test without hash: (gas usage estimate: 20,960,470) Estimated hash cost: 72,120,176 - 20,960,470 = 51,159,706 In the end, the difference is not huge (although it becomes more significant with larger inputs), considering that the algorithm I am currently using probably still has room for improvement in terms of gas consumption. It is also important to consider that if we use the Herodotus library, we will still need to format the u256 read from memory into their Bytes type, which may ultimately incur a similar gas cost. The current behavior of exec_sha3 is as follows: • Pop the inputs from the stack • For each 32-byte chunk to be hashed (if size > 32), read 32 bytes from memory. • Reverse the 32 bytes and append them in groups of 8 to an array (input for carrio_keccak). • If size < 32 but > 0, load the last 32 bytes, reverse them, and append the missing bytes. • Call cairro_keccak with inputs = array, last_input = the last 8-byte slice that is not used, nbr_bytes = the number of bytes to take from last_input. • Reverse the result bytes. The behavior of exec_sha3 with the Herodotus library would be: • Pop the inputs from the stack • For each 32-byte chunk to be hashed (if size > 32), read 32 bytes from memory. • Reverse the 32 bytes, transform the u256 into Array • Concatenate with the already extracted data (loop append? Concat()?) • If size < 32 but > 0, load the last 32 bytes, reverse them, and concatenate the last bytes. • Call Herodotus::keccak_cairo with the filled array.span() Herodotus::keccak_cairo will then format the span • Reverse the result bytes.

I have highlighted in green the differences (gas-consuming) between the two algorithms. I think that having to format the memory inputs into the type created by Herodotus (Bytes), which will then format this type into Span for carrio_keccak, may make the process more gas-consuming than the current one. Please let me know if I'm wrong or if I've missed something. I haven't been able to test all edge cases with the Herodotus function yet, but it seems to be bug-proof. My implementation also seems to be, I haven't found/thought of any breaking cases so far. All the cases I've tried return results consistent with the EVM playground, and I'm willing to test any case. I'm fairly confident in my implementation (in terms of results, not gas efficiency though).

Could we check I think that having to format the memory inputs into the type created by Herodotus (Bytes), which will then format this type into Span for carrio_keccak, may make the process more gas-consuming than the current one this statement?
By importing the Herodotus SHA3 implem (from our fork of their cair-lib) and running a test from our repo, using their lib? What I'm saying is, try to make this type transformation?

@Quentash
Copy link
Contributor

Quentash commented Sep 2, 2023

I have taken the Herodotus impl and tried it as explained above. Using the sames use cases, here are the results ;

Using this code :

        let offset: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;
        let mut size: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;

        let mut toHash: Array<u8> = ArrayTrait::<u8>::new();
        let mut counter = 0;

        loop {
            if size < 32 {
                break;
            }

            let mut mem = self.memory.load_internal(offset+(32*counter));
            let memu8: Bytes = u256_to_bytes_array(mem).span();

            let mut i = 0;
            loop {
                if i == 32 {
                    break;
                }

                toHash.append(*memu8[i]);
                i+=1;
            };

            counter += 1;
            size -= 32;
         };
        if size > 0 {
            let mut mem = self.memory.load_internal(offset+(32*counter));
            let memu8: Bytes = u256_to_bytes_array(mem).span();
            let mut i = size;
            loop {
                if i == 0 {
                    break;
                }

                toHash.append(*memu8[size-i]);
                i-=1;
            };
        }

        let mut hash = KeccakTrait::keccak_cairo(toHash.span());
        hash.low = integer::u128_byte_reverse(hash.low);
        hash.high = integer::u128_byte_reverse(hash.high);
        let tmp = hash.low;
        hash.low = hash.high;
        hash.high = tmp;

        self.stack.push(hash)

I got those results ;

Use case 1:
Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000
Result: 0x1b00e6bf2213698a78d172620d2852921faf54cf9645342bdb34ca455f69d44

Herodotus into "Kakarot" implementation:
Test complete: (gas usage estimate: 2,870,752)
Test without hash: (gas usage estimate: 310,930)
Estimated hash cost: 2,870,752 - 310,930 = 2,559,822

Which represent 1 617 354 more gas used compared to the 'Kakarot' impl.

Use case 2:
Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000 * 1000
Result: 0x2022ae07f3a362b08ac0a4bcb785c830cb5c368dc0ce6972249c6abbc68a5291

"Kakarot" implementation:
Test complete: (gas usage estimate: 224,120,682)
Test without hash: (gas usage estimate: 20,960,470)
Estimated hash cost: 224,120,682 - 20,960,470 = 203,160,212

Which represent 152,000,506 more gas used compared to the 'Kakarot' impl.

I even tried to take a shorcut for the 32 bytes sized test doing this :

        let offset: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;
        let mut size: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;

        let mut toHash: Array<u8> = ArrayTrait::<u8>::new();
        let mut counter = 0;

        if size == 32 {
            let mut mem = self.memory.load_internal(offset+(32*counter));
            let toHash: Bytes = u256_to_bytes_array(mem).span();

            let mut hash = KeccakTrait::keccak_cairo(toHash);
            hash.low = integer::u128_byte_reverse(hash.low);
            hash.high = integer::u128_byte_reverse(hash.high);
            let tmp = hash.low;
            hash.low = hash.high;
            hash.high = tmp;

            return self.stack.push(hash);
            
        }

But it still consumed 2,274,852 gas which is only ~300k less than without the shorcut.
Although I expected the cost to be higher, I don't understand why it gets so high.
If you have insight, that would be very welcomed !

@Quentash
Copy link
Contributor

Quentash commented Sep 5, 2023

I have analysed tests found at https://github.com/ethereum/tests/blob/develop/src/GeneralStateTestsFiller/VMTests/vmTests/sha3Filler.yml#L52 .
I mostly manage to have obtain the same results but some edge cases gave/give me problems.

  1. Case : offset = 1000 & size = 0xFFFFF :
    This problem was giving me trouble as I was allocating the memory every time we tried to read from an unalloc place. Which resulted in out of gaz errors.
    I bypassed this by puting a " (offset + (32 * counter)) > self.memory.bytes_len " which basically verify if we're trying to read beyond the last memory slot. If it's the case, I will simply fill the array to hash with 0. AND THEN, I use context.memory.expand( offset + size ) to expand the memory to the last slot we "read".
    Is this a valid approach ?

  2. Case : offset = 0xFFFFFFFFF & size = 100:

  3. Case : offset = 1000 & size = 0xFFFFFFFFF :

  4. Case : offset = max256u & size = max256u :
    Those 3 cases are unable to test right now. They are supposed to return out of gas error according to the official tests doc but are unable to test in our environment because memory.bytes_len is a usize/u32 type. ( which doesn't hold 0xFFFFFFFFF )
    As already discussed with Eikix, those case will be ignored unless asked otherwise meanwhile.

Beside that, all the other tests pass and return the correct values.
If you don't have any other suggestions/remark, I will slowly push my code and create the associated PR.

@Eikix
Copy link
Member Author

Eikix commented Sep 8, 2023

Important update: we are gathering some bugs in the Kakarot v0 codebase, we need to make sure each issue and each PR in Kakarot-ssj is aware of the lists of known bugs. Look at this link everytime you take an issue and check your issue isn't targeted by a known bug.

@Eikix
Copy link
Member Author

Eikix commented Sep 8, 2023

Important update: we are gathering some bugs in the Kakarot v0 codebase, we need to make sure each issue and each PR in Kakarot-ssj is aware of the lists of known bugs. Look at this tracking issue everytime you take an issue and check your issue isn't targeted by a known bug. Will add this reminder in many places to make sure we keep track of known bugs.

@github-project-automation github-project-automation bot moved this from 🆕 Backlog to ✅ Done in Kakarot on Starknet Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request opcode Implementation of an opcode
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants