feat: implement 0x20 - SHA3 Opcode #115

Eikix · 2023-08-17T09:56:48Z

Since	Group
Frontier	SHA3

Index 1 is top of the stack. See PUSH.

Stack input

offset: byte offset in the memory.
size: byte size to read in the memory.

Stack output

hash: Keccak-256 hash of the given data in memory.

Example

Memory
`0xFFFFFFFF`

	Input	Output
`1`	`0`	`0x29045A592007D0C246EF02C2223570DA9522D0CF0F73282C79A1BC8F0BB2C238`
`2`	`4`

Reproduce in playground.

Error cases

The state changes done by the current context are reverted in those cases:

Not enough gas.
Not enough values on the stack.

Gas

minimum_word_size = (size + 31) / 32
static_gas = 30

dynamic_gas = 6 * minimum_word_size + memory_expansion_cost

The memory expansion cost explanation can be found here.

Estimate your gas cost

Input: offset

Input: size

State: current memory size

Static gas + dynamic gas = 30

The text was updated successfully, but these errors were encountered:

Quentash · 2023-08-30T13:53:49Z

Can I have this one please ?

Quentash · 2023-08-31T12:39:01Z

EDIT : Here is a corrected version of my first solution. This function returns the exact same hash than the EVM playground for many cases. I hope not forgetting some.
The main idea is to format the memory bytes into a u64 Array that I feed to cairo_keccak().
Using this, I manage to retrieve the exact same result obtained via the EVM playground whatever are the inputs.
It may be very badly written(tested a lot of things) so any feedback would be welcome to improve the code !

The test cases I'm thinking about are ;

32 bytes size w/ and w/o offset
!=32 bytes size w/ and w/o offset
0 byte size w/ and w/o offset
offset beyond memory_len
offset < memory_len but offset + size > memory_len
big size ( 3200 )
... ?

fn exec_sha3(ref self: ExecutionContext) -> Result<(), EVMError> {
        let offset: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;
        let mut size: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;

        let mut toHash: Array<u64> = ArrayTrait::<u64>::new();
        let mut last_input: u64 = 0;
        let mut counter = 0;
        loop {
            if size < 32 {
                break;
            }

            let mut mem = self.memory.load_internal(offset+(32*counter));
            mem.low = integer::u128_byte_reverse(mem.low);
            mem.high = integer::u128_byte_reverse(mem.high);
            let tmp = mem.low;
            mem.low = mem.high;
            mem.high = tmp;

            let (highL, lowL) = u128_split(mem.low);
            let (highH, lowH) = u128_split(mem.high);
            toHash.append(lowL);
            toHash.append(highL);
            toHash.append(lowH);
            toHash.append(highH);

            counter += 1;
            size -= 32;
        };
        if size > 0 {
            let mut mem = self.memory.load_internal(offset+(32*counter));
            mem.low = integer::u128_byte_reverse(mem.low);
            mem.high = integer::u128_byte_reverse(mem.high);
            let tmp = mem.low;
            mem.low = mem.high;
            mem.high = tmp;
            let (highL, lowL) = u128_split(mem.low);
            let (highH, lowH) = u128_split(mem.high);

            if size < 8 {
                last_input = lowL;
            } else if size < 16 {
                size -= 8;
                toHash.append(lowL);
                last_input = highL;
            } else if size < 24 {
                size -= 16;
                toHash.append(lowL);
                toHash.append(highL);
                last_input = lowH;
            } else {
                size -= 24;
                toHash.append(lowL);
                toHash.append(highL);
                toHash.append(lowH);
                last_input = highH;
            }
        }

        let mut hash = cairo_keccak(ref toHash,last_input,size);
        hash.low = integer::u128_byte_reverse(hash.low);
        hash.high = integer::u128_byte_reverse(hash.high);
        let tmp = hash.low;
        hash.low = hash.high;
        hash.high = tmp;

        self.stack.push(hash)
    }

What it passes actually :

Quentash · 2023-09-02T12:51:45Z

Here are my thoughts about either implementing Herodotus keccak or use the one suggested above.

TL;DR : Herodotus keccak seems to be a little bit more gas efficient but the fact of having to format the u256 read from memory into a Span.u8 that will itself be formated into a Span.u64 may acutally "compenstate the gain from Herodotus' keccak".
Both implemenation seems to be able to handle all edges cases I thought about until now.

keccak_thoughts.odt

Eikix · 2023-09-02T13:22:34Z

Putting here the content of the word file for @Quentash:

I pulled Herodotus' keccak and added several test cases to be able to test some additional edge cases and compare them with my own implementation. I noticed that the tests consume slightly less gas in Herodotus than in my own implementation, but I think it's important to consider that Kakarot's tests set up a context and manipulate the stack/memory, while Herodotus' tests only test the hashing function. So, I also ran each test by isolating the hash function admitting that test_complete_cost - test_without_hash_cost = hash_cost.
Here are the results:

Use case 1:
Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000
Result: 0x1b00e6bf2213698a78d172620d2852921faf54cf9645342bdb34ca455f69d44
Herodotus implementation:
Test complete: (gas usage estimate: 854,892)
Test without hash: (gas usage estimate: 6,500)
Estimated hash cost: 854,892 - 6,500 = 848,392

"Kakarot" implementation:
Test complete: (gas usage estimate: 1,250,798)
Test without hash: (gas usage estimate: 308,330)
Estimated hash cost: 1,250,798 - 308,330 = 942,468

Use case 2:
Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000 * 1000
Result: 0x2022ae07f3a362b08ac0a4bcb785c830cb5c368dc0ce6972249c6abbc68a5291
Herodotus implementation:
Test complete: (gas usage estimate: 46,123,022)
Test without hash: (gas usage estimate: 1,246,580)
Estimated hash cost: 46,123,022 - 1,246,580 = 44,876,442

"Kakarot" implementation:
Test complete: (gas usage estimate: 72,120,176)
Test without hash: (gas usage estimate: 20,960,470)
Estimated hash cost: 72,120,176 - 20,960,470 = 51,159,706
In the end, the difference is not huge (although it becomes more significant with larger inputs), considering that the algorithm I am currently using probably still has room for improvement in terms of gas consumption. It is also important to consider that if we use the Herodotus library, we will still need to format the u256 read from memory into their Bytes type, which may ultimately incur a similar gas cost.
The current behavior of exec_sha3 is as follows:
• Pop the inputs from the stack
• For each 32-byte chunk to be hashed (if size > 32), read 32 bytes from memory.
• Reverse the 32 bytes and append them in groups of 8 to an array (input for carrio_keccak).
• If size < 32 but > 0, load the last 32 bytes, reverse them, and append the missing bytes.
• Call cairro_keccak with inputs = array, last_input = the last 8-byte slice that is not used, nbr_bytes = the number of bytes to take from last_input.
• Reverse the result bytes.
The behavior of exec_sha3 with the Herodotus library would be:
• Pop the inputs from the stack
• For each 32-byte chunk to be hashed (if size > 32), read 32 bytes from memory.
• Reverse the 32 bytes, transform the u256 into Array
• Concatenate with the already extracted data (loop append? Concat()?)
• If size < 32 but > 0, load the last 32 bytes, reverse them, and concatenate the last bytes.
• Call Herodotus::keccak_cairo with the filled array.span() Herodotus::keccak_cairo will then format the span
• Reverse the result bytes.

I have highlighted in green the differences (gas-consuming) between the two algorithms. I think that having to format the memory inputs into the type created by Herodotus (Bytes), which will then format this type into Span for carrio_keccak, may make the process more gas-consuming than the current one. Please let me know if I'm wrong or if I've missed something.
I haven't been able to test all edge cases with the Herodotus function yet, but it seems to be bug-proof.
My implementation also seems to be, I haven't found/thought of any breaking cases so far. All the cases I've tried return results consistent with the EVM playground, and I'm willing to test any case. I'm fairly confident in my implementation (in terms of results, not gas efficiency though).

Eikix · 2023-09-02T16:34:44Z

Putting here the content of the word file for @Quentash:

I pulled Herodotus' keccak and added several test cases to be able to test some additional edge cases and compare them with my own implementation. I noticed that the tests consume slightly less gas in Herodotus than in my own implementation, but I think it's important to consider that Kakarot's tests set up a context and manipulate the stack/memory, while Herodotus' tests only test the hashing function. So, I also ran each test by isolating the hash function admitting that test_complete_cost - test_without_hash_cost = hash_cost. Here are the results:

Use case 1: Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000 Result: 0x1b00e6bf2213698a78d172620d2852921faf54cf9645342bdb34ca455f69d44 Herodotus implementation: Test complete: (gas usage estimate: 854,892) Test without hash: (gas usage estimate: 6,500) Estimated hash cost: 854,892 - 6,500 = 848,392

"Kakarot" implementation: Test complete: (gas usage estimate: 1,250,798) Test without hash: (gas usage estimate: 308,330) Estimated hash cost: 1,250,798 - 308,330 = 942,468

Use case 2: Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000 * 1000 Result: 0x2022ae07f3a362b08ac0a4bcb785c830cb5c368dc0ce6972249c6abbc68a5291 Herodotus implementation: Test complete: (gas usage estimate: 46,123,022) Test without hash: (gas usage estimate: 1,246,580) Estimated hash cost: 46,123,022 - 1,246,580 = 44,876,442

"Kakarot" implementation: Test complete: (gas usage estimate: 72,120,176) Test without hash: (gas usage estimate: 20,960,470) Estimated hash cost: 72,120,176 - 20,960,470 = 51,159,706 In the end, the difference is not huge (although it becomes more significant with larger inputs), considering that the algorithm I am currently using probably still has room for improvement in terms of gas consumption. It is also important to consider that if we use the Herodotus library, we will still need to format the u256 read from memory into their Bytes type, which may ultimately incur a similar gas cost. The current behavior of exec_sha3 is as follows: • Pop the inputs from the stack • For each 32-byte chunk to be hashed (if size > 32), read 32 bytes from memory. • Reverse the 32 bytes and append them in groups of 8 to an array (input for carrio_keccak). • If size < 32 but > 0, load the last 32 bytes, reverse them, and append the missing bytes. • Call cairro_keccak with inputs = array, last_input = the last 8-byte slice that is not used, nbr_bytes = the number of bytes to take from last_input. • Reverse the result bytes. The behavior of exec_sha3 with the Herodotus library would be: • Pop the inputs from the stack • For each 32-byte chunk to be hashed (if size > 32), read 32 bytes from memory. • Reverse the 32 bytes, transform the u256 into Array • Concatenate with the already extracted data (loop append? Concat()?) • If size < 32 but > 0, load the last 32 bytes, reverse them, and concatenate the last bytes. • Call Herodotus::keccak_cairo with the filled array.span() Herodotus::keccak_cairo will then format the span • Reverse the result bytes.

I have highlighted in green the differences (gas-consuming) between the two algorithms. I think that having to format the memory inputs into the type created by Herodotus (Bytes), which will then format this type into Span for carrio_keccak, may make the process more gas-consuming than the current one. Please let me know if I'm wrong or if I've missed something. I haven't been able to test all edge cases with the Herodotus function yet, but it seems to be bug-proof. My implementation also seems to be, I haven't found/thought of any breaking cases so far. All the cases I've tried return results consistent with the EVM playground, and I'm willing to test any case. I'm fairly confident in my implementation (in terms of results, not gas efficiency though).

Could we check I think that having to format the memory inputs into the type created by Herodotus (Bytes), which will then format this type into Span for carrio_keccak, may make the process more gas-consuming than the current one this statement?
By importing the Herodotus SHA3 implem (from our fork of their cair-lib) and running a test from our repo, using their lib? What I'm saying is, try to make this type transformation?

Quentash · 2023-09-02T18:52:37Z

I have taken the Herodotus impl and tried it as explained above. Using the sames use cases, here are the results ;

Using this code :

        let offset: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;
        let mut size: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;

        let mut toHash: Array<u8> = ArrayTrait::<u8>::new();
        let mut counter = 0;

        loop {
            if size < 32 {
                break;
            }

            let mut mem = self.memory.load_internal(offset+(32*counter));
            let memu8: Bytes = u256_to_bytes_array(mem).span();

            let mut i = 0;
            loop {
                if i == 32 {
                    break;
                }

                toHash.append(*memu8[i]);
                i+=1;
            };

            counter += 1;
            size -= 32;
         };
        if size > 0 {
            let mut mem = self.memory.load_internal(offset+(32*counter));
            let memu8: Bytes = u256_to_bytes_array(mem).span();
            let mut i = size;
            loop {
                if i == 0 {
                    break;
                }

                toHash.append(*memu8[size-i]);
                i-=1;
            };
        }

        let mut hash = KeccakTrait::keccak_cairo(toHash.span());
        hash.low = integer::u128_byte_reverse(hash.low);
        hash.high = integer::u128_byte_reverse(hash.high);
        let tmp = hash.low;
        hash.low = hash.high;
        hash.high = tmp;

        self.stack.push(hash)

I got those results ;

Use case 1:
Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000
Result: 0x1b00e6bf2213698a78d172620d2852921faf54cf9645342bdb34ca455f69d44

Herodotus into "Kakarot" implementation:
Test complete: (gas usage estimate: 2,870,752)
Test without hash: (gas usage estimate: 310,930)
Estimated hash cost: 2,870,752 - 310,930 = 2,559,822

Which represent 1 617 354 more gas used compared to the 'Kakarot' impl.

Use case 2:
Inputs : 0xFAFAFAFA00000000000000000000000000000000000000000000000000000000 * 1000
Result: 0x2022ae07f3a362b08ac0a4bcb785c830cb5c368dc0ce6972249c6abbc68a5291

"Kakarot" implementation:
Test complete: (gas usage estimate: 224,120,682)
Test without hash: (gas usage estimate: 20,960,470)
Estimated hash cost: 224,120,682 - 20,960,470 = 203,160,212

Which represent 152,000,506 more gas used compared to the 'Kakarot' impl.

I even tried to take a shorcut for the 32 bytes sized test doing this :

        let offset: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;
        let mut size: u32 = Into::<u256, Result<u32, EVMError>>::into((self.stack.pop()?))?;

        let mut toHash: Array<u8> = ArrayTrait::<u8>::new();
        let mut counter = 0;

        if size == 32 {
            let mut mem = self.memory.load_internal(offset+(32*counter));
            let toHash: Bytes = u256_to_bytes_array(mem).span();

            let mut hash = KeccakTrait::keccak_cairo(toHash);
            hash.low = integer::u128_byte_reverse(hash.low);
            hash.high = integer::u128_byte_reverse(hash.high);
            let tmp = hash.low;
            hash.low = hash.high;
            hash.high = tmp;

            return self.stack.push(hash);
            
        }

But it still consumed 2,274,852 gas which is only ~300k less than without the shorcut.
Although I expected the cost to be higher, I don't understand why it gets so high.
If you have insight, that would be very welcomed !

Quentash · 2023-09-05T09:19:35Z

I have analysed tests found at https://github.com/ethereum/tests/blob/develop/src/GeneralStateTestsFiller/VMTests/vmTests/sha3Filler.yml#L52 .
I mostly manage to have obtain the same results but some edge cases gave/give me problems.

Case : offset = 1000 & size = 0xFFFFF :
This problem was giving me trouble as I was allocating the memory every time we tried to read from an unalloc place. Which resulted in out of gaz errors.
I bypassed this by puting a " (offset + (32 * counter)) > self.memory.bytes_len " which basically verify if we're trying to read beyond the last memory slot. If it's the case, I will simply fill the array to hash with 0. AND THEN, I use context.memory.expand( offset + size ) to expand the memory to the last slot we "read".
Is this a valid approach ?
Case : offset = 0xFFFFFFFFF & size = 100:
Case : offset = 1000 & size = 0xFFFFFFFFF :
Case : offset = max256u & size = max256u :
Those 3 cases are unable to test right now. They are supposed to return out of gas error according to the official tests doc but are unable to test in our environment because memory.bytes_len is a usize/u32 type. ( which doesn't hold 0xFFFFFFFFF )
As already discussed with Eikix, those case will be ignored unless asked otherwise meanwhile.

Beside that, all the other tests pass and return the correct values.
If you don't have any other suggestions/remark, I will slowly push my code and create the associated PR.

Eikix · 2023-09-08T05:06:02Z

Important update: we are gathering some bugs in the Kakarot v0 codebase, we need to make sure each issue and each PR in Kakarot-ssj is aware of the lists of known bugs. Look at this link everytime you take an issue and check your issue isn't targeted by a known bug.

Eikix · 2023-09-08T05:12:37Z

Important update: we are gathering some bugs in the Kakarot v0 codebase, we need to make sure each issue and each PR in Kakarot-ssj is aware of the lists of known bugs. Look at this tracking issue everytime you take an issue and check your issue isn't targeted by a known bug. Will add this reminder in many places to make sure we keep track of known bugs.

Eikix added the enhancement New feature or request label Aug 17, 2023

github-project-automation bot added this to Kakarot on Starknet Aug 17, 2023

github-project-automation bot moved this to 🆕 Backlog in Kakarot on Starknet Aug 17, 2023

Eikix added the opcode Implementation of an opcode label Aug 23, 2023

Eikix added this to the Kakarot Cairo Migration milestone Aug 29, 2023

Eikix assigned Quentash Aug 30, 2023

Quentash mentioned this issue Sep 5, 2023

feat: implement 0x20 - SHA3 Opcode #281

Merged

9 tasks

Eikix closed this as completed in #281 Sep 25, 2023

github-project-automation bot moved this from 🆕 Backlog to ✅ Done in Kakarot on Starknet Sep 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement 0x20 - SHA3 Opcode #115

feat: implement 0x20 - SHA3 Opcode #115

Eikix commented Aug 17, 2023 •

edited by danilowhk

Loading

Stack input

Stack output

Example

Error cases

Gas

Estimate your gas cost

Quentash commented Aug 30, 2023

Quentash commented Aug 31, 2023 •

edited

Loading

Quentash commented Sep 2, 2023 •

edited

Loading

Eikix commented Sep 2, 2023

Eikix commented Sep 2, 2023

Quentash commented Sep 2, 2023 •

edited

Loading

Quentash commented Sep 5, 2023 •

edited

Loading

Eikix commented Sep 8, 2023

Eikix commented Sep 8, 2023

feat: implement 0x20 - SHA3 Opcode #115

feat: implement 0x20 - SHA3 Opcode #115

Comments

Eikix commented Aug 17, 2023 • edited by danilowhk Loading

Stack input

Stack output

Example

Error cases

Gas

Estimate your gas cost

Quentash commented Aug 30, 2023

Quentash commented Aug 31, 2023 • edited Loading

Quentash commented Sep 2, 2023 • edited Loading

Eikix commented Sep 2, 2023

Eikix commented Sep 2, 2023

Quentash commented Sep 2, 2023 • edited Loading

Quentash commented Sep 5, 2023 • edited Loading

Eikix commented Sep 8, 2023

Eikix commented Sep 8, 2023

Eikix commented Aug 17, 2023 •

edited by danilowhk

Loading

Quentash commented Aug 31, 2023 •

edited

Loading

Quentash commented Sep 2, 2023 •

edited

Loading

Quentash commented Sep 2, 2023 •

edited

Loading

Quentash commented Sep 5, 2023 •

edited

Loading