Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change definition of m in EXCHANGE #174

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

frangio
Copy link

@frangio frangio commented Dec 12, 2024

The encoding of the operands in the bytecode remains the same, but conceptually m is defined as an absolute stack depth rather than relative to n.

This is meant to be reflected in the textual representation of the instruction in assembly, which should be EXCHANGE n m. The notation should be more intuitive with this change. In particular:

  1. The order of immediates is irrelevant as one would expect. EXCHANGE n m and EXCHANGE m n are equivalent.
  2. EXCHANGE n m can be understood as exactly equivalent to SWAPN n SWAPN m SWAPN n (minus different valid ranges for n,m).

@frangio frangio changed the title Change definition of EXCHANGE immediates Change definition of m in EXCHANGE Dec 12, 2024
Copy link
Contributor

@shemnon shemnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pdobacz
Copy link
Member

pdobacz commented Dec 12, 2024

Interesting that EEST's Op.EXCHANGE[x ,y] already behaves like that (pls someone double check, but it names these args x and y, not n and m. Maybe we can align with EEST, keep n and m as "raw" args, the "nibble + 1"s, and x and y as "user friendly" args, being 1-based indices in the stack being exchanged. Then we establish a convention which tells the reader what the pair of args means.

Making a self note to propose a change for evmone to align, if we decide to go ahead here.

And in general, I think I support this change, but I'd like to hear from others too.

- `n = imm >> 4 + 1`, `m = imm & 0x0F + 1`
- `n + 1`th stack item is swapped with `n + m + 1`th stack item (1-based).
- Stack validation: `stack_height >= n + m + 1`
- `i = imm >> 4`, `j = imm & 0x0F`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I didn't see this has been changed to i/j already. I propose to align with EEST (and be compatible with past version), so:

  • n and m remain what they were
  • x and y are the new args
  • end with: "xth stack item is swapped with yth stack item (1-based)."

This way, any EVM/test/whatever code, which doesn't update to your convention, still uses the same names for the same things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the SWAPn/SWAPN/EXCHANGE consistency argument, I take the above comment back. Pls make sure then that conventions are sound within this document (I think they are as of 44b5781), and I think we'll work from there

Copy link
Member

@pdobacz pdobacz Dec 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

welp, there's no equivalence with the old SWAP1..16, but that I think is not fixable, so

EXCHANGE x y ≡ SWAPN x SWAPN y SWAPN x ≡ SWAP[x-1] SWAP[y-1] SWAP[x-1]

encoded in bytecode as

0xe8nibble[x-2]nibble[y-x-1] ≡ 0xe7[x-2] 0xe7[y-2] 0xe7[x-2] ≡ 0x[90+x-2][90+y-2][90+x-2]

Concretely

EXCHANGE 2 3 ≡ SWAPN 2 SWAPN 3 SWAPN 2 ≡ SWAP1 SWAP2 SWAP3

0xe800 ≡ 0xe700e701e700 ≡ 0x909190

The not fixable part can be swept under the rug by the assembler, which would not allow verbatim SWAP1..16 in assembler code, but would use them in output bytecode for low instances of SWAPN[...] as optimization.

I risk to say that DUP/DUPN are aligned (and similarly "unfixable" and it's similarly "fine"). But please double check my above ramblings, I made like 3 errors in the +-1's there

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the notation/encoding I have in mind. For example, I'd write SWAPN 1 (n=1, encoded 0xe700) as equivalent to SWAP1 (0x90).

I'm going to write some more comprehensive notes to try to get everyone aligned.

@frangio
Copy link
Author

frangio commented Dec 12, 2024

pls someone double check

Based on this test it looks like Op.EXCHANGE[x, y] corresponds to n=(x - 1), m=(y - 1) in my proposed notation. x and y are the (1-based) stack heights that will be touched by the instruction.

The problem with this is that it doesn't match the established notation for SWAPn and SWAPN n. I think for user-friendliness there should be consistency across all of SWAPn, SWAPN, and EXCHANGE. For example, SWAP1 should be equivalent to SWAPN 1. More generally:

  • SWAPn should be equivalent to SWAPN n
  • EXCHANGE n m should be equivalent to SWAPN n SWAPN m SWAPN n

My understanding is that in EEST Op.SWAPN[x] corresponds to SWAPN with n=(x + 1) in the spec. You get that Op.SWAP1 is equivalent to Op.SWAPN[0], andOp.EXCHANGE[2, 3] is equivalent to Op.SWAPN[0], Op.SWAPN[1], Op.SWAPN[0]. So in my opinion the convention established in Op.EXCHANGE shouldn't be followed for user-friendly notation.

It's true that the current definitions of n,m are used all over the place though... For example, revm, evmone. So perhaps we should be more careful about the change in the spec, maybe leave the definition of m, define a new variable and add "suggested notation".

@frangio frangio mentioned this pull request Dec 13, 2024
@wjmelements
Copy link

wjmelements commented Dec 13, 2024

EXCHANGE n m can be understood as exactly equivalent to SWAPN n SWAPN m SWAPN n (minus different valid ranges for n,m).

This seems to refer to the SWAPN indexing, whereby EXCHANGE[1,2] would mean e600. SWAP1 SWAP2 SWAP1 swaps the 2nd and 3rd items.

But I would expect EXCHANGE[2,3] to mean e600, swapping the 2nd and 3rd items. If we're using absolute indices, we should be using DUP indices rather than SWAP indices. DUPN is one-indexed while SWAPN is two indexed. It makes sense for SWAPN to be two-indexed, to be consistent with the original SWAP series, and because the first item is already implicit. But neither of these are true for EXCHANGE. Please consider using 1-indexing instead of 2-indexing if you are going to be using absolute indices.

Edited to fix the encoding.

@frangio
Copy link
Author

frangio commented Dec 13, 2024

I think I'm open to EXCHANGE using DUP-like indexing. My assumption was that expectations about SWAPn would carry over to EXCHANGE n m but I'm not sure it's true. And it's not unreasonable to expect people to have to learn a new opcode.

SWAPN is two indexed

By the way I think it's the other way around, SWAPN is 0-indexed but the 0th item isn't reachable.

@pdobacz
Copy link
Member

pdobacz commented Dec 16, 2024

consistency across all of SWAPn, SWAPN, and EXCHANGE

Yes, good point. I think this should take precedence and EEST/code needs to be cleaned up if necessary.

@frangio
Copy link
Author

frangio commented Dec 16, 2024

I described 3 possible approaches in this document:

https://hackmd.io/@frangio/Bk4Vjj6V1l

My original proposal was the Traditional approach. @wjmelements is vouching for the Exact approach.

I feel pretty strongly that DUPN and SWAPN should follow the Traditional approach. I have a slight preference for using the Traditional approach with EXCHANGE, but I could accept the Exact approach for it as well.

@pdobacz
Copy link
Member

pdobacz commented Dec 17, 2024

I described 3 possible approaches in this document:

Thanks, this is really helpful. I personally prefer Exact on all 3 EIP-663 instructions, departing from both SWAPn/DUPn instructions (which, as I proposed in the other comment, can possibly be removed from the assembly language). To me SWAPn/DUPn were always confusing and inconsistent, to the point I don't really treat the "n"s there seriously. If you look at them in conjunction with PUSHn and (taking the argument to the extreme - CREATEn, MSTOREn, LOGn), n is something else every time. We should detach from this convention when we have the chance.

I even prefer Verbatim over Traditional

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants