reorder opcodes on nvm3 #212

igormcoelho · 2019-09-27T13:57:04Z

As discussed on #208, it could be a good idea to reorganize opcodes on nvm3.
One thing I would propose is to move PushX opcodes to the beggining, so 05 would be number 5, and so on... this could be connected to a low-level serialization standard.

erikzhang · 2019-09-27T15:27:39Z

0x05 is PUSHBYTES5. So 0x051234567890 is to push 0x1234567890 onto the stack, and 0x05 will be the length prefix.

I don't think moving them is a good idea.

igormcoelho · 2019-09-27T16:58:02Z

PUSHBYTESX (1-75) pattern is not very efficient Erik... I'll explain why.

Currently, we can use PUSHBYTES from 0 to 75 (opcodes 0x00-0x4b), an inheritance from Bitcoin VM (https://en.bitcoin.it/wiki/Script), also the following opcodes PUSHDATA1, 2 and 4 (that goes up to 4GiB). Regarding numbers, we can directly PUSH 1 to 16 only (opcodes 0x51-0x60), another inheritance from Bitcoin. On total, 96 opcodes are dedicated to this.

Yet, if we want to push number 17, which is a small number, this would require two ops: PUSHBYTES1 + 17, and this happens to all small numbers after 16. I am constructing a serialization standard for Neo for a while, inspired on real "Neo" stuff, and I believe the VarBytes pattern on Neo is more efficient than any of these (inherited from Bitcoin).

I propose a simplification that would be very likely more efficient than any of these.

Proposed table:

Opcodes 0-127 (0x00-0x7f): PUSH NUMBER X. So, we would start with PUSHX (push numbers) instead of PUSHBYTES. We would directly push any number from 0 to 127.
Opcode 128 (0x80) : PUSH NUMBER -1. This is good for two reasons: we provide direct access to numbers/symbols up to half-byte; and 0x80 would be negative bigint (-128).
Opcode 129 (0x81) : PUSHVARBYTES. This opcode would replace all existing PUSHBYTES, in currently existing VarBytes pattern: first byte indicates size, up to 253 (small - 2B), then 254 (medium - 3B), then 255 (huge - 4B). This simplifies development a lot. Regarding drawbacks, if we want to push a string with 5 bytes (PUSHBYTES5), now we would need one extra byte: PUSHVARBYTES + 5 + 5 bytes. One bad situation would be to push a tiny number such as 255... one would need: PUSHVARBYTES + 1 + 0xff (3 bytes to push a one byte value).
Opcode 130 (0x82) : PUSHBYTE. Now, if we want to push 255, we do: PUSHBYTE + 0xff (two bytes for one byte). But note that, on NeoVM2 we also needed this after value 16... to push byte 17, we would need PUSHBYTES1 + 17, and I don't think there's a good reason to swap logic after 16, but there's a good reason to swap logic after 127, as proposed here (one good example is BigInteger serialization).
Opcode 131 (0x83): PUSHBYTES2. To push byte 128, we would need to : PUSHBYTE + 0x80, but to push bignumber 128, we would need: PUSHVARBYTES + 2 + 0x8000 (4 bytes for small number). So, probably we will want to have a PUSHBYTES2 + 0x8000 (3 bytes for two, same on NVM2). Perhaps we extend this to 3, 4 maximum, but not 75 bytes (this makes no sense).
Opcode 132 (0x84) : PUSHBYTES3 : good for short/int numbers (can be removed)
Opcode 133 (0x85) : PUSHBYTES4 : good for int numbers (pushbytes5 may be good for 0x00 prefix residual on 32-bit BigIntegers)
Opcode 134 (0x86) : PUSHBYTES8 : good for long numbers (pushbytes9 may be good for 0x00 prefix residual on 64-bit BigIntegers
Opcode 135 (0x87) : PUSHBYTES20 : good for hashes (scripthash)
Opcode 136 (0x88) : PUSHBYTES32 : good for hashes (sha256)
Opcode 137 (0x89) : PUSHBYTES33: good for 32-byte compressed ecc pubkeys
Opcode 138 (0x8A) : PUSHBYTES64: good for 32-byte ecc signatures (or hash512)
Opcode 139 (0x8B) : PUSHBYTES65: good for 32-byte uncompressed ecc pubkeys
Opcodes 140 - 255. What else? Now we need to fit all the rest of opcodes in ~115 slots. I did a rough calculation, and we can do that, and in an even better organized format.

We have:

7 control opcodes
17 stack opcodes
5 string opcodes
5 bitwise opcodes
25 int opcodes
14 array/map opcodes
2 exception opcodes
TOTAL: 75 opcodes

Efficieny on stack operations:
For efficiency, we need more DUPs and more SWAPs (I would guess, at least 4-8), as this would allow a strong reduction on XSWAP patterns on parameter passing (this happens all the time, in all contracts!). This requires a specific study (that we can do), on existing contracts.
example: DUP1,DUP2,DUP3,DUP4 (at least 4, or more).

extra ops:
One LOG operation directly on VM would be nice too, as this is required on all exsiting upper layers (System.Log), and this is likely to happen in all nvm-based blockchain machines.

Another PUSHVARHUGE would be interesting to cover memory allocations after 2^32 bytes, as this may happen if NVM is used on local systems which are 64-bit based (more than 4GiB memory). This can be done after PUSHVARBYTES opcode, or as a special case for value 0x00 (as we will already have PUSH 0).

Some espace reserved for cryptographic/big int modular computations would also be nice, although I would prefer this as an EXTENSION opcode (as discussed in another issue).

=======

Please think about this Erik... I know NVM opcodes by head now, reimplemented these on C++, JavaScript, so I can assure you this is not a crazy man talking 😂 I studied Bitcoin VM, Ethereum VM, and took a look at all proposals of VM for blockchains out of there, although not as much details as these two. I also studied WASM quite a lot. So, we have an opportunity here, since we are open to reorganizing NVM3, this can be an amazing shot.

igormcoelho · 2019-09-27T17:27:56Z

@erikzhang @shargon @lightszero please consider the above proposal.

lightszero · 2019-09-28T01:09:07Z

About reorder opcodes, I agree.
If people want check the nvm,they can run a DASM tool,no need to check bytes.
No need to care about "0x05= PUSHBytes5"

And in my view.
we have too mush PUSHs

I think keep 4 is enough.

pushbytes1
pushbytes2
pushbytes4
pushdata [with dynamic data length]

for number keep
push0
push1
pushM1
is enough.

igormcoelho · 2019-09-28T01:32:12Z

And in my view.
we have too mush PUSHs

@lightszero precisely that.

The proposal is to reduce to few: pushbyte, pushbytes2, 4, 8, 20, 32, 33, 64. These as commonly used. The rest can be varbytes.

Regarding direct number push, I think we need many, otherwise, to push a single byte (such as number 7), we need two opcodes...

shargon · 2019-09-28T09:13:58Z

I think keep 4 is enough.
pushbytes1
pushbytes2
pushbytes4
pushdata [with dynamic data length]

pushbytes20, 16 and 32 are useful for hashes

shargon · 2019-09-28T09:15:01Z

But take in mind that this opcodes never reduce the performance. Maybe with push bytes64, is enough.

igormcoelho · 2019-09-28T13:26:46Z

I didnt understand your last point Shargon, can you clarify? you mean we only need pushbytes64, and the rest can be varbytes? The difference indeed is just one extra byte for size,but if we know we will use it (a lot!),its good to have. E,g, 0x21 is currently pushbytes33... if we dont have that, we will use a lot: pushvarbyte + 33 + key (one byte extra,but it works).
The critical for me is pushbytes2... otherwise number 255 will require 4 bytes (super ugly!): pushvarbyte + 2 + 0xff00.

igormcoelho · 2019-09-30T22:17:11Z

@shargon @lightszero @erikzhang please help me update this comparison table, to compare approaches:

LEGEND:
NeoVM Ops: current implementation of test
NeoVM+ Ops: new implementation of test (current proposal)
Optimal Size: estimated minimum size in bytes for the task (minimum is 1)
Overhead and Overhead+: extra percentual of bytes spent for the task, regarding Optimal Size

Test	Optimal Size	NeoVM Ops	Overhead (%)	NeoVM+ Ops	Overhead+ (%)
push empty bytearray	1	PUSH0	0%	PUSH0	0%
push number -1	1	PUSHM1	0%	PUSHM1	0%
push number 0	1	PUSH0	0%	PUSH0	0%
push number 1-16	1	PUSH1	0%	PUSH1	0%
push number 17-127	1	(127) PUSHBYTES1 + 0x7f	1/2=50%	PUSH127	0%
push number 128-255	2	(128) PUSHBYTES2 + 0x8000	(3-2)/2=50%	(128) PUSHBYTES2 + 0x8000	1/2=50%
push number 256-65k	2-4	(65535) PUSHBYTES3 + 0xffff00	(4-3)/4=25%	PUSHBYTES3 + 0xffff00	1/4=25%
push scripthash	20	PUSHBYTES20 + scripthash	1/20	PUSHBYTES20 + scripthash	1/20
push hash	32	PUSHBYTES32 + hash	1/32	PUSHBYTES32 + hash	1/32
push pubkey (compressed)	33	PUSHBYTES33 + pubkey	1/33	PUSHBYTES33 + pubkey	1/33
push 75 bytes	75	PUSHBYTES75 + 75bytes	1/75	PUSHVARBYTES + 76 + 76bytes	2/75
push 76 bytes	76	PUSHDATA1 + 76 + 76bytes	2/76	PUSHVARBYTES + 76 + 76bytes	2/76
push 255 bytes	255	PUSHDATA1 + 255 + 255bytes	2/255	PUSHVARBYTES + 253 + 2B(255) + 255bytes	4/255
push 65k bytes	65k	PUSHDATA2 + 65k + 65k bytes	2/65k	PUSHVARBYTES + 254 + 2B(65k) + 65kbytes	4/65k

Igor: From this table, it's very positive in my opinion that the new approach is able to drastically reduce overheads in most numeric operations. Please add your conclusions here too, and feel free to update the table with other highlights.

shargon · 2019-10-01T07:54:05Z

PUSHDATAX to PUSHVARBYTES is a very good move.

igormcoelho · 2019-10-01T14:20:28Z

PUSHDATAX to PUSHVARBYTES is a very good move.

It's slightly less efficient (on the table, with ever decreasing impact for bigger values), but much more "standard" regarding current Neo serialization model. I think this counts a lot, to have a strong identity... better than being perfectly efficient, I prefer being simple to understand and easily attract more devs/adopters to the protocol.

shargon · 2019-10-01T15:08:53Z

Agree, one opcode for dynamic size is enough following the neo standard

roman-khimov · 2019-10-01T20:14:45Z

I'd prefer keeping compatibility as much as possible, just changing opcodes (like it was done in Add PUSHNULL and ISNULL #208) doesn't look right to me, these are just numbers that almost no one really cares about. I mean, yeah, all three people implementing NEO VM do care, but other than that, sorry, but no.
If we're to change anything here, then yes, it's better to make something more than just moving opcodes around, which leads me to the next point...
Has anyone tried to analyze currently used verification/application scripts in Testnet and Mainnet? I mean we have quite a lot of real-world scripts there and they probably can answer a lot of questions. It shouldn't be hard to get real statistics for opcode usage, it should also be possible to try to find some common patterns that can be optimized. Of course all of you have a lot of experience with this VM and you kinda 'feel' what is the right opcode set for it, but still using some real data (that we have!) can only help.

igormcoelho · 2019-10-01T20:22:18Z

@roman-khimov that's precisely the point! If you look at the table up there, it's not just reordering, it's a total rewrite of neo-vm push system. It's completely different, focusing on performance and simplicity first, based on lessons learned from neo-vm2. Just to highlight: that table is just for PUSH.
I did some studies already, and the critical part, in my opinion, is having more direct DUPs and SWAPs. But to keep discussions easier, I was focusing on PUSH first (which is much easier), then moving to the rest, afterwards. All contributions are welcome here, it's hard to unite everybody into very small, but impacting changes, like these on the vm. I'm positive that very good things will come into a neovm3.

P.S.: Some studies I did last year, and are available on nvm-optimizer project: https://github.com/NeoResearch/nvm-opt (The Inlining example is the best of all.. reduction of dozens opcodes to just three).
On general, pattern for methods is recurrent, and NEP-5 ICO template is a very good example on how new opcodes could benefit runtime smart contract execution. This happens in all languages I know, C#, Python, Go... and the way to optimize it is via more DUPs and SWAPs, like I just mentioned (EVM has a similar strategy as well).

roman-khimov · 2019-10-01T21:06:59Z

it's a total rewrite of neo-vm push system.

Yeah, I do see it and, to be fair, I do like it. It's just that I'm not sure we're properly taking into account the data that we have in Mainnet/Testnet, real scripts and contracts that we have. You obviously have way more experience with this VM than I do and most probably your suggestions are correct, but for me it would be very interesting to check these suggestions against real data to either prove them or improve them.

For example, PUSHX (0-127) --- do we really need that many of them? What are the most commonly pushed values? I don't know. But checking against real-world scripts can help here. And if we have a VM design very close to Bitcoin's one then we can also consider some data from it.

igormcoelho · 2019-10-05T14:57:38Z

@roman-khimov this proposal has two big aspects involved: (1) simplicity (2) efficiency.

PUSHX0-127 do we really need that many of them? What are the most commonly pushed values?

From simplification perspective we need them, as the idea is to have a nvm much closer to Neo serialization format, where 05 means number 05, etc. It would be nice to know which are often used, but I guess some may never been used (on Neo blockchain, but could be used for local heavy processing using nvm...). We need to separate blockchain applications of nvm (like Neo MainNet / Bitcoin), from general purpose computing on nvm (see snake game on https://neoresearch.io/nvm-learn). So, from an efficiency perspective on blockchain, few people may indeed use PUSH67, perhaps almost never, but its implementation costs the same as any other opcode, and doesnt bring any overhead to the machine, since 0 to 127 will implemented as a subtraction, as it is now. In this sense, the overhead of supporting PUSH0-16 and PUSH0-127 is exactly the same, so if we don't improve anything, we won't make it worse, but we will surely make it simpler.
Numbers 128-255 always confused people, because Neo has bigint support, so it's a good strategy to make people use its bigint version (PUSHBYTES2 ff00). It's also not less efficient than current opcodes, as 128-256 is also currently pushed in same manner.
PUSHVARBYTES is a major simplification over PUSHDATA1, 2 and 4, and it makes us so much closer to Neo serialization standard, adopting the same implementation everywhere (good for global understanding of the platform).

That being said, I'm not against analyzing the whole chain to watch pros and cons of this proposal, but I don't think it will be measurable in any performance metric, as the biggest gain of 50% happens on push numbers 17-127, which are not very common I guess (we would need to infer from PUSHBYTES2 pattern, as these didn't even exist...).
Other changes, like multiple DUPS, will impact more deeply the smart contracts, but we would need to rewrite them to test, and since executions are also very rare in a blockchain (low avg tps), it won't directly affect time. Integrating it on nvm-optimizer is the best shot to demonstrate that we can reduce smart contracts by 70% for example, will something like this convice people perhaps? I think we can easily design some standard benchmarks, for example backtracking over nvm (since it's not just for blockchains!), and this could help demonstrate the advantages of multiple dups (perhaps massive gains for heavy computation).
From simplicity perspective, this proposal helps, I think it should be considered. We need a standard Neo3, that fits easily an specification model, and I think this proposal helps a lot in this sense.

erikzhang · 2019-12-03T02:29:09Z

Please review #235.

erikzhang mentioned this issue Dec 4, 2019

New PUSH Opcodes #247

Merged

roman-khimov mentioned this issue Dec 5, 2019

Redesign PUSH opcodes #235

Closed

erikzhang closed this as completed in #247 Dec 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reorder opcodes on nvm3 #212

reorder opcodes on nvm3 #212

igormcoelho commented Sep 27, 2019

erikzhang commented Sep 27, 2019

igormcoelho commented Sep 27, 2019 •

edited

Loading

igormcoelho commented Sep 27, 2019

lightszero commented Sep 28, 2019 •

edited

Loading

igormcoelho commented Sep 28, 2019 •

edited

Loading

shargon commented Sep 28, 2019

shargon commented Sep 28, 2019

igormcoelho commented Sep 28, 2019 •

edited

Loading

igormcoelho commented Sep 30, 2019 •

edited

Loading

shargon commented Oct 1, 2019

igormcoelho commented Oct 1, 2019 •

edited

Loading

shargon commented Oct 1, 2019

roman-khimov commented Oct 1, 2019 •

edited

Loading

igormcoelho commented Oct 1, 2019 •

edited

Loading

roman-khimov commented Oct 1, 2019

igormcoelho commented Oct 5, 2019 •

edited

Loading

erikzhang commented Dec 3, 2019

reorder opcodes on nvm3 #212

reorder opcodes on nvm3 #212

Comments

igormcoelho commented Sep 27, 2019

erikzhang commented Sep 27, 2019

igormcoelho commented Sep 27, 2019 • edited Loading

igormcoelho commented Sep 27, 2019

lightszero commented Sep 28, 2019 • edited Loading

igormcoelho commented Sep 28, 2019 • edited Loading

shargon commented Sep 28, 2019

shargon commented Sep 28, 2019

igormcoelho commented Sep 28, 2019 • edited Loading

igormcoelho commented Sep 30, 2019 • edited Loading

shargon commented Oct 1, 2019

igormcoelho commented Oct 1, 2019 • edited Loading

shargon commented Oct 1, 2019

roman-khimov commented Oct 1, 2019 • edited Loading

igormcoelho commented Oct 1, 2019 • edited Loading

roman-khimov commented Oct 1, 2019

igormcoelho commented Oct 5, 2019 • edited Loading

erikzhang commented Dec 3, 2019

igormcoelho commented Sep 27, 2019 •

edited

Loading

lightszero commented Sep 28, 2019 •

edited

Loading

igormcoelho commented Sep 28, 2019 •

edited

Loading

igormcoelho commented Sep 28, 2019 •

edited

Loading

igormcoelho commented Sep 30, 2019 •

edited

Loading

igormcoelho commented Oct 1, 2019 •

edited

Loading

roman-khimov commented Oct 1, 2019 •

edited

Loading

igormcoelho commented Oct 1, 2019 •

edited

Loading

igormcoelho commented Oct 5, 2019 •

edited

Loading