-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assembly language feature: add new levels for Yul #13247
Comments
In fact we already did this 2-3 years ago, when we discussed "Yul++" (see this hackmd) upon prompt by @SilentCicero, which eventually turned into yulp. What other Yul+ do you have in mind? Regarding the lower-level we did had similar discussions in mind over the years, last when the new stack allocator was created and it turned unneeded. |
Thanks, I'm aware of yulp which is however not maintained by the Solidity team if I understand correctly and there is also no active development going on (the repo is even archived). So in terms of design, I must admit I actually like yulp and could be a good starting point for Yul+. Regarding the low-level version I have the following points to mention:
|
I’m in for more access to more lower level stuff. Unless switch statements can be auto optimized to be as efficient as jump tables. I’ve written a Yul based sort (https://gist.github.com/Vectorized/7b3a1fff3832bad126fdcba0ae785275) that costs 2x more gas compared to the 1st and 2nd place entries of 2018’s Solidity golfing contest (the top 2 entires abuse jump tables extensively to implement sorting networks efficiently). |
I support giving devs access to lower level functionality specifically access to stack and JUMPs. |
I also favor a Yul+/Yul++ variant. This would be immensely beneficial for writing code where a manual high optimization is needed (e.g. in cryptography) that neither Solidity nor current Yul can provide. weierstrudel by AztecProtocol is a good example - they had to write huff lang for weierstrudel implementation to achieve that kind of optimization. Like @pcaversaccio said - I think it'd be nice to have another Yul version (Yul+/Yul++) that has access to stack-related opcodes ( |
With regards to a higher level intermediate representation, I think it would be ideal to go beyond Yul++ and use something that does not access memory/ storage directly a la slithIR. This is more conducive to analysis/ optimizations that are prohibitive or difficult to perform on Yul. More broadly, I'm in favor of moving towards a multi-level representation that is progressively lowered. Ideally, this would be very modular and allow third-parties to write optimization passes or do source-to-source translation with greater ease. |
+1, Solidity needs a mechanism to rip out all the guard rails when necessary or the incentive to eject entirely becomes too great when maximal efficiency is required. |
I think Yul++ makes the most sense name wise for this extension as Yul+ already exists although archived Yul++ would differentiate it from that (For example I have a Yul+ <-> Foundry repo which would populate when people are searching for this Yul extension). I do favor adding this though, especially as features like #9889 and many from Yul+ like |
+1 for allowing jumps and labels I am trying to reverse a contract based on some pseudo code from the ethervm decompiler. |
One feature that would be incredibly impactful would be the ability to insert pure opcodes/mneumonics. This would give the ultimate flexibility and control and many other feature requests could be solved by this one feature. Furthermore, this change would allow for interoperability with Huff or any other language or tool that compiles to bytecode. As for the actual implementation, one idea would be that opcodes can only be inserted inside Yul functions, or perhaps there's a special type of function just for this where the args get put on the stack and the return values are what gets added to the stack. function mySolidityFunc() {
assembly {
opcodes myOpcodeFunc(arg1, arg2) => ret {
DUP1 // [arg1, arg1, arg2]
MUL // [arg1^2, arg2]
SUB // [arg1^2 - arg2]
}
let ans := myOpcodeFunc(2, 1)
}
|
@devtooligan how is that different from a normal Yul function? You already have access to |
In my mind it's about being able to paste in bytecode. Can be hundreds sloc. Don't need or want to rewrite in Yul if maintaining it with Huff. You don't like the idea, @leonardoalt? What are your concerns? |
Most of the Yul built-in functions translate directly into the opcodes, so I feel like it's just redundant. |
Right now Huff can only deploy whole contracts. Would be aweseome if we could find a way for Huff to interact w Solidity. |
I guess you could use Yul's verbatim for that? |
I think it's about interoperability on the opcode level here. The same would apply to Fe probably. The question we need to ask ourselves is whether we consider this an isolated Solidity/Yul issue or we think about transpilation/interoperability. IMHO when Huff becomes more mature you will work only with Huff and not consider it porting it to Solidity/Yul. So maybe such a feature request is a lifecycle problem. Also, as pointed out by @leonardoalt above, it's already possible to create bytecode sequences that will not be modified by the optimizer. |
@leonardoalt Thanks this looks promising 🫡 |
It looks like the exact thing you need for your initial comment haha |
What about jumps and labels? |
Ser my initial research has shown that Found this related issue: |
This issue has been marked as stale due to inactivity for the last 90 days. |
This is still relevant. Please don't close. |
@cameel could you please remove the |
Sure, but don't worry, the bot would remove the label anyway, it just runs only once per day. So the issue would not be closed since you commented on it. EDIT: Ah, I see it even already did. |
This issue has been marked as stale due to inactivity for the last 90 days. |
This is still relevant. Please don't close. |
This issue has been marked as stale due to inactivity for the last 90 days. |
This is still relevant. Please don't close. |
This issue has been marked as stale due to inactivity for the last 90 days. |
This is still relevant. Don't close. |
This issue has been marked as stale due to inactivity for the last 90 days. |
This is still relevant. Don't close. |
While solidity doesn't support verbatim I had to be creative, if you like me need this feature TODAY, this how you can do it: contract ExampleImpl {
// Workaround for calling an arbitrary code from solidity
function _verbatim() private pure returns (uint256 output) {
assembly ("memory-safe") {
let ptr := mload(0x40)
// Force a constant to be represented as 32 repeated '7E' in the runtime code.
mstore(ptr, 0x7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E7E)
output := mload(ptr)
}
}
function add(uint256, uint256) external pure returns (uint256) {
// Calls the injected bytecode (reusable)
return _verbatim();
}
function sub(uint256, uint256) external pure returns (uint256) {
// Calls the injected bytecode that replaces the `PUSH32 0x7F7F...5B` constant
return 0x7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F7F5B;
}
}
library CodeInjection {
/**
* @dev Find 32 consecutive repeated bytes in a byte sequence
* return the memory pointer of the constant, or zero if not found.
**/
function indexOfBytes32(bytes memory bytecode, uint8 haystack) internal pure returns (uint256 pos) {
assembly ("memory-safe") {
// Transform `0x7E` into `0x81818181818181...`
haystack := or(haystack, shl(8, haystack))
haystack := or(haystack, shl(16, haystack))
haystack := or(haystack, shl(32, haystack))
haystack := or(haystack, shl(64, haystack))
haystack := or(haystack, shl(128, haystack))
haystack := not(haystack)
let size := mload(bytecode)
pos := add(bytecode, 32)
// Efficient Algorithm to find 32 consecutive repeated bytes in a byte sequence
// It look in chunks of 32 bytes, and works even if the constant is not aligned.
for {
let chunk := 1
let end := add(pos, size)
} gt(chunk, 0) { pos := add(pos, chunk) } {
// Transform all `0x7E` bytes into `0xFF`
// 0x81 ^ 0x7E == 0xFF
// Also transform all other bytes in something different than `0xFF`
chunk := xor(mload(pos), haystack)
// Find the right most unset bit, which is equivalent to find the
// right most byte different than `0x7E`.
// ex: (0x12345678FFFFFF + 1) & (~0x12345678FFFFFF) == 0x00000001000000
chunk := and(add(chunk, 1), not(chunk))
// Round down to the closest multiple of 256
// Ex: 2 ** 18 become 2 ** 16
chunk := div(chunk, mod(chunk, 0xff))
// Find the number of leading bytes different than `0x7E`.
// Rationale:
// Multiplying a number by a power of 2 is the same as shifting the bits to the left
// 1337 * (2 ** 16) == 1337 << 16
// Once the chunk is a multiple of 256 it always shift entire bytes, we use this to
// select a specific byte in a byte sequence.
chunk := shr(248, mul(0x201f1e1d1c1b1a191817161514131211100f0e0d0c0b0a090807060504030201, chunk))
// Stop the loop if we go out of bounds
// obs: can remove this check if you are 100% sure the constant exists
chunk := mul(chunk, lt(pos, end))
}
}
}
}
contract Example is ExampleImpl {
// OBS: This codes replaces it MUST have exact 32 bytes in size (31 + opcode)
// with no inputs and push ONE value onto the stack.
// PUSH26 0x04 CALLDATALOAD PUSH1 0x24 CALLDATALOAD ADD
bytes32 private constant SUM_BYTECODE = 0x7900000000000000000000000000000000000000000000000000043560243501;
// PUSH26 0x24 CALLDATALOAD PUSH1 0x04 CALLDATALOAD SUB
bytes32 private constant SUB_BYTECODE = 0x7900000000000000000000000000000000000000000000000000243560043503;
constructor() payable {
// In solidity the child's constructor are executed before the parent's constructor,
// so once this contract extends `ExampleImpl`, it's constructor is executed first.
// Copy `ExampleImpl` runtime code into memory.
bytes memory runtimeCode = type(ExampleImpl).runtimeCode;
// Find the location of the constant `0x7E7E7E...`
uint256 sumPos = CodeInjection.indexOfBytes32(runtimeCode, 0x7E);
require(sumPos > 0, "code injection failed, constant 0x7E7E7E7E.. not found");
uint256 subPos = CodeInjection.indexOfBytes32(runtimeCode, 0x7F);
require(subPos > 0, "code injection failed, constant 0x7F7F7F7F.. not found");
// Replaces the first occurence of `0x7E7E..` in the runtime code by the `SUM_BYTECODE`
assembly ("memory-safe") {
// Replace '0x7E7E...' by some arbitrary code
mstore(sumPos, SUM_BYTECODE)
// Replace '0x7F7F...' by some arbitrary code
mstore(subPos, SUB_BYTECODE)
// Return the modified bytecode
return (add(runtimeCode, 32), mload(runtimeCode))
}
}
} |
v cool! same vibes |
This issue has been marked as stale due to inactivity for the last 90 days. |
This is still relevant. Don't close. |
This issue has been marked as stale due to inactivity for the last 90 days. |
This is still relevant. Don't close. |
Abstract
Add new levels for Yul: Yul+ being a higher-level version and Yul- being a lower-level version.
Motivation
The current version of Yul can be viewed as a pretty high-level assembly language since it provides e.g. no access to stack and control flow instructions. I open this issue to initiate a discussion on adding potential new levels to the current version of Yul. The idea started with my Twitter thread here and feedback by @leonardoalt. The current idea is to add new levels of Yul: Yul+ being a higher-level version and Yul- being a lower-level version with more compilation stack layers. Building something similar to LLVM IR would be really nice.
The text was updated successfully, but these errors were encountered: