-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intrinsics for the VM's storage opcodes. #2508
Conversation
We should certainly error out at compile-time if storage intrinsics are used in a script! |
I think that already happens. The problem is if you take a local variable's address in a script, then call a (wrapper) function in the contract that in-turn calls one of these intrinsics using that address, then we have a runtime error. |
This is great! Thanks Vaivas :) Three small requests:
|
I meant to do the 3rd one in this PR and forgot all about it 🤦♂️ . I'll do all three of these and update back here. Thanks Mo. |
|
One thing I am curious about is the portability of these intrinsics. Given that the EVM's word size is different, the quad word intrinsic would have to load |
+1 for this. |
Aren't we going to have different intrinsics for different targets? So maybe since these are Fuel specific they should be I was 🤔 about this the other day actually -- platform specific stuff in the IR should probably be generalised with wrappers. We're going to have to do this with ASM blocks which are currently FuelVM specific in the IR, and so any platform specific stuff will need the same. Storage counts here -- it's very, very similar between FuelVM and EVM but not the same. |
Hmm, my mental model has been that we will use the same intrinsics and accomplish portability by implementing the codegen differently. Otherwise we aren't increasing portability with intrinsics relative to asm blocks, since you'd have to rewrite both of them. Platform specific stuff can live in asm perhaps? How do portable intrinsics work in e.g. rust? I guess they're platform specific too, thinking about things like the AVX intrinsics....maybe there's a middle ground where some are disallowed for certain backends, like the ones related to the fuel tx frame -- those would need to be disallowed for EVM.. Just having a stream of consciousness here... |
I was thinking yeah, a certain subset of intrinsics will just not compile, with an error, if your target isn't right for them. Some intrinsics simply won't exist for certain targets, especially if we wrap basic ASM ops (in the ongoing effort to remove ASM blocks from the libs in the name of type safety). ASM blocks themselves are going to have to change drastically, though this isn't really the place for that discussion. But they're going to have to be more opaque than they are now, and maybe intrinsics should be the same. We could actually have just one 'intrinsic' which is parameterised with an enum. So I'm not sure if this is less work and/or easier to implement though. |
We also need to standardize how we define what intrinsics are allowable for what backends, and it should be thoroughly considered..
What would be the benefit of restructuring in this way? |
There is one intrinsic whose behaviour is entirely data driven based on the target. You could add or modify targets without having to change the actual compiler core/type checker. |
I've now updated So that leaves only the question of names for these intrinsics (the discussion going on above related to different targets). I'll update the names once that is finalized, but the rest of the code can be reviewed now I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments, below, all good otherwise.
let val_reg = if matches!( | ||
&self.context.values[val.0].value, | ||
ValueDatum::Instruction(Instruction::IntToPtr(..)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than special case IntToPtr
here would it work instead to update resolve_ptr()
to handle both GetPointer
and IntToPtr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than special case
IntToPtr
here would it work instead to updateresolve_ptr()
to handle bothGetPointer
andIntToPtr
?
I considered that, but there's some computation that's done later which isn't needed for IntToPtr
(but needed if resolved as a stack pointer in resolve_ptr
. So it wouldn't serve any purpose since I'll need to handle them differently here anyway.
@@ -0,0 +1,486 @@ | |||
contract { | |||
fn main<ea1a0f91>() -> u64, !3 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per #2505 we really need to move these tests away from here and use FileCheck
. This is a bit ridiculous. 😃
let loaded_word = __state_load_word(key); | ||
asm (l: loaded_word) { | ||
l: T | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not turbofish this one here? Was that also part of the discussion around this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not turbofish this one here? Was that also part of the discussion around this issue?
You mean using std::mem::read<T>
? That wouldn't work as it ends up actually doing a load, assuming the value to be an address (in case of non-reference types).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The asm block is just casting the loaded word to T
-- if it's a copy type then we should be able to use __state_load_word::<T>(key)
to get it in the right type from the start, no? I didn't check, is __state_load_word()
just returning a u64
right now? Maybe it should return T
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this just the problem we can't solve until we have trait constraints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really only returns u64
, so having its type to return T
would be incorrect. But if it's returning only u64
, then when used with is_reference_type<T>
in a generic function, we end up with this problem we have. If we could restrict it with trait contraints, then we wouldn't have this call in a non u64
context at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, OK. I kinda feel like we should only have two intrinsics -- read and write, and whether they need to use SRW
/SWW
or SRWQ
/SWWQ
can be decided by the back end. Then the read functions could just return T
.
But I'm sure that won't work either for some reason. OK, this'll do for now but I can feel the technical debt growing weekly. 😄
asm() { | ||
cfei i32; | ||
srwq v k; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next is an __alloca()
intrinsic to wrap CFEI
since it's a bit weird to grab the stack pointer with a call but then allocate some space with an ASM block. I guess it could go in a library function for now...
sway-lib-std/src/alloc.sw
Outdated
pub fn alloca<T>() -> u64 { | ||
let current_pointer = stack_ptr(); | ||
asm() { | ||
cfei i32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is allocating 32 bytes, where it needs to allocate __size_of<T>()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is allocating 32 bytes, where it needs to allocate '__size_of()`.
Oops, yes. Let me fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping (and reverting related commits) this as it needs to be an intrinsic. The argument to cfei
must be an immediate :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. Approving and then I'm out of here. Have a good weekend!
Thank you and you too ! |
This PR provides the following intrinsics:
__state_load_word (key: b256) -> u64;
__state_store_word (key: b256, value: u64);
__state_load_quad (key: b256, addr: u64);
__state_store_quad (key: b256, addr: u64);
All of them compile to equivalent IR instructions. The last two (quad versions) required some changes to asm generation since they use
int_to_ptr
as inputs to the IR instructions for quad storage.While adding the tests, a flaw I discovered was that: If the intrinsics are called from a script, passing an address that belongs to the script, we hit a runtime error (because the address is now referenced in the contract, not the calling script). This indicates that we'll need some static analysis around passing addresses - flagging such issues. Another issue would be if we take the address of a
const
, if at all that is allowed - I haven't checked. These are issues with the__addr_of
intrinsic rather than this one, so I suppose I can file a separate Issue on that?Closes #2514