-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow specifying storage locations #597
Comments
not so! |
See Nick Johnson's Library on upgradeability :) https://gist.github.com/Arachnid/4ca9da48d51e23e5cfe0f0e14dd6318f#file-upgradeable-sol |
Especially with contract upgrades in mind, wouldn't it be better to copy the storage layout and "disable" unused state variables by e.g. prefixing them? Otherwise I don't see how you would practically verify that the storage layout is consistent between upgrades. |
Is there documentation on how storage layout is determined? On Wed, May 25, 2016, 1:54 AM chriseth notifications@github.com wrote:
|
Ok, so after reading up on storage layouts...
In this example, Now, consider I upgrade it to the following.
This would end up with So, instead, I propose being able to do the following.
The solidity compiler would see that Does that make sense? Is this possible? |
I was looking for a complementary/similar feature: the ability to disable packing. (i.e. currently if two storage parameters are each < 256 bits and together they fit into one slot, they are packed together.) Ultimately the compiler could optimise the packing based on the frequency of changes to one ore more variables within. With your suggestion this is a given, each marked variable gets its own slot. I would use a different markup though:
|
the |
I think the tradeoff between introducing errors and decreasing readability is much better when just adding |
^ 👍 for the inheritance structure...it overall is cheaper and more cost effective to do it that way. I envision a lot of modularity around dapps in the future in regards to storage to better handle updates and save gas. |
This came up again as a discussion with @federicobond and I think a good middle ground could be to have an annotation (as proposed in #597 (comment)), but instead of marking a storage slot, it would rather have a string literal as a key, which is hashed to produce a 256-bit key for storage. This would be more expensive (due to the fact of using 32-byte long constants and one couldn't combine multiple variables into a single slot), but might be justified for some. When this annotation is missing, it would default to the current behaviour. For syntax I propose:
|
|
I really don't think solidity should have such low-level impact on the storage location. If you want to dislocate storage variables, why not use structs or a mapping to structs? |
One more possible other solution: contract MyContract {
storage("some-collection") {
uint foo;
uint bar;
}
storage("other-collection") {
mapping (uint => bool) qux;
MyStruct baz;
}
} The advantage of this is that contracts could define blocks of variables that are colocated in storage, but providing gaps, to extend structs later, etc. |
Just throwing this as an idea: given that this need arises from avoiding clashes when working with upgradeability, wouldn't it make sense to just avoid clashing by storing all variables in a hashed location, similar to how a mapping works? We could either store all variables from the same contract/struct together (the hash being a contract identifier, and variables are stored at offsets of that hash), or all individual variables in sparse hashed locations. The issue remains on how to generate an identifier for a contact, to ensure there are no clashes between different contracts, but that identifier is more robust than a simple name. Maybe requiring a special constant with a random value for every contract that will use this approach, similar to old Java's |
Adds an EIP for getting logs by a block hash.
There was also a lengthy related discussion in #4017. |
This came up again with #7891 . If we want to expose really general control we need three components:
Natural restrictions would apply (violating would result in compile time errors):
I would suggest to make all such specifiers optional. Variables without specifiers before any variables with specifiers will be assigned slots as before.
For the purpose of inheritance: locations are assigned just as if it was one flat contract containing all variables in the order of C3 linearization. Example (we can always decide on a different syntax):
An alternative notation-wise would be to merge slot and offset into a single byte offset that is then split into
|
Another alternative would be to require specifying the location for all variables, if the location is specified for any variable. Also we could at a later point allow compile time evaluated expressions in the specifier, i.e.:
Although we'd need to consider that one could construct those to specifically collide with some mapping key, so this would be dangerous. Although that's also true for choosing some specific value for |
Maybe we should gather some data about how this feature would be used. One use is avoiding clashes during upgrades, another is having more efficient use of storage by combining small variables in a certain way. I think just providing full flexibility all the time might not be the way to go as it is too easy to get wrong. So it could already be enough to only allow hashed locations and another way to specify which variables to combine (without specifying the offset exactly) or when to insert "start a new slot here". |
What can "go wrong"? Or in particular, what can go wrong that we can't easily detect at compile time? That way we can always extend the very same solution to support more cases, instead of needing breaking changes and new language features... |
For the sake of upgrades, it'd seem that the only requirement is to be able to assign an immutable As for EIP2330 linked above, the requirements are pretty much the same. As long as there is a deterministic process for calculating the storage slot, the actual slot can then be just exposed in the ABI for any consumers. |
An alternative, maybe more complex, approach we've been wondering about would be to already move further towards generally decoupling storage from C3 linearization. In particular, the idea is based on the fact that conceptually the https://eips.ethereum.org/EIPS/eip-7201 pattern could be adjusted to abstract contract Example {
struct MainStorage {
uint256 x;
uint256 y;
}
function _getExampleMainStorage() internal virtual view returns (MainStorage storage);
function _getXTimesY() internal view returns (uint256) {
MainStorage storage $ = _getMainStorage();
return $.x * $.y;
}
}
contract FinalContract is Example {
// keccak256(abi.encode(uint256(keccak256("example.main")) - 1)) & ~bytes32(uint256(0xff));
bytes32 private constant EXAMPLE_MAIN_STORAGE_LOCATION =
0x183a6125c38840424c4a85fa12bab2ab606c4b6d0e7cc73c0c06ba5300eab500;
function _getExampleMainStorage() internal override view returns (MainStorage storage $) {
assembly {
$.slot := EXAMPLE_MAIN_STORAGE_LOCATION
}
}
} Now we're not suggesting to do just that, since it has the same issues of the compiler being oblivious to the actual storage layout. contract FinalContract is Example {
storage{slot: keccak256(abi.encode(uint256(keccak256("example.main")) - 1)) & ~bytes32(uint256(0xff))}
Example.MainStorage exampleStorage;
function _getExampleMainStorage() internal override view returns (MainStorage storage) {
return exampleStorage;
}
} However, having to access storage via indirection of internal functions is still generally inconvenient, but the concept could be further turned into compiler-builtin mechansisms by introducing abstract contract ExampleA {
struct MainStorage {
uint256 x;
uint256 y;
}
// A virtual storage variable does *not* get a storage location itself directly.
// It forces the contract to become `abstract` and requires an override
// providing the actual location in the most-derived contract
MainStorage virtual $;
function _getXTimesY() internal view returns (uint256) {
return $.x * $.y;
}
}
// Alternatively even:
abstract contract ExampleB {
uint256 virtual x;
uint256 virtual y;
function _getXTimesY() internal view returns (uint256) {
return x + y;
}
}
In turn, we could allow and require the most-derived (and - at least for now - only the most derived contract) to "override" the inherited state variables, providing them with a particular location, while allowing to specify explicit storage locations. As in: function erc7201_slot(string memory _id) pure returns (bytes32) {
return keccak256(abi.encode(uint256(keccak256(_id)) - 1)) & ~bytes32(uint256(0xff));
}
contract FinalContractA is ExampleA {
storage{ slot: erc7201_slot("example.main") }
MainStorage override ExampleA.$;
}
// or in the alternative version:
contract FinalContractB is ExampleB {
storage{ slot: erc7201_slot("example.main") }
uint256 override ExampleB.x;
uint256 override ExampleB.y; // would be laid out after `Example.x` following the usual rules
} To make this safer, the logic could be that if any inherited state variable is However, this would mean that each and every state variable would have to be laid out explicitly in the most derived contract, which may be overly verbose. Conversely, we could consider doing something similar on the contract-level instead of the level of state variables, along the lines of contract FinalContractB2 is ExampleB {
storage{ slot: erc7201_slot("example.main") }
storage(ExampleB) override; // good syntax is a bit tricky here, though
/// storage ExampleB.* override; // there may be better options or variants syntactically
// TODO: what about bases of bases?
} This would lead to something similar as already suggested: contracts could declare their entire storage "virtual" (pending syntax - whether reusing
So, in particular, we're wondering about more opinions on the following:
|
Yes it is desirable to declare a variable and specify where it should be placed in storage. ERC-1967 is the simplest example where this would be used. Ideally the compiler should be able to check for clashes though, and I don't know if this is possible in the general case. It is possible if you restrict to locations of the form of ERC-1967 (for single-slot values) or ERC-7201, and I think these are general enough for EIP-7702 purposes, but there are other conventions that people might want to use. One point about per-state-variable control of layout is that it should not be applicable to
I agree on the comment in parentheses. This incorrect assumption should not be a motivation for any changes.
Should
I would consider it a burden but because it leaks implementation details of base contracts. Authors of reusable contracts should be able to ship changes to their code, including to storage variables, without breaking downstream code!
This also leaks implementation details (especially bases-of-bases), although it is something library authors may already be familiar with (via forced override in functions, see Thoughts on override(A, B) syntax).
Not sure if this is what you meant by "occurs linearly" but it may be intuitive to say that if a base contract and its own base contracts form an independent subtree in the inheritance DAG (i.e., there are no edges into this subtree other than to its root) it should be possible to remove it from storage linearization and put it elsewhere in storage. You could refine this further by saying that you only consider contracts that have linear/non-relocated storage. I've written too much at this point but I can elaborate later if this sounds interesting. |
Yeah, I definitely understand the desire of not leaking implementation details - there is some friction, though, between having 1. fine-grained control over individual state variables, 2. safely ensuring that all state variables are at "safe" locations and 3. hiding implementation details. E.g. if two bases, which are supposed to be located at distinct storage areas, add a shared base with a state variable (as an implementation detail), I don't see a choice other than being explicit about the location of that shared base (resp. its state variables) similarly to the existing disliked override logic for virtual functions (or just disallowing the relocation in that case). Similarly, relocating individual state variables while keeping others private may be dangerous - there may be sane solutions to this (just as an example, relocation of specific storage variables could only be allowed if simultaneously being explicit about the storage location of the containing structure, i.e. contract or inheritance subtree). However, we don't need to fully solve this for an initial version that allows for EIP-7702 safety (for which relocating the entire inheritance graph in bulk in (only) the most-derived contract should be enough initially), so for the time being this is mainly a concern for ensuring that our method/syntax ideally remains future-proof.
Here the main question is whether we should make this a hard requirement or whether the expectation is rather to ship explicitly adjusted EIP-7702-versions of contracts. I'd assume that for use in the context of 7702 contract code would need to be at least reinspected (e.g. against potential clashes due to storage use via inline assembly, resp. the 7201 pattern) in any case - however, if there is a strong desire to be able to use fully unmodified base contracts that are merely relocated via inheritance, we can take that into account.
It's clear what you mean there and yes, under that condition relocating a full subtree is safe. So for the time being, I'd summarize our state of discussion as follows:
|
Hey all, I'm not sure how wildly incorrect I might be (fairly new to this part of ethdev), but I guess it's worth floating this idea I had, although it breaks account storage persisting through different code delegations. Let me know if that's an issue. Would it be possible to slightly modify what SLOAD/SSTORE does for accounts with set code (7702), such that instead of affecting slot X for the account, it affects slot x for a fictional account at address bytes20(uint160(keccak256(account . 7702_address_field)))? This way, we maintain the benefit that contracts don't need to be rewritten with namespaced storage, and app devs don’t have to worry about it. All the while maintaining guarantees that whatever contract a user delegates control to won't have any storage collisions regardless of how other contracts the user may have delegated to managed storage (i.e. you don't have to be aware of previous contracts' potential messups or users accidentally corrupting their storage). |
I think there are pros and cons to both ways of implementing it:
The global storage position feels "easier", but it also feels a bit more like a temporary solution to a very specific usecase rather than an end goal, which makes the engineer in me feel icky. |
Given that ERC-7201 is now well defined, in the |
Thank you everyone for the input so far. Looks like we have more clarity on the possible use cases so it's time to start moving things forward on getting the syntax finalized. The biggest obstacle here is still choosing something that will be easy to extend in the right direction without making contract definitions misleading, unreadable or too repetitive. We discussed it at length and all the options so far have drawbacks. Since we need to decide something anyway, here's my proposal that I wanted to put up as a more concrete basis for further discussion: https://notes.ethereum.org/@solidity/explicit-storage-layout-syntax This is by no means final, but I think it reconciles our own concerns with many of the requirements posted here so far. Not all of them could be incorporated though and we're still open to alternative proposals or amendments. The proposal presents 3 variants, with the first one being only just enough to cover EIP-7702 and the other two trying to accommodate other use cases. Note that in either case we'd only implement enough to allow specifying the location for the most derived contract, which would basically be this, respectively:
|
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Thanks for sharing the proposal! I like variant 3, specifically layout blocks, but What do you think is the subset that should be implemented initially? One thing that I see listed as a potential requirement (great summary, by the way) but not addressed in the proposals is the issue of collisions. Ideally the compiler would be able to guarantee that a layout is collision-free. I don't think this can be done for arbitrary expressions, but it should be possible if restricted to known formulas, e.g. native Solidity formulas for arrays and mappings, or ERC-7201. Constants pose a problem because they could be hash outputs. Do others agree that this is an important requirement? |
I think collision detection would be quite valuable. It's not clear to me why 1) would not support
be exactly enough to conform to 7201? Could someone elaborate? |
All of the variants support ERC-7201 but also support other expressions and that's one way that collisions could come in. But I don't think we'd want to restrict to ERC-7201 only, for example there is ERC-1967, and there may be other schemes people want to use. That said this is something that was considered in ERC-7201 for the design of the annotation, whose format is |
Just specifying the location for the whole hierachy. If you mean beyond that, then as @ekpyron said above, the next step would be to allow relocating specific base contracts or subtrees. But how quickly we proceed with that strongly depends on how certain we feel that this design is robust and extensible enough. For now we need anyone interested to try to poke holes in it and bring up stuff that we may be overlooking :)
TBH we assumed that people would actually be pushing for more expressivity here rather than for restricting it. This is exactly the kind of feedback we need here :) This is something we could consider if there's clear consensus that this is what everyone wants, but for now it seems to me like it might be a bit too restrictive. I'd be worried about missing some common formulas from pre-ERC-7201 schemes being in active use. So far we were not planning to enshrine any particular method of selecting locations, rather leaving that up to ERCs/conventions/libraries. The idea was to allow any expression that can be evaluated at compilation time (as long as the resulting locations don't overlap in an obvious way within a single contract hierarchy). Currently that set is pretty limited, but still includes constants. With time it would be extended to include enough things to cover ERC-7201 (keccak, conversions, some subset of ABI encoding), and eventually we'd get full compile-time evaluation support (but that might only ship on top of the new type system). With that, common formulas for locations could be considered for inclusion in the standard library.
Because it will relocate the whole inheritance hierarchy, including |
@frangio We discussed the idea of limiting the available expressions on the design call today, but so far we're not really convinced this is something that should be enforced by the compiler. We're open to being convinced otherwise, but for now we'd still rather leave that up to convention. Guaranteeing safety against collisions may be hard to reconcile with allowing more than one scheme and we really don't want to limit this feature to a single convention, even if it's ERC-7201. Also, this would have to work nicely not only with the current scheme used by the compiler but also the future one, which is likely to be introduced in response to Verkle Trees. The current scheme will need to change in certain ways because Verkle will make not colocating data together (as is the case for dynamic types now) more expensive. I also updated the proposal to address your comments there. I removed the override from the ERC-7201 example. In place of an annotation I added a |
We decided to go with variant 2. Here's the spec for what exactly will be implemented: Initial syntax for explicit storage locations. As I said before, it's just the minimal version of it, though we did include a compile-time helper for the ERC-7201 formula to nudge people towards using the established conventions rather than arbitrary expressions. This is the new syntax in a nutshell: contract C layout at 0x1234 {} contract FinalContract
layout at erc7201("FinalContract")
is BaseA, BaseB
{} From the feedback we got here so far, it seems to me that it should be a generally acceptable base to extend later as needed. We're planning to start working on it after we're done with some other features which have kept us busy so far. If anyone has some reservations and can provide strong arguments for why this may be bad or insufficient, this is the moment to speak up. |
A potential issue is that contracts that manually implement custom storage locations will be unaffected by Solidity's native layout rerooting. In a sense this should be expected because it's a very low level operation. But it would be good if Solidity provided some way for this kind of logic to adapt to a potentially rerooted layout... For example, if the base of the layout was available as a constant. Maybe something like |
@frangio Sounds like a reasonable feature. We can add something like this, though I can't promise it will make its way into the initial implementation (which is almost ready at this point). There are also some things that need to be clarified about how it would work in corner cases related to possible future extensions:
|
I think keep pointing at the empty spot.
My intuition is it makes more sense to point at the storage of only |
Inline assembly has now made fully up upgradable contracts possible. One of the main hangups with this is that the storage locations have to stay the same across upgrades. Would it be possible to introduce support for specifying the storage locations for storage variables?
The text was updated successfully, but these errors were encountered: