Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to new address generation #1000

Closed
alpe opened this issue Sep 13, 2022 · 18 comments · Fixed by #1014
Closed

Improvements to new address generation #1000

alpe opened this issue Sep 13, 2022 · 18 comments · Fixed by #1014
Milestone

Comments

@alpe
Copy link
Contributor

alpe commented Sep 13, 2022

Feedback on the new address generation highlighted 2 cases that we could improve:
A) (existing) factory contracts that create new instances may run into address collusions with static labels
B) Overloading the label to be used as a salt may damage the label use case

See #942 for context

Some solutions to discuss would be:
A) adding the Init Message to the hash generation
B) adding an optional salt or nonce field to the MsgInstantiateContract. When sent from a contract or not set by the CLI we default to
B1) a constant or empty string
B2) sender, label or init message again

// tag: @the-frey , @ethanfrey , @webmaster128

@ethanfrey
Copy link
Member

I fully support (A) and had raised that issue before. I never saw a reason explaining why it was not included, so adding it in before tagging v0.29.0 seems very obvious improvement.

For (B) I am open to adding a field, but it should be optional and would only be set by the client, as it would be api breaking on the contract interface. I have a contract which uses a different label with the same initMsg to create multiple sub-contracts. I have been recommending this pattern to get multiple addresses controlled by one contract, and cw1-whitelist doesn't really offer meaningful differences in initMsg in such cases.

In order to maintain the control of the prehash space, I strongly suggest (B2) with label.

I think we need (A) + (B2) with label in order to avoid breaking existing factory contracts.

@webmaster128
Copy link
Member

webmaster128 commented Sep 13, 2022

Issue A is a good point. I'm not convinced Issue B is a problem. The label is an instance description and any automated system can put some nice human readable counter in there.

Adding the init message does not help to prevent Issue A. It might make it less likely but you can have a factory that uses constant init messages. Trying to add a binary representation of the init message can lead to all sorts of problems as there is an infinite amount of valid binary representations of the init message. You need to have certainty to know the exact init message not only in object form but byte-by-byte when generating the address. That's easy to get wrong. Now, even if it works, you cannot pre-generated addresses and remain flexible with the init message later.

Adding a field to a protobuf message is a breaking change (as discussed here: https://github.com/cosmos/cosmos-sdk/blob/main/docs/architecture/adr-044-protobuf-updates-guidelines.md).

Adding fields to Msgs. Adding fields is a not a Protobuf spec-breaking operation. However, when adding new fields to Msgs, the unknown field rejection will throw an error when sending the new Msg to an older node.

If we want to do that, we should create a second MsgInstantiateContract message type.

@ethanfrey
Copy link
Member

Adding the init message does not help to prevent Issue A. It might make it less likely but you can have a factory that uses constant init messages. Trying to add a binary representation of the init message can lead to all sorts of problems as there is an infinite amount of valid binary representations of the init message.

This is the semantics of CREATE2, which we had been following in the issue and seems to cover the Ethereum use cases well. I did not hear any disagreement of it until it was somehow missing in the implementation.

You need to have certainty to know the exact init message not only in object form but byte-by-byte when generating the address. That's easy to get wrong.

If you use the same client code to pre-generate the address as you use to claim it, it should make the same JSON and will work. For contracts, we don't really care if you can pregenerate it, just that it is different. Some factory contracts I have seen use different initMsg, but a constant label. Others use the same initMsg but custom labels. I haven't seen one in the wild that leaves them both the same.

The above argument is in practice a negligible problem and much outweighed by not breaking a large number of existing factory contracts.

Now, even if it works, you cannot pre-generated addresses and remain flexible with the init message later.

That is actually a point in favor. If you try to pre-generate one on the client side (not the contract side), you are committing to the actual contract that will be placed there. If you say it will be some escrow contract with certain rules, you can show someone your sender, label, the code hash and the initMsg and they can verify that it generates the address and thus you have a provable content without deploying. That is a feature, not a bug.

@ethanfrey
Copy link
Member

For the record, I am against adding the salt. I think it is unnecessary complexity with minimal utility. But I figured adding an optional field couldn't hurt and if someone wanted it, my feeling wasn't strong enough to block it.

@webmaster128
Copy link
Member

I'm not suggesting to ignore the issue. But I don't think it is reasonable to break instantiations with constant init message. So I guess we need to stick to the old behaviour for MsgInstantiateContract and have a different message for the predictable address generation.

You have a good point there regarding fixing the init message. It can be done, sure. But I think we first need to address the bigger issue.

@hussein-aitlahcen
Copy link

I'm not suggesting to ignore the issue. But I don't think it is reasonable to break instantiations with constant init message. So I guess we need to stick to the old behaviour for MsgInstantiateContract and have a different message for the predictable address generation.

You have a good point there regarding fixing the init message. It can be done, sure. But I think we first need to address the bigger issue.

If introducing a new message keep deployed contracts behavior intact, I'm in favor of that. Is there any drawback in this case?

@alpe alpe added this to the v0.29.0 milestone Sep 16, 2022
@alpe
Copy link
Contributor Author

alpe commented Sep 16, 2022

Thanks for the feedback and discussions!
IMHO we should accept for A) that the creator needs to submit the exact same bytes for the init message. There is risk to get it wrong as @webmaster128 pointed out but we are in good company doing the same as CREATE2.
We could look into normalization of the init message later, to address this risk better.

I would expect predictable addresses are not the main use case and only a very small number of accounts will exists before the contract instantiation in practice. I want to assume that in these cases people know what they do and tested it proper before.
There are some good benefits this feature brings and we can iterate on this.

About B) let's stick to the plan and use the label only

If introducing a new message keep deployed contracts behavior intact ...

This would be the cleanest solution but is not compatible with the wasmvm v1 api.

@alpe
Copy link
Contributor Author

alpe commented Sep 16, 2022

We can use SortJSON to normalize the initMsg. It is also used in getSignBytes

@ethanfrey
Copy link
Member

Great idea. hash(normalize(initMsg)) makes a lot of sense

@hussein-aitlahcen
Copy link

We can use SortJSON to normalize the initMsg. It is also used in getSignBytes

Is this function doing a deep sort? I think it's really hard in practice to enforce that with nested structures. Perhaps better to treat the initMsg as opaque json blob.

@webmaster128
Copy link
Member

I don't understand why we putting any focus on details of a solution that is not solving the full problem described in this ticket. If we do not restore the old behavior for MsgInstantiateContract, we break instantiations with constant init message and label that worked before. This can have disastrous efects for deployed contracts. If you agree to ignore those, please be explicit about it.

If introducing a new message keep deployed contracts behavior intact ...

This would be the cleanest solution but is not compatible with the wasmvm v1 api.

Why? I don't think so. We need to restore the old behaviour for MsgInstantiateContract. Then we can create a second MsgInstantiateContract2 which implements the new address derivation. Initially MsgInstantiateContract2 will not be usable from a contract but support for it can be added. A client then has to decide which message they prefer to use.

@hussein-aitlahcen
Copy link

I don't understand why we putting any focus on details of a solution that is not solving the full problem described in this ticket. If we do not restore the old behavior for MsgInstantiateContract, we break instantiations with constant init message and label that worked before. This can have disastrous efects for deployed contracts. If you agree to ignore those, please be explicit about it.

If introducing a new message keep deployed contracts behavior intact ...

This would be the cleanest solution but is not compatible with the wasmvm v1 api.

Why? I don't think so. We need to restore the old behaviour for MsgInstantiateContract. Then we can create a second MsgInstantiateContract2 which implements the new address derivation. Initially MsgInstantiateContract2 will not be usable from a contract but support for it can be added. A client then has to decide which message they prefer to us

I agree, this approach has less friction with the existing semantic. Similarly to the new EVM (disclaimer: I'm definitely not a fan of EVM but think that their approach to extending the VM is great) opcode, we would then have a new WasmMsg::Instantiate2 variant right?

@alpe
Copy link
Contributor Author

alpe commented Sep 19, 2022

Based on the feedback we had another round of discussion (@webmaster128 , @webmaster128 and me) an came up with the plan to:

  • support 2 ways of address generation: classic and predictable addresses (not available to contracts, yet)
  • add new MsgInstantiate2 for predictable addresses with fields: salt, include_init_msg (default false)
  • init msg is taken as raw bytes without normalization for MsgInstantiate2
  • not reused/repurposed label for address generation, resolving the nice-to-have critique point (B)

@webmaster128
Copy link
Member

webmaster128 commented Sep 20, 2022

The implementation work was started in #1014 💪

Biggest open question to me is the shape and size of the salt. Bytes can be inconvenient to work with. Looking at the CLI and test code, I wonder if we better make this a string instead of bytes. Then the spec allows no salt > 255 bytes. I think we should fix this and don't make it configurable by the app because the rest of the spec is also not configurable.
On Ethereum the salt is a fixed length binary value (bytes32). It is often provided in the integer form (uint is the same as uint256, which again is the same as bytes32). Using an integer is nice in CLIs. We don't have easy access to uint256 here and I guess uint64 would be too small if you want collision free salt generation using an RNG.

Any thoughts on that matter?

@alpe
Copy link
Contributor Author

alpe commented Sep 21, 2022

size of the salt.... don't make it configurable

👍 Any preferences on the value? I would start with 255 (to not be blocked on this) but this can be modified easily

string instead of bytes.

A string would be more convenient on the CLI but in the API I see strings for humans while bytes are for machines. There should not be any semantic attached to the salt other than raw data.
Using an integer is a smart way for the CLI! I would still postpone this for a quick solution, now. There are other places where people enter encoded bytes, too.

@webmaster128
Copy link
Member

👍 Any preferences on the value? I would start with 255 (to not be blocked on this) but this can be modified easily

I'd say 64 bytes is a reasonable max size. This allows you to encode counters up to uint512 or use the hash result of both sha256 and sha512 as the value. Anything above that just encourages doing strange things.

@alpe
Copy link
Contributor Author

alpe commented Sep 21, 2022

Do we need predictable address generation for gov proposals in this first iteration? I would prefer to add this later if needed

@webmaster128
Copy link
Member

Please also note https://medium.com/cosmwasm/dev-note-3-limitations-of-instantiate2-and-how-to-deal-with-them-a3f946874230 for important updates on Instantiate2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants