-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extensible address format #5694
Comments
So this would certainly break existing addresses so I want to make sure we get this right and allows us to be future adaptable and flexible (i.e. we can deal with future use-cases without necessarily knowing them right now). As for the actual proposal, I'm curious what pre-hash "conditions" provide over just prepending a byte prefix apart from not revealing the condition (e.g. key type)? It seems like the byte-prefix is much simpler and quite flexible. Although I'd advise against the app constructor. We already are aware of the need to refactor the config usage in the SDK so we might as well tie this into that refactor. Thoughts @ethanfrey @zmanian? |
It made sense when designing from the ground up (weave), to avoid collisions and not require any prefix on the address (just the preimage). Also we know what an issue it can become reserving prefix bytes and ensuring no collisions. For the current case of the sdk, backwards compatibility is key. So secp256k1 keys must continue to work as is. For other curves, you can prepend a "type byte" on the addresses (making them 21 bytes), or prepend a byte/string/etc on the preimage (public key) before hashing - this would be I think it is essential that this does not break existing addresses, thus there must be a special case for secp256k1 of a "nil" prefix. Other than that, I have no opinion on which route you choose to take - I think the developers who will be writing and maintaining this should choose which approach works best for them. |
ACK on pre-image prefix with an empty byte slice for SECP256K1 keys. |
So I like the conceptual integrity of uniform 20 byte addresses. And prefixing the key type with the preimage. I don't see why information about the key needs to be in both the preimage and the digest("Address"). |
So it sounds like we're basically aligning on something similar to the weave condition approach. Which means all new pubkeys (besides secp256k1) will have their type prepended to the pubkey bytes before applying sha256 and taking 20 bytes, i.e. Weave also includes an extension prefix: https://github.com/iov-one/weave/blob/master/conditions.go#L32. Not sure that's necessary but we should document the approach clearly and maybe a registry of prefixes somewhere. Clients will need to know this stuff. Groups would likely just have the I do wish we knew more about the actual collision resistance of this approach. The best info we have is what @ethanfrey shared above and also @alpe mentioned that this was not flagged as an issue in weave's audit report (https://github.com/iov-one/weave/blob/master/audit/IOV_Audit_Report_2019.1-Final%20with%20Release.pdf). So I guess that's validation. I just feel horribly unqualified to make this sort of call myself. |
Yes, correct. Simple and clean. We will have an endpoint that returns all registered key types. |
One other consideration is whether or not Address() should stay part of the PubKey interface. I would actually argue that apps should specify how they convert pubkeys into addresses so they can control the set of address prefixes. Thoughts?
With the current PubKey interface, I think a registry of prefixes as @alexanderbez suggests would be hard.
|
Not necessarily. We could keep the Tendermint logic intact, no? If left intact, yes, we would not be able to call and use |
Okay, makes sense to not change anything Tendermint side and have a different route for obtaining the address. In that case though I would suggest an |
Yeah, I'm not opposed to having curves/keys (re)-defined in the SDK (properly -- no amino). Any opposition to this @zmanian @marbar3778? |
And even without redefining all of them, we could simplify the requirements for signature algorithms just needed in the SDK and not tendermint - just relaxing the interface. |
Sha256(KeyTypePrefix || Length || Keybytes) truncated to 20 bytes seems like the basic answer for address generation that almost any signature system can use. |
@aaronc do you think there is a capacity to do this in the 0.39 release? I'm a bit skeptical. Perhaps this should be slated for 0.40? |
Well it may not be that involved. If we're moving keys to the sdk it should probably be done at the same time. @alpe do you possibly have any bandwidth to raise a PR to address this? It seems like what we've aligned on here is pretty close to the weave approach. |
I can start the basic implementation early next week but I guess migration and tests will make this a big package of work. Let's see |
Given the discussions in #7737 about the guidelines in the current ADR 028 draft being insufficient, I want to push for more in depth analysis of this and getting the opinion of 1 or more people who are professional cryptographers. I know that it will be a heavy lift to not use 20 byte addresses everywhere in the SDK because of the staking store key format. But I don't think that should be the reason we don't consider longer addresses. Cryptography should be the motivating reason IMHO and if we need to refactor staking store keys or whatever to accommodate that, then we have to do it sooner or later. Anyway, my request is simply that we come to a decision on this that involves some more in depth cryptography to know we're making good choices for today and the future. @alessio any luck getting input from the cryptographer at AiB you mentioned? @ebuchman is there anyone on your team that could lean in? @andrey-kuprianov I know you're not a cryptographer but you seem to have a good grasp of some of this and I would appreciate your input. |
CC'ing @alain2sf |
Addresses should definitely move to 32 bytes imo, as collision resistance is a necessary property for many applications. #7737 highlights the need for this in a sub-area of IBC, and smart contracts and bitcoin scripts require this. (Its already a problem that they don't offer collision resistance at the moment) Also collision attacks are really powerful, its going to be extremely hard to consider all the possible cross-overs where collisions on addresses are possible that will cause critical failures, and its highly likely some will be missed. Also fully agreed that given collision resistant address hashes, the address hash should take as input the pubkey type enum (encoded in a non-malleable way) as Zaki mentioned in #5694 (comment) With collision resistant hashes + prefixing, you get the guarantee that a pubkey -> privkey attack on one signature algorithm doesn't break all other signature algorithms. Without collision resistant hashes, you must use post-fixing to get the same guarantee. |
A very insightful discussion happening here, as well as in #3685, #4789; I will be happy to contribute my view on this. I completely agree with @ValarDragon that extending the address length is the only way to allow addresses of different types, various signature types, etc. I don't quite understand how concatenating different types of public keys in the pre-image of the hash function allows to achieve the goal. Suppose we have If changing the address scheme at all, I have another suggestion which might seem radical, but I hope is actually reasonable, given a wide range of usage scenarios the address should cover. I propose to
What will this bring?
It is important that this structure is in the address post-image, not in the pre-image. This provides the separation of security between various address kinds. The only drawback will be the increased bandwidth, but as correctly pointed out by @ValarDragon:
If the bandwidth is really an issue, this can be addressed e.g. as follows:
|
Thanks @andrey-kuprianov. I agree we should re-consider post-image prefixing as it is the only way that guarantees no attacks between different address types. I am not sure whether or not addresses should be fixed length or variable length. In general, variable length makes sense to me. The two reasons not to do it that I see are really:
There are various ways to address 1, but a simple way is to simply not use the address as part of the key at all, but instead assign every address a unique As for 2, yeah it's annoying but UI designers can deal with it and it shouldn't limit changes needed for security. |
This only holds once you've increased hash length to be collision resistant. (Which imo should be made default regardless of any other decision, and that things not being collision resistant should require lots of thought) Then assuming security of the hash function, the desired property holds. No need for post-image prefixing here. I'd prefer to frame post-image prefixing as keeping different address types in different address spaces (Then adding a general method for combining different address spaces into one single one, and this method must be the same across all applications). I'm pretty opposed to this as a standalone solution, without guarantees that each address type is itself collision resistant. Collision attacks are a really powerful tool for the attacker, and the lack of collision resistance will make the system very mis-use vulnerable. From the developer perspective, its extremely plausible that people are going to make mistakes in choosing the right address space (or using an address space in incorrect ways, e.g. making dependent txs, scripting / contracts, etc.), which will then be a catastrophic failure without collision resistance. Making our API misuse resistant is in my opinion a key requirement for any framework building high security software. Its an anti-feature to force developers to have to think about collision resistance vulnerabilities whenever they want to work with addresses. Separately, I also view namespacing each address type as high complexity. You need all different smart contracts, and software interacting with the chain to reason about the chains address space methods (which themselves are very liable to update). Further, if you want to interact with multiple chains, they may each have different address spacing methods, or you enforce a single namespacing mechanism for chains that communicate via IBC. |
So if we were going to do post-image prefixing, I think it makes sense to have a global prefix registry like https://github.com/multiformats/multicodec. That's not actually that hard - we just have a csv file in a repository. Also, we can assume all module accounts have the same prefix and address space because they are always chain dependent. So the prefix table should only include prefixes for real public keys plus the single prefix for module accounts. In that way, it might not actually be such a big undertaking and hopefully developer UX will be reasonable. I agree we need to make sure each address type is itself collision resistant. Maybe pre-image prefixing is enough if you have a long enough address, but again it's still a "one rotten apple spoils the whole bunch" type setup. Either way, looks like we're definitely looking at variable address lengths because I don't think there would be much appetite for rewriting all the existing 20 byte address as 32 or 40 bytes. |
I also completely agree with @andrey-kuprianov and @ValarDragon , extending the address length is the best way to allow different type of addresses, it's important to offer new signature schemes and a better diversity and support in cryptographic algos/models. |
Let's start investigating all things which will break when changing the address format. My initial list is in the issue below, let's edit that issue (or move it to other place where everyone can edit it): #8041 |
+1 for adapting the Algorand algorithm. I was also proposing it in the ADR-028 review: #7086 (comment) 3 months ago. In general, with big enough address space and good hash function we shouldn't need post-image prefixes/suffixes. The problem with prefixes / suffixes is that we need to keep track of them and they expand the address space even more (either we create some enums, or verbose codes). It's worth to note that we plan for the future where we will have many complex accounts (composed accounts) with different rules and different composition mechanism. Adding a post-image prefix/suffix for each variation might be cumbersome. WDYT? |
I am a bit confused here, reading through the Algorand specification linked in the referenced comment above they are using an entire 32 byte public key plus a checksum in base 32 to make addresses more resilient for use by users (copy/paste, etc). Specifically which part of the Algorand specification are you proposing here? It seems there are two contexts involved with regards to the Algorand info... on chain/state store binary formats and those exposed to users, roughly "transit formats". The bech32 address format in use today seems to be sufficient for transit [base 32 + checksum] and the extra checksum specified in the Algorand documentation doesn't seem very useful for internal system functions that are not taking input from a user directly. |
We are not going to use it, it was just an idea on how they hash the address.
|
Use Cases and Goals
Currently we are planning to add new address types for:
Currently secp256k1 keys and multisig accounts occupy the same overlapping 20 byte address space. We would like an extensible approach that:
Potential Approaches
Weave Conditions
As described in #3685 by @ethanfrey:
And as @ethanfrey explained this approach should be sufficient with 20 bytes of address space:
21-byte addresses
This approach would add a 1 byte address type prefix to addresses, for up to 256 potential address types. Apps would then need to register individual address prefixes likely in the app constructor.
This has the advantage of explicitly eliminating the chance of address collision, although maybe that is not an issue as @ethanfrey explained above. There is the added overhead of initializing prefixes in the app constructor.
I'm really open to either approach as I'm not an expert in this area. Either way, it would be good to have a standardized well-documented approach.
What do people think? /cc @alexanderbez
The text was updated successfully, but these errors were encountered: