diff --git a/EIPS/eip-712.md b/EIPS/eip-712.md index 16e14a9065e5fc..6fbb24cfc5f249 100644 --- a/EIPS/eip-712.md +++ b/EIPS/eip-712.md @@ -41,43 +41,7 @@ Here we outline a scheme to encode data along with its structure which allows it ## Specification -### Signatures and Hashing overview - -A signature scheme consists of hashing algorithm and a signing algorithm. The signing algorithm of choice in Ethereum is `secp256k1`. The hashing algorithm of choice is `keccak256`, this is a function from bytestrings, 𝔹⁸ⁿ, to 256-bit strings, 𝔹²⁡⁢. - -A good hashing algorithm should satisfy security properties such as determinism, second pre-image resistance and collision resistance. The `keccak256` function satisfies the above criteria _when applied to bytestrings_. If we want to apply it to other sets we first need to map this set to bytestrings. It is critically important that this encoding function is deterministic and injective. If it is not deterministic then the hash might differ from the moment of signing to the moment of verifying, causing the signature to incorrectly be rejected. If it is not injective then there are two different elements in our input set that hash to the same value, causing a signature to be valid for a different unrelated message. - -### Transactions and bytestrings - -An illustrative example of the above breakage can be found in Ethereum. Ethereum has two kinds of messages, transactions `𝕋` and bytestrings `𝔹⁸ⁿ`. These are signed using `eth_sendTransaction` and `eth_sign` respectively. Originally the encoding function `encode : 𝕋 βˆͺ 𝔹⁸ⁿ β†’ 𝔹⁸ⁿ` was defined as follows: - -* `encode(t : 𝕋) = RLP_encode(t)` -* `encode(b : 𝔹⁸ⁿ) = b` - -While individually they satisfy the required properties, together they do not. If we take `b = RLP_encode(t)` we have a collision. This is mitigated in ethereum/go-ethereum#2940 by modifying the second leg of the encoding function: - -* `encode(b : 𝔹⁸ⁿ) = "\x19Ethereum Signed Message:\n" β€– len(b) β€– b` where `len(b)` is the ascii-decimal encoding of the number of bytes in `b`. - -This solves the collision between the legs since `RLP_encode(t : 𝕋)` never starts with `\x19`. There is still the risk of the new encoding function not being deterministic or injective. It is instructive to consider those in detail. - -As is, the definition above is not deterministic. For a 4-byte string `b` both encodings with `len(b) = "4"` and `len(b) = "004"` are valid. This can be solved by further requiring that the decimal encoding of the length has no leading zeros and `len("") = "0"`. - -The above definition is not obviously collision free. Does a bytestring starting with `"\x19Ethereum Signed Message:\n42a…"` mean a 42-byte string starting with `a` or a 4-byte string starting with `2a`?. This was pointed out in ethereum/go-ethereum#14794 and motivated Trezor to not implement the standard as-is (see trezor/trezor-mcu#163). Fortunately this does not lead to actual collisions as the total length of the encoded bytestring provides sufficient information to disambiguate the cases. - -Both determinism and injectiveness would be trivially true if `len(b)` was left out entirely. The point is, it is difficult to map arbitrary sets to bytestrings without introducing security issues in the encoding function. Yet the current design of `eth_sign` still takes a bytestring as input and expects implementors to come up with an encoding. - -### Arbitrary messages - -The `eth_sign` call assumes messages to be bytestrings. In practice we are not hashing bytestrings but the collection of all semantically different messages of all different DApps `𝕄`. Unfortunately, this set is impossible to formalize. Instead we approximate it with the set of typed named structures `π•Š`. This standard formalizes the set `π•Š` and provides a deterministic injective encoding function for it. - -Just encoding structs is not enough. It is likely that two different DApps use identical structs. When this happens, a signed message intended for one DApp would also be valid for the other. The signatures are compatible. This can be intended behaviour, in which case everything is fine as long as the DApps took replay attacks into consideration. If it is not intended, there is a security problem. - -The way to solve this is by introducing a domain separator, a 256-bit number. This is a value unique to each domain that is 'mixed in' the signature. It makes signatures from different domains incompatible. The domain separator is designed to include bits of DApp unique information such as the name of the DApp, the intended validator contract address, the expected DApp domain name, etc. The user and user-agent can use this information to mitigate phishing attacks, where a malicious DApp tries to trick the user into signing a message for another DApp. - -## Specification - The set of signable messages is extended from transactions and bytestrings `𝕋 βˆͺ 𝔹⁸ⁿ` to also include structured data `π•Š`. The new set of signable messages is thus `𝕋 βˆͺ 𝔹⁸ⁿ βˆͺ π•Š`. They are encoded to bytestrings suitable for hashing and signing as follows: - * `encode(transaction : 𝕋) = RLP_encode(transaction)` * `encode(message : 𝔹⁸ⁿ) = "\x19Ethereum Signed Message:\n" β€– len(message) β€– message` where `len(message)` is the _non-zero-padded_ ascii-decimal encoding of the number of bytes in `message`. * `encode(domainSeparator : 𝔹²⁡⁢, message : π•Š) = "\x19\x01" β€– domainSeparator β€– hashStruct(message)` where `domainSeparator` and `hashStruct(message)` are defined below. @@ -165,9 +129,7 @@ The method `eth_signTypedData` is added to the Ethereum JSON-RPC. The method par #### eth_signTypedData -The sign method calculates an Ethereum specific signature with: `sign(keccak256("\x19Ethereum Signed Message:\n" + len(message) + message)))`. - -By adding a prefix to the message makes the calculated signature recognisable as an Ethereum specific signature. This prevents misuse where a malicious DApp can sign arbitrary data (e.g. transaction) and use the signature to impersonate the victim. +The sign method calculates an Ethereum specific signature with: `sign(keccak256("\x19\x01" β€– domainSeparator β€– hashStruct(message)))`, as defined above. **Note**: the address to sign with must be unlocked.