-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add specific unicode string literal type (and make ascii the default) #5167
Comments
See OpenZeppelin/openzeppelin-contracts#1090 (comment) to learn how OpenZeppelin was affected by this |
An obvious way to introduce them is via a prefix (like in |
If we do not want a breaking change: If we are happy to do a breaking change, then the current strings would need to be prefixed with |
Note that things like |
Why? |
Because the idea is that the source representation does not have any "weird" characters, but the internal representation can be anything. @maraoz would you agree? |
@chriseth agreed! |
Should this go to the backlog? |
Preliminary vote: make ascii strings the default and require a prefix for unicode strings |
Decision on meeting:
|
Does an ascii string allow unicode escape?
|
Yes, we said unicode escapes in default strings are fine, but not unicode characters. |
While implementing I had realised a few things: it is quite a large change allowing escapes in non-unicode literals, because the scanner just turns the escape into codepoints. First thought implementation should have no effect on the design, but this I think is a useful consideration:
Assigning any literal to a string type should check for UTF-8 encoding (this is something we have now). |
The rules described in #5167 (comment) were implemented. |
From the zeppelin audit:
Strings in Solidity are not only used for displaying information: for example, it is very common to have them be a key of a mapping. Because UTF-8 allows for multiple invisible characters (e.g. ZERO WIDTH SPACE), and for characters that look almost like common characters (e.g. GREEK QUESTION MARK), this usage can be extremely problematic, and lead to underhanded backdoors, exploits, etc. OpenZeppelin’s main access-control contracts are affected by this, as are multiple other string-based implementations.
Consider adding a non-UTF-8 string type to prevent these situations from arising in the first place.
The text was updated successfully, but these errors were encountered: