-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Current ~str UTF-8 behavior allows for denial-of-service attack with args, environ #7188
Comments
This is a plausible default handler. We don't presently support default handlers, but I might be ok with it if/when we do. |
AFAICT, using a condition handler lets you replace the string wholesale, not replace the offending character. What sort of default handler would we use to let clients figure out the difference between a replaced string, and one that was, say, empty to begin with? |
You can easily change the signature of the condition to use an enum that expresses these differences, if they matter. |
That doesn't help if you rely on the default condition handler though, because in the end it just replaces the bad string with another one. |
Isn't it entirely incorrect to assume that the arguments and the environment are UTF-8 encoded? I.e. these two functions should return |
I don't think this is really an issue in the |
In general we have no or only poorly sketched interfaces for accepting other encodings presently. it would be good to grow some. |
nominating well-covered milestone |
Just a bug, declining |
Fixing this is presumably going to require changing API (to provide byte vectors as the primary value type and perhaps a secondary method that gives Option<~str>). |
Apparently even an empty program (eg., At least |
IMO |
… r=xFrednet,flip1995 `needless_collect` enhancements fixes rust-lang#7164 changelog: `needless_collect`: For `BTreeMap` and `HashMap` lint only `is_empty`, as `len` might produce different results than iter's `count` changelog: `needless_collect`: Lint `LinkedList` and `BinaryHeap` in direct usage case as well
The current behavior of
~str
is that it unilaterally rejects any invalid UTF-8 sequence (modulo #3787). Unfortunately, this opens up rust programs to denial-of-service attacks where maliciously crafted user input can cause unexpected task failure. Two cases that exist right now are invalid UTF-8 in the args list and in the environment. The mere presence of the invalid UTF-8 will causeos::args()
andos::env()
to immediately raise thestr::not_utf8
condition, which is unlikely to be handled by callers of these functions.I've suggested this before on the IRC channel, but I think it's worth suggesting again, that when parsing UTF-8 we should consider simply translating the first byte of any invalid sequence into the Replacement Character (U+FFFD) instead of failing outright.
The text was updated successfully, but these errors were encountered: