-
Notifications
You must be signed in to change notification settings - Fork 1.5k
PARQUET-430: Change to use Locale parameterized version of String.toUpperCase()/toLowerCase #312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@rdblue @liancheng Would you please take a look at this? Cheers. |
|
LGTM although I'm not quite sure whether it's absolutely necessary... |
|
@liancheng
For instance, this code snippet reproduces an Exception when the default locale is tr: Exception is:
Thanks. |
|
Agree, +1. @rdblue Would you mind to also take a look? |
|
+1 |
…pperCase()/toLowerCase A String is being converted to upper or lowercase, using the platform's default encoding. This may result in improper conversions when used with international characters. For instance, "TITLE".toLowerCase() in a Turkish locale returns "tıtle", where 'ı' -- without a dot -- is the LATIN SMALL LETTER DOTLESS I character. To obtain correct results for locale insensitive strings, we'd better use toLowerCase(Locale.ENGLISH). For more information on this, please see: - http://stackoverflow.com/questions/11063102/using-locales-with-javas-tolowercase-and-touppercase - http://lotusnotus.com/lotusnotus_en.nsf/dx/dotless-i-tolowercase-and-touppercase-functions-use-responsibly.htm - http://java.sys-con.com/node/46241 This PR changes our use of String.toUpperCase()/toLowerCase() to String.toUpperCase(Locale.*ENGLISH*)/toLowerCase(*Locale.ENGLISH*) Author: proflin <proflin.me@gmail.com> Closes apache#312 from proflin/PARQUET-430 and squashes the following commits: ed55822 [proflin] PARQUET-430
…pperCase()/toLowerCase A String is being converted to upper or lowercase, using the platform's default encoding. This may result in improper conversions when used with international characters. For instance, "TITLE".toLowerCase() in a Turkish locale returns "tıtle", where 'ı' -- without a dot -- is the LATIN SMALL LETTER DOTLESS I character. To obtain correct results for locale insensitive strings, we'd better use toLowerCase(Locale.ENGLISH). For more information on this, please see: - http://stackoverflow.com/questions/11063102/using-locales-with-javas-tolowercase-and-touppercase - http://lotusnotus.com/lotusnotus_en.nsf/dx/dotless-i-tolowercase-and-touppercase-functions-use-responsibly.htm - http://java.sys-con.com/node/46241 This PR changes our use of String.toUpperCase()/toLowerCase() to String.toUpperCase(Locale.*ENGLISH*)/toLowerCase(*Locale.ENGLISH*) Author: proflin <proflin.me@gmail.com> Closes apache#312 from proflin/PARQUET-430 and squashes the following commits: ed55822 [proflin] PARQUET-430
…pperCase()/toLowerCase A String is being converted to upper or lowercase, using the platform's default encoding. This may result in improper conversions when used with international characters. For instance, "TITLE".toLowerCase() in a Turkish locale returns "tıtle", where 'ı' -- without a dot -- is the LATIN SMALL LETTER DOTLESS I character. To obtain correct results for locale insensitive strings, we'd better use toLowerCase(Locale.ENGLISH). For more information on this, please see: - http://stackoverflow.com/questions/11063102/using-locales-with-javas-tolowercase-and-touppercase - http://lotusnotus.com/lotusnotus_en.nsf/dx/dotless-i-tolowercase-and-touppercase-functions-use-responsibly.htm - http://java.sys-con.com/node/46241 This PR changes our use of String.toUpperCase()/toLowerCase() to String.toUpperCase(Locale.*ENGLISH*)/toLowerCase(*Locale.ENGLISH*) Author: proflin <proflin.me@gmail.com> Closes apache#312 from proflin/PARQUET-430 and squashes the following commits: ed55822 [proflin] PARQUET-430
…pperCase()/toLowerCase A String is being converted to upper or lowercase, using the platform's default encoding. This may result in improper conversions when used with international characters. For instance, "TITLE".toLowerCase() in a Turkish locale returns "tıtle", where 'ı' -- without a dot -- is the LATIN SMALL LETTER DOTLESS I character. To obtain correct results for locale insensitive strings, we'd better use toLowerCase(Locale.ENGLISH). For more information on this, please see: - http://stackoverflow.com/questions/11063102/using-locales-with-javas-tolowercase-and-touppercase - http://lotusnotus.com/lotusnotus_en.nsf/dx/dotless-i-tolowercase-and-touppercase-functions-use-responsibly.htm - http://java.sys-con.com/node/46241 This PR changes our use of String.toUpperCase()/toLowerCase() to String.toUpperCase(Locale.*ENGLISH*)/toLowerCase(*Locale.ENGLISH*) Author: proflin <proflin.me@gmail.com> Closes apache#312 from proflin/PARQUET-430 and squashes the following commits: ed55822 [proflin] PARQUET-430
A String is being converted to upper or lowercase, using the platform's default encoding. This may result in improper conversions when used with international characters.
For instance, "TITLE".toLowerCase() in a Turkish locale returns "tıtle", where 'ı' -- without a dot -- is the LATIN SMALL LETTER DOTLESS I character. To obtain correct results for locale insensitive strings, we'd better use toLowerCase(Locale.ENGLISH).
For more information on this, please see:
This PR changes our use of String.toUpperCase()/toLowerCase() to String.toUpperCase(Locale.ENGLISH)/toLowerCase(Locale.ENGLISH)