Skip to content

Locale problem #1690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
totembe opened this issue Jul 25, 2017 · 9 comments
Closed

Locale problem #1690

totembe opened this issue Jul 25, 2017 · 9 comments
Labels

Comments

@totembe
Copy link

totembe commented Jul 25, 2017

Running
Apache Tomcat/8.5.16
JVM Oracle Java 1.8.0_141-b15
OpenGrok 1.1-rc5

I have source repositories which contains Turkish comments. Files are encoded with ISO-8859-9. My system which runs OpenGrok was in en_EN locale and after parsing repositories, turkish characters are displayed as ? (&#65533) mark. While indexing, i specified locale while indexing as

sudo LC_ALL=tr_TR /path/to/opengrok/OpenGrok index /path/to/repos

Repos are git repos. A few SVN. git is important svn is not. When i checked xref gz files i see &#305 instead of &#65533 for character ı.

In order to see propery on web end, I had to change tomcat8 locale settings. After changing locale of tomcat8, i see turkish characters properly but now i cannot see revision history. When I click history, it gives me File Not Found error. When i switch back to system default locale, it works fine again.

My tomcat script environment variables;

export CATALINA_HOME=/opt/tomcat8
export JAVA_HOME=/opt/jdk/jdk1.8.0_141
export PATH=$JAVA_HOME/bin:$PATH
export LC_ALL=tr_TR
export JAVA_OPTS="$JAVA_OPTS -Duser.language=tr -D.user.region=TR"

@totembe
Copy link
Author

totembe commented Jul 25, 2017

25-Jul-2017 14:13:53.240 SEVERE [http-nio-8080-exec-7] org.opensolaris.opengrok.util.Executor.exec Failed to read from process: /usr/bin/git
java.io.IOException: Failed to parse author date: AuthorDate: Sat, 8 Jul 2017 04:06:39 +0300
at org.opensolaris.opengrok.history.GitHistoryParser.process(GitHistoryParser.java:102)
at org.opensolaris.opengrok.history.GitHistoryParser.processStream(GitHistoryParser.java:68)
at org.opensolaris.opengrok.util.Executor.exec(Executor.java:213)
at org.opensolaris.opengrok.history.GitHistoryParser.parse(GitHistoryParser.java:161)
at org.opensolaris.opengrok.history.GitRepository.getHistory(GitRepository.java:520)
at org.opensolaris.opengrok.history.GitRepository.getHistory(GitRepository.java:513)
at org.opensolaris.opengrok.history.FileHistoryCache.get(FileHistoryCache.java:546)
at org.opensolaris.opengrok.history.HistoryGuru.getHistory(HistoryGuru.java:239)
at org.opensolaris.opengrok.history.HistoryGuru.getHistoryUI(HistoryGuru.java:210)
at org.apache.jsp.history_jsp._jspService(history_jsp.java:187)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:443)
at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:385)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:329)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:231)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.opensolaris.opengrok.web.StatisticsFilter.doFilter(StatisticsFilter.java:55)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.opensolaris.opengrok.web.AuthorizationFilter.doFilter(AuthorizationFilter.java:76)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:478)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:80)
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:624)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:799)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1455)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.text.ParseException: Unparseable date: "Sat, 8 Jul 2017 04:06:39 +0300" with format "EE, d MMM yyyy HH:mm:ss Z" and locale "tr_TR"
at org.opensolaris.opengrok.history.Repository$1.parse(Repository.java:458)
at org.opensolaris.opengrok.history.GitHistoryParser.process(GitHistoryParser.java:96)
... 43 more
Caused by: java.text.ParseException: Unparseable date: "Sat, 8 Jul 2017 04:06:39 +0300" with format "d MMM yyyy HH:mm:ss Z" and locale "tr_TR"
... 45 more

@vladak
Copy link
Member

vladak commented Jul 25, 2017

Obviously, git fails to parse date in "English" format in Turkish locale. I wonder if it is possible to make git produce Turkish date.

@totembe
Copy link
Author

totembe commented Jul 26, 2017

It seems git ignores system locale completely and it uses own format. I tried git to generate date according to locale but failed to do so.

git executed by OpenGrok is

/usr/bin/git log --abbrev-commit --abbrev=8 --name-only --pretty=fuller --date=rfc --follow -- crm/PACKAGE.ARGE.BODY.SQL

Maybe instead of using iso format instead of rfc format and adjusting parser in OpenGrok java code will standardize dates along all locales. As I am not familier with OpenGrok architecture, I am not sure whether this is feasible.

@vladak
Copy link
Member

vladak commented Jul 26, 2017

Found this answer-less 1 year old question: https://stackoverflow.com/questions/36191343/how-to-make-git-print-date-in-a-different-locale so it seems like git log does not indeed support printing the date in locale.

The Repository.java#getDateFormat goes through the list of formats specified in given repository (there are 2 datePatterns entries defined in GitRepository.java) and tries to parse them using given locale:

445                 for (SimpleDateFormat formatter : formatters) {
446                     try {
447                         return formatter.parse(source);
448                     } catch (ParseException ex1) {
449                         /*
450                          * Adding all exceptions together to get some info in
451                          * the logs.
452                          */
453                         ex1 = new ParseException(
454                                 String.format("%s with format \"%s\" and locale \"%s\"",
455                                         ex1.getMessage(),
456                                         formatter.toPattern(),
457                                         Locale.getDefault().toString()),
458                                 ex1.getErrorOffset()
459                         );

It would be bad to return to the state before #1326 and hack GitRepository to use its own date parser. One idea would be to extend the cycle in getDateFormat's parse() and try English locale in case all other formats failed.

@vladak
Copy link
Member

vladak commented Jul 26, 2017

Also, I wonder whether filing a bug against Git would be in order.

@vladak
Copy link
Member

vladak commented Jul 26, 2017

Actually, I found in git help log that running git log with --date=format:%c returns the date in system locale, e.g.:

$ LANG=cs_CZ.UTF-8 git log -n 1 --date=format:%c
commit 4984607bf4dd370ebf015197d48c621caa84e606
Author: Vladimir Kotal <Vladimir.Kotal@Oracle.COM>
Date:   25. ��ervence 2017 17:30:36 CET

    fix import ordering

The date format %c expands to is %a %b %d %H:%M:%S %Y, at least on Solaris. Not sure it's the same on other systems.

@vladak
Copy link
Member

vladak commented Jul 26, 2017

I wanted to rename this issue to "Locale problem with git" however this is probably not specific just to git, e.g. Mercurial behaves in the same way.

@vladak vladak added bug and removed question labels Jul 26, 2017
@vladak
Copy link
Member

vladak commented Jul 26, 2017

Trying to find a SCM that prints the date in current locale, I found out that svn log prints the date in current locale however with --xml which is used by Subversion history parser it is printed in "iso-strict" fomat. So, I wonder whether SimpleDateFormat objects in getDateFormat() should be constructed with English locale via https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html#SimpleDateFormat(java.lang.String,%20java.util.Locale).

@totembe
Copy link
Author

totembe commented Jul 26, 2017

I am running debian wheezy system which uses old stable packages. At the moment it has git 1.7.1, via backports git 1.9.1. iso8601-strict and format:%c options are not available for both these versions. It is ok me to upgrade debian to ubuntu, or compile git from source, it may break some other production systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants