Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Japanese text displayed incorrectly when running git diff in version 2.11.0 #981

Closed
1 task done
nyoro712 opened this issue Dec 5, 2016 · 16 comments
Closed
1 task done
Assignees
Milestone

Comments

@nyoro712
Copy link

nyoro712 commented Dec 5, 2016

  • I was not able to find an open or closed issue matching what I'm seeing

Setup

  • Which version of Git for Windows are you using? Is it 32-bit or 64-bit?
$ git --version --build-options
git version 2.11.0.windows.1
sizeof-long: 4
machine: x86_64

I'm using a portable release: PortableGit-2.11.0-64-bit.7z.exe

  • Which version of Windows are you running? Vista, 7, 8, 10? Is it 32-bit or 64-bit?
$ cmd.exe /c ver

Microsoft Windows [Version 6.1.7601]

Windows 7 Professional, Service Pack 1, 64-bit

  • What options did you set as part of the installation? Or did you choose the
    defaults?

I have no install-options.txt file.

  • Any other interesting things about your environment that might be related
    to the issue you're seeing?

I'm using diff-highlight.
I downloaded diff-highlight file from https://github.com/git/git/tree/master/contrib/diff-highlight and put it in %PathToPortableGit%\usr\bin\ directory.

.gitconfig

[core]
	autocrlf = false
	quotepath = false
[pager]
	log = diff-highlight | less
	show = diff-highlight | less
	diff = diff-highlight | less
[diff]
	compactionHeuristic = true

Details

  • Which terminal/shell are you running Git from? e.g Bash/CMD/PowerShell/other

Bash

  1. Build a test repository.
$ mkdir gitrepo
$ cd gitrepo
$ git init
$ touch test.txt
  1. Rewrite test.txt as following, and save with BOM-less UTF-8 encoding.
+日本語のテキスト
+Write in Japanese
+
  1. Add test.txt to the repository.
$ git add .
$ git commit -m "First commit"
  1. Rewrite test.txt as following.
-日本語のテキスト
-Write in Japanese
+英語のテキスト
+Write in English
  1. Now run git diff
$ git diff head
  • What did you expect to occur after running these commands?
$ git diff head
diff --git a/test.txt b/test.txt
index 19ebe7c..e45d453 100644
--- a/test.txt
+++ b/test.txt
@@ -1,2 +1,2 @@
-日本語のテキスト
-Write in Japanese
+英語のテキスト
+Write in English
  • What actually happened instead?
$ git diff head
diff --git a/test.txt b/test.txt
index 19ebe7c..e45d453 100644
--- a/test.txt
+++ b/test.txt
@@ -1,2 +1,2 @@
-<E6><97><A5><E6><9C><AC><E8><AA><9E><E3><81><AE><E3><83><86><E3><82><AD><E3><82><B9><E3><83><88>
-Write in Japanese
+<E8><8B><B1><E8><AA><9E><E3><81><AE><E3><83><86><E3><82><AD><E3><82><B9><E3><83><88>
+Write in English

Additional Notes

  • It worked correctly when I used version 2.10.2.windows.1
  • I did not change any config when updating to version 2.11.0.windows.1 from 2.10.2.windows.1
@Suchiman
Copy link

Suchiman commented Dec 5, 2016

I see the same behavior for german umlauts like äöü in git log.

@Egor-Skriptunoff
Copy link

Egor-Skriptunoff commented Dec 5, 2016

The same problem with Russian symbols when upgraded from 2.10 to 2.11.0.windows.1
It seems that 2.11.0.windows.1 does not honor the quotepath option anymore.

@hwdsl2
Copy link

hwdsl2 commented Dec 5, 2016

I encountered the same problem for Chinese characters after upgrading from 2.10.2 to 2.11.0.

@shiftkey
Copy link

shiftkey commented Dec 5, 2016

@nyoro712 to ensure the BOM settings and encodings are correct, are you able to publish this repository somewhere so we're able to test the same bytes?

@Egor-Skriptunoff @Suchiman same thing, are you able to provide sample repositories - and repro steps for the commands you are running - to ensure we're testing the right encodings?

@nyoro712
Copy link
Author

nyoro712 commented Dec 6, 2016

Published https://github.com/nyoro712/git-for-windows-issue-981

  1. Clone the repository.
$ git clone https://github.com/nyoro712/git-for-windows-issue-981 issue
$ cd issue
  1. Rewite test.txt
-日本語のテキスト
-Write in Japanese
+英語のテキスト
+Write in English
  1. Run git diff
$ git diff head
  1. The problem will be reproduced.

.gitconfig and diff-highlight are not included in the repository.
They are global config (git config --global) in my environment.

@xtwmx
Copy link

xtwmx commented Dec 7, 2016

Similar problem with letter Õ in git log.

@dscho
Copy link
Member

dscho commented Dec 8, 2016

Could you create .minttyrc in your home directory with the single line Charset=UTF-8, like so:

echo Charset=UTF-8 >>~/.minttyrc

then restart Git Bash and try again?

@dscho dscho self-assigned this Dec 8, 2016
@dscho dscho added this to the v2.11.1 milestone Dec 8, 2016
dscho added a commit to dscho/build-extra that referenced this issue Dec 8, 2016
The recent git-wrapper change to set LC_ALL=C unless set (required by
TortoiseGit to be able to pass non-ASCII command-lines to git.exe)
unfortunately broke the encoding in Git Bash.

Let's reinstate it.

This fixes git-for-windows/git#981

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@nyoro712
Copy link
Author

nyoro712 commented Dec 9, 2016

I tried .minttyrc, but nothing changed.

@dscho
Copy link
Member

dscho commented Dec 9, 2016

@nyoro712 I settled on another solution.

If you want to work around it in the meantime, right-click on the windows icon in the top left of Git Bash's window, select Options, and make sure that the Text tab lists "UTF-8" as encoding.

@nyoro712
Copy link
Author

nyoro712 commented Dec 9, 2016

@dscho It still display incorrectly.

screenshot

@nyoro712
Copy link
Author

nyoro712 commented Dec 9, 2016

The result of locale command on Git Bash with version 2.10.2

$ locale
LANG=ja_JP.UTF-8
LC_CTYPE="ja_JP.UTF-8"
LC_NUMERIC="ja_JP.UTF-8"
LC_TIME="ja_JP.UTF-8"
LC_COLLATE="ja_JP.UTF-8"
LC_MONETARY="ja_JP.UTF-8"
LC_MESSAGES="ja_JP.UTF-8"
LC_ALL=

In version 2.11.0

$ locale
LANG=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=C

So I unset LC_ALL and set LANG, it worked well.

  1. Unset LC_ALL
$ unset LC_ALL

The result of locale is

$ locale
LANG=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_ALL=

Then git diff works fine, but Japanese filenames are still displayed incorrectly.

  1. Set LANG
$ export LANG=ja_JP.UTF-8

The result of locale is

$ locale
LANG=ja_JP.UTF-8
LC_CTYPE="ja_JP.UTF-8"
LC_NUMERIC="ja_JP.UTF-8"
LC_TIME="ja_JP.UTF-8"
LC_COLLATE="ja_JP.UTF-8"
LC_MONETARY="ja_JP.UTF-8"
LC_MESSAGES="ja_JP.UTF-8"
LC_ALL=

Everything works fine. (However, restarting Git Bash, it goes back)

@dscho
Copy link
Member

dscho commented Dec 9, 2016

The fix that closed this ticket is not yet in any published version. Hold on, though, I am still busy preparing a prerelease for you to test.

@dscho
Copy link
Member

dscho commented Dec 10, 2016

Bummer, I thought this ticket would be auto-tagged... https://github.com/git-for-windows/git/releases/tag/prerelease-v2.11.0.windows.1.1

Could you test that, please?

@nyoro712
Copy link
Author

@dscho I tried prerelease-v2.11.0.windows.1.1.
It's fine! 🎉

@PhilipOakley
Copy link

PhilipOakley commented Dec 10, 2016 via email

dscho added a commit to git-for-windows/build-extra that referenced this issue Dec 11, 2016
Non-ASCII characters are [now shown properly
again](git-for-windows/git#981) in Git Bash.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
dscho added a commit that referenced this issue Dec 12, 2016
This merges the "Pre-release to test fixes for #981 and #987".

These commits were actually meant to land in `master` before merging the
Pull Requests; Better late than never.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Member

dscho commented Dec 23, 2016

Aside: I mistakenly thought there was a numbering err on the release page, as I'd expected an increment to the 2.11.0 part (i.e. just before the .rc0). In fact the increment is after the .windows. part of the version < feels like a fool >.

@PhilipOakley it is I who feels like a fool. I assumed that prerelease-v2.11.0.windows.1.1 would be a good name for a release based on v2.11.0.windows.1.

I worked a bit on streamlining the prerelease engineering (basically, I want to start one command in the VM dedicated to build releases and forget about it, it should do everything from coming up with a good tag, to building, to uploading the prerelease and publishing it). The way this script works right now, it will take the base version (v2.11.0), increment the last digit, and then append an auto-incrementing prerelease suffix.

The prerelease which is building right now, as I write this, is labeled v2.11.1.windows-prerelease.1.

Thanks for helping this project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants