Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #60: Use proper encoding on Windows #91

Merged
merged 2 commits into from
Jun 29, 2020

Conversation

hdpoliveira
Copy link
Collaborator

No description provided.

@incidentist
Copy link
Collaborator

@hdpoliveira Is the idea here to set command output to a binary encoding, and let Windows decode it at some higher level using whatever weird process it wants? This is a big change to all command output on Windows, so I just want to make sure it works for utf8/16 file names and file contents on Windows. For instance, the extension uses hg cat in many places to output file contents at different revisions.

The general approach looks good, just wanna make sure it works, and unfortunately I don't have a Windows machine. Can you tell me more about the testing you've done?

@hdpoliveira
Copy link
Collaborator Author

We already configure mercurial encoding to be utf8, so it seems the node encoding get on the way.

For the testing, I tested on a root directory with an "á", and on a file name with an "é", but I didn't test the file contents. I'll do that 😊.

The testing I did: initialize one repository blá, check that the extension detects an Hg repository, create a new file, check that the extension detects an untracked file, add the file via the SCM tab, commit it. Clone from blá into blé. Modify file in blá, check that the extension detects modifications to file, commit. Pull from blé. In blé, create new file bló, check it's untracked, add file via SCM tab, commit, push to blá.
As you mentioned, I need to test modifications to file using special characters.

Do you see other gaps?

@incidentist
Copy link
Collaborator

Great! Oh, put some of those cháractérs into commit messages too, and try the "Show file history" command. That's the other place where user-generated text ends up in command output.

@hdpoliveira
Copy link
Collaborator Author

Yeah well, it broke hg cat ☹️
I'm checking now to see what can be done.

@hdpoliveira
Copy link
Collaborator Author

I'm almost getting to the conclusion this is unfixable...
Take this example: a repository root named blé, on which there is a file named blé, in which its contents are "blé" committed with a commit message "blé". Here is what I get:

PS C:\...\blé> hg log -r . -p
revisÒo:       9:a4ee1adc8024
etiqueta:      tip
usußrio:       Hudson Oliveira <hdpoliveira@gmail.com>
data:          Fri Jun 26 15:19:18 2020 -0300
sumßrio:       blÚ

diff --git a/blÚ b/blÚ
new file mode 100644
--- /dev/null
+++ b/blÚ
@@ -0,0 +1,1 @@
+bl├®


PS C:\...\blé> hg root
C:\...\blÚ

At some places it prints blÚ and others prints bl├®.
Using HGENCODING=utf-8 does not make things much better - it only changes the commit message:

PS C:\...\blé> $env:HGENCODING = 'utf-8'
PS C:\...\blé> hg log -r . -p
revisão:       9:a4ee1adc8024
etiqueta:      tip
usuário:       Hudson Oliveira <hdpoliveira@gmail.com>
data:          Fri Jun 26 15:19:18 2020 -0300
sum├írio:       bl├®

diff --git a/blÚ b/blÚ
new file mode 100644
--- /dev/null
+++ b/blÚ
@@ -0,0 +1,1 @@
+bl├®


PS C:\...\blé> hg root
C:\...\blÚ

It looks like there is a different encoding for file paths versus text. #60 is with respect to file paths.
I don't think there is a global solution here (plus, some commands may output both - like hg log -p).

@hdpoliveira
Copy link
Collaborator Author

hdpoliveira commented Jun 26, 2020

Ok, this solution is the best I can think of. It resolves #60 without breaking commit messages or file diff.
It is on a per-command basis, so we may encounter this problem in the future. If we do, the fix is simple though: just add {stdoutIsBinaryEncodedInWindows: true} to the exec call.

@hdpoliveira
Copy link
Collaborator Author

Support to accentuation in project path:
image

image

File history:
image

Copy link
Collaborator

@incidentist incidentist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a good compromise.

@incidentist incidentist merged commit 9cafd97 into mrcrowl:master Jun 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants