-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use UTF-8 on Windows 10 Version 1903, fix #1195 #1915
Conversation
Allows Ninja to use descriptions, filenames and environment variables with characters outside of the ANSI codepage on Windows. Build manifests are now UTF-8 by default (this change needs to be emphasized in the release notes). WriteConsoleOutput doesn't support UTF-8, but it's deprecated on newer Windows 10 versions anyway (or as Microsoft likes to put it: "no longer a part of our ecosystem roadmap"). We'll use the VT100 sequence just as we do on Linux and macOS. https://docs.microsoft.com/en-us/windows/uwp/design/globalizing/use-utf8-code-page https://docs.microsoft.com/en-us/windows/console/writeconsoleoutput https://docs.microsoft.com/de-de/windows/console/console-virtual-terminal-sequences
@jhasse I'm happy to see Ninja supporting Unicode, but I want to point out consequences to this change that should be taken into account: This is a breaking change for ninja build files. Since Ninja feeds the raw bytes from those files into -A Win32 functions, this means that Ninja build files will now have to be encoded as utf-8, not ANSI (which is good, but a breaking change). In particular, cmake will have to be updated to not generate Ninja build files as ANSI. There might also be repercussions with the "include prefix" feature. Ninja does binary comparison of the string coming from its build file and that coming from the |
Indeed, thanks for pointing that out. Note that this will only result in issues when there are non-ASCII characters in the build manifest. Something that would have resulted in problems on Windows anyway so I doubt that many people relied on that. It will still be the first point in the release notes ;) |
I've opened CMake Issue 21866 for this, thanks. |
#1918 proposes an additional tool to help generators determine the correct encoding for |
It appears this fix isn't providing full UTF-8 support on Windows 10 (build 19043). With a simple
And a source file,
ninja incorrectly generates an intermediate
and rename the aforementioned C source file to
Note that the lexer used by ninja did not encounter issues when the |
Yes. Looks like you've found a bug. |
release 1.11 This release adds Validation Nodes which are a new way to add jobs like linters or static analyzers to the build graph. They are added using |@ and don't produce any outputs. You can read more about the motivation and the syntax here: ninja-build/ninja#1800 Another big change is that Ninja now uses UTF-8 on Windows. This means that while previous versions of Ninja used the local ANSI encoding it will now always use UTF-8 allowing filenames and output with special characters. For this to work you'll need Windows 10 Version 1903 or newer. And for the console output to show Unicode characters you'll need to set the codepage to 65001. More information at: ninja-build/ninja#1915 Note that this is a breaking change if you relied on non-ASCII characters of the local codepage! If you want to query Ninja what codepage it uses in your generator, call `ninja -t wincodepage` and act accordingly. There are also two new tools: missingdeps: ninja-build/ninja#1331 inputs: ninja-build/ninja#1730 And as it was often requested, ninja now has a --quiet flag :) For a complete list of changes see https://github.com/ninja-build/ninja/milestone/3?closed=1
Allows Ninja to use descriptions, filenames and environment variables with characters outside of the ANSI codepage on Windows. Build manifests are now UTF-8 by default (this change needs to be emphasized in the release notes).
WriteConsoleOutput doesn't support UTF-8, but it's deprecated on newer Windows 10 versions anyway (or as Microsoft likes to put it: "no longer a part of our ecosystem roadmap"). We'll use the VT100 sequence just as we do on Linux and macOS.
https://docs.microsoft.com/en-us/windows/uwp/design/globalizing/use-utf8-code-page
https://docs.microsoft.com/en-us/windows/console/writeconsoleoutput
https://docs.microsoft.com/de-de/windows/console/console-virtual-terminal-sequences
Fixes #1195 (I hope).