Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent results from isprint('\t') #7416

Closed
stevengj opened this issue Jun 25, 2014 · 6 comments
Closed

inconsistent results from isprint('\t') #7416

stevengj opened this issue Jun 25, 2014 · 6 comments
Labels
system:windows Affects only Windows

Comments

@stevengj
Copy link
Member

As mentioned in #7402 and JuliaLang/IJulia.jl#200, isprint('\t') returns true on Windows and false on MacOS. This inconsistency seems undesirable.

According to http://en.cppreference.com/w/cpp/string/byte/isprint, the correct result is false. However, other documentation for isprint is vague on the topic. Since isprint(' ') returns true, it seems odd to me that it would return false for a tab.

I'm not sure what the right answer is here, but I'm leaning towards false based on what non-Windows systems do.

@stevengj
Copy link
Member Author

One potential fix might be to use Unicode character categories, as suggested in #5939.

@stevengj
Copy link
Member Author

Note that our current isprint function just calls iswprint, which is broken on Windows anyway because it only handles characters that are in the BMP (i.e. which fit into one 16-bit UTF16 code unit).

stevengj added a commit to JuliaIO/JSON.jl that referenced this issue Jun 25, 2014
@stevengj
Copy link
Member Author

See also this discussion, in which the MinGW developers comment that contrery [sic] to other systems,msvcrt DLL version of iswprint function returns 1 for tab character and hence specifically check for tabs in GDB's iswprint implementation.

@elextr
Copy link

elextr commented Jun 26, 2014

I don't think there is a Unicode "printable" property looking at http://www.unicode.org/reports/tr44/#Character_Properties

The traditional ASCII printable has meant "not a control or delete", so space was "printable" but tab was not.

@stevengj
Copy link
Member Author

The tab character in Unicode is in category Cc (control characters), so we could define isprintas !iscntrl && ....

@vtjnash
Copy link
Member

vtjnash commented Dec 13, 2014

fixed by #8233

@vtjnash vtjnash closed this as completed Dec 13, 2014
earl pushed a commit to earl/llvm-mirror that referenced this issue Jul 26, 2018
The standard library functions ::isprint/std::isprint have platform-
and locale-dependent behavior which makes LLVM's output less
predictable. In particular, regression tests my fail depending on the
implementation of these functions.

Implement llvm::isPrint in StringExtras.h with a standard behavior and
replace all uses of ::isprint/std::isprint by a call it llvm::isPrint.
The function is inlined and does not look up language settings so it
should perform better than the standard library's version.

Such a replacement has already been done for isdigit, isalpha, isxdigit
in r314883. gtest does the same in gtest-printers.cc using the following
justification:

    // Returns true if c is a printable ASCII character.  We test the
    // value of c directly instead of calling isprint(), which is buggy on
    // Windows Mobile.
    inline bool IsPrintableAscii(wchar_t c) {
      return 0x20 <= c && c <= 0x7E;
    }

Similar issues have also been encountered by Julia:
JuliaLang/julia#7416

I noticed the problem myself when on Windows isprint('\t') started to
evaluate to true (see https://stackoverflow.com/questions/51435249) and
thus caused several unit tests to fail. The result of isprint doesn't
seem to be well-defined even for ASCII characters. Therefore I suggest
to replace isprint by a platform-independent version.

Differential Revision: https://reviews.llvm.org/D49680

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338034 91177308-0d34-0410-b5e6-96231b3b80d8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:windows Affects only Windows
Projects
None yet
Development

No branches or pull requests

3 participants