use multibyte.h macros for CHAR_T #16

yamt · 2014-03-14T06:35:08Z

this fixes O commands with autoindent and J command at least.

lichray · 2014-03-14T07:36:17Z

CHAR_T literals should be wrapped with L() macro, yes.

Locale sensitive upper/lower case sometime makes sense.

isspace, isdigit and isblank uses are intended to be locale insensitive. Otherwise, full-width spaces will be able to be used as vi command splitter, or full-width (or even CJK numbers) will be able to be used in hex numbers, etc. I used to spend lots of time to evaluate them case by case. I suggest you to keep locale sensitive in mind and review those changes again.

yamt · 2014-03-14T14:18:53Z

i'm not sure what you mean. isspace etc is locale sensitive.

lichray · 2014-03-14T15:49:55Z

That's true... Here are the details:

I assume that I can use the narrow char type functions on wide chars and
get correct answer for the unsigned char range. Now I know that's
wrong, but it works fine so far.

And then, I tested the locale effects to the narrow char type functions
on FreeBSD. If the effects do not satisfy my needs, I apply isascii
before using those functions. This might not be cross platform, but if
it also works on other BSDs, then that's fine.

So if you see narrow char type functions on wide chars in nvi2 code, they
are intentional. If you can provide counter examples to show how they
breaks, for example, ex script become locale sensitive while historically
they don't, or wide char type recognition completely stop working on some
BSDs, I'll look at them again.

I would suggest to split this patch into two, one for L literals, one
for char type functions.

yamt · 2014-03-14T16:47:35Z

it might happen to work for you, but not for me.
using isascii() on wchar_t has the same problem.
please use iswXXX(), or wctob().

at least O commands with autoindent and J command was broken for fileencoding=iso-2022-jp
on NetBSD. at least for these cases, iswblank() is appropriate.

using a full-width space as a command splitter might be a little icky but not broken as the current code.

i don't bother to separate patch because L() part is not important at all.

lichray · 2014-03-14T17:00:20Z

On Fri, Mar 14, 2014 at 12:47 PM, YAMAMOTO Takashi <notifications@github.com

wrote:

it might happen to work for you, but not for me.
using isascii() on wchar_t has the same problem.
please use iswXXX(), or wctob().

at least O commands with autoindent and J command was broken for
fileencoding=iso-2022-jp
on NetBSD. at least for these cases, iswblank() is appropriate.

Can you show me the steps to reproduce it? And your locale settings. Hope
I can also reproduce it on FreeBSD. If not, I'll take this serious anyway.

iswblank() must not be used alone for J command. You definitely don't want
to
join a full-width space into a narrow space :)

using a full-width space as a command splitter might be a little icky but
not broken as the current code.

That's not acceptable. We need to fix both.

Zhihao Yuan, ID lichray
The best way to predict the future is to invent it.

4BSD -- http://4bsd.biz/

yamt · 2014-03-15T02:24:06Z

LANG=ja_JP.eucJP
unset LC_xxx

vi
:set fileencoding=iso-2022-jp
:set ai
i今日[ESC]O

and

vi
ihoge[ESC]o今日[ESC]kJ

lichray · 2015-03-31T03:18:30Z

@yamt I looked at this patch again and noticed that none of the change you request was prefixed by isascii, so even in my theory the status quo is problematic. Now I need some help:

Can you check NetBSD's libc source code and see whether isascii and iswascii the same? (For short, I expect them to be the same on all ASCII-based systems.)
Are tolower, isdigit, isblank produces the same result to their wide variants for wint_t within (0, 127)?

lichray · 2015-12-29T20:19:00Z

@yamt Can you give this branch a test? Thanks. https://github.com/lichray/nvi2/tree/narrow-wctype

use multibyte.h macros for CHAR_T

7bae61e

this fixes O commands with autoindent and J command at least.

lichray force-pushed the master branch 2 times, most recently from 2c1d2dc to 9f2cc1e Compare April 3, 2015 07:23

lichray force-pushed the master branch 2 times, most recently from 834f889 to 4ee3903 Compare December 29, 2015 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use multibyte.h macros for CHAR_T #16

use multibyte.h macros for CHAR_T #16

yamt commented Mar 14, 2014

lichray commented Mar 14, 2014

yamt commented Mar 14, 2014

lichray commented Mar 14, 2014

yamt commented Mar 14, 2014

lichray commented Mar 14, 2014

yamt commented Mar 15, 2014

lichray commented Mar 31, 2015

lichray commented Dec 29, 2015

use multibyte.h macros for CHAR_T #16

Are you sure you want to change the base?

use multibyte.h macros for CHAR_T #16

Conversation

yamt commented Mar 14, 2014

lichray commented Mar 14, 2014

yamt commented Mar 14, 2014

lichray commented Mar 14, 2014

yamt commented Mar 14, 2014

lichray commented Mar 14, 2014

yamt commented Mar 15, 2014

lichray commented Mar 31, 2015

lichray commented Dec 29, 2015