-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.md.in
145 lines (115 loc) · 4.47 KB
/
README.md.in
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
lua-wcwidth
===========
[](https://travis-ci.org/aperezdc/lua-wcwidth)
[](https://coveralls.io/github/aperezdc/lua-wcwidth?branch=master)
[](https://luarocks.org/modules/aperezdc/wcwidth)
When writing output to a fixed-width output system (such as a terminal), the
displayed length of a string does not always match the number of characters
(also known as [runes](https://swtch.com/plan9port/unix/man/rune3.html), or
code points) contained by the string. Some characters occupy two spaces
(full-wide characters), and others occupy none.
POSIX.1-2001 and POSIX.1-2008 specify the
[wcwidth(3)](http://man7.org/linux/man-pages/man3/wcwidth.3.html) function
which can be used to know how many spaces (or *cells*) must be used to display
a Unicode code point. This [Lua](http://lua.org) contains a portable and
standalone implementation based on the Unicode Standard release files.
This module is useful mainly for implementing programs which must produce
output to terminals, while handling proper alignment for double-width and
zero-width Unicode code points.
Usage
-----
The following snippet defines a function which can determine the display width
for a string:
```lua
local wcwidth, utf8 = require "wcwidth", require "utf8"
local function display_width(s)
local len = 0
for _, rune in utf8.codes(s) do
local l = wcwidth(rune)
if l >= 0 then
len = len + l
end
end
return len
end
```
The function above can be used to print any UTF-8 string properly
right-aligned to a terminal:
```lua
local function alignright(s, cols)
local numspaces = cols - display_width(s)
local spaces = ""
while numspaces > 0 do
numspaces = numspaces - 1
spaces = spaces .. " "
end
return spaces .. s
end
print(alignright("コンニチハ", 80))
```
The `wcwidth()` function takes a Unicode code point as argument, and returns
one of the following values:
* `-1`: Width cannot be determined (the code point is not printable).
* `0`: The code point does not advance the cursor (e.g. `NULL`, or a combining
character).
* `2`: The character is East Asian wide (`W`) or East Asian full-width (`F`),
and is displayed using two spaces.
* `1`: All the rest of characters, which take a single space.
Note that the
[wcswidth(3)](http://man7.org/linux/man-pages/man3/wcswidth.3.html) companion
function is *deliberately not provided by this module*: while Lua 5.3 provides
[utf8.codes()](http://www.lua.org/manual/5.3/manual.html#pdf-utf8.codes) and
[utf8.codepoint()](http://www.lua.org/manual/5.3/manual.html#pdf-utf8.codepoint)
to convert UTF8 byte sequences to code points, for other Lua versions it would
be needed to depend on a third party module, and that would be against the
goal of `wcwidth` being standalone. If needed be, `wcswidth()` can be
implemented as follows using the Lua 5.3 `utf8` module (or any other
implementation which provides a compatible implementation):
```lua
-- Calculates the printable length of first "n" characters of string "s"
-- on a terminal. Returns the number of cells or -1 if the string contains
-- non-printable characters. Raises an error on invalid UTF8 input.
function wcswidth(s, n)
local cells = 0
if n then
local count = 0
for _, rune in utf8.codes(s) do
local w = wcwidth(rune)
if w < 0 then return -1 end
count = count + 1
if count >= n then break end
end
else
for _, rune in utf8.codes(s) do
local w = wcwidth(rune)
if w < 0 then return -1 end
cells = cells + w
end
end
return cells
end
```
Installation
------------
[LuaRocks](https://luarocks.org) is recommended for installation.
The stable version (recommended) can be installed with:
```sh
luarocks install wcwidth
```
The development version can be installed with:
```sh
luarocks install --server=https://luarocks.org/dev wcwidth
```
Unicode Tables
--------------
The `update-tables` script downloads the following resources from the [Unicode
Consortium website](http://unicode.org):
* @@WIDE_URL@@
* @@ZERO_URL@@
With them, it generates the following files:
* [wcwidth/widetab.lua](./wcwidth/widetab.lua)
* [wcwidth/zerotab.lua](./wcwidth/zerotab.lua)
The most current version of `wcwidth` uses the following versions of the above
Unicode Standard release files:
* `@@WIDE_VER@@`
* `@@ZERO_VER@@`