forked from qsantos/qrcode
-
Notifications
You must be signed in to change notification settings - Fork 0
/
DOCUMENTATION
462 lines (361 loc) · 15.7 KB
/
DOCUMENTATION
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
###########################################################################
########## ##########
########## DOCUMENTATION ##########
########## ##########
###########################################################################
<< Table of contents >>
Introduction
QR-code appearance
Patterns
* Finder patterns
* Timing patterns
* Alignment patterns
Information
* Format information
* Version information
Masking
Codewords
Blocks
* Content splitting
* Interleaving
* Uneven blocks
Error correction
Bibliography
=== Introduction ===
This file intends to give a clear overview of the way QR-codes works. It
is short enough to be read end-to-end easily and contain enough
information to understand the overall idea of QR-encoding. However,
for more technical information and value tables, please refere to the
specification and other sources of information in the bibliography at
the end of the file.
A QR-code is basically a two-dimensional barcode. It is designed to be
easily recognised in a picture (typically shot from a mobile phone) and
to withstand against alterations (physical alteration of the support,
lens flare or recognition glitch).
Their main use is to allow an user to easily access a piece of
information on a physical object using computer-like device (well, a
mobile phone). For example, consumer products often contain an URL to
a promotionnal webpage. An URL could also link to a detailed pack of
information that would not fit on the physicial object.
=== QR-code appearance ===
A QR-code is a square grid block of "modules". A module is a white or
dark square block. They can be seen as individual pixels but will be
usually represented as several pixels on a picture. The "version" of a
QR-code is a number between 1 and 40 (included) which defines its size:
the number of modules on the side of a version 'v' QR-code is 17+4*v.
Examples of QR-codes can be found in the 'examples/' directory. These
images have been generated using 'qrencode' version 4.3.2 available in
Debian repositories. It generates PNG files. However, the PBM file format
(1) is handier to parse. It is basically an ASCII grid of pixels. The
conversion from PNG to PBM has been done using 'convert' from the
ImageMagick software suite.
We will consider the 'hw.pbm' example which contains the single string
"Hello World!". The content of the file is:
= Listing 1: examples/hw.pbm =
P1
21 21
1 1 1 1 1 1 1 0 0 1 0 0 1 0 1 1 1 1 1 1 1
1 0 0 0 0 0 1 0 1 0 0 1 0 0 1 0 0 0 0 0 1
1 0 1 1 1 0 1 0 0 1 0 0 0 0 1 0 1 1 1 0 1
1 0 1 1 1 0 1 0 1 0 0 1 0 0 1 0 1 1 1 0 1
1 0 1 1 1 0 1 0 0 0 1 1 1 0 1 0 1 1 1 0 1
1 0 0 0 0 0 1 0 1 1 1 0 1 0 1 0 0 0 0 0 1
1 1 1 1 1 1 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0
1 1 1 1 1 0 1 1 1 1 0 0 1 1 0 1 0 1 0 1 0
1 1 1 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 0 1
1 0 0 1 0 0 1 0 1 0 0 1 0 1 1 0 0 1 1 1 0
0 0 0 0 1 0 0 0 1 0 1 0 0 1 0 0 1 1 1 0 0
1 1 0 1 1 1 1 1 0 1 1 1 0 1 1 0 0 0 0 0 1
0 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0
1 1 1 1 1 1 1 0 1 1 0 0 1 0 1 0 0 0 1 1 0
1 0 0 0 0 0 1 0 0 1 1 1 1 1 0 1 0 1 1 0 0
1 0 1 1 1 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 1
1 0 1 1 1 0 1 0 1 1 0 0 0 1 1 1 1 1 0 0 0
1 0 1 1 1 0 1 0 1 1 1 0 1 0 0 1 0 0 1 0 0
1 0 0 0 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 0 0
1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 0 1 0 0 1 0
= End of Listing 1 =
The first line is information about the format (ASCII black/white
bitmap). The second line gives the width and height of the image. The size
of this QR-code is 21 so its version is 1. Replacing the 0s with spaces
and the 1s with Xs gets us a more readable representation of the QR-code:
= Figure 1: "Hello World!" QR-code =
X X X X X X X X X X X X X X X X
X X X X X X
X X X X X X X X X X X
X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X
X X X X X X X X X X X X X X X X X
X X X
X X X X X X X X X X X X X X
X X X X X X X X X X X X X X X
X X X X X X X X X X
X X X X X X X
X X X X X X X X X X X X X
X X X X
X X X X X X X X X X X X X
X X X X X X X X X X
X X X X X X X X X X X
X X X X X X X X X X X X
X X X X X X X X X X X
X X X X X X X X X
X X X X X X X X X X X X X X X
= End of Figure 1 =
If this file was provided without a QR-code decoder, you can check the
content of the QR-code by using the ZXing online decoder (2).
(1) https://en.wikipedia.org/wiki/Netpbm
(2) http://zxing.org/w/decode.jspx
=== Patterns ===
The first layer in the QR-code decoding stack is the shape recognition. It
is a difficult and error-prone task. To ease the process of this step,
regular fixed patterns are embedded within the image.
The finder patterns and the timing patterns are present in all regular
QR-codes; the number alignment patterns depends on the size of the
QR-code. The first two are present in Figure 1 and Figure 2 help to
locate them (the 0s which are part of the patterns have been put back).
= Figure 2: QR-code patterns =
X X X X X X X 0 0 X X X X X X X
X 0 0 0 0 0 X 0 0 X 0 0 0 0 0 X
X 0 X X X 0 X 0 0 X 0 X X X 0 X
X 0 X X X 0 X 0 0 X 0 X X X 0 X
X 0 X X X 0 X 0 0 X 0 X X X 0 X
X 0 0 0 0 0 X 0 0 X 0 0 0 0 0 X
X X X X X X X 0 X 0 X 0 X 0 X X X X X X X
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
X
0
X
0
X
0 0 0 0 0 0 0 0
X X X X X X X 0
X 0 0 0 0 0 X 0
X 0 X X X 0 X 0
X 0 X X X 0 X 0
X 0 X X X 0 X 0
X 0 0 0 0 0 X 0
X X X X X X X 0
= End of Figure 2 =
== Finder patterns ==
The most obvious patterns are the "finder patterns". They are located
in the top-left, top-right and bottom-left corners. They are composed
of a 3x3 full square within a 7x7 empty square and are separated from
the remaining of the image by a 1-module wide white border.
As their name suggest, they are used to locate a QR-code within a bigger
image and to determine the orientation of the QR-code. Note that they
do not extend with the version/size of the QR-code (their size remain
of 7 modules).
== Timing patterns ==
The second kind of pattern is less outstanding: two lines of alternating
modules. One line is the seventh row and the other is the seventh column
(minus the finder patterns).
They help infering the size of the modules. In the example, they are
5-module long, but their length varies accordingly with the version/size
of the QR-code.
== Alignment patterns ==
There are no alignment patterns in the example. They are made of a dark
module within 5x5 empty square, without any border:
= Figure 3: Alignment pattern =
X X X X X
X 0 0 0 X
X 0 X 0 X
X 0 0 0 X
X X X X X
= End of Figure 3 =
They are used to help the shape-recognition software align the module
grid in bigger QR-codes. Their coordinates varies a bit accross the
QR-code versions and are given in tables.
=== Information ===
== Format information ==
Format information is 5-bit long. 10 more bits help correcting errors
that would happend in this section. These 15 bits are xor-masked with
101010000010010b before being put twice in the QR-code for redundancy.
In the example (see Figure 4), the 15-bit long masked format information
is 111110110101010b. It contains the 5 bits 11000b. The three first bits
110b pick the mask no 6 and the last two set the error correction level
to 0 (L).
= Figure 4: Format information =
X X X X X X X 0 X X X X X X X
X X 1 X X
X X X X X 0 X X X X X
X X X X X 1 X X X X X
X X X X X 0 X X X X X
X X 1 X X
X X X X X X X X X X X X X X X X X
0
1 1 1 1 1 0 X 1 1 1 0 1 0 1 0 1 0
X
X
X
X X X X X X X 1
X X 0
X X X X X 1
X X X X X 1
X X X X X 1
X X 1
X X X X X X X 1
= End of Figure 4 =
== Version information ==
NOTE: the version information is only present from version 7 to 40
The version information is just the version of the QR-code. It is
directly encoded within the image to help the infering of the size of
the modules. The version takes 6 bits, plus 12 error correction bits. No
masking is applied and the 18-bit long chain can be found in two locations
as shown in Figure 5 for a version 8 QR-code.
= Figure 5: Version information =
X X X X X X X 0 0 1 X X X X X X X
X X 1 1 1 X X
X X X X X 0 1 1 X X X X X
X X X X X (...) 0 1 0 X X X X X
X X X X X 0 0 0 X X X X X
X X 1 0 0 X X
X X X X X X X X X X X X X X X X X X
X
(...)
0 1 0 0 0 1 X
0 1 1 1 0 0
1 1 1 0 0 0 X
X X X X X X X
X X
X X X X X
X X X X X
X X X X X
X X
X X X X X X X
= End of Figure 5 =
The previous figure shows a version information chain of
001000010110111100b (bottom-to-top, right-to-left), which corresponds
to the 6-bit number 001000b, for version 8, as expected.
=== Masking ===
The format information contains information about masking. This is
an operation on the content modules which regularly swap some of
them. By selecting the right masking out of the 8 available ones, an
encoder is able to generate a QR-code which content does not contain
a group of modules similar to a pattern (which would otherwise confuse
shape-recognition).
The masks are defined by an expression which depend on the row and the
column of the current module. If this expression is equal to 0, then
the current module is to be swapped.
= Listing 2: The 8 masks =
0 (i+j)%2
1 i%2
2 j%3
3 (i+j)%3
4 (i/2+j/3)%2
5 (i*j)%2 + (i*j)%3
6 ((i*j)%2+(i*j)%3)%2 -> same as ((i*j)%3 + i*j)%2
7 ((i*j)%3+(i+j)%2)%2 -> same as ((i*j)%3 + i+j)%2
= End of Listing 2
When encoding, each mask must be tested in order to find the best one.
=== Codewords ===
The remaining modules represent the content of the QR-code (data and its
own error correction information). As a summary there are five items
which are not content:
* finder patterns;
* timing patterns;
* alignment patterns;
* format information;
* version information.
The content is made of codewords of 8 bits. Each bit correspond to
one module. The bits will be accessed by order of significance (first
accessed is the most significant bit in each codeword).
Basically, the bits of content are arranged as follow. Starting from
the bottom-right, access 8-module codewords from bottom to top in the
following order:
7 6
5 4
3 2
1 0
When the top of the image (row 0) is reached, got to the next (left)
columns and go the same top to bottom:
1 0
3 2
5 4
7 6
However, the areas without content are to be avoided, which will distort
this nice rectangular-shape codeword to hideous monsters. The simplest
way to find the right module to read or write a bit of content is to go
"turtle-mode". We start at the bottom-right corner of the grid and,
whenever the next bit of content is to be accessed, we do the following:
= Listing 3: Content-finding turtle =
if going from bottom-to-top:
go left or go top-right depending on column parity
else:
go left or go bottom-right depending on column parity
iterate until a content module is found
apply masking
= End of Listing 3 =
Note that, for the 6 left-most module columns, to "simplify" things,
the parity is inversed (because of the vertical timing pattern).
=== Blocks ===
== Content splitting ==
To bound the complexity of the decoding algorithm, the error correction
algorithm handles no more than 30 error-correction codewords. Depending
on the size of the QR-code and the error correction level, the number
of data codewords which can be handled with these 30 error-correction
codewords is limited.
Thus, the data is split in blocks so that each block needs no more than
30 error-correction codewords to work. For example, in QR-codes version
4 and error correction level Q, there are 2 blocks of 24 data codewords
alongside with 26 error-correction codewords. For a higher alteration
tolerance, e.g. H, we would choose more blocks with less data, e.g. four
blocks of 9 data codewords and 16 error correction codewords.
= Listing 4: QR-code 4-Q blocks =
Block 1: D1 D2 D3 ... D24 E1 E2 ... E26
Block 2: D25 D26 D27 ... D48 E27 E28 ... E52
= End of Listing 4 =
== Interleaving ==
To mitigate physicial alterations, the blocks are then
interleaved. Because an anomaly is expected to involve an area, the
interleaving should help to share out the errors among the blocks. Without
the interleaving, an anomaly would probably affect only one block but
the limited number of error correction codewords would not be sufficient
to fix them all.
In order to further decrease the risks of one block being corrupted, the
data codewords and the error-correction codewords are separated. Only
half the block should be threatened by the anomaly, and hopefully,
not enough will be to corrupt it.
== Uneven blocks ==
The available content area in a QR-code does not match nicely with
the size of the blocks. Thus, the data part of some blocks may not be
filled. However, the number of error-correction codewords per block is
always the same within a QR-code.
The sizes of the blocks given in the tables are chosen so that there are
two kinds of blocks within a QR-code: one with n data codeword and one
with n+1. All the n-blocks are used first, the (n+1)-blocks are used last.
The interleaving for this case is illustrated with a version 5-H QR-code:
= Listing 5: QR-code 5-H blocks =
Block 1: D1 D2 ... D11 E1 E2 ... E22
Block 2: D12 D13 ... D22 E23 E24 ... E44
Block 3: D23 D24 ... D33 D34 E45 E46 ... E66
Block 4: D35 D36 ... D45 D46 E67 E68 ... E88
= End of Listing 5 =
=== Error correction ===
Note: this section only deals with the error correction for the content
blocks, not for the format or version information.
The Reed-Solomon (RS) error correction algorithm is applied for each
block. In the following section, we will consider a block as a polynomial
over GF(2^8) (each 8-bit long codeword is a coefficient). Note that
the gap in Listing 5 due to the blocks of different size is sicarded:
the last blocks actually have a degree greater than the first ones.
The overall procedure is to first compute "syndromes", which are elements
of GF(2^8) containing information about the state of the block. There
is one syndrome per error-correction codeword. These are then used to
locate the errors, and then correct them.
Note: errors can be either erasures (we know it has been corrupted)
and misreading (it changed without obvious clue it happened)
misreading costs twice as much as erasures to ccorect
=== Bibliography ===
The information used in the design of this program and documentation come
from Wikpedia (1), Wikiversity (2) (3) and the QR-code specification (4)
(5). The last two URLs were found on Stack Overflow (5).
Moreover, some step-by-step execution of the decoding process (printf()
to display the internal state of the program and GIMP to go through the
PBM files) helped understanding ambiguous information and fixing bugs.
(1) https://en.wikipedia.org/wiki/QR-code
(2) https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders
(3) https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders/Additional_information
(4) http://www.codeplex.com/Download?ProjectName=qrcodenet&DownloadId=284291
(5) https://stackoverflow.com/questions/3710937/what-is-the-spec-for-formatting-data-in-qr-codes-i-can-not-find-it-anywhere