-
Notifications
You must be signed in to change notification settings - Fork 6
/
README
361 lines (283 loc) · 12.1 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
QTC (c) 50m30n3 2011, 2012
http://d00m.org/~someone/qtc/
This software is licensed under the GNU GPL V3 see LICENSE for details.
OVERVIEW
The QTC codec is a lossless video and still image codec.
It uses quad trees to quickly find and encode constant and unchanged areas in
image sequences or single images.
QTC is designed for "pixel graphics". Its main goal is to lossless encode screen
captures at high frame rates. It is not suited for encoding photos, film,
or captures of computer games, although it is certainly capable of doing so.
QTC uses full 24bit RGB color internally to avoid conversion artifacts.
Encoding and decoding should be bit exact under all circumstances.
There are currently 3 file formats associated with this codec:
QTI - Still image container
QTV - Video container
QTW - Video container for Web usage
QTW files are split into multiple blocks to make streaming using JavaScript and
HTTP easier.
Both video file containers support indexing and seeking. Adapting the codec for
other container formats should be trivial and is left as an exercise for the
reader.
The supplied programs implement a full featured QTC decoder and encoder for
video ans still images in plain C code. The only exception is the quad tree
compressor itself which uses local functions. This is not supported by all
compilers.
This reference implementation is not tweaked for speed or minimal file size.
All parameters are set on a per file basis, but the format is theoretically
capable of varying parameters dynamically per frame.
A lot of algorithmic and micro optimizations are omitted for the sake of
clarity.
Programs:
qtienc - Still image encoder
qtidec - Still image decoder
qtvenc - Video encoder
qtvdec - Video decoder
qtvcap - X11 screen capture program
qtvplay - Video player
Aside from the screen capture program, which is build for the X Window System,
and the video player, which uses SDL, the programs rely on no external
dependencies.
ALGORITHM
The QTC algorithm can be split into three parts.
The first part is image preprocessing, the second part is the quad tree
compression itself, and the last part is a range coder based entropy encoder.
In the fist step the image is preprocessed using lossless image and color
transformations. These are tweakable on a per frame basis.
The image can be processed using either a full Paeth transform, as used in the
PNG file format, or a simplified Paeth transform that only uses a single
predictor for increased speed.
The color information can be transformed into a simplified YUV color space
called fakeyuv. This is done to avoid rounding errors that appear when using
the standard YUV color space.
The fakeyuv color space encodes the green channel as Y component, the difference
between red and green as U and the difference between red and blue as V.
In memory the colors are stored as UYV to increase computational efficiency.
This step does not reduce the image size and only serves to make the image more
compressible in the next steps.
The quad tree compressor itself uses recursive subdivision of the input image to
find areas of constant color or areas that did not change in respect to the
reference image.
When a reference image is available the encoder first checks the current block
against the same block in the reference image. If there are no changes
subdivision stops and the next block gets encoded.
When there is no reference image, or a change was detected in the last step, the
encoder checks if the current block is of a single color.
If the block is of a single color, subdivision stops and the color of the block
is written to the output stream, otherwise the block is subdivided and the
process is repeated.
Should a block at any time get smaller than a certain threshold it gets saved
as "literal" block, where the image contents of the block are written to the
output stream as they are.
At this point an optional caching mechanism can be used to reduce the number of
literal blocks written to the file. This cache allows to recognize recently
used blocks and reference them using an id.
The structure of the subdivision tree built during compression is saved as a
separate command data bit stream so the structure can be replicated during
decompression.
Normally the compressor works on a per pixel basis, but a special mode for
the aforementioned fakeyuv color model exists that encodes the Y component
separately. This is especially useful for gray scale images.
The last step of the encoder is a range coder based entropy encoder.
In the reference encoder this step is integrated into the container format.
The color and index data is encoded 8 bit at a time using a second order
Markov chain model.
The command data is encoded one bit at a time using an eight order
Markov chain model.
USAGE
All programs read and write binary ppm files for uncompressed files.
The video en/decoders read/write image sequences in the form of img001.ppm,
img002.ppm, img003.ppm.... Simply pass the name of the first file.
Concatenated image sequences can also be read/written using stdio.
qtienc:
-h - Print help
-t [0..2] - Use image transforms (0)
-e - Compress output data
-y [0..2] - Use fakeyuv transform (0)
-v - Be verbose
-s [1..] - Minimal block size (2)
-d [0..] - Maximum recursion depth (16)
-c [0..] - Cache size in kilo tiles (0)
-l [0..] - Laziness
-i filename - Input file (-)
-o filename - Output file (-)
qtidec:
-h - Print help
-v - Be verbose
-a [0..2] - Analysis mode
-i filename - Input file (-)
-o filename - Output file (-)
qtvenc:
-h - Print help
-t [0..2] - Use image transforms (0)
-e - Compress output data
-w - Create QTW file
-y [0..2] - Use fakeyuv transform (0)
-v - Be verbose
-x - Create index (Needs key frames)
-s [1..] - Minimal block size (2)
-n [1..] - Limit number of frames to encode
-r [1..] - Frame rate (25)
-k [1..] - Place key frames every X seconds
-b [1..] - Maximum size of one QTW block in KiB (1024)
-d [0..] - Maximum recursion depth (16)
-c [0..] - Cache size in kilo tiles (0)
-l [0..] - Laziness
-i filename - Input file (-)
-o filename - Output file (-)
qtvdec:
-h - Print help
-v - Be verbose
-w - Read QTW file
-a [0..2] - Analysis mode
-f [0..] - Begin decoding at specific frame
-n [1..] - Limit number of frames to decode
-i filename - Input file (-)
-o filename - Output file (-)
qtvcap:
-h - Print help
-t [0..2] - Use image transforms (0)
-e - Compress output data
-y [0..2] - Use fakeyuv transform (0)
-v - Be verbose
-x - Create index (Needs key frames)
-m - Capture Mouse
-g geometry - Specify capture region
-s [1..] - Minimal block size (2)
-n [1..] - Limit number of frames to encode
-r [1..] - Frame rate (25)
-k [1..] - Place key frames every X seconds
-d [0..] - Maximum recursion depth (16)
-c [0..] - Cache size in kilo tiles (0)
-l [0..] - Laziness
-i filename - Input screen ($DISPLAY)
-o filename - Output file (-)
qtvplay:
-h - Print help
-v - Be verbose
-r [1..] - Override frame rate
-w - Read QTW file
-i filename - Input file (-)
[space] - Play/Pause
[left] - Seek backwards 10sec
[right] - Seek forwards 10sec
[down] - Seek backwards 1min
[up] - Seek forwards 1min
[a] - Toggle analysis mode
[o] - Toggle overlay mode
[t] - Toggle Peath transform
[y] - Toggle fakeyuv
[s] - Print stats
OPTIONS
-t:
Choose which image transform to use.
0 - Don't use image transforms (faster, big)
1 - Use simplified Peath transform (fast, small)
2 - Use full Peath transform (slow, smaller)
-e:
Compress output data using entropy coding (slower, smaller)
-y:
Choose which color transform to use.
Mode 2 is mostly useful for pure gray scale images like text.
0 - Don't transform color data (fast, big)
1 - Transform color data using fakeyuv transform (fast, small)
2 - Also separate color data during compression (fast, sometimes smaller)
-v:
Print some stats like compression ratio and FPS
-s:
Minimal block size. Small values reduce the amount of data after quad tree
compression. Makes entropy coding slightly less efficient.
When using entropy coding 4-8 is optimal, otherwise 1-2.
-d:
Maximum recursion depth during quad tree compression. Setting this to 0
disables the quad tree compression.
-c:
Size of the cache to use during compression in kilo (1024) tiles.
Caching uses an additional cachesize*minsize*minsize*4 bytes of ram
during en/decoding. Can drastically reduce the size of raw data without
much of a speed impact. Cache size 64 is recommended.
Larger cache sizes need more bits to save the cache indices, increasing
the files size, but allow for more cache hits in large videos.
-l:
Subdivide quad tree n times before beginning real compression.
Saves a bit of time but introduces a tiny overhead.
Values around 3 make sense for FullHD material.
-i:
Input file name. File to read input from. When not set or "-" read from
stdin. For image sequences specify the first file. Numbers need leading
zeros. For QTW files pass the file without number postfix.
For qtvcap this contains the X screen to capture from.
-o:
Output file name. File to write output to. When not set or "-" write to
stdout. For image sequences specify the first file. Numbers need leading
zeros. For QTW files additional block files with numbered postfixes are
created.
-a:
Create a color coded analysis image showing types of blocks and
subdivisions. Green blocks are simplified blocks, red blocks are literal
blocks, blue blocks are reference blocks, white blocks are cached.
When using full fakeyuv encoding a value of 1 will show the luma channel
and a value of 2 will show the chroma channel.
For videos/images without color separation this has no effect.
-w:
Create a QTW file instead of a QTV file. QTW files are designed for web
usage and JavaScript streaming. The file itself only contains the header and
index. The video data is written to sequentially numbered block files for
easier streaming and seeking using JavaScript and HTTP.
QTW files always have an index.
-x:
Append an index to the file containing a list of key frames, their offset,
and in the case of QTW files, their block number. This allows for seeking
inside the video stream.
-f:
Start decoding at a specific frame.
In case the selected frame is not a keyframe or the video has no index
the decoder will skip the required number of frames.
This may take a while.
-n:
Only encode a certain number of frames.
-r:
Frame rate.
-k:
Key frame rate. Place a key frame every X seconds. Higher values increase
the file size since key frames are bigger, but allow for more accurate
seeking. Every 5-10 seconds should be ok.
-b:
Maximum size of one QTW block in KiB. Smaller values create more files and
therefore more server requests but allow for smaller buffer and seek times.
-m:
Include mouse cursor in screen capture.
-g:
Specify the capture region. The region is in the format WxH+X,Y.
W and H are the size of the region, X and Y the offset from the top left
screen corner.
You can use the "getgeom" script to query the geometry of a window by
clicking ok it.
EXAMPLES
Encode a still image (simple):
$ qtienc -i image.ppm -o image.qti
Encode a still image (high compression):
$ qtienc -y1 -t2 -s4 -c64 -e -i image.ppm -o image.qti
Encode an image sequence (simple):
$ qtvenc -i frame0000.ppm -o video.qtv
Encode an image sequence (fast):
$ qtvenc -l3 -i frame0000.ppm -o video.qtv
Encode an image sequence (high compression):
$ qtvenc -y1 -t2 -s4 -c64 -e -i frame0000.ppm -o video.qtv
Re-encode a video:
$ qtvdec -i video_old.qtv | qtvenc -y1 -t2 -s4 -c64 -e -o video_new.qtv
Re-encode a video with key frames and index:
$ qtvdec -i video_old.qtv | qtvenc -x -k10 -y1 -t2 -s4 -c64 -e -o video_new.qtv
Re-encode a video for web into a separate directory:
$ mkdir video.qtw
$ qtvdec -i video.qtv | qtvenc -x -k10 -y1 -t2 -s4 -c64 -e -o video.qtw/video
Decode a video into an image sequence:
$ qtvdec -i video.qtv -o frame0000.ppm
Capture a full screen video:
$ qtvcap -l3 -c64 -m -o capture.qtv
Capture a video of a single window:
$ qtvcap -l2 -c64 -m -g $(getgeom) -o capture.qtv
Play back video:
$ qtvplay -i video.qtv
Re-encode a video using avconv:
$ qtvdec -i video.qtv | avconv -f image2pipe -r 25 -vcodec ppm -i - video.mp4