GitHub - 50m30n3/QTC: Using quad trees for video and image compression.

50m30n3 / QTC Public

Notifications You must be signed in to change notification settings
Fork 6
Star 14

Using quad trees for video and image compression.

d00m.org/~someone/qtc/

GPL-3.0 license

14 stars 6 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 199 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README		README
databuffer.c		databuffer.c
databuffer.h		databuffer.h
getgeom		getgeom
image.c		image.c
image.h		image.h
ppm.c		ppm.c
ppm.h		ppm.h
qtc.c		qtc.c
qtc.h		qtc.h
qti.c		qti.c
qti.h		qti.h
qtidec.c		qtidec.c
qtienc.c		qtienc.c
qtv.c		qtv.c
qtv.h		qtv.h
qtvcap.c		qtvcap.c
qtvdec.c		qtvdec.c
qtvenc.c		qtvenc.c
qtvplay.c		qtvplay.c
rangecode.c		rangecode.c
rangecode.h		rangecode.h
tilecache.c		tilecache.c
tilecache.h		tilecache.h
utils.c		utils.c
utils.h		utils.h
x11grab.c		x11grab.c
x11grab.h		x11grab.h

Repository files navigation

QTC (c) 50m30n3 2011, 2012

http://d00m.org/~someone/qtc/

This software is licensed under the GNU GPL V3 see LICENSE for details.


OVERVIEW

The QTC codec is a lossless video and still image codec.
It uses quad trees to quickly find and encode constant and unchanged areas in
image sequences or single images.

QTC is designed for "pixel graphics". Its main goal is to lossless encode screen
captures at high frame rates. It is not suited for encoding photos, film,
or captures of computer games, although it is certainly capable of doing so.

QTC uses full 24bit RGB color internally to avoid conversion artifacts.
Encoding and decoding should be bit exact under all circumstances.

There are currently 3 file formats associated with this codec:
	QTI - Still image container
	QTV - Video container
	QTW - Video container for Web usage

QTW files are split into multiple blocks to make streaming using JavaScript and
HTTP easier.

Both video file containers support indexing and seeking. Adapting the codec for
other container formats should be trivial and is left as an exercise for the
reader.

The supplied programs implement a full featured QTC decoder and encoder for
video ans still images in plain C code. The only exception is the quad tree
compressor itself which uses local functions. This is not supported by all
compilers.

This reference implementation is not tweaked for speed or minimal file size.
All parameters are set on a per file basis, but the format is theoretically
capable of varying parameters dynamically per frame.
A lot of algorithmic and micro optimizations are omitted for the sake of
clarity.

Programs:
	qtienc  - Still image encoder
	qtidec  - Still image decoder
	qtvenc  - Video encoder
	qtvdec  - Video decoder
	qtvcap  - X11 screen capture program
	qtvplay - Video player

Aside from the screen capture program, which is build for the X Window System,
and the video player, which uses SDL, the programs rely on no external
dependencies.


ALGORITHM

The QTC algorithm can be split into three parts.
The first part is image preprocessing, the second part is the quad tree
compression itself, and the last part is a range coder based entropy encoder.

In the fist step the image is preprocessed using lossless image and color
transformations. These are tweakable on a per frame basis.

The image can be processed using either a full Paeth transform, as used in the
PNG file format, or a simplified Paeth transform that only uses a single
predictor for increased speed.

The color information can be transformed into a simplified YUV color space
called fakeyuv. This is done to avoid rounding errors that appear when using
the standard YUV color space.
The fakeyuv color space encodes the green channel as Y component, the difference
between red and green as U and the difference between red and blue as V.
In memory the colors are stored as UYV to increase computational efficiency.

This step does not reduce the image size and only serves to make the image more
compressible in the next steps.

The quad tree compressor itself uses recursive subdivision of the input image to
find areas of constant color or areas that did not change in respect to the
reference image.

When a reference image is available the encoder first checks the current block
against the same block in the reference image. If there are no changes
subdivision stops and the next block gets encoded.

When there is no reference image, or a change was detected in the last step, the
encoder checks if the current block is of a single color.
If the block is of a single color, subdivision stops and the color of the block
is written to the output stream, otherwise the block is subdivided and the
process is repeated.

Should a block at any time get smaller than a certain threshold it gets saved
as "literal" block, where the image contents of the block are written to the
output stream as they are.

At this point an optional caching mechanism can be used to reduce the number of
literal blocks written to the file. This cache allows to recognize recently
used blocks and reference them using an id.

The structure of the subdivision tree built during compression is saved as a
separate command data bit stream so the structure can be replicated during
decompression.

Normally the compressor works on a per pixel basis, but a special mode for
the aforementioned fakeyuv color model exists that encodes the Y component
separately. This is especially useful for gray scale images.

The last step of the encoder is a range coder based entropy encoder.

In the reference encoder this step is integrated into the container format.

The color and index data is encoded 8 bit at a time using a second order
Markov chain model.
The command data is encoded one bit at a time using an eight order
Markov chain model.


USAGE

All programs read and write binary ppm files for uncompressed files.
The video en/decoders read/write image sequences in the form of img001.ppm,
img002.ppm, img003.ppm.... Simply pass the name of the first file.
Concatenated image sequences can also be read/written using stdio.

qtienc:
	-h		-	Print help
	-t [0..2]	-	Use image transforms (0)
	-e		-	Compress output data
	-y [0..2]	-	Use fakeyuv transform (0)
	-v		-	Be verbose
	-s [1..]	-	Minimal block size (2)
	-d [0..]	-	Maximum recursion depth (16)
	-c [0..]	-	Cache size in kilo tiles (0)
	-l [0..]	-	Laziness
	-i filename	-	Input file (-)
	-o filename	-	Output file (-)

qtidec:
	-h		-	Print help
	-v		-	Be verbose
	-a [0..2]	-	Analysis mode
	-i filename	-	Input file (-)
	-o filename	-	Output file (-)

qtvenc:
	-h		-	Print help
	-t [0..2]	-	Use image transforms (0)
	-e		-	Compress output data
	-w		-	Create QTW file
	-y [0..2]	-	Use fakeyuv transform (0)
	-v		-	Be verbose
	-x		-	Create index (Needs key frames)
	-s [1..]	-	Minimal block size (2)
	-n [1..]	-	Limit number of frames to encode
	-r [1..]	-	Frame rate (25)
	-k [1..]	-	Place key frames every X seconds
	-b [1..]	-	Maximum size of one QTW block in KiB (1024)
	-d [0..]	-	Maximum recursion depth (16)
	-c [0..]	-	Cache size in kilo tiles (0)
	-l [0..]	-	Laziness
	-i filename	-	Input file (-)
	-o filename	-	Output file (-)

qtvdec:
	-h		-	Print help
	-v		-	Be verbose
	-w		-	Read QTW file
	-a [0..2]	-	Analysis mode
	-f [0..]	-	Begin decoding at specific frame
	-n [1..]	-	Limit number of frames to decode
	-i filename	-	Input file (-)
	-o filename	-	Output file (-)

qtvcap:
	-h		-	Print help
	-t [0..2]	-	Use image transforms (0)
	-e		-	Compress output data
	-y [0..2]	-	Use fakeyuv transform (0)
	-v		-	Be verbose
	-x		-	Create index (Needs key frames)
	-m		-	Capture Mouse
	-g geometry	-	Specify capture region
	-s [1..]	-	Minimal block size (2)
	-n [1..]	-	Limit number of frames to encode
	-r [1..]	-	Frame rate (25)
	-k [1..]	-	Place key frames every X seconds
	-d [0..]	-	Maximum recursion depth (16)
	-c [0..]	-	Cache size in kilo tiles (0)
	-l [0..]	-	Laziness
	-i filename	-	Input screen ($DISPLAY)
	-o filename	-	Output file (-)

qtvplay:
	-h		-	Print help
	-v		-	Be verbose
	-r [1..]	-	Override frame rate
	-w		-	Read QTW file
	-i filename	-	Input file (-)
	[space]		-	Play/Pause
	[left]		-	Seek backwards 10sec
	[right]		-	Seek forwards 10sec
	[down]		-	Seek backwards 1min
	[up]		-	Seek forwards 1min
	[a]		-	Toggle analysis mode
	[o]		-	Toggle overlay mode	
	[t]		-	Toggle Peath transform
	[y]		-	Toggle fakeyuv
	[s]		-	Print stats


OPTIONS

-t:
	Choose which image transform to use.
	0 - Don't use image transforms (faster, big)
	1 - Use simplified Peath transform (fast, small)
	2 - Use full Peath transform (slow, smaller)

-e:
	Compress output data using entropy coding (slower, smaller)

-y:
	Choose which color transform to use.
	Mode 2 is mostly useful for pure gray scale images like text.
	0 - Don't transform color data (fast, big)
	1 - Transform color data using fakeyuv transform (fast, small)
	2 - Also separate color data during compression (fast, sometimes smaller)

-v:
	Print some stats like compression ratio and FPS

-s:
	Minimal block size. Small values reduce the amount of data after quad tree
	compression. Makes entropy coding slightly less efficient.
	When using entropy coding 4-8 is optimal, otherwise 1-2.

-d:
	Maximum recursion depth during quad tree compression. Setting this to 0
	disables the quad tree compression.

-c:
	Size of the cache to use during compression in kilo (1024) tiles.
	Caching uses an additional cachesize*minsize*minsize*4 bytes of ram
	during en/decoding. Can drastically reduce the size of raw data without
	much of a speed impact. Cache size 64 is recommended.
	Larger cache sizes need more bits to save the cache indices, increasing
	the files size, but allow for more cache hits in large videos.

-l:
	Subdivide quad tree n times before beginning real compression.
	Saves a bit of time but introduces a tiny overhead.
	Values around 3 make sense for FullHD material.

-i:
	Input file name. File to read input from. When not set or "-" read from
	stdin. For image sequences specify the first file. Numbers need leading
	zeros. For QTW files pass the file without number postfix.
	For qtvcap this contains the X screen to capture from.

-o:
	Output file name. File to write output to. When not set or "-" write to
	stdout. For image sequences specify the first file. Numbers need leading
	zeros. For QTW files additional block files with numbered postfixes are
	created.

-a:
	Create a color coded analysis image showing types of blocks and
	subdivisions. Green blocks are simplified blocks, red blocks are literal
	blocks, blue blocks are reference blocks, white blocks are cached.
	When using full fakeyuv encoding a value of 1 will show the luma channel
	and a value of 2 will show the chroma channel.
	For videos/images without color separation this has no effect.

-w:
	Create a QTW file instead of a QTV file. QTW files are designed for web
	usage and JavaScript streaming. The file itself only contains the header and
	index. The video data is written to sequentially numbered block files for
	easier streaming and seeking using JavaScript and HTTP.
	QTW files always have an index.

-x:
	Append an index to the file containing a list of key frames, their offset,
	and in the case of QTW files, their block number. This allows for seeking
	inside the video stream.

-f:
	Start decoding at a specific frame.
	In case the selected frame is not a keyframe or the video has no index
	the decoder will skip the required number of frames.
	This may take a while.

-n:
	Only encode a certain number of frames.

-r:
	Frame rate.

-k:
	Key frame rate. Place a key frame every X seconds. Higher values increase
	the file size since key frames are bigger, but allow for more accurate
	seeking. Every 5-10 seconds should be ok.

-b:
	Maximum size of one QTW block in KiB. Smaller values create more files and
	therefore more server requests but allow for smaller buffer and seek times.

-m:
	Include mouse cursor in screen capture.

-g:
	Specify the capture region. The region is in the format WxH+X,Y.
	W and H are the size of the region, X and Y the offset from the top left
	screen corner.
	You can use the "getgeom" script to query the geometry of a window by
	clicking ok it.


EXAMPLES

Encode a still image (simple):
$ qtienc -i image.ppm -o image.qti

Encode a still image (high compression):
$ qtienc -y1 -t2 -s4 -c64 -e -i image.ppm -o image.qti

Encode an image sequence (simple):
$ qtvenc -i frame0000.ppm -o video.qtv

Encode an image sequence (fast):
$ qtvenc -l3 -i frame0000.ppm -o video.qtv

Encode an image sequence (high compression):
$ qtvenc -y1 -t2 -s4 -c64 -e -i frame0000.ppm -o video.qtv

Re-encode a video:
$ qtvdec -i video_old.qtv | qtvenc -y1 -t2 -s4 -c64 -e -o video_new.qtv

Re-encode a video with key frames and index:
$ qtvdec -i video_old.qtv | qtvenc -x -k10 -y1 -t2 -s4 -c64 -e -o video_new.qtv

Re-encode a video for web into a separate directory:
$ mkdir video.qtw
$ qtvdec -i video.qtv | qtvenc -x -k10 -y1 -t2 -s4 -c64 -e -o video.qtw/video

Decode a video into an image sequence:
$ qtvdec -i video.qtv -o frame0000.ppm

Capture a full screen video:
$ qtvcap -l3 -c64 -m -o capture.qtv

Capture a video of a single window:
$ qtvcap -l2 -c64 -m -g $(getgeom) -o capture.qtv

Play back video:
$ qtvplay -i video.qtv

Re-encode a video using avconv:
$ qtvdec -i video.qtv | avconv -f image2pipe -r 25 -vcodec ppm -i - video.mp4