Computer Vision

This page is intended to organize the development of a Computer Vision library for Julia.

Similar libraries for inspiration

Goals

Mostly type-independent algorithms
Support for high-dimensional data: "5D images", with 3 spatial dimensions, multichannel (multicolor), over time (a color movie would be a 5D image)
Ability to handle images larger than can be resident in memory at once, including different storage schemes (see this thread on the mailing list)
Robust handling of missing data (e.g., from bad camera pixels, dropped frames, or in registered images)
Lots of algorithms, efficiently implemented
optional GPU support (CuMatrix)
DICOM support

Current status

Several basic image processing algorithms are already implemented in examples/image.jl, though these are limited in terms of types that can be used. Also, there probably are more efficient solutions for some of them.

Next steps

The most important step will be a proper design, for which we should take a look at the libraries mentioned above.
consistent naming convention
Document functions

Other random points:

Image transformations (will need interpolation methods!)
Gaussian/Laplacian pyramids
optimal threshold (Otsu)
Histogram equalization
Morphological operations (dilate, erode, opening, closing)
TGV denoising
Non-linear filtering (median, alpha-trimmed mean,...)
Canny edge detection
Simulation of noise

Already implemented functions (implemented for 2d grayscale and RGB memory-resident images)

imread
imwrite
ppmwrite (should be replaced by imwrite since #328 is closed)
imshow (depends on feh for now)
ftshow (logarithmic view of Fourier spectrum (nice to have for MRI))
rgb2gray
rgb2hsi
hsi2rgb
rgb2ntsc
ntsc2rgb
imcomplement
imlineardiffusion
imROF (TV denoising)
imedge (partial)
Filters: gaussian2d, sobel, imlaplacian, imdog, imlog, prewitt, imaverage
imadjustintensity
similarity metrics: ssd, ssdn, sad, sadn, ncc
imthresh
imgaussiannoise
imstretch
forward/backward differences: backdiffx, backdiffy, forwarddiffx, forwarddiffy

Design proposal

An image will be represented as one of the following main types (other types will be introduced later below):

ImageArray: an "in memory" image stored as a multidimensional array
ImageFileArray: an image stored on disk in multidimensional array format (note: this "raw" format is not intended for general-purpose export, formats like .png, .ppm, .tif will still be used for that)
ImageFileBricks: an image stored on disk as an "array of arrays", for example representing a 256x256 image as a 4x4 array of sub-images with size 64x64. This format is designed to support local operations on images of arbitrarily large size.

These will be composite types whose fields specify details about the representation (more detail below).

Image library functions will have a syntax illustrated by the following:

copydata(image_out,image_in): converts from one format to another (or simply copies the data, if the type isn't changing)
imfilter(image_out,image_in,kernel): spatial/temporal filtering

One issue to discuss is whether the output should come first, or last. Here I have shown it first, because additional arguments (as in the imfilter example) are likely to be best thought of as "inputs" and hence should perhaps not be split from image_in.

This syntax leverages Julia's ability to modify its input arguments. This has several distinct advantages:

The most efficient algorithm is likely to depend upon the storage format of both the input and the output. Julia's multiple dispatch will make this much easier to optimize.
This obviates the need to pass additional arguments specifying the desired output format, because all the details about how you want to format the output image are already stored as the fields of image_out.
You can readily specify that you only need a sub-region:
```
`image_out.coordinate_ranges = [20:50,30:85];
copydata(image_out,image_in)`
```
will snip out a rectangular portion of image_in and store it in image_out. As a bonus, image_out automatically keeps track of which region its data came from.
Sub-region strategies will make it easier to implement many algorithms for ImageFile and ImageFileBricks types. For example, for an ImageFileBricks type you can create an ImageArray object corresponding to a single output brick, call the version of the algorithm written for an output of ImageArray type, and use the result as one of the data bricks. Likewise, this same strategy should make it straightforward to implement multithreaded operations, where each thread processes a block of pixels.
If you don't need to keep the original image, you can specify an in-place operation by passing an ImageNil (another image type not yet introduced) as the first argument. This can save memory.

Finally, a key component of the library will be to provide pixel iterators of different types. Those used in VIGRA ("iterators") and ImgLib2 ("cursors") will be the models. However, currently I'd propose that we only implement iterators that work on ImageArray objects; we then use Julia's multiple-dispatch capabilities to iterate over bricks of more complicated types. The virtue of this strategy is that we don't need to try to write "one true iterator" for complex formats; we can adapt the iteration strategy to the algorithm. For example, a filtering operation with a kernel that has large extent along the z axis but small extent along x and y might benefit from a different "bricking strategy" than one with a kernel of different shape.

Status: a reasonably-complete implementation of ImageArray is currently in the master repository. (Users should be aware that details may still change.) Implementation of algorithms and the ImageFile* types is underway.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly