-
Notifications
You must be signed in to change notification settings - Fork 1
Computer Vision
This page is intended to organize the development of a Computer Vision library for Julia.
- Mostly type-independent algorithms
- Support for high-dimensional data: "5D images", with 3 spatial dimensions, multichannel (multicolor), over time (a color movie would be a 5D image)
- Ability to handle images larger than can be resident in memory at once, including different storage schemes (see this thread on the mailing list)
- Robust handling of missing data (e.g., from bad camera pixels, dropped frames, or in registered images)
- Lots of algorithms, efficiently implemented
- optional GPU support (CuMatrix)
- DICOM support
- Several basic image processing algorithms are already implemented in
examples/image.jl
, though these are limited in terms of types that can be used. Also, there probably are more efficient solutions for some of them.
- The most important step will be a proper design, for which we should take a look at the libraries mentioned above.
- consistent naming convention
- Document functions
Other random points:
- Image transformations (will need interpolation methods!)
- Gaussian/Laplacian pyramids
- optimal threshold (Otsu)
- Histogram equalization
- Morphological operations (dilate, erode, opening, closing)
- TGV denoising
- Non-linear filtering (median, alpha-trimmed mean,...)
- Canny edge detection
- Simulation of noise
- imread
- imwrite
- ppmwrite (should be replaced by imwrite since #328 is closed)
- imshow (depends on feh for now)
- ftshow (logarithmic view of Fourier spectrum (nice to have for MRI))
- rgb2gray
- rgb2hsi
- hsi2rgb
- rgb2ntsc
- ntsc2rgb
- imcomplement
- imlineardiffusion
- imROF (TV denoising)
- imedge (partial)
- Filters: gaussian2d, sobel, imlaplacian, imdog, imlog, prewitt, imaverage
- imadjustintensity
- similarity metrics: ssd, ssdn, sad, sadn, ncc
- imthresh
- imgaussiannoise
- imstretch
- forward/backward differences: backdiffx, backdiffy, forwarddiffx, forwarddiffy
An image will be represented as one of the following main types (other types will be introduced later below):
- ImageArray: an "in memory" image stored as a multidimensional array
- ImageFileArray: an image stored on disk in multidimensional array format (note: this "raw" format is not intended for general-purpose export, formats like .png, .ppm, .tif will still be used for that)
- ImageFileBricks: an image stored on disk as an "array of arrays", for example representing a 256x256 image as a 4x4 array of sub-images with size 64x64. This format is designed to support local operations on images of arbitrarily large size.
These will be composite types whose fields specify details about the representation (more detail below).
Image library functions will have a syntax illustrated by the following:
-
copy(image_out,image_in)
: converts from one format to another (or simply copies the data, if the type isn't changing) -
imfilter(image_out,image_in,kernel)
: spatial/temporal filtering
One issue to discuss is whether the output should come first, or last. Here I have shown it first, because additional arguments (as in the imfilter example) are likely to be best thought of as "inputs" and hence should perhaps not be split from image_in.
This syntax leverages Julia's ability to modify its input arguments. This has several distinct advantages:
-
The most efficient algorithm is likely to depend upon the storage format of both the input and the output. Julia's multiple dispatch will make this much easier to optimize.
-
This obviates the need to pass additional arguments specifying the desired output format, because all the details about how you want to format the output image are already stored as the fields of image_out.
-
You can readily specify that you only need a sub-region:
`image_out.coordinate_ranges = [20:50,30:85]; copy(image_out,image_in)`
will snip out a rectangular portion of image_in and store it in image_out. As a bonus, image_out automatically keeps track of which region its data came from.
-
Sub-region strategies will make it easier to implement many algorithms for ImageFile and ImageFileBricks types. For example, for an ImageFileBricks type you can create an ImageArray object corresponding to a single output brick, call the version of the algorithm written for an output of ImageArray type, and use the result as one of the data bricks. Likewise, this same strategy should make it straightforward to implement multithreaded operations, where each thread processes a block of pixels.
-
If you don't need to keep the original image, you can specify an in-place operation by passing an ImageNil (another image type not yet introduced) as the first argument. This can save memory.
Finally, a key component of the library will be to provide pixel iterators of different types. Those used in VIGRA ("iterators") and ImgLib2 ("cursors") will be the models. However, currently I'd propose that we only implement iterators that work on ImageArray objects; we then use Julia's multiple-dispatch capabilities to iterate over bricks of more complicated types. The virtue of this strategy is that we don't need to try to write "one true iterator" for complex formats; we can adapt the iteration strategy to the algorithm. For example, a filtering operation with a kernel that has large extent along the z axis but small extent along x and y might benefit from a different "bricking strategy" than one with a kernel of different shape.
Here is a very rough beginning for the ImageArray type:
abstract ImageCoordinate
abstract Space <: ImageCoordinate
abstract Time <: ImageCoordinate
abstract Channel <: ImageCoordinate
abstract Image
type ImageArray{T<:Number} <: Image
data::Array{T}
coordinate_types::Vector{ImageCoordinate}
coordinate_units::Vector{Any} # vector of strings, "microns" or I"\mu m"
coordinate_names::Vector{Any} # vector of strings, "X" or "Y"
coordinate_ranges::Vector{Range1}
space_directions::Matrix{Float64} # e.g., 0.15*eye(2) for single image with 0.15 micron pixels
valid::Array{Any,1} # can be used to store bad frame/pixel data
metadata::CompositeKind # arbitrary metadata, like acquisition time, etc.
ImageArray{T}() = new()
end
Supplying a constructor that just takes an array input and provides defaults for the other fields will make it easier for people who don't want/need to worry about all these other fields.
One important detail concerns the data type T
and the specification of bad pixels via the valid
field. When T
is Float32
or Float64
and the type is ImageArray
, it is straightforward to use a NaN
to represent a known bad pixel: just have data[i,j,...] = NaN. However, when T
is an integer type (e.g., common on-disk formats), this is not an option. In such cases, the valid
field provides a way of marking bad pixels. valid
can be a Bool
array of the same size as the image. Alternatively, in raw acquired data sets, bad pixels are frequently separable: you have certain pixels on your camera that you know are bad, and perhaps something went wrong during the acquisition of a particular frame or image stack. For example, suppose stack 20 is entirely bad, and frame 14 of stack 37 is also bad. valid
could be specified in the following way:
# Assume goodpixels is an array of the size of one camera frame, true for the good pixels
goodframes = trues(1,1,n_frames_per_stack,n_stacks) # first 2 coords are camera x and y
goodframes[1,1,:,20] = false
goodframes[1,1,14,37] = false
img.valid = cell(2)
img.valid[1] = goodpixels
img.valid[2] = goodframes
The notion here is that the 4-dimensional valid
array is being represented as the outer product of goodpixels and goodframes, but for reasons of memory-efficiency we don't directly compute the outer product.