Skip to content
Jeff Bezanson edited this page Nov 18, 2011 · 61 revisions
               _
   _       _ _(_)_     |
  (_)     | (_) (_)    | A fresh approach to technical computing
   _ _   _| |_  __ _   |
  | | | | | | |/ _` |  |           http://julialang.org
  | | |_| | | | (_| |  |       julia-math@googlegroups.com
 _/ |\__'_|_|_|\__'_|  |
|__/                   |
## Overview

This is the official documentation for the Julia programming language. It is generally intended to be read in the order below, but can for the most part be sensibly read out of order. Most of the language is documented, so reading this manual should give a fairly good idea of how to write programs in it.

## Resources ## Contents
  1. Introduction

  2. Getting Started

  3. Integers and Floating-Point Numbers

  4. Mathematical Operations

  5. Complex and Rational Numbers

  6. Strings

  7. Functions

  8. Control Flow

  9. Variables and Scoping

  10. Types

  11. Methods

  12. Constructors

  13. Conversion and Promotion

  14. Arrays

  15. Running External Programs

  16. Metaprogramming

  17. Parallel Computing

  18. Calling C and Fortran Code

  19. Standard Library Reference

  20. Potential Features

# 1. Introduction

Scientific computing has traditionally required the highest performance, yet domain experts have largely moved to slower dynamic languages for daily work. We believe there are many good reasons to prefer dynamic languages for these applications, and we do not expect their use to diminish. Fortunately, modern language design and compiler techniques make it possible to mostly eliminate the performance trade-off and provide a single environment productive enough for prototyping and efficient enough for deploying performance-intensive applications. The Julia programming language fills this role: it is a flexible dynamic language, appropriate for scientific and numerical computing, with performance comparable to traditional statically-typed languages.

Julia features optional typing, multiple dispatch, and good performance, achieved using type inference and just-in-time (JIT) compilation, implemented using LLVM. It is multi-paradigm, combining features of imperative, functional, and object-oriented programming. The syntax of Julia is similar to MATLAB® and consequently MATLAB® programmers should feel immediately comfortable with Julia. While MATLAB® is quite effective for prototyping and exploring numerical linear algebra, it has limitations for programming tasks outside of this relatively narrow scope. Julia keeps MATLAB®'s ease and expressiveness for high-level numerical computing, but transcends its general programming limitations. To achieve this, Julia builds upon the lineage of mathematical programming languages, but also borrows much from popular dynamic languages, including Lisp, Perl, Python, Lua, and Ruby.

The most significant departures of Julia from typical dynamic languages are:

  • The core language imposes very little; the standard library is written in Julia itself, including primitive operations like integer arithmetic
  • A rich language of types for constructing and describing objects, that can also optionally be used to make type declarations
  • The ability to define function behavior across many combinations of argument types via multiple dispatch
  • Automatic generation of efficient, specialized code for different argument types
  • Good performance, approaching that of statically-compiled languages like C

Although one sometimes speaks of dynamic languages as being "typeless", they are definitely not: every object, whether primitive or user-defined, has a type. The lack of type declarations in most dynamic languages, however, means that one cannot instruct the compiler about the types of values, and often cannot explicitly talk about types at all. In static languages, on the other hand, while one can — and usually must — annotate types for the compiler, types exist only at compile time and cannot be manipulated or expressed at run time. In Julia, types are themselves run-time objects, and can also be used to convey information to the compiler.

While the casual programmer need not explicitly use types or multiple dispatch, they are the core unifying features of Julia: functions are defined on different combinations of argument types, and applied by dispatching to the most specific matching definition. This model is a good fit for mathematical programming, where it is unnatural for the first argument to "own" an operation as in traditional object-oriented dispatch. Operators are just functions with special notation — to extend addition to new user-defined data types, you define new methods for the + function. Existing code then seamlessly applies to the new data types.

Partly because of run-time type inference (augmented by optional type annotations), and partly because of a strong focus on performance from the inception of the project, Julia's computational efficiency exceeds that of other dynamic languages, and even rivals that of statically-compiled languages. For large scale numerical problems, speed always has been, continues to be, and probably always will be crucial: the amount of data being processed has easily kept pace with Moore's Law over the past decades.

Julia aims to create an unprecedented combination of ease-of-use, power, and efficiency in a single language. In addition to the above, some advantages of Julia over comparable systems include:

  • Free and open source (MIT licensed)
  • User-defined types are as fast and compact as built-ins
  • Designed for parallelism and distributed computation
  • Lightweight "green" threading (coroutines)
  • Unobtrusive yet powerful type system
  • Elegant and extensible conversions and promotions for numeric and other types
  • Efficient support for Unicode, including but not limited to UTF-8
  • Call C functions directly (no wrappers or special APIs needed)
  • Powerful shell-like capabilities for managing other processes
  • Lisp-like macros and other metaprogramming facilities
# 2. Getting Started ## Installation and Running

The latest version of Julia can be downloaded and installed by following the instructions on the main GitHub page. The easiest way to learn and experiment with Julia is by starting an interactive session (also known as a read-eval-print loop or "repl"):

$ julia
               _      
   _       _ _(_)_     |
  (_)     | (_) (_)    |  A fresh approach to technical computing.
   _ _   _| |_  __ _   |  
  | | | | | | |/ _` |  |  Version 0 (pre-release)
  | | |_| | | | (_| |  |  Commit 61847c5aa7 (2011-08-20 06:11:31)*
 _/ |\__'_|_|_|\__'_|  |  
|__/                   |

julia> 1 + 2
3

julia> ans
3

julia> load("file.j")

To exit the interactive session, type ^D — the control key together with the d key. When run in interactive mode, julia displays a banner and prompts the user for input. Once the user has entered a complete expression, such as 1 + 2, and hits enter, the interactive session evaluates the expression and shows its value. If an expression is entered into an interactive session with a trailing semicolon, its value is not shown. The variable ans is bound to the value of the last evaluated expression whether it is shown or not. The load function reads and evaluates the contents of the given file.

To run code in a file non-interactively, you can give it as the first argument to the julia command:

$ julia script.j arg1 arg2...

As the example implies, the following command-line arguments to julia are taken as command-line arguments to the program script.j. There are various ways to run Julia code and provide options, reminiscent of those taken by the perl and ruby programs:

julia [options] [program] [args...]
 -q --quiet               Quiet startup without banner
 -H --home=<dir>          Load files relative to <dir>
 -T --tab=<size>          Set REPL tab width to <size>

 -e --eval=<expr>         Evaluate <expr> and don't print
 -E --print=<expr>        Evaluate and print <expr>
 -P --post-boot=<expr>    Evaluate <expr> right after boot
 -L --load=file           Load <file> right after boot
 -b --bare                Bare: don't load default startup files
 -J --sysimage=file       Start up with the given system image file

 -p n                     Run n local processes

 -h --help                Print this message
## Major Differences From MATLAB®

The following are the most significant differences that may trip up Julia users used to MATLAB®:

  • Arrays are indexed with square brackets, A[i,j].
  • Multiple values are returned and assigned with parentheses, return (a, b) and (a, b) = f(x).
  • Values are passed and assigned by reference. If a function modifies an array, the changes will be visible in the caller.
  • Use n for nx1: The number of arguments to an array constructor equals the number of dimensions of the result. In particular, rand(n) makes a 1-dimensional array.
  • Concatenating scalars and arrays with the syntax [x,y,z] concatenates in the first dimension ("vertically"). For the second dimension ("horizontally"), use spaces as in [x y z]. To construct block matrices (concatenating in the first two dimensions), the syntax [a b; c d] is used to avoid confusion.
  • Colons a:b and a:b:c construct Range objects. To construct a full vector, use linspace, or "concatenate" the range by enclosing it in brackets, [a:b].
  • Functions return values using the return keyword, instead of by listing their names in the function definition (see The "return" Keyword for details).
  • A file may contain any number of functions, and all definitions will be externally visible when the file is loaded.
  • Reductions such as sum, prod, and max are performed over every element of an array when called with a single argument as in sum(A).
  • Functions such as sort that operate column-wise by default (sort(A) is equivalent to sort(A,1)) do not have special behavior for 1xN arrays; the argument is returned unmodified since it still performs sort(A,1). To sort a 1xN matrix like a vector, use sort(A,2).
  • Parentheses must be used to call a function with zero arguments, as in tic() and toc().
  • Do not use commas to end statements. The results of statements are not automatically printed (except at the interactive prompt), and lines of code do not need to end with semicolons. The function println can be used to print a value followed by a newline.
# 3. Integers and Floating-Point Numbers

Integers and floating-point values are the basic building blocks of arithmetic and computation. Built-in representations of such values are called numeric primitives, while representations of integers and floating-point numbers as immediate values in code are known as numeric literals. For example, 1 is an integer literal, while 1.0 is a floating-point literal; their binary in-memory representations as objects are numeric primitives. Julia provides a broad range of primitive numeric types, and a full complement of arithmetic and bitwise operators as well as standard mathematical functions are defined over them. The following are Julia's primitive numeric types:

  • Integer types:

    • Int8 — signed 8-bit integers ranging from –27 to 27 – 1.
    • Uint8 — unsigned 8-bit integers ranging from 0 to 28 – 1.
    • Int16 — signed 16-bit integers ranging from –215 to 215 – 1.
    • Uint16 — unsigned 16-bit integers ranging from 0 to 216 – 1.
    • Int32 — signed 32-bit integers ranging from –231 to 231 – 1.
    • Uint32 — unsigned 32-bit integers ranging from 0 to 232 – 1.
    • Int64 — signed 64-bit integers ranging from –263 to 263 – 1.
    • Uint64 — unsigned 64-bit integers ranging from 0 to 264 – 1.
  • Floating-point types:

Additionally, full support for complex and rational numbers is built on top of these primitive numeric types. All numeric types interoperate naturally without explicit casting, thanks to a flexible type promotion system. Moreover, this promotion system, detailed in Conversion and Promotion, is user-extensible, so user-defined numeric types can be made to interoperate just as naturally as built-in types.

## Integers

Literal integers are represented in the standard manner:

julia> 1
1

julia> 1234
1234

The default type for an integer literal depends on whether the target system has a 32-bit architecture or a 64-bit architecture:

# 32-bit system:
julia> typeof(1)
Int32

# 64-bit system:
julia> typeof(1)
Int64

Larger integer literals which cannot be represented using only 32 bits but can be represented in 64 bits always create 64-bit integers, regardless of the system type:

# 32-bit or 64-bit system:
julia> typeof(3000000000)
Int64

If an integer literal has a value larger than can be represented as an Int64 but smaller than the maximum value that can be represented by a Uint64, then it will create a Uint64 value:

# 32-bit or 64-bit system:
julia> 12345678901234567890
12345678901234567890

julia> typeof(ans)
Uint64

The minimum and maximum representable values of primitive numeric types such as integers are given by the typemin and typemax functions:

julia> (typemin(Int32), typemax(Int32))
(-2147483648,2147483647)

julia> for T = {Int8,Uint8,Int16,Uint16,Int32,Uint32,Int64,Uint64}
         println("$(lpad(T,6)): [$(typemin(T)),$(typemax(T))]")
       end
  Int8: [-128,127]
 Uint8: [0,255]
 Int16: [-32768,32767]
Uint16: [0,65535]
 Int32: [-2147483648,2147483647]
Uint32: [0,4294967295]
 Int64: [-9223372036854775808,9223372036854775807]
Uint64: [0,18446744073709551615]

This last expression uses several features we have yet to introduce, including for loops, strings, and string interpolation, but should be easy enough to understand for people coming from most mainstream programming languages.

Integers can also be input in hexadecimal form using 0x as a prefix, a notation also found in C, Java, Perl, Python and Ruby:

julia> 0xff
255

julia> 0xffffffff
4294967295

There is no literal input format for integer types besides Int32, Int64 and Uint64. On 64-bit systems, there is no literal syntax for Int32 values even. You can, however convert values to other integer types easily:

julia> int8(-15)
-15

julia> typeof(ans)
Int8

julia> uint8(231)
231

julia> typeof(ans)
Uint8
## Floating-Point Numbers

Literal floating-point numbers are represented in the standard formats:

julia> 1.0
1.0

julia> 1.
1.0

julia> 0.5
0.5

julia> .5
0.5

julia> -1.23
-1.23

julia> 1e10
1e+10

julia> 2.5e-4
0.00025

The above results are all Float64 values. There is no literal format for Float32, but you can convert values to Float32 easily:

julia> float32(-1.5)
-1.5

julia> typeof(ans)
Float32

There are three specified standard floating-point values that do not correspond to a point on the real number line:

  • Inf — positive infinity — a value larger than all finite floating-point values
  • -Inf — negative infinity — a value smaller than all finite floating-point values
  • NaN — not a number — a value incomparable to all floating-point values (including itself).

For further discussion of how these non-finite floating-point values are ordered with respect to each other and other floats, see Numeric Comparison. By the IEEE 754 standard, these floating-point values are the results of certain arithmetic operations:

julia> 1/0
Inf

julia> -5/0
-Inf

julia> 0.000001/0
Inf

julia> 0/0
NaN

julia> 500 + Inf
Inf

julia> 500 - Inf
-Inf

julia> Inf + Inf
Inf

julia> Inf - Inf
NaN

julia> Inf/Inf
NaN

The typemin and typemax functions also apply to floating-point types:

julia> (typemin(Float32),typemax(Float32))
(-Inf,Inf)

julia> (typemin(Float64),typemax(Float64))
(-Inf,Inf)

Note that Float32 values NaN, Inf and -Inf are shown identically to their Float64 counterparts.

Floating-point types also support the eps function, which gives the distance between 1.0 and the next largest representable floating-point value:

julia> eps(Float32)
1.192092896e-07

julia> eps(Float64)
2.22044604925031308e-16

These values are 2^-23 and 2^-52 as Float32 and Float64 value, respectively. The eps function can also take a floating-point value as an argument, and gives the absolute difference between that value and the next representable floating point value. That is, eps(x) yields a value of the same type as x such that x + eps(x) is the next representable floating-point values that are larger than x:

julia> eps(1.0)
2.22044604925031308e-16

julia> eps(1000.)
1.13686837721616030e-13

julia> eps(1e-27)
1.79366203433576585e-43

julia> eps(0.0)
4.94065645841246544e-324

As you can see, the distance to the next largest representable floating-point value is smaller for smaller values and larger for larger values. In other words, the representable floating-point numbers are densest in the real number line near zero, and grow sparser exponentially as one moves farther away from zero. By definition, eps(1.0) is the same as eps(Float64) since 1.0 is a 64-bit floating-point value.

### Background and References

For a brief but lucid presentation of how floating-point numbers are represented, see John D. Cook's article on the subject as well as his introduction to some of the issues arising from how this representation differs in behavior from the idealized abstraction of real numbers. For an excellent, in-depth discussion of floating-point numbers and issues of numerical accuracy encountered when computing with them, see David Goldberg's paper What Every Computer Scientist Should Know About Floating-Point Arithmetic. For even more extensive documentation of the history of, rationale for, and issues with floating-point numbers, as well as discussion of many other topics in numerical computing, see the collected writings of William Kahan, commonly known as the "Father of Floating-Point". Of particular interest may be An Interview with the Old Man of Floating-Point.

## Numeric Literal Coefficients

To make common numeric formulas and expressions clearer, Julia allows variables to be immediately preceded by a numeric literal, implying multiplication. This makes writing polynomial expressions much cleaner:

julia> x = 3
3

julia> 2x^2 - 3x + 1
10

julia> 1.5x^2 - .5x + 1
13.0

You can also use numeric literals as coefficients to parenthesized expressions:

julia> 2(x-1)^2 - 3(x-1) + 1
3

Additionally, parenthesized expressions can be used as coefficients to variables, implying multiplication of the expression by the variable:

julia> (x-1)x
6

Neither juxtaposition of two parenthesized expressions, nor placing a variable before a parenthesized expression, however, can be used to imply multiplication:

julia> (x-1)(x+1)
type error: apply: expected Function, got Int64

julia> x(x+1)
type error: apply: expected Function, got Int64

Both of these expressions are interpreted as function application: any expression that is not a numeric literal, when immediately followed by a parenthetical, is interpreted as a function applied to the values in parentheses (see Functions for more about functions). Thus, in both of these cases, an error is caused since the left-hand value is not a function.

The above syntactic enhancements significantly reduce the visual noise incurred when writing common mathematical formulae. Note that no whitespace may come between a numeric literal coefficient and the identifier or parenthesized expression which it multiplies.

### Syntax Conflicts

Juxtaposed literal coefficient syntax conflicts with two numeric literal syntaxes: hexadecimal integer literals and engineering notation for floating-point literals. Here are some situations where syntactic conflicts arise:

  • The hexadecimal integer literal expression 0xff could be interpreted as the numeric literal 0 multiplied by the variable xff.
  • The floating-point literal expression 1e10 could be interpreted as the numeric literal 1 multiplied by the variable e10, and similarly with the equivalent E form.

In both cases, we resolve the ambiguity in favor of interpretation as a numeric literals:

  • Expressions starting with 0x are always hexadecimal literals.
  • Expressions starting with a numeric literal followed by e or E are always floating-point literals.
# 4. Mathematical Operations

Julia provides a complete collection of basic arithmetic and bitwise operators across all of its numeric primitive types, as well as providing portable, efficient implementations of a comprehensive collection of standard mathematical functions.

## Arithmetic and Bitwise Operators

The following arithmetic operators are supported on all primitive numeric types:

  • +x — unary plus is the identity operation.
  • -x — unary minus maps values to their additive inverses.
  • x + y — binary plus performs addition.
  • x - y — binary minus performs subtraction.
  • x * y — times performs multiplication.
  • x / y — divide performs division.

The following bitwise operators are supported on all primitive integer types:

  • ~x — bitwise not.
  • x & y — bitwise and.
  • x | y — bitwise or.
  • x $ y — bitwise xor.
  • x >>> ylogical shift right.
  • x >> yarithmetic shift right.
  • x << y — logical/arithmetic shift left.

Here are some simple examples using arithmetic operators:

julia> 1 + 2 + 3
6

julia> 1 - 2
-1

julia> 3*2/12
0.5

(By convention, we tend to space less tightly binding operators less tightly, but there are no syntactic constraints.)

Julia has a type promotion system that allows arithmetic operations on mixtures of argument types to "just work" naturally and automatically (see Conversion and Promotion for details of the promotion system):

julia> 1 + 2.5
3.5

julia> 0.5*12
6.0

julia> 3*2/12 + 1
1.5

The above expressions all promote to Float64. However, more nuanced promotions also work:

julia> uint8(12) - int8(15)
-3

julia> typeof(ans)
Int16

julia> uint8(12) - float32(1.5)
10.5

julia> typeof(ans)
Float32

Here are some examples with bitwise operators:

julia> ~123
-124

julia> ~uint32(123)
4294967172

julia> ~uint8(123)
132

julia> 123 & 234
106

julia> 123 | 234
251

julia> typeof(ans)
Int64

julia> uint8(123) | uint16(234)
251

julia> typeof(ans)
Uint16

julia> 123 $ 234
145

As a general rule of thumb, arguments are promoted to the smallest type that can accurately represent all of the arguments.

Every binary arithmetic and bitwise operator also has an updating version that assigns the result of the operation back into its left operand. For example, the updating form of + is the += operator. Writing x += 3 is equivalent to writing x = x + 3:

  julia> x = 1
  1

  julia> x += 3
  4

  julia> x
  4

The updating versions of all the binary arithmetic and bitwise operators are:

+=  -=  *=  /=  &=  |=  $=  >>>=  >>=  <<=
## Numeric Comparisons

Standard comparison operations are defined for all the primitive numeric types:

  • == — equality.
  • != — inequality.
  • < — less than.
  • <= — less than or equal to.
  • > — greater than.
  • >= — greater than or equal to.

Here are some simple examples:

julia> 1 == 1
true

julia> 1 == 2
false

julia> 1 != 2
true

julia> 1 == 1.0
true

julia> 1 < 2
true

julia> 1.0 > 3
false

julia> 1 >= 1.0
true

julia> -1 <= 1
true

julia> -1 <= -1
true

julia> -1 <= -2
false

julia> 3 < -0.5
false

As is evident here, promotion also applies to comparisons: the comparisons are performed in whatever type the arguments are promoted to, which is generally the smallest type in which the values can be faithfully represented.

After promotion to a common type, integers are compared in the standard manner: by comparison of bits. Floating-point numbers are compared according to the IEEE 754 standard:

  • finite numbers are ordered in the usual manner
  • Inf is equal to itself and greater than everything else except NaN
  • -Inf is equal to itself and less then everything else except NaN
  • NaN is not equal to, less than, or greater than anything, including itself.

The last point is potentially suprprising and thus worth noting:

julia> NaN == NaN
false

julia> NaN != NaN
true

julia> NaN < NaN
false

julia> NaN > NaN
false

For situations where one wants to compare floating-point values so that NaN equals NaN, such as hash key comparisons, the function isequal is also provided, which considers NaNs to be equal to each other:

julia> isequal(NaN,NaN)
true

Unlike most languages, with the notable exception of Python, comparisons can be arbitrarily chained:

julia> 1 < 2 <= 2 < 3 == 3 > 2 >= 1 == 1 < 3 != 5
true

Chaining comparisons is often quite convenient in numerical code. Only as many initial comparisons and their operand expressions as are necessary to determine the final truth value of the entire chain are evaluated. See Short-Circuit Evaluation for further discussion of this behavior.

## Mathematical Functions

Julia provides a comprehensive collection of mathematical functions and operators. These mathematical operations are defined over as broad a class of numerical values as permit sensible definitions, including integers, floating-point numbers, rationals, and complexes, wherever such definitions make sense.

  • round(x) — round x to the nearest integer.
  • iround(x) — round x to the nearest integer, giving an integer-typed result.
  • floor(x) — round x towards -Inf.
  • ceil(x) — round x towards +Inf.
  • trunc(x) — round x towards zero.
  • itrunc(x) — round x towards zero, giving an integer-typed result.
  • div(x,y) — truncated division; quotient rounded towards zero.
  • fld(x,y) — floored division; quotient rounded towards -Inf.
  • rem(x,y) — remainder; satisfies x == div(x,y)*y + rem(x,y), implying that sign matches x.
  • mod(x,y) — modulus; satisfies x == fld(x,y)*y + mod(x,y), implying that sign matches y.
  • gcd(x,y...) — greatest common divisor of x, y... with sign matching x.
  • lcm(x,y...) — least common multiple of x, y... with sign matching x.
  • abs(x) — a positive value with the magnitude of x.
  • abs2(x) — the squared magnitude of x.
  • sign(x) — indicates the sign of x, returning -1, 0, or +1.
  • signbit(x) — indicates the sign bit of x, returning -1 or +1.
  • copysign(x,y) — a value with the magnitude of x and the sign of y.
  • sqrt(x) — the square root of x.
  • cbrt(x) — the cube root of x.
  • hypot(x,y) — accurate sqrt(x^2 + y^2) for all values of x and y.
  • pow(x,y) — x raised to the exponent y.
  • exp(x) — the natural exponential function at x.
  • expm1(x) — accurate exp(x)-1 for x near zero.
  • ldexp(x,n) — x*2^n computed efficiently for integral n.
  • log(x) — the natural logarithm of x.
  • log(b,x) — the base b logarithm of x.
  • log2(x) — the base 2 logarithm of x.
  • log10(x) — the base 10 logarithm of x.
  • log1p(x) — accurate log(1+x) for x near zero.
  • logb(x) — returns the binary exponent of x.
  • erf(x) — the error function at x.
  • erfc(x) — accurate 1-erf(x) for large x.
  • gamma(x) — the gamma function at x.
  • lgamma(x) — accurate log(gamma(x)) for large x.

For an overview of why functions like hypot, expm1, log1p, and erfc are necessary and useful, see John D. Cook's excellent pair of blog posts on the subject: expm1, log1p, erfc, and hypot.

All the standard trigonometric functions are also defined:

sin    cos    tan    cot    sec    csc
sinh   cosh   tanh   coth   sech   csch
asin   acos   atan   acot   asec   acsc
acoth  asech  acsch  sinc   cosc   atan2

These are all single-argument functions, with the exception of atan2, which gives the angle in radians between the x-axis and the point specified by its arguments, interpreted as x and y coordinates.

For notational convenience, there are equivalent operator forms for the mod and pow functions:

  • x % y is equivalent to mod(x,y).
  • x ^ y is equivalent to pow(x,y).

Like arithmetic and bitwise operators, % and ^ also have updating forms. As with other operators, x %= y means x = x % y and x ^= y means x = x^y:

julia> x = 2; x ^= 5; x
32

julia> x = 7; x %= 4; x
3
# 5. Complex and Rational Numbers

Julia ships with predefined types representing both complex and rational numbers, and supports all the mathematical operations discussed in Mathematical Operations on them. Promotions are defined so that operations on any combination of predefined numeric types, whether primitive or composite, behave as expected.

## Complex Numbers

The global constant im is bound to the complex number i, representing one of the square roots of -1. It was deemed harmful to co-opt the name i for a global constant, as that would preclude its use as a variable anywhere, and generally cause confusion. Since Julia allows numeric literals to be juxtaposed with identifiers as coefficients, this binding suffices to provide convenient syntax for complex numbers, similar to the traditional mathematical notation:

julia> 1 + 2im
1 + 2im

You can perform all the standard arithmetic operations with complex numbers:

julia> (1 + 2im)*(2 - 3im)
8 + 1im

julia> (1 + 2im)/(1 - 2im)
-0.6 + 0.8im

julia> (1 + 2im) + (1 - 2im)
2 + 0im

julia> (-3 + 2im) - (5 - 1im)
-8 + 3im

julia> (-1 + 2im)^2
-3 - 4im

julia> (-1 + 2im)^2.5
2.7296244647840089 - 6.9606644595719001im

julia> (-1 + 2im)^(1 + 1im)
-0.2791038107582666 + 0.0870805341410243im

julia> 3(2 - 5im)
6 - 15im

julia> 3(2 - 5im)^2
-63 - 60im

julia> 3(2 - 5im)^-1
0.2068965517241379 + 0.5172413793103448im

The promotion mechanism ensures that combinations of operands of different types just work:

julia> 2(1 - 1im)
2 - 2im

julia> (2 + 3im) - 1
1 + 3im

julia> (1 + 2im) + 0.5
1.5 + 2.0im

julia> (2 + 3im) - 0.5im
2.0 + 2.5im

julia> 0.75(1 + 2im)
0.75 + 1.5im

julia> (2 + 3im) / 2
1.0 + 1.5im

julia> (1 - 3im) / (2 + 2im)
-0.5 - 1.0im

julia> 1 + 3/4im
1.0 + 0.75im

Note that 3/4im parses as 3/4*im, which, since division and multiplication have equal precedence and are left-associative, is equivalent to (3/4)*im rather than the quite different value, 3/(4*im) == -(3/4)*im.

Standard functions to manipulate complex values are provided:

julia> real(1 + 2im)
1

julia> imag(1 + 2im)
2

julia> conj(1 + 2im)
1 - 2im

julia> abs(1 + 2im)
2.23606797749978981

julia> abs2(1 + 2im)
5

As is common, the absolute value of a complex number is its distance from zero. The abs2 function is of particular use for complex numbers, where it avoids taking a square root and can thus return a value of the same type as the real and imaginary parts of its argument. The full gamut of other mathematical functions are also defined for complex numbers:

julia> sqrt(im)
0.7071067811865476 + 0.7071067811865475im

julia> sqrt(1 + 2im)
1.272019649514069 + 0.7861513777574233im

julia> cos(1 + 2im)
2.0327230070196656 - 3.0518977991517997im

julia> exp(1 + 2im)
-1.1312043837568135 + 2.4717266720048188im

julia> sinh(1 + 2im)
-0.4890562590412937 + 1.4031192506220405im

Note that mathematical functions always return real values when applied to real numbers and complex values when applied to complex numbers. Thus, sqrt, for example, behaves differently when applied to -1 versus -1 + 0im even though -1 == -1 + 0im:

julia> sqrt(-1)
NaN

julia> sqrt(-1 + 0im)
0.0 + 1.0im

If you need to construct a complex number using variables, the literal numeric coefficient notation will not work, although explicitly writing the multiplication operation will:

julia> a = 1; b = 2; a + b*im
1 + 2im

Constructing complex numbers from variable values like this, however, is not recommended. Use the complex function to construct a complex value directly from its real and imaginary parts instead:

julia> complex(a,b)
1 + 2im

This construction is preferred for variable arguments because it is more efficient than the multiplication and addition construct, but also because for certain values of b unexpected results can occur:

julia> 1 + Inf*im
NaN + Inf*im

julia> 1 + NaN*im
NaN + NaN*im

These results are natural and unavoidable consequences of the interaction between the rules of complex multiplication and IEEE-754 floating-point arithmetic. Using the complex function to construct complex values directly, however, gives more intuitive results:

julia> complex(1,Inf)
complex(1.0,Inf)

julia> complex(1,NaN)
complex(1.0,NaN)

On the other hand, it can be argued that these values do not represent meaningful complex numbers, and are thus not appreciably different from the results gotten when multiplying explicitly by im.

## Rational Numbers

Julia has a rational number type to represent exact ratios of integers. Rationals are constructed using the // operator:

julia> 2//3
2//3

If the numerator and denominator of a rational have common factors, they are reduced to lowest terms such that the denominator is non-negative:

julia> 6//9
2//3

julia> -4//8
-1//2

julia> 5//-15
-1//3

julia> -4//-12
1//3

This normalized form for a ratio of integers is unique, so equality of rational values can be tested by checking for equality of the numerator and denominator. The standardized numerator and denominator of a rational value can be extracted using the num and den functions:

julia> num(2//3)
2

julia> den(2//3)
3

Direct comparison of the numerator and denominator is generally not necessary, since the standard arithmetic and comparison operations are defined for rational values:

julia> 2//3 == 6//9
true

julia> 2//3 == 9//27
false

julia> 3//7 < 1//2
true

julia> 3//4 > 2//3
true

julia> 2//4 + 1//6
2//3

julia> 5//12 - 1//4
1//6

julia> 5//8 * 3//12
5//32

julia> 6//5 / 10//7
21//25

Rationals can be easily converted to floating-point numbers:

julia> float(3//4)
0.75

Conversion from rational to floating-point respects the following identity for any integral values of a and b:

julia> isequal(float(a//b), a/b)
true

This includes cases where a == 0 or b == 0, in which situations the conversion from rational value to floating-point produces the appropriate ±Inf or NaN value:

julia> 5//0
1//0

julia> float(ans)
Inf

julia> 0//0
0//0

julia> float(ans)
NaN

julia> -3//0
-1//0

julia> float(ans)
-Inf

In a sense, Julia's rational values are a convenient way of deferring the computation of integer ratios, thereby allowing exact canceling of common factors and avoiding accumulation of floating-point errors. Adherence to floating-point semantics implies that other than increased precision, most algorithms designed for floating-point arithmetic will work similarly for rationals.

As usual, the promotion system makes interactions with other numeric types natural and effortless:

julia> 3//5 + 1
8//5

julia> 3//5 - 0.5
0.1

julia> 2//7 * (1 + 2im)
2//7 + 4//7im

julia> 2//7 * (1.5 + 2im)
0.4285714285714285 + 0.5714285714285714im

julia> 3//2 / (1 + 2im)
3//10 - 3//5im

julia> 1//2 + 2im
1//2 + 2//1im

julia> 1 + 2//3im
1//1 + 2//3im

julia> 0.5 == 1//2
true

julia> 0.33 == 1//3
false

julia> 0.33 < 1//3
true

julia> 1//3 - 0.33
0.0033333333333333
# 6. Strings

Strings are finite sequences of characters. Of course, the real trouble comes when one asks what a character is. The characters that English speakers are familiar with are the letters A, B, C, etc., together with numerals and common punctuation symbols. These characters are standardized together with a mapping to integer values between 0 and 127 by the ASCII standard. There are, of course, many other characters used in non-English languages, including variants of the ASCII characters with accents and other modifications, related scripts such as Cyrillic and Greek, and scripts completely unrelated to ASCII and English, including Arabic, Chinese, Hebrew, Hindi, Japanese, and Korean. The Unicode standard tackles the complexities of what exactly a character is, and is generally accepted as the definitive standard addressing this problem. Depending on your needs, you can either ignore these complexities entirely and just pretend that only ASCII characters exist, or you can write code that can handle any of the characters or encodings that one may encounter when handling non-ASCII text. Julia makes dealing with plain ASCII text simple and efficient, and handling Unicode is as simple and efficient as possible. In particular, you can write C-style string code to process ASCII strings, and they will work as expected, both in terms of performance and semantics. If such code encounters non-ASCII text, it will gracefully fail with a clear error message, rather than silently introducing corrupt results. When this happens, modifying the code to handle non-ASCII data is straightforward and easy.

There are a few noteworthy high-level features about Julia's strings:

  • String is an abstraction, not a concrete type — many different representations can implement the String interface, but they can easily be used together and interact transparently. Any string type can be used in any function expecting a String.
  • Like C and Java, but unlike most dynamic languages, Julia has a first-class type representing a single character, called Char. This is just a special kind of 32-bit integer whose numeric value represents a Unicode code point.
  • As in Java, strings are immutable: the value of a String object cannot be changed. To construct a different string value, you construct a new string from parts of other strings.
  • Conceptually, a string is a partial function from indices to characters — for some index values, no character value is returned, and instead an exception is thrown. This for allows efficient indexing into strings by the byte index of an encoded representation rather than by a character index, which cannot be implemented both efficiently and simply for variable-width encodings of Unicode strings.
  • Julia supports the full range of Unicode characters: literal strings are always ASCII or UTF-8 but other encodings for strings from external sources can be supported easily and efficiently.
## Characters

A Char value represents a single character: it is just a 32-bit integer with a special literal representation and appropriate arithmetic behaviors, whose numeric value is interpreted as a Unicode code point. Here is how Char values are input and shown:

julia> 'x'
'x'

julia> typeof(ans)
Char

You can convert a Char to its integer value, i.e. code point, easily:

julia> int('x')
120

julia> typeof(ans)
Int32

You can convert an integer value back to a Char just as easily:

julia> char(120)
'x'

Not all integer values are valid Unicode code points, but for performance, the char conversion does not check that every character value is valid. If you want to check that each converted value is a value code point, use the safe_char conversion instead:

julia> char(0xd800)
'???'

julia> safe_char(0xd800)
invalid Unicode code point: U+d800

julia> char(0x110000)
'\U110000'

julia> safe_char(0x110000)
invalid Unicode code point: U+110000

As of this writing, the valid Unicode code points are U+00 through U+d7ff and U+e000 through U+10ffff. These have not all been assigned intelligible meanings yet, nor are they necessarily interpretable by applications, but all of these values are considered to be valid Unicode characters.

You can input any Unicode character in single quotes using \u followed by up to four hexadecimal digits or \U followed by up to eight hexadecimal digits (the longest valid value only requires six):

julia> '\u0'
'\0'

julia> '\u78'
'x'

julia> '\u2200'
'∀'

julia> '\U10ffff'
'\U10ffff'

Julia uses your system's locale and language settings to determine which characters can be printed as-is and which must be output using the generic, escaped \u or \U input forms. In addition to these Unicode escape forms, all of C's traditional escaped input forms can also be used:

julia> int('\0')
0

julia> int('\t')
9

julia> int('\n')
10

julia> int('\e')
27

julia> int('\x7f')
127

julia> int('\177')
127

julia> int('\xff')
255

Like any integers, you can do arithmetic and comparisons with Char values:

julia> 'x' - 'a'
23

julia> 'A' < 'a'
true

julia> 'A' <= 'a' <= 'Z'
false

julia> 'A' <= 'X' <= 'Z'
true

Arithmetic with Char values always yields integer values. To create a new Char value, explicit conversion back to the Char type is required:

julia> 'A' + 1
66

julia> char(ans)
'B'
## String Basics

Here a variable is initialized with a simple string literal:

julia> str = "Hello, world.\n"
"Hello, world.\n"

If you want to extract a character from a string, you index into it:

julia> str[1]
'H'

julia> str[6]
','

julia> str[end]
'\n'

All indexing in Julia is 1-based: the first element of any integer-indexed object is found at index 1, not index 0, and the last element is found at index n rather than n-1, when the string has a length of n.

In any indexing expression, the keyword, end, can be used as a shorthand for length(x), where x is the object being indexed into, whether it is a string, an array, or some other indexable object. You can perform arithmetic and other operations with end, just like a normal value:

julia> str[end-1]
'.'

julia> str[end/2]
' '

julia> str[end/3]
'o'

julia> str[end/4]
'l'

Using an index less than 1 or greater than end raises an error:

julia> str[0]
in next: arrayref: index out of range

julia> str[end+1]
in next: arrayref: index out of range

You can also extract a substring using range indexing:

julia> str[4:9]
"lo, wo"

Note the distinction between str[k] and str[k:k]:

julia> str[6]
','

julia> str[6:6]
","

The former is a single character value of type Char, while the latter is a string value that happens to contain only a single character. In Julia these are very different things.

## Unicode and UTF-8

Julia fully supports Unicode characters and strings. As discussed above, in character literals, Unicode code points can be represented using unicode \u and \U escape sequences, as well as all the standard C escape sequences. These can likewise be used to write string literals:

julia> s = "\u2200 x \u2203 y"
"∀ x ∃ y"

Whether these Unicode characters are displayed as escapes or shown as special characters depends on your terminal's locale settings and its support for Unicode. Non-ASCII string literals are encoded using the UTF-8 encoding. UTF-8 is a variable-width encoding, meaning that not all characters are encoded in the same number of bytes. In UTF-8, ASCII characters — i.e. those with code points less than 0x80 (128) — are encoded as they are in ASCII, using a single byte, while code points 0x80 and above are encoded using multiple bytes — up to four per character. This means that not every byte index into a UTF-8 string is necessarily a valid index for a character. If you index into a string at such an invalid byte index, an error is thrown:

julia> s[1]
'∀'

julia> s[2]
invalid UTF-8 character index

julia> s[3]
invalid UTF-8 character index

julia> s[4]
' '

In this case, the character is a three-byte character, so the indices 2 and 3 are invalid and the next character's index is 4.

Because of variable-length encodings, strlen(s) and length(s) are not always the same: strlen(s) gives the number of characters in s while length(s) gives the maximum valid byte index into s. If you iterate through the indices 1 through length(s) and index into s, the sequence of characters returned, when errors aren't thrown, is the sequence of characters comprising the string, s. Thus, we do have the identity that strlen(s) <= length(s) since each character in a string must have its own index. The following is an inefficient and verbose way to iterate through the characters of s:

julia> for i = 1:length(s)
         try
           println(s[i])
         catch
           # ignore the index error
         end
       end
∀
 
x
 
∃
 
y

The blank lines actually have spaces on them. Fortunately, the above awkward idiom is unnecessary for iterating through the characters in a string, since you can just use the string as an iterable object, no exception handling required:

julia> for c = s
         println(c)
       end
∀

x

∃

y

UTF-8 is not the only encoding that Julia supports, and adding support for new encodings is quite easy, but discussion of other encodings and how to implement support for them is beyond the scope of this document for the time being. For further discussion of UTF-8 encoding issues, see the section below on byte array literals, which goes into some greater detail.

## Interpolation

One of the most common and useful string operations is concatenation:

julia> greet = "Hello"
"Hello"

julia> whom = "world"
"world"

julia> strcat(greet, ", ", whom, ".\n")
"Hello, world.\n"

Constructing strings like this can become a bit cumbersome, however. To reduce the need for these verbose calls to strcat, Julia allows interpolation into string literals using $, as in Perl:

julia> "$greet, $whom.\n"
"Hello, world.\n"

This is more readable and convenient and equivalent to the above string concatenation — the system rewrites this apparent single string literal into a concatenation of string literals with variables.

The shortest complete expression after the $ is taken as the expression whose value is to be interpolated into the string. Thus, you can interpolate any expression into a string using parentheses:

julia> "1 + 2 = $(1 + 2)"
"1 + 2 = 3"

The expression need not be contained in parentheses, however. For example, since a literal array expression is not complete until the opening [ is closed by a matching ], you can interpolate an array like this:

julia> x = 2; y = 3; z = 5;

julia> "x,y,z: $[x,y,z]."
"x,y,z: [2,3,5]."

Both concatenation and string interpolation call the generic string function to convert objects into String form. Most non-String objects are converted to strings as they are shown in interactive sessions:

julia> v = [1,2,3]
[1,2,3]

julia> "v: $v"
"v: [1,2,3]"

The string function is the identity for String and Char values, so these are interpolated into strings as themselves, unquoted and unescaped:

julia> c = 'x'
'x'

julia> "hi, $c"
"hi, x"

To include a literal $ in a string literal, escape it with a backslash:

julia> print("I have \$100 in my account.\n")
I have $100 in my account.
## Common Operations

You can lexicographically compare strings using the standard comparison operators:

julia> "abracadabra" < "xylophone"
true

julia> "abracadabra" == "xylophone"
false

julia> "Hello, world." != "Goodbye, world."
true

julia> "1 + 2 = 3" == "1 + 2 = $(1 + 2)"
true

You can search for the index of a particular character using the strchr function:

julia> strchr("xylophone", 'x')
1

julia> strchr("xylophone", 'p')
5

julia> strchr("xylophone", 'z')
char not found

You can start the search for a character at a given offset by providing a third argument:

julia> strchr("xylophone", 'o')
4

julia> strchr("xylophone", 'o', 5)
7

julia> strchr("xylophone", 'o', 8)
char not found

Another handy string function is repeat:

julia> repeat(".:Z:.", 10)
".:Z:..:Z:..:Z:..:Z:..:Z:..:Z:..:Z:..:Z:..:Z:..:Z:."

Some other useful functions include:

  • length(str) gives the maximal (byte) index that can be used to index into str.
  • strlen(str) the number of characters in str; this is not the same as length(str).
  • i = start(str) gives the first valid index at which a character can be found in str (typically 1).
  • c, j = next(str,i) returns next character at or after the index i and the next valid character index following that. With the start and length, can be used to iterate through the characters in str.
  • c, j = prev(str,i) returns the character at or before index i and the index at which it occurs. With length and start can be used to iterate through the characters in str in reverse.
  • ind2chr(str,i) gives the number of characters in str up to and including any at index i.
  • chr2int(str,j) gives the index at which the jth character in str occurs.
## Non-Standard String Literals

There are situations when you want to construct a string or use string semantics, but the behavior of the standard string construct is not quite what is needed. For these kinds of situations, Julia provides non-standard string literals. A non-standard string literal looks like a regular double-quoted string literal, but is immediately prefixed by an identifier, and doesn't behave quite like a normal string literal.

Two types of interpretation are performed on normal Julia string literals: interpolation and unescaping (escaping is the act of expressing a non-standard character with a sequence like \n, whereas unescaping is the process of interpreting such escape sequences as actual characters). There are cases where its convenient to diable either or both of these behaviors. For such situations, Julia provides three types of non-standard string literals:

  • E"..." interpret escape sequences but do not interpolate, thereby rendering $ a harmless, normal character.
  • I"..." perform interpolation but do not interpret escape sequences specially.
  • L"..." perform neither unescaping nor interpolation.

Suppose, for example, you would like to write strings that will contain many $ characters without interpolation. You can, as described above, escape the $ characters with a preceding backslash. This can become tedious, however. Non-standard string literals prefixed with E do not perform string interpolation:

julia> E"I have $100 in my account.\n"
"I have \$100 in my account.\n"

This allows you to have $ characters inside of string literals without triggering interpolation and without needing to escape those $s by preceding them with a \. Escape sequences, such as the \n above, still behave as usual, so '\n' becomes a newline character.

On the other hand, I"..." string literals perform interpolation but no unescaping:

julia> I"I have $100 in my account.\n"
"I have 100 in my account.\\n"

The value of the expression 100 is interpolated into the string, yielding the decimal string representation of the value 100 — namely "100" (sorry, that might be a bit confusing). The trailing \n sequence is taken as literal backslash and n characters, rather than being interpreted as a single newline character.

The third non-standard string form interprets all the characters between the opening and closing quotes literally: the L"..." form. Here is an example usage:

julia> L"I have $100 in my account.\n"
"I have \$100 in my account.\\n"

Neither the $ nor the \n sequence are specially interpreted.

### Byte Array Literals

Some string literal forms don't create strings at all. In the next section, we will see that regular expressions are written as non-standard string literals. Another useful non-standard string literal, however, is the byte-array string literal: b"...". This form lets you use string notation to express literal byte arrays — i.e. arrays of Uint8 values. The convention is that non-standard literals with uppercase prefixes produce actual string objects, while those with lowercase prefixes produce non-string objects like byte arrays or compiled regular expressions. The rules for byte array literals are the following:

  • ASCII characters and ASCII escapes produce a single byte.
  • \x and octal escape sequences produce the byte corresponding to the escape value.
  • Unicode escape sequences produce a sequence of bytes encoding that code point in UTF-8.

There is some overlap between these rules since the behavior of \x and octal escapes less than 0x80 (128) are covered by both of the first two rules, but here these rules agree. Together, these rules allow one to easily use ASCII characters, arbitrary byte values, and UTF-8 sequences to produce arrays of bytes. Here is an example using all three:

julia> b"DATA\xff\u2200"
[68,65,84,65,255,226,136,128]

The ASCII string "DATA" corresponds to the bytes 68, 65, 84, 65. \xff produces the single byte 255. The Unicode escape \u2200 is encoded in UTF-8 as the three bytes 226, 136, 128. Note that the resulting byte array does not correspond to a valid UTF-8 string — if you try to use this as a regular string literal, you will get a syntax error:

julia> "DATA\xff\u2200"
syntax error: invalid UTF-8 sequence

Also observe the significant distinction between \xff and \uff: the former escape sequence encodes the byte 255, whereas the latter escape sequence represents the code point 255, which is encoded as two bytes in UTF-8:

julia> b"\xff"
[255]

julia> b"\uff"
[195,191]

In character literals, this distinction is glossed over and \xff is allowed to represent the code point 255, because characters always represent code points. In strings, however, \x escapes always represent bytes, not code points, whereas \u and \U escapes always represent code points, which are encoded in one or more bytes. For code points less than \u80, it happens that the the UTF-8 encoding of each code point is just the single byte produced by the corresponding \x escape, so the distinction can safely be ignored. For the escapes \x80 through \xff as compared to \u80 through \uff, however, there is a major difference: the former escapes all encode single bytes, which — unless followed by very specific continuation bytes — do not form valid UTF-8 data, whereas the latter escapes all represent Unicode code points with two-byte encodings.

If this is all extremely confusing, try reading "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets". It's an excellent introduction to Unicode and UTF-8, and may help alleviate some confusion regarding the matter.

In byte array literals, objects interpolate as their binary representation rather than as their string representation:

julia> msg = "Hello."
"Hello."

julia> len = uint16(length(msg))
6

julia> b"$len$msg"
[6,0,72,101,108,108,111,46]

Here the first two bytes are the native (little-endian on x86) binary representation of the length of the string "Hello.", encoded as a unsigned 16-bit integer, while the following bytes are the ASCII bytes of the string "Hello." itself.

## Regular Expressions

Julia has Perl-compatible regular expressions, as provided by the PCRE library. Regular expressions are related to strings in two ways: the obvious connection is that regular expressions are used to find regular patterns in strings; the other connection is that regular expressions are themselves input as strings, which are parsed into a state machine that can be used to efficiently search for patterns in strings. In Julia, regular expressions are input using non-standard string literals prefixed with various identifiers beginning with r. The most basic regular expression literal without any options turned on just uses r"...":

julia> r"^\s*(?:#|$)"
r"^\s*(?:#|$)"

julia> typeof(ans)
Regex

To check if a regex matches a string, use the matches function:

julia> matches(r"^\s*(?:#|$)", "not a comment")
false

julia> matches(r"^\s*(?:#|$)", "# a comment")
true

As one can see here, matches simply returns true or false, indicating whether the given regex matches the string or not. Commonly, however, one wants to know not just whether a string matched, but also how it matched. To capture this information about a match, use the match function instead:

julia> match(r"^\s*(?:#|$)", "not a comment")

julia> match(r"^\s*(?:#|$)", "# a comment")
RegexMatch("#")

If the regular expression does not match the given string, matches returns nothing — a special value that does not print anything at the interactive prompt. Other than not printing, it is a completely normal value and you can test for it programmatically:

m = match(r"^\s*(?:#|$)", line)
if m == nothing
  println("not a comment")
else
  println("blank or comment")
end

If a regular expression does match, the value returned by match is a RegexMatch object. These objects record how the expression matches, including the substring that the pattern matches and any captured substrings, if there are any. This example only captures the portion of the substring that matches, but perhaps we want to capture any non-blank text after the comment character. We could do the following:

julia> m = match(r"^\s*(?:#\s*(.*?)\s*$|$)", "# a comment ")
RegexMatch("# a comment ", 1="a comment")    

You can extract the following info from a RegexMatch object:

  • the entire substring matched: m.match
  • the captured substrings as a tuple of strings: m.captures
  • the offset at which the whole match begins: m.offset
  • the offsets of the captured substrings as a vector: m.offsets

For when a capture doesn't match, instead of a substring, m.captures contains nothing in that position, and m.offsets has a zero offset (recall that indices in Julia are 1-based, so a zero offset into a string is invalid). Here's is a pair of somewhat contrived examples:

julia> m = match(r"(a|b)(c)?(d)", "acd")
RegexMatch("acd", 1="a", 2="c", 3="d")

julia> m.match
"acd"

julia> m.captures
("a","c","d")

julia> m.offset
1

julia> m.offsets
[1,2,3]

julia> m = match(r"(a|b)(c)?(d)", "ad")
RegexMatch("ad", 1="a", 2=nothing, 3="d")

julia> m.match
"ad"

julia> m.captures
("a",nothing,"d")

julia> m.offset
1

julia> m.offsets
[1,0,2]

It is convenient to have captures returned as a tuple so that one can use tuple destructuring syntax to bind them to local variables:

julia> first, second, third = m.captures
("a",nothing,"d")

julia> first
"a"

A collection of variant regular expresion string literals indicate various combinations of the i, m, and s flags used to indicate case-insensitivity, multiline matching, and single-line matching, described in the perlre manpage as follows:

i   Do case-insensitive pattern matching.

    If locale matching rules are in effect, the case map is taken
    from the current locale for code points less than 255, and
    from Unicode rules for larger code points. However, matches
    that would cross the Unicode rules/non-Unicode rules boundary
    (ords 255/256) will not succeed.

m   Treat string as multiple lines.  That is, change "^" and "$"
    from matching the start or end of the string to matching the
    start or end of any line anywhere within the string.

s   Treat string as single line.  That is, change "." to match any
    character whatsoever, even a newline, which normally it would
    not match.

    Used together, as /ms, they let the "." match any character
    whatsoever, while still allowing "^" and "$" to match,
    respectively, just after and just before newlines within the
    string.

For example, the following regex has all three flags turned on:

julia> rims"a+.*b+.*?d$"
rims"a+.*b+.*?d$"

julia> match(rims"a+.*b+.*?d$","Goodbye,\nOh, angry,\nBad world\n")
RegexMatch("angry,\nBad world")
# 7. Functions

In Julia, a function is an object that maps a tuple of argument values to a return value. Julia functions are not pure mathematical functions, in the sense that functions can alter and be affected by the global state of the program. The basic syntax for defining functions in Julia is:

function f(x,y)
  x + y
end

This syntax is similar to MATLAB®, but there are some significant differences:

  • In MATLAB®, this definition must be saved in a file, named f.m, whereas in Julia, this expression can appear anywhere, including in an interactive session.
  • In MATLAB®, the closing end is optional, being implied by the end of the file. In Julia, the terminating end is required.
  • In MATLAB®, this function would print the value x + y but would not return any value, whereas in Julia, the last expression evaluated is a function's return value.
  • Expression values are never printed automatically except in interactive sessions. Semicolons are only required to separate expressions on the same line.

In general, while the function definition syntax is reminiscent of MATLAB®, the similarity is largely superficial. Therefore, rather than continually comparing the two, in what follows, we will simply describe the behavior of functions in Julia directly.

There is a second, more terse syntax for defining a function in Julia. The traditional function declaration syntax demonstrated above is equivalent to the following compact "assignment form":

f(x,y) = x + y

In the assignment form, the body of the function must be a single expression, although it can be a compound expression (see Compound Expressions). Short, simple function definitions are common in Julia. The short function syntax is accordingly quite idiomatic, considerably reducing both typing and visual noise.

A function is called using the traditional parenthesis syntax:

julia> f(2,3)
5

Without parentheses, the expression f refers to the function object, and can be passed around like any value:

julia> g = f;

julia> g(2,3)
5

There are two other ways that functions can be applied: using special operator syntax for certain function names (see Operators Are Functions below), or with the apply function:

julia> apply(f,2,3)
5

The apply function applies its first argument — a function object — to its remaining arguments.

## The "return" Keyword

The value returned by a function is the value of the last expression evaluated, which, by default, is the last expression in the body of the function definition. In the example function, f, from the previous section this is the value of the expression x + y. As in C and most other imperative or functional languages, the return keyword causes a function to return immediately, providing an expression whose value is returned:

function g(x,y)
  return x * y
  x + y
end

Since functions definitions can be entered into interactive sessions, it is easy to compare these definitions:

f(x,y) = x + y

function g(x,y)
  return x * y
  x + y
end

julia> f(2,3)
5

julia> g(2,3)
6

Of course, in a purely linear function body like g, the usage of return is pointless since the expression x + y is never evaluated and we could simply make x * y the last expression in the function and omit the return. In conjunction with other control flow, however, return is of real use. Here, for example, is a function that computes the hypotenuse length of a right triangle with sides of length x and y, avoiding overflow:

function hypot(x,y)
  x = abs(x)
  y = abs(y)
  if x > y
    r = y/x
    return x*sqrt(1+r*r)
  end
  if y == 0
    return zero(x)
  end
  r = x/y
  return y*sqrt(1+r*r)
end

There are three possible points of return from this function, returning the values of three different expressions, depending on the values of x and y. The return on the last line could be omitted since it is the last expression.

## Operators Are Functions

In Julia, most operators are just functions with support for special syntax. The exceptions are operators with special evaluation semantics like && and ||. These operators cannot be functions since short-circuit evaluation (see Short Circuit Evaluation) requires that their operands are not evaluated before evaluation of the operator. Accordingly, you can also apply them using parenthesized argument lists, just as you would any other function:

julia> 1 + 2 + 3
6

julia> +(1,2,3)
6

The infix form is exactly equivalent to the function application form — in fact the former is parsed to produce the function call internally. This also means that you can assign and pass around operators such as + and * just like you would with other function values:

julia> f = +;

julia> f(1,2,3)
6

Under the name f, the function does not support infix notation, however.

## Anonymous Functions

Functions in Julia are first-class objects: they can be assigned to variables, called using the standard function call syntax from the variable they have been assigned to. They can be used as arguments, and they can be returned as values. They can also be created anonymously, without giving them a name:

julia> x -> x^2 + 2x - 1
#<function>

This creates an unnamed function taking one argument and returning the value of the polynomial x^2 + 2x - 1 at that value. The primary use for anonymous functions is passing them to functions which take other functions as arguments. A classic example is the map function, which applies a function to each value of an array and returns a new array containing the resulting values:

julia> map(round, [1.2,3.5,1.7])
[1.0,4.0,2.0]

This is fine if a named function effecting the transform one wants already exists to pass as the first argument to map. Often, however, a ready-to-use, named function does not exist. In these situations, the anonymous function construct allows easy creation of a single-use function object without needing a name:

julia> map(x -> x^2 + 2x - 1, [1,3,-1])
[2,14,-2]
## Multiple Return Values

In Julia, one returns a tuple of values to simulate returning multiple values. However, tuples can be created and destructured without needing parentheses, thereby providing an illusion that multiple values are being returned, rather than a single tuple value. For example, the following function returns a pair of values:

function foo(a,b)
  a+b, a*b
end

If you call it in an interactive session without assigning the return value anywhere, you will see the tuple returned:

julia> foo(2,3)
(5,6)

A typical usage of such a pair of return values, however, extracts each value into a variable. Julia supports simple tuple "destructuring" that facilitates this:

julia> x, y = foo(2,3);

julia> x
5

julia> y
6

You can also return multiple values via an explicit usage of the return keyword:

function foo(a,b)
  return a+b, a*b
end

This has the exact same effect as the previous definition of foo.

## Varargs Functions

It is often convenient to be able to write functions taking an arbitrary number of arguments. Such functions are traditionally known as "varargs" functions, which is short for "variable number of arguments". You can define a varargs function by following the last argument with an ellisis:

bar(a,b,x...) = (a,b,x)

The variables a and b are bound to the first two argument values as usual, and the variable x is bound to an iterable collection of the zero or more values passed to bar after its first two arguments:

julia> bar(1,2)
(1,2,())

julia> bar(1,2,3)
(1,2,(3,))

julia> bar(1,2,3,4)
(1,2,(3,4))

julia> bar(1,2,3,4,5,6)
(1,2,(3,4,5,6))

In all these cases, x is bound to a tuple of the trailing values passed to bar.

On the flip side, it is often handy to "splice" the values contained in an iterable collection into a function call as individual arguments. To do this, one also uses ... but in the function call instead:

julia> x = (3,4)
(3,4)

julia> bar(1,2,x...)
(1,2,(3,4))

In this case a tuple of values is spliced into a varargs call precisely where the variable number of arguments go. This need not be the case, however:

julia> x = (2,3,4)
(2,3,4)

julia> bar(1,x...)
(1,2,(3,4))

julia> x = (1,2,3,4)
(1,2,3,4)

julia> bar(x...)
(1,2,(3,4))

Furthermore, the iterable object spliced into a function call need not be a tuple:

julia> x = [3,4]
[3,4]

julia> bar(1,2,x...)
(1,2,(3,4))

julia> x = [1,2,3,4]
[1,2,3,4]

julia> bar(x...)
(1,2,(3,4))

Also, the function that arguments are spliced into need not be a varargs function (although it often is):

baz(a,b) = a + b

julia> args = [1,2]
[1,2]

julia> baz(args...)
3

julia> args = [1,2,3]
[1,2,3]

julia> baz(args...)
no method baz(Int64,Int64,Int64)

As you can see, if the wrong number of elements are in the spliced container, then the function call will fail, just as it would if too many arguments were given explicitly.

## Further Reading

We should mention here that this is far from a complete picture of defining functions. Julia has a sophisticated type system and allows multiple dispatch on argument types. None of the examples given here provide any type annotations on their arguments, meaning that they are applicable to all types of arguments. The type system is described in Types and defining a function in terms of methods chosen by multiple dispatch on run-time argument types is described in Methods.

# 8. Control Flow

Julia provides a variety of control flow constructs:

The first five control flow mechanisms are standard to high-level programming languages. Tasks are not so standard: they provide non-local control flow, making it possible to switch between temporarily-suspended computations. This is a powerful construct: both exception handling and cooperative multitasking are implemented in Julia using tasks. Everyday programming requires no direct usage of tasks, but certain problems can be solved much more easily by using tasks.

## Compound Expressions

Sometimes it is convenient to have a single expression which evaluates several subexpressions in order, returning the value of the last subexpression as its value. There are two Julia constructs that accomplish this: begin blocks and (;) chains. The value of both compound expression constructs is that of the last subexpression. Here's an example of a begin block:

julia> z = begin
         x = 1
         y = 2
         x + y
       end
3

Since these are fairly small, simple expressions, they could easily be placed onto a single line, which is where the (;) chain syntax comes in handy:

julia> z = (x = 1; y = 2; x + y)
3

This syntax is particularly useful with the terse single-line function definition form introduced in Functions. Although it is typical, there is no requirement that begin blocks be multiline or that (;) chains be single-line:

julia> begin x = 1; y = 2; x + y end
3

julia> (x = 1;
        y = 2;
        x + y)
3
## Conditional Evaluation

Conditional evaluation allows portions of code to be evaluated or not evaluated depending on the value of another expression. Here is the anatomy of the if-elseif-else conditional syntax:

if x < y
  println("x is less than y")
elseif x > y
  println("x is greater than y")
else
  println("x is equal to y")
end

The semantics are just what you'd expect: if the condition expression x < y is true, then the corresponding block is evaluated; otherwise the condition expression x > y is evaluated, and if it is true, the corresponding block is evaluated; if neither expression is true, the else block is evaluated. Here it is in action:

julia> function test(x, y)
         if x < y
           println("x is less than y")
         elseif x > y
           println("x is greater than y")
         else
           println("x is equal to y")
         end
       end

julia> test(1, 2)
x is less than y

julia> test(2, 1)
x is greater than y

julia> test(1, 1)
x is equal to y

The elseif and else blocks are optional, and as many elseif blocks as desired can be used. The condition expressions in the if-elseif-else construct are evaluated until the first one evaluates to true, after which the associated block is evaluated, and no further condition expressions or blocks are evaluated.

Unlike C, MATLAB®, Perl, Python, and Ruby — but like Java, and a few other stricter, typed languages — it is an error if the value of a conditional expression is anything but true or false:

julia> if 1
         println("true")
       end
type error: lambda: in if, expected Bool, got Int64

This error indicates that the conditional was of the wrong type: Int64 rather than the required Bool.

The so-called "ternary operator", ?:, is closely related to the if-elseif-else syntax, but is used where a conditional choice between single expression values is required, as opposed to conditional execution of longer blocks of code. It gets its name from being the only operator in most languages taking three operands:

a ? b : c

The expression a, before the ?, is a condition expression, and the ternary operation evaluates the expression b, before the :, if the condition a is true or the expression c, after the :, if it is false.

The easiest way to understand this behavior is to see an example. In the previous example, the println call is shared by all three branches: the only real choice is which literal string to print. This could be written more concisely using the ternary operator. For the sake of clarity, let's try a two-way version first:

julia> x = 1; y = 2;

julia> println(x < y ? "less than" : "not less than")
less than

julia> x = 1; y = 0;

julia> println(x < y ? "less than" : "not less than")
not less than

If the expression x < y is true, the entire ternary operator expression evaluates to the string "less than" and otherwise it evaluates to the string "not less than". The original three-way example requires chaining multiple uses of the ternary operator together:

julia> test(x, y) = println(x < y ? "x is less than y"    :
                            x > y ? "x is greater than y" : "x is equal to y")

julia> test(1, 2)
x is less than y

julia> test(2, 1)
x is greater than y

julia> test(1, 1)
x is equal to y

To facilitate chaining, the operator associates from right to left.

It is significant that like if-elseif-else, the expressions before and after the : are only evaluated if the condition expression evaluates to true or false, respectively:

v(x) = (println(x); x)

julia> 1 < 2 ? v("yes") : v("no")
yes
"yes"

julia> 1 > 2 ? v("yes") : v("no")
no
"no"
## Short-Circuit Evaluation

Short-circuit evaluation is quite similar to conditional evaluation. The behavior is found in most imperative programming languages having the && and || boolean operators: in a series of boolean expressions connected by these operators, only the minimum number of expressions are evaluated as are necessary to determine the final boolean value of the entire chain. Explicitly, this means that:

  • In the expression a && b, the subexpression b is only evaluated if a evaluates to false.
  • In the expression a || b, the subexpression b is only evaluated if a evaluates to true.

The reasoning is that a && b must be false if a is false, regardless of the value of b, and likewise, the value of a || b must be true if a is true, regardless of the value of b. Both && and || associate to the right, but && has higher precedence than than || does. It's easy to experiment with this behavior:

t(x) = (println(x); true)
f(x) = (println(x); false)

julia> t(1) && t(2)
1
2
true

julia> t(1) && f(2)
1
2
false

julia> f(1) && t(2)
1
false

julia> f(1) && f(2)
1
false

julia> t(1) || t(2)
1
true

julia> t(1) || f(2)
1
true

julia> f(1) || t(2)
1
2
true

julia> f(1) || f(2)
1
2
false

You can easily experiment in the same way with the associativity and precedence of various combinations of && and || operators.

If you want to perform boolean operations without short-circuit evaluation behavior, you can use the bitwise boolean operators introduced in Mathematical Operations: & and |. These are normal functions, which happen to support infix operator syntax, but always evaluate their arguments:

julia> f(1) & t(2)
1
2
false

julia> t(1) | t(2)
1
2
true

Just like condition expressions used in if, elseif or the ternary operator, the operands of && or || must be boolean values (true or false). Using a non-boolean value is an error:

julia> 1 && 2
type error: lambda: in if, expected Bool, got Int64

Chained numeric comparisons also exhibit short-circuit evaluation behavior. Recall from Numeric Comparisons that numeric comparisons in Julia can be arbitrarily chained:

julia> a = 1; b = 2; c = 3;

julia> a < b <= c
true

This last expression is equivalent to the less concise:

julia> a < b && b <= c
true

In particular, this implies that chained comparisons exhibit short-circuit behavior. Let's see this in action:

v(x) = (println(x); x)

julia> v(1) > v(2) <= v(3)
2
1
false

The first two expressions in a comparison chain are always evaluated, since their values are required to check the first comparison, which must be checked. Subsequent expressions, however, need not be evaluated if a comparison earlier in the chain is false. Note that the middle expression is only evaluated once, rather than twice as it would be if the expression were written as v(1) < v(2) && v(2) <= v(3). However, the order of evaluations in a chained comparison is undefined. It is strongly recommended not to use expressions with side effects (such as printing) in chained comparisons. If side effects are required, the short-circuit && operator should be used explicitly.

## Repeated Evaluation: Loops

There are two constructs for repeated evaluation of expressions: the while loop and the for loop. Here is an example of a while loop:

julia> i = 1;

julia> while i <= 5
         println(i)
         i += 1
       end
1
2
3
4
5

The while loop evaluates the condition expression (i < n in this case), and as long it remains true, keeps also evaluating the body of the while loop. If the condition expression is false when the while loop is first reached, the body is never evaluated.

The for loop makes common repeated evaluation idioms easier to write. Since counting up and down like the above while loop does is so common, it can be expressed more concisely with a for loop:

julia> for i = 1:5
         println(i)
       end
1
2
3
4
5

Here the 1:5 is a Range object, representing the sequence of numbers 1, 2, 3, 4, 5. The for loop iterates through these values, assigning each one in turn to the variable i. One rather important distinction between the previous while loop form and the for loop form is the scope during which the variable is visible. If the variable i has not been introduced in an other scope, in the for loop form, it is visible only inside of the for loop, and not afterwards. You'll either need a new interactive session instance or a different variable name to test this:

julia> for j = 1:5
         println(j)
       end
1
2
3
4
5
()

julia> j
j not defined

See Variables and Scoping for a detailed explanation of variable scope and how it works in Julia.

In general, the for loop construct can iterate over all sorts of containers:

julia> for i = [1,4,0]
         println(i)
       end
1
4
0

julia> for s = ["foo","bar","baz"]
         println(s)
       end
foo
bar
baz

Various types of iterable containers will be introduced and discussed in later section.

It is sometimes convenient to terminate the repetition of a while before the test condition is falsified or stop iterating in a for loop before the end of the iterable object is reached. This can be accomplished with the break keyword:

julia> i = 1;

julia> while true
         println(i) 
         if i >= 5 
           break
         end
         i += 1
       end
1
2
3
4
5

julia> for i = 1:1000
         println(i)               
         if i >= 5
           break
         end
       end
1
2
3
4
5

The above while loop would never terminate on its own, and the for loop would iterate up to 1000. These loops are both exited early by using the break keyword.

In other circumstances, it is handy to be able to stop an iteration and move on to the next one immediately. The continue keyword accomplishes this:

julia> for i = 1:10
         if i % 3 != 0
           continue
         end
         println(i)
       end
3
6
9

This is a somewhat contrived example since we could produce the same behavior more clearly by negating the condition and placing the println call inside the if block. In realistic usage there is more code to be evaluated after the continue, and often there are multiple points from which one calls continue.

## Exception Handling

When an unexpected condition occurs, a function may be unable to return a reasonable value to its caller. In such cases, it may be best for the exceptional condition to either terminate the program, printing a diagnostic error message, or if the programmer has provided code to handle such exceptional circumstances, allow that code to take the appropriate action.

The error function is used to indicate that an unexpected condition has occurred which should interrupt the normal flow of control. The built in sqrt function returns NaN if applied to a negative real value:

julia> sqrt(-1)
NaN

Suppose we want to stop execution immediately if the square root of a negative number is taken. To do this, we can define a fussy version of the sqrt function that raises an error if its argument is negative:

fussy_sqrt(x) = x >= 0 ? sqrt(x) : error("negative x not allowed")

julia> fussy_sqrt(2)
1.4142135623730951

julia> fussy_sqrt(-1)
negative x not allowed

If fussy_sqrt is called with a negative value from another function, instead of trying to continue execution of the calling function, it returns immediately, displaying the error message in the interactive session:

function verbose_fussy_sqrt(x)
  println("before fussy_sqrt")
  r = fussy_sqrt(x)
  println("after fussy_sqrt")
  return r
end

julia> verbose_fussy_sqrt(2)
before fussy_sqrt
after fussy_sqrt
1.4142135623730951

julia> verbose_fussy_sqrt(-1)
before fussy_sqrt
negative x not allowed

Now suppose we want to handle this circumstance rather than just giving with an error. To catch an error, you use the try and catch keywords. Here is a rather contrived example that computes the square root of the absolute value of x by handling the error raised by fussy_sqrt:

function sqrt_abs(x)
  try
    fussy_sqrt(x)
  catch
    fussy_sqrt(-x)
  end
end

julia> sqrt_abs(2)
1.4142135623730951

julia> sqrt_abs(-2)
1.4142135623730951

Of course, it would be far simpler and more efficient to just return sqrt(abs(x)). However, this demonstrates how try and catch operate: the try block is executed initially, and the value the entire construct is the value of the last expression if no exceptions are thrown during execution; if an exception is thrown during the evaluation of the try block, however, execution of the try code ceases immediately and the catch block is evaluated instead. If the catch block succeeds without incident (it can in turn raise an exception, which would unwind the call stack further), the value of the entire try-catch construct is that of the last expression in the catch block.

### Throw versus Error

The error function is convenient for indicating that an error has occurred, but it is built on a more fundamental function: throw. Perhaps throw should be introduced first, but typical usage calls for error, so we have deferred the introduction of throw. Above, we use a form of the try-catch expression in which no value is captured by the catch block, but there is another form:

try
  # execute some code
catch x
  # so something with x
end

In this form, if the built-in throw function is called by the "execute some code" expression, or any callee thereof, the catch block is executed with the argument of the throw function is bound to the variable x. The error function is simply a convenience which always throws an instance of the abstract type Exception. Here we can see that the object thrown when a divide-by-zero error occurs is of type DivideByZeroError:

julia> div(1,0)
error: integer divide by zero

julia> try 
         div(1,0)
       catch x
         println(typeof(x))
       end
DivideByZeroError

DivideByZeroError is a concrete subtype of Exception, thrown to indicate that some illegal division by zero has occurred in an function whose return type is integeral, thereby making it impossible to return a reasonable value. Floating-point functions, on the other hand, can simply return NaN rather than throwing an exception.

Unlike error, which should only be used to indicate an unexpected condition, throw is merely a control construct, and can be used to pass any value back to an enclosing try-catch:

julia> try
         throw("Hello, world.")
       catch x
         println(x)
       end
Hello, world.

This example is very contrived, of course — the power of the try-catch construct lies in the ability to unwind a deeply nested computation immediately to a much higher level in the stack of calling functions. There are situations where no error has occurred, but the ability to unwind the stack and pass a value to a higher level is desirable. These are the circumstances in which throw should be used rather than error. On the other hand, when something unexpected has occurred in a program's execution, error is the more appropriate way let a caller handle the problem or terminate the program with an error message.

## Tasks (aka Coroutines)

Tasks are a control flow feature that allows computations to be suspended and resumed in a flexible manner. This feature is sometimes called by other names, such as symmetric coroutines, lightweight threads, cooperative multitasking, or one-shot continuations.

When a piece of computing work (in practice, executing a particular function) is designated as a Task, it becomes possible to interrupt it by switching to another Task. The original Task can later be resumed, at which point it will pick up right where it left off. At first, this may seem similar to a function call. However there are two key differences. First, switching tasks does not use any space, so any number of task switches can occur without consuming the call stack. Second, you may switch among tasks in any order, unlike function calls, where the called function must finish executing before control returns to the calling function.

This kind of control flow can make it much easier to solve certain problems. In some problems, the various pieces of required work are not naturally related by function calls; there is no obvious "caller" or "callee" among the jobs that need to be done. An example is the producer-consumer problem, where one complex procedure is generating values and another complex procedure is consuming them. The consumer cannot simply call a producer function to get a value, because the producer may have more values to generate and so might not yet be ready to return. With tasks, the producer and consumer can both run as long as they need to, passing values back and forth as necessary.

Julia provides the functions produce and consume for solving this problem. A producer is a function that calls produce on each value it needs to produce:

function producer()
  produce("start")
  for n=1:4
    produce(2n)
  end
  produce("stop")
end

To consume values, first the producer is wrapped in a Task, then consume is called repeatedly on that object:

julia> p = Task(producer)
Task

julia> consume(p)
"start"

julia> consume(p)
2

julia> consume(p)
4

julia> consume(p)
6

julia> consume(p)
8

julia> consume(p)
10

julia> consume(p)
"stop"

One way to think of this behavior is that producer was able to return multiple times. Between calls to produce, the producer's execution is suspended and the consumer has control.

A Task can be used as an iterable object in a for loop, in which case the loop variable takes on all the produced values:

julia> for x = Task(producer)
         println(x)
       end
start
2
4
6
8
10
stop
# 9. Variables and Scoping

Until now, we have simply used variables without any explanation. Julia's usage of variables closely resembles that of other dynamic languages, so we have hopefully gotten away with this liberty. In what follows, however, we address this oversight and provide details of how variables are used, declared, and scoped in Julia.

The scope of a variable is the region of code within which a variable is visible. Variable scoping helps avoid variable naming conflicts. The concept is intuitive: two functions can both have arguments called x without the two x's referring to the same thing. Similarly there are many other cases where different blocks of code can use the same name without referring to the same thing. The rules for when the same variable name does or doesn't refer to the same thing are called scope rules; this section spells them out in detail.

Certain constructs in the language introduce scope blocks, which are regions of code that are eligible to be the scope of some set of variables. The scope of a variable cannot be an arbitrary set of source lines, but will always line up with one of these blocks. The constructs introducing such blocks are:

  • function bodies (either syntax)
  • while loops
  • for loops
  • try blocks
  • catch blocks
  • let blocks
  • type blocks.

Notably missing from this list are begin blocks, which do not introduce a new scope block.

Certain constructs introduce new variables into the current innermost scope. When a variable is introduced into a scope, it is also inherited by all inner scopes unless one of those inner scopes explicitly overrides it. These constructs which introduce new variables into the current scope are as follows:

  • A declaration local x introduces a new local variable.
  • A declaration global x makes x in the current scope and inner scopes refer to the global variable of that name.
  • A function's arguments are introduced as new local variables into the function's body scope.
  • An assignment x = y introduces a new local variable x only if x is neither declared global nor explicitly introduced as local by any enclosing scope, before or after the current line of code.

In the following example, there is only one x assigned both inside and outside a loop:

function foo(n)
    x = 0
    for i = 1:n
        x = x + 1
    end
    x
end

julia> foo(10)
10

In the next example, the loop has a separate x and the function always returns zero:

function foo(n)
    x = 0
    for i = 1:n
        local x
        x = i
    end
    x
end

julia> foo(10)
0

In this example, an x exists only inside the loop, and the function encounters an undefined variable error on its last line (unless there is a global variable x):

function foo(n)
    for i = 1:n
        x = i
    end
    x
end

julia> foo(10)
in foo: x not defined

A variable that is not assigned to or otherwise introduced locally defaults to global, so this function would return the value of the global x if there is such a variable, or produce an error if no such global exists. As a consequence, the only way to assign to a global variable inside a non-top-level scope is to explicitly declare the variable as global within some scope, since otherwise the assignment would introduce a new local rather than assigning to the global. This rule works out well in practice, since the vast majority of variables assigned inside functions are intended to be local variables, and using global variables should be the exception rather than the rule, especially assigning new values to them.

One last example shows that an outer assignment introducing x need not come before an inner usage:

function foo(n)
    f = y -> n + x + y
    x = 1
    f(2)
end

julia> foo(10)
13

This last example may seem slightly odd for a normal variable, but allows for named functions — which are just normal variables holding function objects — to be used before they are defined. This allows functions to be defined in whatever order is intuitive and convenient, rather than forcing bottom up ordering or requiring forward declarations, both of which one typically sees in C programs. As an example, here is an inefficient, mutually recursive way to test if positive integers are even or odd:

even(n) = n == 0 ? true  :  odd(n-1)
odd(n)  = n == 0 ? false : even(n-1)

julia> even(3)
false

julia> odd(3)
true

Julia provides built-in, efficient functions to test this called iseven and isodd so the above definitions should only be taken as examples.

Since functions can be used before they are defined, as long as they are defined by the time they are actually called, no syntax for forward declarations is necessary, and definitions can be ordered arbitrarily.

At the interactive prompt, variable scope works the same way as anywhere else. The prompt behaves as if there is scope block wrapped around everything you type, except that this scope block is identified with the global scope. This is especially apparent in the case of assignments:

julia> for i = 1:1; y = 10; end

julia> y
y not defined

julia> y = 0
0

julia> for i = 1:1; y = 10; end

julia> y
10

In the former case, y only exists inside of the for loop. In the latter case, an outer y has been introduced and so is inherited within the loop. Due to the special identification of the prompt's scope block with the global scope, it is not necessary to declare global y inside the loop. However, in code not entered into the interactive prompt this declaration would be necessary in order to modify a global variable.

The let statement provides a different way to introduce variables. Unlike assignments to local variables, let statements allocate new variable bindings each time they run. An assignment modifies an existing value location, and let creates new locations. This difference is usually not important, and is only detectable in the case of variables that outlive their scope via closures. let has the following syntax:

let var1 = value1,
    var2,
    var3 = value3
    code
end

let accepts a comma-separated series of assignments and variable names. Unlike local variable assignments, the assignments do not occur in order. Rather, all assignment right-hand sides are evaluated in the scope outside the let, then the let variables are assigned "simultaneously". In this way, let operates like a function call. Indeed, the following code:

let a = b, c = d
    body
end

is equivalent to ((a,c)->body)(b, d). Therefore it makes sense to write something like let x=x ....

Here is a case where the behavior of let is needed:

Fs = cell(2)
for i=1:2
    Fs[i] = ()->i
end

Fs[1]()  =>  2
Fs[2]()  =>  2

Here we create and store two closures that return variable i. However, it is always the same variable i, so the two closures behave identically. We can use let to create a new binding for i:

Fs = cell(2)
for i=1:2
    let i=i
        Fs[i] = ()->i
    end
end

Fs[1]()  =>  1
Fs[2]()  =>  2

Since the begin construct does not introduce a new block, it can be useful to use the zero-argument let to just introduce a new scope block without creating any new bindings:

julia> begin
         local x = 1
         begin
           local x = 2
         end
         x
       end
syntax error: local x declared twice

julia> begin
         local x = 1
         let  
           local x = 2
         end
         x
       end
1

The first example is illegal because you cannot declare the same variable as local in the same scope twice. The second example is legal since the let introduces a new scope block, so the inner local x is a different variable than the outer local x.

# 10. Types

Type systems have traditionally fallen into two quite different camps: static type systems, where every value must have a type computable before the execution of a program, and dynamic type systems, where nothing is known about types of values flowing through a piece of code until run time, when the values themselves are available. Object orientation allows some flexibility in statically typed languages by letting code be written without the precise types of values being known at compile time. The ability to write code that can operate on different types is called polymorphism. All code in classic dynamically typed languages is polymorphic: only by explicitly checking types, or when objects fail to support operations at run-time, are the types of any values ever restricted.

Julia's type system is dynamic, but gains some of the advantages of static type systems by making it possible to indicate that certain values are of specific types. This can be of great assistance in generating efficient code, but even more significantly, it allows method dispatch on the types of function arguments to be deeply integrated with the language. Method dispatch is explored in detail in Methods, but is rooted in the type system presented here.

The default behavior in Julia when types are omitted is to allow values to be of any type. Thus, one can write many useful Julia programs without ever explicitly using types. When additional expressiveness is needed, however, it is easy to gradually introduce explicit type annotations into previously "untyped" code. Doing so will typically increase both the performance and robustness of these systems, and perhaps somewhat counterintuitively, often significantly simplify them.

Describing Julia in the lingo of type systems, it is: dynamic, nominative, parametric and dependent. Generic types can be parameterized by other types and by integers, and the hierarchical relationships between types are explicitly declared, rather than implied by compatible structure. One particularly distinctive feature of Julia's type system is that concrete types may not subtype each other: all concrete types are final and may only have abstract types as their supertypes. While this might at first seem unduly restrictive, it has many beneficial consequences with surprisingly few drawbacks. It turns out that being able to inherit behavior is much more important than being able to inherit structure, and inheriting both causes significant difficulties in traditional object-oriented languages. Other high-level aspects of Julia's type system that should be mentioned up front are:

  • There is no division between object and non-object values: all values in Julia are true objects having a type that belongs to a single, fully connected type graph, all nodes of which are equally first-class as types.
  • There is no meaningful concept of a "compile-time type": the only type a value has is its actual type when the program is running. This is called a "run-time type" in object-oriented languages where the combination of static compilation with polymorphism makes this distinction significant.
  • Only values, not variables, have types — variables are simply names bound to values.
  • Both abstract and concrete types can be paramaterized by other types and by integers. Type parameters may be completely omitted when they do not need to be explicitly referenced or restricted.

Julia's type system is designed to be powerful and expressive, yet clear, intuitive and unobtrusive. Many Julia programmers may never feel the need to write code that explicitly uses types. Some kinds of programming, however, becomes clearer, simpler, faster and more robust with declared types.


A Note On Capitalization.

There is no semantic significance to capitalization of names in Julia, unlike, for example, Ruby, where identifiers beginning with an uppercase letter (including type names) are constants. By convention, however, the first letter of each word in a Julia type name begins with a capital letter and underscores are not used to separate words. Variables, on the other hand, are conventionally given lowercase names, with word separation indicated by underscores ("_"). In numerical code it is not uncommon to use single-letter uppercase variable names, especially for matrices. Since types rarely have single-letter names, this does not generally cause confusion, although type parameter placeholders (see below) also typically use single-letter uppercase names like T or S.

## Type Declarations

The :: operator can be used to attach type annotations to expressions and variables in programs. There are two primary reasons to do this:

  1. As an assertion to help confirm that your program works the way you expect,
  2. To provide extra type information to the compiler, which can then improve performance in many cases

The :: operator is read as "is an instance of" and can be used anywhere to assert that the value of the expression on the left is an instance of the type on the right. When the type on the right is concrete, the value on the left must have that type as its implementation — recall that all concrete types are final, so no implementation is a subtype of any other. When the type is abstract, it suffices for the value to be implemented by a concrete type that is a subtype of the abstract type. If the type assertion is not true, an exception is thrown, otherwise, the left-hand value is returned:

julia> (1+2)::Float
type error: typeassert: expected Float, got Int64

julia> (1+2)::Int
3

This allows a type assertion to be attached to any expression in place.

When attached to a variable, the :: operator means something a bit different: it declares the variable to always have the specified type, like a type declaration in a statically-typed language such as C. Every value assigned to the variable will be converted to the declared type using the convert function:

julia> function foo()
         x::Int8 = 1000
         x
       end

julia> foo()
-24

julia> typeof(ans)
Int8

This feature is useful for avoiding performance "gotchas" that could occur if one of the assignments to a variable changed its type unexpectedly.

## Abstract Types

Abstract types cannot be instantiated, and serve only as nodes in the type graph, thereby describing sets of related concrete types: those concrete types which are their descendants. We begin with abstract types even though they have no instantiation because they are the backbone of the type system: they form the conceptual hierarchy which makes Julia's type system more than just a collection of object implementations (which is precisely what C's type system provides, and thus is not presumed to be useless or absurd at all).

Recall that in Integers and Floating-Point Numbers, we introduced a variety of concrete types of numeric values: Int8, Uint8, Int16, Uint16, Int32, Uint32, Int64, Uint64, Float32, and Float64. These are all bits types, which we will discuss in the next section. Although they have different representation sizes, Int8, Int16, Int32 and Int64 all have in common that they are signed integer types. Likewise Uint8, Uint16, Uint32 and Uint64 are all unsigned integer types, while Float32 and Float64 are distinct in being floating-point types rather than integers. It is common for a piece of code to make sense, for example, only if its arguments are some kind of integer, but not really depend on what particular kind of integer, as long as the appropriate low-level implementations of integer operations are used. For example, the greatest common denominator algorithm works for all kinds of integers, but will not work for floating-point numbers. Abstract types allow the construction of a hierarchy of types, providing a context into which concrete types can fit. This allows you, for example, to easily program to any type that is an integer, without restricting an algorithm to a specific type of integer.

Abstract types are declared using the abstract keyword. The general syntaxes for declaring an abstract type are:

abstract «name»
abstract «name» <: «supertype» 

The abstract keyword introduces a new abstract type, whose name is given by «name». This name can be optionally followed by <: and an already-existing type, indicating that the newly declared abstract type is a subtype of this "parent" type.

When no supertype is given, the default supertype is Any — a predefined abstract type that all objects are instances of and all types are subtypes of. In type theory, Any is commonly called "top" because it is at the apex of the type graph. Whenever a supertype is not specified for any kind of newly created type, the default supertype is Any. Julia also has a predefined abstract "bottom" type, at the nadir of the type graph, which is called None. It is the exact opposite of Any: no object is an instance of None and all types are supertypes of None.

As a specific example, let's consider a subset of the abstract types that make up Julia's numerical hierarchy:

abstract Number
abstract Real  <: Number
abstract Int   <: Real
abstract Uint  <: Int
abstract Float <: Real

The Number type is a direct child type of Any, and Real is its child. In turn, Real has two children (it has more, but only two are shown here; we'll get to the others later): Int and Float, separating the world into representations of integers and representations of real numbers. Representations of real numbers include, of course, floating-point types, but also include other types, such as Julia's rationals. Hence, Float is a proper subtype of Real, including only floating-point representations of real numbers.

The <: operator in general means "is a subtype of", and, used in declarations like this, declares the right-hand type to be an immediate supertype of the newly declared type. It can also be used in expressions as a subtype operator which returns true when its left operand is a subtype of its right operand:

julia> Int <: Number
true

julia> Int <: Float
false

Since abstract types have no instantiations and serve as no more than nodes in the type graph, there is not much more to say about them until we introduce parametric abstract types later on in Parametric Types.

## Bits Types

A bits type is a concrete type whose data consists of plain old bits. Classic examples of bits types are integers and floating-point values. Unlike most languages, Julia lets you declare your own bits types, rather than providing only a fixed set of built-in bits types. In fact, the standard bits types are all defined in the language itself:

bitstype 8 Bool

bitstype 8  Int8  <: Int
bitstype 16 Int16 <: Int
bitstype 32 Int32 <: Int
bitstype 64 Int64 <: Int

bitstype 8  Uint8  <: Uint
bitstype 16 Uint16 <: Uint
bitstype 32 Uint32 <: Uint
bitstype 64 Uint64 <: Uint
bitstype 32 Char   <: Uint

bitstype 32 Float32 <: Float
bitstype 64 Float64 <: Float

The general syntaxes for declaration of a bitstypes are:

bitstype «bits» «name»
bitstype «bits» «name» <: «supertype»

The number of bits indicates how much storage the type requires and the name gives the new type a name. A bits type can optionally be declared to be a subtype of some supertype. If a supertype is omitted, then the type defaults to having Any as its immediate supertype. The declaration of Bool above therefore means that a boolean value takes eight bits to store, and has Any as its immediate supertype. (Note that Bool values are not considered to be numeric, unlike many languages; performing arithmetic operations on boolean values will produce errors.) Currently, only sizes of 8, 16, 32, and 64 bits are supported. Therefore, boolean values, although they really need just a single bit, cannot be declared to be any smaller then eight bits.

The types Bool, Int8 and Uint8 all have identical representations: they are eight-bit chunks of memory. Since Julia's type system is nominative, however, they are not interchangeable despite having identical structure. Another fundamental difference between them is that they have different supertypes: Bool's direct supertype is Any, Int8's is Int, and Uint8's is Uint. All other differences between Bool, Int8, and Uint8 are matters of behavior — the way functions are defined to act when given objects of these types as arguments. This is why a nominative type system is necessary: if structure determined type, which in turn dictates behavior, it would be impossible to make Bool behave any differently than Int8 or Uint8.

## Composite Types

Composite types are called records, structures ("structs" in C), or objects in various languages. A composite type is a collection of named fields, an instance of which can be treated as a single value. In many languages, composite types are the only kind of user-definable type, and they are by far the most commonly used user-defined type in Julia as well. In mainstream object oriented languages, such as C++, Java, Python and Ruby, composite types also have named functions associated with them, and the combination is called an "object". In purer object-oriented languages, such as Python and Ruby, all values are objects whether they are composites or not. In less pure object oriented languages, including C++ and Java, some values, such as integers and floating-point values, are not objects, while instances of user-defined composite types are true objects with associated methods. In Julia, all values are objects, as in Python and Ruby, but functions are not bundled with the objects they operate on since Julia chooses which method of a function to use by multiple dispatch, meaning that the types of all of a function's arguments are considered when selecting a method, rather than just the first one (see Methods for more information on methods and dispatch). Thus, it would be inappropriate for functions to "belong" to only their first argument.

Since composite types are the most common form of user-defined concrete type, they are simply introduced with the type keyword followed by a block of field names, optionally annotated with types using the :: operator:

type Foo
  bar
  baz::Int
  qux::Float64
end

Fields with no type annotation default to Any, and can accordingly hold any type of value.

New objects of the newly created composite type Foo are created by applying the Foo type object function-like to its values for its field in order:

julia> foo = Foo("Hello, world.", 23, 1.5)
Foo("Hello, world.",23,1.5)

julia> typeof(foo)
Foo

Since the bar field is unconstrained in type, any value will do; the value for baz can be any kind of Int and qux must be a Float64.

If you apply the Foo type constructor to values that do not match the declared types (Any,Int64,Float), Julia will try to convert the given values to the expected types using the convert generic function (see Conversion and Promotion):

julia> Foo((), 23.5, 1)
Foo((),23,1.0)

Here the integer value 1 is converted to the floating-point value 1.0 while the floating-point value 23.5 is truncated to the integer value 23. If no conversion method exists, it will complain:

julia> Foo((), 0, "fail")
no method convert(Type{Float64},ASCIIString)

You can access the field values of a composite object using the traditional foo.bar notation:

julia> foo.bar
"Hello, world."

julia> foo.baz
23

julia> foo.qux
1.5

You can also change the values as one would expect:

julia> foo.qux = 2
2.0

julia> foo.bar = 1//2
1//2

There is much more to say about how instances of composite types are created, but that discussion depends on both Parametric Types and on Methods, and is sufficiently important to be addressed in its own section: Constructors.

## Type Unions

A type union is a special abstract type which includes as objects all instances of any of its argument types, constructed using the special Union function:

julia> IntOrString = Union(Int,String)
Union(Int,String)

julia> 1 :: IntOrString
1

julia> "Hello!" :: IntOrString
"Hello!"

julia> 1.0 :: IntOrString
type error: typeassert: expected Union(Int,String), got Float64

Many languages' compilers have an internal union construct for reasoning about types; Julia simply exposes it to the programmer. The union of no types is the "bottom" type, None:

julia> Union()
None

Recall from the discussion above that None is the abstract type which is the subtype of all other types, and which no object is an instance of. Since a zero-argument Union call has no argument types for objects to be instances of, it should produce the a type which no objects are instances of — i.e. None.

## Tuple Types

Tuples are an abstraction of the arguments of a function — without the function itself. The salient aspects of a function's arguments are their order and their types. The type of a tuple of values is the tuple of types of values:

julia> typeof((1,"foo",2.5))
(Int64,ASCIIString,Float64)

Accordingly, a tuple of types can be used anywhere a type is expected:

julia> (1,"foo",2.5) :: (Int64,String,Any)
(1,"foo",2.5)

julia> (1,"foo",2.5) :: (Int64,String,Float32)
type error: typeassert: expected (Int64,String,Float32), got (Int64,ASCIIString,Float64)

If one of the components of the tuple is not a type, however, you will get an error:

julia> (1,"foo",2.5) :: (Int64,String,3)
type error: typeassert: expected Type{T}, got (BitsKind,AbstractKind,Int64)

Note that the empty tuple () is its own type:

julia> typeof(())
()
## Parametric Types

An important and powerful feature of Julia's type system is that it is parametric: types can take parameters, so that type declarations actually introduce a whole family of new types — one for each possible combination of parameter values. There are many languages that support some version of generic programming, wherein data structures and algorithms to manipulate them may be specified without specifying the exact types involved. For example, some form of generic programming exists in ML, Haskell, Ada, Eiffel, C++, Java, C#, F#, and Scala, just to name a few. Some of these languages support true parametric polymorphism (e.g. ML, Haskell, Scala), while others support ad-hoc, template-based styles of generic programming (e.g. C++, Java). With so many different varieties of generic programming and parametric types in various languages, we won't even attempt to compare Julia's parametric types to other languages, but will instead focus on explaining Julia's system in its own right. We will note, however, that because Julia is a dynamically typed language and doesn't need to make all type decisions at compile time, many traditionally intractable difficulties encountered in static parametric type systems can be relatively easily handled.

The only kinds of types that are declared are abstract types, bits types, and composite types. All such types can be parameterized, with the same syntax in each case. We will discuss them in in the following order: first, parametric composite types, then parametric abstract types, and finally parametric bits types.

### Parametric Composite Types

Type parameters are introduced immediately after the type name, surrounded by curly braces:

type Point{T}
  x::T
  y::T
end

This declaration defines a new parametric type, Point{T}, holding two "coordinates" of type T. What, one may ask, is T? Well, that's precisely the point of parametric types: it can be any type at all (or an integer, actually, although here it's clearly used as a type). Point{Float64} is a concrete type equivalent to the type defined by replacing T in the definition of Point with Float64. Thus, this single declaration actually declares an unlimited number of types: Point{Float64}, Point{String}, Point{Int64}, etc. Each of these is now a usable concrete type:

julia> Point{Float64}
Point{Float64}

julia> Point{String}
Point{String}

The type Point{Float64} is a point whose coordinates are 64-bit floating-point values, while the type Point{String} is a "point" whose "coordinates" are string objects (see Strings). However, Point itself is also a valid type object:

julia> Point
Point{T}

Here the T is the dummy type symbol used in the original declaration of Point. What does Point by itself mean? It is an abstract type that contains all the specific instances Point{Float64}, Point{String}, etc.:

julia> Point{Float64} <: Point
true

julia> Point{String} <: Point
true

Other types, of course, are not subtypes of it:

julia> Float64 <: Point
false

julia> String <: Point
false

Concrete Point types with different values of T are never subtypes of each other:

julia> Point{Float64} <: Point{Int64}
false

julia> Point{Float64} <: Point{Float}
false

This last point is very important:

Even though Float64 <: Float we DO NOT have Point{Float64} <: Point{Float}.

In other words, in the parlance of type theory, Julia's type parameters are invariant, rather than being covariant (or even contravariant). This is for practical reasons: while any instance of Point{Float64} may conceptually be like an instance of Point{Float} as well, the two types have different representations in memory:

  • An instance of Point{Float64} can be represented compactly and efficiently as an immediate pair of 64-bit values;
  • An instance of Point{Float} must be able to hold any pair of instances of Float. Since objects that are instances of Float can be of arbitrary size and structure, in practice an instance of Point{Float} must be represented as a pair of pointers to individually allocated Float objects.

The efficiency gained by being able to store Point{Float64} objects with immediate values is magnified enormously in the case of arrays: an Array{Float64} can be stored as a contiguous memory block of 64-bit floating-point values, whereas an Array{Float} must be an array of pointers to individually allocated Float objects — which may well be boxed 64-bit floating-point values, but also might be arbitrarily large, complex objects, which are declared to be implementations of the Float abstract type.

How does one construct a Point object? It is possible to define custom constructors for composite types, which will be discussed in detail in Constructors, but in the absence of any special constructor declarations, there are two default ways of creating new composite objects, one in which the type parameters are explicitly given and the other in which they are implied by the arguments to the object constructor.

Since the type Point{Float64} is a concrete type equivalent to if Point had been declared with Float64 in place of T, the type Point{Float64} can be used much like such a non-parametric concrete type would be. In particular, it can be applied as a constructor for new Point{Float64} objects:

julia> Point{Float64}(1.0,2.0)
Point(1.0,2.0)

julia> typeof(ans)
Point{Float64}

For the default constructor, exactly one argument must be supplied for each field:

julia> Point{Float64}(1.0)
no method Point(Float64,)

julia> Point{Float64}(1.0,2.0,3.0)
no method Point(Float64,Float64,Float64)

Since the type of of the desired object is fully specified in this form, the provided arguments need not match the expected field type exactly, and will be converted (see Conversion and Promotion) automatically to the expected type:

julia> Point{Float64}(1,2.5)
Point(1.0,2.5)

julia> q = Point{Float64}(1,2)
Point(1.0,2.0)

In many cases, it is redundant to provide the type of Point object one wants to construct, since the types of arguments to the constructor call already implicitly provide type information. For that reason, you can also apply Point itself as a constructor, provided that the implied value of the parameter type T is unambiguous:

julia> Point(1.0,2.0)
Point(1.0,2.0)

julia> typeof(ans)
Point{Float64}

julia> Point(1,2)
Point(1,2)

julia> typeof(ans)
Point{Int64}

In the case of Point, the type of T is unambiguously implied if and only if the two arguments to Point have the same type. When this isn't the case, the constructor will fail with a no method error:

julia> Point(1,2.5)
no method Point(Int64,Float64)

Constructor methods to appropriately handle such mixed cases can be defined, but that will not be discussed until later on in Constructors.

### Parametric Abstract Types

Parametric abstract type declarations declare a collection of abstract types, in much the same way:

abstract Pointy{T}

With this declaration, Pointy{T} is a distinct abstract type for each type or integer value of T. As with parametric composite types, each such instance is a subtype of Pointy:

julia> Pointy{Int64} <: Pointy
true

julia> Pointy{1} <: Pointy
true

Parametric abstract types are invariant, much as parametric composite types are:

julia> Pointy{Float64} <: Pointy{Float}
false

julia> Pointy{Float} <: Pointy{Float64}
false

Much as plain old abstract types serve to create a useful hierarchy of types over concrete types, parametric abstract types serve the same purpose with respect to parametric composite types. We could, for example, have declared Point{T} to be a subtype of Pointy{T} as follows:

type Point{T} <: Pointy{T}
  x::T
  y::T
end

Given such a declaration, for each choice of T, we have Point{T} as a subtype of Pointy{T}:

julia> Point{Float64} <: Pointy{Float64}
true

julia> Point{Float} <: Pointy{Float}
true

julia> Point{String} <: Pointy{String}
true

This relationship is also invariant:

julia> Point{Float64} <: Pointy{Float}
false

What purpose do parametric abstract types like Pointy serve? Consider if we create a point-like implementation that only requires a single coordinate because the point is on the diagonal line x = y:

type DiagPoint{T} <: Pointy{T}
  x::T
end

Now both Point{Float64} and DiagPoint{Float64} are implementations of the Pointy{Float64} abstraction, and similarly for every other possible choice of type T. This allows programming to a common interface shared by all Pointy objects, implemented for both Point and DiagPoint. This cannot be fully demonstrated, however, until we have introduced methods and dispatch in the next section, Methods.

There are situations where it may not make sense for type parameters to range freely over all possible types (or integers). In such situations, one can constrain the range of T like so:

abstract Pointy{T<:Real}

With such a declaration, it is acceptable to use any type that is a subtype of Real in place of T, but not types that are not subtypes of T:

julia> Pointy{Float64}
Pointy{Float64}

julia> Pointy{Float}
Pointy{Float}

julia> Pointy{String}
type error: Pointy: in T, expected Real, got AbstractKind

julia> Pointy{1}
type error: Pointy: in T, expected Real, got Int64

Type parameters for parametric composite types can be restricted in the same manner:

type Point{T<:Real} <: Pointy{T}
  x::T
  y::T
end

To give a couple of real-world examples of how all this parametric type machinery can be useful, here is the actual definition of Julia's Rational type, representing an exact ratio of integers:

type Rational{T<:Int} <: Real
  num::T
  den::T
end

It only makes sense to take ratios of integer values, so the parameter type T is restricted to being a subtype of Int, and a ratio of integers represents a value on the real number line, so any Rational is an instance of the Real abstraction.

#### Singleton Types

There is a special kind of abstract parametric type that must be mentioned here: singleton types. For each type, T, the type Type{T} is an abstract type whose only instance is the object T. The abstract type Type{T} is called the "singleton type" of T. Since the definition is a little difficult to parse, let's look at some examples:

julia> isa(Float64, Type{Float64})
true

julia> isa(Float, Type{Float64})
false

julia> isa(Float, Type{Float})
true

julia> isa(Float64, Type{Float})
false

In other words, isa(A,Type{B}) is true if and only if A and B are the same object and are a type. Without the parameter, Type is simply an abstract type which has all type objects as its instances, including, of course, specific singleton types:

julia> isa(Type{Float64},Type)
true

julia> isa(Float64,Type)
true

julia> isa(Float,Type)
true

Any object that is not a type is not an instance of Type:

julia> isa(1,Type)
false

julia> isa("foo",Type)
false

Until we discuss parametric methods and conversions, it is difficult to explain the utility of the singleton type construct, but in short, it allows one to specialize function behavior on specific type values, rather just kinds of types, which is all that would be possible in the absence of singleton types. This is useful for writing methods (especially parametric ones) whose behavior depends on a type that is given as an explicit argument rather than implied by the type of one of its arguments.

A few popular languages have singleton types, including Haskell, Scala and Ruby. In general usage, the term "singleton type" refers to a type whose only instance is a single value. This meaning applies to Julia's singleton types, but with that caveat that only type objects have singleton types, whereas in other languages with singleton types, every object has one.

### Parametric Bits Types

Bits types can also be declared parametrically. For example, pointers are represented as boxed bits types which would be declared in Julia like this:

# 32-bit system:
bitstype 32 Ptr{T}

# 64-bit system:
bitstype 64 Ptr{T}

The slightly odd feature of these declarations as compared to typical parametric composite types, is that the type parameter T cannot be used in the definition of the type itself — it is just an abstract tag, essentially defining an entire family of types with identical structure, differentiated only by their type parameter. Thus, Ptr{Float64} and Ptr{Int64} are distinct types, even though they have identical structure. And of course, all specific pointer types are subtype of the umbrella Ptr type:

julia> Ptr{Float64} <: Ptr
true

julia> Ptr{Int64} <: Ptr
true
## Type Aliases

Sometimes it is convenient to introduce a new name for an already expressible type. For such occasions, Julia provides the typealias mechanism. For example, PtrInt is type aliased to either Uint32 or Uint64 as is appropriate for the size of pointers on the system:

# 32-bit system:
julia> PtrInt
Uint32

# 64-bit system:
julia> PtrInt
Uint64

This is accomplished via the following code in j/pointer.j:

if WORD_SIZE == 64
  typealias PtrInt Uint64
else
  typealias PtrInt Uint32
end

Where WORD_SIZE is a constant set to either 32 or 64 depending on the size of the system's pointers.

For parametric types, typealias can be convenient for providing a new parametric types name where one of the parameter choices is fixed. For example, Julia's dense arrays have type Array{T,n} where T is the element type and n is the number of array dimensions. For convenience, writing Array{Float64} allows one to specify the element type without specifying the dimension:

julia> Array{Float64,1} <: Array{Float64} <: Array
true

However, there is no way to equally simply restrict just the dimension but not the element type. Yet, one often needs to program to just vectors or matrices. For that reason, the following type aliases are provided:

typealias Vector{T} Array{T,1}
typealias Matrix{T} Array{T,2}

In languages where parametric types must be always specified in full, this is not especially helpful, but in Julia, this allows one to write just Matrix for the abstract type including all two-dimensional dense arrays of any element type. Of course, writing Vector{Int64} for an array of 32-bit integers is also more convenient than writing Array{Int64,1}.

## Operations on Types

Since types in Julia are themselves objects, ordinary functions can operate on them. Some functions that are particularly useful for working with or exploring types have already been introduced, such as the <: operator, which indicates whether its left hand operand is a subtype of its right hand operand.

The isa function tests if an object is of a given type and returns true or false:

julia> isa(1,Int)
true

julia> isa(1,Float)
false

The typeof function, already used throughout the manual in examples, returns the type of its argument. Since, as noted above, types are objects, they also have types, and we can ask what their types are. Here we apply typeof to an instance of each of the kinds of types discussed above:

julia> typeof(Float)
AbstractKind

julia> typeof(Float64)
BitsKind

julia> typeof(Rational)
CompositeKind

julia> typeof(Union(Float,Float64,Rational))
UnionKind

julia> typeof((Float,Float64,Rational,None))
(AbstractKind,BitsKind,CompositeKind,UnionKind)

As you can see, the types of types are called, by convention, "kinds":

  • Abstract types have type AbstractKind
  • Bits types have type BitsKind
  • Composite types have type CompositeKind
  • Unions have type UnionKind
  • Tuples of types have a type that is the tuple of their respective kinds.

What if we repeat the process? What is the type of a kind? Kinds, as it happens, are all composite values and thus all have a type of CompositeKind:

julia> typeof(AbstractKind)
CompositeKind

julia> typeof(BitsKind)
CompositeKind

julia> typeof(CompositeKind)
CompositeKind

julia> typeof(UnionKind)
CompositeKind

The reader may note that CompositeKind shares with the empty tuple (see above), the distinction of being its own type (i.e. a fixed point of the typeof function). The only other types sharing this distinction are tuples recursively built with () and CompositeKind as their only atomic values:

julia> typeof(())
()

julia> typeof(CompositeKind)
CompositeKind

julia> typeof(((),))
((),)

julia> typeof((CompositeKind,))
(CompositeKind,)

julia> typeof(((),CompositeKind))
((),CompositeKind)

Another operation that applies to some kinds of types is super. Only abstract types (AbstractKind), bits types (BitsKind), and composite types (CompositeKind) have a supertype, so these are the only kinds of types that the super function applies to:

julia> super(Float64)
Float

julia> super(Number)
Any

julia> super(String)
Any

julia> super(Any)
Any

If you apply super to other type objects (or non-type objects), a "no method" error is raised:

julia> super(Union(Float64,Int64))
no method super(UnionKind,)

julia> super(None)
no method super(UnionKind,)

julia> super((Float64,Int64))
no method super((BitsKind,BitsKind),)
# 11. Methods

Recall from Functions that a function is an object that maps a tuple of arguments to a return value, or throws an exception if no appropriate value can be returned. It is very common for the same conceptual function or operation to be implemented quite differently for different types of arguments: adding two integers is very different from adding two floating-point numbers, both of which are distinct from adding an integer to a floating-point number. Despite their implementation differences, these operations all fall under the general concept of "addition". Accordingly, in Julia, these behaviors all belong to a single object: the + function.

To facilitate using many different implementations of the same concept smoothly, functions need not be defined all at once, but can rather be defined piecewise by providing specific behaviors for certain combinations of argument types and counts. A definition of one possible behavior for a function is called a method. Thus far, we have presented only examples of functions defined with a single method, applicable to all types of arguments. However, the signatures of method definitions can be annotated to indicate the types of arguments in addition to their number, and more than a single method definition may be provided. When a function is applied to a particular tuple of arguments, the most specific method applicable to those arguments is executed. Thus, the overall behavior of a function is a patchwork of the behaviors of its various methods. If the patchwork is well designed, even though the implementations of the methods may be quite different, the outward behavior of the function will appear seamless and consistent.

The choice of which method to execute when a function is applied is called dispatch. Julia allows the dispatch process to choose which of a function's methods to call based on the number of arguments given, and on the types of all of the function's arguments. This is different than traditional object-oriented languages, where dispatch occurs based only on the first argument, which often has a special argument syntax, and is sometimes implied rather than explicitly written as an argument.1 Using all of a function's arguments to choose which method should be invoked, rather than just the first, is known as multiple dispatch. Multiple dispatch is particularly useful for mathematical code, where it makes little sense to artificially deem the operations to "belong" to one argument more than any of the others: does the addition operation in x + y belong to x any more than it does to y? The implementation of a mathematical operator generally depends on the types of all of its arguments.

**Footnote 1:** In C++ or Java, for example, in a method call like `obj.meth(arg1,arg2)`, the object `obj` "receives" the method call and is implicitly passed to the method via the `this` keyword, rather then as an explicit method argument. When the current `this` object is the receiver of a method call, it can be omitted altogether, writing just `meth(arg1,arg2)`, with `this` implied as the receiving object. ## Defining Methods

Until now, we have, in our examples, defined only functions with a single method having unconstrained argument types. Such functions behave just like they would in traditional dynamically typed languages. Nevertheless, we have used multiple dispatch and methods almost continually without being aware of it: all of Julia's standard functions and operators, like the aforementioned +, have many methods defining their behavior over various possible combinations of argument type and count.

When defining a function, one can optionally constrain the types of parameters it is applicable to, using the :: type-assertion operator, introduced in the section on composite types:

f(x::Float64, y::Float64) = 2x + y

This function definition only applies only to calls where x and y are both values of type Float64:

julia> f(2.0, 3.0)
7.0

Applying it to any other types of arguments will result in a "no method" error:

julia> f(2.0, 3)
no method f(Float64,Int64)

julia> f(float32(2.0), 3.0)
no method f(Float32,Float64)

julia> f(2.0, "3.0")
no method f(Float64,ASCIIString)

julia> f("2.0", "3.0")
no method f(ASCIIString,ASCIIString)

As you can see, the arguments must be precisely of type Float64. Other numeric types, such as integers or 32-bit floating-point values, are not automatically converted to 64-bit floating-point, nor are strings parsed as numbers. Because Float64 is a concrete type and concrete types cannot be subclassed in Julia, such a definition can only be applied to arguments that are exactly of type Float64. It may often be useful, however, to write more general methods where the declared parameter types are abstract:

f(x::Number, y::Number) = 2x - y

julia> f(2.0, 3)
1.0

This method definition applies to any pair of arguments that are instances of Number. They need not be of the same type, so long as they are each numeric values. The problem of handling disparate numeric types is delegated to the arithmetic operations in the expression 2x - y.

To define a function with multiple methods, one simply defines the function multiple times, with different numbers and declared types of arguments. The first method definition for a function creates the function object, and subsequent method definitions add new methods to the existing function object. The most specific method definition matching the number and types of the arguments will be executed when a function is applied. Thus, the two method definitions above, taken together, define the behavior for f over all pairs of instances of the abstract type Number — but with a different behavior specific to pairs of Float64 values. If one of the arguments is a 64-bit float but the other one is not, then the f(Float64,Float64) method cannot be called and the more general f(Number,Number) method must be used:

julia> f(2.0, 3.0)
7.0

julia> f(2, 3.0)
1.0

julia> f(2.0, 3)
1.0

julia> f(2, 3)
1

The 2x + y definition is only used in the first case, while the 2x - y definition is used in the others. No automatic casting or conversion of function arguments is ever performed: all conversion in Julia is non-magical and completely explicit. Conversion and Promotion, however, shows how clever application of sufficiently advanced technology can be indistinguishable from magic.

For non-numeric values, and for fewer or more than two arguments, the function f remains undefined, and applying it will still result in a "no method" error:

julia> f("foo", 3)
no method f(ASCIIString,Int64)

julia> f()
no method f()

You can easily see which methods exist for a function by entering the function object itself in an interactive session:

julia> f
Methods for generic function f
f(Float64,Float64)
f(Number,Number)

This output tells us that f is a function object with two methods: one taking two Float64 arguments and one taking arguments of type Number.

In the absence of a type declaration with ::, the type of a method parameter is Any by default, meaning that it is unconstrained since all values in Julia are instances of the abstract type Any. Thus, we can define a catch-all method for f like so:

julia> f(x,y) = println("Whoa there, Nelly.")

julia> f("foo", 1)
Whoa there, Nelly.

This catch-all is less specific than any other possible method definition for a pair of parameter values, so it will only be called on pairs of arguments to which no other method definition applies.

Although it seems a simple concept, multiple dispatch on the types of values is perhaps the single most powerful and central feature of the Julia language. Core operations typically have dozens of methods:

julia> +
Methods for generic function +
+(Int8,Int8)
+(Int16,Int16)
+(Int32,Int32)
+(Int64,Int64)
+(Uint8,Uint8)
+(Uint16,Uint16)
+(Uint32,Uint32)
+(Uint64,Uint64)
+(Float32,Float32)
+(Float64,Float64)
+(Char,Char)
+(Int,Ptr{T})
+(Rational{T<:Int},Rational{T<:Int})
+(Real,Range{T<:Real})
+(Real,Range1{T<:Real})
+(Union(Range{T<:Real},Range1{T<:Real}),Real)
+(Union(Range{T<:Real},Range1{T<:Real}),Union(Range{T<:Real},Range1{T<:Real}))
+(Ptr{T},Int)
+()
+(Complex,Complex)
+(T<:Number,T<:Number)
+(Number,)
+(Number,Number)
+(AbstractArray{T<:Number,N},)
+(SparseMatrixCSC{T1},SparseMatrixCSC{T2})
+(SparseMatrixCSC{T},Union(Array{T,N},Number))
+(Number,DArray{T,N,distdim})
+(Number,AbstractArray{T,N})
+(Union(Array{T,N},Number),SparseMatrixCSC{T})
+(AbstractArray{S,N},AbstractArray{T,N})
+(DArray{T,N,distdim},Number)
+(AbstractArray{T,N},Number)
+(Any,Any,Any)
+(Any,Any,Any,Any)
+(Any,Any,Any,Any,Any)
+(Any,Any,Any,Any...)

Multiple dispatch together with the flexible parametric type system, give Julia its ability to abstractly express high-level algorithms decoupled from implementation details, yet generate efficient, specialized code to handle each case at run time.

## Method Ambiguities

It is possible to define a set of function methods such that there is no unique most specific method applicable to some combinations of arguments:

julia> g(x::Float64, y) = 2x + y

julia> g(x, y::Float64) = x + 2y
Warning: New definition g(Any,Float64) is ambiguous with g(Float64,Any).
         Make sure g(Float64,Float64) is defined first.

julia> g(2.0, 3)
7.0

julia> g(2, 3.0)
8.0

julia> f(2.0, 3.0)
7.0

Here the call g(2.0, 3.0) could be handled by either the g(Float64, Any) or the g(Any, Float64) method, and neither is more specific than the other. In such cases, Julia warns you about this ambiguity, but allows you to proceed, arbitrarily picking a method. You should avoid method ambiguities by specifying an appropriate method for the intersection case:

julia> g(x::Float64, y::Float64) = 2x + 2y

julia> g(x::Float64, y) = 2x + y

julia> g(x, y::Float64) = x + 2y

julia> g(2.0, 3)
7.0

julia> g(2, 3.0)
8.0

julia> f(2.0, 3.0)
10.0

To suppress Julia's warning, the disambiguating method must be defined first, since otherwise the ambiguity exists, if transiently, until the more specific method is defined.

## Parametric Methods

Method definitions can optionally have type parameters immediately after the method name and before the parameter tuple:

same_type{T}(x::T, y::T) = true
same_type(x,y) = false

The first method applies whenever both arguments are of the same concrete type, regardless of what type that is, while the second method acts as a catch-all, covering all other cases. Thus, overall, this defines a boolean function that checks whether its two arguments are of the same type:

julia> same_type(1, 2)
true

julia> same_type(1, 2.0)
false

julia> same_type(1.0, 2.0)
true

julia> same_type("foo", 2.0)
false

julia> same_type("foo", "bar")
true

julia> same_type(int32(1), int64(2))
false

This kind of definition of function behavior by dispatch is quite common — idiomatic, even — in Julia. Method type parameters need not only be used as the types of parameters: they can be used anywhere a value would be in the signature of the function or body of the function. Here's an example where the method type parameter T is used as the type parameter to the parametric type Vector{T} in the method signature:

julia> myappend{T}(v::Vector{T}, x::T) = [v..., x]

julia> myappend([1,2,3],4)
[1,2,3,4]

julia> myappend([1,2,3],2.5)
no method myappend(Array{Int64,1},Float64)

julia> myappend([1.0,2.0,3.0],4.0)
[1.0,2.0,3.0,4.0]

julia> myappend([1.0,2.0,3.0],4)
no method myappend(Array{Float64,1},Int64)

As you can see, the type of the appended element must match the element type of the vector it is appended to, or a "no method" error is raised. In the following example, the method type parameter T is used as the return value:

julia> mytypeof{T}(x::T) = T

julia> mytypeof(1)
Int64

julia> mytypeof(1.0)
Float64

Just as you can put subtype constraints on type parameters in type declarations (see Parametric Types), you can also constrain type parameters of methods:

same_type_numeric{T<:Number}(x::T, y::T) = true
same_type_numeric(x::Number, y::Number) = false

julia> same_type_numeric(1, 2)
true

julia> same_type_numeric(1, 2.0)
false

julia> same_type_numeric(1.0, 2.0)
true

julia> same_type_numeric("foo", 2.0)
no method same_type_numeric(ASCIIString,Float64)

julia> same_type_numeric("foo", "bar")
no method same_type_numeric(ASCIIString,ASCIIString)

julia> same_type_numeric(int32(1), int64(2))
false

The same_type_numeric function behaves much like the same_type function defined above, but is only defined for pairs of numbers.

# 12. Constructors

Constructors are functions that create new objects — specifically, instances of composite types. In Julia, type objects also serve as constructor functions: they create new instances of themselves when applied to an argument tuple as a function. This much was already mentioned briefly when composite types were introduced. For example:

type Foo
  bar
  baz
end

julia> foo = Foo(1,2)
Foo(1,2)

julia> foo.bar
1

julia> foo.baz
2

For many types, forming new objects by binding their field values together is all that is ever needed to create instances. There are, however, cases where more functionality is required when creating composite objects. Sometimes invariants must be enforced, either by checking arguments or by transforming them. Recursive data structures, especially those that may be self-referential, often cannot be constructed cleanly without first being created in an incomplete state and then altered programmatically to be made whole, as a separate step from object creation. Sometimes, it's just convenient to be able to construct objects with fewer or different types of parameters than they have fields. Julia's system for object construction addresses all of these cases and more.

## Outer Constructor Methods

A constructor is just like any other function in Julia in that its overall behavior is defined by the combined behavior of its methods. Accordingly, you can add functionality to a constructor by simply defining new methods. For example, let's say you want to add a constructor method for Foo objects that takes only one argument and uses the given value for both the bar and baz fields. This is simple:

Foo(x) = Foo(x,x)

julia> Foo(1)
Foo(1,1)

You could also add a zero-argument Foo constructor method that supplies default values for both of the bar and baz fields:

Foo() = Foo(0)

julia> Foo()
Foo(0,0)

Here the zero-argument constructor method calls the single-argument constructor method, which in turn calls the automatically provided two-argument constructor method. For reasons that will become clear very shortly, additional constructor methods declared as normal methods like this are called outer constructor methods. Outer constructor methods can only ever create a new instance by calling another constructor method, such as the automatically provided default one.


A Note On Nomenclature.

While the term "constructor" generally refers to the entire function which constructs objects of a type, it is common to abuse terminology slightly and refer to specific constructor methods as "constructors". In such situations, it is generally clear from context that the term is used to mean "constructor method" rather than "constructor function", especially as it is often used in the sense of singling out a particular method of the constructor from all of the others.

## Inner Constructor Methods

While outer constructor methods succeed in addressing the problem of providing additional convenience methods for constructing objects, they fail to address the other two use cases mentioned in the introduction of this chapter: enforcing invariants, and allowing construction of self-referential objects. For these problems, one needs inner constructor methods. An inner constructor method is much like an outer constructor method, with two differences:

  1. It is declared inside the block of a type declaration, rather than outside of it like normal methods.
  2. It has access to a special locally existent function called new that creates objects of the block's type.

For example, suppose one wants to declare a type that holds a pair of real numbers, subject to the constraint that that the first number is not greater than the second one. One could declare it like this:

type OrderedPair
  x::Real
  y::Real

  OrderedPair(x,y) = x > y ? error("out of order") : new(x,y)
end

Now OrderedPair objects can only be constructed such that x <= y:

julia> OrderedPair(1,2)
OrderedPair(1,2)

julia> OrderedPair(2,1)
out of order

You can still reach in and directly change the field values to violate this invariant (support for immutable composites is planned but not yet implemented), but messing around with an object's internals uninvited is considered poor form. You (or someone else) can also provide additional outer constructor methods at any later point, but once a type is declared, there is no way to add more inner constructor methods. Since outer constructor methods can only create objects by calling other constructor methods, ultimately, some inner constructor must be called to create an object. This guarantees that all objects of the declared type must come into existence by a call to one of the inner constructor methods provided with the type, thereby giving some degree of real enforcement of a type's invariants, at least for object creation.

If any inner constructor method is defined, no default constructor method is provided: it is presumed that you have supplied yourself with all the inner constructors you need. The default constructor is equivalent to writing your own inner constructor method which takes all of the object's fields as parameters, passes them directly to new, and returns the resulting object like so:

type Foo
  bar
  baz

  Foo(bar,baz) = new(bar,baz)
end

This declaration has the same effect as the earlier definition of the Foo type without an explicit inner constructor method. Even if a type's fields have constrained types, this equivalence holds because the new function attempts to convert arguments that are not already of the required type:

type T1
  x::Int64
end

type T2
  x::Int64
  T2(x) = new(x)
end

julia> T1(1)
T1(1)

julia> T2(1)
T2(1)

julia> T1(1.9)
T1(1)

julia> T2(1.9)
T2(1)

julia> T1("hello")
no method convert(Type{Int64},ASCIIString)

julia> T2("hello")
no method convert(Type{Int64},ASCIIString)

It is considered good form to provide as few inner constructor methods as possible: only those taking all arguments explicitly and enforcing essential error checking and transformation. Additional convenience constructor methods, supplying default values or auxiliary additional transformations, should be provided as outer constructors, calling the inner constructors to do the heavy lifting. This separation is typically quite natural.

## Incomplete Initialization

The final problem which has still not been addressed is construction of self-referential objects, or more generally, recursive data structures. Since the fundamental difficulty may not be immediately obvious, let us briefly explain it. Consider the following recursive type declaration:

type SelfReferential
  obj::SelfReferential
end

This type may appear innocuous enough, until one considers how to construct an instance of it. If a is an instance of SelfReferential, then a second instance can be created by the call:

b = SelfReferential(a)

But how does one construct the first instance when no instance exists to provide as a valid value for its obj field? The only solution is to allow creating an incompletely initialized instance of SelfReferential with an unassigned obj field, and using that incomplete instance as a valid value for the obj field of another instance, such as, for example, itself.

To allow for the creation of incompletely initialized objects, Julia allows the new function to be called with fewer than the number of fields that the type has, returning an object with the unspecified fields uninitialized. The inner constructor method can then use the incomplete object, finishing its initialization before returning it. Here, for example, we take another crack at defining the SelfReferential type, with a zero-argument inner constructor returning instances having obj fields pointing to themselves:

type SelfReferential
  obj::SelfReferential

  SelfReferential() = (x = new(); x.obj = x)
end

We can verify that this constructor works and constructs objects that are, in fact, self-referential:

x = SelfReferential();

julia> is(x, x)
true

julia> is(x, x.obj)
true

julia> is(x, x.obj.obj)
true

Although it is generally a good idea to return a fully initialized object from an inner constructor, incompletely initialized objects can be returned:

type Incomplete
  xx

  Incomplete() = new()
end

julia> z = Incomplete();

While you are allowed to create objects with uninitialized fields, any access to an uninitialized field is an immediate error:

julia> z.xx
access to undefined reference

This prevents uninitialized fields from propagating throughout a program or forcing programmers to continually check for uninitialized fields, the way they are required to check for null values everywhere in Java. You can also pass incomplete objects to other functions from inner constructors to complete them:

type Lazy
  xx

  Lazy(v) = complete_me(new(), v)
end

As with incomplete objects returned from constructors, if complete_me or any of its callees try to access the xx field of the Lazy object before it has been initialized, an error will immediately be thrown.

## Parametric Constructors

Parametric types add a few wrinkles to the constructor story. Recall from Parametric Types that, by default, instances of parametric composite types can be constructed either with explicitly given type parameters or with type parameters implied by the types of the arguments given to the constructor. For example:

type Point{T<:Real}
  x::T
  y::T
end

## implicit T ##

julia> Point(1,2)
Point(1,2)

julia> Point(1.0,2.5)
Point(1.0,2.5)

julia> Point(1,2.5)
no method Point(Int64,Float64)

## explicit T ##

julia> Point{Int64}(1,2.5)
Point(1,2)

julia> Point{Float64}(1,2.5)
Point(1.0,2.5)

For constructor calls with explicit type parameters, such as Point{Int64}(1,2.5), the arguments can be of any type since the value of T is explicitly given. If the arguments are not already of type T, then conversion is attempted. When the type is implied by the arguments to the constructor call, as in Point(1,2), then the types of the arguments must match — otherwise it's ambiguous which of the arguments should determine the value of T.

What's really going on here is that Point, Point{Float64} and Point{Int64} are all different constructor functions. In fact, Point{T} is a distinct constructor function for each type T. Without any explicitly provided inner constructors, the declaration of the composite type Point{T<:Real} automatically provides an inner constructor, Point{T}, for each possible type T<:Real, which behaves just like non-parametric default inner constructors do. It also provides a single general outer Point constructor that takes pairs of real arguments, which must be of the same type. This automatic provision of constructors is equivalent to the following explicit declaration:

type Point{T<:Real}
  x::T
  y::T

  Point(x,y) = new(x,y)
end

Point{T<:Real}(x::T, y::T) = Point{T}(x,y)

Some features of parametric constructor definitions at work here deserve comment. First, inner constructor declarations always define methods of Point{T} rather than methods of the general Point constructor function. Since Point is not a concrete type, it makes no sense for it to even have inner constructor methods at all. Thus, the inner method declaration Point(x,y) = new(x,y) provides an inner constructor method for each value of T. It is, accordingly, this method declaration that makes the constructor calls with explicit type parameters, like Point{Float64}(1,2) and Point{Int64}(1,2), work. The outer constructor declaration defines a method for the general Point constructor, and only applies to pairs of values of the same real type. This declaration makes constructor calls like Point(1,2) and Point(1.0,2.5), without explicit type parameters, work. Since the method declaration restricts the arguments to being of the same type, calls like Point(1,2.5), with arguments of different types, result in "no method" errors.

Suppose we wanted to make the constructor call Point(1,2.5) work. The simplest way to achieve this is to define the following additional outer constructor method:

Point(x::Int64, y::Float64) = Point{Float64}(x,y)

This method definition calls the explicit type constructor for Point{Float64}, thereby giving the Float64 type precedence over Int64: both x and y will be converted to Float64. With this method definition the previous "no method" error now creates a point:

julia> Point(1,2.5)
Point(1.0,2.5)

julia> typeof(ans)
Point{Float64}

However, other similar calls still don't work:

julia> Point(1.5,2)
no method Point(Float64,Int64)

For a much more general way of making all such calls work sensibly, see Conversion and Promotion. At the risk of spoiling the suspense, we can reveal here that the all it takes is the following outer method definition to make all calls to the general Point constructor work as one would expect:

Point(x::Real, y::Real) = Point(promote(x,y)...)

With this method definition, the Point constructor promotes its arguments the same way that numeric operators like + do, and works for all kinds of real numbers:

julia> Point(1.5,2)
Point(1.5,2.0)

julia> Point(1,1//2)
Point(1//1,1//2)

julia> Point(1.0,1//2)
Point(1.0,0.5)

While the implicit type parameter constructors provided by default in Julia are fairly strict, it is possible to make them behave in a more relaxed but sensible manner quite easily if one wants to. Moreover, since constructors can leverage all of the power of the type system, methods, and multiple dispatch, providing sophisticated behavior is typically quite simple.

## Case Study: Rational

Perhaps the best way to tie all these pieces together is to present a real world example of a parametric composite type and its constructor methods. To that end, here is beginning of rational.j, which implements Julia's rational numbers:

type Rational{T<:Int} <: Real
    num::T
    den::T

    function Rational(num::T, den::T)
        if num != 0 || den != 0
            g = gcd(den, num)
            num = div(num, g)
            den = div(den, g)
        end
        new(num, den)
    end
end
Rational{T<:Int}(n::T, d::T) = Rational{T}(n,d)
Rational(n::Int, d::Int) = Rational(promote(n,d)...)
Rational(n::Int) = Rational(n,one(n))

//(n::Int, d::Int) = Rational(n,d)
//(x::Rational, y::Int) = x.num // (x.den*y)
//(x::Int, y::Rational) = (x*y.den) // y.num
//(x::Complex, y::Real) = complex(real(x)//y, imag(x)//y)
//(x::Real, y::Complex) = x*y'//real(y*y')

function //(x::Complex, y::Complex)
    xy = x*y'
    yy = real(y*y')
    complex(real(xy)//yy, imag(xy)//yy)
end

The line type Rational{T<:Int} <: Real declares that Rational takes one type parameter of an integer type, and is itself a real type. The field declarations num::T and den::T indicate that the data held in a Rational{T} object are a pair of integers of type T, one representing the rational value's numerator and the other representing its denominator.

Now things get interesting. Rational has a single inner constructor method which ensures that every rational is constructed with numerator and denominator sharing no common factors with a non-negative denominator. This is accomplished by dividing the given numerator and denominator values by their greatest common divisor, computed using the gcd function. Since gcd returns the greatest common divisor of its arguments with sign matching the first argument — in this case, den — after this division, the new value of den is guaranteed to be non-negative. Because this is the only inner constructor for Rational, we can be certain that Rational objects are always constructed in this normalized form.

Rational also provides several outer constructor methods for convenience. The first is the "standard" general constructor that infers the type parameter T from the type of the numerator and denominator, in the case that they have the same type. The second applies when the given numerator and denominator values have different types: it promotes them to a common type and then delegates construction to the first outer constructor. The third outer constructor turns integer values into rationals by supplying a value of 1 as the denominator.

Following the outer constructor definitions, we have a number of methods for the // operator, which provides a syntax for writing rationals. Before these definitions, // is a completely undefined operator with only syntax and no meaning. Afterwards, it behaves just as described in Rational Numbers — its entire behavior is defined in these few lines. The first and most basic definition just makes a//b where a and b are integers construct a Rational by applying the Rational constructor to them. When one of the operands of // is already a rational number, we construct a new rational for the resulting ratio slightly differently; this behavior is actually identical to division of a rational with an integer. Finally, applying // to complex integral values creates an instance of Complex{Rational} — a complex number whose real and imaginary parts are rationals:

julia> (1 + 2im)//(1 - 2im)
-3//5 + 4//5im

julia> typeof(ans)
ComplexPair{Rational{Int64}}

julia> ans <: Complex{Rational}
true

Thus, although the // operator usually returns an instance of Rational, if either of its arguments are complex integers, it will return an instance of Complex{Rational} instead. The interested reader should consider perusing the rest of rational.j: it is short, self-contained, and implements an entire basic Julia type in just a little over a hundred lines of code.

# 13. Conversion and Promotion

Julia has a system for promoting arguments of mathematical operators to a common type, which has been mentioned in various other sections, including Integers and Floating-Point Numbers, Mathematical Operations, Types, and Methods. In this section, we explain how this promotion system works, as well as how to extend it to new types and apply it to functions besides built-in mathematical operators. Traditionally, programming languages fall into two camps with respect to promotion of arithmetic arguments:

  • Automatic promotion for built-in arithmetic types and operators. In most languages, built-in numeric types, when used as operands to arithmetic operators with infix syntax, such as +, -, *, and /, are automatically promoted to a common type to produce the expected results. C, Java, Perl, and Python, to name a few, all correctly compute the sum 1 + 1.5 as the floating-point value 2.5, even though one of the operands to + is an integer. These systems are convenient and designed carefully enough that they are generally all-but-invisible to the programmer: hardly anyone consciously thinks of this promotion taking place when writing such an expression, but compilers and interpreters must perform conversion before addition since integers and floating-point values cannot be added as-is. Complex rules for such automatic conversions are thus inevitably part of specifications and implementations for such languages.
  • No automatic promotion. This camp includes Ada and ML — very "strict" statically typed languages. In these languages, every conversion must be explicitly specified by the programmer. Thus, the example expression 1 + 1.5 would be a compilation error in both Ada and ML. Instead one must write real(1) + 1.5, explicitly converting the integer 1 to a floating-point value before performing addition. Explicit conversion everywhere is so inconvenient, however, that even Ada has some degree of automatic conversion: integer literals are promoted to the expected integer type automatically, and floating-point literals are similarly promoted to appropriate floating-point types.

In a sense, Julia falls into the "no automatic promotion" category: mathematical operators are just functions with special syntax, and the arguments of functions are never automatically converted. However, one may observe that applying mathematical operations to a wide variety of mixed argument types is just an extreme case of polymorphic multiple dispatch — something which Julia's dispatch and type systems are particularly well-suited to handle. "Automatic" promotion of mathematical operands simply emerges as a special application: Julia comes with pre-defined catch-all dispatch rules for mathematical operators, invoked when no specific implementation exists for some combination of operand types. These catch-all rules first promote all operands to a common type using user-definable promotion rules, and then invoke a specialized implementation of the operand in question for the resulting values, now of the same type. User-defined types can easily participate in this promotion system by defining methods for conversion to and from other types, and providing a handful of promotion rules defining what types they should promote to when mixed with other types.

## Conversion

Conversion of values to various types is performed by the convert function. The convert function generally takes two arguments: the first is a type object while the second is a value to convert to that type; the returned value is the value converted to an instance of given type. The simplest way to understand this function is to see it in action:

julia> x = 12
12

julia> typeof(x)
Int64

julia> convert(Uint8, x)
12

julia> typeof(ans)
Uint8

julia> convert(Float, x)
12.0

julia> typeof(ans)
Float64

Conversion isn't always possible, in which case a no method error is thrown indicating that convert doesn't know how to perform the requested conversion:

julia> convert(Float, "foo")
no method convert(Type{Float},ASCIIString)

Some languages consider parsing strings as a numbers or formatting numbers as a strings to be conversions (many dynamic languages will even perform conversion for you automatically), however Julia does not: even though some strings can be parsed as numbers, most strings are not valid representations of numbers, and only a very limited subset of them are.

### Defining New Conversions

To define a new conversion, simply provide a new method for convert. That's really all there is to it. For example, the method to convert a number to a boolean is simply this:

convert(::Type{Bool}, x::Number) = (x!=0)

The type of the first argument of this method is a singleton type, Type{Bool}, the only instance of which is Bool. Thus, this method is only invoked when the first argument is the type value Bool. When invoked, the method determines whether a numeric value is true or false as a boolean, by comparing it to zero:

julia> convert(Bool, 1)
true

julia> convert(Bool, 0)
false

julia> convert(Bool, 1im)
true

julia> convert(Bool, 0im)
false

The method signatures for conversion methods are often quite a bit more involved than this example, especially for parametric types.

### Case Study: Rational Conversions

To continue our case study of Julia's Rational type, here are the conversions declared in rational.j, right after the declaration of the type and its constructors:

convert{T<:Int}(::Type{Rational{T}}, x::Rational) = Rational(convert(T,x.num),convert(T,x.den))
convert{T<:Int}(::Type{Rational{T}}, x::Int) = Rational(convert(T,x), convert(T,1))

function convert{T<:Int}(::Type{Rational{T}}, x::Float, tol::Real)
    if isnan(x); return zero(T)//zero(T); end
    if isinf(x); return sign(x)//zero(T); end
    y = x
    a = d = one(T)
    b = c = zero(T)
    while true
        f = convert(T,round(y)); y -= f
        a, b, c, d = f*a+c, f*b+d, a, b
        if y == 0 || abs(a/b-x) <= tol
            return a//b
        end
        y = 1/y
    end
end
convert{T<:Int}(rt::Type{Rational{T}}, x::Float) = convert(rt,x,eps(x))

convert{T<:Float}(::Type{T}, x::Rational) = convert(T,x.num)/convert(T,x.den)
convert{T<:Int}(::Type{T}, x::Rational) = div(convert(T,x.num),convert(T,x.den))

The initial four convert methods provide conversions to rational types. The first method converts one type of rational to another type of rational by converting the numerator and denominator to the appropriate integer type. The second method does the same conversion for integers by taking the denominator to be 1. The third method implements a standard algorithm for approximating a floating-point number by a ratio of integers to within a given tolerance, and the fourth method applies it, using machine epsilon at the given value as the threshold. In general, one should have a//b == convert(Rational{Int64}, a/b).

The last two convert methods provide conversions from rational types to floating-point and integer types. To convert to floating point, one simply converts both numerator and denominator to that floating point type and then divides. To convert to integer, one can use the div operator for truncated integer division (rounded towards zero).

## Promotion

Promotion refers to converting values of mixed types to a single common type. Although it is not strictly necessary, it is generally implied that the common type to which the values are converted can faithfully represent all of the original values. In this sense, the term "promotion" is appropriate since the values are converted to a "greater" type — i.e. one which can represent all of the input values in a single common type. It is important, however, not to confuse this with object-oriented (structural) super-typing, or Julia's notion of abstract super-types: promotion has nothing to do with the type hierarchy, and everything to do with converting between alternate representations. For instance, although every Int32 value can also be represented as a Float64 value, Int32 is not a subtype of Float64.

Promotion to a common supertype is performed in Julia by the promote function, which takes any number of arguments, and returns a tuple of the same number of values, converted to a common type, or throws an exception if promotion is not possible. The most common use case for promotion is to convert numeric arguments to a common type:

julia> promote(1, 2.5)
(1.0,2.5)

julia> promote(1, 2.5, 3)
(1.0,2.5,3.0)

julia> promote(2, 3//4)
(2//1,3//4)

julia> promote(1, 2.5, 3, 3//4)
(1.0,2.5,3.0,0.75)

julia> promote(1.5, im)
(1.5 + 0.0im,0.0 + 1.0im)

julia> promote(1 + 2im, 3//4)
(1//1 + 2//1im,3//4 + 0//1im)

Integer values are promoted to the largest type of the integer values. Floating-point values are promoted to largest of the floating-point types. Mixtures of integers and floating-point values are promoted to a floating-point type big enough to hold all the values. Integers mixed with rationals are promoted to rationals. Rationals mixed with floats are promoted to floats. Complex values mixed with real values are promoted to the appropriate kind of complex value.

That is really all there is to using promotions. The rest is just a matter of clever application, the most typical "clever" application being the definition of catch-all methods for numeric operations like the arithmetic operators +, -, * and /. Here are some of the the catch-all method definitions given in promotion.j:

+(x::Number, y::Number) = +(promote(x,y)...)
-(x::Number, y::Number) = -(promote(x,y)...)
*(x::Number, y::Number) = *(promote(x,y)...)
/(x::Number, y::Number) = /(promote(x,y)...)

These method definitions say that in the absence of more specific rules for adding, subtracting, multiplying and dividing pairs of numeric values, promote the values to a common type and then try again. That's all there is to it: nowhere else does one ever need to worry about promotion to a common numeric type for arithmetic operations — it just happens automatically. There are definitions of catch-all promotion methods for a number of other arithmetic and mathematical functions in promotion.j, but beyond that, there are hardly any calls to promote required in the Julia standard library. The most common usages of promote occur in outer constructors methods, provided for convenience, to allow constructor calls with mixed types to delegate to an inner type with fields promoted to an appropriate common type. For example, recall that rational.j provides the following outer constructor method:

Rational(n::Int, d::Int) = Rational(promote(n,d)...)

This allows calls like the following to work:

julia> Rational(int8(15),int32(-5))
-3//1

julia> typeof(ans)
Rational{Int32}

For most user-defined types, it is better practice to require programmers to supply the expected types to constructor functions explicitly, but sometimes, especially for numeric problems, it can be convenient to do promotion automatically.

### Defining Promotion Rules

Although one could, in principle, define methods for the promote function directly, this would require many redundant definitions for all possible permutations of argument types. Instead, the behavior of promote is defined in terms of an auxiliary function called promote_rule, which one can provide methods for. The promote_rule function takes a pair of type objects and returns another type object, such that instances of the argument types will be promoted to the returned type. Thus, by defining the rule:

promote_rule(::Type{Float64}, ::Type{Float32} ) = Float64

one declares that when 64-bit and 32-bit floating-point values are promoted together, they should be promoted to 64-bit floating-point. The promotion type does not need to be one of the argument types, however; the following promotion rules both occur in Julia's standard library:

promote_rule(::Type{Uint8}, ::Type{Int8}) = Int16
promote_rule(::Type{Char}, ::Type{Uint8}) = Int32

The former rule expresses that Int16 is the smallest integer type that contains all the values representable by both Uint8 and Int8 since the former's range extends above 127 while the latter's range extends below 0. In the latter case, the result type is Int32 since Int32 is large enough to contain all possible Unicode code points, and numeric operations on characters always result in plain old integers unless explicitly cast back to characters (see Strings). Also note that one does not need to define both promote_rule(::Type{A}, ::Type{B}) and promote_rule(::Type{B}, ::Type{A}) — the symmetry is implied by the way promote_rule is used in the promotion process.

The promote_rule function is used as a building block to define a second function called promote_type, which, given any number of type objects, returns the common type to which those values, as arguments to promote should be promoted. Thus, if one wants to know, in absence of actual values, what type a collection of values of certain types would promote to, one can use promote_type:

julia> promote_type(Int8, Uint16)
Int32

Internally, promote_type is used inside of promote to determine what type argument values should be converted to for promotion. It can, however, be useful in its own right. The curious reader can read the code in promotion.j, which defines the complete promotion mechanism in about 35 lines.

### Case Study: Rational Promotions

Finally, we finish off our ongoing case study of Julia's rational number type, which makes relatively sophisticated use of the promotion mechanism with the following promotion rules:

promote_rule{T<:Int}(::Type{Rational{T}}, ::Type{T}) = Rational{T}
promote_rule{T<:Int,S<:Int}(::Type{Rational{T}}, ::Type{S}) = Rational{promote_type(T,S)}
promote_rule{T<:Int,S<:Int}(::Type{Rational{T}}, ::Type{Rational{S}}) = Rational{promote_type(T,S)}
promote_rule{T<:Int,S<:Float}(::Type{Rational{T}}, ::Type{S}) = promote_type(T,S)

The first rule asserts that promotion of a rational number with its own numerator/denominator type, simply promotes to itself. The second rule says that promoting a rational number with any other integer type promotes to a rational type whose numerator/denominator type is the result of promotion of its numerator/denominator type with the other integer type. The third rule applies the same logic to two different types of rational numbers, resulting in a rational of the promotion of their respective numerator/denominator types. The fourth and final rule dictates that promoting a rational with a float results in the same type as promoting the numerator/denominator type with the float.

This small handful of promotion rules, together with the conversion methods discussed above, are sufficient to make rational numbers interoperate completely naturally with all of Julia's other numeric types — integers, floating-point numbers, and complex numbers. By providing appropriate conversion methods and promotion rules in the same manner, any user-defined numeric type can interoperate just as naturally with Julia's predefined numerics.

# 14. Arrays

Julia, like most technical computing languages, provides a first-class array implementation. Most technical computing languages pay a lot of attention to their array implementation at the expense of other containers. Julia does not treat arrays in any special way. The array library is implemented almost completely in julia itself, and derives its performance from the compiler, just like any other code written in julia.

An array is a collection of objects stored in a multi-dimensional grid. In the most general case, an array may contain objects of type Any. For most computational purposes, arrays should contain objects of a more specific type, such as Float64, Int32, etc.

In general, unlike many other technical computing languages, Julia does not expect programs to be written in a vectorized style for performance. Julia's JIT compiler uses type inference and generates optimized code for scalar array indexing, allowing programs to be written in a style that is convenient and readable, without sacrificing performance, and using significantly lesser memory at times.

In Julia, all arguments to functions are passed by reference. Some technical computing languages pass arrays by value, and this is convenient in many cases. In Julia, modifications made to input arrays within a function will be visible in the parent function. The entire Julia array library ensures that inputs are not modified by library functions. User code, if it needs to exhibit similar behaviour, should take care to create a copy of inputs that it may modify.

## Basic Functions
  1. ndims(A) — the number of dimensions of A
  2. size(A) — a tuple containing the dimensions of A
  3. eltype(A) — the type of the elements contained in A
  4. numel(A) — the number of elements in A
  5. length(A) — the size of the largest dimension of A
  6. nnz(A) — the number of nonzero values in A
  7. stride(A,k) — the size of the stride along dimension k
  8. strides(A) — a tuple of the linear index distances between adjacent elements in each dimension
## Construction and Initialization

A broad variety of functions for constructing and initializing arrays are provided. In the following list of such functions, calls with a dims... argument can either take a single tuple of dimension sizes or a series of dimension sizes passed as a variable number of arguments.

  1. Array(type, dims...) — an uninitialized dense array
  2. cell(dims...) — an uninitialized cell array (heterogeneous array)
  3. zeros(type, dims...) — an array of all zeros of specified type
  4. ones(type, dims...) — an array of all ones of specified type
  5. trues(dims...) — a Bool array with all values true
  6. falses(dims...) — a Bool array with all values false
  7. reshape(A, dims...) — an array with the same data as the given array, but with different dimensions.
  8. fill(A, x) — fill the array A with value x
  9. copy(A) — copy A
  10. similar(A, element_type, dims...) — an uninitialized array of the same generic type as the given array (dense, sparse, etc.), but with the specified element type and dimensions. The second and third arguments are both optional, defaulting to the element type and dimensions of A if omitted.
  11. reinterpret(type, A) — Construct an array with the same binary data as the given array, but with the specified element type.
  12. rand(dims) — random array with Float64 uniformly distributed values in [0,1)
  13. randf(dims) — random array with Float32 uniformly distributed values in [0,1)
  14. randn(dims) — random array with Float64 normally distributed random values with a mean of 0 and standard deviation of 1
  15. eye(n) — n-by-n identity matrix
  16. eye(m, n) — m-by-n identity matrix
  17. linspace(start, stop, n) — Construct a vector of n linearly-spaced elements from start to stop.
### Comprehensions

Comprehensions provide a general and powerful way to construct arrays. The comprehension syntax is similar to the set construction notation in mathematics:

A = [ F(x,y,...) | x=rx, y=ry, ... ]

The meaning of this form is that F(x,y,...) is evaluated with the variables x, y, etc. taking on each value in their given list of values. Values can be specified as any iterable object, but will commonly be ranges like 1:n or 2:(n-1), or explicit arrays of values like [1.2, 3.4, 5.7]. The result is an N-d dense array with dimensions that are the concatenation of the dimensions of the variable ranges rx, ry, etc. and each F(x,y,...) evaluation returns a scalar.

The following example computes a weighted average of the current element and its left and right neighbour along a 1-d grid.

julia> const x = rand(10)
[0.6017125321472665,0.55317268439850298,0.83375372173664064,0.20371170284589835,0.50800458572940888,0.52963052092498386,0.33042233578025493,0.49411133447814293,0.29570938193206264,0.81897111867503525]

julia> [ 0.5*x[i-1] + x[i] + 0.5*x[i+1] | i=2:length(x)-1 ]
[1.27090581134045655,1.21219591535884108,0.8745908565789231,0.87467569761484998,0.94884398167981576,0.84229326348181832,0.80717719333430171,0.95225060850865173]

In most high-level technical computing languages, this computation would be performed by computing three vectors (left-shifted, centre, and right-shifted) and then using vector arithmetic, which ends up using at least three times as much memory and perhaps more for temporary vectors that are created along the way. The Julia approach here computes the result vector directly. The code is closer to the math, and thus much more intuitive to understand.

NOTE: In the above example, x is declared as constant because type inference in Julia does not work on non-constant global variables.

## Indexing

The general syntax for indexing into an n-dimensional array A is:

X = A[I_1, I_2, ..., I_n]

where each I_k may be:

  1. A scalar value
  2. A Range of the form :, a:b, or a:b:c
  3. An arbitrary integer vector, including the empty vector []

The result X has the dimensions (size(I_1), size(I_2), ..., size(I_n)), with location (i_1, i_2, ..., i_n) of X containing the value A[I_1[i_1], I_2[i_2], ..., I_n[i_n]].

Indexing syntax is equivalent to a call to ref:

X = ref(A, I_1, I_2, ..., I_n)

Example:

julia> x = reshape(1:16, 4, 4)
4x4 Int64 Array
1 5 9 13 
2 6 10 14 
3 7 11 15 
4 8 12 16 

julia> x[2:3, 2:end-1]
2x2 Int64 Array
6 10 
7 11 
## Assignment

The general syntax for assigning values in an n-dimensional array A is:

A[I_1, I_2, ..., I_n] = X

where each I_k may be:

  1. A scalar value
  2. A Range of the form :, a:b, or a:b:c
  3. An arbitrary integer vector, including the empty vector []

The size of X should be (size(I_1), size(I_2), ..., size(I_n)), and the value in location (i_1, i_2, ..., i_n) of A is overwritten with the value X[I_1[i_1], I_2[i_2], ..., I_n[i_n]].

Index assignment syntax is equivalent to a call to assign:

  A = assign(A, X, I_1, I_2, ..., I_n)

Example:

julia> x = reshape(1:9, 3, 3)
3x3 Int64 Array
1 4 7 
2 5 8 
3 6 9 

julia> x[1:2, 2:3] = -1
3x3 Int64 Array
1 -1 -1 
2 -1 -1 
3 6 9
## Concatenation

Arrays can be concatenated along any dimension using the following syntax:

  1. cat(dim, A...) — concatenate input n-d arrays along the dimension dim
  2. vcat(A...) — Shorthand for cat(1, A...)
  3. hcat(A...) — Shorthand for cat(2, A...)
  4. hvcat(A...)

Concatenation operators may also be used for concatenating arrays:

  1. [A B C...] — calls hcat
  2. [A, B, C, ...] — calls vcat
  3. [A B; C D; ...] — calls hvcat
## Vectorized Operators and Functions

The following operators are supported for arrays. In case of binary operators, the dot version of the operator should be used when both inputs are non-scalar, and any version of the operator may be used if one of the inputs is a scalar.

  1. Unary Arithmetic — -
  2. Binary Arithmetic — +, -, *, .*, /, ./, \, .\, ^, .^, div, mod
  3. Comparison — ==, !=, <, <=, >, >=
  4. Unary Boolean or Bitwise — ~
  5. Binary Boolean or Bitwise — &, |, $
  6. Trigonometrical functions — sin, cos, tan, sinh, cosh, tanh, asin, acos, atan, atan2, sec, csc, cot, asec, acsc, acot, sech, csch, coth, asech, acsch, acoth, sinc, cosc, hypot
  7. Logarithmic functions — log, log2, log10, log1p, logb, ilogb
  8. Exponential functions — exp, expm1, exp2, ldexp
  9. Rounding functions — ceil, floor, trunc, round, ipart, fpart
  10. Other mathematical functions — min, max, abs, pow, sqrt, cbrt, erf, erfc, gamma, lgamma, real, conj, clamp
## Implementation

The base array type in Julia is the abstract type AbstractArray{T,n}. It is parametrized by the number of dimensions n and the element type T. AbstractVector and AbstractMatrix are aliases for the 1-d and 2-d cases. Operations on AbstractArray objects are defined using higher level operators and functions, in a way that is independent of the underlying storage class. These operations are guaranteed to work correctly as a fallback for any specific array implementation.

The Array{T,n} type is a specific instance of AbstractArray where elements are stored in column-major order. Vector and Matrix are aliases for the 1-d and 2-d cases. Specific operations such as scalar indexing, assignment, and a few other basic storage-specific operations are all that have to be implemented for Array, so that the rest of the array library can be implemented in a generic manner for AbstractArray.

SubArray is a specialization of AbstractArray that perform indexing by reference rather than by copying. A SubArray is created with the sub function, which is called the same way as ref (with an array and a series of index arguments). The result of sub looks the same as the result of ref, except the data is left in place. sub stores the input index vectors in a SubArray object, which can later be used to index the original array indirectly.

StridedVector and StridedMatrix are convenient aliases defined to make it possible for Julia to call a wider range of BLAS and LAPACK functions by passing them either Array or SubArray objects, and thus saving inefficiencies from indexing and memory allocation.

The following example computes the QR decomposition of a small section of a larger array, without creating any temporaries, and by calling the appropriate LAPACK function with the right leading dimension size and stride parameters.

julia> a = rand(10,10);

julia> b = sub(a, 2:2:8,2:2:4)
4x2 SubArray of 10x10 Float64 Array
0.48291296659328276 0.31639301252254248 
0.11191852765878418 0.80311033863988501 
0.34377272170384798 0.12998312467801409 
0.75207724893767547 0.48974544536835718 

julia> (q,r,p) = qr(b);

julia> q
4x2 Float64 Array
-0.31610281030340204 0.38994108897230212 
-0.80237370921615103 -0.5848318975546335 
-0.12986390146593485 0.36571345172816944 
-0.48929624071011685 0.61005841520202764 

julia> r
2x2 Float64 Array
-1.00091806276211814 -0.65508286752651457 
0.0 0.70738744643074303 

julia> p
[2,1]
# 15. Running External Programs

Julia borrows backtick notation for commands from the shell, Perl, and Ruby. However, in Julia, writing

julia> `echo hello`
`echo hello`

differs in a several aspects from the behavior in various shells, Perl, or Ruby:

  • Instead of immediately running the command, backticks create a Cmd object to represent the command. You can use this object to connect the command to others via pipes, run it, and read or write to it.
  • When the command is run, Julia does not capture its output unless you specifically arrange for it to. Instead, the output of the command by default goes to stdout as it would using libc's system call.
  • The command is never run with a shell. Instead, Julia parses the command syntax directly, appropriately interpolating variables and splitting on words as the shell would, respecting shell quoting syntax. The command is run as julia's immediate child process, using fork and exec calls.

Here's a simple example of actually running an external program:

julia> run(`echo hello`)
hello
true

The hello is the output of the echo command, while the true is the return value of the command, indicating that it succeeded. (These are colored differently by the interactive session if your terminal supports color.)

## Interpolation

Suppose you want to do something a bit more complicated and use the name of a file in the variable file as an argument to a command. You can use $ for interpolation much as you would in a string literal (see Strings):

julia> file = "/etc/passwd"
"/etc/passwd"

julia> `sort $file`
`sort /etc/passwd`

A common pitfall when running external programs via a shell is that if a file name contains characters that are special to the shell, they may cause undesirable behavior. Suppose, for example, rather than /etc/passwd, we wanted to sort the contents of the file /Volumes/External HD/data.csv. Let's try it:

julia> file = "/Volumes/External HD/data.csv"
"/Volumes/External HD/data.csv"

julia> `sort $file`
`sort '/Volumes/External HD/data.csv'`

How did the file name get quoted? Julia knows that file is meant to be interpolated as a single argument, so it quotes the word for you. Actually, that is not quite accurate: the value of file is never interpreted by a shell, so there's no need for actual quoting; the quotes are inserted only for presentation to the user. This will even work if you interpolate a value as part of a shell word:

julia> path = "/Volumes/External HD"
"/Volumes/External HD"

julia> name = "data"
"data"

julia> ext = "csv"
"csv"

julia> `sort $path/$name.$ext`
`sort '/Volumes/External HD/data.csv'`

As you can see, the space in the path variable is appropriately escaped. But what if you want to interpolate multiple words? In that case, just use an array (or any other iterable container):

julia> files = ["/etc/passwd","/Volumes/External HD/data.csv"]
["/etc/passwd","/Volumes/External HD/data.csv"]

julia> `grep foo $files`
`grep foo /etc/passwd '/Volumes/External HD/data.csv'`

If you interpolate an array as part of a shell word, Julia emulates the shell's {a,b,c} argument generation:

julia> names = ["foo","bar","baz"]
["foo","bar","baz"]

julia> `grep xylophone $names.txt`
`grep xylophone foo.txt bar.txt baz.txt`

Moreover, if you interpolate multiple arrays into the same word, the shell's Cartesian product generation behavior is emulated:

julia> names = ["foo","bar","baz"]
["foo","bar","baz"]

julia> exts = ["aux","log"]
["aux","log"]

julia> `rm -f $names.$exts`
`rm -f foo.aux foo.log bar.aux bar.log baz.aux baz.log`

Since you can interpolate literal arrays, you can use this generative functionality without needing to create temporary array objects first:

julia> `rm -rf $["foo","bar","baz","qux"].$["aux","log","pdf"]`
`rm -rf foo.aux foo.log foo.pdf bar.aux bar.log bar.pdf baz.aux baz.log baz.pdf qux.aux qux.log qux.pdf`
## Quoting

Inevitably, one wants to write commands that aren't quite so simple, and it becomes necessary to use quotes. Here's a simple example of a perl one-liner at a shell prompt:

sh$ perl -le '$|=1; for (0..3) { print }'
0
1
2
3

The Perl expression needs to be in single quotes for two reasons: so that spaces don't break the expression into multiple shell words, and so that uses of Perl variables like $| (yes, that's the name of a variable in Perl), don't cause interpolation. In other instances, you may want to use double quotes so that interpolation does occur:

sh$ first="A"
sh$ second="B"
sh$ perl -le '$|=1; print for @ARGV' "1: $first" "2: $second"
1: A
2: B

In general, the Julia backtick syntax is carefully designed so that you can just cut-and-paste shell commands as-is into backticks and they will work: the escaping, quoting, and interpolation behaviors are the same as the shell's. The only difference is that the interpolation is integrated and aware of Julia's notion of what is a single string value, and what is a container for multiple values. Let's try the above two examples in Julia:

julia> `perl -le '$|=1; for (0..3) { print }'`
`perl -le '$|=1; for (0..3) { print }'`

julia> run(ans)
0
1
2
3
true

julia> first = "A"; second = "B";

julia> `perl -le 'print for @ARGV' "1: $first" "2: $second"`
`perl -le 'print for @ARGV' '1: A' '2: B'`

julia> run(ans)
1: A
2: B
true

The results are identical, and Julia's interpolation behavior mimics the shell's with some improvements due to the fact that Julia supports first-class iterable objects while most shells use strings split on spaces for this, which introduces ambiguities. When trying to port shell commands to Julia, try cut and pasting first. Since Julia shows commands to you before running them, you can easily and safely just examine its interpretation without doing any damage.

## Pipelines

Shell metacharacters, such as |, &, and >, are not special inside of Julia's backticks: unlike in the shell, inside of Julia's backticks, a pipe is always just a pipe:

julia> run(`echo hello | sort`)
hello | sort
true

This expression invokes the echo command with three words as arguments: "hello", "|", and "sort". The result is that a single line is printed: "hello | sort". Inside of backticks, a "|" is just a literal pipe character. How, then, does one construct a pipeline? Instead of using "|" inside of backticks, one uses Julia's | operator between Cmd objects:

julia> run(`echo hello` | `sort`)
hello
true

This pipes the output of the echo command to the sort command. Of course, this isn't terribly interesting since there's only one line to sort, but we can certainly do much more interesting things:

julia> run(`cut -d: -f3 /etc/passwd` | `sort -n` | `tail -n5`)
210
211
212
213
214
true

This prints the highest five user IDs on a UNIX system. The cut, sort and tail commands are all spawned as immediate children of the current julia process, with no intervening shell process. Julia itself does the work to setup pipes and connect file descriptors that is normally done by the shell. Since Julia does this itself, it retains better control and can do some things that shells cannot.

Julia can run multiple commands in parallel:

julia> run(`echo hello` & `echo world`)
world
hello
true

The order of the output here is non-deterministic because the two echo processes are started nearly simultaneously, and race to make the first write to the stdout descriptor they share with each other and the julia parent process. Julia lets you pipe the output from both of these processes to another program:

julia> run(`echo world` & `echo hello` | `sort`)
hello
world
true

In terms of UNIX plumbing, what's happening here is that a single UNIX pipe object is created and written to by both echo processes, and the other end of the pipe is read from by the sort command.

The combination of a high-level programming language, a first-class command abstraction, and automatic setup of pipes between processes is a powerful one. To give some sense of the complex pipelines that can be created easily, here are some more sophisticated examples, with apologies for the excessive use of Perl one-liners:

julia> prefixer(prefix, sleep) = `perl -nle '$|=1; print "'$prefix' ", $_; sleep '$sleep';'`

julia> run(`perl -le '$|=1; for(0..9){ print; sleep 1 }'` | prefixer("A",2) & prefixer("B",2))
A	0
B	1
A	2
B	3
A	4
B	5
A	6
B	7
A	8
B	9
true

This is a classic example of a single producer feeding two concurrent consumers: one perl process generates lines with the numbers 0 through 9 on them, while two parallel processes consume that output, one prefixing lines with the letter "A", the other with the letter "B". Which consumer gets the first line is non-deterministic, but once that race has been won, the lines are consumed alternately by one process and then the other. (Setting $|=1 in Perl causes each print statement to flush the stdout handle, which is necessary for this example to work. Otherwise all the output is buffered and printed to the pipe at once, to be read by just one consumer process.)

Here is an even more complex multi-stage producer-consumer example:

julia> run(`perl -le '$|=1; for(0..9){ print; sleep 1 }'` |
           prefixer("X",3) & prefixer("Y",3) & prefixer("Z",3) |
           prefixer("A",2) & prefixer("B",2))
B	Y	0
A	Z	1
B	X	2
A	Y	3
B	Z	4
A	X	5
B	Y	6
A	Z	7
B	X	8
A	Y	9
true

This example is similar to the previous one, except there are two stages of consumers, and the stages have different latency so they use a different number of parallel workers, to maintain saturated throughput.

Finally, we have an example of how you can make a process read from itself:

julia> gen = `perl -le '$|=1; for(0..9){ print; sleep 1 }'`
`perl -le '$|=1; for(0..9){ print; sleep 1 }'`

julia> dup = `perl -ne '$|=1; warn $_; print ".$_"; sleep 1'`
`perl -ne '$|=1; warn $_; print ".$_"; sleep 1'`

julia> run(gen | dup | dup)
0
.0
1
..0
2
.1
3
...0
4
.2
5
..1
6
.3
....0
7
.4
8
9
..2
.5
...1
.6
..3
.....0
.7
..4
.8
.9
...2
..5
....1
..6
...3

This example never terminates since the dup process reads its own output and duplicates it to stderr forever. We strongly encourage you to try all these examples to see how they work.

# 16. Metaprogramming

The strongest legacy of Lisp in the Julia language is its metaprogramming support. Like Lisp, Julia is homoiconic: it represents its own code as a data structure of the language itself. Since code is represented by objects that can be created and manipulated from within the language, it is possible for a program to transform and generate its own code. This allows sophisticated code generation without extra build steps, and also allows true Lisp-style macros, as compared to preprocessor "macro" systems, like that of C and C++, that perform superficial textual manipulation as a separate pass before any real parsing or interpretation occurs. Another aspect of metaprogramming is reflection: the ability of a running program to dynamically discover properties of itself. Reflection emerges naturally from the fact that all data types and code are represented by normal Julia data structures, so the structure of the program and its types can be explored programmatically just like any other data.

## Expressions and Eval

Julia code is represented as a syntax tree built out of Julia data structures of type Expr. This makes it easy to construct and manipulate Julia code from within Julia, without generating or parsing source text. Here is the definition of the Expr type:

type Expr
  head::Symbol
  args::Array{Any,1}
  typ
end

The head is a symbol identifying the kind of expression, and args is an array of subexpressions, which may be symbols referencing the values of variables at evaluation time, may be nested Expr objects, or may be actual values of objects. The typ field is used by type inference to store type annotations, and can generally be ignored.

There is special syntax for "quoting" code (analogous to quoting strings) that makes it easy to create expression objects without explicitly constructing Expr objects. There are two forms: a short form for inline expressions using : followed by a single expression, and a long form for blocks of code, enclosed in quote ... end. Here is an example of the short form used to quote an arithmetic expression:

julia> ex = :(a+b*c+1)
+(a,*(b,c),1)

julia> typeof(ex)
Expr

julia> ex.head
call

julia> typeof(ans)
Symbol

julia> ex.args
{+,a,*(b,c),1}

julia> typeof(ex.args[1])
Symbol

julia> typeof(ex.args[2])
Symbol

julia> typeof(ex.args[3])
Expr

julia> typeof(ex.args[4])
Int64

Expressions provided by the parser generally only have symbols, other expressions, and literal values as their args, whereas expressions constructed by Julia code can easily have arbitrary run-time values without literal forms as args. In this specific example, + and a are symbols, *(b,c) is a subexpression, and 1 is a literal 64-bit signed integer. Here's an example of the longer expression quoting form:

julia> quote
     x = 1
     y = 2
     x + y
   end

begin
  x = 1
  y = 2
  +(x,y)
end

When the argument to : is just a symbol, a Symbol object results instead of an Expr:

julia> :foo
foo

julia> typeof(ans)
Symbol

In the context of an expression, symbols are used to indicate access to variables, and when an expression is evaluated, a symbol evaluates to the value bound to that symbol in the appropriate scope (see Variables and Scoping for further details).

### Eval and Interpolation

Given an expression object, one can cause Julia to evaluate (execute) it at the top level scope — i.e. in effect like loading from a file or typing at the interactive prompt — using the eval function:

julia> :(1 + 2)
+(1,2)

julia> eval(ans)
3

julia> ex = :(a + b)
+(a,b)

julia> eval(ex)
a not defined

julia> a = 1; b = 2;

julia> eval(ex)
3

Expressions passed to eval are not limited to returning values — they can also have side-effects that alter the state of the top-level evaluation environment:

julia> ex = :(x = 1)
x = 1

julia> x
x not defined

julia> eval(ex)
1

julia> x
1

Here, the evaluation of an expression object causes a value to be assigned to the top-level variable x.

Since expressions are just Expr objects which can be constructed programmatically and then evaluated, one can, from within Julia code, dynamically generate arbitrary code which can then be run using eval. Here is a simple example:

julia> a = 1;

julia> ex = Expr(:call, {:+,a,:b}, Any)
+(1,b)

julia> a = 0; b = 2;

julia> eval(ex)
3

The value of a is used to construct the expression ex which applies the + function to the value 1 and the variable b. Note the important distinction between the way a and b are used:

  • The value of the variable a at expression construction time is used as an immediate value in the expression. Thus, the value of a when the expression is evaluated no longer matters: the value in the expression is already 1, independent of whatever the value of a might be.
  • On the other hand, the symbol :b is used in the expression construction, so the value of the variable b at that time is irrelevant — :b is just a symbol and the variable b need not even be defined. At expression evaluation time, however, the value of the symbol :b is resolved by looking up the value of the variable b.

Constructing Expr objects like this is powerful, but somewhat tedious and ugly. Since the Julia parser is already excellent at producing expression objects, Julia allows "splicing" or interpolation of expression objects, prefixed with $, into quoted expressions, written using normal syntax. The above example can be written more clearly and concisely using interpolation:

julia> a = 1;
1

julia> ex = :($a + b)
+(1,b)

This syntax is automatically rewritten to the form above where we explicitly called Expr. The use of $ for expression interpolation is intentionally reminiscent of string interpolation and command interpolation. Expression interpolation allows convenient, readable programmatic construction of complex Julia expressions.

### Code Generation

When a significant amount of repetitive boilerplate code is required, it is common to generate it programmatically to avoid redundancy. In most languages, this requires an extra build step, and a separate program to generate the repetitive code. In Julia, expression interpolation and eval allow such code generation to take place in the normal course of program execution. For example, the following code defines a series of operators on three arguments in terms of their 2-argument forms:

for op = (:+, :*, :&, :|, :$)
  eval(quote
    ($op)(a,b,c) = ($op)(($op)(a,b),c)
  end)
end

In this manner, Julia acts as its own preprocessor, and allows code generation from inside the language. The above code could be written slightly more tersely using the : prefix quoting form:

for op = (:+, :*, :&, :|, :$)
  eval(:(($op)(a,b,c) = ($op)(($op)(a,b),c)))
end

This sort of in-language code generation, however, using the eval(quote(...)) pattern, is common enough that Julia comes with a macro to abbreviate this pattern:

for op = (:+, :*, :&, :|, :$)
  @eval ($op)(a,b,c) = ($op)(($op)(a,b),c)
end

The @eval macro rewrites this call to be precisely equivalent to the above longer versions. For longer blocks of generated code, the expression argument given to @eval can be a block:

@eval begin
  # multiple lines
end

Interpolating into an unquoted expression is not supported and will cause a compile-time error:

julia> $a + b
not supported
## Macros

Macros are the analogue of functions for expression generation at compile time: they allow the programmer to automatically generate expressions by transforming zero or more argument expressions into a single result expression, which then takes the place of the macro call in the final syntax tree. Macros are invoked with the following general syntax:

@name expr1 expr2 ...

Note the distinguishing @ before the macro name and the lack of commas between the argument expressions. Before the program runs, this statement will be replaced with the result of calling an expander function for name on the expression arguments. Expanders are defined with the macro keyword:

macro name(expr1, expr2, ...)
    ...
end

Here, for example, is very nearly the definition of Julia's @assert macro (see error.j for the actual definition, which allows @assert to work on booleans arrays as well):

macro assert(ex)
    :($ex ? nothing : error("Assertion failed: ", $string(ex)))
end

This macro can be used like this:

julia> @assert 1==1.0

julia> @assert 1==0
Assertion failed: 1==0

Macro calls are expanded so that the above calls are precisely equivalent to writing

1==1.0 ? nothing : error("Assertion failed: ", "1==1.0")
1==0 ? nothing : error("Assertion failed: ", "1==0")

That is, in the first call, the expression :(1==1.0) is spliced into the test condition slot, while the value of string(:(1==1.0)) is spliced into the assertion message slot. The entire expression, thus constructed, is placed into the syntax tree where the @assert macro call occurs. Therefore, if the test expression is true when evaluated, the entire expression evaluates to nothing, whereas if the test expression is false, an error is raised indicating the asserted expression that was false. Notice that it would not be possible to write this as a function, since only the value of the condition and not the expression that computed it would be available.

### Hygiene

An issue that arises in more complex macros is that of hygiene. In short, one needs to ensure that variables introduced and used by macros do not accidentally clash with the variables used in code interpolated into those macros. To demonstrate the problem before providing the solution, let us consider writing a @time macro that takes an expression as its argument, records the time, evaluates the expression, records the time again, prints the difference between the before and after times, and then has the value of the expression as its final value. A naïve attempt to write this macro might look like this:

macro time(ex)
  quote
    local t0 = clock()
    local val = $ex
    local t1 = clock()
    println("elapsed time: ", t1-t0, " seconds")
    val
  end
end

At first blush, this appears to work correctly:

julia> @time begin
         local t = 0
         for i = 1:10000000
           t += i
         end
         t 
       end
elapsed time: 1.1377708911895752 seconds
50000005000000

Suppose, however, that we change the expression passed to @time slightly:

julia> @time begin
         local t0 = 0
         for i = 1:10000000
           t0 += i
         end
         t0
       end
syntax error: local t0 declared twice

What happened? The trouble is that after macro expansion, the above expression becomes equivalent to:

begin
  local t0 = clock()
  local val = begin
    local t0 = 0
    for i = 1:100000000
      t0 += i
    end
    t0
  end
  local t1 = clock()
  println("elapsed time: ", t1-t0, " seconds")
  val
end

Declaring a local variable twice in the same scope is illegal, and since begin blocks do not introduce a new scope block (see Variables and Scoping), this code is invalid. The root problem is that the naïve @time macro implementation is unhygienic: it is possible for the interpolated code to accidentally use variables that clash with the variables used by the macro's code.

To address the macro hygiene problem, Julia provides the gensym function, which generates unique symbols that are guaranteed not to clash with any other symbols. Called with no arguments, gensym returns a single unique symbol:

julia> s = gensym()
#1007

Since it is common to need more than one unique symbol when generating a block of code in a macro, if you call gensym with an integer argument, it returns a tuple of that many unique symbols, which can easily be captured using tuple destructuring:

julia> s1, s2 = gensym(2)
(#1009,#1010)

julia> s1
#1009

julia> s2
#1010

The gensym function can be used define the @time macro correctly, avoiding potential variable name clashes:

macro time(ex)
  t0, val, t1 = gensym(3)
  quote
    local $t0 = clock()
    local $val = $ex
    local $t1 = clock()
    println("elapsed time: ", $t1-$t0, " seconds")
    $val
  end
end

The call to gensym(3) generates three unique names for variables to use inside of the generated code block. With this definition, both of the above uses of @time work identically — the behavior of the code no longer depends in any way upon the names of variables in the given expression, since they are guaranteed not to collide with the names of variables used in code generated by the macro.

### Non-Standard String Literals

Recall from Strings that string literals prefixed by an identifier are called non-standard string literals, and can have different semantics than un-prefixed string literals. For example:

  • E"$100\n" interprets escape sequences but does no string interpolation
  • r"^\s*(?:#|$)" produces a regular expression object rather than a string
  • b"DATA\xff\u2200" is a byte array literal for [68,65,84,65,255,226,136,128].

Perhaps surprisingly, these behaviors are not hard-coded into the Julia parser or compiler. Instead, they are custom behaviors provided by a general mechanism that anyone can use: prefixed string literals are parsed as calls to specially-named macros. For example, the regular expression macros is just the following:

macro r_str(p)
  Regex(p)
end

That's all. This macro says that the literal contents of the string literal r"^\s*(?:#|$)" should be passed to the @r_str macro and the result of that expansion should be placed in the syntax tree where the string literal occurs. In other words, the expression r"^\s*(?:#|$)" is equivalent to placing the following object directly into the syntax tree:

Regex("^\\s*(?:#|\$)")

Not only is the string literal form shorter and far more convenient, but it is also more efficient: since the regular expression is compiled and the Regex object is actually created when the code is compiled, the compilation occurs only once, rather than every time the code is executed. Consider if the regular expression occurs in a loop:

for line = lines
  m = match(r"^\s*(?:#|$)", line)
  if m.match == nothing
    # non-comment
  else
    # comment
  end
end

Since the regular expression r"^\s*(?:#|$)" is compiled and inserted into the syntax tree when this code is parsed, the expression is only compiled once instead of each time the loop is executed. In order to accomplish this without macros, one would have to write this loop like this:

re = Regex("^\\s*(?:#|\$)")
for line = lines
  m = match(re, line)
  if m.match == nothing
    # non-comment
  else
    # comment
  end
end

Moreover, if the compiler could not determine that the regex object was constant over all loops, certain optimizations might not be possible, making this version still less efficient than the more convenient literal form above. Of course, there are still situations where the non-literal form is more convenient: if one needs to interpolate a variable into the regular expression, has to take this more verbose approach; in cases where the regular expression pattern itself is dynamic, potentially changing upon each loop iteration, a new regular expression object must be constructed on each iteration. The vast majority of use cases, however, one does not construct regular expressions dynamically, depending on run-time data. In this majority of cases, the ability to write regular expressions as compile-time values is, well, invaluable.

The mechanism for user-defined string literals is deeply, profoundly powerful. Not only are Julia's non-standard literals implemented using it, but also the command literal syntax (`echo "Hello, $person"`) and regular string interpolation are implemented using it. These two powerful facilities are implemented with the following innocuous-looking pair of macros:

macro cmd(str)
  :(cmd_gen($shell_parse(str)))
end

macro str(s)
  interp_parse(s)
end

Of course, a large amount of complexity is hidden in the functions used in these macro definitions, but they are just functions, written entirely in Julia. You can read their source and see precisely what they do — and all they do is construct expression objects to be inserted into your program's syntax tree.

## Reflection # 17. Parallel Computing

Julia provides a multiprocessing environment based on message passing to allow programs to run on multiple processors in separate memory domains at once.

Julia's implementation of message passing is different from other environments such as MPI. Communication in Julia is generally "one-sided", meaning that the programmer needs to explicitly manage only one processor in a two-processor operation. Furthermore, these operations typically do not look like "message send" and "message receive" but rather resemble higher-level operations like calls to user functions.

Parallel programming in Julia is built on two primitives: remote references and remote calls. A remote reference is an object that can be used from any processor to refer to an object stored on a particular processor. A remote call is a request by one processor to call a certain function on certain arguments on another (possibly the same) processor. A remote call returns a remote reference to its result. Remote calls return immediately; the processor that made the call proceeds to its next operation while the remote call happens somewhere else. You can wait for a remote call to finish by calling wait on its remote reference, and you can obtain the full value of the result using fetch.

Let's try this out. Starting with julia -p n provides n processors on the local machine. Generally it makes sense for n to equal the number of CPU cores on the machine.

$ ./julia -p 2

julia> r = remote_call(2, rand, 2, 2)
RemoteRef(2,1,0)

julia> s = remote_call(2, +, 1, r)
RemoteRef(2,1,1)

julia> fetch(r)
0.10824216411304866 0.13798233877923116 
0.12376292706355074 0.18750497916607167 

julia> fetch(s)
1.10824216411304866 1.13798233877923116 
1.12376292706355074 1.18750497916607167 

The first argument to remote_call is the index of the processor that will do the work. Most parallel programming in Julia does not reference specific processors or the number of processors available, but remote_call is considered a low-level interface providing finer control. The second argument to remote_call is the function to call, and the remaining arguments will be passed to this function. As you can see, in this example we asked processor 2 to construct a 2-by-2 random matrix, then add 1 to it.

Occasionally you might want a remotely-computed value immediately. This typically happens when you read from a remote object to obtain data needed by the next local operation. The function remote_call_fetch exists for this purpose. It is equivalent to fetch(remote_call(...)) but is more efficient.

julia> remote_call_fetch(2, ref, r, 1, 1)
0.10824216411304866

The syntax of remote_call is not especially convenient. The macro @spawn makes things easier. It operates on an expression rather than a function, and picks where to do the operation for you:

julia> r = @spawn rand(2,2)
RemoteRef(1,1,0)

julia> s = @spawn 1+fetch(r)
RemoteRef(1,1,1)

julia> fetch(s)
1.10824216411304866 1.13798233877923116 
1.12376292706355074 1.18750497916607167 

Note that we used 1+fetch(r) instead of 1+r. This is because we do not know where the code will run, so in general a fetch might be required to move r to the processor doing the addition. In this case, @spawn is smart enough to perform the computation on the processor that owns r, so the fetch will be a no-op.

(It is worth noting that @spawn is not built-in but defined in Julia as a macro. It is possible to define your own such constructs.)

## Data Movement

Sending messages and moving data constitute most of the overhead in a parallel program. Reducing the number of messages and the amount of data sent is critical to achieving performance and scalability. To this end, it is important to understand the data movement performed by Julia's various parallel programming constructs.

fetch can be considered an explicit data movement operation, since it directly asks that an object be moved to the local machine. @spawn (and a few related constructs) also moves data, but this is not as obvious, hence it can be called an implicit data movement operation. Consider these two approaches to constructing and squaring a random matrix:

# method 1
A = rand(1000,1000)
Bref = @spawn A^2
...
fetch(Bref)

# method 2
Bref = @spawn rand(1000,1000)^2
...
fetch(Bref)

The difference seems trivial, but in fact is quite significant due to the behavior of @spawn. In the first method, a random matrix is constructed locally, then sent to another processor where it is squared. In the second method, a random matrix is both constructed and squared on another processor. Therefore the second method sends much less data than the first.

In this toy example, the two methods are easy to distinguish and choose from. However, in a real program designing data movement might require more thought and very likely some measurement. For example, if the first processor needs matrix A then the first method might be better. Or, if computing A is expensive and only the current processor has it, then moving it to another processor might be unavoidable. Or, if the current processor has very little to do between the @spawn and fetch(Bref) then it might be better to eliminate the parallelism altogether. Or imagine rand(1000,1000) is replaced with a more expensive operation. Then it might make sense to add another @spawn statement just for this step.

## Parallel Map and Loops

Fortunately, many useful parallel computations do not require data movement. A common example is a monte carlo simulation, where multiple processors can handle independent simulation trials simultaneously. We can use @spawn to flip coins on two processors:

function count_heads(n)
    c = 0
    for i=1:n
        c += randbit()
    end
    c
end

a = @spawn count_heads(100000000)
b = @spawn count_heads(100000000)
fetch(a)+fetch(b)

The function count_heads simply adds together n random bits. Then we perform some trials on two machines, and add together the results.

At this point it is worth mentioning how to make sure your code is available on all processors (in this case, all processors need the count_heads function). There are two primary methods. First, you can use @bcast to run top-level inputs on all processors:

julia> @bcast load("myfile.j")

Alternatively, all Julia processes will automatically load a file called custom.j (if it exists) in the same directory as the Julia executable on startup. If you regularly work with certain source files, it makes sense to load them from this file.

This example, as simple as it is, demonstrates a powerful and often-used parallel programming pattern. Many iterations run independently over several processors, and then their results are combined using some function. The combination process is called a reduction, since it is generally tensor-rank-reducing: a vector of numbers is reduced to a single number, or a matrix is reduced to a single row or column, etc. In code, this typically looks like the pattern x = f(x,v[i]), where x is the accumulator, f is the reduction function, and the v[i] are the elements being reduced. It is desirable for f to be associative, so that it does not matter what order the operations are performed in.

Notice that our use of this pattern with count_heads can be generalized. We used two explicit @spawn statements, which limits the parallelism to two processors. To run on any number of processors, we can use a parallel for loop, which can be written in Julia like this:

nheads = @parallel (+) for i=1:200000000
  randbit()
end

This construct implements the pattern of assigning iterations to multiple processors, and combining them with a specified reduction (in this case (+)). The result of each iteration is taken as the value of the last expression inside the loop. The whole parallel loop expression itself evaluates to the final answer.

Note that although parallel for loops look like serial for loops, their behavior is dramatically different. In particular, the iterations do not happen in a specified order, and writes to variables or arrays will not be globally visible since iterations run on different processors. Any variables used inside the parallel loop will be copied and broadcast to each processor.

For example, the following code will not work as intended:

a = zeros(100000)
@parallel for i=1:100000
  a[i] = i
end

Notice that the reduction operator can be omitted if it is not needed. However, this code will not initialize all of a, since each processor will have a separate copy if it. Parallel for loops like these must be avoided. Fortunately, distributed arrays can be used to get around this limitation, as we will see in the next section.

Using "outside" variables in parallel loops is perfectly reasonable if the variables are read-only:

a = randn(1000)
@parallel (+) for i=1:100000
  f(a[randi(end)])
end

Here each iteration applies f to a randomly-chosen sample from a vector a shared by all processors.

In some cases no reduction operator is needed, and we merely wish to apply a function to all integers in some range (or, more generally, to all elements in some collection). This is another useful operation called parallel map, implemented in Julia as the pmap function. For example, we could compute the singular values of several large random matrices in parallel as follows:

M = {rand(1000,1000) | i=1:10}
pmap(svd, M)

Julia's pmap is designed for the case where each function call does a large amount of work. In contrast, @parallel for can handle situations where each iteration is tiny, perhaps merely summing two numbers.

## Distributed Arrays

Large computations are often organized around large arrays of data. In these cases, a particularly natural way to obtain parallelism is to distribute arrays among several processors. This combines the memory resources of multiple machines, allowing use of arrays too large to fit on one machine. Each processor operates on the part of the array it owns, providing a ready answer to the question of how a program should be divided among machines.

A distributed array (or, more generally, a global object) is logically a single array, but pieces of it are stored on different processors. This means whole-array operations such as matrix multiply, scalar*array multiplication, etc. use the same syntax as with local arrays, and the parallelism is invisible. In some cases it is possible to obtain useful parallelism just by changing a local array to a distributed array.

Julia distributed arrays are implemented by the DArray type. A DArray has an element type and dimensions just like an Array, but it also needs an additional property: the dimension along which data is distributed. There are many possible ways to distribute data among processors, but at this time Julia keeps things simple and only allows distributing along a single dimension. For example, if a 2-d DArray is distributed in dimension 1, it means each processor holds a certain range of rows. If it is distrbuted in dimension 2, each processor holds a certain range of columns.

Common kinds of arrays can be constructed with functions beginning with d:

dzeros(100,100,10)
dones(100,100,10)
drand(100,100,10)
drandn(100,100,10)
dcell(100,100,10)
dfill(x, 100,100,10)

In the last case, each element will be initialized to the specified value x. These functions automatically pick a distributed dimension for you. To specify the distributed dimension, other forms are available:

drand((100,100,10), 3)
dzeros(Int64, (100,100), 2)
dzeros((100,100), 2, [7, 8])

In the first dzeros call, we specified an element type. In the second dzeros call, we also specified which processors should be used to store the data. When dividing data among a large number of processors, one often sees diminishing returns in performance. Placing DArrays on a subset of processors allows multiple DArray computations to happen at once, with a higher ratio of work to communication on each processor.

distribute(a::Array, dim) can be used to convert a local array to a distributed array, optionally specifying the distributed dimension. localize(a::DArray) is used to obtain the locally-stored portion of a DArray. owner(a::DArray, index) gives the id of the processor storing the given index in the distributed dimension. myindexes(a::DArray) gives a tuple of the indexes owned by the local processor.

A DArray can be stored on a subset of the available processors. Three properties fully describe the distribution of DArray d. d.pmap[i] gives the processor id that owns piece number i of the array. Piece i consists of indexes d.dist[i] through d.dist[i+1]-1. d.distdim gives the distributed dimension. For convenience, d.localpiece gives the number of the piece owned by the local processor (this could also be determined by searching d.pmap).

Indexing a DArray (square brackets) gathers all of the referenced data to a local Array object.

Indexing a DArray with the sub function creates a "virtual" sub-array that leaves all of the data in place. This should be used where possible, especially for indexing operations that refer to large pieces of the original array.

sub itself, naturally, does no communication and so is very efficient. However, this does not mean it should be viewed as an optimization in all cases. Many situations require explicitly moving data to the local processor in order to do a fast serial operation. For example, functions like matrix multiply perform many accesses to their input data, so it is better to have all the data available locally up front.

## Distributed array computations

Whole-array operations (e.g. elementwise operators) are a convenient way to use distributed arrays, but they are not always sufficient. To handle more complex problems, tasks can be spawned to operate on parts of a DArray and write the results to another DArray. For example, here is how you could apply a function f to each 2-d slice of a 3-d DArray:

function compute_something(A::DArray)
    B = darray(eltype(A), size(A), 3)
    for i = 1:size(A,3)
        @spawnat owner(B,i) B[:,:,i] = f(A[:,:,i])
    end
    B
end

We used @spawnat to place each operation near the memory it writes to.

This code works in some sense, but trouble stems from the fact that it performs writes asynchronously. In other words, we don't know when the result data will be written to the array and become ready for further processing. This is known as a "race condition", one of the famous pitfalls of parallel programming. Some form of synchronization is necessary to wait for the result. As we saw above, @spawn returns a remote reference that can be used to wait for its computation. We could use that feature to wait for specific blocks of work to complete:

function compute_something(A::DArray)
    B = darray(eltype(A), size(A), 3)
    deps = cell(size(A,3))
    for i = 1:size(A,3)
        deps[i] = @spawnat owner(B,i) B[:,:,i] = f(A[:,:,i])
    end
    (B, deps)
end

Now a function that needs to access slice i can perform wait(deps[i]) first to make sure the data is available.

Another option is to use a @sync block, as follows:

function compute_something(A::DArray)
    B = darray(eltype(A), size(A), 3)
    @sync begin
        for i = 1:size(A,3)
            @spawnat owner(B,i) B[:,:,i] = f(A[:,:,i])
        end
    end
    B
end

@sync waits for all spawns performed within it to complete. This makes our compute_something function easy to use, at the price of giving up some parallelism (since calls to it cannot overlap with subsequent operations).

Still another option is to use the initial, un-synchronized version of the code, and place a @sync block around a larger set of operations in the function calling this one.

## Synchronization With Remote References ## Scheduling

Julia's parallel programming platform uses [Tasks](Control flow#wiki-Tasks-aka-Coroutines) to switch among multiple computations. Whenever code performs a communication operation like fetch or wait, the current task is suspended and a scheduler picks another task to run. A task is restarted when the event it is waiting for completes.

For many problems, it is not necessary to think about tasks directly. However, they can be used to wait for multiple events at the same time, which provides for dynamic scheduling. In dynamic scheduling, a program decides what to compute or where to compute it based on when other jobs finish. This is needed for unpredictable or unbalanced workloads, where we want to assign more work to processors only when they finish their current tasks.

As an example, consider computing the singular values of matrices of different sizes:

M = {rand(800,800), rand(600,600), rand(800,800), rand(600,600)}
pmap(svd, M)

If one processor handles both 800x800 matrices and another handles both 600x600 matrices, we will not get as much scalability as we could. The solution is to make a local task to "feed" work to each processor when it completes its current task. This can be seen in the implementation of pmap:

function pmap(f, lst)
    np = nprocs()
    n = length(lst)
    results = cell(n)
    i = 1
    # function to produce the next work item from the queue.
    # in this case it's just an index.
    next_idx() = (idx=i; i+=1; idx)
    @sync begin
        for p=1:np
            @spawnlocal begin
                while true
                    idx = next_idx()
                    if idx > n
                        break
                    end
                    results[idx] = remote_call_fetch(p, f, L[idx])
                end
            end
        end
    end
    results
end

@spawnlocal is similar to @spawn, but only runs tasks on the local processor. We use it to create a "feeder" task for each processor. Each task picks the next index that needs to be computed, then waits for its processor to finish, then repeats until we run out of indexes. A @sync block is used to wait for all the local tasks to complete, at which point the whole operation is done. Notice that all the feeder tasks are able to share state via next_idx() since they all run on the same processor. However, no locking is required, since the threads are scheduled cooperatively and not preemptively. This means context switches only occur at well-defined points (during the fetch operation).

## Adding Processors # 18. Calling C and Fortran Code

Though most code can be written in Julia, there are many high-quality, mature libraries for numerical computing already written in C and Fortran. To allow easy use of this existing code, Julia makes it simple and efficient to call C and Fortran functions. Julia has a "no boilerplate" philosophy: functions can be called directly from Julia without any "glue" code, code generation, or compilation — even from the interactive prompt. This is accomplished in three steps:

  1. Load a shared library and create a handle to it.
  2. Lookup a library function by name, getting a handle to it.
  3. Call the library function using the built-in ccall function.

The code to be called must be available as a shared library. Most C and Fortran libraries ship compiled as shared libraries already, but if you are compiling the code yourself using GCC (or Clang), you will need to use the -shared and -fPIC options. The machine instructions generated by Julia's JIT are the same as a native C call would be, so the resulting overhead is the same as calling a library function from C code. (Non-library function calls in both C and Julia can be inlined and thus may have even less overhead than calls to shared library functions. When both libraries and executables are generated by LLVM, it is possible to perform whole-program optimizations that can even optimize across this boundary, but Julia does not yet support that. In the future, however, it may do so, yielding even greater performance gains.)

Shared libraries are loaded with dlopen function, which provides access to the functionality of the POSIX dlopen(3) call: it locates a shared library binary and loads it into the process' memory allowing the program to access functions and variables contained in the library. The following call loads the standard C library, and stores the resulting handle in a Julia variable called libc:

libc = dlopen("libc")

Once a library has been loaded, functions can be looked up by name using the dlsym function, which exposes the functionality of the POSIX dlsym(3) call. This returns a handle to the clock function from the standard C library:

libc_clock = dlsym(libc, :clock)

Finally, you can use ccall to actually generate a call to the library function. Inputs to ccall are as follows:

  1. Function reference from dlsym — a value of type Ptr{Void}.
  2. Return type, which may be any bits type, including Int32, Int64, Float64, or Ptr{T} for any type parameter T, indicating a pointer to values of type T, or just Ptr for void* "untyped pointer" values.
  3. A tuple of input types, like those allowed for the return type.
  4. The following arguments, if any, are the actual argument values passed to the function.

As a complete but simple example, the following calls the clock function from the standard C library:

julia> t = ccall(dlsym(libc, :clock), Int32, ())
5380445

julia> typeof(ans)
Int32

clock takes no arguments and returns an Int32. One common gotcha is that a 1-tuple must be written with with a trailing comma. For example, to call the getenv function to get a pointer to the value of an environment variable, one makes a call like this:

julia> path = ccall(dlsym(libc, :getenv), Ptr{Uint8}, (Ptr{Uint8},), "SHELL")
Ptr{Uint8} @0x00007fff5fbfd670

julia> cstring(path)
"/bin/zsh"

Note that the argument type tuple must be written as (Ptr{Uint8},), rather than just (Ptr{Uint8}). This is because (Ptr{Uint8}) is just Ptr{Uint8}, rather than a 1-tuple containing Ptr{Uint8}:

julia> (Ptr{Uint8})
Ptr{Uint8}

julia> (Ptr{Uint8},)
(Ptr{Uint8},)

In practice, especially when providing reusable functionality, one generally wraps ccall usages in Julia functions that set up arguments and then check for errors in whatever manner the C or Fortran function indicates them, propagating to the Julia caller as exceptions. This is especially important since C and Fortran APIs are notoriously inconsistent about how they indicate error conditions. For example, the getenv C library function is wrapped in the following Julia function in env.j:

function getenv(var::String)
  val = ccall(dlsym(libc, :getenv),
              Ptr{Uint8}, (Ptr{Uint8},), cstring(var))
  if val == C_NULL
    error("getenv: undefined variable: ", var)
  end
  cstring(val)
end

The C getenv function indicates an error by returning NULL, but other standard C functions indicate errors in various different ways, including by returning -1, 0, 1 and other special values. This wrapper throws an exception clearly indicating the problem if the caller tries to get a non-existent environment variable:

julia> getenv("SHELL")
"/bin/zsh"

julia> getenv("FOOBAR")
getenv: undefined variable: FOOBAR

Here is a slightly more complex example that discovers the local machine's hostname:

function gethostname()
  hostname = Array(Uint8, 128)
  ccall(dlsym(libc, :gethostname), Int32, 
        (Ptr{Uint8}, Ulong),
        hostname.data, ulong(length(hostname)))
  return cstring(convert(Ptr{Uint8}, hostname))
end

This example first allocates an array of bytes, then calls the C library function gethostname to fill the array in with the hostname, takes a pointer to the hostname buffer, and converts the pointer to a Julia string, assuming that it is a NUL-terminated C string. It is common for C libraries to use this pattern of requiring the caller to allocate memory to be passed to the callee and filled in. Allocation of memory from Julia like this is generally accomplished by creating an uninitialized array and passing a pointer to its data to the C function.

In case of a Fortran function, all inputs must be passed by reference. The following example computes a dot product using a BLAS function.

libBLAS = dlopen("libLAPACK")

function compute_dot(DX::Vector, DY::Vector)
  assert(length(DX) == length(DY))
  n = length(DX)
  incx = incy = 1
  product = ccall(dlsym(libBLAS, :ddot_),
                  Float64,
                  (Ptr{Int32}, Ptr{Float64}, Ptr{Int32}, Ptr{Float64}, Ptr{Int32}),
                  int32(n), DX, int32(incx), DY, int32(incy))
  return product
end

Note that no C header files are used anywhere in the process. Currently, it is not possible to pass structs and other non-primitive types from Julia to C libraries. However, C functions that generate and use opaque structs types by passing around pointers to them can return such values to Julia as Ptr{Void}, which can then be passed to other C functions as Ptr{Void}. Memory allocation and deallocation of such objects must be handled by calls to the appropriate cleanup routines in the libraries being used, just like in any C program.

# 19. Standard Library Reference ## Getting Around

exit([code]) — Quit (or control-D at the prompt).

whos() — Print information about user-defined variables.

edit("file") — Edit a file. Returns to the julia prompt when you quit the editor. If the file name ends in ".j" it is loaded when the editor exits.

edit(function[, types]) — Edit the definition of a function, optionally specifying types to indicate which method to edit. When the editor exits, the source file containing the definition is re-loaded.

## All Objects

is(x, y) — Determine whether x and y refer to the same object in memory.

isa(x, type) — Determine whether x is of the given type.

isequal(x, y) — True if and only if x and y have the same contents. Loosely speaking, this means x and y would look the same when printed.

typeof(x) — Get the concrete type of x.

tuple(xs...) — Construct a tuple of the given objects.

uid(x) — Get a unique integer id for x. uid(x)==uid(y) if and only if is(x,y).

hash(x) — Compute an integer hash code such that isequal(x,y) implies hash(x)==hash(y).

finalizer(x, function) — Register a function to be called on x when there are no program-accessible references to x. The behavior of this function is unpredictable if x is of a bits type.

copy(x) — Create a deep copy of x: i.e. copy is called recursively on all constituent parts of x. If a user-defined type should be recursively copied, a copy method should be defined for it which implements deep copying of an instance.

convert(type, x) — Try to convert x to the given type.

promote(xs...) — Convert all arguments to their common promotion type (if any), and return them all (as a tuple).

## Types

subtype(type1, type2) — True if and only if all values of type1 are also of type2.

typemin(type) — The lowest value representable by the given (real) numeric type.

typemax(type) — The highest value representable by the given (real) numeric type.

sizeof(type) — Size, in bytes, of the canonical binary representation of the given type, if any.

eps(type) — The distance between 1.0 and the next largest representable floating-point value of type.

eps(x) — The distance between x and the next largest representable floating-point value of the same type as x.

promote_type(type1, type2) — Determine a type big enough to hold values of each argument type without loss.

## Generic Functions

method_exists(f, tuple) — Determine whether the given generic function has a method matching the given tuple of argument types.

applicable(f, args...) — Determine whether the given generic function has a method applicable to the given arguments.

invoke(f, (types...), args...) — Invoke a method for the given generic function matching the specified types (as a tuple), on the specified arguments. The arguments must be compatible with the specified types. This allows invoking a method other than the most specific matching method, which is useful when the behavior of a more general definition is explicitly needed (often as part of the implementation of a more specific method of the same function).

## Iteration

Sequential iteration is implemented by the methods start, done, and next. The general for loop:

for i = I
  # body
end

is translated to:

state = start(I)
while !done(I, state)
  (i, state) = next(I, state)
  # body
end

The state object may be anything, and should be chosen appropriately for each iterable type.

Fully implemented by: Range, Range1, NDRange, Tuple, Real, AbstractArray, IntSet, IdTable, HashTable, WeakKeyHashTable, LineIterator, String, Set, Task.

## General Collections

isempty(collection) — Determine whether a collection is empty (has no elements).

length(collection) — Return the number of elements generated by an iterable collection. Or, for indexable collections, the maximum index i for which ref(collection, i) is valid.

Fully implemented by: Range, Range1, Tuple, Number, AbstractArray, IntSet, HashTable, WeakKeyHashTable, String, Set.

Partially implemented by: FDSet.

## Iterable Collections

contains(itr, x) — Determine whether a collection contains the given value, x.

reduce(op, v0, itr) — Reduce the given collection with the given operator, i.e. accumulate v = op(v,elt) for each element, where v starts as v0. Reductions for certain commonly-used operators are available in a more convenient 1-argument form: max(itr), min(itr), sum(itr), prod(itr), any(itr), all(itr).

countp(p, itr) — Count the number of elements in itr for which predicate p is true.

anyp(p, itr) — Determine whether any element of itr satisfies the given predicate.

allp(p, itr) — Determine whether all elements of itr satisfy the given predicate.

## Indexable Collections

ref(collection, key...), also called by the syntax collection[key...] — Retrieve the value(s) stored at the given key or index within a collection.

assign(collection, value, key...), also called by the syntax collection[key...] = value — Store the given value at the given key or index within a collection.

Fully implemented by: Array, DArray, AbstractArray, SubArray, IdTable, HashTable, WeakKeyHashTable, String.

Partially implemented by: Range, Range1, Tuple.

## Associative Collections

has(collection, key) — Determine whether a collection has a mapping for a given key.

get(collection, key, default) — Return the value stored for the given key, or the given default value if no mapping for the key is present.

del(collection, key) — Delete the mapping for the given key in a collection.

del_all(collection) — Delete all keys from a collection.

Fully implemented by: IdTable, HashTable, WeakKeyHashTable.

Partially implemented by: IntSet, Set, EnvHash, FDSet, Array.

## Set-Like Collections

add(collection, key) — Add an element to a set-like collection.

Fully implemented by: IntSet, Set, FDSet.

## Tuples

ntuple(n, f::Function) — Create a tuple of length n, computing each element as f(i), where i is the index of the element.

## Dequeues

push(collection, item) — Insert an item at the end of a collection.

pop(collection) — Remove the last item in a collection and return it.

enq(collection, item) — Insert an item at the beginning of a collection.

insert(collection, index, item) — Insert an item at the given index.

del(collection, index) — Remove the item at the given index.

grow(collection, n) — Add uninitialized space for n elements at the end of a collection.

append!(collection, items) — Add the elements of items to the end of a collection.

Fully implemented by: Vector (aka 1-d Array).

## Strings

strlen(s) — The number of characters in string s.

length(s) — The last valid index for string s. Indexes are byte offsets and not character numbers.

chars(string) — Return an array of the characters in string.

strcat(strs...) — Concatenate strings.

string(char...) — Create a string with the given characters.

string(x) — Create a string from any value using the show function.

cstring(::Ptr{Uint8}) — Create a string from the address of a C (0-terminated) string.

cstring(s) — Convert a string to a contiguous byte array representation appropriate for passing it to C functions.

ASCIIString(::Array{Uint8,1}) — Create an ASCII string from a byte array.

UTF8String(::Array{Uint8,1}) — Create a UTF-8 string from a byte array.

strchr(string, char[, i]) — Return the index of char in string, giving an error if not found. The third argument optionally specifies a starting index.

lpad(string, n, p) — Make a string at least n characters long by padding on the left with copies of p.

rpad(string, n, p) — Make a string at least n characters long by padding on the right with copies of p.

split(string, char, include_empty) — Return an array of strings by splitting the given string on occurrences of the given character delimiter. The second argument may also be a set of character delimiters to use. The third argument specifies whether empty fields should be included.

join(strings, delim) — Join an array of strings into a single string, inserting the given delimiter between adjacent strings.

## I/O

open(file_name[, read, write, create, truncate, append]) — Open a file in a mode specified by four boolean arguments. The default is to open files for reading only.

memio([size]) — Create an in-memory I/O stream, optionally specifying how much initial space is needed.

fdio(descriptor[, own]) — Create an IOStream object from an integer file descriptor. If own is true, closing this object will close the underlying descriptor. By default, an IOStream is closed when it is garbage collected.

flush(stream) — Commit all currently buffered writes to the given stream.

close(stream) — Close an I/O stream. Performs a flush first.

with_output_stream(stream, f::Function, args...) — Call f(args...) with the current output stream set to the given object. This is typically used to redirect the output of print and show.

write(stream, x) — Write the canonical binary representation of a value to the given stream.

read(stream, type) — Read a value of the given type from a stream, in canonical binary representation.

read(stream, type, dims) — Read a series of values of the given type from a stream, in canonical binary representation. dims is either a tuple or a series of integer arguments specifying the size of Array to return.

## Text I/O

show(x) — Write an informative text representation of a value to the current output stream.

print(x) — Write (to the current output stream) a canonical (un-decorated) text representation of a value if there is one, otherwise call show.

dump(x) — Write a thorough text representation of a value to the current output stream.

readall(stream) — Read the entire contents of an I/O stream as a string.

readline(stream) — Read a single line of text, including a trailing newline character (if one is reached before the end of the input).

readuntil(stream, delim) — Read a string, up to and including the given delimiter byte.

readlines(stream) — Read all lines as an array.

LineIterator(stream) — Create an iterable object that will yield each line from a stream.

## Standard numeric types

Bool, Int8, Uint8, Int16, Uint16, Int32, Uint32, Int64, Uint64, Float32, Float64, Complex64, Complex128

## Mathematical operators and functions

Unary Arithmetic — -

Binary Arithmetic — +, -, *, .*, /, ./, \, .\, ^, .^, div, mod

Comparison — ==, !=, <, <=, >, >=

Unary Boolean or Bitwise — ~

Binary Boolean or Bitwise — &, |, $

Trigonometric functions — sin, cos, tan, sinh, cosh, tanh, asin, acos, atan, atan2, sec, csc, cot, asec, acsc, acot, sech, csch, coth, asech, acsch, acoth, sinc, cosc, hypot

Logarithmic functions — log, log2, log10, log1p, logb, ilogb

Exponential functions — exp, expm1, exp2, ldexp

Rounding functions — ceil, floor, trunc, round, ipart, fpart

Other mathematical functions — min, max, abs, pow, sqrt, cbrt, erf, erfc, gamma, lgamma, real, conj, clamp

## Random numbers

Random numbers are generated in Julia by calling functions from the Mersenne Twister library

rand — Generate a Float64 random number in (0,1)

randf — Generate a Float32 random number in (0,1)

randi(Int32|Uint32|Int64|Uint64) — Generate a random integer of the given type

randbit — Generate 1 or 0 at random

randbool — Generate a random boolean value

randn — Generate a uniformly distributed random number with mean 0 and standard deviation 1

randg(a) — Generate a sample from the gamma distribution with shape parameter a

randchi2(n) — Generate a sample from the chi-squared distribution with n degrees of freedom (also available as chi2rnd)

srand — Seed the RNG

## Arrays ### Basic functions

ndims(A) — Returns the number of dimensions of A

size(A) — Returns a tuple containing the dimensions of A

eltype(A) — Returns the type of the elements contained in A

numel(A) — Returns the number of elements in A

length(A) — Returns the size of the largest dimension of A

nnz(A) — Counts the number of nonzero values in A

stride(A, k) — Returns the size of the stride along dimension k

strides(A) — Returns a tuple of the linear index distances between adjacent elements in each dimension

### Constructors

Array(type, dims) — Construct an uninitialized dense array. dims may be a tuple or a series of integer arguments.

cell(dims) Construct an uninitialized cell array (heterogeneous array). dims can be either a tuple or a series of integer arguments.

zeros(type, dims) — Create an array of all zeros of specified type

ones(type, dims) — Create an array of all ones of specified type

trues(dims) — Create a Bool array with all values set to true

falses(dims) — Create a Bool array with all values set to false

reshape(A, dims) — Create an array with the same data as the given array, but with different dimensions. An implementation for a particular type of array may choose whether the data is copied or shared.

fill(A, x) — Fill an array A with value x

copy(A) — Create a copy of A

similar(array, element_type, dims) Create an uninitialized array of the same type as the given array, but with the specified element type and dimensions. The second and third arguments are both optional. The dims argument may be a tuple or a series of integer arguments.

reinterpret(type, A) — Construct an array with the same binary data as the given array, but with the specified element type

rand(dims) — Create a random array with Float64 random values in (0,1)

randf(dims) — Create a random array with Float32 random values in (0,1)

randn(dims) — Create a random array with Float64 uniformly distributed random values with a mean of 0 and standard deviation of 1

eye(n) — n-by-n identity matrix

eye(m, n) — m-by-n identity matrix

linspace(start, stop, n) — Construct a vector of n linearly-spaced elements from start to stop.

### Mathematical operators and functions

All mathematical operations and functions are supported for arrays

### Indexing, Assignment, and Concatenation

ref(A, ind) — Returns a subset of A as specified by ind, which may be an Int, a Range, or a Vector.

sub(A, ind) — Returns a SubArray, which stores the input A and ind rather than computing the result immediately. Calling ref on a SubArray computes the indices on the fly.

slicedim(A, d, i) — Return all the data of A where the index for dimension d equals i. Equivalent to A[:,:,...,i,:,:,...] where i is in position d.

assign(A, X, ind) — Store an input array X within some subset of A as specified by ind.

cat(dim, A...) — Concatenate the input arrays along the specified dimension

vcat(A...) — Concatenate along dimension 1

hcat(A...) — Concatenate along dimension 2

hvcat — Horizontal and vertical concatenation in one call

flipdim(A, d) — Reverse A in dimension d.

flipud(A) — Equivalent to flip(1,A).

fliplr(A) — Equivalent to flip(2,A).

find(A) — Return a vector of the linear indexes of the non-zeros in A.

findn(A) — Return a vector of indexes for each dimension giving the locations of the non-zeros in A.

## Linear Algebra

Linear algebra functions in Julia are largely implemented by calling functions from LAPACK.

* — Matrix multiplication

\ — Matrix division using a polyalgorithm. For input matrices A and B, the result X is such that A*X == B. For rectangular A, QR factorization is used. For triangular A, a triangular solve is performed. For square A, Cholesky factorization is tried if the input is symmetric with a heavy diagonal. LU factorization is used in case Cholesky factorization fails or for general square inputs.

dot — Compute the dot product

norm — Compute the norm of a Vector or a Matrix

(R, p) = chol(A) — Compute Cholesky factorization

(L, U, p) = lu(A) — Compute LU factorization

(Q, R, p) = qr(A) — Compute QR factorization

(D, V) = eig(A) — Compute eigenvalues and eigenvectors of A

(U, S, V) = svd(A) — Compute the SVD of A

## FFT

FFT functions in Julia are largely implemented by calling functions from FFTW

fft(A, dim) — One dimensional FFT if input is a Vector. For n-d cases, compute fft of vectors along dimension dim

fft2 — 2d FFT

fft3 — 3d FFT

fftn — N-d FFT

ifft(A, dim) — Inverse FFT. Same arguments as fft

ifft2 — Inverse 2d FFT

ifft3 — Inverse 3d FFT

ifftn — Inverse N-d FFT

## Parallel Computing

addprocs_local(n) — Add processes on the local machine. Can be used to take advantage of multiple cores.

`addprocs_ssh({"host1","host2",...}) — Add processes on remote machines via SSH. Requires julia to be installed in the same location on each node, or to be available via a shared file system.

addprocs_sge(n) — Add processes via the Sun/Oracle Grid Engine batch queue, using qsub.

nprocs() — Get the number of available processors

myid() — Get the id of the current processor

remote_call(id, func, args...) — Call a function asynchronously on the given arguments on the specified processor. Returns a RemoteRef.

wait(RemoteRef) — Wait for a value to become available for the specified remote reference.

fetch(RemoteRef) — Wait for and get the value of a remote reference.

put(RemoteRef, value) — Store a value to a remote reference. Implements "shared queue of length 1" semantics: if a value is already present, blocks until the value is removed with take.

take(RemoteRef) — Fetch the value of a remote reference, removing it so that the reference is empty again.

RemoteRef() — Make an uninitialized remote reference on the local machine.

RemoteRef(n) — Make an uninitialized remote reference on processor n.

# 20. Potential Features

Julia is still a very young programming language, and there are many features that have been discussed and planned to various extents, but not yet implemented. This page documents some of these potential future features, but is likely to be out of date. See the mailing list at julia-math@googlegroups.com and GitHub issues for the latest discussion on potential features.

### Immutability

TODO: add discussion / links to discussions about immutability here.


© 2010-2011 Stefan Karpinski, Jeff Bezanson, Viral Shah, Alan Edelman.

The Julia Manual — All Rights Reserved.