Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A multivariate lognormal using MvNormal internally. #455

Merged
merged 1 commit into from
Feb 6, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 97 additions & 18 deletions doc/source/multivariate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,38 +47,38 @@ Computation of statistics

.. function:: entropy(d)

Return the entropy of distribution ``d``.
Return the entropy of distribution ``d``.


Probability evaluation
~~~~~~~~~~~~~~~~~~~~~~~

.. function:: insupport(d, x)

If ``x`` is a vector, it returns whether x is within the support of ``d``.
If ``x`` is a matrix, it returns whether every column in ``x`` is within the support of ``d``.
If ``x`` is a vector, it returns whether x is within the support of ``d``.
If ``x`` is a matrix, it returns whether every column in ``x`` is within the support of ``d``.

.. function:: pdf(d, x)

Return the probability density of distribution ``d`` evaluated at ``x``.

- If ``x`` is a vector, it returns the result as a scalar.
- If ``x`` is a vector, it returns the result as a scalar.
- If ``x`` is a matrix with n columns, it returns a vector ``r`` of length n, where ``r[i]`` corresponds to ``x[:,i]`` (i.e. treating each column as a sample).

.. function:: pdf!(r, d, x)

Evaluate the probability densities at columns of x, and write the results to a pre-allocated array r.
Evaluate the probability densities at columns of x, and write the results to a pre-allocated array r.

.. function:: logpdf(d, x)

Return the logarithm of probability density evaluated at ``x``.

- If ``x`` is a vector, it returns the result as a scalar.
- If ``x`` is a vector, it returns the result as a scalar.
- If ``x`` is a matrix with n columns, it returns a vector ``r`` of length n, where ``r[i]`` corresponds to ``x[:,i]``.

.. function:: logpdf!(r, d, x)

Evaluate the logarithm of probability densities at columns of x, and write the results to a pre-allocated array r.
Evaluate the logarithm of probability densities at columns of x, and write the results to a pre-allocated array r.

.. function:: loglikelihood(d, x)

Expand All @@ -100,7 +100,7 @@ Sampling

.. function:: rand!(d, x)

Draw samples and output them to a pre-allocated array x. Here, x can be either a vector of length ``dim(d)`` or a matrix with ``dim(d)`` rows.
Draw samples and output them to a pre-allocated array x. Here, x can be either a vector of length ``dim(d)`` or a matrix with ``dim(d)`` rows.


**Node:** In addition to these common methods, each multivariate distribution has its own special methods, as introduced below.
Expand All @@ -117,14 +117,14 @@ The probability mass function is given by

.. math::

f(x; n, p) = \frac{n!}{x_1! \cdots x_k!} \prod_{i=1}^k p_i^{x_i},
f(x; n, p) = \frac{n!}{x_1! \cdots x_k!} \prod_{i=1}^k p_i^{x_i},
\quad x_1 + \cdots + x_k = n

.. code-block:: julia

Multinomial(n, p) # Multinomial distribution for n trials with probability vector p

Multinomial(n, k) # Multinomial distribution for n trials with equal probabilities
Multinomial(n, k) # Multinomial distribution for n trials with equal probabilities
# over 1:k


Expand All @@ -133,7 +133,7 @@ The probability mass function is given by
Multivariate Normal Distribution
----------------------------------

The `Multivariate normal distribution <http://en.wikipedia.org/wiki/Multivariate_normal_distribution>`_ is a multidimensional generalization of the *normal distribution*. The probability density function of a d-dimensional multivariate normal distribution with mean vector :math:`\boldsymbol{\mu}` and covariance matrix :math:`\boldsymbol{\Sigma}` is
The `Multivariate normal distribution <http://en.wikipedia.org/wiki/Multivariate_normal_distribution>`_ is a multidimensional generalization of the *normal distribution*. The probability density function of a d-dimensional multivariate normal distribution with mean vector :math:`\boldsymbol{\mu}` and covariance matrix :math:`\boldsymbol{\Sigma}` is

.. math::

Expand Down Expand Up @@ -231,7 +231,7 @@ Multivariate normal distribution is an `exponential family distribution <http://

.. math::

\mathbf{h} = \boldsymbol{\Sigma}^{-1} \boldsymbol{\mu}, \quad \text{ and } \quad \mathbf{J} = \boldsymbol{\Sigma}^{-1}
\mathbf{h} = \boldsymbol{\Sigma}^{-1} \boldsymbol{\mu}, \quad \text{ and } \quad \mathbf{J} = \boldsymbol{\Sigma}^{-1}

The canonical parameterization is widely used in Bayesian analysis. We provide a type ``MvNormalCanon``, which is also a subtype of ``AbstractMvNormal`` to represent a multivariate normal distribution using canonical parameters. Particularly, ``MvNormalCanon`` is defined as:

Expand All @@ -247,12 +247,12 @@ We also define aliases for common specializations of this parametric type:

.. code:: julia

typealias FullNormalCanon MvNormalCanon{PDMat, Vector{Float64}}
typealias DiagNormalCanon MvNormalCanon{PDiagMat, Vector{Float64}}
typealias FullNormalCanon MvNormalCanon{PDMat, Vector{Float64}}
typealias DiagNormalCanon MvNormalCanon{PDiagMat, Vector{Float64}}
typealias IsoNormalCanon MvNormalCanon{ScalMat, Vector{Float64}}

typealias ZeroMeanFullNormalCanon MvNormalCanon{PDMat, ZeroVector{Float64}}
typealias ZeroMeanDiagNormalCanon MvNormalCanon{PDiagMat, ZeroVector{Float64}}
typealias ZeroMeanFullNormalCanon MvNormalCanon{PDMat, ZeroVector{Float64}}
typealias ZeroMeanDiagNormalCanon MvNormalCanon{PDiagMat, ZeroVector{Float64}}
typealias ZeroMeanIsoNormalCanon MvNormalCanon{ScalMat, ZeroVector{Float64}}

A multivariate distribution with canonical parameterization can be constructed using a common constructor ``MvNormalCanon`` as:
Expand Down Expand Up @@ -286,6 +286,85 @@ A multivariate distribution with canonical parameterization can be constructed u

**Note:** ``MvNormalCanon`` share the same set of methods as ``MvNormal``.

.. _multivariatelognormal:

Multivariate Lognormal Distribution
-----------------------------------

The `Multivariate lognormal distribution <http://en.wikipedia.org/wiki/Log-normal_distribution>`_ is a multidimensional generalization of the *lognormal distribution*.

If :math:`\boldsymbol X \sim \mathcal{N}(\boldsymbol\mu,\,\boldsymbol\Sigma)` has a multivariate normal distribution then :math:`\boldsymbol Y=\exp(\boldsymbol X)` has a multivariate lognormal distribution.

Mean vector :math:`\boldsymbol{\mu}` and covariance matrix :math:`\boldsymbol{\Sigma}` of the underlying normal distribution are known as the *location* and *scale* parameters of the corresponding lognormal distribution.

The package provides an implementation, ``MvLogNormal``, which wraps around ``MvNormal``:

.. code-block:: julia

immutable MvLogNormal <: AbstractMvLogNormal
normal::MvNormal
end

Construction
~~~~~~~~~~~~

``MvLogNormal`` provides the same constructors as ``MvNormal``. See above for details.

Additional Methods
~~~~~~~~~~~~~~~~~~

In addition to the methods listed in the common interface above, we also provide the following methods:

.. function:: location(d)

Return the location vector of the distribution (the mean of the underlying normal distribution).

.. function:: scale(d)

Return the scale matrix of the distribution (the covariance matrix of the underlying normal distribution).

.. function:: median(d)

Return the median vector of the lognormal distribution. which is strictly smaller than the mean.

.. function:: mode(d)

Return the mode vector of the lognormal distribution, which is strictly smaller than the mean and median.

Conversion Methods
~~~~~~~~~~~~~~~~~~

It can be necessary to calculate the parameters of the lognormal (location vector and scale matrix) from a given covariance and mean, median or mode. To that end, the following functions are provided.

.. function:: location{D<:AbstractMvLogNormal}(::Type{D},s::Symbol,m::AbstractVector,S::AbstractMatrix)

Calculate the location vector (the mean of the underlying normal distribution).

If ``s == :meancov``, then m is taken as the mean, and S the covariance matrix of a lognormal distribution.

If ``s == :mean | :median | :mode``, then m is taken as the mean, median or mode of the lognormal respectively, and S is interpreted as the scale matrix (the covariance of the underlying normal distribution).

It is not possible to analytically calculate the location vector from e.g., median + covariance, or from mode + covariance.

.. function:: location!{D<:AbstractMvLogNormal}(::Type{D},s::Symbol,m::AbstractVector,S::AbstractMatrix,μ::AbstractVector)

Calculate the location vector (as above) and store the result in ``μ``

.. function:: scale{D<:AbstractMvLogNormal}(::Type{D},s::Symbol,m::AbstractVector,S::AbstractMatrix)

Calculate the scale parameter, as defined for the location parameter above.

.. function:: scale!{D<:AbstractMvLogNormal}(::Type{D},s::Symbol,m::AbstractVector,S::AbstractMatrix,Σ::AbstractMatrix)

Calculate the scale parameter, as defined for the location parameter above and store the result in ``Σ``.

.. function:: params{D<:AbstractMvLogNormal}(::Type{D},m::AbstractVector,S::AbstractMatrix)

Return (scale,location) for a given mean and covariance

.. function:: params!{D<:AbstractMvLogNormal}(::Type{D},m::AbstractVector,S::AbstractMatrix,μ::AbstractVector,Σ::AbstractMatrix)

Calculate (scale,location) for a given mean and covariance, and store the results in ``μ`` and ``Σ``


.. _dirichlet:
Expand All @@ -298,7 +377,7 @@ The `Dirichlet distribution <http://en.wikipedia.org/wiki/Dirichlet_distribution
.. math::

f(x; \alpha) = \frac{1}{B(\alpha)} \prod_{i=1}^k x_i^{\alpha_i - 1}, \quad \text{ with }
B(\alpha) = \frac{\prod_{i=1}^k \Gamma(\alpha_i)}{\Gamma \left( \sum_{i=1}^k \alpha_i \right)},
B(\alpha) = \frac{\prod_{i=1}^k \Gamma(\alpha_i)}{\Gamma \left( \sum_{i=1}^k \alpha_i \right)},
\quad x_1 + \cdots + x_k = 1


Expand All @@ -308,7 +387,7 @@ The `Dirichlet distribution <http://en.wikipedia.org/wiki/Dirichlet_distribution
Dirichlet(alpha) # Dirichlet distribution with parameter vector alpha

# Let a be a positive scalar
Dirichlet(k, a) # Dirichlet distribution with parameter a * ones(k)
Dirichlet(k, a) # Dirichlet distribution with parameter a * ones(k)



Expand Down
6 changes: 5 additions & 1 deletion src/Distributions.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ using StatsBase
using Compat

import Base.Random
import Base: size, eltype, length, full, convert, show, getindex, scale, rand, rand!
import Base: size, eltype, length, full, convert, show, getindex, scale, scale!, rand, rand!
import Base: sum, mean, median, maximum, minimum, quantile, std, var, cov, cor
import Base: +, -, .+, .-
import Base.Math.@horner
Expand Down Expand Up @@ -99,6 +99,7 @@ export
MixtureModel,
Multinomial,
MultivariateNormal,
MvLogNormal,
MvNormal,
MvNormalCanon,
MvNormalKnownCov,
Expand Down Expand Up @@ -207,6 +208,7 @@ export
sqmahal, # squared Mahalanobis distance to Gaussian center
sqmahal!, # inplace evaluation of sqmahal
location, # get the location parameter
location!, # provide storage for the location parameter (used in multivariate distribution mvlognormal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little hesitant to export this without support for more distributions, but it does follow the pattern of many other functions defined here. Same for params! and scale! below.

Can you open an issue for defining these for other distributions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmsquire I opened an issue (#459). The in-place variants would only be relevant to multivariate distributions, and the other multivariates that have been implemented to date may not have a location or scale, but I'll double-check it.

I also realized that location and scale should be defined for some univariate distributions that use that terminology, yet some are missing. I opened #460 for that.

mean, # mean of distribution
meandir, # mean direction (of a spherical distribution)
meanform, # convert a normal distribution from canonical form to mean form
Expand All @@ -221,6 +223,7 @@ export
ncomponents, # the number of components in a mixture model
ntrials, # the number of trials being performed in the experiment
params, # get the tuple of parameters
params!, # provide storage space to calculate the tuple of parameters for a multivariate distribution like mvlognormal
pdf, # probability density function (ContinuousDistribution)
pmf, # probability mass function (DiscreteDistribution)
probs, # Get the vector of probabilities
Expand All @@ -230,6 +233,7 @@ export
rate, # get the rate parameter
sampler, # create a Sampler object for efficient samples
scale, # get the scale parameter
scale!, # provide storage for the scale parameter (used in multivariate distribution mvlognormal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that scale and scale! are both defined in base, but with a different meaning (that of scaling a vector or matrix, as opposed to extracting a vector).

It's probably okay here (especially since scale has been defined, but in general, having the same function refer to different concepts is discouraged in Julia.

@StefanKarpinski, I often want to cite one of your talks or an issue when saying this, but I can't find a good one--any suggestions? Assuming, of course, that I'm not misrepresenting the idea.

(@KrisDM, this probably has no direct impact on this PR, just things to keep in mind.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmsquire @StefanKarpinski True, but it's an occasion where Julia's out-of-class function definitions show a problem of - excuse the pun - scalability. "scale" has a different meaning for matrices as it has for distributions. In a C++ class for the lognormal distirbution, we wouldn't think twice about using scale as a function name (even if the standard library would contain an entirely unrelated function scale) because the class context would define the semantics of the word,

I have to say that one thing that surprised me about Julia is that if a method like "scale" is defined in base, and I want to also define it in my module, I have to import and then export it again to avoid warnings (which happened the last time I tried this in 0.3 - haven't tried this anymore in 0.4 so apologies if already changed). A more scalable behaviour, in my view, would be to allow separate definition in the two namespaces, and if I import the two definitions into the same workspace, let the mulitple-dispatch figure out which one to call.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a defensibly similar usage of "scale" to me, so I would just extend Base's versions of these.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if you don't feel so, don't listen to me – I'm not an expert on this domain.

shape, # get the shape parameter
skewness, # skewness of the distribution
span, # the span of the support, e.g. maximum(d) - minimum(d)
Expand Down
Loading