Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributions on manifolds #708

Open
aplavin opened this issue Feb 6, 2024 · 8 comments
Open

Distributions on manifolds #708

aplavin opened this issue Feb 6, 2024 · 8 comments

Comments

@aplavin
Copy link

aplavin commented Feb 6, 2024

I'm looking for a way to describe distributions on manifolds (for now, spheres), so that their pdf is defined in a consistent way. This repo seems to be the natural place to ask such a question. Maybe, Manifolds.jl already contains something relevant?

Let's take the simplest example, a uniform distribution on the unit sphere in 3d. What is its pdf?

  • Using the measure of the parent 3d space where the sphere is embedded, pdf is zero or infinity.
  • Using the area measure on the sphere surface (2d), pdf = 1/4pi.
  • Parametrizing the sphere with latitude & longitude, and using the natural area measure in the lat-lon space, pdf = cos(lat)/4pi.

What are consistent ways to handle all these variants?

@kellertuer
Copy link
Member

HI,
this is a super-interesting topic. Quite a while back, we started https://github.com/JuliaManifolds/ManifoldMeasures.jl to address this, see also this #57 discussion which we started quite some time ago.

I think, either the mentioned started package or a new approach with https://github.com/JuliaStats/Distributions.jl, that is combining it with ManifoldsBase.jl would be the right way to go.

I personally am an optimiser, so the main thing I needed is “some random” starting point. That is what the Manifolds.jl function rand (or its in-place variant rand!) principally offer, so for now I fear there is no consistent implementation of what you are asking for – but I would be super happy to support you if you plan to start something like that.

@mateuszbaran
Copy link
Member

This is a very interesting and nontrivial topic, and I'm happy to say that we have more or less all basic building blocks to answer such questions. It's mostly a matter of developing applications.

There are many ways to define distributions on a sphere. Probably the most famous example is the von Mises-Fisher distribution. I haven't added it to Manifolds.jl yet (because it's already in other packages) but we have something better: a generic way of converting Euclidean distributions into manifold distributions. See here: https://juliamanifolds.github.io/Manifolds.jl/latest/tutorials/integration.html . Note that Manifolds.jl tries to work in a coordinate-free way, so it focuses on your second question. Libraries like Bijectors.jl seem to have a good enough interface for working in coordinates so I didn't try to compete there but we could develop it further here if it's needed.

I think the main issue with going forward is answering the question of what we want to do with such distributions. Just sampling and computing (log)pdfs? Do we need to compute moments? Do we need parameter estimation? Do we need it to work with some PPLs?

@aplavin
Copy link
Author

aplavin commented Feb 7, 2024

Thanks for those perspectives, glad there are others thinking about this in Julia :)

My usecases only involve random number generation and pdf evaluation (mostly, for MCMC sampling). It's just nontrivial to define a concrete programming object "uniform distribution on the unit sphere".
After looking at those packages, I'm not sure how one would use Manifold[Measures] or Bijectors to support everything relevant to such a distribution. And the Distributions API definitely doesn't seem flexible enough for that! There are two sets of choices that are orthogonal:

  • Point on a sphere can be defined by 3d coordinates or 2d spherical coordinates
  • Density can be wrt 3D volume, 2D area on sphere surface, or area in 2d spherical coordinates space

Potential API could look like this, I guess:

d = Uniform(Sphere())

rand(d, Cartesian)  # return 3d vector
rand(d, LonLat)  # return 2d vector of [lon, lat]

pdf(d, [1, 0, 0])  # error, not clear which measure to use

d_vol = assume_measure(d, VolumeMeasure)
pdf(d_vol, [1, 0, 0]) == Inf  # refer to a point by 3d vector
pdf(d_vol, [0, 0]) == Inf  # refer to a point by 2d vector of [lon, lat]

d_area = assume_measure(d, AreaMeasure)
pdf(d_area, [1, 0, 0]) == 1/4pi
pdf(d_area, [0, 0]) == 1/4pi

d_ll = assume_measure(d, LonLatMeasure)
pdf(d_ll, [1, 0, 0]) == cos(0)/4pi
pdf(d_ll, [0, 0]) == cos(0)/4pi

What do you think, does it look reasonable?

@mateuszbaran
Copy link
Member

d = Uniform(Sphere())

rand(d, Cartesian)  # return 3d vector
rand(d, LonLat)  # return 2d vector of [lon, lat]

rand already has a wide range of standard call patterns (types, RNG generators, number of samples...) so adding another thing as a positional argument is quite hard to do consistently. We decided in Manifolds.jl that it's not worth the effort and all non-standard customization is done either through the "sampler" object or via keyword arguments.

pdf(d_vol, [1, 0, 0]) == Inf  # refer to a point by 3d vector
pdf(d_vol, [0, 0]) == Inf  # refer to a point by 2d vector of [lon, lat]

Going by my experience, I'd either wrap the second argument of pdf in a wrapper struct that indicates whether it means a 3d vector or lon-lat coordinates or make d_vol assume a single representation. After all, what would you do if someone wanted to extend it to other spherical coordinate systems? There would be no way to distinguish between them. Manifolds.jl generally assumes that there is one particular non-wrapped point representation and all other representations need a wrapper.

BTW, I don't really see what would the volume measure be useful for?

@aplavin
Copy link
Author

aplavin commented Feb 8, 2024

BTW, I don't really see what would the volume measure be useful for?

Yeah probably not really useful. My point there was that we shouldn't have pdf(Uniform(Sphere()), [1, 0, 0]) equal to anything finite really: it would break $\int pdf(dist, dx) dx = 1$. Only explicitly opting for a specific measure makes sense for pdf().

I'd either wrap the second argument of pdf in a wrapper struct that indicates whether it means a 3d vector or lon-lat coordinates

Totally, makes sense! So,

pdf(d_area, [1, 0, 0]) == 1/4pi
pdf(d_area, LonLat(0, 0)) == 1/4pi  # generic types like LonLat, PolAz, ...
pdf(d_area, SkyCoords.GalCoords(0, 0)) == 1/4pi  # can work with existing spherical coordinate types from astronomy and geo packages

right?

rand already has a wide range of standard call patterns (types, RNG generators, number of samples...) so adding another thing as a positional argument is quite hard to do consistently. We decided in Manifolds.jl that it's not worth the effort and all non-standard customization is done either through the "sampler" object or via keyword arguments.

Indeed, it can easily be a keyword argument. Plays nicely with the pdf definition above as well!

rand(d_area)::SVector{3}
rand(d_area, T=LonLat)::LonLat
rand(d_area, T=SkyCoords.GalCoords)::SkyCoords.GalCoords

Is that similar to what you are talking about?
Also, what exactly do you mean by going through the sampler object in this scenario?

@mateuszbaran
Copy link
Member

Yeah probably not really useful. My point there was that we shouldn't have pdf(Uniform(Sphere()), [1, 0, 0]) equal to anything finite really: it would break
. Only explicitly opting for a specific measure makes sense for pdf().

I think I'd prefer to have that pdf undefined instead of infinite to avoid issues with tolerances (at which point we change from 0 to infinity?)

Totally, makes sense! So,

pdf(d_area, [1, 0, 0]) == 1/4pi
pdf(d_area, LonLat(0, 0)) == 1/4pi  # generic types like LonLat, PolAz, ...
pdf(d_area, SkyCoords.GalCoords(0, 0)) == 1/4pi  # can work with existing spherical coordinate types from astronomy and geo packages

right?

Yes, that makes sense.

Indeed, it can easily be a keyword argument. Plays nicely with the pdf definition above as well!

rand(d_area)::SVector{3}
rand(d_area, T=LonLat)::LonLat
rand(d_area, T=SkyCoords.GalCoords)::SkyCoords.GalCoords

Is that similar to what you are talking about? Also, what exactly do you mean by going through the sampler object in this scenario?

Sure, that also looks right.

By "going through the sampler" I mean that d could have a field that determines whether it should return random points as a 3-vector, LonLat, GalCoords or something else.

We can have a small new package for this in JuliaManifolds if you are interested in developing it. I can help but I won't have much time to work on it.

@aplavin
Copy link
Author

aplavin commented Feb 8, 2024

By "going through the sampler" I mean that d could have a field that determines whether it should return random points as a 3-vector, LonLat, GalCoords or something else.

Indeed, that's another possible approach. Basically, dist = Uniform(Sphere()) |> with_measure(measure) |> with_sampletype(LonLat).

We can have a small new package for this in JuliaManifolds if you are interested in developing it. I can help but I won't have much time to work on it.

I may draft something in the near future, with implementations of a few distributions on the sphere. I'm actually only interested in spherical ones for now, but interface would be extensible.
What existing parts of Manifolds.jl could be useful there, aside from the definition of a sphere?
I saw ManifoldMeasure, but it seems to define "measures" in the sense of "distributions", not "measures" as in area/volume/... .

@mateuszbaran
Copy link
Member

I may draft something in the near future, with implementations of a few distributions on the sphere. I'm actually only interested in spherical ones for now, but interface would be extensible.

Cool!

What existing parts of Manifolds.jl could be useful there, aside from the definition of a sphere?

Definitely manifold_volume and for some cases volume_density could be useful. Maybe something else too but it depends on the distribution.

I saw ManifoldMeasure, but it seems to define "measures" in the sense of "distributions", not "measures" as in area/volume/... .

ManifoldMeasures is also worth a look because it already has a few spherical distributions defined. It uses a more generic terminology from measure theory but if you look at the implementation, rand is just making random points and logdensity_def computes the logarithm of PDF (with respect to Riemannian volume, the natural generalization of Lebesgue measure to manifolds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants