-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute Wasserstein distance between a density and a sum of Diracs #184
Comments
Hey, @Vilin97. You don't need to use histograms per se. You can sample the distributions and compute either the sinkhorn distance or the exact distance. There has been some time since I last used the package. I'll recover some notebooks I have, and perhaps I can do a quick example for your case. It should be straightforward. |
Thank you for the answer, @davibarreira . Given X, Y, both d x n matrices (d is the dimension and n is the size of the sample), how can I compute the W2 distance between the empirical distributions given by X and Y? I did not understand how to do it from the documentation of |
Random.seed!(3)
σ1 = MvNormal(I(2))
N = 100
μ = fill(1 / N, N)
μsupport = rand(σ1,100)'
M = 50
σ2 = MvNormal([5,5],I(2))
ν = fill(1 / M, M)
νsupport = rand(σ2,M)';
C = pairwise(sqeuclidean, μsupport', νsupport'; dims=2);
# This is the exact total cost
γ = emd2(μ, ν, C, Tulip.Optimizer());
ε = .5
# This is the sinkhorn cost
s = sinkhorn2(μ, ν, C, ε); |
@Vilin97 , does the code above answer your questions? I'm sampling two multivariate normal distributions, and then constructing the dirac dist. Then, I compute the cost matrix |
Thank you so much for this snippet! I will play around with it when I get to my laptop but from the first glance it looks like exactly what I wanted. Thank you! |
The code you gave works. Thank you! |
I have two distributions in d-dimensional space, between which I want to compute Wasserstein distance. One distribution is a sum of Dirac delta functions (i.e. an empirical distribution), and the other is given by a density (e.g. a Gaussian). Is my best option to compute histograms of both and compute the distance between the histograms? I don't like this approach because the result will depend on the bin width, and bin width choice is a hard problem. Is there a better way?
Here is what I have so far:
Questions:
size(C) == (size(μ, 1), size(ν, 1))
inchecksize
. I don't quite understand whatC
should be whenμ
andν
are not vector-valued.The text was updated successfully, but these errors were encountered: