Simple stick breaking #4129

katosh · 2020-09-23T18:57:00Z

This is another attempt to introduce a new transformation of the n-simplex. The stickbreaking transformation is prominently used by the Dirichlet distribution as it maps the range of the Distribution (the n-simplex) to R^(n-1) where we can sample freely and apply, e.g., ADVI. The issue with the current implementation is that the transformation of later values in the vector depends on previous values. This introduces a dependency that can be confounding for ADVI and seems to produce numerical inaccuracies in some cases.

There was a previous attempt to merge the new transformation but it had a mistake in the determinant of the jacobian: #3638

The current strikebreaking in master is an implementation of the transformation from Stan: https://mc-stan.org/docs/2_19/reference-manual/simplex-transform-section.html Which is just a repeated application of the logit transformation with adjusting range.

Advantages

It is equivalent to the isometric log-ratio-transformation up to rotation. This transformation has some nice properties https://link.springer.com/article/10.1023/A:1023818214614
simpler implementation
no eps for numeric stability (as in https://github.com/pymc-devs/pymc3/blob/ba77d8502704e8aeb112782ee104fb339393cb19/pymc3/distributions/transforms.py#L475)
it transforms all values equally from R to the simplex (no bias)
less dependency among vector values introduced through transformation (better none-full-rank-ADVI)
it is numerically more stable close to the edges of the simplex, e.g., in this example:

Current StickBreaking

import pymc3 as pm
import pandas as pd

with pm.Model() as model:
    decomp = pm.Dirichlet('decomp', np.ones(10)*5e-3, shape=10)
    trace1 = pm.sample()
pd.DataFrame(trace1['decomp_stickbreaking__']).plot.kde(figsize=(10,4));

New StickBreaking2

import pymc3 as pm
import pandas as pd
from pymc3.distributions.transforms import StickBreaking2

with pm.Model() as model:
    decomp = pm.Dirichlet('decomp', np.ones(10)*5e-3, shape=10,
                          transform=StickBreaking2())
    trace2 = pm.sample()
pd.DataFrame(trace2['decomp_stickbreaking__']).plot.kde(figsize=(10,4));

The PR includes tests and there are no breaking changes as it only introduces a new transformation pymc3.distributions.transforms.StickBreaking2 and leaves the original pymc3.distributions.transforms.StickBreaking untouched.

…e transformation.

pymc3/distributions/transforms.py

codecov · 2020-09-23T23:27:54Z

Codecov Report

Merging #4129 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #4129   +/-   ##
=======================================
  Coverage   88.74%   88.74%           
=======================================
  Files          89       89           
  Lines       14037    14024   -13     
=======================================
- Hits        12457    12446   -11     
+ Misses       1580     1578    -2

Impacted Files	Coverage Δ
pymc3/distributions/transforms.py	`97.70% <100.00%> (+0.25%)`	⬆️
pymc3/distributions/continuous.py	`92.93% <0.00%> (+0.11%)`	⬆️

katosh · 2020-09-24T12:14:57Z

I will remove the NumPy implementation of the backward transformation StickBreaking2.backards_val since it is not tested and not implemented for any other transformation.

katosh · 2020-09-24T13:12:20Z

I investigated sampling divergencies in the examples above. I changed the parameter for the Dirichlet distribution to np.ones(10)*1e-2 so not all NUTS samples diverge. Then I used the pairplot_divergence analogous to https://docs.pymc.io/notebooks/Diagnosing_biased_Inference_with_Divergences.html.

import matplotlib.pyplot as plt

def pairplot_divergence(trace, var1, var2, i1=0, i2=0):
    v1 = trace.get_values(varname=var1, combine=True)[:, i1]
    v2 = trace.get_values(varname=var2, combine=True)[:, i2]
    _, ax = plt.subplots(1, 1, figsize=(10, 5))
    ax.plot(v1, v2, 'o', color='b', alpha=.5)
    divergent = trace['diverging']
    ax.plot(v1[divergent], v2[divergent], 'o', color='r')
    ax.set_xlabel('{}[{}]'.format(var1, i1))
    ax.set_ylabel('{}[{}]'.format(var2, i2))
    ax.set_title('scatter plot between {}[{}] and {}[{}]'.format(var1, i1, var2, i2));
    return ax

Current StickBreaking

pairplot_divergence(trace1, 'decomp', 'decomp', i1=2, i2=3)

New StickBreaking2

pairplot_divergence(trace2, 'decomp', 'decomp', i1=2, i2=3)

Conclusion

The parameterization from StickBreaking2 can cure divergencies and hence avoid biases in some cases.

katosh · 2020-09-24T15:32:04Z

It seems forward_val is sometimes called with an argument point, e.g., here:
https://github.com/pymc-devs/pymc3/blob/ba77d8502704e8aeb112782ee104fb339393cb19/pymc3/util.py#L183
So I will include the ignored parameter in StickBreaking2.forward_val.

brandonwillard

Overall, this alternative stickbreaking seems fine, but why wouldn't it entirely replace the old one?

Also, if possible, this PR should add a new test that confirms one of the advantages of this transform over the old one. Ideally, such a test wouldn't require anything as costly as sampling. Is there a value range that demonstrates the improved numerical stability?

RELEASE-NOTES.md

twiecki · 2020-09-26T12:23:07Z

@katosh I would then still take the eps kwarg and raise a deprecation error.

This reverts commit 30452db.

katosh · 2020-09-26T15:50:40Z

Code coverage is reduced since:

The deprecation warning is not tested.
The length of the completely covered code is reduced.

brandonwillard · 2020-09-26T16:16:39Z

The deprecation warning is not tested.

You can add a noqa to that line.

MarcoGorelli · 2020-09-26T16:20:34Z

The deprecation warning is not tested.

You can add a noqa to that line.

why not write a test with

with pytest.warns(DeprecationWarning("<warning text>")):
    <test which sets `eps` parameter>

which covers it?

brandonwillard · 2020-09-26T16:33:06Z

why not write a test with

You can definitely do that, but we're not really testing much of our own code in this case, so it's not a particularly relevant unit test.

katosh · 2020-09-26T16:33:50Z

already done the test :)

katosh · 2020-09-26T21:13:33Z

I tested how close to the edge of the simplex we can go before the transformation starts to break and for the cases I tested it seems to work down to the smallest float64:

>>> import numpy as np
>>> from pymc3.distributions.transforms import stick_breaking
>>> a = 5e-324
>>> vec = np.array([a, a, a, 1-(3*a)]) # a point very close to the edge of the 4-simplex
>>> stick_breaking.backward(stick_breaking.forward(vec).eval()).eval()
array([5.e-324, 5.e-324, 5.e-324, 1.e+000]) # very close to vec

However, the same can possibly not be said about the jacobian!

katosh · 2020-09-27T11:18:28Z

I investigated the jacobian by plotting its values for points StickBreaking.forwad(Simplex(3)) with

import itertools
import numpy as np
import theano
from pymc3.distributions.transforms import stick_breaking
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D

def plot_jacobian_det(line):
    xl = list()
    yl = list()
    zl = list()
    p = theano.tensor.dvector('p')
    jd = theano.function([p], stick_breaking.jacobian_det(p))

    for x, y in itertools.product(line, repeat=2):
        xl.append(x)
        yl.append(y)
        log_jacobian_det = jd(np.array([x, y]))
        zl.append(log_jacobian_det)

    x = np.stack(xl)
    y = np.stack(yl)
    z = np.stack(zl)

    fig = plt.figure()
    ax = Axes3D(fig)
    ax.plot_trisurf(x, y, z, cmap=cm.jet)

I looked at the values in the center of the simplex:

plot_jacobian_det(np.linspace(-1, 1, 100))

and all the way to the edge

plot_jacobian_det(np.linspace(-500, 500, 100))

Note that stick_breaking.backward(np.array([-350, 350])).eval() is array([9.85967654e-305, 1.00000000e+000, 9.92959040e-153]) so we go as close to the edge of the simplex as possible with float64.

There are no numerical issues apparent and the log-determinant of the jacobian seems to behave as expected down to values so close to the edge of the simplex that they cannot be distinguished by the Dirichlet distribution (I belive they are mapped to the simplex befor Dirichlet.logp is evaluated). But this is of course not an exhaustive investigation.

twiecki · 2020-09-27T11:32:36Z

@katosh This looks great and quite thorough. Is there anything missing before merging from your end?

katosh · 2020-09-27T11:40:58Z

I am done so far but of course, I can do further testing if someone has a request.

twiecki · 2020-09-27T11:45:33Z

I think this is great, thanks so much for the contribution!

katosh · 2020-09-27T11:47:55Z

Awesome, thank you for having me be part of this project!

helmutsimon · 2021-08-24T08:01:03Z

It appears that StickBreaking.forward_val is being eliminated, with no equivalent in the new version. This would concern me, as I happen to use it in a public repository. I could work around it, but perhaps there are others using it also. Is there any depreciation warning in the meantime? I only found out about this because I wanted to add a backward_val.

ricardoV94 · 2021-08-24T09:21:08Z

It appears that StickBreaking.forward_val is being eliminated, with no equivalent in the new version. This would concern me, as I happen to use it in a public repository. I could work around it, but perhaps there are others using it also. Is there any depreciation warning in the meantime? I only found out about this because I wanted to add a backward_val.

Do you mind opening a separate issue for that? This one is pretty long and the forward_val were removed for all distributions not just StickBreaking

helmutsimon · 2021-08-25T05:21:02Z

Do you mind opening a separate issue for that? This one is pretty long and the forward_val were removed for all distributions not just StickBreaking

See discourse topic here

katosh and others added 11 commits October 1, 2019 13:14

simplified StickBreaking

83ab40a

fix StickBreaking jacobian

731e6b5

use stable logsumexp in StickBreaking

61ba98e

Rename n to Km1 to more easily compare patch.

6b78e24

Drop first dimension when computing determinant of the Jacobian of th…

8157760

…e transformation.

Drop newly unused expit import.

8cba7e1

Side-by-side stickbreaking implementations for comparison.

3a2dca2

Use same suffix for alternative stickbreaking transform.

6c556f4

Add separate tests for new stickbreaking implementation.

c0cfbd5

correct jacobain of Stickbreaking2

3f1f9f1

Merge remote-tracking branch 'pymc3/master' into simplify-stick-breaking

4903997

katosh mentioned this pull request Sep 23, 2020

Simple stick breaking (Formerly #3620) #3638

Closed

brandonwillard suggested changes Sep 23, 2020

View reviewed changes

pymc3/distributions/transforms.py Show resolved Hide resolved

katosh added 2 commits September 24, 2020 00:09

use pymc3.math.logsumexp in StickBreakin2

ebd41b1

remove old deprecated comment in StickBreaking2

c8612c8

remove distributions.transforms.StickBreaking2.backwards_val

b9b72a8

katosh requested a review from brandonwillard September 24, 2020 13:13

katosh added 2 commits September 24, 2020 17:33

include ignored parameter point in StickBreaking2.forwad_val

e0dc317

StickBreaking2 in the release notes

236a713

brandonwillard suggested changes Sep 26, 2020

View reviewed changes

RELEASE-NOTES.md Outdated Show resolved Hide resolved

katosh added 6 commits September 26, 2020 11:17

fix release notes typo

d84f928

accuracy test that only StickBreaking2 would pass

960e004

replace StickBreaking with the new alternative

7242402

cite isometric logration in StickBreaking

7d400c9

update release notes

4345c5b

update lda-advi-aevb.ipynb with new stickbreaking

3b42c03

katosh added 6 commits September 26, 2020 14:29

deprecation warning for eps in Stickbreaking

db23321

clarify jacobian implementation

1e1fa1e

remove revealing file system reference

56fcd74

Revert "remove unused import"

36f55ec

This reverts commit 30452db.

fix stickbreaking accuracy test for 32bit

53f0fa4

remove unused import

11a9dc9

katosh added 2 commits September 26, 2020 18:30

test eps deprecation of StickBreaking

7540973

fix indentation

231f0b4

Merge remote-tracking branch 'pymc3/master' into simplify-stick-breaking

f2bb87c

twiecki approved these changes Sep 27, 2020

View reviewed changes

twiecki merged commit fd76e96 into pymc-devs:master Sep 27, 2020

katosh mentioned this pull request Nov 20, 2020

Suggestion: Add tt.nnet.softmax to pm.math? #4226

Closed

This was referenced Dec 1, 2020

Errors building PyMC3 documentation #4276

Closed

Fix warnings from docstrings when building docs, remove t_stick_breaking #4283

Merged

harrig12 mentioned this pull request Jun 2, 2021

uniform dirichlet prior with stickbreaking transform + ADVI #4733

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple stick breaking #4129

Simple stick breaking #4129

katosh commented Sep 23, 2020 •

edited

Loading

codecov bot commented Sep 23, 2020 •

edited

Loading

katosh commented Sep 24, 2020

katosh commented Sep 24, 2020

katosh commented Sep 24, 2020

brandonwillard left a comment •

edited

Loading

twiecki commented Sep 26, 2020

katosh commented Sep 26, 2020

brandonwillard commented Sep 26, 2020

MarcoGorelli commented Sep 26, 2020 •

edited

Loading

brandonwillard commented Sep 26, 2020

katosh commented Sep 26, 2020

katosh commented Sep 26, 2020 •

edited

Loading

katosh commented Sep 27, 2020

twiecki commented Sep 27, 2020

katosh commented Sep 27, 2020

twiecki commented Sep 27, 2020

katosh commented Sep 27, 2020

helmutsimon commented Aug 24, 2021

ricardoV94 commented Aug 24, 2021 •

edited

Loading

helmutsimon commented Aug 25, 2021

Simple stick breaking #4129

Simple stick breaking #4129

Conversation

katosh commented Sep 23, 2020 • edited Loading

Advantages

Current StickBreaking

New StickBreaking2

codecov bot commented Sep 23, 2020 • edited Loading

Codecov Report

katosh commented Sep 24, 2020

katosh commented Sep 24, 2020

Current StickBreaking

New StickBreaking2

Conclusion

katosh commented Sep 24, 2020

brandonwillard left a comment • edited Loading

Choose a reason for hiding this comment

twiecki commented Sep 26, 2020

katosh commented Sep 26, 2020

brandonwillard commented Sep 26, 2020

MarcoGorelli commented Sep 26, 2020 • edited Loading

brandonwillard commented Sep 26, 2020

katosh commented Sep 26, 2020

katosh commented Sep 26, 2020 • edited Loading

katosh commented Sep 27, 2020

twiecki commented Sep 27, 2020

katosh commented Sep 27, 2020

twiecki commented Sep 27, 2020

katosh commented Sep 27, 2020

helmutsimon commented Aug 24, 2021

ricardoV94 commented Aug 24, 2021 • edited Loading

helmutsimon commented Aug 25, 2021

katosh commented Sep 23, 2020 •

edited

Loading

codecov bot commented Sep 23, 2020 •

edited

Loading

brandonwillard left a comment •

edited

Loading

MarcoGorelli commented Sep 26, 2020 •

edited

Loading

katosh commented Sep 26, 2020 •

edited

Loading

ricardoV94 commented Aug 24, 2021 •

edited

Loading