Fix ordering Transformation for batched dimensions #6255

TimOliverMaier · 2022-10-30T11:49:45Z

This PR is the continuation of @purna135 's PR #5660 addressing issue #5659 (ordering transformation fails for dim>1).

Short summary:
The initial change of the dimensionality of the jacobian determinant:

     def log_jac_det(self, value, *inputs): 
-        return at.sum(value[..., 1:], axis=-1)
+        return at.sum(value[..., 1:], axis=-1, keepdims=True)

This led to issues with multivariate distributions (as pointed out by @Sayam753 here).

Based on the suggestions of @ricardoV94, I added a simple case inside the log_jac_det method of ordered and sumto1transform like this:

+    def __init__(self, ndim_supp=0):
+        self.ndim_supp = ndim_supp

     def log_jac_det(self, value, *inputs): 
-        return at.sum(value[..., 1:], axis=-1, keepdims=True)
+        if self.ndim_supp == 0:
+            return at.sum(y, axis=-1, keepdims=True)
+        else:
+            return at.sum(y, axis=-1)

While fixing the mentioned problem with multivariate distributions, two problems remain:

TestElementWiseLogP in test_transform.py fails
due to the inconsistent dimensionality of the jacobian determinant among the possible transformations.
I also ran all other the tests in the distribution subfolder locally. Here many tests are also failing in the main branch for me.
I am unsure here if any of the failing tests are caused by this PR's additions.
While experimenting with the multivariate distributions I realized that ordering of a MvNormal fails in main and still fails in
this PR. In both branches it fails with an ValueError:
"array must not contain infs or NaNs\nApply node that caused the error: SolveTriangular{lower=True, trans=0, unit_diagonal=False, check_finite=True}(TensorConstant{[[1. 0. 0...0. 0. 1.]]}, SpecifyShape.0)\nToposort index: 12\nInputs types: [TensorType(float64, (4, 4)), TensorType(float64, (4, 1))]\nInputs shapes: [(4, 4), (4, 1)]\nInputs strides: [(8, 32), (8, 8)]\nInputs values: ['not shown', array([[0.80230155],\n [ nan],\n [ nan],\n [ nan]])]\nOutputs clients: [[InplaceDimShuffle{1,0}(SolveTriangular{lower=True, trans=0, unit_diagonal=False, check_finite=True}.0)]]\n\nBacktrace when the node is created (use Aesara flag traceback__limit=N to make it longer):\n File "/home/tim/miniconda3/envs/pymc_dev/lib/python3.10/site-packages/aeppl/joint_logprob.py", line 151, in factorized_joint_logprob\n q_logprob_vars = _logprob(\n File "/home/tim/miniconda3/envs/pymc_dev/lib/python3.10/functools.py", line 889, in wrapper\n return dispatch(args[0].class)(*args, **kw)\n File "/home/tim/miniconda3/envs/pymc_dev/lib/python3.10/site-packages/aeppl/transforms.py", line 611, in transformed_logprob\n logprob = _logprob(rv_op, values, *inputs, **kwargs)\n File "/home/tim/miniconda3/envs/pymc_dev/lib/python3.10/functools.py", line 889, in wrapper\n return dispatch(args[0].class)(*args, **kw)\n File "/home/tim/Workspaces/pymc/pymc/distributions/distribution.py", line 123, in logp\n return class_logp(value, *dist_params)\n File "/home/tim/Workspaces/pymc/pymc/distributions/multivariate.py", line 288, in logp\n quaddist, logdet, ok = quaddist_parse(value, mu, cov)\n File "/home/tim/Workspaces/pymc/pymc/distributions/multivariate.py", line 156, in quaddist_parse\n dist, logdet, ok = quaddist_chol(delta, chol_cov)\n File "/home/tim/Workspaces/pymc/pymc/distributions/multivariate.py", line 173, in quaddist_chol\n delta_trans = solve_lower(chol_cov, delta.T).T\n\nHINT: Use the Aesara flag exception_verbosity=high for a debug print-out and storage map footprint of this Apply node."

For reproduction, see code below.

with pm.Model() as model:
    kappa_MV = pm.MvNormal("kappa_mv",
                           mu=[1,1,1,1],
                           cov = np.eye(4),
                           initval=[2,1,0,2],
                           transform=pm.distributions.transforms.ordered
                           )
    pm.sample()

I want to share my progress here as draft to make collaboration possible ;) .

Cheers!

Checklist

Explain important implementation details 👆
Make sure that the pre-commit linting/style checks pass.
Link relevant issues (preferably in nice commit messages)
Are the changes covered by tests and docstrings?
Fill out the short summary sections 👇

Bugfixes / New features

ordering transformation for RVs with number of dimensions > 1

codecov · 2022-10-30T12:07:54Z

Codecov Report

Merging #6255 (9daae35) into main (4acd98e) will decrease coverage by 3.72%.
The diff coverage is 100.00%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6255      +/-   ##
==========================================
- Coverage   94.17%   90.44%   -3.73%     
==========================================
  Files         111      111              
  Lines       23908    23985      +77     
==========================================
- Hits        22515    21693     -822     
- Misses       1393     2292     +899

Impacted Files	Coverage Δ
pymc/distributions/transforms.py	`99.37% <100.00%> (-0.63%)`	⬇️
pymc/tests/distributions/test_transform.py	`100.00% <100.00%> (ø)`
pymc/tests/distributions/test_timeseries.py	`0.00% <0.00%> (-99.51%)`	⬇️
pymc/tests/step_methods/hmc/test_quadpotential.py	`0.00% <0.00%> (-95.82%)`	⬇️
pymc/distributions/timeseries.py	`28.46% <0.00%> (-66.00%)`	⬇️
pymc/model_graph.py	`66.47% <0.00%> (-12.36%)`	⬇️
pymc/step_methods/hmc/quadpotential.py	`73.84% <0.00%> (-6.78%)`	⬇️
pymc/distributions/logprob.py	`61.72% <0.00%> (-5.56%)`	⬇️
pymc/distributions/shape_utils.py	`97.44% <0.00%> (-0.43%)`	⬇️
... and 17 more

TimOliverMaier · 2022-10-31T11:40:22Z

Hey.

The instantiations' transforms.orderedand transforms.sum_to_1 member ndim_supp is set to 1, thereby the old behavior is mimicked. This fixes the two failing tests. It seems the other failing tests in the distribution folder were due to local issues on my machine. So no problem here :) .

Regarding the ValueError when sampling multivariate ordered distributions, I found out it comes from a initval, that is not ordered. I think this is a different issue.

TimOliverMaier · 2022-11-08T10:47:04Z

I know, tests are still missing here. I hesistate here, because, if my changes are the way to go here, then I think transforms.ordered would stay only for backward-compatibility and it`s usage in the tests should be changed to either the univariate or multivariate instantiation. After your feedback I am happy to create (and change) the tests.

ricardoV94 · 2022-11-18T09:03:27Z

@TimOliverMaier I think we can remove the old ordered one. Question: should we raise ValueError for ndim_supp>1? Sorting the last axis axis only does not make a lot of sense when there are two core dimensions.

TimOliverMaier · 2022-11-18T09:24:20Z

@ricardoV94 thank you for your response! Alright, I will alter the relevant tests then. But we keep the transforms.ordered instantiation for backwards compatibility for now, I think?

Yes, good point. I will implement in the ValueErroras well. Only out of interest: can one get ndim_supp>1 with the provided distributions in pymc?

ricardoV94 · 2022-11-18T10:00:55Z

Only out of interest: can one get ndim_supp>1 with the provided distributions in pymc?

Just some edge cases: MatrixNormal and RandomWalk with multivariate innovations (RandomWalk.ndim_supp = innovation.ndim_supp + 1)

ricardoV94 · 2022-11-18T10:02:42Z

But we keep the transforms.ordered instantiation for backwards compatibility for now, I think?

I am not sure that's better actually...

1. updated variable `__all__` to contain `univariate_ordered`,`multivariate_ordered` and analogous `sum_to_1` instantiations.

TimOliverMaier · 2022-11-19T22:16:19Z

I updated the tests now to use the new instantiations and implemented the ValueError.
I wonder if we should keep the two last tests in test_transform.py :

pymc/pymc/tests/distributions/test_transform.py

Lines 620 to 645 in 7442474

    
           def test_transforms_ordered(): 
        
               with pm.Model() as model: 
        
                   pm.Normal( 
        
                       "x_univariate", 
        
                       mu=[-3, -1, 1, 2], 
        
                       sigma=1, 
        
                       size=(10, 4), 
        
                       transform=tr.univariate_ordered, 
        
                   ) 
        
               log_prob = model.point_logps() 
        
               np.testing.assert_allclose(list(log_prob.values()), np.array([18.69])) 
        
           def test_transforms_sumto1(): 
        
               with pm.Model() as model: 
        
                   pm.Normal( 
        
                       "x", 
        
                       mu=[-3, -1, 1, 2], 
        
                       sigma=1, 
        
                       size=(10, 4), 
        
                       transform=tr.univariate_sum_to_1, 
        
                   ) 
        
               log_prob = model.point_logps() 
        
               np.testing.assert_allclose(list(log_prob.values()), np.array([-56.76]))

These were added earlier in this PR. I think the shape of the transforms is now properly tested in check_vectortransform_elementwise_logp. I'd rather put some extra lines covering the multivariate case here

pymc/pymc/tests/distributions/test_transform.py

Lines 247 to 255 in 7442474

    
           def test_ordered(): 
        
               check_vector_transform(tr.univariate_ordered, SortedVector(6)) 
        
               check_jacobian_det( 
        
                   tr.univariate_ordered, Vector(R, 2), at.dvector, np.array([0, 0]), elemwise=False 
        
               ) 
        
               vals = get_values(tr.univariate_ordered, Vector(R, 3), at.dvector, np.zeros(3)) 
        
               close_to_logical(np.diff(vals) >= 0, True, tol)

ricardoV94

@TimOliverMaier we forgot to include this PR in our major release of 4.4.0 last night. As such I would agree with your initial suggestion of keeping the original transforms around for a bit longer so we can merge these fixes and release them with 4.1, without breaking backwards compatibility. I hope that's not too annoying :)

Besides that I have some small suggestions for the new tests. I wrote about the first but it applies to both. Let me know what you think

ricardoV94 · 2022-11-20T08:23:53Z

pymc/tests/distributions/test_transform.py

@@ -598,3 +615,31 @@ def test_discrete_trafo():
        with pytest.raises(ValueError) as err:
            pm.Binomial("a", n=5, p=0.5, transform="log")
        err.match("Transformations for discrete distributions")
+
+
+def test_transforms_ordered():


Perhaps a more direct name?

Suggested change

def test_transforms_ordered():

def test_2d_univariate_ordered_():

ricardoV94 · 2022-11-20T08:30:35Z

pymc/tests/distributions/test_transform.py

+            transform=tr.univariate_ordered,
+        )
+
+    log_prob = model.point_logps()


To make this test a bit more readable I would suggest including an equivalent x_1d = pm.Normal(..., shape=(4,)) in the model and comparing the elemwise logp of that is the same as each copy of the 2D (10, 4) one that exists now.

You can do that via m.compile_logp(sum=False)({"x_1d_ordered__": np.zeros((4,)), x_2d_ordered__": np.zeros(10, 4)}) and then asserting closeness across the last axis.

For the sum_to_1 transform the logps are not the same. Is this expected? Even if the shape is the same like here:

def test_2d_univariate_sum_to_1(): with pm.Model() as model: x_1d = pm.Normal( "x_1d", mu=[-3,-1,1,2], sigma=1, size=(10,4), transform=tr.univariate_sum_to_1, ) x_2d = pm.Normal( "x_2d", mu=[-3, -1, 1, 2], sigma=1, size=(10, 4), transform=tr.univariate_sum_to_1, ) log_p = model.compile_logp(sum=False)({"x_1d_sumto1__":np.ones((10,3))*0.25,"x_2d_sumto1__":np.zeros((10,3))*0.25}) np.testing.assert_allclose(log_p[0],log_p[1])

Fails with:

./pymc/tests/distributions/test_transform.py::test_2d_univariate_sum_to_1 Failed: [undefined]AssertionError: Not equal to tolerance rtol=1e-07, atol=0 Mismatched elements: 40 / 40 (100%) Max absolute difference: 1.03125 Max relative difference: 0.72677567 x: array([[-6.200189, -1.700189, -1.200189, -2.450189], [-6.200189, -1.700189, -1.200189, -2.450189], [-6.200189, -1.700189, -1.200189, -2.450189],... y: array([[-5.418939, -1.418939, -1.418939, -1.418939], [-5.418939, -1.418939, -1.418939, -1.418939], [-5.418939, -1.418939, -1.418939, -1.418939],...

Ok funny. This is not the case if I use np.zeros

PR looks good otherwise, should we investigate the SumTo1?

Was this when comparing np.zeros(), vs np.ones()? I don't expect those to be equivalent...

Oh boy 😆 . Maybe it was just that. I thought I was comparing np.ones() to np.ones(). will check again.

Yes. This was my mistake, sorry! Checked it properly with np.ones() and it passes.

1. Old `transforms.ordered` and `transforms.sum_to_1` instantiations were added again for backwards compatibility. 2. Tests for multivariate and univariate usage of `SumTo1` and `Ordered` transformations were added.

ricardoV94 · 2022-11-22T12:29:50Z

@TimOliverMaier thanks so much for the fix (and the going back and forth). Is this ready to merge on your end, or did you want to do anything else?

TimOliverMaier · 2022-11-22T12:42:15Z

@ricardoV94 my pleasure :) . Yes, I am finished here.

Co-authored-by: Purna Chandra Mansingh <purnachandramansingh135@gmail.com>

TimOliverMaier mentioned this pull request Oct 30, 2022

Fix pm.distributions.transforms.ordered fails on >1 dimension #5660

Closed

ricardoV94 requested a review from lucianopaz October 31, 2022 12:06

TimOliverMaier marked this pull request as ready for review October 31, 2022 12:28

ricardoV94 requested a review from Sayam753 November 18, 2022 09:00

purna135 and others added 8 commits November 18, 2022 13:03

keep the original dims in the log_jac_det

2465a0f

added tests to test_transforms.py

c9a0470

added tests for sum_to_1 transforms

0862134

fix pre-commit

7fb4c37

ordered transform class with n_dims_supp

2d2bab4

univariate_ordered and multivariate_ordered

0372528

ndim_supp cases in sumto1 transform

7ef9b2e

keep transforms.ordered for backward compatibility

9d279db

TimOliverMaier force-pushed the transforms_ordered branch from 311bf9d to 9d279db Compare November 18, 2022 12:11

pre-commit run

7515c73

TimOliverMaier force-pushed the transforms_ordered branch from 80c7dcc to 7515c73 Compare November 19, 2022 21:28

updated __all__ variable

7442474

1. updated variable `__all__` to contain `univariate_ordered`,`multivariate_ordered` and analogous `sum_to_1` instantiations.

ricardoV94 requested changes Nov 20, 2022

View reviewed changes

TimOliverMaier added 2 commits November 20, 2022 18:51

Tests for uni- and multivariate SumTo1, Ordered

bc8166e

1. Old `transforms.ordered` and `transforms.sum_to_1` instantiations were added again for backwards compatibility. 2. Tests for multivariate and univariate usage of `SumTo1` and `Ordered` transformations were added.

test Ordered and SumTo1 raise ValueError

8ad2c03

TimOliverMaier requested review from ricardoV94 and removed request for lucianopaz and Sayam753 November 21, 2022 08:51

improved docs for SumTo1, Ordered instantiations

9daae35

ricardoV94 changed the title ~~Ordering Transformation for higher Dimensions~~ Fix ordering Transformation for batched dimensions Nov 22, 2022

ricardoV94 added the bug label Nov 22, 2022

ricardoV94 approved these changes Nov 22, 2022

View reviewed changes

ricardoV94 merged commit 58dfb35 into pymc-devs:main Nov 22, 2022

This was referenced Nov 22, 2022

pm.distributions.transforms.ordered fails on >1 dimension #5659

Closed

Deprecate transforms.ordered in next major release #6330

Closed

wrongu pushed a commit to wrongu/pymc that referenced this pull request Dec 1, 2022

Fix ordering Transformation for batched dimensions (pymc-devs#6255)

3fb48d1

Co-authored-by: Purna Chandra Mansingh <purnachandramansingh135@gmail.com>

ricardoV94 mentioned this pull request Sep 12, 2023

Fix bug in univariate Ordered and SumTo1 transform logp #6903

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ordering Transformation for batched dimensions #6255

Fix ordering Transformation for batched dimensions #6255

TimOliverMaier commented Oct 30, 2022 •

edited

Loading

codecov bot commented Oct 30, 2022 •

edited

Loading

TimOliverMaier commented Oct 31, 2022 •

edited

Loading

TimOliverMaier commented Nov 8, 2022

ricardoV94 commented Nov 18, 2022 •

edited

Loading

TimOliverMaier commented Nov 18, 2022

ricardoV94 commented Nov 18, 2022 •

edited

Loading

ricardoV94 commented Nov 18, 2022 •

edited

Loading

TimOliverMaier commented Nov 19, 2022

ricardoV94 left a comment •

edited

Loading

ricardoV94 Nov 20, 2022

ricardoV94 Nov 20, 2022 •

edited

Loading

TimOliverMaier Nov 20, 2022

TimOliverMaier Nov 20, 2022

ricardoV94 Nov 22, 2022

ricardoV94 Nov 22, 2022

TimOliverMaier Nov 22, 2022 •

edited

Loading

TimOliverMaier Nov 22, 2022

ricardoV94 commented Nov 22, 2022

TimOliverMaier commented Nov 22, 2022

	def test_transforms_ordered():
	def test_2d_univariate_ordered_():

Fix ordering Transformation for batched dimensions #6255

Fix ordering Transformation for batched dimensions #6255

Conversation

TimOliverMaier commented Oct 30, 2022 • edited Loading

Bugfixes / New features

codecov bot commented Oct 30, 2022 • edited Loading

Codecov Report

TimOliverMaier commented Oct 31, 2022 • edited Loading

TimOliverMaier commented Nov 8, 2022

ricardoV94 commented Nov 18, 2022 • edited Loading

TimOliverMaier commented Nov 18, 2022

ricardoV94 commented Nov 18, 2022 • edited Loading

ricardoV94 commented Nov 18, 2022 • edited Loading

TimOliverMaier commented Nov 19, 2022

ricardoV94 left a comment • edited Loading

Choose a reason for hiding this comment

ricardoV94 Nov 20, 2022

Choose a reason for hiding this comment

ricardoV94 Nov 20, 2022 • edited Loading

Choose a reason for hiding this comment

TimOliverMaier Nov 20, 2022

Choose a reason for hiding this comment

TimOliverMaier Nov 20, 2022

Choose a reason for hiding this comment

ricardoV94 Nov 22, 2022

Choose a reason for hiding this comment

ricardoV94 Nov 22, 2022

Choose a reason for hiding this comment

TimOliverMaier Nov 22, 2022 • edited Loading

Choose a reason for hiding this comment

TimOliverMaier Nov 22, 2022

Choose a reason for hiding this comment

ricardoV94 commented Nov 22, 2022

TimOliverMaier commented Nov 22, 2022

TimOliverMaier commented Oct 30, 2022 •

edited

Loading

codecov bot commented Oct 30, 2022 •

edited

Loading

TimOliverMaier commented Oct 31, 2022 •

edited

Loading

ricardoV94 commented Nov 18, 2022 •

edited

Loading

ricardoV94 commented Nov 18, 2022 •

edited

Loading

ricardoV94 commented Nov 18, 2022 •

edited

Loading

ricardoV94 left a comment •

edited

Loading

ricardoV94 Nov 20, 2022 •

edited

Loading

TimOliverMaier Nov 22, 2022 •

edited

Loading