Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove automatic normalization in Multinomial and Categorical #5331

Closed
ricardoV94 opened this issue Jan 10, 2022 Discussed in #5246 · 12 comments
Closed

Remove automatic normalization in Multinomial and Categorical #5331

ricardoV94 opened this issue Jan 10, 2022 Discussed in #5246 · 12 comments

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Jan 10, 2022

Discussed in #5246

Originally posted by ricardoV94 December 8, 2021
This cropped up in #5234

Should we stop doing automatic normalization of the p parameter? This can hide very wrong inputs such as

import pymc as pm

with pm.Model() as m:
  x = pm.Multinomial('x', n=5, p=[-2, -2, -2, -2])

m.logp({x: [3, 2, 0, 0]})
# array(-4.62888666)

If we decide to keep the automatic normalization, we can at least remove some checks in the logp definition, since they cannot be triggered in this case.


I think this caused some problems e.g. if you a user specifies [.3, .3, .3] where things almost line up.

Originally posted by @twiecki in #5246 (comment)


Whenever the user creates a Multinomial / Categorical distribution with concrete (numpy) values we check if they are valid and if not we normalize them but also issue a UserWarning along the lines of:

p values sum up to 0.9, instead of 1.0. They will be automatically rescaled. You can rescale them directly to get rid of this warning.

In the logp or whenever we have symbolic inputs we don't do any invisible normalization, and let it evaluate to -inf as with invalid parameters in other distributions.

I think this covers most of the user cases and does not have big backwards compat issues.

Originally posted by @ricardoV94 in #5246 (comment)

@LukeLB
Copy link
Contributor

LukeLB commented Jan 11, 2022

@ricardoV94 I'm happy to take this. Just to make sure I understand the ask, the normalisation step in pm.Multinomial() takes place using this line of code in the dist function (line 534 in multivariate.py):
p = p / at.sum(p, axis=-1, keepdims=True)
and the same code is used in the logp function within pm.Categorical() (line 1250 in discrete.py). Is this correct? For consistency should the normalisation step be moved to the dist function in pm.Categorical()?

@ricardoV94
Copy link
Member Author

Yes the normalization in both should be done only in the dist and only of the inputs are lists or numpy arrays

@ricardoV94
Copy link
Member Author

I've assigned the issue to you. Thanks for volunteering!

@LukeLB
Copy link
Contributor

LukeLB commented Jan 15, 2022

@ricardoV94 I'm close to being done on this, my solution is to insert this code section in the dist() function:

if isinstance(p, np.ndarray) or isinstance(p, list):
            p_sum = np.sum([p], axis=-1)
            if (p_sum != 1.0).any(): 
                warnings.warn(
                    f"p values sum up to {p_sum}, instead of 1.0. They will be automatically rescaled. You can rescale them directly to get rid of this warning.",
                    UserWarning
                )
                p = p / at.sum(p, axis=-1, keepdims=True)

This works fine for Multinomial. However for Categorical when I move the normalisation step outside of logp() to dist(), I get failure of this test:

 _____________________________ TestMatchesScipy.test_categorical_valid_p[p2] _____________________________

self = <pymc.tests.test_distributions.TestMatchesScipy object at 0x7fc1bf3826a0>
p = array([-1, -1,  0,  0])

    @aesara.config.change_flags(compute_test_value="raise")
   @pytest.mark.parametrize(
        "p",
       [
            np.array([-0.2, 0.3, 0.5]),
            # A model where p sums to 1 but contains negative values
            np.array([-0.2, 0.7, 0.5]),
            # Hard edge case from #2082
            # Early automatic normalization of p's sum would hide the negative
            # entries if there is a single or pair number of negative values
            # and the rest are zero
            np.array([-1, -1, 0, 0]),
        ],
    )
    def test_categorical_valid_p(self, p):
        with Model():
            x = Categorical("x", p=p)
            with pytest.raises(ParameterValueError):
>               logp(x, 2).eval()
E               Failed: DID NOT RAISE <class 'aeppl.logprob.ParameterValueError'>

Which appears to be testing an edge case form a previous issue involving normalisation of invalid input [-1, -1, 0, 0]. I can fix this issue by just leaving the normalisation in logp() but this creates inconsistent warning behaviour where for Multinomial,

with pm.Model() as m:
    x = pm.Multinomial('x', n=5, p=[0.5,0.5,0.5,1])

UserWarning: p values sum up to [2.5], instead of 1.0. They will be automatically rescaled. You can rescale them directly to get rid of this warning.

raises a UserWarning but the same call for Categorical does not. Any suggestions on how to proceed here?

@ricardoV94
Copy link
Member Author

If some/all of those test conditions no longer make sense in light of the new behavior they can be removed.

@ricardoV94
Copy link
Member Author

ricardoV94 commented Jan 15, 2022

You can also replace the inputs to be TensorVariables so that tbe .dist does not trigger the renormalization. Actually testing both numpy and the tensorvariables might be the most comprehensive thing to do, since we are concisously treating them differently now

@ricardoV94
Copy link
Member Author

ricardoV94 commented Jan 15, 2022

Finally, for the negative case we should probably raise a ValueError in the dist directly before we do the normalization check. What do you think?

@LukeLB
Copy link
Contributor

LukeLB commented Jan 16, 2022

Appologies I'm not completely clear on what you are suggesting here. To clarify, are you saying that we just check for negative values in p prior to normalisation and raise a ValueError if present? Then rewrite the test highlighted above to test that a ValueError is raised? To me this makes sense as a negative probablities are not valid in any circumstance anyway.

@ricardoV94
Copy link
Member Author

@LukeLB Yeah that was my suggestion

@LukeLB
Copy link
Contributor

LukeLB commented Jan 16, 2022

Great I'll get on that!

@LukeLB
Copy link
Contributor

LukeLB commented Jan 18, 2022

@ricardoV94 I have put a PR in for this issue now.

@ricardoV94
Copy link
Member Author

Closed via #5370

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants