-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak DirichletMultinomial logp and refactor some multivariate logp tests #5234
Conversation
57468f6
to
f11a3fc
Compare
f11a3fc
to
d9592c4
Compare
d9592c4
to
5d5d6ed
Compare
Codecov Report
@@ Coverage Diff @@
## main #5234 +/- ##
==========================================
+ Coverage 78.95% 78.98% +0.02%
==========================================
Files 88 88
Lines 14232 14231 -1
==========================================
+ Hits 11237 11240 +3
+ Misses 2995 2991 -4
|
5d5d6ed
to
7b732fe
Compare
I don't get why we would automatically normalize the vector p in the Multinomial, but I'm happy to revert those changes if they cause too much trouble. It hides the same problems that were raised in this old issue for the Categorical: #2082 (comment) |
Agreed, the implicit normalization is sneaky and can cause issues down the road |
If we changed this, it would make sense to also not normalize things in the Categorical by default... |
aefc1a5
to
8b5820e
Compare
I reverted the changes to the Multinomial normalization, I'll open a discussion on this for the Multinomial and the Categorical. In the meantime this can hopefully be merged. |
095ae11
to
9f6e3c7
Compare
Probability of each one of the different outcomes. Elements must | ||
be non-negative and sum to 1 along the last axis. They will be | ||
automatically rescaled otherwise. | ||
n: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But n
can be a vector too, can't it? E.g if the number of trials vary by observation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah n can be a vector or a matrix, anything as long as dimensions broadcast properly. I put the int to indicate the base case, but perhaps the best is to remove any information of shape and only mention the meaning of the parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not int, vector, matrix
? That's more explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find that very helpful. Other multivariate distributions use an ambiguous "array" and most univariate distributions say only "int" or "float" even though they also support vectors and tensors of any shape.
a : one- or two-dimensional array | ||
Dirichlet parameter. Elements must be strictly positive. | ||
The number of categories is given by the length of the last axis. | ||
n : int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same remark as for Multinomial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, those tests are much simpler now 🤩
LGTM @ricardoV94 -- I just had a question about the n
parameter. Once that's settled we can merge
…nality of n and a Refactor vectorized logp tests
… n and p Refactor vectorized logp tests
9f6e3c7
to
0549f93
Compare
@AlexAndorra I added a commit that harmonizes the docstrings of the 3 distributions. The emphasis is on the dtype of each parameter and the special meaning of the last axis of I rather not make it explicitly that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that looks better now, thanks @ricardoV94
Couple of tweaks/fixes to the logp /docstrings of Multinomial and DirichletMultinomial
Summary:
n
anda
inDirichletMultinomial
, related to discussion in Adds tests and mode for dirichlet multinomial distribution #5225. This also solves a minor issue where the logp was returning a 1D vector for the base case instead of just a scalar.n
andp
in the docstrings of theMultinomial
Dirichlet
Also:
4. Remove couple old/redundant random tests